Decision-Theoretic Control of Reasoning: General Theory and an Application to Game-Playing

Stuart Russell and Eric Wefald

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-88-435
October 1988

http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/CSD-88-435.pdf

In this paper we outline a general approach to the study of problem-solving, in which search steps are considered decisions in the same sense as actions in the world. Unlike other metrics in the literature, the value of a search step is defined as a real utility rather than as a quasi-utility, and can therefore be computed directly from a model of the base-level problem-solver. We develop a formula for the value of a search step in a game-playing context using the single-step assumption, namely that a computation step can be evaluated as it was the last to be taken. We prove some meta-level theorems that enable the development of a low-overhead algorithm, MGSS*, that chooses search steps in order of highest estimated utility. Although we show that the single-step assumption is untenable in general, a program implemented for the game of Othello appears to rival an alpha-beta search with equal node allocations or time allocations. Pruning and search termination subsumes or improve on many other algorithms. Single-agent search, as in the A* algorithm, yields a simpler analysis, and we are currently investigating applications of the algorithm developed for this case.


BibTeX citation:

@techreport{Russell:CSD-88-435,
    Author = {Russell, Stuart and Wefald, Eric},
    Title = {Decision-Theoretic Control of Reasoning: General Theory and an Application to Game-Playing},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {1988},
    Month = {Oct},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/5755.html},
    Number = {UCB/CSD-88-435},
    Abstract = {In this paper we outline a general approach to the study of problem-solving, in which search steps are considered decisions in the same sense as actions in the world. Unlike other metrics in the literature, the value of a search step is defined as a real utility rather than as a quasi-utility, and can therefore be computed directly from a model of the base-level problem-solver. We develop a formula for the value of a search step in a game-playing context using the single-step assumption, namely that a computation step can be evaluated as it was the last to be taken. We prove some meta-level theorems that enable the development of a low-overhead algorithm, MGSS*, that chooses search steps in order of highest estimated utility. Although we show that the single-step assumption is untenable in general, a program implemented for the game of Othello appears to rival an alpha-beta search with equal node allocations or time allocations. Pruning and search termination subsumes or improve on many other algorithms. Single-agent search, as in the A* algorithm, yields a simpler analysis, and we are currently investigating applications of the algorithm developed for this case.}
}

EndNote citation:

%0 Report
%A Russell, Stuart
%A Wefald, Eric
%T Decision-Theoretic Control of Reasoning: General Theory and an Application to Game-Playing
%I EECS Department, University of California, Berkeley
%D 1988
%@ UCB/CSD-88-435
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/5755.html
%F Russell:CSD-88-435