Decision-Theoretic Control of Reasoning: General Theory and an Application to Game-Playing
Stuart Russell and Eric Wefald
EECS Department, University of California, Berkeley
Technical Report No. UCB/CSD-88-435
, 1988
http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/CSD-88-435.pdf
In this paper we outline a general approach to the study of problem-solving, in which search steps are considered decisions in the same sense as actions in the world. Unlike other metrics in the literature, the value of a search step is defined as a real utility rather than as a quasi-utility, and can therefore be computed directly from a model of the base-level problem-solver. We develop a formula for the value of a search step in a game-playing context using the single-step assumption, namely that a computation step can be evaluated as it was the last to be taken. We prove some meta-level theorems that enable the development of a low-overhead algorithm, MGSS*, that chooses search steps in order of highest estimated utility. Although we show that the single-step assumption is untenable in general, a program implemented for the game of Othello appears to rival an alpha-beta search with equal node allocations or time allocations. Pruning and search termination subsumes or improve on many other algorithms. Single-agent search, as in the A* algorithm, yields a simpler analysis, and we are currently investigating applications of the algorithm developed for this case.
BibTeX citation:
@techreport{Russell:CSD-88-435, Author= {Russell, Stuart and Wefald, Eric}, Title= {Decision-Theoretic Control of Reasoning: General Theory and an Application to Game-Playing}, Year= {1988}, Month= {Oct}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/5755.html}, Number= {UCB/CSD-88-435}, Abstract= {In this paper we outline a general approach to the study of problem-solving, in which search steps are considered decisions in the same sense as actions in the world. Unlike other metrics in the literature, the value of a search step is defined as a real utility rather than as a quasi-utility, and can therefore be computed directly from a model of the base-level problem-solver. We develop a formula for the value of a search step in a game-playing context using the single-step assumption, namely that a computation step can be evaluated as it was the last to be taken. We prove some meta-level theorems that enable the development of a low-overhead algorithm, MGSS*, that chooses search steps in order of highest estimated utility. Although we show that the single-step assumption is untenable in general, a program implemented for the game of Othello appears to rival an alpha-beta search with equal node allocations or time allocations. Pruning and search termination subsumes or improve on many other algorithms. Single-agent search, as in the A* algorithm, yields a simpler analysis, and we are currently investigating applications of the algorithm developed for this case.}, }
EndNote citation:
%0 Report %A Russell, Stuart %A Wefald, Eric %T Decision-Theoretic Control of Reasoning: General Theory and an Application to Game-Playing %I EECS Department, University of California, Berkeley %D 1988 %@ UCB/CSD-88-435 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/5755.html %F Russell:CSD-88-435