Decision-Theoretic Control of Reasoning: General Theory and an Application to Game-Playing

Stuart Russell and Eric Wefald

EECS Department, University of California, Berkeley

Technical Report No. UCB/CSD-88-435

, 1988

In this paper we outline a general approach to the study of problem-solving, in which search steps are considered decisions in the same sense as actions in the world. Unlike other metrics in the literature, the value of a search step is defined as a real utility rather than as a quasi-utility, and can therefore be computed directly from a model of the base-level problem-solver. We develop a formula for the value of a search step in a game-playing context using the single-step assumption, namely that a computation step can be evaluated as it was the last to be taken. We prove some meta-level theorems that enable the development of a low-overhead algorithm, MGSS*, that chooses search steps in order of highest estimated utility. Although we show that the single-step assumption is untenable in general, a program implemented for the game of Othello appears to rival an alpha-beta search with equal node allocations or time allocations. Pruning and search termination subsumes or improve on many other algorithms. Single-agent search, as in the A* algorithm, yields a simpler analysis, and we are currently investigating applications of the algorithm developed for this case.

BibTeX citation:

@techreport{Russell:CSD-88-435,
    Author= {Russell, Stuart and Wefald, Eric},
    Title= {Decision-Theoretic Control of Reasoning: General Theory and an Application to Game-Playing},
    Year= {1988},
    Month= {Oct},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/5755.html},
    Number= {UCB/CSD-88-435},
    Abstract= {In this paper we outline a general approach to the study of problem-solving, in which search steps are considered decisions in the same sense as actions in the world. Unlike other metrics in the literature, the value of a search step is defined as a real utility rather than as a quasi-utility, and can therefore be computed directly from a model of the base-level problem-solver. We develop a formula for the value of a search step in a game-playing context using the single-step assumption, namely that a computation step can be evaluated as it was the last to be taken. We prove some meta-level theorems that enable the development of a low-overhead algorithm, MGSS*, that chooses search steps in order of highest estimated utility. Although we show that the single-step assumption is untenable in general, a program implemented for the game of Othello appears to rival an alpha-beta search with equal node allocations or time allocations. Pruning and search termination subsumes or improve on many other algorithms. Single-agent search, as in the A* algorithm, yields a simpler analysis, and we are currently investigating applications of the algorithm developed for this case.},
}

EndNote citation:

%0 Report
%A Russell, Stuart 
%A Wefald, Eric 
%T Decision-Theoretic Control of Reasoning: General Theory and an Application to Game-Playing
%I EECS Department, University of California, Berkeley
%D 1988
%@ UCB/CSD-88-435
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/5755.html
%F Russell:CSD-88-435