Probabilistic Models with Unknown Objects
Brian Christopher Milch
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2006-174
December 13, 2006
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-174.pdf
Humans and other intelligent agents must make inferences about the real-world objects that underlie their observations: for instance, the objects visible in an image, or the people mentioned in a set of text documents. The agent may not know in advance how many objects exist, how they are related to each other, or which observations correspond to which underlying objects. Existing declarative representations for probabilistic models do not capture the structure of such scenarios.
This thesis introduces Bayesian logic (BLOG), a first-order probabilistic modeling language that specifies probability distributions over possible worlds with varying sets of objects. A BLOG model contains statements that define conditional probability distributions for a certain set of random variables; the model also specifies certain context-specific independence properties. We provide criteria under which such a model is guaranteed to fully define a probability distribution. These criteria go beyond existing results in that they can be satisfied even when the Bayesian network defined by the model is cyclic, or contains nodes with infinitely many ancestors.
We describe several approximate inference algorithms that exploit the context-specific dependence structure revealed by a BLOG model. First, we present rejection sampling and likelihood weighting algorithms that are guaranteed to converge to the correct probability for any query on a structurally well-defined BLOG model. Because these algorithms instantiate only those variables that are context-specifically relevant, they can generate samples in finite time even when the model defines infinitely many variables. We then define a general framework for inference on BLOG models using Markov chain Monte Carlo (MCMC) algorithms. This framework allows a programmer to plug in a domain-specific proposal distribution, which helps the Markov chain move to high-probability worlds. Furthermore, the chain can operate on partial world descriptions that specify values only for context-specifically relevant variables. We give conditions under which MCMC over such partial world descriptions is guaranteed to converge to correct probabilities. We also show that this framework performs efficiently on a real-world task: reconstructing the set of distinct publications referred to by a set of bibliographic citations.
Advisors: Stuart J. Russell
BibTeX citation:
@phdthesis{Milch:EECS-2006-174, Author= {Milch, Brian Christopher}, Title= {Probabilistic Models with Unknown Objects}, School= {EECS Department, University of California, Berkeley}, Year= {2006}, Month= {Dec}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-174.html}, Number= {UCB/EECS-2006-174}, Abstract= {Humans and other intelligent agents must make inferences about the real-world objects that underlie their observations: for instance, the objects visible in an image, or the people mentioned in a set of text documents. The agent may not know in advance how many objects exist, how they are related to each other, or which observations correspond to which underlying objects. Existing declarative representations for probabilistic models do not capture the structure of such scenarios. This thesis introduces Bayesian logic (BLOG), a first-order probabilistic modeling language that specifies probability distributions over possible worlds with varying sets of objects. A BLOG model contains statements that define conditional probability distributions for a certain set of random variables; the model also specifies certain context-specific independence properties. We provide criteria under which such a model is guaranteed to fully define a probability distribution. These criteria go beyond existing results in that they can be satisfied even when the Bayesian network defined by the model is cyclic, or contains nodes with infinitely many ancestors. We describe several approximate inference algorithms that exploit the context-specific dependence structure revealed by a BLOG model. First, we present rejection sampling and likelihood weighting algorithms that are guaranteed to converge to the correct probability for any query on a structurally well-defined BLOG model. Because these algorithms instantiate only those variables that are context-specifically relevant, they can generate samples in finite time even when the model defines infinitely many variables. We then define a general framework for inference on BLOG models using Markov chain Monte Carlo (MCMC) algorithms. This framework allows a programmer to plug in a domain-specific proposal distribution, which helps the Markov chain move to high-probability worlds. Furthermore, the chain can operate on partial world descriptions that specify values only for context-specifically relevant variables. We give conditions under which MCMC over such partial world descriptions is guaranteed to converge to correct probabilities. We also show that this framework performs efficiently on a real-world task: reconstructing the set of distinct publications referred to by a set of bibliographic citations.}, }
EndNote citation:
%0 Thesis %A Milch, Brian Christopher %T Probabilistic Models with Unknown Objects %I EECS Department, University of California, Berkeley %D 2006 %8 December 13 %@ UCB/EECS-2006-174 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-174.html %F Milch:EECS-2006-174