Nonparametric Hierarchical Bayesian Models of Categorization

Kevin Canini

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2011-133

December 14, 2011

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2011/EECS-2011-133.pdf

Categorization, or classification, is a fundamental problem in both cognitive psychology and machine learning. Classical psychological models of categorization fall into two main groups: prototype models and exemplar models, which are equivalent, respectively, to the statistical methods of parametric density estimation and kernel density estimation. Many categorization studies in psychology attempt to understand how people solve this problem by comparing their inferences to those of formal computational models such as prototype or exemplar models. From this perspective, different models make different predictions about the representations and mechanisms people use to make categorization judgments. Instead, one can seek to understand categorization by viewing it as a problem of statistical inference and attempting to characterize the inductive biases of human learners. These inductive biases can be directly exposed using an experimental method called iterated learning, which provides direct insight into human categorization in a way that is independent of any proposed models. I describe the results of an iterated learning study of human categorization which supports previous findings by psychologists that people's representations seem to be more flexible than would be implied by either prototype or exemplar models alone.

Prototype and exemplar models both use a single, fixed level of complexity in their representations of categories, with prototype models exhibiting the simplest representations, and exemplar models using the most complex representations. Treating categorization as a type of statistical inference, I describe a family of nonparametric Bayesian models of categorization based on the Dirichlet process mixture model (DPMM). These models represent categories as combinations of clusters of objects and, together, produce a continuum of representational complexities where prototype and exemplar models are special cases, occupying opposite ends of the spectrum. DPMM models allow the level of complexity of category representations to be chosen to suit the task at hand or to change over time; this flexibility can explain psychological results demonstrating that people's inferences are more congruent with prototype models at some times and exemplar models at other times.

The DPMM can be generalized into a larger framework of models based on the hierarchical Dirichlet process (HDP). The HDP subsumes the DPMM and multiple previous psychological models, including prototypes, exemplars, and the Rational Model of Categorization. In addition, the HDP contains a family of previously unexplored models which make interesting predictions about how information can be shared between multiple categories. While most other categorization models learn each individual category in isolation and independently of the others, these HDP models share information between categories. This sharing of information can improve the speed and accuracy of learning and explained certain transfer learning effects that were observed in people's judgments. I introduce an extension of the HDP, called the tree-HDP, which is designed to infer systems of hierarchically related categories. The tree-HDP is able to simultaneously learn categories at multiple levels of generality and infer the taxonomic relationships between them.

Advisors: Stuart J. Russell

BibTeX citation:

@phdthesis{Canini:EECS-2011-133,
    Author= {Canini, Kevin},
    Title= {Nonparametric Hierarchical Bayesian Models of Categorization},
    School= {EECS Department, University of California, Berkeley},
    Year= {2011},
    Month= {Dec},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2011/EECS-2011-133.html},
    Number= {UCB/EECS-2011-133},
    Abstract= {Categorization, or classification, is a fundamental problem in both cognitive psychology and machine learning. Classical psychological models of categorization fall into two main groups: prototype models and exemplar models, which are equivalent, respectively, to the statistical methods of parametric density estimation and kernel density estimation. Many categorization studies in psychology attempt to understand how people solve this problem by comparing their inferences to those of formal computational models such as prototype or exemplar models. From this perspective, different models make different predictions about the representations and mechanisms people use to make categorization judgments. Instead, one can seek to understand categorization by viewing it as a problem of statistical inference and attempting to characterize the inductive biases of human learners. These inductive biases can be directly exposed using an experimental method called iterated learning, which provides direct insight into human categorization in a way that is independent of any proposed models. I describe the results of an iterated learning study of human categorization which supports previous findings by psychologists that people's representations seem to be more flexible than would be implied by either prototype or exemplar models alone.

Prototype and exemplar models both use a single, fixed level of complexity in their representations of categories, with prototype models exhibiting the simplest representations, and exemplar models using the most complex representations. Treating categorization as a type of statistical inference, I describe a family of nonparametric Bayesian models of categorization based on the Dirichlet process mixture model (DPMM). These models represent categories as combinations of clusters of objects and, together, produce a continuum of representational complexities where prototype and exemplar models are special cases, occupying opposite ends of the spectrum. DPMM models allow the level of complexity of category representations to be chosen to suit the task at hand or to change over time; this flexibility can explain psychological results demonstrating that people's inferences are more congruent with prototype models at some times and exemplar models at other times.

The DPMM can be generalized into a larger framework of models based on the hierarchical Dirichlet process (HDP). The HDP subsumes the DPMM and multiple previous psychological models, including prototypes, exemplars, and the Rational Model of Categorization. In addition, the HDP contains a family of previously unexplored models which make interesting predictions about how information can be shared between multiple categories. While most other categorization models learn each individual category in isolation and independently of the others, these HDP models share information between categories. This sharing of information can improve the speed and accuracy of learning and explained certain transfer learning effects that were observed in people's judgments. I introduce an extension of the HDP, called the tree-HDP, which is designed to infer systems of hierarchically related categories. The tree-HDP is able to simultaneously learn categories at multiple levels of generality and infer the taxonomic relationships between them.},
}

EndNote citation:

%0 Thesis
%A Canini, Kevin 
%T Nonparametric Hierarchical Bayesian Models of Categorization
%I EECS Department, University of California, Berkeley
%D 2011
%8 December 14
%@ UCB/EECS-2011-133
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2011/EECS-2011-133.html
%F Canini:EECS-2011-133