Research page

Reseach projects

SVM-KNN and work on Caltech-101
Shape feature selection
Shapetime photography
Digit classifier with few prototypes

SVM-KNN and work on Caltech-101

Details at Caltech-101
Publication: Hao Zhang, Alexander C. Berg, Michael Maire, and Jitendra Malik. SVM-kNN: Discriminative nearest neighbor classification for visual category recognition, preprint in PDF, poster in PPT, In IEEE Computer Vision and Pattern Recognition, 2006
See also:
- Data set home page: Caltech-101 (and newer Caltech-256)
- Local metric learning with very good performances at Andrea Frome's page
- Pyramid match kernel at Kristin Grauman's page
- Spatial pyramid match kernel at Lana Lazebnik's page
- Generative classification based on dependent regions Feifei Li's page
- Hierarchical Max features on Caltech-101 at Lior Wolf's page and Jim Mutch's page
- Compositional features at Buhmann's page
- An extension of SVM-KNN is at Pang Ning Tan's page

Selecting Shape Features for Multi-way Classification

Details/Abstract: The task of visual object recognition benefits from feature selection as it reduces the amount of computation in recognizing a new instance of an object, and the selected features give insights into the classification process. We focus on a class of current feature selection methods known as embedded methods: due to the nature of multi-way classification in object recognition, we derive an extension of the Relevance Vector Machine technique to multi-class. In experiments, we apply Relevance Vector Machine on the problem of digit classification and study its effects. Experimental results show that our classifier enhances accuracy, yields good interpretation for the selected subset of features and costs only a constant factor of the baseline classifier.

Publication: Hao Zhang, Jitendra Malik. Selecting shape features using multi-class relevance vector machine. Technical Report UCB/EECS-2005-6, EECS Department, University of California, Berkeley, October 10 2005.

Graphical model of the classifier multiclass- RVM

Shapetime photography

-with Bill Freeman

We try to capture in a still image the movement of an object over time.

Details/Abstract:
We introduce a new method to describe, in a single image, shape relationships over time. We acquire both range and image information in a sequence of frames using a stationary stereo camera. From the pictures taken, we compute a composite image consisting of the image pixels from the surface closest to the camera over all time frames. This composite reveals 3-\uppercase{d} relationships between the shapes at different times, displayed by occlusion cues in the composite image. We call the composite a shape-time photograph.

Small errors in stereo depth measurements can create artifacts in the shape-time images. We correct most of these using a Markov network to estimate, at each pixel, the most probable time-frame showing the front-most surface, taking into account (a) the stereo depth measurements and their uncertainties, and (b) spatial continuity assumptions for the front-surface time-frame assignments.

Links:

Publication: Bill Freeman, Hao Zhang. Shapetime photography, IEEE Computer Vision and Pattern Recognition, 2003. [PDF]

Shape Classifier (feature weighting)

We try to classify object classes using as much information as we can from shape descriptors (e.g. shape context).

Details: For purpose of object recognition, we learn one discriminative classifier based on one prototype, using shape context distances as the feature vector. From multiple prototypes, the outputs of the classifiers are combined using the method called ``error correcting output codes''. The overall classifier is tested on benchmark dataset (MNIST) and is shown to outperform existing methods with far fewer prototypes.

links:

Berkeley shape matching - overview page

Publication: Hao Zhang, Jitendra Malik. Learning a discriminative classifier using shape context distances, IEEE Computer Vision and Pattern Recognition, 2003. [PDF]