Reseach projects



SVM-KNN and work on Caltech-101

-with Alex Berg and Michael Maire

     



Selecting Shape Features for Multi-way Classification

Details/Abstract: The task of visual object recognition benefits from feature selection as it reduces the amount of computation in recognizing a new instance of an object, and the selected features give insights into the classification process. We focus on a class of current feature selection methods known as embedded methods: due to the nature of multi-way classification in object recognition, we derive an extension of the Relevance Vector Machine technique to multi-class. In experiments, we apply Relevance Vector Machine on the problem of digit classification and study its effects. Experimental results show that our classifier enhances accuracy, yields good interpretation for the selected subset of features and costs only a constant factor of the baseline classifier.

Publication: Hao Zhang, Jitendra Malik. Selecting shape features using multi-class relevance vector machine. Technical Report UCB/EECS-2005-6, EECS Department, University of California, Berkeley, October 10 2005.

Graphical model of the classifier multiclass- RVM



Shapetime photography

-with Bill Freeman

We try to capture in a still image the movement of an object over time.

Details/Abstract:
We introduce a new method to describe, in a single image, shape relationships over time. We acquire both range and image information in a sequence of frames using a stationary stereo camera. From the pictures taken, we compute a composite image consisting of the image pixels from the surface closest to the camera over all time frames. This composite reveals 3-\uppercase{d} relationships between the shapes at different times, displayed by occlusion cues in the composite image. We call the composite a shape-time photograph.

Small errors in stereo depth measurements can create artifacts in the shape-time images. We correct most of these using a Markov network to estimate, at each pixel, the most probable time-frame showing the front-most surface, taking into account (a) the stereo depth measurements and their uncertainties, and (b) spatial continuity assumptions for the front-surface time-frame assignments.

Links:

Publication: Bill Freeman, Hao Zhang. Shapetime photography, IEEE Computer Vision and Pattern Recognition, 2003. [PDF]



Shape Classifier (feature weighting)

    

We try to classify object classes using as much information as we can from shape descriptors (e.g. shape context).

Details: For purpose of object recognition, we learn one discriminative classifier based on one prototype, using shape context distances as the feature vector. From multiple prototypes, the outputs of the classifiers are combined using the method called ``error correcting output codes''. The overall classifier is tested on benchmark dataset (MNIST) and is shown to outperform existing methods with far fewer prototypes.

links:

Publication: Hao Zhang, Jitendra Malik. Learning a discriminative classifier using shape context distances, IEEE Computer Vision and Pattern Recognition, 2003. [PDF]