CS294-2 Visual Grouping and Object Recognition (Prof. Jitendra Malik)

November 22, 1999

Lecture 22: 3D Object Recognition

Scribe Notes by Jeng Lung



There are two approaches to recognize 3D objects from 2D views:


1. Multiple Views Approach


This approach attacks the problem by modeling a 3D object by a set of 2D views.  Think of the 3D object surrounded by a viewing sphere and pictures are taken from different points of the viewing sphere.  Each picture is called an “aspect”.  Aspects are the same if the set of visible and hidden surfaces seen in both pictures are the same.  Equivalence relationship between views in graphs is isomorphic while visual events correspond to curves on the viewing sphere.




1.1 Vertex-Edge Event


This occurs when a vertex transitions between occlusion and disocclusion in the picture.


1.2 Edge Edge-Edge Event


This occurs when an entire surface appears/disappears in the picture.



1.3 Aspect Graph


You can construct an aspect graph by making each aspect a node in the graph and connect neighboring aspects with edges.  The idea of an aspect graph came from Koenderink & Van Doorn in 1979.  Gigus & Malik modified it for polyhedral objects in 1989 and Ponce & Kriegman modified it for curved objects in 1992.


1.4 Problems with multiple views approach


While the multiple views approach is sound in theory, this idea does not work well in practice due to too many aspects that are needed.  If an object have n vertices, in the worst case there will be O(n6) aspects.  Another problem with this method occurs because of bad segmentation.  Segmentation is first done on an image and then a line drawing of the object is created.  The line drawing is then matched with different aspects for recognition.  This method relies on segmentation, which is not too good right now.


1.5 Generic/Singular views


Using results from studies in psychology, the number of views needed could be reduced.  People recognize objects easier in familiar views.  For example, people can recognize horses easier from the side view or the front view but it is harder for people to recognize the horse from a top or bottom view.


There are “generic/non-accidental” views and “singular/accidental” views.  For example, in the figure below, in general it would be a straight line but there is an “accidental” view where it could be a circle that is viewed from the side. 



Similarly, in figure below, there is an accidental view in which l1, l2, and l3 does not intersect in 3D space.



2. Similarity among visually perceived forms


Eric Goldmeier studied how people judge shapes to be similar back in the 1930s and hypothesis that two shapes are more similar if their feature vectors are more similar.

This meant proportional changes of parts of a figure results in a more similar figure.


In the example below, people pick the graph to the right to be more similar to the original image, which contradicts the hypothesis.



The hypothesis is modified so that if there are many elements closely spaced, similarity will be determined by texture while if the elements are not closely spaced, similarity will be determined by proportionality.