CS294-2
Visual Grouping and Object Recognition (Prof. Jitendra Malik)
November
22, 1999
Lecture
22: 3D Object Recognition
Scribe
Notes by Jeng Lung
There
are two approaches to recognize 3D objects from 2D views:
This approach attacks the problem by modeling a 3D object by a set of 2D views. Think of the 3D object surrounded by a viewing sphere and pictures are taken from different points of the viewing sphere. Each picture is called an “aspect”. Aspects are the same if the set of visible and hidden surfaces seen in both pictures are the same. Equivalence relationship between views in graphs is isomorphic while visual events correspond to curves on the viewing sphere.
1.1
Vertex-Edge Event
This
occurs when a vertex transitions between occlusion and disocclusion in the
picture.
1.2
Edge Edge-Edge Event
This
occurs when an entire surface appears/disappears in the picture.
1.3
Aspect Graph
You
can construct an aspect graph by making each aspect a node in the graph and
connect neighboring aspects with edges.
The idea of an aspect graph came from Koenderink & Van Doorn in
1979. Gigus & Malik modified it for
polyhedral objects in 1989 and Ponce & Kriegman modified it for curved
objects in 1992.
1.4
Problems with multiple views approach
While
the multiple views approach is sound in theory, this idea does not work well in
practice due to too many aspects that are needed. If an object have n vertices, in the worst case there will be O(n6)
aspects. Another problem with this
method occurs because of bad segmentation.
Segmentation is first done on an image and then a line drawing of the
object is created. The line drawing is
then matched with different aspects for recognition. This method relies on segmentation, which is not too good right
now.
1.5
Generic/Singular views
Using
results from studies in psychology, the number of views needed could be
reduced. People recognize objects
easier in familiar views. For example,
people can recognize horses easier from the side view or the front view but it
is harder for people to recognize the horse from a top or bottom view.
There
are “generic/non-accidental” views and “singular/accidental” views. For example, in the figure below, in general
it would be a straight line but there is an “accidental” view where it could be
a circle that is viewed from the side.
Similarly,
in figure below, there is an accidental view in which l1, l2,
and l3 does not intersect in 3D space.
2.
Similarity among visually perceived forms
Eric
Goldmeier studied how people judge shapes to be similar back in the 1930s and
hypothesis that two shapes are more similar if their feature vectors are more
similar.
This
meant proportional changes of parts of a figure results in a more similar
figure.
In
the example below, people pick the graph to the right to be more similar to the
original image, which contradicts the hypothesis.
The
hypothesis is modified so that if there are many elements closely spaced,
similarity will be determined by texture while if the elements are not closely
spaced, similarity will be determined by proportionality.