CS 294-2 "Grouping and Recognition"
November 29th, 1999
Scribe notes by Natsuko Toyofuku
Two main applications of practical image recognition are:
Image retrieval (Internet searches, engineering, computer science)
Biological object recognition (cortical locality of image processing
and neural processes)
In essence we wish to be able to retrieve an image containing desired features
from a database of images.
Currently, the best way we have to do this is through the Internet,
using a search engine such as Altavista. Altavista is not generally
too successful at image retrieval since it tries to find a keyword similar
to the queried image (e.g. searching captions).
How can we improve this?
Rough sketch of process:
1. Content Analysis: methodically searching through each image or file
Drawbacks: this can get far too labor-intensive since it involves an
exhaustive search of every image. Use keywords (for text/caption
searches), or with color images you could use histograms to improve performance.
2. Build Index: build a more limited database to search. This
will also increase speed, but sacrifices comprehensivenes
3. Query Analysis: search the index.
Problems: you have to make sure the query is in terms that match the
format of the index.
There are many indexes of images, and although the Internet has created
an explosion of publically available image collections (size of Internet
estimated at ~15 Terabytes and growing), even before the Internet there
were image collections such as the E.G. Hulton Deutsch Collection.
12,000,000 photographs housed in racks in many rooms.
Recommended site: www.thinker.edu
This is the Legion of Honor museum in SF. Look for the "Image
With this many images in so many databases, how can you find what you
One solution is to add text to the pictures (i.e. use a set of descriptive
terms or phrases to represent each image). There are two problems
with this solution. First, such a method could not be completely
exhaustive, and any attempt would become far to unwieldy or impractical
For example, a Van Gogh painting. How could you describe it so
that someone else could query it?
Luckily, there are patterns and trends in how and what people query
for, such as
"classes" (e.g. animal, landscape) or "qualified classes"
(e.g. puppies, seashores with lighthouses).
People tend to want to find
"things" rather than "stuff" (e.g. colors and textures) and
"categories" rather than "specific instances"
The main paradigm is
Correspondence — Pose — Verification
We've looked at several methods for object recognition
Other methods are iconic matching, histogram matching, and
But if you have enough detail for Query 3, then you practically have
the queried image already and that is not terribly helpful.
Histogram Matching (wavelets)
Blobworld (from Berkeley) use an image similar to what you want.
A flower would be a bad image to use to find a tiger.
Why this mixed bag of errors?
The relationship between precision and recall
Object recognition is also application dependent. What do you
intend to do with the image and what kinds of images are you looking at.
What about real tasks? What causes the errors you see?
But these are using single images.
Solution: normalized cuts
While you do get better recognition and better segmentation with normalized
cuts, it is hard to ignore spurious boundaries (see examples).
What about color? These are all black and white images,
if they were in color, it would be easier to segment certain areas.
What about focus? With less focus, high frequency texture
information gets lost.
End Scribe Notes
Prof. Malik will continue Wed with the biological object recognition
(cortical locality of image processing and neural processes) part of this
Prof. Malik will not be holding make-up sessions for the two lectures