CS 764: Seminar in Computer Vision

advertisement
CORNELL
UNIVERSITY
CS 764
Seminar in Computer Vision
Ramin Zabih
Fall 1998
CORNELL
UNIVERSITY
Course mechanics

Meeting time will be Tue/Thu 11-12, here
• Starting a week from today

Home page is now up
www/CS764

Assignment: present one paper
• You’ll have a lot of freedom, but you need to
talk to me in advance
• Some possible papers will be posted shortly
2
CORNELL
UNIVERSITY
Topic of this seminar

The use of “knowledge” in the analysis of
visual data
• Sometimes called “context”

Clearly this is vital
• On both psychological and technical grounds
• But how? No one has much of an idea…

What is the interface between reasoning
and perception? (Or, mind and body?)
3
CORNELL
UNIVERSITY
What is the visual system’s “contract”
Two standard (bad) answers
 Answer 1: describe the scene in terms of
surfaces [low-level vision]

• There is a green patch 2” wide 1’ away

Answer 2: describe the scene in terms of
objects [model-based recognition]
• Start with a set of 3D models (modelbase)
• Determine position and pose
4
CORNELL
UNIVERSITY
Why are these answers wrong?

They are almost purely data-driven
• Bottom-up (from the data) versus top-down
(from somewhere else)

They report “objective fact”, with no room
for the task at hand
• For a given image, there is only one right
answer

Other problems as well
• Not very useful, etc.
5
CORNELL
UNIVERSITY
Technical and psychological arguments

There are technical arguments against this
• Vision is an inverse problem
– Many 3D scenes could explain a single 2D image
• On engineering grounds, this makes no sense
– Ultimately, perception is used for some task

The human perceptual system has both topdown and bottom-up elements
• Various optical illusions
– Two people can look at the same picture and see
something completely different
6
CORNELL
UNIVERSITY
Your vision system doesn’t listen
10
CORNELL
UNIVERSITY
It makes “reasonable” assumptions
11
CORNELL
UNIVERSITY
Low-level vision has its solution
Inverse problems require assumptions
 The assumptions for low-level vision are
extremely general (I.e., weak)

• Reflect the physics of the visible world
• For example, motion or depth or intensity tend
to be “coherent”
– Saying that every pixel is moving differently from its
neighbors is a very unlikely answer
– The world we live in tends not to do that
– Helmholtz’s “unconscious inference”
12
CORNELL
UNIVERSITY
We’ll need high-level vision

Most of the field is low-level vision or modelbased recognition
• Partly to avoid the confusion CS764 is about

Key question: how to avoid brittleness?
• Can make the visual system compute just what we
need for our task (I.e., berries)
• But how to handle the unexpected (I.e., lions)?
13
CORNELL
UNIVERSITY
A short historical perspective

1960’s vision was completely task-specific
• A black blob in the center of the image is a
telephone
• These efforts are now considered “hacks”

1970’s vision became completely general
• Marr pushed the field towards precise technical
questions
• Low-level vision and recognition became
dominant
14
CORNELL
UNIVERSITY
Tasks strike back

In the mid-1980’s, several attempts were
made to re-introduce a notion of task
• Active/animate/purposive vision

These attempts are widely viewed as
failures, for good reasons
• We’ll look at them a bit next week

It’s not enough to have good intuitions
• There needs to be technical merit as well
15
CORNELL
UNIVERSITY
Desiderata

Technical solutions (algorithms) that are
very roughly consistent with human data
• Goal is not AI, psychology or philosophy

Provide visual summaries useful for tasks,
but degrade gracefully
• Handle open/unstructured environments
• Deal with expectations and breakdown
16
CORNELL
UNIVERSITY
Our path for 764

No good computational work to read
• Perhaps Vera will fix this?

We will examine papers along these lines:
•
•
•
•
Computational approaches that failed
Psychological data that is highly suggestive
Neurologically inspired architectures
Cognitive scientists and philosophers
– Their goal is argument, not algorithm!
– They’ve thought the most about these issues
17
Download