Themes in Computer Vision Carlo Tomasi

advertisement
Themes in Computer Vision
Carlo Tomasi
Applications
•
•
•
•
•
•
•
•
•
autonomous cars, planes, missiles, robots, ...
space exploration
aid to the blind, ASL recognition
manufacturing,
quality control
surveillance, security
image retrieval
medical imaging
...
perceptual input for
cognition
(CMU NavLab ‘90)
Vision is Effortless to Us
•
•
•
•
•
driving a car
threading a needle
recognizing a distant, occluded object
understanding (flat!) pictures
perceive the mood of a painting
Technical Difficulties
• 512x512x3x30 ≈
23.5MB/s was a
problem 10 years ago
• technology just got
good enough
• great opportunity!
Fundamental Challenges I
• 3D2D implies information loss
graphics
vision
• sensitivity to errors
• need for models
Reconstruction and Geometry
must use redundancy to
address sensitivity to noise
Reconstruction Example
(Tomasi & Kanade ‘91)
Fundamental Challenges II
• Appearance changes with viewpoint, i.e., the
same thing looks different
• Geometric changes: surface slant depends on
viewpoint
• Photometric changes: surface brightness and
color depend on viewpoint
• Occlusions: what is hidden depends on
viewpoint
• Ambiguity: different things look similar
• Correspondence is hard
Photometric and Geometric Change
Occlusion
?
Technicality: Motion Blur
Wrong Correspondence
Simple Images are Harder
(Birchfield and Tomasi ‘01)
Models
• must be insensitive to
• viewing position
changes
• lighting changes
• object configuration
changes
• occlusion
• clutter
• must be sensitive to
• object changes!
Low-Level Models are General
Model: surfaces are smooth, connected
(Marr and Poggio ‘80)
Higher-Level Models Work Better…
(Lin and Tomasi ‘01)
•… when they are right
• (and much worse when
they are wrong)
left input image
ground truth disparity
our result
State of the Art
disparity error
(Lin and Tomasi, 01)
Fundamental Challenges III
• An old problem in the
new context of
recognition:
• Variation of appearance:
Objects change over
time, with context,
viewpoint, lighting, pose,
expression,…
• Similarity: Different
objects look similar
• [BTW, objects do not
always appear in
isolation…]
(US Army FERET Database)
Modeling Images as Points
1
2
1
2
n
n
principal components
form an approximate basis
for all the images in the set
Example: Eigenfaces
=
the projection of a new image
onto the eigenbasis is
a compressed representation
of that image
can use this to recognize faces,
synthesize new images, ...
(Turk, Pentland ‘91; Murase-Nayar ‘93; many others)
Fundamental Challenges IV:
Motions can be
complex
• Variation, self-occlusion,
occlusion, clutter, …
“run”
“read my lips”
Simple Models Are Fast
a head is an ellipse with two colors,
surrounded by strong intensity gradients
(Birchfield ‘98)
2D Articulated Models for Tracking
(Bregler ‘93)
3D Models are More Accurate…
•… when they are right
• [BTW, why is she wearing a
black shirt?]
(Isard & Blake ‘99)
Probabilistic Models Handle Uncertainty
• world state w, observation (image) p
• prior P(w)
•
•
•
•
•
•
colors change moderately (?)
arms move with limited acceleration (boxing?)
the height of a head can only change so much (dancing?)
contours are smooth and change smoothly
balls follow the laws of gravity
…
• sensor model P(p|w)
•
•
•
•
image motion can be measured only so well
motion blurs the image
noise corrupts pixel values
...
Bayesian Tracking
• Bayes’ rule: P(w|p)  P(p |w) P(w)
• what is the world state w likely to be, given
that we observed the image p ?
(Isard & Blake ‘99)
Even Higher Models May Be Needed
[MY COMPUTER CAN UNDERSTAND SIGN]
computer No(1(HandsIpsi 1 1 0 S Out Down, NeutralIpsi 0 0 0 S Out Down)(
,-)
0("
" 0 -1 "
" ",
"
" " " "
" ")
(",-)
0("
" -1 0 "
" ",
"
" " " "
" ")
(",-)
0("
" 0 1 "
" ",
"
" " " "
" ")
(",-)
1("
" 1 0 "
" ",
"
" " " "
" "))
understand No(1(HandIn
0 0 0 X Out Contra,NeutralOut 0 0 0 D Up Contra)(-,-)
"("
1 " " "
" ",
"
" " " "
" "))
signs
No(1(
0 0 0 B
Up Out,
- - - - -)
(-,-)
"("
1 0 0 "
" ",
- - - - -))
can
No(1(HandUp
0 0 0
Out Contra,NeutralOut 0 0 -1 B Out Up)
(-,-)
"("
" " " "
" ",
"
" " 1 "
" "))
(Richards & Tomasi ‘02)
Fundamental Challenge V:
Images are Diverse
Previous Work in Image Retrieval
Hulton Deutsch
scale
texture
Color and Texture Models
orientation
Image Distances
(Rubner & Tomasi ‘97)
(Rubner & Tomasi ‘97)
Retrieval by Refinement - 1
(Rubner & Tomasi ‘97)
Retrieval by Refinement - 2
(Rubner & Tomasi ‘97)
Vision is AI Complete
•
•
•
•
Vision is an inverse problem
Strong models of the world are required
Vision implies reasoning about the world
Vision is AI
Download