Computational Vision Jitendra Malik University of California, Berkeley

advertisement
Computational Vision
Jitendra Malik
University of California, Berkeley
What is in an image?
The input is just an array of
brightness values; humans perceive
structure in it.
From Pixels to Perception
Water
Tiger
outdoor
wildlife
Grass
Sand
back
Tiger
head
eye
tail
legs
shadow
mouse
If visual processing was purely feedforward…(it isn’t)
Pixels
Local Neighborhoods
Contours
Surfaces
Objects
Scenes
Water
Tiger
Grass
Sand
Low-level
Image Processing
Mid-level
Grouping
Figure/Ground
Surface Attributes
High-level
Recognition
Boundaries of image regions defined
by a number of attributes





Brightness/color
Texture
Motion
Binocular disparity
Familiar configuration
Grouping is hierarchical
A
Perceptual organization forms
a tree:
Image
BG
B
C
grass bush far
L-bird
beak
body beak
eye head
• A,C are refinements of B
• A,C are mutual refinements
• A,B,C represent the same percept
R-bird
body
eye head
Two segmentations are
consistent when they can be
explained by the same
segmentation tree
Humans assign a depth ordering to
surfaces across a contour


R1 appears in front of R2
R2 appears in front of R3
This can be done for images of natural scenes …
Figure-Ground Labeling
- red
is near; blue is far
Figure/Ground Organization

A contour belongs to one of the two (but not
both) abutting regions.
Figure
(face)
Ground
(Shapeless)
Ground
(shapeless)
Figure
(Goblet)
Important for the perception of shape
Some other aspects of perceptual organization
Good continuation
Amodal completion
Modal completion
What do we see here?
And here?
Some Pictorial Cues
Support, Size
2
?
?
1
3
?
Cast Shadows
Shading
Measuring Surface Orientation
Binocular Stereopsis
Optical flow for a pilot
Object Category Recognition
Shape variation within a category

D’Arcy Thompson: On Growth and Form,
1917

studied transformations between shapes of
organisms
Attneave’s Cat (1954)
Line drawings convey most of the
information
Objects are in Scenes
Human stick figure from single image
Input image
Stick figure
Support masks
This is hard…





Variety of poses
Clothing
Missing parts
Small support for parts
Background clutter
Taxonomy and Partonomy

Taxonomy: E.g. Cats are in the order Felidae which in
turn is in the class Mammalia





Recognition can be at multiple levels of categorization, or be
identification at the level of specific individuals , as in faces.
Partonomy: Objects have parts, they have subparts
and so on. The human body contains the head, which
in turn contains the eyes.
These notions apply equally well to scenes and to
activities.
Psychologists have argued that there is a “basic-level”
at which categorization is fastest (Eleanor Rosch et al).
In a partonomy each level contributes useful
information for recognition.
Visual Control of Action

Locomotion



Navigation/Way-finding
Obstacle Avoidance
Manipulation



Grasping
Pick and Place
Tool use
Camera Obscura
(Reinerus Gemma-Frisius, 1544)
Camera Obscura
(Angelo Sala, 1576-1637)
Download