CV: Perceiving 3D from 2D Many cues from 2D images

advertisement
CV: Perceiving 3D from 2D
Many cues from 2D images
enable interpretation of the
structure of the 3D world
producing them
MSU CSE 803 Stockman
1
Topic roadmap




Labeling 3D structure in a 2D image
Labeling constraints on edge graphs
Huffman-Clowes-Waltz labeling
Other cues
motion parallax
shape from texture or shading
stereo from two 2d images
MSU CSE 803 Stockman
2
Topic roadmap: mathematical
models




Shape from shading
Depth from stereo
Depth from focus
Perspective transformation (review)
MSU CSE 803 Stockman
3
Many 3D cues
How can humans and other
machines reconstruct the
3D nature of a scene from
2D images?
What other world
knowledge needs to be
added in the process?
MSU CSE 803 Stockman
4
Vocabulary for image labeling
Interpret the local structure of the
scene in the image space
MSU CSE 803 Stockman
5
Some terms for local 3D structure
(left) intensity image of
3 blocks (right) result of
5x5 Prewitt operator
Blade: ( > ) as in the blade of a knife, where
the normal to the occluding surface element.
Occluding and occluded surfaces unrelated.
Crease: (convex + or concave -) formed by an abrupt
change to a surface or the joining of two surfaces. Surface
on both sides of the crease can be sensed
MSU CSE 803 Stockman
6
Limb: smooth object contour
egg
Soup can
Limb: (>>) formed by viewing a smooth 3D object,
such as an arm or a soup can: when approaching the
contour, the surface normal becomes perpendicular to
the line of sight.
(Right side of arrow is the occluding surface.)
MSU CSE 803 Stockman
7
Albedo and lighting


mark: (M) surface mark or change of
“albedo” (reflectance) and not the 3d
surface, creating an intensity contour in
the image
shading: (S) illumination change due
to a change in lighting or shadow on
the surface, creating an intensity
contour in the image
MSU CSE 803 Stockman
8
Labeling image contours
interprets the 3D scene structure
“shadow” relates to
illumination, not material
+
MSU CSE 803 Stockman
Logo on cup is a
“mark” on the material
An egg and a
thin cup on a
table top
lighted from
the top right
9
“Intrinsic Image” stores 3D info
in “pixels” and not intensity.
For each point of
the image, we
want depth to the
3D surface point,
surface normal at
that point, albedo
of the surface
material, and
illumination of
that surface point.
MSU CSE 803 Stockman
10
Practice labeling the contours
(left) an unopened can of Brand X soda is a solid blue can
with a bright orange block ‘X’. (right) an empty box with all
four flaps open so even the bottom of the box is visible.
MSU CSE 803 Stockman
11
3D scene versus 2D image




Creases
Corners
Faces
Occlusions (for some
viewpoint)




Edges
Junctions
Regions
Blades, limbs, T’s
MSU CSE 803 Stockman
12
Labeling of simple polyhedra
Labeling of a block
floating in space. BJ and
KI are convex creases.
Blades AB, BC, CD, etc
model the occlusion of
the background. Junction
K is a convex trihedral
corner. Junction D is a Tjunction modeling the
occlusion of blade CD by
blade JE.
MSU CSE 803 Stockman
13
Trihedral Blocks World Image
Junctions: only 16 cases!
Only 16 possible junctions in 2D formed by viewing 3D corners
formed by 3 planes and viewed from a general viewpoint! From
top to bottom: L-junctions, arrows, forks, and T-junctions.
MSU CSE 803 Stockman
14
Challenge !
Create a scene shot of no more
than 2 blocks that creates all 16
junctions in the image
MSU CSE 803 Stockman
15
How do we obtain the catalog?




think about solid/empty assignments to
the 8 octants about the X-Y-Z-origin
think about non-accidental viewpoints
account for all possible topologies of
junctions and edges
then handle T-junction occlusions
MSU CSE 803 Stockman
16
Blocks world labeling
Left: block floating in
space
Right: block glued
to a wall at the back
MSU CSE 803 Stockman
17
Try labeling these: interpret the
3D structure, then label parts
What does it
mean if we
can’t label
them? If we
can label
them?
MSU CSE 803 Stockman
18
Waltz filtering discards edge
interpretations spanning junctions
MSU CSE 803 Stockman
19
1975 researchers very excited




very strong constraints on
interpretations
several hundred in catalogue when
cracks and shadows allowed (Waltz):
algorithm works very well with them
but, world is not made of blocks!
later on, curved blocks world work
done but not as interesting
MSU CSE 803 Stockman
20
Backtracking or
interpretation tree MSU CSE 803 Stockman
21
Interpretation tree search
MSU CSE 803 Stockman
22
“Necker cube” has multiple
interpretations
Label the different interpretations
A human staring at one of these cubes typically
experiences changing interpretations. The interpretation
of the two forks (G and H) flip-flops between “front
corner” and “back corner”. What is the explanation?
MSU CSE 803 Stockman
23
Depth cues in 2D images
MSU CSE 803 Stockman
24
“Interposition” cue
Def: Interposition occurs when one object occludes
another object, thus indicating that the occluding object
is closer to the viewer than the occluded object.
MSU CSE 803 Stockman
25
interposition
• T-junctions indicate
occlusion: top is occluding
edge while bar is the
occluded edge
• Bench occludes lamp post
• leg occludes bench
• lamp post occludes fence
• railing occludes trees
• trees occlude steeple
MSU CSE 803 Stockman
26
• Perspective scaling:
railing looks smaller at the
left; bench looks smaller at
the right; 2 steeples are far
away
• Forshortening: the bench
is sharply angled relative to
the viewpoint; image length
is affected accordingly
MSU CSE 803 Stockman
27
Texture gradient reveals surface
orientation
( In East Lansing,
we call it “corn”
not “maize’. )
Note also that the
rows appear to
converge in 2D
Texture Gradient: change of image texture along some
direction, often corresponding to a change in distance or
orientation in the 3D world containing the objects creating
MSU CSE 803 Stockman
28
the texture.
3D Cues from Perspective
MSU CSE 803 Stockman
29
3D Cues from perspective
MSU CSE 803 Stockman
30
More 3D cues
Virtual lines
Falsely perceived interposition
MSU CSE 803 Stockman
31
More 3D cues
2D alignment usually
means 3d alignment
2D image curves create
perception of 3D surface
MSU CSE 803 Stockman
32
“structured light” can enhance
surfaces in industrial vision
Sculpted object
Potatoes with light stripes
MSU CSE 803 Stockman
33
Shape (normals) from shading
Clearly intensity encodes
shape in this case
Cylinder with white paper
and pen stripes
Intensities plotted as a
surface
MSU CSE 803 Stockman
34
Shape (normals) from shading
Plot of intensity of one image row reveals the 3D shape of
these diffusely reflecting objects.
MSU CSE 803 Stockman
35
Download