CV: Perceiving 3D from 2D Many cues from 2D images enable interpretation of the structure of the 3D world producing them MSU CSE 803 Stockman 1 Topic roadmap Labeling 3D structure in a 2D image Labeling constraints on edge graphs Huffman-Clowes-Waltz labeling Other cues motion parallax shape from texture or shading stereo from two 2d images MSU CSE 803 Stockman 2 Topic roadmap: mathematical models Shape from shading Depth from stereo Depth from focus Perspective transformation (review) MSU CSE 803 Stockman 3 Many 3D cues How can humans and other machines reconstruct the 3D nature of a scene from 2D images? What other world knowledge needs to be added in the process? MSU CSE 803 Stockman 4 Vocabulary for image labeling Interpret the local structure of the scene in the image space MSU CSE 803 Stockman 5 Some terms for local 3D structure (left) intensity image of 3 blocks (right) result of 5x5 Prewitt operator Blade: ( > ) as in the blade of a knife, where the normal to the occluding surface element. Occluding and occluded surfaces unrelated. Crease: (convex + or concave -) formed by an abrupt change to a surface or the joining of two surfaces. Surface on both sides of the crease can be sensed MSU CSE 803 Stockman 6 Limb: smooth object contour egg Soup can Limb: (>>) formed by viewing a smooth 3D object, such as an arm or a soup can: when approaching the contour, the surface normal becomes perpendicular to the line of sight. (Right side of arrow is the occluding surface.) MSU CSE 803 Stockman 7 Albedo and lighting mark: (M) surface mark or change of “albedo” (reflectance) and not the 3d surface, creating an intensity contour in the image shading: (S) illumination change due to a change in lighting or shadow on the surface, creating an intensity contour in the image MSU CSE 803 Stockman 8 Labeling image contours interprets the 3D scene structure “shadow” relates to illumination, not material + MSU CSE 803 Stockman Logo on cup is a “mark” on the material An egg and a thin cup on a table top lighted from the top right 9 “Intrinsic Image” stores 3D info in “pixels” and not intensity. For each point of the image, we want depth to the 3D surface point, surface normal at that point, albedo of the surface material, and illumination of that surface point. MSU CSE 803 Stockman 10 Practice labeling the contours (left) an unopened can of Brand X soda is a solid blue can with a bright orange block ‘X’. (right) an empty box with all four flaps open so even the bottom of the box is visible. MSU CSE 803 Stockman 11 3D scene versus 2D image Creases Corners Faces Occlusions (for some viewpoint) Edges Junctions Regions Blades, limbs, T’s MSU CSE 803 Stockman 12 Labeling of simple polyhedra Labeling of a block floating in space. BJ and KI are convex creases. Blades AB, BC, CD, etc model the occlusion of the background. Junction K is a convex trihedral corner. Junction D is a Tjunction modeling the occlusion of blade CD by blade JE. MSU CSE 803 Stockman 13 Trihedral Blocks World Image Junctions: only 16 cases! Only 16 possible junctions in 2D formed by viewing 3D corners formed by 3 planes and viewed from a general viewpoint! From top to bottom: L-junctions, arrows, forks, and T-junctions. MSU CSE 803 Stockman 14 Challenge ! Create a scene shot of no more than 2 blocks that creates all 16 junctions in the image MSU CSE 803 Stockman 15 How do we obtain the catalog? think about solid/empty assignments to the 8 octants about the X-Y-Z-origin think about non-accidental viewpoints account for all possible topologies of junctions and edges then handle T-junction occlusions MSU CSE 803 Stockman 16 Blocks world labeling Left: block floating in space Right: block glued to a wall at the back MSU CSE 803 Stockman 17 Try labeling these: interpret the 3D structure, then label parts What does it mean if we can’t label them? If we can label them? MSU CSE 803 Stockman 18 Waltz filtering discards edge interpretations spanning junctions MSU CSE 803 Stockman 19 1975 researchers very excited very strong constraints on interpretations several hundred in catalogue when cracks and shadows allowed (Waltz): algorithm works very well with them but, world is not made of blocks! later on, curved blocks world work done but not as interesting MSU CSE 803 Stockman 20 Backtracking or interpretation tree MSU CSE 803 Stockman 21 Interpretation tree search MSU CSE 803 Stockman 22 “Necker cube” has multiple interpretations Label the different interpretations A human staring at one of these cubes typically experiences changing interpretations. The interpretation of the two forks (G and H) flip-flops between “front corner” and “back corner”. What is the explanation? MSU CSE 803 Stockman 23 Depth cues in 2D images MSU CSE 803 Stockman 24 “Interposition” cue Def: Interposition occurs when one object occludes another object, thus indicating that the occluding object is closer to the viewer than the occluded object. MSU CSE 803 Stockman 25 interposition • T-junctions indicate occlusion: top is occluding edge while bar is the occluded edge • Bench occludes lamp post • leg occludes bench • lamp post occludes fence • railing occludes trees • trees occlude steeple MSU CSE 803 Stockman 26 • Perspective scaling: railing looks smaller at the left; bench looks smaller at the right; 2 steeples are far away • Forshortening: the bench is sharply angled relative to the viewpoint; image length is affected accordingly MSU CSE 803 Stockman 27 Texture gradient reveals surface orientation ( In East Lansing, we call it “corn” not “maize’. ) Note also that the rows appear to converge in 2D Texture Gradient: change of image texture along some direction, often corresponding to a change in distance or orientation in the 3D world containing the objects creating MSU CSE 803 Stockman 28 the texture. 3D Cues from Perspective MSU CSE 803 Stockman 29 3D Cues from perspective MSU CSE 803 Stockman 30 More 3D cues Virtual lines Falsely perceived interposition MSU CSE 803 Stockman 31 More 3D cues 2D alignment usually means 3d alignment 2D image curves create perception of 3D surface MSU CSE 803 Stockman 32 “structured light” can enhance surfaces in industrial vision Sculpted object Potatoes with light stripes MSU CSE 803 Stockman 33 Shape (normals) from shading Clearly intensity encodes shape in this case Cylinder with white paper and pen stripes Intensities plotted as a surface MSU CSE 803 Stockman 34 Shape (normals) from shading Plot of intensity of one image row reveals the 3D shape of these diffusely reflecting objects. MSU CSE 803 Stockman 35