Sensation & Perception Ch. 5: Perceiving Objects and Scenes © Takashi Yamauchi (Dept. of Psychology, Texas A&M University) Main topics The challenge of object perception Inverse projection problem Gestalt psychology Perceptual organization Figure-ground segregation Biederman’s Recognition byChComponents theory 5 1 Minsky’s estimation • Why is it so difficult for the computer to recognize objects as we do? • http://web.media.mit.edu/~minsky/ Ch 5 2 What AI can do • 50th anniversary in 2006 (from Rodney Brooks) – The most advanced AI can barely beat a 2year-old child in object recognition. Ch 5 3 Why AI betrayed? • Recognizing objects – is much more than just detecting lines, colors, or shapes. – requires a large amount of background knowledge. • both implicit and explicit knowledge Ch 5 4 The challenge of object perception • The stimulus on the receptors is ambiguous – Inverse projection problem: an image on the retina can be caused by an infinite number of objects • Objects can be hidden or blurred – Occlusions are common in the environment Ch 5 5 Figure 5.3 The principle behind the inverse projection problem. The small square stimulus creates a square image on the retina. However, this image could also have been created by the other two shapes and many other stimuli. This is why we say that the image on the retina is ambiguous. Ch 5 6 Ch 5 7 Fig. 5-4, p. 96 The Challenge of Object Perception - continued • Objects look different from different viewpoints – Viewpoint invariance: the ability to recognize an object regardless of the viewpoint • The reasons for changes in lightness and darkness in the environment can be unclear Ch 5 8 Ch 5 9 Fig. 5-6a, p. 96 Ch 5 10 Fig. 5-8, p. 97 Part + Part = Whole? • Gestalt psychology Ch 5 11 Group discussion • Given the pictures shown in the slides, write down everything you see Ch 5 12 Fig. 1. Ch 5 13 Fig. 2 Ch 5 14 Fig. 3 Ch 5 15 Fig 4 Ch 5 16 Fig. 5 Ch 5 17 Fig. 7a Fig. 7b Fig. 7c Ch 5 18 Three Musicians. P. Picasso Ch 5 19 Woman with a book: P. Picasso Ch 5 20 La lectrice: P. Picasso Ch 5 21 Music. H. Matisse Ch 5 22 Dance. H. Matisse Ch 5 23 View of Toledo: El Greco Ch 5 24 Baptism of Christ: El Greco Ch 5 25 The Burial of Count Orgaz: El Greco Ch 5 26 Metamorphosis of narcissus: S. Dali Ch 5 27 War: S. Dali Ch 5 28 Mont Sainte-Victoire (Cezanne) Ch 5 29 Hokusai Hiroshige Ch 5 30 What do these pictures tell us? • Perceiving things is a lot more than just detecting lines, colors, motion, so on. • We see – Face, body, hands, apples, mountains, legs, tables, houses, streets,…… – motion, movement, action, rhythm, expansion, contraction, upheaval, Ch 5 31 How come? • Your eyes receive 2-D flat information. • All you got are activation of neurons. • How come you perceive faces, trees, apples…? Ch 5 32 Why & how do we see those? • Perception involves – Organization • We see a woman sitting on a sofa because we organize visual information in a certain way. Ch 5 33 Ch 5 34 How do we organize visual information? • Are there any principles behind it? • What principles do we follow? • Gestalt laws of organization Ch 5 35 Tell me what you see. Ch 5 36 Tell me what you see. Ch 5 37 Perception is not just detection • Perception is not just about detecting color or shape. • Perception is about organizing visual information. • How do we organize visual information? Ch 5 38 Or when do we fail to organize visual information? Ch 5 39 Gestalt psychology • “Gestalt” means “whole.” • Organizational principles: – Similarity – Proximity – Continuity • And more (see the textbook) Ch 5 40 Law of similarity Similar things are put together Ch 5 41 Ch 5 42 Items with similar colors are put together Ch 5 43 Do you see a sofa? Do you see 3 musicians? Why? Why not? Do you see a blouse and a blue skirt? Do you see a table? Ch 5 44 Ch 5 45 Law of proximity • Things that are close to each other are put together. Ch 5 46 There are two separate worlds --- this world and that world? Ch 5 47 Ch 5 48 Law of good continuity We tend to put things together when they show nice continuity. Ch 5 49 Do you see a spinal motion? This picture is called “Dance.” Do you see Ch 5 why? 50 Do you feel an upward motion or a downward motion? Ch 5 51 Ch 5 52 • There are many many more ways to organize visual information (see the textbook). Ch 5 53 Perceptual Segregation • Figure-ground segregation – determining what part of environment is the figure so that it “stands out” from the background Ch 5 54 Figure-Ground segregation • Distinguishing a figure from the ground. • We know the difference between the figure and the ground Ch 5 55 How does the perceptual system distinguish the figure from the ground? Ch 5 56 Which one is the figure and which one is the ground? Depending on the locus of attention, the figure and the ground switch rapidly. Attention plays some role in determining the figure and Ch 5 the ground. 57 The figure represents “some thing.” The contours belong to the figure rather than to the ground. Ch 5 58 Which one if the figure and which is the ground? This is easy. The figure tends to have solid and continuous surface. Ch 5 59 Which one is the figure and which one is the ground? Symmetric items tend to be seen as a figure. Ch 5 60 What did you see? Some arrows and what? Meaningful items are seen as figure Do you see something else? Ch 5 61 Ch 5 62 Ch 5 63 What does this tell? • Figure-ground segregation is influenced by our knowledge about the world in general. – What things are, how they look like, – Two things can’t occupy the same space simultaneously. Ch 5 64 What does this tell? • Gestalt laws of organization – Similarity, continuity, proximity,.. • Figure-ground segregation • Perceptual organization is based on our world knowledge (what we know about the world). Ch 5 65 Ch 5 66 Ch 5 67 Ch 5 68 Ch 5 69 Ch 5 70 Ch 5 71 Ch 5 72 Ch 5 73 Input & output to and from LGN Ch 5 74 Any idea? • • Gestalt laws of organization – Similarity, continuity, proximity,.. Figure-ground segregation • What do they tell you? Some top-down processes are going on. Ch 5 75 Ch 5 76 Modern research on object perception • The perceptual system is tuned to capture the regularities in the environment. • Biederman’s Recognition by Components model Ch 5 77 How do we store objects in the brain? • See an object. • Put it into your brain • Take out the one you stored earlier. • compare it to what you saw. Ch 5 How do you store what you saw before? 78 You see billions of objects every day • Does your brain have enough space to store them? • Storing everything is ineffective (very expensive). • So, you got to do something different. • Better way to store things? Ch 5 79 How do you do that? Ch 5 80 Biederman’s Recognition by Components • You just need about 36 components to represent millions of different objects. Ch 5 81 Biederman’s Recognition by Components (RBC) • Objects are described and stored by simple geometric components (geons). • There are about 36 geons. • To represent objects, we use geons and their arrangements. Ch 5 82 Geons Ch 5 83 Ch 5 84 Combinations of geons Ch 5 85 Combinations of geons Combining 4 geons can yield more than 1 million objects. (36x36x36x36) Ch 5 86 Ch 5 87 Ch 5 88 Ch 5 89 Ch 5 90 Biederman & Ju (1988) Cognitive Psychology, 20, 38-64 • Do people need information more than geons provide? • Contrast subjects’ performance for object recognition when two types of pictures (actual pictures and schematic pictures) are shown one by one. Ch 5 91 Schematic pictures depicted by geons Ch 5 92 Actual pictures Ch 5 93 Experiment: • The subject indicated the name of the object shown on the screen. • In one case, an actual picture of the object was shown. • In the other case, a schematic illustration of the object (depicted by geons) was shown. Ch 5 94 Questions/design: • Do subjects name the objects more quickly and accurately when actual pictures were shown or when schematic pictures (depicted by geons) were shown? • Each picture (either actual/schematic) flashed on the computer screen only for 50ms, 60ms, or 400ms. Ch 5 95 Why 50ms? Or 400ms? • The task needs to be not too easy but not too difficult. – Test college students’ math ability. – test adding and subtracting? – Test high school students’ math ability. – Give quantum mechanics questions? Ch 5 96 Results: Error rates Presentation 50(ms) 65(ms) 400 (ms) 5(%) 13 20 Line drawings 3(%) 9 35 Photographs Response times (correct responses only) Presentation 50(ms) 65(ms) 400 (ms) 860 870 Line drawings 990 850(ms) 875 Photographs 1010 Ch 5 97 Ch 5 98 Ch 5 99 The intelligence of human object perception • Why are humans much better than computers at object recognition? Ch 5 100 • Theory of unconscious inference – Human object perception is like problem solving. • We make an unconscious inference. – Likelihood principle • objects are perceived based on what is most likely to have caused the pattern. – Humans have a vast array of knowledge (intelligence) that can disambiguate ambiguous stimuli. Ch 5 101 Ch 5 102 Ch 5 103 Ch 5 104 Ch 5 105 Ch 5 106 Ch 5 107 Ch 5 108 Ch 5 109 Ch 5 110 Ch 5 111 Ch 5 112 Ch 5 113 Ch 5 114 Ch 5 115 Ch 5 116 Ch 5 117 Ch 5 118 Ch 5 119 Ch 5 120 Ch 5 121 Ch 5 122 Ch 5 123 Ch 5 124 Ch 5 125 Ch 5 126