Higher Perceptual Functions Object Recognition -Segregation of function -Visual hierarchy -What and where (ventral and dorsal streams) -Single cell coding and ensemble coding -Distributed representations of object categories -Face recognition -Object recognition as a computational problem Functional Segregation Segregation of function exists already in the early visual system: M channel (magnocellular): from M-type retinal ganglion cells to magnocellular LGN layers to layer IVB of V1; wavelength-insensitive in LGN, orientation selectivity in V1 (“simple cells”), binocularity and direction selectivity in layer IVB; processing visual motion. P channel (parvocellular): from P-type retinal ganglion cells to parvocellular LGN layers to interblob regions of layer III in V1; many cells in LGN show color opponency, cells in interblob regions of V1 have strong orientation selectivity and binocularity (“complex cells”), channel is also called P-IB; processing visual object shape. Functional Segregation Segregation of function can also be found at the cortical level: - within each area: cells form distinct columns. - multiple areas form the visual hierarchy … The Visual Hierarchy van Essen and Maunsell, 1983 The Visual Hierarchy van Essen et al., 1990 The Visual Hierarchy -functional segregation of visual features into separate (specialized) areas. -increased complexity and specificity of neural responses. - columnar groupings, horizontal integration within each area. -larger receptive fields at higher levels. -visual topography is less clearly defined at higher levels, or disappears altogether. -longer response latencies at higher levels. - large number of pathways linking each segregated area to other areas. - existence of feedforward, as well as lateral and feedback connections between hierarchical levels. The Architecture of Visual Cortex Lesion studies in the macaque monkey suggest that there are two large-scale cortical streams of visual processing: Dorsal stream (“where”) Ventral stream (“what”) Mishkin and Ungerleider, 1983 What and Where Object discrimination task Bilateral lesion of the temporal lobe leads to a behavioral deficit in a task that requires the discrimination of objects. Landmark discrimination task Bilateral lesion of the parietal lobe leads to a behavioral deficit in a task that requires the discrimination of locations (landmarks). Mishkin and Ungerleider, 1983 The Architecture of Visual Cortex Lateral views of the macaque monkey brain motion form color Single Cells and Recognition What is the cellular basis for visual recognition (visual long-term memory)? 1. Where are the cellular representations localized? 2. What processes generate these representations? 3. What underlies their reactivation during recall and recognition? Single Cells and Recognition Visual recognition involves the inferior temporal cortex (multiple areas). These areas are part of a distributed network and are subject to both bottom-up (feature driven) and top-down (memory driven) influences. Miyashita and Hayashi, 2000 Single Cells and Recognition Characteristics of neural responses in IT: 1. Object-specific (tuned to object class), selective for general object features (e.g. shape) 2. Non-topographic (large RF) 3. Long-lasting (100’s ms) Columnar organization (“object feature columns”) Specificity has often rather broad range (distributed response pattern) Distributed Representations Are there specific, dedicated modules (or cells) for each and every object category? No. – Why not? Distributed Representations Evidence feature based and widely distributed representation of objects across (ventral) temporal cortex. What is a distributed representation? Distributed Representations Experiments conducted by Ishai et al.: Experiment 1: 1. fMRI during passive viewing 2. fMRI during delayed match-to-sample Experiment 2: 1. fMRI during delayed match-to-sample with photographs 2. fMRI during delayed match-to-sample with line drawings Three categories: houses, faces, chairs. Distributed Representations Findings: Experiment 1: Consistent topography in areas that most strongly respond to each of the three categories. Modules? No - Responses are distributed (more so for non-face stimuli) Experiment 2: Are low-level features (spatial frequency, texture etc.) responsible for the representation? No – line drawings elicit similar distributions of responses Distributed Representations From Ishai et al., 1999 Distributed Representations From Ishai et al., 1999 houses faces chairs Face Recognition Face recognition achieves a very high level of specificity – hundreds, if not thousands of individual faces can be recognized. Visual agnosia specific to faces: prosopagnosia. High specificity of face cells “gnostic units”, “grandmother cells” Many face cells respond to faces only – and show very little response to other object stimuli. Face Recognition Typical neural responses in the primate inferior temporal cortex: Desimone et al., 1984 Face Recognition Face cells (typically) do not respond to: 1. “jumbled” faces 2. “partial” faces 3. “single components” of faces (although some face-component cells have been found) 4. other “significant” stimuli Face cells (typically) do respond to: 1. faces anywhere in a large bilateral visual field 2. faces with “reduced” feature content (e.g. b/w, low contrast) Face cell responses can vary with: facial expression, view-orientation Face Recognition Face cells are (to a significant extent) anatomically segregated from other cells selective for objects. They are found in multiple subdivisions across the inferior temporal cortex (in particular in or near the superior temporal sulcus) Face Recognition Faces versus objects in a recent fMRI study (Halgren et al. 1999) Object Recognition: Why is it a Hard Problem? Objects can be recognized over huge variations in appearance and context! Ability to recognize objects in a great number of different ways: object constancy (stimulus equivalence) Sources of variability: - Object position/orientation - Viewer position/orientation - Illumination (wavelength/brightness) - Groupings and context - Occlusion/partial views Object Recognition: Why is it a Hard Problem? Examples for variability: field of view Translation invariance Rotation invariance Object Recognition: Why is it a Hard Problem? More examples for variability: field of view Size invariance Color Object Recognition: Why is it a Hard Problem? Variability in visual scenes: field of view Partial occlusion and presence of other objects Object Recognition: Theories Representation of visual shape (set of locations): Viewer-centered coordinate systems: frame of reference: viewer example: retinotopic coordinates, head-centered coordinates easily accessed, but very unstable … Environment-centered coordinate systems: locations specified relative to environment Object-centered coordinate systems: intrinsic to or fixed to object itself (frame of reference: object) less accessible Object Recognition: Theories A taxonomy: 1. Template matching models (viewer-centered, normalization stage and matching) 2. Prototype models 3. Feature analysis model 4. Recognition by components (object-centered) Object Recognition: Geons Theory proposed by Irv Biederman. Objects have parts. Objects can be described as configurations of a (relatively small) number of geometrically defined parts. These parts (geons) form a recognition alphabet. 24 geons for four basic properties that are viewpoint-invariant. Object Recognition: Geons How geons are constructed: Object Recognition: Geons Geons in IT? Irv Biederman, JCN, 2001 How does Invariance Develop? Higher Perceptual Functions: Agnosias Deficits of feature perception (such as achromatopsia) generally do not cause an inability to recognize objects. Failure of knowledge or recognition = “agnosia”. (visual agnosia) In visual agnosias, feature processing and memory remain intact, and recognition deficits are limited to the the visual modality. Alertness, attention, intelligence and language are unaffected. Other sensory modalities (touch, smell) may substitute for vision in allowing objects to be recognized. Two Kinds of Agnosias Apperceptive agnosia: perceptual deficit, affects visual representations directly, components of visual percept are picked up, but can’t be integrated, effects may be graded, often affected: unusual views of objects Associative agnosia: visual representations are intact, but cannot be accessed or used in recognition. Lack of information about the percept. “Normal percepts stripped of their meaning” (Teuber) This distinction introduced by Lissauer (1890) Apperceptive Agnosia Diagnosis: ability to recognize degraded stimuli is impaired A A Farah: Many “apperceptive agnosias” are “perceptual categorization deficits” … Apperceptive Agnosia Studies by E. Warrington: Laterality in recognition deficits: patients with right-hemispheric lesions (parietal, temporal) showed lower performance on degraded images than controls or left-hemispheric lesions. Hypothesis: object constancy is disrupted (not contour perception) Experiment: Unusual views of objects – patients with right-hemispheric lesions show a characteristic deficit for these views. Apperceptive Agnosia Is “perceptual categorization deficit” a general impairment of viewpoint-invariant object recognition? 1. Patients are not impaired in everyday life (unlike associative agnosics). 2. They are not impaired in matching different “normal” views of objects, only “unusual views”. 3. Impairment follows unilateral lesions, not bilateral (as would be expected if visual shape representations were generally affected). Associative Agnosia Patients do well on perceptual tests (degraded images, image segmentation), but cannot access names (“naming”) or other information (“recognition”) about objects. Agnosics fail to experience familiarity with the stimulus. When given names of objects, they can (generally) give accurate verbal descriptions. Warrington’s analysis places associative agnosia in left hemisphere. Associative Agnosia Associative agnosics can copy drawings of objects but cannot name them (evidence for intactness of perceptual representations…) but… Agnosia Restricted to Specific Categories Specific deficits in recognizing living versus non-living things. Warrington and Shallice (1984): patients with bilateral temporal lobe damage showed loss of knowledge about living things (failures in visual identification and verbal knowledge). Their interpretation: distinction between knowledge domains – functional significance (vase-jug) versus sensory properties (strawberry-raspberry). Evolutionary explanation… Agnosia Restricted to Specific Categories Another view: Damasio (1990) Many inanimate objects are manipulated by humans in characteristic ways. Interpretation: inanimate objects will tend to evoke kinesthetic representations. Agreeing with Warrington, difficulty is not due to visual characteristics or visual discriminability. Agnosia Restricted to Specific Categories Yet another view: Gaffan and Heywood (1993) Presented images (line drawings) of animate and inanimate to normal humans and normal monkeys, tachistoscopically (20 ms). Both subject groups made more errors in identifying animate vs. inanimate objects. Interpretation: Living things are more similar to each other than non-living things “categoryspecific agnosia” How is Semantic Knowledge Organized? Category-based system Property-based system Network model by Farah and McClelland (1991). Prosopagnosia Is face recognition “special”? Anatomical localization Functional independence Associative visual agnosia (prosopagnosia): Lost ability to recognize familiar faces. Affects previous experience as well as (anterograde component) newly experienced faces. Patients can recognize people by their voice, distinctive clothing, hairstyle etc. Prosopagnosia What is special about faces: 1. 2. 3. 4. Higher specificity of categorization Higher level of expertise Higher degree of visual similarity Evolutionary significance Can face and object recognition be dissociated? Neuropsychological evidence suggests, yes (study by McNeil and Warrington) Also, remember Ishai et al. (object category map)