Gluck_OutlinePPT_Ch09

advertisement
Chapter 9
Generalization,
Discrimination, and
the Representation
of Similarity
9.1
Behavioral
Processes
9.1 Behavioral Processes
•
When Similar Stimuli Predict Similar Consequences
•
When Similar Stimuli Predict Different Consequences
•
Unsolved Mysteries—Why Are Some Feature Pairs
Easier to Discriminate Between Than Others?
•
When Dissimilar Stimuli Predict the Same
Consequence
•
Learning and Memory in Everyday Life—
Discrimination and Stereotypes in Generalizing
about Other People
3
Generalization and
Discrimination
•
Generalization—transfer of past learning to
new situations/problems.
Responding to one stimulus (S) as a result of
training with another; influenced by similarity to
the training stimulus.
Specificity—deciding how narrowly a rule applies.
Generality—deciding how broadly a rule applies.
•
Discrimination—recognition of differences
between stimuli.
4
When Similar Stimuli Predict
Similar Consequences
•
Generalization gradient—graph showing
how physical changes in stimuli correspond
to behavioral response changes.
•
In Guttman and Kalish study:
Pigeons learned to peck a yellow light (training S)
for food.
Gradient shows how often they subsequently
pecked different color shades (Fig 9.1).
Gradient width illustrates level of S generalization.
5
(Fig 9.1) Stimulus Generalization
Gradients in Pigeons
Adapted from Guttman & Kalish, 1956, pp. 79–88.
6
What Causes
Generalization Gradients?
•
Is it discrimination error?
•
Logical inference about shared consequences?
•
Shepard (1987): Identify regions of shared
consequence.
Assume all possible regions, small and large.
Average probabilistically over all.
Result: Standard exp-declining gradient
Argues: “View exp-declining gradients as representing
attempt to predict, based on past experience, how likely
it is that what is true about the consequences of one
stimulus will also be true of other similar stimuli.”
7
Generalization as a Search for
Similar Consequences
•
Consequential region—all stimuli with the
same results as the training stimulus, as
mapped on a generalization gradient.
•
For example, the pigeon has a moderate
expectation to get food from pecking a
yellow-range light (given Fig 9.1).
8
The Challenge of Incorporating
Similarity into Learning Models
•
Discrete-component representation—
representation in which each individual
stimulus (or stimulus feature) corresponds
to its own node or “component.”
Simplest possible scheme to represent stimuli.
•
Fig 9.2 uses discrete-component
representations.
Shows an unrealistic generalization gradient.
9
(Fig 9.2) Stimulus Generalization Model
Using Discrete-Component Representations
10
Limitations of DiscreteComponent Representations
•
Representations are applicable to situations
in which stimuli are dissimilar and little
generalization would occur.
Fail when stimuli have high degree of physical
similarity.
•
Note: Different representations in different
contexts provide different patterns of similarity.
Representations are context-specific.
11
(Fig 9.3) Generalization Gradient Produced
by Discrete-Component Network of Fig 9.2
*Shows no response to yellow-orange light (despite similarity to previously
trained yellow light). Only responds to trained “yellow” stimulus.; fails to
show a smooth generalization gradient like that shown in Fig 9.1.
12
Shared Elements and
Distributed Representations
•
Thorndike (law of effect), Estes (stimulus
sampling theory), Rumelhart (connectionist
models) contributed to a contemporary
associative-learning model.
Conceptualized with distributed representations
(overlapping pools of stimulus nodes).
Similar stimuli activate common elements; something
learned about one stimulus transfers to other stimuli
that activate the same nodes.
13
Thorndike and Estes
Shared Elements
Yellow
Orange
Network model follows…14
Shared Elements and
Distributed Representations
•
Fig 9.4a–d shows a network model using
distributed representations.
Nodes laid out in topographic representation
(nodes responding to physically similar stimuli
placed beside each other in the model).
•
9.4a shows the model (which is only slightly
more complicated than Fig 9.2).
•
9.4b shows the outcome in distributed
weights after many acquisition trials.
15
(Fig 9.4a) Distributed
Representation Network
16
(Fig 9.4b)
Train “Yellow”
17
Shared Elements and
Distributed Representations
•
9.4c shows response strength from a
stimulus (yellow/orange) test.
•
9.4d shows the weaker response to a more
varied stimulus (orange) test.
•
Such a distributed representation model
better matches real life gradients, much like
Fig 9.1 (see Fig 9.5).
18
(Fig 9.4c) Test
“Yellow-Orange”
Some Decline in
Response
19
(Fig 9.4d) Test “Orange”
More Decline in Response
20
(Fig 9.5) Stimulus Generalization
Gradient Produced by Distributed
Representation Model of Fig 9.4
21
When Similar Stimuli Predict
Different Consequences
•
Two substances that appear similar initially,
may become distinguishable over time.
•
Example:
Gooseberries look like green grapes. If you are
allergic to gooseberries, you learn to distinguish
them from green grapes (discrimination).
22
Discrimination Training and
Learned Specificity
•
The weaker the generalization, the stronger
the discrimination.
Discrimination = differential responding to two
stimuli.
Discrimination can be trained; in discrimination
training, two different (but similar) stimuli are
presented on each trial.
•
The steeper (and skinnier) the gradient, the
higher the discrimination.
23
Discrimination Training and
Learned Specificity
•
Fig 9.6 shows the adapted results of a
classic 1962 experiment (Jenkins studies
tone discrimination in pigeons).
One gradient represents the test pattern for
pigeons that heard a 1000 Hz tone before they
pecked and received food.
The other gradient represents the generalization
for pigeons that were intermittently exposed to a
similar 950 Hz tone without food.
Which is the control group? Experimental group?
24
(Fig 9.6) Generalization Gradients for
Tones of Different Frequencies
Adapted from Jenkins and Harrison, 1962.
25
Unsolved Mysteries—Why Are Some
Feature Pairs Easier to Discriminate
between Than Others?
•
Some pairs of stimulus features are
separable, such as brightness and hue.
•
Other feature pairs are perceived holistically,
such as brightness and saturation.
•
Understanding the nature of feature pairs
relates to stimulus generalization.
26
The Two-Dimensional Filtering Task
27
Negative Patterning:
Differentiating Configurations from
Their Individual Components
•
Negative patterning occurs when we
respond positively to two stimuli presented
separately, but we respond negatively to
the compound (i.e., the combination).
•
Example:
Mom at home? Eat dinner in the kitchen. Dad at
home? Eat dinner in the kitchen. Both Mom and
Dad at home? Don’t eat dinner in the kitchen
(Eat in the dining room).
28
Negative Patterning
•
Rats, monkeys, and humans learn negative
patterning tasks.
•
Rabbits can learn to blink to either a tone or
a light, and to not blink to a simultaneous
tone and light.
29
Negative Patterning
in Rabbit Eyeblink Conditioning
Adapted from Kehoe, 1988, Figure 9.
30
Negative Patterning
•
Single-layer network models using discretecomponent representations cannot learn
negative patterning.
31
Negative Patterning
•
Fig 9.11 shows a multi-layer network model
for negative patterning.
Include extra nodes that only fire when two or
more specific features present.
•
In Fig 9.11, a configural node for “tone +
light” will fire only if both inputs are active.
32
(Fig 9.11) Solving Negative
Patterning with a Network Model
33
Configural Learning in
Categorization
•
Configural tasks require sensitivity to
combinations of stimulus cues, above and
beyond what is known about stimulus
components.
•
Configural nodes can be applied to
categorization learning, where humans
learn to classify stimuli into categories.
e.g., diagnosis from symptoms.
34
Configural Learning in
Categorization
•
Figure 12a–12c shows a configural-node
model of category learning.
•
12a shows the model.
•
In 12b, both fever and soreness together
(without ache) predicts the disease.
Dilemma = combinatorial explosion
•
12c is a simpler, more flexible (alternative)
model.
35
(Fig 9.12a)
36
(Fig 9.12b)
37
(Fig 9.12c)
38
When Dissimilar Stimuli Predict
the Same Consequence
•
Co-occurrence of stimuli may increase
generalization.
•
Example:
If you like the cookies at a new bakery, you may
like their brownies.
39
Sensory Preconditioning: Similar
Predictions for Co-occurring Stimuli
•
Sensory Preconditioning—conditioning
without an explicit US.
Prior presentation of compound stimuli results in
later tendency for learning about one stimulus to
generalize to the other.
40
Sensory Preconditioning
*Example*
•
Step 1: (tone, light)
•
Step 2: (light, puff)
CR eyeblink should develop over acquisition trials.
•
Step 3: (tone alone)
If CR eyeblink occurs, we call this phenomenon
“sensory preconditioning.”
•
Illustrates the generalizability of a stimulus’s
power! The tone was never presented as a
cue for the puff!
41
Sensory Preconditioning
42
Acquired Equivalence:
Novel Similar Predictions Based on
Prior Similar Consequences
•
Acquired equivalence—prior training in
stimulus equivalence increases amount of
generalization between two stimuli, even if
stimuli are superficially dissimilar.
•
In Hall study, pigeons learned the dissimilar
colors paired separately with the same
color had the same result.
Demonstrated this generalization in a new
situation.
43
Acquired Equivalence
44
Learning and Memory in Everyday Life—
Discrimination and Stereotypes in
Generalizing about Other People
•
Category formation is a basic cognitive
process.
•
Rational generalizations let us tentatively
generalize individual outcomes from
previous experiences.
•
Stereotyping is denying exceptions for
individuals from a group for which we may
hold oversimplified beliefs.
Attempts to justify unfair treatment.
45
9.1 Interim Summary
•
Generalization = transfer of past learning to
new situations and problems.
Requires finding balance between specificity
(knowing how narrowly a rule applies) and
generality (knowing how broadly the rule applies).
•
Discrimination = recognizing differences
between stimuli; knowing which to prefer.
•
Understanding similarity is essential to
understand generalization and discrimination.
46
9.1 Interim Summary
•
Discrete-component representations: assign
each stimulus (or feature) to its own node.
Applicable to situations in which similarity among
features is small enough that there is negligible
transfer of response from one to another.
•
Distributed representations: incorporate idea
of shared elements.
Allow creation of psychological models with
concepts represented as patterns of activity over
many nodes; provide ability to model stimulus
similarity and generalization.
47
9.1 Interim Summary
•
We tend to assume that patterns formed from
compound cues will have consequences that
parallel (or even combine) what we know
about the individual cues.
•
However, some discriminations require
sensitivity to the configurations of stimulus
cues above and beyond what is known about
the individual stimulus cues.
48
9.1 Interim Summary
•
Animals and people can learn to generalize
between stimuli that have no physical
similarity but that do have a history of
co-occurrence or of predicting the same
outcome.
49
9.2
Brain
Substrates
9.2 Brain Substrates
•
Cortical Representations and
Generalization
•
Generalization and the Hippocampal
Region
51
Cortical Representations of
Sensory Stimuli
•
Initial cortical processing of sensory
information (vision, sound, touch, etc.)
occurs in areas dedicated to each sense.
•
Areas in the mammalian cortex can be
organized into topographical maps (e.g.,
homunculi for primary sensory and motor
cortices.
52
Topographic
Map of the
Primary
Sensory
Cortex
Adapted from Penfield & Rasmussen, 1950.
53
Shared-Elements Models of
Receptive Fields
•
Does receptive field function match
generalization theories?
•
If brain is organized in distributed
representations, similar stimuli should
activate common nodes (or neurons).
54
Shared-Elements Models
•
Fig 9.17a–c shows a shared-elements
network model of generalization.
9.17a shows how a 550-Hz tone might activate
nodes 2, 3, and 4.
9.17b shows how a similar 560-Hz tone might
activate nodes 3, 4, and 5.
9.17c illustrates the node overlap (3 and 4)
generalization between 550-Hz tone and a 560Hz tone.
55
(Fig 9.17a)
56
(Fig 9.17b)
57
(Fig 9.17c)
58
Shared-Elements Models of
Receptive Fields
•
Auditory neurons respond to varying
frequencies. Each neuron responds best to
one frequency (see Fig 9.18).
The wider the neuron’s receptive field, the broader
the range of physical stimuli (in this case, auditory
frequencies) processed by that neuron.
59
(Fig 9.18)
Activity of node/neuron #3 in Fig
9.17 is recorded for each of the
tones between 520 Hz and 580 Hz;
the best frequency is 550 Hz.
60
Topographic Organization and
Generalization
•
Richard Thompson’s 1960s animal studies
found that intact auditory cortex is
necessary for auditory generalization from
a specific tone.
Such sensory receptive fields can change from
learning.
61
Plasticity of Cortical
Representations
•
Even in adult animals, cortical areas
temporarily shrink from disuse and spread
from use.
•
Weinberger studies indicate that cortical
plasticity is due to stimulus pairing.
Stimulus presentation alone doesn’t drive
plasticity, stimulus needs to be meaningfully
related to consequence.
62
Plasticity of Representation in the
Primary Auditory Cortex
Adapted from Weinberger, 1977, figure 2.
63
Plasticity of Cortical
Representations
•
The nucleus basalis in the basal forebrain
releases acetylcholine (ACh) throughout
the cortex.
ACh facilitates
cortical
plasticity.
64
Generalization and the
Hippocampal Region
•
Generalization shown in sensory
preconditioning is disrupted by lesioning.
Lesioned rabbits display no sensory
preconditioning.
65
Hippocampal Region and
Sensory Preconditioning
Drawn from data presented in Port & Patterson, 1984.
66
Generalization and the
Hippocampal Region
•
Similarly, rats with hippocampal region
damage (lesions in the entorhinal cortex)
showed poor acquired equivalence.
Latent learning in rabbit eyeblink conditioning
was eliminated with entorhinal cortical lesions.
67
Latent Inhibition in Rabbit
Eyeblink Conditioning
Adapted from Shohamy, Allen, & Gluck, 2000.
68
Modeling the Role of the Hippocampus
in Adaptive Representations
•
Gluck and Myers (1993, 2001) propose that
compression and differentiation of stimulus
representations are computed in the
hippocampal region.
Region acts as an “information highway.”
Cerebral cortex and cerebellum process
associations for behavioral response and storage.
•
Research supports this model.
69
9.2 Interim Summary
•
While it is possible for an animal without an
auditory cortex to learn to respond to
auditory stimuli, an intact auditory cortex is
essential for normal auditory generalization.
Without their auditory cortex, animals can learn
to respond to the presence of a tone, but cannot
respond precisely to a specific tone.
70
9.2 Interim Summary
•
Cortical plasticity is driven by the correlation
between stimulus and salient event.
Plasticity is not driven by presentation alone;
stimulus has to be meaningfully related to ensuing
consequences.
But, primary sensory cortices do not receive
information about which consequence occurred,
only that some sort of salient event has occurred.
Thus, primary sensory cortices only determine
which stimuli deserve expanded representation
and which do not.
71
9.2 Interim Summary
•
When stimulus is paired with salient event
(such as food or shock), nucleus basalis
becomes active.
Delivers acetylcholine to cortex.
Enables cortical remapping to enlarge the
representation of stimulus in the appropriate
primary sensory cortex.
72
9.2 Interim Summary
•
Hippocampal region plays key role in
learning behaviors that depend on stimulus
generalization.
e.g., classical conditioning paradigms of sensory
preconditioning and latent inhibition.
•
Computational modeling suggests that role
is related to hippocampal region’s
compression and differentiation of stimulus
representations.
73
9.3
Clinical
Perspectives
9.3 Clinical Perspectives
•
Generalization Transfer and Hippocampal
Atrophy in the Elderly
•
Rehabilitation of
Language-Learning-Impaired Children
75
Generalization Transfer and
Hippocampal Atrophy in the Elderly
•
Hippocampal or entorhinal cortical atrophy
may be early sign for Alzheimer’s disease.
Adapted from de Leon et al., 1993. Images courtesy of Dr. Mony de Leon NYU School of Medicine
76
Generalization Transfer and
Hippocampal Atrophy in the Elderly
•
Myers and associates developed a method
to study
Human acquired equivalence study
generalization
transfer
in the
elderly.
Phase 2:
train new outcome
Phase 3:
transfer
Phase 1:
equivalence training
Adapted from Myers et al., 2003.
77
Human Acquired
Equivalence Study
•
Phase 1:
Learned to associate the blue fish with the
brunette and the blue fish with the blonde
(equivalent preference).
•
Phase 2:
Learned to associate the red fish with the brunette.
•
Phase 3:
Can they generalize this red fish preference to
the blonde?
78
Human Acquired
Equivalence Study
•
Results:
Healthy participants completed all three phases.
Participants with hippocampal atrophy completed
phases 1 and 2, but could not transfer learning in
phase 3.
•
Test might be a quick and easy screening
tool for potential cognitive impairment.
79
Rehabilitation of
Language-Learning-Impaired Children
•
Language learning impairment (LLI)—
language-learning problems not attributable
to known factors.
Children with normal intelligence but very low
scores on oral language tests.
Tallal found that problem was not languagespecific; rather, a problem in rapid sensory
processing.
80
Rehabilitation of
Language-Learning-Impaired Children
•
In study (Temple et al, 2003):
Participants = 20 dyslexic children (8–12 years
old) and 12 children matched for age, gender,
handedness, and non-verbal IQ.
All received fMRI during a rhyming task before
and after dyslexic children’s training.
Study includes behavioral remediation program
to improve auditory and language processing.
Uses non-linguistic and acoustically modified
speech.
Conducted 5 days per week, 100 min. per day, for
27.9 (average) training days.
81
Rehabilitation of
Language-Learning-Impaired Children
•
Results:
Children’s language and reading scores
increased.
fMRI increases in language-processing areas (left
temporo-parietal cortex).
•
Illustrates cortical plasticity in children from
intense behavioral treatment.
82
Brain Plasticity in
Children with Dyslexia
Data from Temple et al., 2003; Images courtesy of Elise Temple.
83
9.3 Interim Summary
•
Some forms of generalization depend on
medial temporal lobe mediation.
•
Elderly individuals with hippocampal region
atrophy (a risk factor for subsequent
development of Alzheimer’s disease) can
learn initial discriminations but fail to
appropriately transfer learning in later tests.
84
9.3 Interim Summary
•
Studies of dyslexia and other language
impairments provide examples of how
insights from animal research on cortical
function can have clinical implications for
humans with learning impairments.
85
Download