Class 19 - Generalized matching

advertisement
Class Business
• No lab/office hours
• Papers
– Format- subheadings
– Progress Reports
– Early drafts – Thanksgiving
• Rachlin
– Summaries
• Tests
– Review
– Rewrites
Matching Continued
Herrnstein’s Hyperbola -- Application
kR 1
B1 
R1  R e
Given this equation:
What two ways could we
reduce B?
kR 1
B1 
R1  R e
K = 10, R1=10, Re=10
10*10/(10+10) = 5
Decrease Rf for the Behavior
kR 1
B1 
R1  R e
K = 10, R1=5, Re=10
10*5/(5+10) = 3.33
Increase Re
kR 1
B1 
R1  R e
K = 10, R1=10, Re=20
10*10/(10+20) = 3.33
We do this all the time…
Herrnstein's law and concurrent schedules
The strict-matching equations for describing rates of
responding during either alternative in concurrent VI VI
schedules are:
Alternative 1:
B1
R1

,
B1  B 2  Be R1  R 2  Re
Alternative 2:
B2
R2

B1  B 2  Be R1  R 2  Re
According to Herrnstein's assumptions, the denominators
(B1 + B2 + Be) both equal k, so:
kR1
B1 
,
R1  R 2  Re
and
kR2
B2 
R1  R 2  Re
If now we divide B1 by B2 to see what relative behaviour
allocation would look like, we get:
B1 R1

B2 R2
As we saw already, this is just the same relation as:
B1 R1

B2 R2
which is the strict matching law.
B1
R1

B1  B 2 R1  R 2
So, the response rate on one of the schedules comprising
a concurrent VI VI schedule is:
kR1
B1 
,
R1  R 2  Re
How good is Herrnstein's law (1)?
A law is only as good as the assumptions are true.
It fits the data well,
But is that enough? -- many other equations could
be good fits.
How good is Herrnstein's law (2)?
The assumption that total output (k) is constant seems
counter-intuitive, but it could be correct if you measure all
the responses in the same modulus -- which Herrnstein's
theory does. Even doing nothing is behaving.
How good is Herrnstein's law (3)?
The assumption about Re existing -- that there are other
reinforcers available -- cannot be faulted. Logically, it
seems that this must be correct.
BUT the assumption that Re remains constant when R1 is
varied is probably unreasonable.
It seems unlikely that Re could remain constant when Be
varies considerably. Re surely must fall when R1 is
increased.
How good is Herrnstein's law (4)?
The assumption of strict matching is likely to be wrong.
Considerable research has shown that the behavior ratio
changes rather less with changes in reinforcer ratios than
is suggested by strict matching.
Generalized Matching
Strict matching law
 This one is the relative form, because it shows relative
response and reinforcer rates (proportions) :
B1
B1  B2

R1
R1  R2
 You can rearrange this to give the ratio form :
B1
B2

R1
R2
 Lots of concurrent-schedule experiments supported the
strict matching law, at least roughly, but not all of them
Staddon (1968) was the first to collect data that clearly
didn’t fit the strict matching law
His data did NOT fit on
the major diagonal on
relative coordinates as
required by the strict
matching law....
B1/(B1+B2)
He studied concurrent
DRL VI(DRL) schedules
in pigeons, and varied
the VI schedule
1.0
0.5
0.0
0.0
0.5
R1/(R1+R2)
1.0
So Staddon (1968) plotted his data in log-ratio form,
log(B1/B2) against log(R1/R2) :
Strict matching is still
shown as a diagonal
line at 45º
Equal responses
means log (B1/B2) = 0
The data clearly
fall along a line,
whose equation is
y = .59x + .51
log (B1/B2)
Equal reinforcers
means log (R1/R2) = 0
1
y = .59x + .51
equal
choice
0
-1
equal
reinforcers
-2
strict
matching
-3
-3
-2
-1
log (R1/R2)
0
1
Two deviations from strict matching are visible in
Staddon’s data :
2
STADDON (1968) BIRD 420
y = .59x + .51
y = 0.59x + 0.51
1
equal
choice
0
0
log B1/B2
Bias – over all log
reinforcer ratios,
there are more B1
responses than
predicted by strict
matching
1
log (B1/B2)
Undermatching –
there’s less change
in log response ratio
than predicted by
strict matching (the
slope of the line is
less than 1)
-1
-1
-2 -2
equal
reinforcers
strict
matching
DATA
-3 -3
FIT
STRICT MATCHING
-3
-3
-2
-2
-1
-1
/R2)
log log
(RR11/R
2
0
0
1
1
More undermatching and bias
 At first, researchers didn’t take much notice of
Staddon’s results – weird procedure, weird results
 People like Herrnstein’s strict matching theory
 simple and parsimonious
 Reports of these deviations from strict matching
continue
 in more standard concurrent schedules
 Hollard and Davison (1971) – concurrent VI VI,
pigeons, food vs electrical brain stimulation as
reinforcers
 Trevett, Davison and Williams (1972) – concurrent
VI FI, pigeons, food reinforcers
 Baum (1974): concluded that undermatching was
actually more common than strict matching
The Generalized Matching Law - GML
 Baum (1974) therefore suggested a more general
form of the matching law that included parameters to
describe undermatching and bias :
B1
log
B2

R1
a log
 log c
R2
 The GML has the form of the equation for a straight line :
y = slope (x) + intercept
 The slope a is called sensitivity to reinforcer rate
(Lobb & Davison, 1975)
 The intercept log c is called bias (Baum, 1974)
The Generalized Matching Law - GML
 Why is the equation in log-ratio form? Because otherwise it
wouldn’t be a straight line. If we take the GML out of logs :
B1
log
B2
R1
a log
 log c
R2

becomes a power function :
B1
B2

R1 a
c( )
R2
 Not a straight line, harder to interpret on a graph, and
much harder to find the best-fitting values of a and log c
The Generalized Matching Law
B1
log
B2








R1
a log
 log c
R2
If sensitivity (a) < 1, undermatching
If sensitivity (a) > 1, overmatching
If bias (log c) > 0, biased towards Alternative 1
If bias (log c) < 0, biased towards Alternative 2
If a = 1 and log c = 0, strict matching
If a = 1 and log c  0, biased matching
For Staddon’s data, a = 0.59, so undermatching, and
log c = 0.51, so biased towards Alternative 1 (the
shorter DRL)
Examples
LOG-RATIO MEASURES PLOT
STRICT MATCHING
UNDERMATCHING
BIASED UNDERMATCHING
1.0
Strict matching
a = 1, log c = 0
Undermatching
a < 1 , log c = 0
Biased undermatching
LOG RESPONSE RATIO
Three GML plots:
0.5
0.0
-0.5
a < 1 , log c < 0
-1.0
-1.0
-0.5
0.0
0.5
LOG REINFORCER RATIO
1.0
Examples
RELATIVE MEASURES PLOT
STRICT MATCHING
UNDERMATCHING
BIASED UNDERMATCHING
Much harder to see
what’s going on in
relative plots
RELATIVE RESPONSES
1.0
0.8
0.6
0.4
0.2
0.0
0.0
0.2
0.4
0.6
0.8
RELATIVE REINFORCERS
1.0
Examples
RATIO MEASURES PLOT
STRICT MATCHING
UNDERMATCHING
BIASED UNDERMATCHING
12
And even worse with
ratio axes
RESPONSE RATIO
10
8
6
4
2
0
0
2
4
6
8
REINFORCER RATIO
10
12
The Generalized Matching Law
B1
log
B2

R1
a log
 log c
R2
 Undermatching means that when we vary the log
reinforcer ratio (the distribution of reinforcers
between alternatives), the animal’s choice changes
less than strict matching predicts
 Undermatching is more common than strict matching
– usual value of sensitivity is about 0.8 to 0.9
 Example – Alsop and Elliffe (1988) – six pigeons,
food reinforcement, five different concurrent VI VI
schedules
Alsop and Elliffe (1988)
The GML fitted the data well (data points close to line)
1
Bird 131
Bird 132
Bird 133
Bird 134
Bird 135
Bird 136
log response ratio
0
-1
1
0
-1
-1
0
1
-1
0
1
log reinforcer ratio
-1
0
1
Alsop and Elliffe (1988)
All birds showed undermatching (slopes < 1)
1
Bird 131
Bird 132
Bird 133
Bird 134
Bird 135
Bird 136
log response ratio
0
-1
1
0
-1
-1
0
1
-1
0
1
log reinforcer ratio
-1
0
1
Alsop and Elliffe (1988)
All showed some bias – 132 and 134 to B1, others to B2
1
Bird 131
Bird 132
Bird 133
Bird 134
Bird 135
Bird 136
log response ratio
0
-1
1
0
-1
-1
0
1
-1
0
1
log reinforcer ratio
-1
0
1
Response vs. time allocation
B1
log
B2

R1
a log
 log c
R2
 Just as with the strict matching law, we could measure choice
in terms of time allocation instead of response allocation :
T
log 1
T2

R
a log 1  log c
R2
 Results are very similar, but sensitivity is usually a little
higher (a is closer to 1, less undermatching)
Alsop and Elliffe (1988)
log response ratio and log time ratio
All birds showed higher time than response sensitivities
1
Bird 131
Bird 132
Bird 133
responses
time
0
-1
1
Bird 134
Bird 135
Bird 136
0
-1
-1
0
1
-1
0
1
log reinforcer ratio
-1
0
1
Why do we usually get undermatching?
 Baum’s view – undermatching is a failure of matching
 Organisms are ‘trying to match’ but various things prevent it –
usually related to confusion between the alternatives
 e.g., no COD or COD too short (Schroeder & Holland, 1969)
– without a COD, reinforcers on one alternative will affect
responses on the other, so not surprising that we get
undermatching
 e.g., stimuli signaling the alternatives may be confused
 Miller, Saunders and Bourland (1980) ran a switching-key
concurrent VI VI with the alternatives signaled by line
orientations
 Found almost strict matching when lines were 45º different,
but a = 0.33 when lines were 15º different
Why do we sometimes get overmatching?
 It can’t just be a failure of matching though, because there are
conditions that reliably produce overmatching (a > 1)
 This happens if we make it difficult to switch between
alternatives
 e.g., Baum (1982) – two-key concurrent VI VI with pigeons,
but they had to walk around a partition or jump over a hurdle
to switch between keys – a was about 1.9
 e.g., Davison and Elliffe (2000) – switching-key concurrent VI
VI, but a switch response produced a delay of 4.5 s before the
other stimulus and schedule appeared – a was about 1.6
 This doesn’t seem consistent with Baum’s idea that strict
matching is normative, but sometimes animals can’t do it
Why do we get bias?
 Much easier to understand – bias is a constant
proportional preference for one alternative
 doesn’t change when we change the reinforcer ratio
 Choice is biased towards:





Large reinforcers over small reinforcers
Immediate reinforcers over delayed reinforcers
Easy responses over hard responses (e.g., force required to peck)
Higher quality reinforcers over lower quality reinforcers
Reinforcers that occur at unpredictable times (VI) over
reinforcers that occur at regular times (FI)
 There might also be some inherent bias specific to the
individual subject,
 like muscular differences: handedness (beakedness?)
Why do we get bias?
 All these variables affect choice. If one of them is constant
but unequal on the two alternatives, it will produce a bias.
 e.g., if left-key responses are reinforced with 3 s access to
wheat, and right-key responses are reinforced with 6 s access
to wheat, and we then vary the reinforcer rates on each key, we
would expect the GML to fit the data well, but with a negative
value of log c – a bias towards B2, the right key
 e.g., Hollard and Davison (1971) – pigeons, left key always
had 3 s food as the reinforcer, right key had 10 s of ectostriatal
electrical brain stimulation, varied the reinforcer rate ratio over
conditions
Hollard and Davison (1971)
HOLLARD & DAVISON (1971) GROUP DATA
1.0
TIME: Y = 1.06X + 0.71
RESPONSE DATA
LOG RESPONSE OR TIME RATIO
 Sensitivity values
look normal, but
there was a big
bias in favor of
the alternative
providing food
reinforcers
 Responses: bias
to food = 0.65 in
log units
 100.65  4.5, so
birds liked food
4.5 times as
much as EBS
TIME DATA
0.5
0.0
RESPS: Y = 0.77X + 0.65
-0.5
STRICT MATCHING
-1.0
-1.0
-0.5
0.0
0.5
LOG OBTAINED REINFORCER RATIO
1.0
Hollard and Davison (1971)
HOLLARD & DAVISON (1971) GROUP DATA
1.0
TIME: Y = 1.06X + 0.71
RESPONSE DATA
LOG RESPONSE OR TIME RATIO
 Time: bias to
food = 0.71 in log
units
 100.71  5.1, so
birds liked food
5.1 times as
much as EBS
 But there’s a
problem with the
design of this
experiment …
TIME DATA
0.5
0.0
RESPS: Y = 0.77X + 0.65
-0.5
STRICT MATCHING
-1.0
-1.0
-0.5
0.0
0.5
LOG OBTAINED REINFORCER RATIO
1.0
Hollard and Davison (1971)
 We can’t be certain that the bias was because of the
different reinforcer qualities, or because of inherent
bias, or some other asymmetry between the keys
 e.g., maybe the food key was easier to peck than the
EBS key, and that’s why the birds were biased towards
it
 Hollard and Davison needed a series of control
conditions where each key provided the same
reinforcer, so they could measure any inherent bias etc
and subtract it from the bias in the food v EBS
conditions
Trevett, Davison and Williams (1972)
 Does GML applied to concurrent FI VI as well as to VI
VI?
 Is there a bias to either FI or VI?
 Ran the control conditions
 Bias should be due to a preference for FI or VI
 Baseline (control):
 Five concurrent VI VI schedules in which reinforcer
ratio was varied from 1:4 to 4:1 (log reinforcer ratio
varied from –0.6 to +0.6)
 Experimental conditions:
 Five concurrent FI VI schedules that varied reinforcer
ratio over the same range
Trevett, Davison and Williams (1972)
1.0
LOG RESPONSE RATIO (FI/VI)
 Sensitivities (slopes) not
significantly different for
the two schedule types
 conc VI VI – small bias
towards Key 2 (inherent
bias? easier to peck?)
 conc FI VI – bigger bias
towards VI. Difference
between the biases, 0.15
log units, must measure
preference for VI over FI
 Pigeons seem to prefer
unpredictable reinforcers
over predictable ones –
risk-prone rather than
risk-averse, perhaps
TREVETT ET AL. (1972) GROUP DATA
CONC VI VI
CONC FI VI
y = 0.69x -0.10
0.5
0.0
-0.5
-1.0
-1.0
 Similar results for time allocation
y = 0.62x - 0.25
-0.5
0.0
0.5
LOG OBTAINED REINFORCER RATIO
1.0
White & Davison (1973) FI FI
 White and Davison varied relative reinforcer rates in
conc FI FI
 Sometimes, the pigeons showed normal FI scallops,
but sometimes the cumulative records looked like VI
schedules (high constant response rate)
 When they showed the same pattern of responses on
both keys, sensitivity was close to 1 (strict matching)
and there was no bias
 When they showed different patterns on each key
(one FI-like, one VI-like), they found undermatching,
and a bias towards the VI-like alternative
Concurrent VI VR
 The results depend on whether you measure response
or time allocation
 Response bias towards VR schedule
 Time bias towards VI schedule
 Seems reasonable?
 VR is response-based, VI is time-based
 So response and time measures aren’t always the
same, and can be in opposite directions
 Davison (1982): strange result that is a problem for
the GML:
 Keep the VI schedule constant and vary the VR, get strict
matching (a = 1)
 Keep the VR constant and vary the VI, get undermatching.
 So maybe the GML doesn’t apply to ratio schedules very
well.
Other IVs in the GML
B1
log
B2

R1
a log
 log c
R2
 Reinforcer rates control choice according to the GML
 Other independent variables, like reinforcer magnitudes,
delays, qualities, etc. produce biases in the GML if they are
held constant but unequal while we vary reinforcer rates
 What if we keep the reinforcer rates constant and vary
something else instead?
 e.g., arrange conc VI 60 s VI 60 s and vary magnitude
B1
log
B2

M1
a log
 log c
M2
Schneider (1973) and Todorov (1973)
B1
log
B2

M1
a log
 log c
M2
 Version of the GML that describes the effect of
reinforcer magnitude on choice
 Schneider and Todorov both varied reinforcer
magnitude and found that the above equation did
describe their data
 But sensitivity to magnitude was about 0.5 – much
less than sensitivity to rate
 Reinforcer rate is more effective at influencing choice
than reinforcer magnitude
The concatenated generalized matching law
B1
log
B2

R1
M1
ar log
 am log
 log c
R2
M2
 If we put the GML descriptions of control by rate and
magnitude together, we could write the above
equation
 The sensitivity terms have subscripts identifying the
IV they refer to
 Each IV controls choice in the same way, but
sensitivity to each might differ
 Because reinforcer rate affects choice more than
reinforcer magnitude does, ar is greater than am
The concatenated generalized matching law
B1
log
B2
R1
M1
D2
 ar log
 am log
 ad log
 ...
R2
M2
D1
A
Q
aA log 2  aq log 1  log c
A1
Q2
 We could just keep going, adding a log ratio and a sensitivity
term for all the other IVs that affect choice
 Notice that reinforcer delay and response arduousness have
reversed log ratios – why?
 Sensitivity term: how much influence that IV has on choice
 If an IV is constant and equal for both alternatives, its log
ratio will be zero and it will drop out of the equation
 If an IV is constant and unequal it will produce a bias
The concatenated generalized matching law
B1
log
B2

R1
M1
ar log
 am log
 log c
R2
M2
 e.g., suppose M1 is 6 s access to wheat and M2 is 3 s
 We know that am is about 0.5
 The magnitude ratio is 2, so log magnitude ratio is 0.3, so the
magnitude term is about 0.5 x 0.3 = 0.15
 So if we vary the reinforcer rates and measure bias, it should
be about 0.15 towards the larger magnitude, as long as all the
other IVs in the concatenated GML are constant and equal, and
there’s no inherent bias
Vollmer and Bouret (2000)
American college
basketballers, 3-point
shots vs 2-point shots
Across players, ratio of
shots attempted
(responses) slightly
undermatched ratio of
successful shots
(reinforcer rate)
with a bias in favour of
3-point shots
(reinforcer magnitude)
The concatenated generalized matching law
B1
log
B2

X1
 ax log
X2
 This is an efficient way to write the concatenated GML. The
symbol S means “the sum of”
 X means any independent variable that affects choice
 The equation says that the predicted log response ratio is the
sum of all the log independent-variable ratios multiplied by
their sensitivities
 There’s no bias term in the equation, because the hope is that
eventually, when all the IVs that affect choice have been
discovered, there will be no inherent bias
The concatenated generalized matching law
B1
log
B2

X1
 ax log
X2
 There’s a body of research on measuring sensitivities
to different independent variables – we’ve already
seen that sensitivity to :
 reinforcer rate is about 0.8 to 0.9
 reinforcer magnitude is about 0.5
 We’ll look at two more IVs – reinforcer delay and
response force requirement (a way of manipulating
the arduousness of a response)
Response force: Hunter and Davison (1982)
B1
log
B2

R1
A2
ar log
 aA log
 log c
R2
A1
 Hunter and Davison varied both reinforcer rate in concurrent
VI VI and the force required for a key-peck to be counted as a
response (they had weights behind the keys to make them
harder to peck)
 Response measures:
ar = 0.88, aA = 0.71
 Time measures:
ar = 0.98, aA = 0.41
 Force required affected choice less than reinforcer rate
 Sensitivities measured using time allocation aren’t always
higher than for response allocation
 This is the only thorough published test of the concatenated
GML, and it fitted the data very well
Rf magnitude: Elliffe, Davison, Landon (2008)
B1
log
B2

R1
M1
ar log
 am log
 log c
R2
M2
 But what about with other variables?
 Both reinforcer rate and magnitude varied thoroughly
 The concatenated GML fitted the data very well, but
 Sensitivity to rate was highest when the magnitudes were equal
 Sensitivity to magnitude was highest when the rates were equal
 The CGML can’t be exactly right
 probably near enough for predicting behavior in practical terms
Reinforcer delay
B1
log
B2

D2
ad log
 log c
D1
 The first experiment on this (Chung & Herrnstein, 1967)
reported strict matching to relative delay (ad = 1)
 But a more thorough later experiment (Williams & Fantino,
1978) found that sensitivity increased with the average delay
 This shouldn’t happen according to the GML – sensitivity
should stay constant for a particular IV
Assessing food preference: reinforcer quality
B1
log
B2

R1
Q1
ar log
 aq log
 log c
R2
Q2
 You can find out which of two foods an animal
prefers by arranging each as a reinforcer for a
different response and varying the reinforcer rates
 The bias will reflect preference for one or other food,
because it will be affected by the log quality ratio
 This is the same logic that Hollard and Davison used
with food v EBS
 Matthews and Temple (1979) did this with different
cow feeds – crushed barley seems to be what they
like best
Assessing food preference: reinforcer quality
B1
log
B2

R1
Q1
ar log
 aq log
 log c
R2
Q2
 Notice how important it is to use dependent schedules
 If you use independent schedules the animal will respond more
for the preferred food (say, Q1)
 and so its reinforcer rate for that food (R1) will increase
 so log (R1/R2) increases
 and that increases preference more
 so log (R1/R2) increases more
 And so on. You can end up with an apparently very strong
preference for one food that is actually caused by the animal
receiving that food at a higher rate – watch for this in pet-food
advertisements.
Summary
 Generalized matching seems to describe choice well
in a wide variety of situations – different species,
response, reinforcers
 It says that different IVs to do with responses and
reinforcers all affect choice in the same way …
 But that some of them are more effective than others,
as measured by their different sensitivity values
 It’s the best, most widely applicable, description of
choice that we have at the moment
 But is it a theory of choice? Why should it happen?
 Is Baum’s idea that it’s really ‘failed strict matching’
right? Or is it an outcome of some other mechanism
for choice, like maximization?
Theories of Matching
Experimental Analysis of Choice
• Methods: concurrent schedules, concurrent chains, delay
discounting, foraging contingencies, behavioral economic
contingencies.
• Models and Issues: matching/melioration,
maximizing/optimality, hyperbolic discounting, behavioral
economic/ecological models, behavior momentum, molar
versus molecular issue, concepts of response strength.
• Applications: self-control, drug abuse, gambling, risk,
economics, behavioral ecology, social/political decision
making.
Melioration: A Theory of Matching
“To make better”: Behavior shifts to the higher
return (lower cost) or equal local rates of
reinforcement.
(R1 / R2) = (r1 / r2), or (r1 / R1) = (r2 / R2)
(reinforcers per response, i.e., return).
(T1 / T2) = (r1 / r2), or (r1 / T1) = (r2 / T2) (local rate
of reinforcement).
Example: Conc VI 30”VI 120”
Suppose in the first hour of exposure 1000 responses were
emitted to each alternative:
(VI 30”) r1 / R1 = (120 rfs / 1000 resps). Return = 0.120
(VI 120”) r2 / R2 = (30 rfs / 1000 resps). Return = 0.03
Ultimately behavior will shift toward the higher return. What will
be the result?
120 / (1000 + x) = 30 / (1000 – x); x = 600.
120 / 1600 = 30 / 400 i.e., matching (80% responses on VI 30”
alternative).
Return = 0.075 rfs/resp on each alternative.
Problem: Conc VR 30 VR 120
ALL responses will ultimately be made to the VR 30
alternative. This is consistent with matching, but
same would be said if all the responses were made to
the VR 120 alternative. But melioration can predict
which alternative should receive all the responses:
VR 30: r1/ R1 = 1/30; VR 120: r2/ R2 = 1/120.
These cannot change, so shifting to the higher return
means all the responses will go to VR 30 alternative.
Operant Conditioning
Operant Conditioning
Operant Conditioning
MATCHING LAW
Herrnstein
R1 / R1 = r1 / r2
Baum
R1 / R2 = b (r1 / r2)a
MATCHING LAW
• R1 / R2 = r1 / r2
Herrnstein
• R1 / R2 = b (r1 / r2)a
Baum
• V1 / V2 = b (r1 / r2)a1 (M1 / M2)a2 (D2 / D1)a3
• Rachlin’s “Value”
Operant Conditioning
Operant Conditioning
FUNCTIONAL PROPERTIES AND CURVE
FITTING
• What is the “real” delay function?
Vt = V0 / (1 + Kt)
Vt = V0/(1 + Kt)s
Vt = V0/(M + Kts)
Vt = V0/(M + ts)
Vt = V0 exp(-Mt)
Operant Conditioning
Operant Conditioning
Operant Conditioning
Operant Conditioning
Operant Conditioning
Operant Conditioning
Operant Conditioning
Operant Conditioning
Operant Conditioning
Operant Conditioning
Operant Conditioning
Operant Conditioning
Operant Conditioning
Operant Conditioning
Download