Schedules of Reinforcement

advertisement
Schedules of
Reinforcement
Types of Schedule
Schedule Performance
Analysis
What Is a Schedule of
Reinforcement?
l
l
A schedule of reinforcement arranges a
contingency (relationship) between an operant and
the delivery of a reinforcer.
Continuous reinforcement (CRF)
l
l
l
l
Every response is reinforced.
Partial or intermittent schedule
Not every response is reinforced.
Schedules may also specify relationships between
reinforcement contingencies and discriminative
stimuli.
Schedules of Reinforcement
as Feedback Functions
l
In a simple control system:
l
l
l
l
a perceptual input is compared to an internallyspecified reference state for that perception.
The difference between them creates an error signal,
which drives an output .
The output affects the environment so as to push the
perceptual input toward its reference value (negative
feedback), thus closing the loop.
A schedule of reinforcement specifies how the
output will affect the perceptual input; it thus
constitutes an environmental feedback function.
1
Control-System Diagram
Showing Feedback Function
reference
Input function
Output funct
error
comparator
Output (key-pecks)
input
Feedback
(grain
accesses)
disturbances
Feedback
function
Reinforcement
schedule, e.g. FR-5
Basic Intermittent Schedules
l
l
Schedules may be based on number of responses,
passage of time, or both.
Ratio Schedules
l
l
Interval Schedules
l
l
Reinforcement after a specified number of responses have
been completed.
Reinforcement follows first response after the passage of a
specified amount of time.
Differential Reinforcement Schedules
l
Reinforcement based on the spacing of responses in time
or on response omission.
Fixed Ratio Schedules
l
l
l
Reinforcement after a fixed number of
responses has occurred.
Symbolized FR-x, where x is the fixed ratio
requirement.
Example: FR-5: Five responses per
reinforcement. Delivery of reinforcement
resets the response counter back to zero.
2
Responding on Fixed Ratio
Schedules
l
“Break-and-run pattern:
l
l
l
l
FR-200
FR-120
High rate sustained
during ratio completion
Pause follows delivery of
reinforcer (“postreinforcement pause”)
Pause length increases
with ratio size.
Too high a ratio leads
to ratio strain.
Analysis of Performance on
Fixed-Ratio Schedules
l
l
l
In ratio schedules, the rate of reinforcement
depends directly on the rate of responding.
Because reinforcement is more effective at short
than at long delays, high response rates are more
strongly reinforced than low ones. This positive
feedback loop drives rate during the “ratio run” to
the maximum.
After reinforcement, alternate sources of potential
reinforcement for various activities compete.
Returning to the lever is only weakly encouraged
owing to the delay between return to the lever and
reinforcer delivery that is imposed by the ratio
requirement. This explains the post -reinforcement
pause, and why the pause is longer for higher ratios.
Variable Ratio Schedules
l
l
l
Reinforcement after completion of a variable
number of responses; schedule specified by
the average number of responses per
reinforcement.
Symbolized VR-x, where x is the average
ratio requirement.
Example: VR-20: An average of 20
responses is required before the reinforcer
will be delivered, but actual ratio varies
unpredictably after each reinforcement.
3
Performance on Variable Ratio
Schedules
VR-173
l
l
l
Variable ratio
schedules maintain a
relatively high, steady
rate of responding.
Little or no evidence of
post-reinforcement
pausing
Too high a ratio
produces ratio strain.
Analysis of Performance on
Variable-Ratio Schedules
l
l
l
On ratio schedules, rate of reinforcement depends
directly on rate of responding. This positive
feedback loop drives responding upward toward the
maximum.
Because VR schedules occasionally present very
short ratios, return to the lever is sometimes almost
immediately reinforced, allowing this behavior to
compete effectively against other behaviors. This
explains the lack of post -reinforcement pause.
Lack of pauses also rules out fatigue explanation for
post-reinforcement pauses in FR schedules.
Fixed Interval Schedules
l
l
l
l
Reinforcement follows the first response to
occur after a specified fixed interval is over.
Delivery of reinforcer resets the interval timer
to zero.
Symbolized FI x, where x is the interval size
in seconds or minutes.
Examples: FI 30-s, FI 1-m
4
Performance on Fixed Interval
Schedules
l
l
Low or zero rate begins
to increase perhaps
half-way into the
interval, accelerates to
the end.
This pattern is called
the fixed interval
scallop because of its
scalloped or fluted
appearance.
FI 4-m
Analysis of Performance on
Fixed Interval Schedules
l
l
l
It is evident that rats can time the interval; therefore
there must be internally generated stimuli that
correlate with different times since reinforcement.
These may function as discriminative stimuli; those
occurring near the end of the interval are more
closely associated with reinforcement, therefore
generate higher rates of responding.
Skinner demonstrated that an external clock
stimulus would produce an inverted scallop if the
clock were run backwards, supporting this analysis.
Stable performance may represent a tradeoff
between responding too soon (wasted effort) and
responding too late (reduced rate of reinforcement).
Variable Interval Schedules
l
l
l
l
Reinforcer delivered immediately following
the first response to occur after the current
interval is over.
Interval size varies unpredictably after each
reinforcer delivery; schedule specified by the
average interval size.
Symbolized as VI x, where x is the average
interval length in seconds or minutes
Examples: VI 20-s, VI 3-m
5
Performance on Variable
Interval Schedules
l
l
l
VI schedules tend to
sustain a relatively
moderate but steady
rate of responding.
As the average interval
size increases,
response rate
decreases.
Sustain lower rates
than VR schedules that
produce the same rate
of reinforcement
VI 3-m
Analysis of Performance on
Variable-Interval Schedules
l
l
l
On variable-interval schedules, the probability that a
response will be immediately followed by reinforcer
delivery is roughly constant.
Because the currently timed interval is sometimes
very short, returning to the lever is occasionally
followed almost immediately by reinforcer delivery,
encouraging immediate return to responding (no
pauses or scallops).
Reinforcement rate is nearly independent of
response rate, above some minimum. As the only
way to know whether a reinforcer has been “set up”
by the schedule is to respond, this encourages a
moderate, steady rate of responding.
Differential Reinforcement
Schedules
l
Differential reinforcement of high rates (DRH)
l
l
Differential reinforcement of low rates (DRL)
l
l
A response is reinforced if it occurs within x
seconds of the previous response
A response is reinforced if it occurs no sooner
than x seconds after the previous response
Differential reinforcement of other behavior
(DRO)
l
A reinforcer is delivered after x seconds, but only
if no response has occurred. Also called an
omission schedule.
6
Analysis of Differential
Reinforcement Performance
l
l
l
Inter-response times (IRTs ) vary from
response to response.
Only those IRTs meeting the schedule
requirement are followed immediately by
reinforcement.
These become more probable, other IRTs
relatively less probable, and performance
changes in the direction required by the
schedule: Short IRTs for DRH, long IRTs for
DRL, infinite IRTs for DRO.
Applications
l
l
The simple schedules just described were not
intended to model “real-world” contingencies,
but rather, to explore how schedule
properties affect the patterning of behavior.
Nevertheless, we can draw some parallels
between these laboratory schedules and
reinforcement contingencies found outside
the laboratory. Some examples follow.
Piece-work
l
l
l
Some jobs pay so much per item produced.
The pay rate depends directly on the rate of production of
pieces; thus this is a ratio schedule.
Where each item requires several steps to complete,
breaks will tend to occur after an item is completed rather
than in the middle of assembly – a post-reinforcement
pause.
7
Gambling
l
l
l
l
l
l
The slot machine is an excellent example.
Each response (put money in slot, pull lever) brings you
closer to a pay -off.
The faster you play, the sooner you win.
How many responses you will have to make before a
pay -off varies unpredictably after each win.
It’s a variable-ratio schedule!
And what do we know about VR schedules? They
generate a high, steady rate of play. Just what the
“house” wants!
Waiting for Company to Arrive
l
l
l
l
l
Your guests are expected at 8 pm.
As the time approaches, you glance out the front
window to see if anyone has pulled into your
driveway.
The closer to 8 pm it gets, the more often you
glance.
If they are punctual, the first glance after 8 pm will
find them in the drive.
It’s (approximately) a fixed-interval schedule,
complete with FI scallop.
Calling the Plumber
l
l
l
l
One of the water pipes at your house has sprung a leak,
and you are desperate to get your plumber out to fix it.
He doesn’t have an answering machine, so you call, call,
call, call, call. Finally, he answers the phone.
It’s not how often you called that counted, but whether
enough time had passed for him to be back from
wherever he went. But you had no idea how long that
would be.
It’s a variable-interval schedule, and it generated a
moderate, steady rate of calling.
8
Download