1. Stimulus-intrinsic theories

advertisement
Theories of Reinforcement
List of known reinforcers would be endless
The main issue: why are certain events effective as reinforcers?
Two broad categories of answers:
1) Stimulus-intrinsic theories
2) Response-intrinsic theories
1. Stimulus-intrinsic theories
a) Thorndike’s Law of Effect
-circular
b) Hull’s Drive-reduction theory
-similar to Pavlov’s biological strength
-includes the concept of homeostatic imbalance (drive) when deprived
-evidence not supportive:
-many reinforcers do not appear to reduce drive (saccharin, opportunity to mate)
-some reinforcers not even tied to a physical stimulus (opportunity to play, novel environment, see a movie)
-could postulate new drives, but now that list is infinite as well
c) Brain Stimulation theory
Olds & Milner (1954) accidentally discover “pleasure centres”
-very powerful reinforcer: animals will work until they drop from exhaustion
-maybe this is what all reinforcers have in common….
Problems with this notion:
-must prime subjects, even well-trained ones
-extinction curve too sharp
-could argue that the differences due to the activation strength, pathways
-lots of work still being carried out using ESB as reinforcer
2) Response-intrinsic theories
-general view is that it is not the reinforcer, but the behavior one engages in with the
reinforcer that is reinforcing (e.g. car)
-homeostasis as a behavioral state
a) Premack Principle
-prior to him, responses classified as R or Rfer based on either “consummatory” or
“instrumental” nature
-Premack denied this distinction, saying instead that the distinction is based on their baseline
frequencies/durations of occurrence
-given two responses arranged in an operant conditioning procedure, the more probable response
will reinforce the less probable response, not the other way around
-reinforcing ability is measured by an increase in the response in question
-e.g. eating reinforces bar-pressing because if unconstrained, hungry rat more likely to eat
-measure baseline engagement time, can then decide what will reinforce what
e.g.: if unconstrained, rats will spend 70% of its time running, 10% drinking
-so, if unconstrained, running can be used as a reinforcer for drinking behavior
-drink -- run
-see drinking behavior go up
Example with humans:
-children offered opportunity to either eat candy or play pinball
-some preferred one, some preferred the other
-then set up a contingency: in one condition, children had to eat a certain amount of
candy to engage in opportunity to play pinball
-children that had high baseline preference for pinball increased the amount of candy they ate
-then, reversed the contingency: children now had to play pinball a certain amount of time to receive
candy
-now, children that had a high baseline preference for eating candy increased their pinball-playing
-in theory, depriving could result in turning anything into a reinforcer, provided the deprivation is below
baseline for long enough (bread pudding)
-Premack principle very useful in applied settings
-punishment, deprivation not allowed in schools
-traditional consummatory reinforcers not reinforcing to many clients
-with Premack principle, simply use a more-probable behavior to reinforce a less-probable one
-children given opportunity to run and shout after sitting quietly for a specified amount of time
Problems with Premack:
-time spent on a behavior sometimes fuzzy
-some behaviors don’t take much time, but are highly valued; other behaviors take lots of time without
much value
Response deprivation: addendum to Premack
-responses will increase in “value” if deprived of opportunity to engage in them
i.e.: remember the original example with the drinking/running? (running reinforces drinking behavior)
- Can reverse the contingency with deprivation: water-deprived rats will run more than baseline in order to
have opportunity to drink (run – drink)
-any time you carry out an instrumental experiment, you are necessarily depriving organisms of some
reinforcers until they perform the appropriate behavior
-these deprived responses will act as reinforcers only when the deprivation schedule falls below the
baseline level of performing the activity
-for some very low-level behaviors, this can take a while, since baseline is virtually zero to begin
with (bread pudding example)
-all of the above is formally outlined in:
Bliss-Point theory
It is a highly comprehensive mathematical treatment
Postulates that there is nothing intrinsic to the reinforcer that provides reinforcement.
Rather, an instrumental contingency causes a restructuring of activities in the client.
In an unconstrained situation:“behavioral bliss” achievable
e.g. drink 10 sec for 5 sec of running
bliss
e.g.: run 15 sec, drink 15 sec
drinking
e.g. run 10 sec, for 5 sec of drinking
-must run more than they want to in
order to achieve drinking ‘bliss’
running
Download