Operant Conditioning

advertisement
Modules 20 – 22
Learning Theory
Introduction
Learning: relatively permanent
changes in behavior due to experience
 Measured objectively (i.e., behavior
must be observable and recordable)
 Behaviorist perspective

Introduction—How do we
learn?

Conditioning: process by which
associations are learned
– Two types: Classical conditioning and operant
conditioning
– Classical Conditioning: two stimuli are
associated to produce behavior
– Operant conditioning: consequence is
associated with the behavior

Observational Learning: learn by
watching others’ behaviors
Classical Conditioning:
Introduction
Ivan Pavlov
 Russian physiologist
 Studied digestion in dogs
 Discovered dogs were salivating in
response to experimenter’s footsteps
in anticipation of food
 Called these “psychic secretions”

Classical Conditioning:
Introduction




Classical conditioning: type of learning in
which one stimulus is associated with
another
Learning occurs through repeated pairings
of neutral stimulus (footsteps) with natural
stimulus (food)
Most basic form of learning
Also called Pavlovian Conditioning or
Respondent Conditioning
Classical Conditioning:
Important Terms




Unconditioned stimulus (UCS): stimulus
that triggers response
Unconditioned response (UCR): unlearned
or natural response to UCS (reflex)
Conditioned stimulus (CS): neutral stimulus
that comes to elicit (cause) conditioned
response
Conditioned response (CR): learned
response to previously neutral stimulus (CS)
Classical Conditioning:
Paradigm
Before Conditioning
– UCS → UCR
CS → No response
 During Conditioning
– CS + UCS → UCR
 After Conditioning
– CS → CR

Classical Conditioning:
Paradigm with example


Before Conditioning
– Food (UCS) → Salivation (UCR)
Bell (CS) → No response
During Conditioning
– Bell (CS) + Food (UCS) → Salivation (UCR)

After Conditioning
– Bell (CS) → Salivation (CR)

Video Clip: http://www.youtube.com/watch?v=CpoLxEN54ho
Classical Conditioning:
Examples
 Fears
and phobias
 Food aversions (one-trial
learning)
 Dentist’s drill
 Police sirens and lights
 Others???
Find the UCS, UCR, CS, CR in the following:
The door to your house squeaks loudly
when you open it. Soon, your dog begins
wagging its tail when the door squeaks.
The nurse says, “This won’t hurt a bit,” just
before stabbing you with a needle. The
next time you hear “This won’t hurt,” you
cringe in fear.
You have a meal at a fast food restaurant
that causes food poisoning. The next time
you see a sign for that restaurant, you feel
nauseated.
Classical Conditioning:
Types (in order of best learning)




Delayed conditioning: CS precedes and
overlaps presentation of UCS
Simultaneous conditioning: CS and UCS
presented at same time (begin and end
simultaneously)
Trace conditioning: CS presented and
stops with gap before presentation of UCS
Backward conditioning: UCS presented
before CS
Basic Principles of Learning
Acquisition = how is it learned
 Extinction = how is it “forgotten”
 Generalization = when is response
also given
 Discrimination = when is it given
only in specific situations

Classical Conditioning:
Acquisition
 Acquisition:
how is beh learned
– Conditioning occurs because of
repeated pairings of CS and UCS
– Learn association btn CS and UCS
 Learning
curve increases rapidly
and then levels off
Classical Conditioning:
Factors that affect Acquisition

Order and timing of CS and UCS
– Most important—critical for learning
– Delayed conditioning is best
– CS seems to signal UCS but needs to overlap
to be associated


Intensity of CS and UCS (food aversions)
How connected are CS and UCS → how
well does CS predict UCS
Classical Conditioning:
Extinction
Extinction: elimination of learned
response due to removal of UCS
 Gradual process
 Does not erase what is learned
 Spontaneous recovery: re-emergence

of extinguished response after period of
time away (CR is not as intense)
Classical Conditioning:
Extinction
Classical Conditioning:
Generalization



Generalization: tendency to respond to
stimuli that are similar to CS
In Pavlov’s experiment, dog would
salivate to different tones
Other examples:
– Food aversions  start with one type of
seafood and are associated with others
– Phobias
– Others ???
Classical Conditioning:
Discrimination



Discrimination: tendency not to respond
to similar stimuli, but only to original CS
In Pavlov’s experiments, dog was trained
to salivate only to certain tone.
Examples
– Food aversions  in some cases, may only
respond to fish but will eat shellfish
– Others ???
Classical Conditioning:
Higher-Order Learning
Can CS become UCS?
 Yes
 Higher-order conditioning: by
pairing learned CS with new stimulus,
the original CS acts as the UCS
 Example – dog salivates to bell and
then bell is paired with light

Classical Conditioning:
Applications





Phobias: extreme fear of specific stimulus
John Watson’s research (Little Albert)
Wanted to demonstrate behavioral
explanation for phobias
Created phobia in Little Albert
Ways to treat phobias have been
developed using the principles of
Classical Conditioning
Classical Conditioning:
Systematic Desensitization



Systematic Desensitization: decreases
phobic response by substituting an
incompatible response
Works by re-conditioning/re-learning
Process:
– Client creates hierarchy of fear-producing stimuli
– Learns progressive muscle relaxation
– Begins with lowest stimuli on hierarchy and tries
to substitute relaxation
– Continue up hierarchy until actually dealing with
stimulus
Classical Conditioning:
Flooding

Flooding: fear-producing stimuli
presented continuously until fear
response decreases until extinguished
– Uses principle of extinction to treat
phobia
Classical Conditioning:
Other applications


Advertisements
Social attitudes
Classical Conditioning
Video Links





http://www.youtube.com/watch?v=TJ3dLm2j5uk
http://www.youtube.com/watch?v=e1g3y0SRbVc
Frasier:
http://www.youtube.com/watch?v=2c4_l2oe22U
The Office: http://vimeo.com/35754924
Dog training:
http://www.youtube.com/watch?v=tPAnp6Oxc6E
New Major Topic:
Operant Conditioning


Classical Conditioning involved
learning through association of neutral
stimulus with a stimulus that caused a
reflexive response.
Operant Conditioning involves
learning through the connection of a
consequence with a behavior.
Operant Conditioning:
EL Thorndike and Trial-and-Error
Learning

Research
– Placed cat in “puzzle box”
– Cat needed to hit lever to open door to get food
– With successive trials, cat would hit lever sooner


Law of effect: beh followed by satisfying
outcome is stamped in or repeated, while
behaviors followed by negative or no
outcome are extinguished
Video clip
Operant Conditioning:
Introduction—BF Skinner

B.F. Skinner
– Behaviorist
– Major books: Beyond Freedom and
Dignity and Walden Two
– Skinner box: structured environment that
allowed for control of response and
outcome
Operant Conditioning:
Introduction—Definition


Operant conditioning: process by
which organism learns to behave in
ways that produce desirable outcomes
Other ways to say this:
– Learning to behave because of
effects/results of beh
– Beh influenced by consequences
Operant Conditioning:
Paradigm
S + R → R+

S = stimulus
– Something that signals that reinforcement is
likely if you respond,

R = response
– Specific behavior

R+ = reinforcement
– Consequence of beh that increases likelihood
that beh is repeated
Operant Conditioning:
Reinforcement
 Reinforcement:
anything that
increases likelihood that beh
will be repeated
– Primary and secondary
reinforcement (more later)
– Positive and negative reinforcement
(more later)
Operant Conditioning:
Reinforcement (cont’d)

Primary versus secondary
reinforcement
– Primary reinforcement: anything that is
naturally reinforcing or automatically
reduces drive or need (e.g., food,
warmth, attention)
– Secondary reinforcement: anything that
has acquired ability to be reinforcing
(e.g., money, stickers, etc.)
Operant Conditioning:
Reinforcement (cont’d)

Positive and negative
reinforcement
– Positive reinforcement: addition of
stimulus that increases likelihood beh
is repeated
– Negative reinforcement: removal of
stimulus that increases likelihood beh
is repeated
Operant Conditioning:
Punishment

Punishment: any stimulus that
decreases likelihood that beh is
repeated
– Positive punishment: addition of stimulus
to decrease behavior
– Negative punishment: removal of
stimulus to decrease behavior
Operant Conditioning:
Reinforcement and Punishment
Reviewed
Increases
behavior
Add stimulus
to situation
Remove
stimulus from
situation
Decreases
behavior
Operant Conditioning:
Avoidance and Escape Learning

Escape learning: When our response
to aversive stimulus (something we
don’t like) removes that stimulus
– Example → parents yelling at you b/c you
came home late
– You apologize and say it will not happen
again
– So, they STOP yelling at you
Operant Conditioning:
Avoidance and Escape Learning (cont’d)

Avoidance learning: when our
response prevents aversive stimulus
(consequence) from occurring
– Example → You come home late.
– You apologize to your parents BEFORE
they begin to yell at you.
– They do not yell at you
Operant Conditioning:
Schedules of reinforcement

Introduction
– How often beh is reinforced has influence
– Discovered by accident out of necessity
– Financial concerns required Skinner not to reinforce every
behavior
– Led to hypothesis concerning the impact of altering how
often behavior is reinforced
– Continuous reinforcement: reward given for beh every
single time
– Partial (or intermittent) reinforcemetn: reward given part
of the time

Two ways to vary how often
– According to number of responses (ratio)
– According to when response occurs (interval)
Operant Conditioning:
Schedules of reinforcement (cont’d)

Fixed ratio: reinforcement given after set #
of responses
– Response-to-reinf ratio remains constant
– Tend to see burst of responses until reinforced,
then see pause in response rate
– Examples → CD clubs, frequent flyer miles

Variable ratio: reinforcement given after
varying/changing # of responses
– Constant high rate of response (WHY?)
– Examples → slot machine
Operant Conditioning:
Schedules of reinforcement (cont’d)

Fixed interval: reinf. given for first response given
after set time period
–
–
–
–
–

“Wait for it.”
Produces slow, scalloped response pattern
Learn that certain period of time must pass
Examples → Tests on every Friday
Fixed interval: reinf. given for first response given after set
time period
Variable interval: reinf. given for first response after
varying period of time
– Slow but steady response patterns
– Examples → pop quizzes

Video clip
Operant Conditioning:
Schedules of Reinforcement
Set
Number of
responses
Time
Changing
Identify the schedule of reinforcement–
Fixed Ratio, Variable Ratio, Fixed Interval, or
Variable Interval
1.
Rat gets food every third time it presses the lever
2.
Getting paid weekly no matter how much work is done
3.
Getting paid for every ten boxes you make
4.
Hitting a jackpot sometimes on the slot machine
5.
Winning sometimes on the lottery you play once a day
6.
Checking cell phone all day; sometimes getting a text
7.
Buy eight pizzas, get the next one free
8.
Fundraiser averages one donation for every eight houses
visited
9.
Kid has tantrum, parents sometimes give in
10.
Repeatedly checking mail until paycheck arrives
FR
FI
FR
VR
VI/VR
VI
FR
VR
VR
FI
Operant Conditioning:
Principles of Learning
 Acquisition
–Shaping: reinforce successive
approximations of desired beh
 Reinforce
initially for getting close
 Video clips
– Teaching pigeon to play ping pong
– Dog agility training
– Fred
Operant Conditioning:
Principles of Learning

Acquisition (cont’d)
– Effect of schedule of reinforcement on
acquisition


Quickest learning → continuous reinforcement (every
beh)
Strongest overall response → variable (partial
reinforcement) schedules
– Reinforcement versus punishment


Reinforcement works best
Reinf demonstrates correct response
Operant Conditioning:
Principles of Learning (cont’d)

Extinction: elimination of learned response
b/c it is not longer reinforced
– Extinction happens most quickly in fixed ratio
schedule of reinf
– Extinction < likely with variable schedules—
WHY?
– Partial reinforcement is best to avoid extinction
– Spontaneous recovery: return of extinguished
response after rest period (you never forget how
to ride a bike)
Operant Conditioning:
Principles of Learning (cont’d)

Generalization: learning to respond to
similar stimuli
– Example → studying in Psych leads to good
grades, so now you study in other classes

Discrimination: learning to respond
differently to similar stimuli
– Example → how you act in one class versus
another
Operant Conditioning:
Applications

Behavior Modification/Behavior change
– Behavior modification: use of operant conditioning
principles to change or modify beh
– Token economy
– Examples → Villa Maria’s behavior mod program
– Video clip: Big Bang Theory
– Video Clip: Cheers Shock Therapy

Depression
– Martin Seligman’s research
– Learned helplessness: ind learns that response is not
connected to outcome
– So, they stop responding


Superstitions
Others
– Video
Operant Conditioning
Videos



Intro: http://www.youtube.com/watch?v=B8vIbuoktew
Shaping: http://www.youtube.com/watch?v=OCUWHP4YDgU
Schedules of Reinforcement:
http://www.youtube.com/watch?v=i0ad2NSwGb0
Contrasting Types of Conditioning
Classical Conditioning
Operant Conditioning
Basic Idea
Associating events/stimuli with Associating chosen behaviors with
each
other
resulting events
Organism
associates events.
Response
Involuntary, automatic reactions Voluntary actions “operating” on
such as salivating
our environment
Acquisition
NS linked to US by repeatedly
presenting NS before US
Behavior is associated with
punishment or reinforcement
Extinction
CR decreases when CS is
repeatedly presented alone
Target behavior decreases when
reinforcement stops
Spontaneous
Recovery
Extinguished CR starts again
after a rest period (no CS)
Extinguished response starts again
after a rest (no reward)
When CR is triggered by stimuli Response behavior similar to the
Generalization similar to the CS
reinforced behavior.
Distinguishing between a CS and Distinguishing what will get
Discrimination NS not linked to U.S.
reinforced and what will not
New Major Topic:
Cognitive Factors in Learning

Classical Conditioning and Operant
Conditioning => ind must experience
conditioning directly

Social Learning Theory (subtopic)

Cognitive Maps (subtopic)
Social Learning Theory:
Introduction






Albert Bandura
Bobo Doll studies
Children observed live model hitting bobo
clown doll
After observing this, they were given
opportunity to play in the room with bobo
doll
Children engaged in similar behavior
Even when they had witnessed aggression
against a live clown
Social Learning Theory
 Observational
Learning: occurs
when individual’s beh changes
after viewing another ind
engage in specific beh
Social Learning Theory:
Four Important Processes

Attention (first)
– Must pay attention to beh when it is modeled
– Characteristics of model are important



Similar in age, gender, race, etc.
Also if considered prestigious, competent, etc.
Retention (second)
– Must remember behavior
– Involves use of imagery and language
Social Learning Theory:
Four Important Processes

Reproduction (third)
– Must be capable (intellectually and
physically) of reproducing beh
– Our ability to imitate improves with
practice → even when just imagining
ourselves engaging in beh
Social Learning Theory:
Four Important Processes

Motivation (fourth)
– Observer performs beh when motivated
to perform it
– Motivation comes from presence or
absence of reinforcement or punishment
– Motivation



Past reinforcement → they have been rewarded
Promised reinforcement → they believe they will be
rewarded
Vicarious reinforcement → they observed another
being rewarded
Social Learning Theory

Distinction between acquiring behavior and
performing behavior
– Attention and Retention → acquire beh
– Reproduction and Motivation → perform

Reinforcement causes us to demonstrate what we
have learned
– Operant conditioning => we must experience
reinforcement directly to learn
– Observation Learning => can learn without direct
reinforcement

Bobo Doll Video
Cognitive Factors in Learning:
New topic—Cognitive Maps



Edward Tolman
Cognitive map: mental picture of location in
space
Research
– Placed rat in maze and allowed it to explore (no
reinforcement)
– When reintroduced to maze and food placed at
end, rats learned correct route more quickly
– When shortest route blocked, would take next
shorted route
Cognitive Maps (cont’d)

Latent learning: learning that occurs
but is only exhibited when there is
opportunity for reinforcement
– Beh only given when motivated by
possibility of reinforcement
– Example → when preparing for test

Learning videos
– Classical conditioning (marines)

http://www.youtube.com/watch?v=DUa_F2OJT0k&pla
ynext=1&list=PL5323550EAE54D712&feature=results_
video
– Operant conditioning (marines)

http://www.youtube.com/watch?v=tMMNkxxXVKI
– Observational learning

http://www.youtube.com/watch?v=eqNaLerMNOE&list
=PL5323550EAE54D712&index=32
Download