Operant Conditioning
Introduction
Through classical (Pavlov) conditioning,
an organism associates different stimuli
that it does not control. Through operant
conditioning, the organism associates its
behaviors with consequences. Behaviors
followed by reinforcements increase; those
followed by punishers decrease. This
simple but powerful principle has many
applications and also several important
qualifications.
Operant means…
….Explain and train
Operant Conditioning
• A type of learning in which
responses can be controlled by
their consequences
i.e. rewards
or
punishments
Reward vs. Punishment
Reward = more likely
behavior will repeat
Punishment = less likely to
repeat behaviors
Which is better?
Behavior
• Respondent Behavior
– Behavior that occurs as
an automatic response
to some stimulus
Ex: food when hungry;
water when thirsty
• Operant Behavior
– The act operates on
the environment to
produce rewarding or
punishing stimuli
Ex: good grades =
MONEY; bad grades =
grounded
Important People in
Operant Conditioning
B.F. Skinner
Radical Behaviorism
Skinner Box
Edward
Thorndike
Law of Effect
Puzzle Box
Skinner
• Operant Chamber– “Skinner Box”
– Soundproof
– Bar or key that an animal presses or pecks to
release a rewards of food or water
– Device that records these responses
• Shaping-Procedure in which reinforcers (like food)
gradually guide an animal’s actions toward a
desired behavior
-Operant Conditioning
Edward L. Thorndike
• Law of Effect:
– Rewarded behavior is likely to recur
– Puzzle Box
Operant Conditioning Chamber
Skinner Box
Puzzle Box
Two important concepts used in Operant Conditioning
• Reinforcer
– A stimulus or event that increases the odds of repeating
the behavior that led to it
• I give my kids money when they clean their room…this stimulus
increases the odds they will do it again
• Punisher
– A stimulus or event that functions to decreases the odds
of repeating the behavior that lead to
• I spank my kids when they throw food at the dinner table…this
event decreases the odds they will do it again
• Remember…
– It is often the learner that determines if something is a
reinforcement or punishment
– This is called the Premack Principle
– I might give Ryan broccoli after he did a chore and if he likes it he
will do more chores
– Or I might give Ryan broccoli after he did a chore and he may never
do that chore again
– My feelings toward broccoli make no difference
Reinforcer
Anything likely to increase a behavior
Two Types of Reinforcement:
Positive and Negative
Positive Reinforcement
• Something desirable is added to the
environment and this encourages (reinforces)
behavior
– Behaviors are strengthened when they are followed
by the introduction of a stimulus
A
Negative Reinforcement
• Something undesirable is subtracted from the
environment and this encourages (reinforces)
behavior
– NR are aversive stimuli such as loud noise, cold, pain, or
nagging
• We are more likely to repeat behaviors that lead to their removal
– Example
•
•
•
•
Say I have a headache
The NR is the pain of the headache
I take aspirin and the headache goes away
Headache pain (stimulus) - - aspirin (response) - - consequence
(headache gone)
• I will take aspirin again because it removed something unpleasant
So…positive and negative do not mean good or bad.
Instead, positive means adding a stimulus, and negative
means removing a stimulus.
The Simpsons
Reinforcement
Schedules
The pattern (schedule) in which
reinforcement (reward or
punishment) is given.
These schedules influence learning
Continuous
Reinforcement
• Reinforcing the desired response every
time it occurs.
– Example – vending machine
Quick Acquisition
Quick Extinction
Partial Reinforcement
• Reinforcing a response only part of the time.
– slot machine
– You don’t expect to win every
time but hope to win sometime
– The acquisition process is slower, but…
– Greater resistance to extinction.
• 4 different partial reinforcement schedules
time between reinforcement
(interval schedule)
Two focus on number of responses between
reinforcement (ratio schedule)
– Two focus on
–
Fixed-Interval Schedule
• Reinforcement of a behavior after a
specified or fixed time (interval) has
passed.
• You get paid every two weeks
• A worker gets a bonus once a year
– After receiving a reward (a
reinforcement) the worker has to wait
one year for another reward (fixed
interval)
Variable-interval Schedule
• Reinforcement of a behavior
at unpredictable (variable)
time intervals.
• You don’t know when the
reinforcement is coming so
you keep trying or have to be
prepared to take action
Pop Quizzes
Fixed-ratio Schedules
• Reinforcement of a behavior only
after a specified (fixed) number of
responses
• Movie rentals that say rent 5 get one
free
• A worker gets a bonus after every
three items he sells
Variable-ratio Schedule
• Reinforcement of a behavior after an
unpredictable (variable) number of
responses.
– Working on sales commission
• Sometimes called the gambler’s schedule
– Back to the lottery…
– You don’t know when you will win but you do
know the more you buy the better your
chances
Overjustification Effect
• When external rewards undermine the
intrinsic satisfaction of performing a
behavior
– Makes people only do something for reward or
prize and not for pure joy
– Usually the reward may lesson and replace the
person’s original, natural motivation so that the
behavior stops if the reward is eliminated
• Pizza for reading
– “what, I don’t get a free pizza for reading 10
books?”
Before we move on…
• Operant Conditioning uses much of the same
terminology as classical conditioning…(acquisition,
extinction, generalization, discrimination, etc…)
• For example, if I want a child to increase his
bathing behavior, I can give him an extra 30
minutes of TV time after he bathes.
• The reinforcer is extra TV time and acquisition
occurs when he links together the idea that bathing
gives him more Cartoon Network.
• Extinction would occur if I stop giving him TV time
for bathing and he stops seeing the association.
Types of Reinforcers
Types of Reinforcers
• Primary Reinforcers- reinforcements that
happen naturally; not learned (i.e. getting food
when hungry, taking your hand off a burning
stove to relieve pain)
• Conditioned Reinforcers- (secondary
reinforcers) are learned. (i.e. if a rat in Skinner’s
box learns that when a light signal goes off it
signals food, the light becomes on the
secondary reinforcer
Primary Reinforcer
• Things that are in themselves rewarding and
satisfy biological needs
• Like food, warmth, or water
Secondary (or Conditioned)
Reinforcer
• Something that you have learned to value
through classical conditioning
– Money, fines or grades
• Secondary reinforcers can loose their effectiveness
Intrinsic vs. Extrinsic Motivation
Punishment
• Flip side of reinforcement
• The introduction of a bad
stimulus or the removal of a
reinforcing stimulus after a
response occurs
– Weakens a behavior or makes
it less likely to occur again in
the future
Does punishment work?
Yes, but…
Often tells the learner what behavior should NOT
be exhibited and not what behavior should be
And…don’t forget the Premack Principle
Difference between Negative
Reinforcement and Punishment
 Punishment


the introduction of a negative consequence
after a behavior weakens the behavior
Time out for hitting other children
 Negative


Reinforcement
the removal of a negative stimulus after a
behavior strengthens the behavior
Picking up a crying baby
Observational
Learning
Learning by Observation

Learning occurs
not only through
conditioning but
also from our
observation of
others.

“We are, in truth,
more than half
what we are by
imitation”
Lord Chesterfield
Observational Learning:
Definition

Observe and imitate
others
Modeling- Process of
observing and imitating
a specific behavior
 We learn all kinds of
social behaviors by
observing and imitating
others

Mirror Neurons
 Mirror
neurons provide a neural basis for
observational learning
 Example:
when a baby imitates a face an
adult is making, mirror neurons are firing
Bandura’s Experiment
 Albert
Bandura
 Pioneer of research
in observational
learning
 BoBo Doll
Experiment

Reinforcement and
punishment leads to
imitating a behavior
Social Influence on
Observational Learning


Columbine High School“copycat threats”
Prosocial- models can have
positive effects


Gandhi and Martin Luther King
Jr.
Television:


More hours children spend
watching violent TV or playing
violet video games, more at risk
for aggression and crime as
teens and adults
Homicides doubled between
1957 and 1974, coinciding with
the introduction of television
Aversive Conditioning





In aversive conditioning, client is exposed to an
unpleasant stimulus while engaging in the targeted
behavior
Goal- create an aversion to it.
In adults, aversive conditioning is often used to
combat addictions such as smoking or alcoholism.
Examples-Nausea-producing drug while the client
is smoking or drinking so that unpleasant
associations are paired with the addictive
behavior.
Also used to treat nail biting, sex addiction, and
other strong habits or addictions.
Observational Learning influenced debates on
the effect of television violence and parental
role models
Studies have shown the amount
of violent TV watched by children
in elementary school is correlated with
their aggressiveness as teenagers
and with their criminal behavior as adults
Antisocial models vs. prosocial
models