File

advertisement
Operant Conditioning
The Learner is NOT passive.
Learning based on consequence!!!
The Law of Effect
• Edward Thorndike
• Locked cats in a cage
• Behavior changes because
of its consequences.
• Rewards strengthen
behavior.
• If consequences are
unpleasant, the StimulusReward connection will
weaken.
• Called the whole process
instrumental learning (aka
operant conditioning).
Thorndike’s Puzzle Box
• Edward Thorndike (1874-1949):
created a puzzle box: cage with
latched door that could only be
opened by pressing lever inside
• cats became quicker and quicker
to press lever once they figured it
out
• Law of Effect: rewarded
behaviors are more likely to be
repeated
B.F. Skinner
• The Mac Daddy of
Operant Conditioning.
• Nurture guy through
and through.
• Used a Skinner Box
(Operant
Conditioning
Chamber) to prove
his concepts.
Skinner Box
Shaping
• Gradually guiding the subject’s actions toward the desired behavior.
• Successive approximations
• A rat needs to press a lever to receive food.
•
•
•
•
•
Observe to learn natural behaviors  build on what the rat already does
Give food when it approaches the bar
Give food when it stands next to the bar
Give food when it touches the bar
Ignore any other behaviors  discrimination
Reinforcer
• Any event that STRENGTHENS the behavior it
follows.
Two Types of Reinforcement:
Positive and Negative
Positive Reinforcement
• Strengthens a response by presenting a stimulus
after a response.
Negative Reinforcement
• Strengthens a response by reducing or removing an
aversive stimulus.
Negative Reinforcement:
• The removal of something unpleasant.
• NEGATIVE REINFORCEMENT IS NOT PUNISHMENT!
Positive or Negative?
Putting your seatbelt on.
Faking sick to
avoid AP Psych
class.
Studying for a test.
Having a headache and
taking an aspirin.
Breaking out
of jail.
Getting a kiss
for doing the
dishes.
Negative Reinforcement Examples
• Taking aspirin to get rid of headache
• Hurrying home in winter to get out of cold
• Giving in to an argument or dog’s begging/whining
• Fanning yourself to escape the heat
• Leaving theater if movie is bad
• Smoking to relieve anxiety
• Following prison rules to be released from
confinement
• Putting on seatbelt to make car stop beeping
• Umbrella to escape rain
• Saying “uncle” to stop being beaten
Punishment
Meant to decrease a
behavior.
Positive Punishment
• Addition of something
unpleasant.
(spanking=positive
punishment)
Negative Punishment
(Omission Training)
• Removal of something
pleasant.
Punishment works best
when it is immediately
done after behavior and
if it is harsh!
Punishment & Classical Conditioning
• Spanking can lead to association b/w parent &
pain.
• US=spanking
• CS=Parent’s upraised hand that does the
spanking
• UR=fear & crying from pain
• If parent’s upraised hand is paired often
enough with the pain of spanking, children can
begin to associate the gesture with pain & fear
of their own parents. Can lead to children not
trusting parents at early age
The following are examples
of what???
Answer choices are positive punishment,
negative punishment, positive reinforcement,
negative reinforcement
Spanking a child for writing
on the walls.
Giving candy for correct
answers.
Nagging and nagging until
you do the dishes.
Child whines and cries until he
gets his candy at the store.
Taking away cell phone
privileges to reduce low grades.
Stop jamming toothpicks
up one’s fingernails in
exchange for information
Practice Applying Concepts
• Complete reinforcement vs punishment wkst
How do we actually use Operant
Conditioning?
Do we wait for the
subject to deliver the
desired behavior?
Sometimes, we use a
process called shaping.
Shaping is reinforcing
small steps on the way
to the desired
behavior.
To train a dog to get
your slippers, you would
have to reinforce him in
small steps. First, to
find the slippers. Then
to put them in his
mouth. Then to bring
them to you and so
on…this is shaping
behavior.
To get Adam to become a better student, you
need to do more than give him a massage when
he gets good grades. You have to give him
massages when he studies for ten minutes, or
for when he completes his homework. Small
steps to get to the desired behavior.
Big Bang Theory
• Sheldon trains Penny
Chaining Behaviors
• Subjects are taught a
number of responses
successively in order
to get a reward.
Click picture to see a rat chaining behaviors.
Click to see a cool example of chaining behaviors.
Can all animals be taught
anything?
Instinctive drift
• Animals will drift (or revert) back to instinctual behaviors while
performing tasks.
• Example: Pigs will deposit coins in a piggy bank but will push the coins
through the mud and flip it around on its way.
• Behaviorists successfully taught a
raccoon to deposit wooden coins into a
metal container for food reinforcement.
But soon the raccoon started rubbing
the coins together and dipping them
(not dropping them) into the container.
It was performing the motor program
raccoons use to "wash" food in a
stream. This interfered with the trick to
such an extent the Brelands had to give
up on it. Instead, they trained the
raccoon to "play basketball." The
basketball was so large that the
raccoon did not attempt to wash it.
Primary v. Secondary Reinforcers
Primary Reinforcer
• Things that are in
themselves
rewarding.
Secondary Reinforcer
• Things we have learned
to value.
• Money is a special
secondary reinforcer
called a generalized
reinforcer (because it
can be traded for just
about anything)
Token Economy
• Every time a desired
behavior is performed,
a token is given.
• They can trade tokens
in for a variety of
prizes (reinforcers)
• Used in homes, prisons,
mental institutions and
schools. (Book-it = free
pizza)
How is operant conditioning used at school?
• Discuss
Premack Principle
Wick’s pizza might
be a great positive
reinforcer for me,
but it would not
work well on a
vegetarian.
• You have to take into
consideration the
reinforcers used.
• Is the reinforcer
wanted….or at least is it
more preferable than the
targeted behavior.
Reinforcement Schedules
How often to you give the
reinforcer?
• Every time or just
sometimes you see the
behavior.
Continuous v. Partial Reinforcement
Continuous
• Reinforce the behavior
EVERYTIME the
behavior is exhibited.
• Usually done when the
subject is first
learning to make the
association.
• Acquisition comes
really fast.
• But so does extinction.
Partial/Intermittent
• Reinforce the behavior
only SOME of the
times it is exhibited.
• Acquisition comes more
slowly.
• But is more resistant
to extinction.
• FOUR types of Partial
Reinforcement
schedules.
Ratio Schedules
Fixed Ratio
• Provides a reinforcement
after a SET number of
responses.
Variable Ratio
• Provides a reinforcement
after a RANDOM number of
responses.
• Very hard to get acquisition
but also very resistant to
extinction.
Fixed Ratio- She gets a manicure for every 5
pounds she loses.
Interval Schedules
Fixed Interval
• Requires a SET amount of
time to elapse before giving
the reinforcement.
Variable Interval
• Requires a RANDOM amount
of time to elapse before
giving the reinforcement.
• Very hard to get acquisition
but also very resistant to
extinction.
• EG – Pop Quizzes
Fixed Interval: She gets a
manicure for every 7 days she
stays on her diet.
Delayed Gratification
• Delayed gratification is a skill that many
parents want their children to have. People
who can delay gratification are able to achieve
more in life. Consider the following:
• What types of real world rewards occur on a
delayed schedule?
• Did you learn to wait for rewards? Why or why
not?
• Does the media encourage people to delay
gratification? Why or why not?
Practice Applying Concepts
• Complete Schedules of reinforcement wkst
Download