Operant Conditioning

advertisement
Objective 12/11/15:
Provided notes and an activity
SWBAT describe the process of
operant conditioning including the
procedure of shaping as
demonstrated by Skinner’s
experiments.
Agenda:
1. Do Now
2. Notes- Operant Conditioning
3. Activity
LEARNING
Operant Conditioning
Comparison of negative reinforcement & punishment
Although punishment can occur when a response leads to the removal of a
rewarding stimulus, it more typically involves the presentation of an aversive
stimulus.
Students often confuse punishment with negative reinforcement because they
associate both with aversive stimuli. However, as this diagram shows,
punishment & negative reinforcement represent opposite procedures that have
opposite effects on behavior.
Behaviorism
To a Behaviorist Everything you know...
Everything you are...
is the result of human behavior.
“Psychology is the study of behavior,
not of the mind!”
- say the behaviorists.
Picked up steam in the late 1960s &
during the 1970s.
A reaction to the non-scientific work
of Freud. Freud was all the rage. Yet
where was the ‘science’ & how can you
prove anything about the unconscious
mind?
Classical vs. Operant Conditioning
Both use acquisition, discrimination, S-R, generalization &
extinction.
Classical Conditioning:
Automatic or Respondent Behavior
Ex.) Your dog gets sick & requires several painful trips to
the vet. Now he hides every time he hears you rattle
your keys. Automatic.
Or - Your cat is excited to eat, as soon as you get home,
he gets fed.
Operant Conditioning:
Behavior where one can influence their environment
with behaviors which have consequences (operant
behavior).
Ex.) Teacher comments on test.
Child working on homework, if completed can play their
Xbox. If not completed, may lose play time.
Or - Your dog sits nicely because he knows he’ll get a
treat for ‘being a good boy.” :)
Operant
Conditioning
A type of learning in which behavior is
strengthened if followed by reinforcement
or diminished if followed by punishment
in rats:
★
★
★
★
trial and error learning
allows acquisition of motor programs that are not instinctive
behavior shaped by rewards
develops as a result of the association of reinforcement with a
particular response
★ on a proportion of occasions
Trial & Error---------------->Trial & Reward---------->Operant Conditioning
Operant Response - Reinforcement Learned Behavior
YouTube: Big Bang Theory - Operant Conditioning
Edward Thorndike
Law of Effect: rewarded behavior
is likely to be repeated.
Studied at cats inside a ‘puzzle box’ - found
that a well-practiced cat will find the way out.
If an action brings an reward,
Thorndike believed that that action
becomes stamped into the mind.
Behavior changes because of the
consequences of that behavior.
Previous theories had emphasized
practice or repetition.
Thorndike gave equal consideration
to the effects of reward or
punishment, success or failure, &
satisfaction or annoyance on the
learner.
YouTube: Thorndike’s Puzzle Box for Cats
B.F.
Skinner
Instead of antecedents of behavior (what
comes before) a new focus on
consequences of behavior.
BF Skinner argued that, CC did not
explain complex behavior.
2 categories of consequences:
Reinforcement & Punishment
Reinforcement is designed to
increase the probability that a
behavior will occur again.
Punishment is designed to decrease
the probability that a behavior will
occur again.
Operant Conditioning Chamber
YouTube: Difference Between Classical & Operant Conditioning, TedEd Vid
Shaping
A procedure in Operant Conditioning reinforces & guides behavior closer and
closer towards a goal.
Reinforcers guide behavior, step-by-step. Closer and
closer to the target behavior through successive
approximations.
“Baby Steps”
Reinforcers
Any event that STRENGTHENS the behavior it
follows.
There are + and – reinforcers.
+ Positive Reinforcers: Strengthens a response by
presenting a stimulus after a response.
- Negative Reinforcers: Strengthens a response by
reducing or removing an aversive stimulus.
YouTube: “What About Bob” - Baby Steps
Positive Reinforcement
Strengthens a response by presenting a
stimulus after a response.
$$$ Getting Paid!
We may continue to go to work each day
because we receive a paycheck on a
weekly or monthly basis.
***AWARDS***
If we receive awards for writing short stories,
we may be more likely to increase the
frequency of writing short stories.
"PRAISE!"
Receiving praise for our karaoke
performances can increase how often we
sing.
Negative Reinforcement
Strengthens a response by reducing or
removing an aversive stimulus.
Example: Driving in heavy traffic is a negative
condition for most of us. You leave home earlier than
usual one morning, & don't run into heavy traffic. You
leave home earlier again the next morning & again you
avoid heavy traffic. Your behavior of leaving home
earlier is strengthened by the consequence of the
avoidance of heavy traffic.
The concept of Negative Reinforcement is difficult
to learn because of the word negative. Negative
Reinforcement is often confused with Punishment. They
are very different, however.
Negative Reinforcement
strengthens a behavior
because a negative condition
is stopped or avoided as a
consequence of the behavior.
Punishment
Weakens a behavior because a
negative condition is
introduced or experienced as a
consequence of the behavior.
Punishment is often mistakenly confused
with negative reinforcement.
Remember, reinforcement always
increases the chances that a
behavior will occur
&
Punishment always decreases the
chances that a behavior will occur.
Positive Punishment
aka... "punishment by application"
Positive punishment involves presenting
an aversive stimulus after a behavior as
occurred.
For example, when a student talks out
of turn in the middle of class, the
teacher might scold the child for
interrupting her.
Negative Punishment
aka... "punishment by removal"
Negative punishment involves taking
away a desirable stimulus after a
behavior as occurred.
Example: Student talks out of turn
again, the teacher promptly tells the
child that he will have to miss recess
because of his behavior.
Punishment also has some notable drawbacks.
First, any behavior changes that result from punishment are
often temporary. "Punished behavior is likely to reappear after
the punitive consequences are withdrawn," Skinner explained
in his book About Behaviorism.
Perhaps the greatest drawback is the fact that punishment
does not actually offer any information about more
appropriate or desired behaviors. While subjects might be
learning to not perform certain actions, they are not really
learning anything about what they should be doing.
Another thing to consider about punishment is that it can
have unintended and undesirable consequences.
Example, while approximately 75% of parents in the United
States report spanking their children on occasion, researchers
have found that this type of physical punishment can lead to
antisocial behavior, aggressiveness & delinquency among
children.
For this reason, Skinner and other psychologists suggest that
any potential short-term gains from using punishment as a
behavior modification tool need to be weighed again the
potential long-term consequences.
Positive reinforcement - when something is given (apply an aversive
stimulus).
Negative reinforcement - when something is removed (remove an aversive
stimulus).
Skinner: punishment should be judicious, immediate,
consistent, & severe enough actually to be a punishment.
YouTube: Schallhorn Operant Conditioning - Reinforcement & Punishment
Many students are confused about negative
reinforcement.
What's the difference between that and
punishment?
Remember, "reinforcement" - behavior
increases, & because it's "negative," the
reinforcer is removed after the response.
Positive or Negative Reinforcement?
Cleaning the house to get rid of the disgusting
mess and/or to stop your mother from nagging
Positive or Negative Reinforcement?
Cleaning the house to get rid of the disgusting mess and/or to stop your mother
from nagging
NEGATIVE REINFORCEMENT
Strengthens a response by reducing or removing an aversive stimulus.
Nagging/Mess as negative reinforcer to cleaning.
Positive or Negative Reinforcement?
Taking aspirin to relieve a headache
Positive or Negative Reinforcement?
Taking aspirin to relieve a headache
NEGATIVE REINFORCEMENT
Strengthens a response by reducing or removing an aversive
stimulus. (The headache is the aversive stimulus)
headache as negative reinforcer to taking medication
Positive or Negative Reinforcement?
Listening to your favorite music after studying
for an hour
Positive or Negative Reinforcement?
Listening to your favorite music after studying for an hour
POSITIVE REINFORCEMENT: Strengthens a response by
presenting a stimulus after a response.
Positive or Negative Reinforcement?
Leaving the movie theater if the movie is bad
Positive or Negative Reinforcement? -- Leaving the movie theater if the movie is bad
Negative Reinforcement: strengthens a behavior because a negative condition is
stopped or avoided as a consequence of the behavior.
YouTube: Schallhorn - Schedules of Reinforcement
Fixed-ratio Schedules
A schedule that reinforces a
response only after a specified
number of responses.
Examples in natural environments:
Jobs that pay based on units delivered.
Employees often find this schedule undesirable because it
produces a rate of response that leaves them nervous & exhausted
at the end of the day. (Selling cars)
They may feel pressured not to slow down or take rest breaks,
since they feel that such will costs them money. This is an example
of how a schedule can produce a high rate of response even
though the response rate is aversive to the subject.
Collecting tokens.
Many games require the player to collect a fixed number of
tokens to advance to the next level, obtain a new life point, or
receive some other reinforcers.
Attaining a new level in an RPG - Role Playing Game. Some RPG's
clearly indicate how much experience is required to achieve the
next level.
A high degree of certainty as to the level of work that
will be required to achieve the next level puts the
player on a fixed ratio schedule.
Variable-ratio Schedule
Reinforces a response after an
unpredictable number of responses.
Slot machines:
Gambler has no way of predicting how many times
he must put a coin in the slot & pull the lever to hit
a payoff but the more times a coin is inserted the
greater the chance of a payout.
People who play slot machines are often reluctant to
leave them, especially when they have had a large
number of un-reinforced responses.
Playing golf:
Golfer is uncertain how good each shot will be, but
the more often they play, the more likely they are to get
a good shot.
Door to door salesmen:
Uncertain how many houses they will have to visit
to make a sale, but the more houses they try, the more
likely that they will succeed.
Fixed-interval Schedule
A schedule of reinforcement that
reinforces a response only after a
specified time has elapsed.
Getting a raise every year and not in between.
A major issue with this schedule is that people
tend to improve their performance right before
the time period expires so as to "look good"
when the review comes around.
Example:
A weekly paycheck is a good example of a
fixed-interval schedule. The employee receives
reinforcement every seven days, which may result
in a higher response rate as payday approaches.
Fish Feeding. You feed your fish every day at
4:00. After a few days of this, you might start
noticing your fish starts swimming toward the top
of his tank every day around 4:00.
Dear Mister Rogers,
Please say when you are
feeding your fish, because I
worry about them. I can't see
if you are feeding them, so
please say you are feeding
them out loud.
A letter from a blind child to the
childrens television icon. He always
said verbally explained that he was
feeding the fish to comfort this one
child.
Variable-interval Schedule
A schedule of reinforcement that
reinforces a response at unpredictable
time intervals.
If you have a boss who checks your work periodically, you
understand the power of this schedule. Because you don’t
know when the next ‘check-up’ might come, you have to be
working hard at all times in order to be ready.
In this sense, the variable schedules are more powerful and
result in more consistent behaviors.
Example: Teacher observations by administrators.
This may not be as true for punishment since
consistency in the application is so important, but
for all other types of reinforcement they tend to result
in stronger responses.
Punishment
An event that DECREASES the behavior that it follows.
Does punishment work?
Download