Instrumental/Operant Conditioning Thorndike’s Puzzle Box Result Thorndike’s Law of Effect • “Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction…will be more likely to recur” Situation Response Outcome Puzzle Box Pull Loop Meat or Fish SR association Two Theories • Thorndike – Stimulus associated with response (S-R), so the response is a “habit” triggered by the situation • Grandmother – “Cat is working to get food” (R-O) Situation Response Puzzle Box Pull Loop Outcome Meat or Fish RO association Test of Grandma’s Theory • Stage 1: Train instrumental S-R-O • Stage 2: Alter value of O (devalue) in the absence of R and S • Stage 3: Test to determine if R is reduced Responding 100 90 80 70 60 50 40 30 20 10 0 Non-Devalued Devalued Untrained Shaping • Shaping is a method for encouraging novel behavior – Reinforcing successive approximations to the target behavior Types of Reinforcers • Primary Reinforcers satisfy a need and reinforces behavior without any special experiences • Secondary Reinforcers become valuable through association with primary reinforcers Delay of Reinforcement 100 90 Reinforcer Potency • Delayed reinforcers are steeply discounted • Loss of self-control and implusivity • Precommitment 80 70 small immediate 60 50 large delayed 40 30 20 10 0 -9 -6 Delay -3 0 Stimulus Discrimination Stimulus Discrimination? Positive and Negative Reinforcement Shuttle Box Escape versus Avoidance Conditioning Schedules of Reinforcement • Continuous Reinforcement Schedule: Reinforcer is delivered every time a particular response occurs. • Partial or Intermittent Reinforcement Schedule: Reinforcement is given only some of the time. Partial Reinforcement Schedules • Fixed Ratio (FR): Reinforcement occurs after a fixed number of responses. • Variable Ratio (VR): Reinforcement occurs after a varied number of response. • Fixed Interval (FI): Reinforcement occurs for the first response after a fixed time interval • Variable Interval (VI): Reinforcement occurs for the first response after a variable time interval Partial Reinforcement Schedules Schedules and Extinction • Failure to reinforce a response eventually extinguishes it. • Partial reinforced responses are more difficult to distinguish. – “Partial reinforcement extinction effect” – “Superstitious behavior” is resistant to extinction for this reason Why Reinforcers Work • Deprived of the opportunity to engage in behavior (drink, eat, etc.), called the response deprivation hypothesis • Physiological – James Olds and “pleasure centres” – Nucleus Accumbens and Dopamine Punishment and Learning • Punishers decrease of probability the immediately preceding response – Two kinds of punishment. • Negative Reinforcement versus Punishment – Negative Reinforcement: Strengthens behavior – Punishment: Weakens behavior Continue Figure 5.11: Two Kinds of Punishment Return Drawbacks of Punishment • Only suppresses unwanted behavior • Unwanted side effects – target becomes aggressive, avoidance • Often ineffective unless a strong punisher given immediately after every response (e.g., red light camera) • Does not specify what should be done. Guidelines for Effective Punishment • Specify why punishment is being given • Emphasize the behavior, not the person, being punished • Without being abusive, make sure the punishment immediate and noticeable • Identify and positively reinforce more appropriate responses. Some Applications of Instrumental Conditioning • • • • Classroom Management Token Economies in Mentally Challenged Autism Self-Control Other Specialized Forms of Learning • • • • Spatial Learning Knowledge Attribution Helplessness Observational Learning – Mirror neurons