Operant Conditioning The Learner is NOT passive. Learning based on consequence!!! Operant Conditioning Learning controlled by a connection to the consequence of one’s behavior Consequences of behavior determine whether it will be repeated in future Vs. Classical Conditioning Behavior is… CC: elicited, automatic, reflexive OC: emitted, voluntary, complex behaviors Reward is… CC: provided independent of actions OC: dependent on behavior B.F. Skinner • • • The most influential behaviorist and proponent of Operant Conditioning. Nurture guy through and through. Used a Skinner Box (Operant Conditioning Chamber) to prove his concepts. Skinner Operant box—non-reflexive behaviors could be altered by learning Chaining Behaviors Subjects are taught a number of responses successively in order to get a reward. Click picture to see a rat chaining behaviors. Click to see a cool example of chaining behaviors. Thorndike’s Puzzle and The Law of Effect • • • • • • Click picture to see a better explanation of the Law of Effect. Edward Thorndike Locked cats in a cage Behavior changes because of its consequences. If a response is rewarded, that response is more likely to occur If consequences are unpleasant, the StimulusReward connection will weaken. (LOE) Called the whole process instrumental learning. • Instrumental behaviors Thorndike Operant Conditioning Reinforcement Increases probability of response Positive: desirable stimulus is added Negative: undesirable stimulus is removed Punishment Decreases probability of response Positive: adding something bad Negative: removing something good Reinforcement When an event increases the likelihood that a response will occur again Positive Adding something good Designed to increase behavior Negative Removing something bad Designed to increase behavior Types of reinforcers Primary vs. secondary Primary: inherently satisfying to most people Secondary: gain value from conditioning Immediate & delayed Usually needs to be immediate, but humans can handle delayed reinforcers Important for self-control Rat basketball What type of learning was this an example of? Can you explain what helped the rats learn to score a basket? Punishment/Consequence When an event decreases the likelihood that a response will occur again Two types: Positive & Negative Positive ≠ Good. POSITIVE = ADD Adding something bad Designed to decrease behavior Negative ≠ Bad. NEGATIVE = SUBTRACT Removing something good Designed to decrease behavior Importance of reinforcement Punishment signals undesirable behavior but doesn’t inform of desired behavior Punished behavior is suppressed Punishment teaches stimulus discrimination Punishment (esp. physical) teaches fear & aggression Ignore behavior that one wants to punish; look for what to reinforce Punishment tends to be ineffective It tells the organism what not to do, rather than what to do Creates anxiety that can interfere with future learning Encourages subversive behavior (sneakiness) Provides a model for aggressive behavior Only true for some races/cultures Neg. reinforcement ≠ punishment The Decision Tree How to solve operant conditioning problems Should the behavior increase or decrease? Increase. (Reinforcement) Decrease. (Punishment) Is something being added or taken away? Added. (Positive) Removed. (Negative) Review Punishment decreases behavior Reinforcement increases behavior Positive Negative ADD something unfavorable SUBTRACT something desirable ADD something desirable SUBTRACT something unfavorable Applications of Operant Conditioning Behavior Modification Started with Thorndike Altering individual behavior (frequency) through positive and negative reinforcement and positive and negative punishment Reduction of behavior through its extinction and punishment Adaptive behaviors A.K.A. – Applied Behavior Analysis or Positive Behavior Support (PBS) A child is riding with an adult, and the child is thirsty. So, the child asks to stop and get a drink. The adult says no, the child asks again, and again, and again... Finally, the adult gives in, saying, "All right, just this once." Big mistake, right? Why? The adult has now put the child on a partial schedule, guaranteeing a repetition of the same behavior later on. Instead, the adult should have said, "All right, I'll get you a drink IF you don't ask for one for the next 10 (time may have to vary, depending on the child) minutes." Then, the adult is providing the child with positive reinforcement for being quiet. Ending a Relationship????? Behavior Modification Reinforcement provides a system of rewards and punishments to change negative behavior into positive responses. Provides rewards when someone acts in a positive manner. Rewards can range from a compliment to granting a special privilege to the patient whose behavior becomes desirable. A negative consequence might be the result of unwanted behavior, with the removal of a favorite object or taking away a privilege. Cognitive behavior modification techniques focus on thought patterns that affect behavior, Involve teaching a patient to recognize thoughts that may be unrealistic or distort reality. Keeping a journal, role-playing, and being asked to defend thoughts that defy reality. Eating disorders, anxiety disorder, OCD, Panic attacks Aversion behavior modification techniques center on the premise that all behavior is learned and can be unlearned. (aka CC) Electrical shock treatment is one example of adverse stimuli used to treat deviant behavior. (Mild) medication given to alcoholics that might make them ill if they drink while using the drug. The token system provides immediate rewards while setting goals for future conduct. Distribute a token or similar object each time a patient or student exhibits positive behavior. Tokens can be amassed and later exchanged for a prize or privilege, or lost due to unwanted behavior. This form of behavior modification is commonly used in mental institutions and prisons to help control individuals who show violent tendencies. Premack principle A less frequently performed behavior can be increased by reinforcing it with a more frequent behavior Eat your vegetables before you can have dessert! Operant Conditioning in Daily Life Do we wait for the subject to deliver the desired behavior? Sometimes, we use a process called shaping. Shaping is reinforcing small steps on the way to the desired behavior. To train a dog to get your slippers, you would have to reinforce him in small steps. First, to find the slippers. Then to put them in his mouth. Then to bring them to you and so on…this is shaping behavior. To get Barry to become a better student, you need to do more than give him a massage when he gets good grades. You have to give him massages when he studies for ten minutes, or for when he completes his homework. Small steps to get to the desired behavior. Shaping Reinforcing responses that come successively closer to the desired response Successive approximations Shaping Reinforcers gradually increase organism’s actions toward desired end behavior Successive approximations : behaviors closer & closer to end learning goal get rewarded 1. Simply turning toward the lever will be reinforced 2. Only stepping toward the lever will be reinforced 3. Only moving to within a specified distance from the lever will be reinforced 4. Only touching the lever with a part of the body will be reinforced 5. Only touching the lever with a specified paw will be reinforced 6. Only depressing the lever partially with the specified paw will be reinforced 7. Only depressing the lever completely with the specified paw will be reinforced Schedules of reinforcement •How often to you give the reinforcer? •Every time or just sometimes you see the behavior. Schedules of Reinforcement Continuous reinforcement schedule: Reinforcing a response every time Learning occurs rapidly, extinction occurs rapidly Partial reinforcement schedule: Reinforcing a response only some of the time Slower acquisition, but resistant to extinction Fixed vs. Variable Ratio vs. Interval Fixed ratio: after set # of responses Variable ratio: after unpredictable # of responses Fixed interval: after set amount of time has passed Variable interval: after unpredictable amount of time has passed Continuous v. Partial Reinforcement Continuous Reinforce the behavior EVERYTIME the behavior is exhibited. Usually done when the subject is first learning to make the association. Acquisition comes really fast. But so does extinction. Partial • • • • Reinforce the behavior only SOME of the times it is exhibited. Acquisition comes more slowly. But is more resistant to extinction. FOUR types of Partial Reinforcement schedules. Schedules of reinforcement Continuous vs. partial Ratio schedules 1. Fixed-ratio (FR) schedules: Reinforcement after a fixed (predictable) number of responses 2. Ex: paid $1 for every 20 apples you pick Variable-ratio (VR) schedules: Reinforcement after a varying (unpredictable) number of responses Induces very high rate of responding Ex: scratch & win lottery tickets Interval Schedules 3. Fixed-interval (FI) schedule: 4. Reinforcement after a fixed (predictable) amount of time Variable-interval (VI) schedule: Reinforcement after varying (unpredictable) amounts of time Reinforcement Schedules Fixed Variable Ratio Interval after set number of responses after set amount of time after random number of responses after random amount of time Ratio Fixed Variable Interval Name that Schedule! A B D C A.Variable Ratio C. Variable Interval B.Fixed Ratio D. Fixed Interval Winning at the slot machines Getting a free flight after accumulating 10,000 flight miles Receiving an allowance every Saturday regardless of chores, as long as you’ve done one chore Random drug testing at your job