Learning Chapter 5 Part II William G. Huitt Last revised: May 2005 Operant Conditioning The major theorists for the development of operant conditioning are: • Edward Thorndike • John Watson • B.F. Skinner Operant Conditioning • Operant conditioning investigates the influence of consequences on subsequent behavior. • Operant conditioning investigates the learning of voluntary responses. • It was the dominant school in American psychology from the 1930s through the 1950s. Operant Conditioning • Where classical conditioning illustrates S-->R learning, operant conditioning is often viewed as R-->S learning • It is the consequence that follows the response that influences whether the response is likely or unlikely to occur again. Operant Conditioning • The three-term model of operant conditioning (S--> R -->S) incorporates the concept that responses cannot occur without an environmental event (e.g., an antecedent stimulus) preceding it. • While the antecedent stimulus in operant conditioning does not ELICIT or CAUSE the response (as it does in classical conditioning), it can influence its occurrence. Operant Conditioning • When the antecedent does influence the likelihood of a response occurring, it is technically called a discriminative stimulus. • It is the stimulus that follows a voluntary response (i.e., the response's consequence) that changes the probability of whether the response is likely or unlikely to occur again. Operant Conditioning • There are two types of consequences: – positive (sometimes called pleasant) – negative (sometimes called aversive) Operant Conditioning • Two actions can be taken with these stimuli: – they can be ADDED to the learner’s environment. – they can be SUBTRACTED from the learner’s environment. • If adding or subtracting the stimulus results in a change in the probability that the response will occur again, the stimulus is considered a CONSEQUENCE. • Otherwise the stimulus is considered a NEUTRAL stimulus. Operant Conditioning • There are 4 major techniques or methods used in operant conditioning. • They result from combining: – the two major purposes of operant conditioning (increasing or decreasing the probability that a specific behavior will occur in the future), – the types of stimuli used (positive/pleasant or negative/aversive), and – the action taken (adding or removing the stimulus). Operant Conditioning Outcomes of Conditioning Stimulus Increase Behavior Decrease Behavior Positive/ pleasant Add Positive Reinforcement Subtract Response Cost Negative/ Aversive Subtract Negative Reinforcement Add Punishment Operant Conditioning • Two stages of negative reinforcement – Escape learning • Learning to perform a behavior because it terminates an aversive event – Avoidance learning • Learning to avoid events or conditions associated with dreaded or aversive outcomes • Many avoidance behaviors are maladaptive and occur in response to phobias Operant Conditioning • Disadvantages of punishment and response cost – Punishment and response cost do not extinguish an undesirable behavior; rather, they suppress that behavior when the punishing agent is present. – Punishment and response cost indicate that a behavior is unacceptable but does not help people develop more appropriate behaviors. Operant Conditioning • Disadvantages of punishment and response cost – The person who is severely punished often becomes fearful and feels angry and hostile toward the punisher. These reactions may be accompanied by a desire to retaliate or to avoid or escape from the punisher and the punishing situation. – Punishment frequently leads to both negative affect and aggression. Those who administer physical punishment may become models of aggressive behavior. Operant Conditioning • Shaping behavior – An operant conditioning technique that consists of gradually molding a desired behavior (response) by reinforcing responses that become progressively closer to the desired behavior – B. F. Skinner demonstrated that shaping is particularly effective in conditioning complex behaviors Operant Conditioning • Shaping behavior – Successive approximations • A series of gradual steps, each of which is more like the final desired response – The motives of the shaper and the person or animal whose behavior is being shaped are different – The shaper seeks to change another’s behavior by controlling its consequences – The person or animal’s motive is to gain rewards or avoid unwanted consequences Use Conditioning to Modify Your Own Behavior 1. Identify the target behavior. • observable • measurable 2. Gather and record baseline data. • daily record • note where the behavior takes place • note cues (or temptations) in the environment 3. Plan your behavior modification program. • Set goals (small steps, moderate & systematic change) • Develop steps and actions Use Conditioning to Modify Your Own Behavior 4. Choose reinforcers • Pick appropriate for action • Be prepared for not receiving 5. Set the reinforcement conditions and begin recording and reinforcing your progress. • Keep in mind Skinner’s concept of shaping – rewarding small steps toward the desired outcome. • Be perfectly honest with yourself and claim a reward only when you meet the goals. • Chart your progress as you work toward gaining more control over the target behavior. Schedules of consequences Stimuli are presented in the environment according to a schedule of which there are two basic categories: • Continuous • Partial or Intermittent Schedules of consequences Continuous reinforcement simply means that the behavior is followed by a consequence each time it occurs. • Excellent for getting a new behavior started. • Behavior stops quickly when reinforcement stops. • Is the schedule of choice for punishment and response cost. Schedules of consequences Intermittent schedules are based either on the • passage of time OR • number of correct responses Schedules of consequences The consequence can be delivered based on • a fixed amount of time or number of correct responses OR • a slightly different amount of time or number of responses that vary around a particular number Schedules of consequences This results in an four classes of intermittent schedules. Fixed Interval • The first correct response after a set amount of time has passed is reinforced (i.e., a consequence is delivered). • The time period required is always the same. • Example: Spelling test every Friday. Schedules of consequences Pattern of behavior for fixed interval schedule Schedules of consequences Variable Interval • The first correct response after a set amount of time has passed is reinforced (i.e., a consequence is delivered). • After the reinforcement, a new time period (shorter or longer) is set with the average equaling a specific number over a sum total of trials. • Example: Pop quiz Schedules of consequences Pattern of behavior for variable interval schedule Schedules of consequences Fixed Ratio • A reinforcer is given after a specified number of correct responses. This schedule is best for learning a new behavior. • The number of correct responses required for reinforcement remains the same. • Example: Ten math problems for homework Schedules of consequences Pattern of behavior for fixed ratio schedule Schedules of consequences Variable Ratio • A reinforcer is given after a set number of correct responses. • After reinforcement the number of correct responses necessary for reinforcement changes. This schedule is best for maintaining behavior. • Example: Student raises hand to be called on. Schedules of consequences Pattern of behavior for variable ratio schedule Rules In Analyzing Examples • The following questions can help in determining whether operant conditioning has occurred. a. What behavior in the example was increased or decreased? b. Was the behavior • increased (if yes, the process has the be either positive or negative reinforcement), OR • decreased (if the behavior was decreased the process is either response cost or punishment). Rules In Analyzing Examples • The following questions can help in determining whether operant conditioning has occurred. c. What was the consequence / stimulus that followed the behavior in the example? d. Was the consequence (stimulus) added or removed? • If added, the process was either positive reinforcement or punishment. • If it was subtracted, the process was either negative reinforcement or response cost. Analyzing An Example Billy likes to campout in the backyard. He campedout on every Friday during the month of June. The last time he camped out, some older kids snuck up to his tent while he was sleeping and threw a bucket of cold water on him. Billy has not camped-out for three weeks. a. What behavior was changed? Camping out Analyzing An Example Billy likes to campout in the backyard. He campedout on every Friday during the month of June. The last time he camped out, some older kids snuck up to his tent while he was sleeping and threw a bucket of cold water on him. Billy has not camped-out for three weeks. b. Was the behavior strengthened or weakened? Weakened (Behavior decreased) Eliminate positive and negative reinforcement Analyzing An Example Billy likes to campout in the backyard. He campedout on every Friday during the month of June. The last time he camped out, some older kids snuck up to his tent while he was sleeping and threw a bucket of cold water on him. Billy has not camped-out for three weeks. c. What was the consequence? Having water thrown on him. d. Was the behavior consequence added or subtracted? Added Analyzing An Example Billy likes to campout in the backyard. He campedout on every Friday during the month of June. The last time he camped out, some older kids snuck up to his tent while he was sleeping and threw a bucket of cold water on him. Billy has not camped-out for three weeks. Since a consequence was ADDED and the behavior was WEAKENED (REDUCED), the process was PUNISHMENT. Classical vs Operant Conditioning • Processes of generalization, discrimination, extinction, and spontaneous recovery occur in both classical and operant conditioning • Both types of conditioning depend on associative learning • In classical conditioning, an association is formed between two stimuli • In operant conditioning, the association is established between a response and its consequences Classical vs Operant Conditioning • In classical conditioning, the focus is on what precedes the response • In operant conditioning, the focus is on what follows the response • In classical conditioning, the subject is passive and responds to the environment rather than acting on it • In operant conditioning, the subject is active and operates on the environment