CHAPTER 6 PSYCH 100 LEARNING Background • Behavioral and Social Learning Theories: Behaviorism (ch. 1) • Learning is “the relatively permanent change in behavior brought about as a result of experience or practice” • Learning is an internal event, but learning is not recognized as learning until it is displayed by overt behavior Classical Conditioning: Learning Through Association Ivan Pavlov Ivan Pavlov was a prominent Russian physiologist who did research on digestion. Pavlov discovered that dogs will salivate in response to the sound of a tone in a process we now call classical conditioning. • Classical conditioning is the process of learning by which a previously neutral stimulus comes to elicit a response identical or similar to one that was originally elicited by another stimulus as the result of the pairing or association of the two stimuli. Classical Conditioning • Simple/Reflex Behavior • Focuses on ANTECEDENTS: Events that take place BEFORE the response • Learning takes place when an association is made between two previously unrelated stimuli Stimulus à Response Stimulus à Response Pavlov’s Demo Pavlov noticed that the dogs in his experiments salivated before being fed, whenever they heard the clanging of the metal food carts being wheeled into the laboratory. To isolate the cause, Pavlov paired the sound of a bell tone with the presentation of meat powder several times, then presented the sound of the tone alone (without meat powder). Pavlov’s Demo The dog reacted to the sound of the tone even without the presentation of meat powder. Pavlov demonstrated that a learned association was formed by the pairing of events (tone and meat powder) in the animal’s environment. Pavlov’s Demo (NS) Neutral Stimulus (NR) No Response At first, the bell tone is a neutral stimulus that does not cause the dog to drool. Pavlov’s Demo (UCS) Unconditioned Stimulus (UCR) Unconditioned Response The meat powder is an unconditioned stimulus that elicits the unconditioned response. Unconditioned means unlearned. Dogs naturally salivate around food. Pavlov’s Demo Repeated pairings (NS) Neutral Stimulus (UCS) Unconditioned Stimulus (UCR) (NR) Unconditioned NoResponse response When the neutral stimulus (bell) is paired with the presentation of meat powder, the unconditioned response to the meat powder causes the dog to drool. The neutral stimulus has no effect. Pavlov’s Demo (CS) Conditioned Stimulus (CR) Conditioned Response After conditioning (repeated pairing of neutral stimulus with unconditioned stimulus), the neutral stimulus becomes a conditioned stimulus, which means that the bell alone now elicits the response of salivation as the result of learning by association. Phase 1: Before Conditioning (a) (b) US (food in mouth) UR (salivation) NS (tone) (no salivation) Phase 2: During Conditioning (c) NS (tone) + US (food in mouth) UR (salivation) Phase 3: After Conditioning (d) CS (tone) CR (salivation) Extinction Extinction (CS alone) 15 10 5 0 Extinction is the process where the association between the unconditioned stimulus (meat powder) and conditioned stimulus (bell ringing) is broken. When the bell is presented enough times without being paired with meat, the response extinguishes. Spontaneous Recovery Spontaneous Extinction Recovery (CS alone) 15 10 5 24-hour rest 0 Spontaneous recovery is where the conditioned stimulus suddenly elicits an extinguished conditioned response when it is presented again after a period of time after extinction occurs. Once the CS response had undergone extinction, then after a 24-hr rest period, spontaneous recovery occurs and the response is back, but not at full strength. Stimulus generalization is the tendency to for stimuli that are similar to the conditioned stimulus to elicit the conditioned response. Stimulus discrimination is the ability to differentiate conditioned responses to different but related stimuli. Factors that strengthen classical conditioning include: 1. Frequency of pairings 2. Timing 3. Intensity of the unconditioned stimulus Little Albert John Watson and Rosalie Rayner, applied principles of classical conditioning to create a fear response in a young boy, called “Little Albert.” Albert developed a conditioned emotional response to fear a white rat through repeated pairing of the rat with an unpleasant jarring sound. Watson and Rayner then examined the generalization of the acquired fear to other related stimuli, such as other furry objects, including a rabbit, a fur coat, and even Watson himself wearing a Santa Claus mask. (CS) White Rat (US) Loud Gong (CR) (UR) Fear • In the famous case of Little Albert, the CS was the white rat and the US was the loud gong sound. The CR to the white rat was a learned fear response. • While this project taught us quite a bit, one must consider the ethical failures of Watson and Rayner’s work. • Do you think the results were worth the ethical violations? Would such a project be permitted today? Unconditioned Stimulus (Illness) (CR) (UR) Nausea Conditioned Stimulus (Taste of Poisoned Berries) Taste aversion is a special instance of conditioning because it breaks two of the cardinal rules of the process—it may occur after only one pairing of CS-US, and the presentation of the US (illness) and CS (taste) may be separated by hours. Taste aversion also shows the adaptive value of conditioning. It is a crucial response that allows us to learn to avoid certain foods that have sickened us in the past. Classical Conditioning in the Real World 1. Advertising 2. Positive Emotions 3. Drug Cravings 4. Taste Aversions Operant Conditioning: Learning Through Consequences Operant Conditioning • A second form of learning is operant conditioning, learning from consequences (responses) that produce changes in behavior or the environment. • The major figure in operant conditioning was the American psychologist B. F. Skinner. Operant Conditioning • A response that is followed by a reinforcer is strengthened & is more likely to occur again • Reinforcer increases the frequency of a response it follows • Sequence: Response à Stimulus (Reinforcer) Operant Conditioning • Complex/Voluntary Behavior • Focuses On CONSEQUENCES: Events which take place AFTER the target response • Learning takes place when a desired (target) response is affected by its consequences • Shaping: A procedure for teaching complex behaviors that at first reinforces similar behaviors to the target behavior EXAMPLES • Child does homework à gets to play game • Child doesn’t do homework à no game • As children grow, schools expect them to sit for longer and longer periods of time • Teacher using drafts to teach writing 1st à basic formatting 2nd à content in right place 3rd à improve quality of content Edward Thorndike Based on his use of a puzzle box in animal experiments, Edward Thorndike, (an early learning theorist), proposed the Law of Effect, which emphasized the role of consequences in shaping behavior. B.F. Skinner Operant conditioning is a form of learning in which responses come to be strengthened by their consequences. B. F. Skinner of Harvard University first described this type of learning in the late 1930s. A Skinner box is a small enclosure in which an animal can be reinforced, with a food pellet, for a particular response, such as pressing a lever. The rate of response is recorded. In the Skinner box, the floor is electrified to investigate escape or avoidance learning. Speaker Signal lights Lever To food dispenser Food pellet Electric grid To shock generator Skinner Box Operant Conditioning • Skinner’s principle of reinforcement holds that organisms tend to repeat those responses that are followed by favorable consequences, or reinforcement. • Something is positively reinforcing if the rate or probability of a response increases after it is presented, such as in the case of food, water, sleep, etc. • An example of positive reinforcement is when you tell a joke and all your friends laugh. You then become more likely to keep telling jokes. But what happens to the likelihood of your joke telling if no one laughs? Operant Conditioning Behavior Consequence Response Rewarding Stimulus Presented Patronize Diner Tendency to tell jokes increases Behavior Response Press lever Consequence Tendency to press lever increases Rewarding Stimulus Presented Food delivered Responses can be strengthened either by presenting positive reinforcers or by removing negative reinforcers. Positive reinforcement occurs when a response is strengthened when it is followed by the presentation of a (rewarding) stimulus. Behavior Response Press lever Consequence Tendency to press lever increases Aversive Stimulus Removed Shock turned off A stimulus is negatively reinforcing when its removal strengthens the preceding response. Negative reinforcers are aversive stimuli, such as pain or anxiety. The rat in this example presses the lever, which removes the aversive effects of an electric shock. A person learns to turn on a fan or an air-conditioner when these responses are reinforced by relief from uncomfortable heat. Primary Reinforcer Secondary Reinforcer $ • Primary reinforcers – satisfy basic biological needs or drives • Secondary reinforcers – acquire their value through learning and association with a primary reinforcer Discriminative Stimulus Water Light Glass Food pellet dispenser Food tray Lever When the light shines (a discriminative stimulus), the food dispenser will release a food pellet when the animal performs the desired response (pressing the lever). Shaping Operant conditioning is usually established through a gradual process called shaping, which involves the reinforcement of closer and closer approximations to a desired response. Shaping is necessary when an organism does not, on its own, offer the desired response. For example, when a rat is first placed in a Skinner box, it may not press the lever at all. In this case the experimenter begins shaping lever-pressing behavior by reinforcing (feeding) the rat for successive steps toward the target response, such as when it moves closer to the lever. Extinction Extinction in operant conditioning is the process by which the association between response and reinforcer is broken. The most efficient means of unpairing a response and a reinforcer is to stop reinforcing the operant response; that is, to not present food when the bar is pressed, for example. A schedule of reinforcement determines the occurrences of a specific response result in presentation of a reinforcer. Continuous reinforcement occurs when every instance of a designated response is reinforced. Partial reinforcement occurs when a designated response is reinforced only some of the time. Cumulative Responses A fixed-ratio schedule entails giving a reinforcer after a fixed number of desired responses are produced. A fixed-ratio schedule generally provides a rapid rate of response, indicated by the steep slope of the curve. Fixed-ratio (FR) Lower resistance to extinction Rapid responding Short pause after reinforcement Time Cumulative Responses A variable ratio schedule entails giving a reinforcer after a desired response occurs following a variable number of non-reinforced responses. Variable-ratio schedules, like fixed-ratio schedules, tend to produce a rapid response rate. Intermittent reinforcement, or partial reinforcement, occurs when a designated response is reinforced only some of the time. Variable-ratio (VR) Higher resistance to extinction High, steady rate without pauses Note: Time Higher ratios generate higher response rates Avoidance Learning Avoidance behavior is conditioned by presenting a stimulus, usually a light or bell, that signals that the aversive stimulus (electric shock) will follow in a few seconds. The animal learns to avoid the aversive stimulus by moving to the adjoining (non-electrified) compartment when the light appears, but before the painful stimulus is presented. Punishment Type of Punishment Behavior Presentation You bite into a of unpleasant hot red pepper stimulus Removal of Child hits another reinforcing child in the stimulus playground Punishment Effect: Frequency of Behavior Declines Your tongue burns You avoid biting hot peppers in the future Child is removed from the playground or is required to sit out for a period of time Child no longer hits other children in the playground Punishment involves the presentation of an aversive or unpleasant stimulus, or removal of a reinforcing stimulus, after an undesired response occurs. Punishment generally leads to a decline in the frequency of the punished response. Drawbacks of punishment • May temporarily suppress but not eliminate behaviors • Does not teach new behaviors • Can have undesirable consequences • May become abusive • May represent inappropriate modeling Occasional use of mild punishment may sometimes be appropriate • Verbal reprimands • Removal of a reinforcer • Time-outs Observational Learning • Observational learning occurs when an organism’s response is influenced by the observation of others as “models.” • Psychologist Albert Bandura investigated observational learning and identified four key processes: attention, retention, reproduction, and reinforcement. • Influence of modeling is generally stronger when the model is similar to learner and when the model is positively reinforced for performing the behavior. • Fears may also be acquired by modeling. Albert Bandura *COMPARISON* Classical Conditioning Occurs when Nature of Response Focuses on Association Required Learning takes place when… Operant Conditioning Behavior (response) is followed by a Two stimuli are paired reinforcing stimulus Involuntary Voluntary Simple Complex ANTECEDENTS: CONSEQUENCES: Events which take place Events which take place BEFORE the target AFTER the target response response S à R An association is made between two unrelated stimuli R à S A desired response is affected by its consequences YOUR TURN – SORTING TIME! Classical Conditioning Operant Conditioning Response is followed by a Do homework, get a good reinforcing stimulus class grade Involuntary Candy in class, love Yelled at by history teacher at of Voluntarythe blue, never want toReinforced take Response Participate in class à get candy Association Stimulus, thenhistory Response again