overheads for Unit 4: learning 1 Learning through Conditioning Learning: a relatively durable change in behaviour or knowledge that is due to experience. Conditioning: learning associations between events that occur in an organism’s environment Classical Conditioning: a stimulus acquires the capacity to evoke a response that was originally evoked by another stimulus. Ivan Pavlov Classical Conditioning Terminology: Unconditioned stimulus (UCS): evokes an unconditioned response without previous conditioning Unconditioned response (UCR): an unlearned reaction to an unconditioned stimulus that occurs without previous conditioning Conditioned stimulus (CS): a previously neutral stimulus that has, through conditioning, acquired the capacity to evoke a conditioned response Conditioned response (CR): a learned reaction to a conditioned stimulus that occurs because of previous conditioning. CS + UCS UCR bell + meat salivation CS CR bell salivation Conditioned fears: many irrational fears and phobias can be traced back to experiences that involve classical conditioning overheads for Unit 4: learning 2 Conditioning and Physiological Responses Ader & Cohen (1981; 1984; 1993) have shown that c.c. procedures can lead to immunosuppression - a decrease in the production of antibodies They paired a drug that chemically suppresses the immune system (UCS) with an unusual tasting liquid (CS). Eventually, the liquid alone resulted in immunosuppression (CR). Acquisition: the process of pairing the CS and UCS, until the conditioned response is elicited by the conditioned stimulus stimuli that are novel, unusual, or especially intense timing of the stimulus presentations is critical o Simultaneous conditioning: CS and UCS begin and end together o Trace conditioning: CS begins and ends before the UCS is presented o Short-delayed conditioning: CS begins just before UCS (about half a second before) and stops at the same time as the UCS. This is the timing that works best. Extinction: when UCS no longer follows the CS, the CR gets weaker and eventually ceases. Spontaneous Recovery: the reappearance of an extinguished response after a period of non-exposure to the conditioned stimulus. the renewal effect: if a response is extinguished in a different environment than it was acquired, it will reappear if the animal is returned to the original environment where acquisition took place so extinction leads to suppression of the CR, not “unlearning” Stimulus Generalization: a stimulus similar to the CS may also elicit the CR. e.g. little Albert (Watson & Raynor, 1920) the more similar new stimuli are to the original CS, the greater the generalization. overheads for Unit 4: learning 3 Stimulus Discrimination: process by which organism learns to respond differently to stimuli distinct from the CS. the less similar new stimuli are to the original CS, the greater the likelihood and ease of discrimination. Higher order conditioning: when a conditioned stimulus functions as if it were an unconditioned stimulus. phase 1: a neutral stimulus is paired with a UCS until it becomes a CS CS (tone) + UCS (meat) UCR (salivation) CS (tone) CR (salivation) phase 2: another neutral stimulus is paired with the previously established CS, and acquires the capacity to elicit the response originally evoked by the UCS CS (red light) + CS (tone) CR (salivation) CS (red light) CR (salivation) Operant Conditioning Volitional responses come to be controlled by their consequences. (also called instrumental learning) Edward Thorndike (1898): cats and “puzzle boxes”. Cats used trial and error learning, not simple reflexes, to figure out a way out of the puzzle box. * responses leading to + outcomes are strengthened. * responses leading to – outcomes are weakened. Thorndike’s Law of Effect: If a response in the presence of a stimulus leads to satisfying effects, the association between the stimulus and the response is strengthened. overheads for Unit 4: learning 4 B.F. Skinner (1953, 1969, 1984): organisms tend to repeat those responses that are followed by favourable consequences. Reinforcement: when an event following a response increases an organism’s tendency to make that response. Operant chamber (a.k.a. a Skinner Box): a small enclosure in which an animal can make a specific response that is recorded while the consequences of the response are systematically controlled. Reinforcement contingencies: the circumstances or rules that determine whether responses lead to the presentation of reinforcers. experimenter manipulates whether positive consequences occur when the animal makes a designated response (typically the delivery of a bit of food into a food cup in the chamber) the key d.v. is response rate over time, recorded by the cumulative recorder Acquisition: the initial stage of learning some new pattern of responding Shaping: the reinforcement of closer and closer approximations of a desired response. Extinction: the gradual weakening and disappearance of a response tendency because the response is no longer followed by a reinforcer. Resistance to extinction: when an organism continues to make a response after delivery of the reinforcer for it has been terminated. the greater the resistance to extinction, the longer the responding will continue Antecedent stimuli: stimuli that precede a response can also exert considerable influence over operant behaviour. when a response is consistently followed by a reinforcer in the presence of a particular stimulus, that stimulus can serve as a signal indicating that the response is likely to lead to a reinforcer. It becomes a discriminant stimulus: a cue that influences operant behavior by indicating the probable consequences of a response. overheads for Unit 4: learning 5 Delayed Reinforcement: The longer the delay between the response and delivery of a reinforcer, the more slowly conditioning proceeds. Conditioned Reinforcement primary reinforcers are events that are inherently reinforcing because they satisfy biological needs (e.g. food, water, sex, warmth) secondary reinforcers are events that acquire reinforcing qualities by being associated with primary reinforcers (e.g. money, good grades, praise) Reinforcement Schedules Continuous reinforcement: when every instance of a designated response is reinforced. Intermittent, or partial, reinforcement: when a designated response is reinforced only some of the time. Four Popular Intermittent Schedules Fixed-Ratio (FR): reinforcement comes after a fixed number of responses, e.g. FR-25 = reinforcement on every 25th response. high rate of responding Variable Ratio (VR): the number of responses needed for reinforcement varies, e.g. VR-10 = on the average, reinforced after every 10th response highest rate of responding greatest resistance to extinction overheads for Unit 4: learning 6 Fixed Interval (FI): reinforcer is delivered for the first response made after a fixed period of time has gone by, e.g. FI-10 = subject has to wait 10 seconds after reinforcement before their response will yield another reinforcement response patterns are scalloped Variable Interval (VI): first response after a variable period of time has elapsed is reinforced, e.g. VI –20 = reinforcers delivered on average once every 20 seconds generates a low but stable response rate hard to extinguish ratio schedules produce more rapid responding variable schedules tend to produce steadier response rates and greater resistance to extinction shifting to a higher ratio stimulates harder work and greater productivity gambling is reinforced on a variable ratio schedule, which produces rapid, steady responding with great resistance to extinction. Positive Reinforcement: when a stimulus that follows a behaviour increases the probability of that behaviour over time Negative Reinforcement: when the removal of an unpleasant or aversive stimulus is made contingent on a particular behaviour, thereby strengthening that response Avoidance behaviour escape learning: when an organism acquires a response that decreases or ends some aversive stimulation. shuttle box paradigm: 2 compartments with a door that can be opened and closed by the experimenter animal is placed in one compartment and an electric current in the floor of the chamber is turned on with the doorway open. overheads for Unit 4: learning 7 animal learns to escape the shock by going into the other compartment. the escape response gets strengthened through negative reinforcement escape learning can lead to avoidance learning: when an organism acquires a response that prevents some aversive stimulus from happening at all. e.g. the experimenter gives the animal a signal that the shock is forthcoming avoidance responses are long-lasting, even though we’re not sure how the behaviour continues to be reinforced. The best explanation is the Two-Process theory of avoidance. The Two-Process Theory of Avoidance the warning light becomes a CS (via classical conditioning), eliciting conditioned fear in the animal fleeing to the other side of the box is an operant response that produces negative reinforcement because it reduces conditioned fear the avoidance response removes an internal aversive stimulus, conditioned fear, rather than an external aversive stimulus, the shock. Punishment: when an event following a response weakens the tendency to make that response. typically involves presentation of an aversive stimulus can also involve the removal of a rewarding stimulus it can have unintended side-effects: - general suppression of behavioural activity - strong emotional responses, including fear, anxiety, anger, and resentment - physical punishment often leads to an increase in aggressive behaviour overheads for Unit 4: learning 8 Instinctive Drift: when an animal’s innate response tendencies interfere with the conditioning process. Conditioned taste aversion: tendency to associate a substance’s taste with illness caused by eating that substance. even just one pairing even with a delay of hours John Garcia (1989) probably a by-product of our evolutionary history Preparedness and phobias: Preparedness (Seligman, 1971): a species-specific predisposition to be conditioned in certain ways and not others. - instinctive drift, conditioned taste aversion, phobic responses We carry an innate tendency acquired through natural selection to respond quickly and automatically to stimuli that posed a survival threat to our ancestors. Observational Learning (Bandura’s 1977, 1986) bobo doll study an organism’s responding is influenced by the observation of others, who are called models. Four key processes 1. Attention. You need to be paying attention to someone else’s behaviour and its consequences 2. Retention. You need to store a mental representation of what you have witnessed in your memory to be able to use later. 3. Reproduction. You have to be able to reproduce the response. 4. Motivation. You’re not likely to engage in the behaviour unless you’re motivated, e.g. by expectations that it will pay off for you. Reinforcement affects which responses are actually performed more than which responses are acquired. Bandura’s theory explains why physical punishment tends to increase aggressive behaviour in children: they are unwittingly serving as models for aggressive behaviour