Chapter 5: Models to explain learning Conditioning is the process of learning associations between a stimulus in the environment (one event) and a behaviour response (another event) Classical conditioning Classical conditioning is a model of learning that occurs through the repeated involuntary association of two or more different stimuli They are like reflexes in that they occur involuntarily, but unlike reflexes they are learned Stimuli is any object, environment or event that precede an action Responses are actions that follow a stimuli Conditioned reflexes are automatic responses that occur as the result of previous experience. Humans may associate a reflex response with a new stimulus and hence it becomes a learned reflex response. This involves little conscious through or awareness For example: Hitting the brakes in response to seeing a speed camera Stop talking when the lights dim in a cinema Packing up books when the bells sounds Conditioned emotional responses are an emotional response to a stimulus that doesn’t naturally produce that response, learned through the process of classical conditioning. Pavlov’s experiment of classical conditioning In Pavlov’s experiments, the dogs salivated each time meat powder was presented to them. The meat powder in this situation was an unconditioned stimulus (UCS). The dog’s salivation to the meat powder was an unconditioned response (UCR). Pavlov would sound a tone (like ringing a bell) and then give the dogs the meat powder. The sound of the bell was the neutral stimulus (NS). Prior to conditioning, the dogs did not salivate when they heard the bell because the sound of the bell had no association for the dogs. When Pavlov paired the bell with the meat powder over and over again, the previously neutral stimulus also began to elicit salivation from the dogs. Therefore, the neutral stimulus became the conditioned stimulus (CS). Eventually, the dogs began to salivate to the sound of the bell alone, just as they previously had salivated at the sound of the assistants footsteps. The behaviour caused by the conditioned stimulus is called the conditioned response (CR). Classical conditioning as a three phase process All examples from Pavlov’s experiment The neutral stimulus (NS) is any stimulus that does not normally produce a predictable response For example, no salivation in response to the ringing of a bell The unconditioned stimulus (UCS) is any stimulus that consistently produces a particular, natural occurring, automatic response. For example, the dog food The unconditioned response (UCR) is the response that occurs automatically when the UCS is presented For example, salivation in response to the presence of food The conditioned stimulus (CS) is the stimulus that is ‘neutral’ at the start of conditioning, but eventually triggers a very similar response to that caused by the UCS after conditioning. For example, the ringing of the bell The conditioned response (CR) is the learned response that is produced by the conditioned stimulus. For example, salivation by the dogs in response to the ringing of the bell UCS Think . . . what action/object stimulated a natural automatic response initially? UCR Think . . . what was the reaction that occurred automatically? NS Think . . . what action/object used to produce no reaction before conditioning CR Think . . . this was the NS that now produces a reaction/response CR Think . . . this is the same as the UCR but it now occurs due to the CS being present NOTE: The neutral stimulus is the same as the conditioned stimulus (NS is before conditioning, CS is after conditioning). The unconditioned response is the same as the conditioned response, just due to different events. What happens in each stage of Classical conditioning Before conditioning: Neutral stimulus elicits no response Unconditioned stimulus elicits an unconditioned response Involuntary behaviour as it occurs without conscious awareness During conditioning: The neutral stimulus is paired with the unconditioned stimulus to produce an unconditioned response – must be repeatedly associated The timing and order of the presentation is important o Neutral stimulus has to be first, otherwise the organism will ignore the presence of the neutral stimulus Typically, the neutral stimulus is presented almost immediately before the unconditioned stimulus, and there should only be a brief interval between the presentation of the two (0.5 seconds) After conditioning: After multiple pairing of the neutral stimulus and unconditioned stimulus Neutral stimulus becomes a conditioned stimulus Organism has learned to respond to the conditioned stimulus in the same way that they originally responded to the unconditioned stimulus Key processes of classical conditioning Acquisition is used to describe the overall process during which an organism learns to associate two events – the NS and the UCS – until the NS alone has become a CS that produces the CR (basically just the whole process of classical conditioning) Stimulus generalisation is the tendency for another stimulus that is similar to the original CS to produce a response that is similar, but not identical, to the CR For example the dog salivating in response to the ringing of a mobile phone or a doorbell Stimulus discrimination occurs when a person or animal responds to the CS only, but not to any other stimulus that is similar to the CS For example, the dog salivating only in response to the sound of the ‘experimental’ bell, and not to other similar sounds such as a doorbell or a mobile phone ringing Extinction is the gradual decrease in the strength or rate of a CR that occurs when the UCS is no longer present For example, Pavlov’s dogs eventually ceased salivating (CR) in response to the bell (CS) presented alone after a number of trials in which the food (UCS) did not follow the sound of the bell Spontaneous recovery is the reappearance of a CR when the CS is presented, following a rest period (no CS present) after the CR appears to have been extinguished For example, Pavlov’s dogs starting to salivate again to the sound of the bell after extinction is initially achieved as a part of the experimental research Examples of key processes of classical conditioning during Pavlov’s dog experiment Extinction was explored by Pavlov’s experiment with dogs. Pavlov found that when he repeatedly presented the bell (CS) without the meat powder (UCS), extinction occurred: the dogs stopped salivating in response to the bell. However, after a couple of hours of resting from this extinction training, the dogs began to salivate when Pavlov rang the bell. This behaviour is an example of spontaneous recovery. Stimulus discrimination was demonstrated when the dogs distinguished between the tone that sounded before they were fed and other tones (e.g. a doorbell), responding only to the one they had learned to associate with food. This ability to discriminate was useful to them because the other sounds did not predict the arrival of food. If stimulus generalisation had occurred, the dogs would have salivated in response to many tones that were similar to the conditioned tone, incorrectly thinking that these tones also predicted the arrival of food. The Little Albert Experiment It was run in 1920 by John B Watson and Rosalie Rayner, in which the experiment used classical conditioning to condition a baby boy to fear a white rat. What occurred during the experiment: Initially, Little alert was presented with various neutral stimuli, including a rabbit, a dog, a monkey, different masks and a white rat He did not show any fear response to these stimuli Watson and Rayner would present Little Albert with the white rat, and immediately after strike a hammer against a metal bar behind Little Albert’s head The loud noise made by the hammer and metal bar would frighten Little Albert, causing him to cry They repeatedly paired the loud sound with the white rat, until the presence of the white rat alone elicited a fear response form Little Albert This fear response was then generalised to other stimuli such as a rabbit, a dog, a seal skin and slightly less fearful reactions to cotton-wool balls and a Santa Claus mask After the experiment, he also demonstrated stimulus generalisation in that his fear response was extended to all white or furry stimuli, such as a rabbit, a dog and even a white mask. Little albert’s mother moved away with him before the experiment ended, meaning that Watson and Rayner did not get the chance to extinguish this fear response. Little Albert passed away at the age of 6, due to causes unrelated to the experiment. Note: The mother was not made fully aware that her son was used in these experiments. She left the workplace with her son before the fear responses were removed. The fears may have disappeared over time, but Albert did suffer emotional trauma and may have suffered some kind of lasting psychological harm. NS: white rat UCS: loud noise UCR: crying in response to the loud noise CS: white rat CR: crying in response to the white rat How was stimulus generalisation demonstrated? Little Albert also produced a conditioned response when presented with similar objects such as a white rabbit, a dog, seal skin coat, cotton wool balls and a Santa Claus mask. Ethical considerations breached during the Little Albert experiment Voluntary Participation was not adhered to as Little Albert’s parents did not volunteer him to be a part of the study. Informed Consent was not adhered to as his mother was not informed about the study, and did not give consent for Little Albert to be a part of the study. Debriefing was not adhered to as his mother and Little Albert did not go through debriefing to extinguish his conditioned fear response as they moved away. Confidentiality was not adhered to as they gave Little Albert’s name and also, they presented video and photos of the experiment. Deception was not adhered to as the classical conditioning was not apart of the experimental design. Withdrawal rights was not adhered to as Little Albert wanted to escape the experiment as seen through him trying to crawl away, but he was not allowed to. No-harm principle was not adhered to as Little Albert was put under great psychological distress and his fear response was never extinguished, and the lasting effects may have caused ongoing psychological harm. Beneficence was not adhered to as although the experiment did contribute to our modern understanding of classical conditioning, but the benefits of the research did not outweigh the harm that is caused in Little Albert. Example of a Classical Conditioning prac Aim: To investigate the effect of classical conditioning through the use of repeated association for two stimuli (the wizz fizz and a bell) Hypothesis: It is hypothesised that students will salivate only to the ring of the bell just after the process of conditioning has taken place Discussion/what happened: All three phases of classical conditioning were present in the experiment. During the first phase ‘before conditioning’ the neutral stimulus (NS) (bell) produces no response in the participants, the unconditioned stimulus (UCS) (wizz fizz) caused the participants to salivate, which is an unconditioned response (UCR). The second phase, ‘during conditioning’ then commenced, the NS is paired and repeatedly associated with the UCS which causes an UCR of salivation in the participants. The final phase, ‘after conditioning’ then takes place, in which the NS becomes a conditioned stimulus (CS) and the UCR converts into a conditioned response (CR). Now through the sound of only the bell (CS) participants salivated (CR). However, shortly after, the process of extinction took place and students no longer salivated to the sound of the bell. Operant Conditioning Operant conditioning also known as instrumental learning is a type of learning in which the likelihood of a behaviour being repeated is determined by the consequence of that behaviour. A response may be followed by the consequences pf reinforcement (such as food) or punishment (such as being reprimanded) or nothing (no consequence at all). These consequences determine whether the same response is likely to be repeated when an organism is presented with the same stimulus. An operant is any response (or set of responses) that acts (‘operates’) on the environment to produce some kind of consequence. The key difference between classical conditioning and operant conditioning is that the individual ‘operates’ on the environment to solve a problem whereas in classical conditioning the individual is passive (it just happens automatically). Classical conditioning doesn’t explain voluntary behaviour o Often we need to make adjustments to our behaviour depending upon outcomes/consequences Operant conditioning is a form of learning in which responses come to e controlled (learn form mistakes) by their consequences Operant responses tend to be voluntary (by choice) Operant conditioning as a three-phase model 3 phases of Operant Conditioning: 1. Antecedent (A), a stimulus that occurs before the behaviour a. Environmental stimulus that causes the behaviour to occur 2. Behaviour (B), occurs due to the antecedent 3. Consequence (C), occurs as a result of behaviour The antecedent is the stimulus (object or event) that precedes a specific behaviour, signals the probable consequence for the behaviour and therefore influences the occurrence of the behaviour For example, sleeping in on a school day or your mobile phone ring tone when you are expecting a call from a friend The behaviour is the voluntary action that occurs in the presence of the antecedent stimulus For example, arriving to school late or tapping ‘accept’ on your mobile phone screen The consequence is the environmental event that occurs immediately after the behaviour and has an effect on the occurrence of the behaviour again For example, getting a detention or being able to have a conversation with your friend NOTE: When writing the answer, write the answer in CBA as it is easier to find the C and B first. Also think of it that the A lead to B which leads to C Can be a memory from a previous behaviour Consequences can act as a stimulus for next time Can be an A for next time Reinforcement; more likely to occur next time Reinforcement is said to occur when a stimulus strengthens or increases the frequency or likelihood of a response that follows it (the response is strengthened because it leads to rewarding consequences) A reinforcer is any stimulus that strengthens or increases the frequency or likelihood of a response that it follows. Positive reinforcer/reinforcement; something given (added) A positive reinforcer is a stimulus that strengthens or increases the frequency or likelihood of a desired response by providing a satisfying consequence. Positive reinforcement involves giving or applying a positive reinforcer after the desired response has been made, which strengthens the likelihood of the behaviour occurring again. For example, getting $100 because you got an A+ on your exam, so next time you have an exam, you will study hard to get $100 again Negative reinforcer/reinforcement; something removed or avoided (subtracted) A negative reinforcer is any unpleasant or aversive stimulus that, when removed or avoided, strengthens or increases the frequency or likelihood of a desired response. Negative reinforcement involves the removal or avoidance of an unpleasant stimulus strengthening the response. For example, Taking Panadol to remove a headache. Because it took the headache away last time, you are more likely to take Panadol next time you get a headache – so it strengthens your behaviour Escape – the response stops some aversive event that has already begun; causing a negative event to stop. E.g. rat pressing the lever in skinners box to stop the electricity shock Avoidance – The response prevents a negative event from happening at all; stopping the aversive event altogether. E.g. taking an umbrella to avoid getting wet Punishment Punishment is the delivery of an unpleasant consequence following a response, or the removal of a pleasant consequence following a response Positive punishment Positive punishment involves giving something unpleasant or painful that will weaken/decrease the likelihood of a response occurring again. For example, giving an electric shock, giving verbal reprimand, parents giving a child a smack on their behind, being given extra chores at home for doing something wrong. Negative punishment/response cost Negative punishment/response cost involves the removal of something good that will weaken/decrease the likelihood of the response occurring again. For example, taking away privileges, losing a licence for speeding, grounded for not coming home on time. Note: Giving a school detention can be sometimes seen as Negative Punishment (Response Cost) as you are having your free time taken away from you by getting a detention or Positive Punishment as you are getting a fluro sticker in your diary. Factors influencing the effectiveness of reinforcement and punishment Order of presentation Timing Appropriateness Key concept: Any type of reinforcement is intended to increase the likelihood of a behaviour being repeated and any type of punishment is intended to decrease the likelihood of behaviour being repeated. Skinner’s experiment with rats B. F. Skinner believed that any behaviour that is followed by a consequence will change in strength and frequency depending on the nature of the consequence. For example, parenting on how parents discipline children: One behaviour (tantrum) will be followed with a certain consequence (time out in a room). A different behaviour (eating dinner) will be followed with a different consequence (praise). Skinner used a special box to condition rats to behave in certain ways. He would reward certain behaviours with food pellets and punish other behaviours with a mild electric shock. He had a cumulative recording device that would show how various behaviours were encouraged and discouraged. The cumulative recorder shows the responses (lever pressing) made by the rat Key processes in operant conditioning Acquisition Acquisition is the overall learning process which a specific response or pattern of responses is established through reinforcement. The speed it is established depends on whether reinforcement is partial (only sometimes) or continuous. Extinction Extinction is the gradual weakening and disappearance of a response because the response is no longer followed by a reinforcer. Extinction doesn’t happen straight away and occurs when the response is no longer present. For example, In Skinner’s box, when Skinner stopped reinforcing his rats or pigeons with food pellets, the rats eventually stopped pressing the lever. Spontaneous recovery After the apparent extinction of a conditioned response, spontaneous recovery can occur and the organism will once again show the response in the absence of any reinforcement. The conditioned response shown is often weaker. If reinforcement is not continued, the response will usually extinguish even more quickly than it did the first time. Stimulus generalisation Stimulus generalisation is the tendency to respond to similar stimuli to those which preceded reinforcement/responding to stimuli other than the original stimuli. For example, In Skinners box, a pigeon trained to peck at a switch that was lit by a green light would generalise the original stimulus with lights of varying colours. Stimulus discrimination Stimulus discrimination occurs when an organism makes the correct response to a stimulus and is reinforced, but does not respond to any other stimulus, even when stimuli are similar. For example, In Skinner’s box, a pigeon in his box could be taught to discriminate between a red and green light. Application of Operant Conditioning: Shaping Shaping involves successive responses that closely resemble or progress towards the ultimate desired response. In simpler terms; breaking down complex behaviour into smaller parts, that organism has to perform each step before moving on to learn the next step. Reinforcement is given only when the organism successfully accomplishes each step towards the target behaviour. Often used in animal training. Also referred to as method of successive approximations. For example, In Skinner’s box, his rats may not press the lever at all so it is reinforced when the rat moves towards the lever, then it is reinforced only when it touches the lever, then when it presses the lever. Comparison of Classical and Operant Conditioning Role of learner (Passive or active) Nature of response/behaviour Timing of the stimulus (before or after) Timing of response (before or after) Classical conditioning (based on repeated association) The learner is passive. ie the response is elicted by the UCS. Involuntary, reflex Operant conditioning (relies on consequences and rewards) The learner is active. ie the response is emitted. It is not a conscious or deliberate response. It is intentional and often goal directed. Involves the A.N.S. The stimulus occurs before the response/behaviour Involves the C.N.S. The stimulus occurs after the response. The NS/UCS pairing comes before the natural reflex response. The response is followed by reinforcing stimulus, more commonly known as a consequence. The consequence acts as a stimulus for ‘next time’ The response/behaviour happens after reinforcement occurs The response/behaviour happens after reinforcement occurs Conscious, voluntary Example of operant conditioning prac Aim: To trial operant conditioning through the use of negative and positive consequences in response to specific behaviour. Antecedent: Being told to look for an object (chocolate) in the classroom Behaviour: Looking for an object (chocolate) by walking around the room Consequence: The consequence was either a positive response (hot) or a negative response (cold) from the rest of the class. If the participant walked closer to the chocolate the class would say ‘hot’. If the participant walked away from the object the audience would say ‘cold’. Discussion: Operant conditioning was present as the participant would repeat an action if it had a positive response from the crowd and would not repeat an action that had a negative response from the class. The positive response from the audience is positive reinforcement and the negative response is positive punishment. Observational learning/modelling Observational learning occurs when someone uses observation of a model’s actions and the consequences of those actions to guide their future actions. A model is who or what (the person) is being observed and may be live or symbolic. Vicarious conditioning (indirect learning) is when an individual observes a model displaying behaviour that is either reinforced or punished and later behaves in the same way, in a modified way, or refrains from doing so as a result of the observation. Vicarious reinforcement increases the likelihood of the observer behaving in a similar way to a model whose behaviour is reinforced. For example, a student who sees another student being allowed to leave a class early after correctly finishing all their work may be more inclined in another class to model the behaviour and respond in a similar way if they consider leaving class early a desirable outcome (a reinforcer). Vicarious punishment decreases the likelihood of an observer performing a particular behaviour after having seen a model’s behaviour being punished. For example, a student may observe someone else in class receiving detention for calling out without permission. The observer is likely to refrain from that behaviour in the future if they view a detention as an undesirable outcome (a punisher). Observational learning processes 1. Attention – The learner must closely pay attention in order to observe the modelled behaviour a. More likely to imitate a model who is liked and has a high status, has perceived similarities with traits of the observer, is familiar and known, the behaviour is visible and stands out 2. Retention – The learner must code and store the observed information in memory a. This is achieved by a mental representation (visual image) of the behaviour observed 3. Reproduction – Depending on their physical capabilities, the learner retrieves the stored information and reproduces the observed behaviour 4. Motivation – The learner must want to reproduce/imitate the learnt behaviour – WHY ARE THEY DOING THIS a. Depends on whether the learner believes that there will be a desirable consequence (reinforcement) for reproducing the learn behaviour 5. Reinforcement – Reinforcement influences learner’s motivation to perform the observed/learned behaviour – WHAT DO THEY GET a. A person is likely to reproduce an observed response if they have a good incentive to do so i. Being rewarded or if the response is likely to ‘pay off’ in the particular situation Example of observation learning process 1. Attention – Celebrities or sporting stars are often asked to model clothing or use new products in TV commercials, as it catches the attention of more people 2. Retention – Try to picture in your mind the steps involved in turning on your oven (map layout of your kitchen remembered and stored in your mind) 3. Reproduction – A 3 yr old will not have the skills to kick a football like Lionel Messi 4. Motivation – A 10 yr old being motivated to play with their dog because their sister looks like she’s having fun doing so 5. Reinforcement – Feeling of excitement/pride/had fun after playing with their dog, so they want to do it again Bandura’s experiment using a Bo-Bo doll Bandura allocated children into 3 groups, with each group watching one of three movies. The moves all displayed an adult model punching, hitting, kicking and verbally abusing a large inflated bo-bo doll. The children were split into 3 conditions. In the first condition children observed the aggressive model being rewarded with lollies, soft drink and praise from another adult. In the second condition children observed the aggressive model being punished with a spanking and verbal criticisms such as ‘Hey there, you big bully! Quit picking on that clown.’ In the third condition there were no consequences for the aggressor’s behaviour. Children who watched the aggressive model either being reinforced or experiencing no consequences for their aggressive behaviour imitated aggressive behaviour more thatn the children who watched the aggressive model being punished. When children were offered a reward (positive reinforcement) for imitating the model’s aggressive behaviour, even children who had seen the model punished tended to imitate the model’s behaviour by behaving more aggressively. The results indicate that observational learning can sometimes occur by simply viewing a model even if the model is neither reinforced nor punished. Clearly the children had learned something form observing the model. This highlights an important distinction between learning an performance (the actual production of a learned response). If someone observes a model’s behaviour and does not perform the actions they had observed, it does not mean that the behaviour as not learned. The results of Bandura’s experiment indicate that probably all the chi9ldren learned the model’s behaviour, regardless of whether they observed the model being reinforced or punished or experiencing no consequence for aggressive behaviour. Some children simply did not perform what they had learned until they were offered an incentive (reward) to do so. Bandura’s studies of observational learning led him to develop social learning theory. Social learning theory emphasises the importance of the environment, or ‘social context’, in which learning occurs. Bandura proposed that from the time we are born we are surrounded by other people displaying a huge variety of behaviours, all of which we can observe. In the 1960s, Albert Bandura conducted a series of experiments to investigate different aspects of observational learning by young children. Bandura was particularly interested in observational learning is not totally separate from conditioning. Children who watched the aggressive model either being reinforced or experiencing no consequences for their aggressive behaviour, imitated aggressive behaviour more than the children who watched the aggressive model being punished. When children were offered a reward for imitating the model’s aggressive behaviour, even children who had seen the model punished tended to imitate the model’s behaviour by behaving more aggressively. Just because the person does not display the behaviour they have observed, does not mean they have not learned it! For example, the children who saw the model punished, did not copy the aggressive behaviour until they were offered a reward. This proved that they had paid attention to the behaviour and had the ability to reproduce it but had not done so. Example of observational learning prac Aim: To be able to recreate a balloon animal through the process of observational learning Hypothesis: It is hypothesised that Waverley Christian College VCE psychology students will be able to replicate the process of creating a balloon animal through the process of observational learning. Attention: Observing the model creating the balloon animal Retention: Learner mentally represents and retains how the balloon has been transformed Reproduction: The learner converts the mental representation into an action to imitate the actions of the model is physically capable of doing so. Motivation: The learner wants to recreate the balloon animal Reinforcement: The learner has a sense of fulfilment as they were able to recreate the balloon animal. Comparing and contrasting between classical conditioning an operant conditioning Classical conditioning --The association of two stimuli, the NS and UCS, provides the basis of learning Similarities --There is an acquisition process whereby a response is conditioned or learned --Extinction takes place over a period when the UCS is withdrawn or is no longer present and the CS is repeatedly presented alone --Both types of conditioning are achieved as a result of the repeated association of events that follow each other closely in time --Accounts for the acquisition of the response --The behaviour of the organism does not have any environmental consequences --Extinction of the learned response can occur --Spontaneous recovery can occur Operant conditioning --Behaviour is associated with consequences that follow it --Extinction also occurs over time, but after reinforcement is no longer given --Accounts fort the perpetuation (maintenance) of the response --The consequence of a response if a vital component of the learning process --Involves voluntary responses that are initiated by the organism, as well as involuntary responses --The response is involuntary --Stimulus generalisation can occur --The learner is passive participant and does not control the learning process --The learner is an active participant and does control the learning process --Stimulus discrimination can occur --Response relies on the UCS being presented first --The presentation of the reinforcer or punisher depends on the response occurring first --Association is between two stimuli --Association is between the stimulus and the response --The timing of the two stimuli must be close (ideally about half a second) and the sequencing is vital --While learning generally occurs faster when the reinforcement or punishment occurs soon after the response (behaviour), there can be a considerable time difference between them (especially in humans) --The response is often one involving the action of the A.N.S. and the association of the two stimuli is often not conscious or deliberate --The response may involve the A.N.S. but often involves higher order brain processes because the response is conscious, international and often goal-directed Comparison of 3 types of learning outline to explain – for why questions Classical conditioning Passive Operant conditioning Active Involuntary Voluntary 3 Association 3 Consequences Observational learning Voluntary 5 Consequences Type of behaviour (voluntary/involuntary) + role of the learner (passive/active) Number of stages Type of learning