NATIONAL QUALIFICATIONS CURRICULUM SUPPORT Human Biology Case Study into Rewarded Behaviour, Unrewarded Behaviour and Shaping Learning Student’s Guide [HIGHER] The Scottish Qualifications Authority regularly reviews the arrangements for National Qualifications. Users of all NQ support materials, whether published by Learning and Teaching Scotland or others, are reminded that it is their responsibility to check that the support materials correspond to the requirements of the current arrangements. Acknowledgement Learning and Teaching Scotland gratefully acknowledges this contribution to the National Qualifications support programme for Human Biology. The publisher gratefully acknowledges permission to use the following sources: image of Anton Van Dalen from http://antonvandalen.com/images/ANTON_PAINTINGS/ANTON_Painting_B.F.SkinnerWith ProjectPigeon.jpg © Artist: Anton van Dalen, Title: B.F. Skinner with Project Pigeon. Medium: Oil on canvas. Size: 48"x64". Date: 1985; image of BF Skinner from http://sgspsychology2.webs.com/Crime/bf-skinner.jpg © BF Skinner Foundation; picture of a rat in a basic skinner box from http://animals.howstuffworks.com/pets/dog-training1.htm, Courtesy of HowStuffWorks.com; picture of will press lever from http://content.perspicuity.com/?q=node/229 © Craig Swanson www.perspicuity.com; microsoft clipart © Microsoft Clipart, 2011; text ‘Animal Intelligence’ by EL Thorndike from http://www.psych.appstate.edu/~kms/classes/psy5150/Documents/Thorndike1898.pdf; © Thorndike, E. L. (1998). Animal intelligence: An experimental study of the associate processes in animals. American Psychologist, 53(10), 1125-1127. doi:10.1037/0003-066X.53.10.1125. Animal Intelligence: An Experimental Study of the Associate Processes in Animals, by E. L. Thorndike, 1898, Psychological Review Monograph Supplement, 2(4), 1-8. This article is in the public domain Every effort has been made to trace all the copyright holders but if any have been inadvertently overlooked, the publishers will be pleased to make the necessary arrangements at the first opportunity. © Learning and Teaching Scotland 2011 This resource may be reproduced in whole or in part for educational purposes by educational establishments in Scotland provided that no profit accrues at any stage. 2 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 Contents Overview 4 Activities Activity 1: A practical with a human subject 5 Activity 2: An introduction to rewarded behaviour 11 Activity 3A: Identifying examples of positive reinforcement of behaviour 13 Activity 3B (Extension): Beyond rewarded behaviour: other categories of operant conditioning 13 Activity 4: Interpreting and evaluating the first reported experimental evidence of operant conditioning 15 Activity 5A: An investigation into rewarded and unrewarded behaviour: analysing results and drawing conclusions 16 Activity 5B (Extension): Presenting a scientific research paper on operant conditioning: what is learned? 19 Activity 6: Research into the shaping of behaviour and its applications 22 Appendix: Learning activities 23 Glossary and reference 34 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 3 OVERVIEW Overview In this case study you will find out about an important form of learning called operant conditioning. You will evaluate original research, analyse and draw conclusions from experimental results, research and present information on some interesting applications of this form of learning, and complete a practical activity. There are also two extension activities, which will allow you to explore operant conditioning further. The sheets referred to in the instructions of each activity can be found in the Appendix. Project Orcon: During World War II the American psychologist B.F. Skinner attempted to train pigeons to guide missiles to targets by rewarding them for performing certain behaviour. The pigeons were to actually sit within the flying missiles! 4 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 ACTIVITIES Activities Activity 1: A practical with a human subject Before you learn about this topic in detail you are going to complete a practical activity. This activity will introduce some of the important concepts of this topic, but these may not become apparent until the experiment has been completed and you have been debriefed . In this activity you will work in groups of three. One person will be the experimenter, one person will be the monitor and the third per son will be the experimental subject. It is important that everyone in the group reads their instructions carefully and carries them out exactly as described. If you already have some knowledge about this topic you will need to keep this to yourself for the time being and take the role of the experimenter or the monitor rather than being the subject of the experiment. Once you have decided on which role you will take you need to read the instruction card for your role. Don’t read the cards for the other roles. Then begin the experiment (see the instruction card sheets at the end of this section). You should work somewhere quiet where other groups will not distract you. After you have completed the experiment you will have a chance to discuss your observations with your group members. Your teacher will then debrief you and explain the theory underlying the experiment. Read the following after you have completed the experiment and been debriefed. To complete this activity use the class data spreadsheet (a s eparate Excel file) to plot a line graph of time interval versus the average number of responses per 30 seconds (there is a tutorial on the following page on how to do this). Print out a copy of the graph for your notes. Write a paragraph describing the experiment that you carried out, the pattern shown by the class results and what this experiment tells you about rewarded behaviour. Are the conclusions from the class data valid for your own group’s data? What would be a more accurate title for this activity? NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 5 ACTIVITIES How to calculate an average value using Excel 1. 2. Highlight the cells containing the data that you want to calculate the average value of. Click on the AutoSum button in the horizontal toolbar and select Average. How to plot a line graph using Excel 1. Go to Insert, Chart. 2. Select X Y (Scatter) and click on the first dots and lines option . 3. Right click on the chart and click on the Select Data option. 4. Select the X values box and highlight the range of data in your table that you wish to use. 5. Select the Y values box and highlight the range of data in your table that you wish to use. 6. Click OK. 6 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 ACTIVITIES Activity 1: A practical with a human subject Instructions for the experimenter – do not show to the subject You will need a pencil and paper and a big bag of chocolate buttons. You will sit at a table opposite the subject. Read out the following instructions to the subject: 1. 2. 3. 4. 5. 6. 6. 7. 8. Tell the subject: ‘You are taking part in a test of intelligence .’ Tell the subject: ‘You must work to receive points from me.’ Tell the subject: ‘You will receive a point every time I tap the table with my pencil.’ Tell the subject: ‘When you receive a point you should tally it on your paper.’ Tell the subject: ‘Your goal is to get as many points as possible.’ Tell the subject: ‘You will receive a chocolate button from me every time you have earned five points.’ Tell the subject: ‘I can’t answer any questions or provide any you with any more information.’ Tell the subject: ‘The experiment will begin as soon as I tap the table with the pencil for the first time.’ Tell the subject: ‘I won’t say anything else to you .’ During the experiment it is important that you do not say anything to the subject or smile or nod persistently at them. You s hould maintain a blank face and try not to make any repeated gestures or movements other than tapping your pencil on the table. Try to stay quiet apart from when you tap the pencil. Keep the pencil in your hand at all times. When you are ready to start the experiment spend the first minute or so observing the subject’s behaviour. Then tap the table with you r pencil every time the subject taps their chin with their fingers . If the subject does not touch their chin at all, you will need to tap your pencil when they make a different response. This could be crossing their legs, touching the table, touching another part of their face or hair, or any other action that you observe them to make. Try not to make it too obvious that this behaviour is what you are lo oking for. You will need to keep count of your taps so that you know when you should pass a chocolate button to the subject. Make your pencil tap as soon as the chin tapping action has been completed and the subject’s finger has been withdrawn. If the subject tries to record a point before you have given one to them, remind them that they will only get a point when you tap the table with your pencil. NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 7 ACTIVITIES The rate of the subject making chin taps should increase. If the rate increases for a time then begins to decrease, tell the subject: ‘Keep earning points.’ Once the subject is making frequent chin taps, continue pencil tapping as normal for another 5 minutes, then for the last 5 minutes of the experiment stop tapping the table with your pencil, but keep the pencil in your hand. You should continue the experiment for a maximum time of 30 minutes. After the experiment At the end of the experiment you should come together with the subject and the monitor. You will ask the monitor and the subject some questions about what was happening. Don’t confirm anything until you have heard from both of them, and ask the monitor first. 1. Questions to ask the monitor at the end of the experiment: - How did the number of points the subject received per 30 -second interval change during the course of the experiment? - What observations did you make about the subject’s general behaviour and emotions during the experiment? - Did the subject’s general behaviour and emotions change during the course of the experiment? 2. Questions to ask the subject at the end of the experiment: - What do think was happening during the experiment? - How did the number of points you were receiving change during the course of the experiment? - Why do you think you received points? When did you realise this ? - Why do you think there were times when you received points very quickly and other times when you didn’t receive points very quickly? - Do you think that any aspect of your behaviour affected whether you received points or not? When did you realise this? - Why do think that you received points more frequently later in the experiment? (Ask this question if appropriate .) - What do you think happened in the last 5 minutes of the experiment? 8 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 ACTIVITIES Activity 1: A practical with a human subject Instructions for the monitor – do not show to the subject You will need a stop-clock, a pen and paper. You should sit so that you can see the subject, but not so close that you are a distraction. The subject of the experiment will receive one point every time the experimenter taps the table with their pencil. They must update their score in writing every time they receive a point. The subject will receive a chocolate button from the experimenter every time they earn five points. Your task is to record the number of times the subje ct taps their chin with their fingers in each 30 seconds of the experiment. You should start the stop clock when the experimenter indicates that they are starting the experiment. In the last 5 minutes of the experiment the subject will not receive any poin ts (they don’t know this). If the subject is not touching their chin when they receive points, you will need to watch closely to see exactly what action they are doing to get points and record the frequency of this instead. You should set out your results like this: Time interval number Time interval (seconds) 1 0–30 2 30–60 3 60–90 etc etc Number of responses made by the subject You should also record any spontaneous comments that the subject makes and any observations you make about the way they are behaving and their emotions. The experiment should take no longer than 30 minutes. At the end of the experiment you will be asked some questions by the experimenter; you should not mention the subject’s chin tapping at this point. After the subject has answered their questions you will present your data to your group. After the discussion you should add your group’s data to the class results spreadsheet on a computer. NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 9 ACTIVITIES Activity 1: A practical with a human subject Instructions for the subject Listen carefully to everything the experimenter tells you and follow their instructions. 10 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 ACTIVITIES Activity 2: An introduction to rewarded behaviour The set of responses an animal makes to environmental stimuli allow s it to control the events that occur in its environment. These responses are collectively known as behaviour. Animals (including humans) that make responses that produce beneficial outcomes maximise the chances of the survival and reproduction of their genes. The outcomes of their behaviour can act as rewards that reinforce the set of responses that made them available. When an animal is rewarded for responding to a particular stimulus, for example pressing a lever and obtaining food, the behaviour is reinforced: the food acts as a positive reinforcer. The animal’s nervous system enables it to learn that there is an association between the responses it made and the stimulus that led to the reward. The frequency of the responses increases because they result in the animal being rewarded: its behaviour has be en subjected to positive reinforcement. This type of learning is known as operant or instrumental conditioning because the animal’s behaviour operates on its environment and can be instrumental to it receiving a reward. The diagram below summarises the three key variables involved in the process: In Activity 1 the subject of the experiment received a reward of points (and food) when they made a specific response in the presence of the experimenter. As a result, the frequency of that response increased over the course of the experiment. When they no longer received a reward as a result of their behaviour, the frequency of the response decreased and was extinguished. What was the stimulus in this case? NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 11 ACTIVITIES Operant conditioning in action Watch a video clip ( http://www.youtube.com/watch?v=PQtDTdDr8vs) of a rat demonstrating instrumentally conditioned behaviour. The container the rat is in is called an operant conditioning chamber, or Skinner box. It takes the latter name from B. F. Skinner (1904–1990), an eminent American psychologist who pioneered its use in behavioural research. Answer the following questions about the clip and then draw a diagram like the one above to illustrate the process of operant conditioning in this example: - What is the behaviour that is being rewarded? - What is the reward likely to be? - What is the stimulus that the rat associates with the rewarded behaviour? - Could there be more than one stimulus that the rat associates with the rewarded behaviour? - Why is it adaptive for the rat to behave in this way? - What advantages are there to studying behaviour using an operant conditioning chamber? Find out more about B. F. Skinner here: http://www.bfskinner.org/BFSkinner/Abo utSkinner.html 12 Burrhus Frederic Skinner (1904–1990) with a Skinner box. NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 ACTIVITIES Activity 3A: Identifying examples of positive reinforcement of behaviour To practise identifying cases of positive reinforcement visit the website http://psych.athabascau.ca/html/prtut/. Read through the introduction, definition and examples of positive reinforcement then work through the practice exercises. For each question write down your answer and explanation then check whether it is correct and amend your answer as necessary. Activity 3B (Extension): Beyond rewarded behaviour: other forms of operant conditioning So far you have looked at one category of operant conditioning: rewarded behaviour. Three other categories exist. Essentially there are two variables to consider: the value of the reinforcer to the animal (always determined by the animal’s preferences) and the relationship between a particular response and the presentation of the reinforcer (this can be determined by the experimenter). In rewarded behaviour the reinforcer has an attractive value to the animal (the animal would ‘like’ to receive more of it). There is also a positive relationship between the response and the reinforcer. This means that carrying out the response results in the presentation of the reinforcer. This category of operant conditioning is also known as positive reinforcement and increases the rate of response. To learn about the other categories of operant conditioning look at the illustrations on the sheet ‘The categories of operant conditioning’. Each shows an example of one combination of reinforcer value and response – reinforcer relationship for a typical human situation. For each category, identify the precise nature of the response, the value of the reinforcer, the relationship between the response and the whether the reinforcer is presented, and the effect on the rate of responding. To summarise the four categories of operant conditioning and their outcomes complete the table below: NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 13 ACTIVITIES Relationship between response and reinforcer Value of reinforcer Positive (the reinforcer is presented when the response is made) Attractive (its presence benefits the animal) Positive reinforcement (reward) The rate of responding increases. Negative (the reinforcer is omitted when the response is made) Aversive (its presence harms the animal) Now make up two human examples for each category using a realistic reinforcer and set of responses. Present your examples as cartoon strips. 14 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 ACTIVITIES Activity 4: Interpreting and evaluating the first reported experimental evidence of operant conditioning This extract is taken from one of the first papers written by the American psychologist Edward L. Thorndike (1874–1949). In this paper he describes his initial observations of rewarding animals for specific behaviours. Part 1 Read the extract then discuss and complete the following tasks: Draw the experimental set-up as Thorndike describes it. What were the independent and dependent variables in Thorndike ’s investigation? Why do you think Thorndike did not feed the animal if it failed to escape? What were the stimulus, rewarded behaviour and reward in the experiment? Sketch a graph to show the results that Thorndike describes. Which sentence(s) indicates that Thorndike has taken measures to increase the reliability of his results? What did Thorndike do to ensure his comparison was done fairly? Can you think of any variables that have not been accounted for? If you were going to replicate Thorndike’s experiments, what additional information would you need to know? How could the experiment be altered to test whether the food was actually acting as a positive reinforcer of the animals’ behaviour? Part 2 This video clip (http://www.youtube.com/watch?v=Vk6H7Ukp6To) shows a dramatisation of Thorndike’s experiments. How closely do the events shown in the clip match your answers to Part 1 of this activity? Amend your notes as necessary. What was Thorndike’s law of effect? Find about more about Thorndike’s experiments at this website: http://psychclassics.yorku.ca/Thorndike/Animal/ . NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 15 ACTIVITIES Activity 5A: An investigation into rewarded and unrewarded behaviour: analysing results and drawing conclusions Graphs 1 and 2 show the results of two parts of an experiment carried out to study the effect of reinforcement on the behaviour of rats. Read the descriptions of both experiments and the results , then complete the tasks. You may find it helpful to discuss the tasks with your classmates before attempting to write down answers. Part 1 of the experiment Two rats were placed in identical operant conditioning chambers (similar to the one you saw in Activity 2) for 10 periods each of 10 minutes’ duration. The rate at which each rat pressed a lever in the chamber was measured. Rat 1’s leverpushing behaviour was reinforced by rewarding it with a sucrose pellet every time it pressed the lever (a continuous reinforcement (CRF) schedule). Rat 2 did not receive anything for pressing the lever. Graph 1 16 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 ACTIVITIES Part 2 of the experiment Two rats were used, Rat 1 and Rat 3. Rat 1 was the same individual as used in the first part of the experiment. Rat 3’s leve r-pressing behaviour was conditioned in a similar way to Rat 1, but the schedule of reinforcement was altered so that it only received a reward every three times that it pressed the lever (a partial reinforcement schedule). Rat 1 and Rat 3 were put into individual operant conditioning chambers for 10-minute periods for six consecutive days. When the rats pressed the lever they did not receive a sucrose pellet; their lever-pushing behaviour was unrewarded (the term for this condition is extinction). Graph 2 Tasks 1. Describe and compare the trend in the results for both rats on Graph 1. Make sure your descriptions are precise: you should make reference to the patterns of change and use numbers to support your description. 2. Explain the difference in response rates between the two rats in Graph 1 in terms of what you have learned about operant conditioning. NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 17 ACTIVITIES 3. Suggest a reason for the pattern of results obtained from Rat 1 between trial number 6 and trial number 10 in Graph 1. 4. What other information would you need to know in order to decide whether your conclusions for Graph 1 are valid? 5. How could you adapt the design of the experiment to condition lever pushing behaviour in response to the stimulus of an illuminated light bulb? 6. Suggest why the rats were placed in separate choice chambers in both parts of the experiment. 7. Use Graph 2 to describe the effect of omitting the reward on conditioned behaviour. 8. What was the difference between Rat 1 and Rat 3? 9. What do the results in Graph 2 suggest about the effect of the reinforcement schedule on unreinforced behaviour? 10. Predict the trend in the results for the second part of the experiment if it was continued for a further three days. 11. Explain why the term for withholding the reward after prior conditioning is called ‘extinction’. 12. What improvements could be made to the design of the experiments? 18 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 ACTIVITIES Activity 5B (Extension): Interpreting a scientific research paper on operant conditioning: what is learned? An introduction to how scientists communicate their fi ndings The primary way in which scientists communicate their discoveries to the research community is by publishing research papers in science journals. Research papers contain a complete account of the scientists’ investigations, including enough detail to allow other scientists to replicate them. Before a paper can be published it is shown to a pair of scientists who work on the same topic. The scientists will review the paper, checking to make sure that the experiments have been carried out fairly, that the results are reliable, that the paper’s conclusions are valid, and that the writing is satisfactory. This process is known as peer review. If the paper is acceptable to the reviewers then the journal will publish it, otherwise it may be sent back to th e authors with suggestions for improving it, or simply rejected. In this activity you will gain experience of reading an original research paper and present its findings in the form of a poster. The first thing to do is to familiarise yourself with the standard layout and content of a research paper by reading the sheet entitled ‘The basic structure of a scientific research paper’. Research papers can often be long and complex so it is important to understand how to read them effectively so that you pick up the main experimental findings. Read the sheet, ‘How to read a scientific research paper for comprehension’ for an introduction on how to do this. Reading a paper and presenting the findings Background The paper you will be reading is entitled ‘Variations in the sensitivity of instrumental responding to reinforcer devaluation’ (Adams, 1982). It describes a series of experiments carried out on rats. You will look at the first experiment, which is on pages 2–7. You have seen that rewarded behaviour involves an animal forming associations. However, a key question that researchers have asked is what does the animal actually learn to associate its behaviour with? One answer is that the animal associates its behaviour with the stimulus that triggers it, eg the sight of a lever, so that every time it sees the lever it presses on it. Another answer is that the animal associates its behaviour with the stimulus and the reward it receives for responding to it, eg the food pellet that is NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 19 ACTIVITIES delivered, so that every time it sees the lever it presses it because it knows it will receive something of value to it. The latter association provides the animal with more flexibility. If the reward ceases to be of value to the animal then it can stop carrying out the rewarded behaviour. By contrast, if an animal has not encoded the value of the reinforcer, but has just learnt to associate its behaviour with, for example, the sight of a lever , then it will not stop pressing, even when the reward has been devalued. Its behaviou r in response to the stimulus has become a habit, rather than being directed towards achieving a goal (the reward). These two associations are illustrated below. Conditioned behaviour as a habitual response to a stimulus Every time I see the lever I press it, but I don’t know why. Conditioned behaviour as a goal-directed action Every time I see the lever I press it because I know I will get food for doing this. The standard way to investigate this aspect of operant conditioning is to first condition a particular behaviour in an animal by positively reinforcing it, so 20 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 ACTIVITIES that the animal is trained to respond at a high rate to the stimulus that leads to the reward. The next step is to devalue the reinforcer. This means exposing the animal to it (in the absence of the conditioned behaviour) in such a way that it is no longer regarded as a reward. In the final step the animal is again exposed to the stimulus (in extinction) and its rate of responding is recorded. If the animal responds at a lower rate compared to an animal that has not experienced reinforcer devaluation, this would indica te that the animal’s lever-pressing behaviour has been established as a goal -directed action rather than a stimulus-response habit. Its rate of responding has decreased because it ‘knows’ that the reinforcer has lost its value so there is no point in worki ng to gain it anymore. The experiment described on pages 2–7 of the paper was carried out to investigate how the number of reinforced lever presses rats are allowed to carry out affected whether the result was the formation of a stimulus -response habit (the rats simply learn that pressing the lever is a good thing) or a goal directed action (the rats learn that pressing the lever is a good thing because it results in the delivery of a food pellet) . Instructions for completing the task You will be given your own copy of this paper, which you should annotate as necessary. Your task is to read and understand the research presented on pages 2–7 of the paper (Experiment 1) and then present your understanding in the form of an A2 poster. The poster should include: the key questions the experiment was designed to answer a description of the background to the work a description of the methods employed by the authors a description of the results obtained (this could be done as a simplified graph) the main findings of the research any ideas you can think of for further experiments diagrams and pictures to illustrate your written descriptions. For assistance in interpreting the research, including discussion questions to get you started, read the document ‘A guide to the paper’. NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 21 ACTIVITIES Activity 6: Research into the shaping of behaviour and its applications Positive reinforcement can be used to increase the frequency with which an animal behaves in a particular way. However, there may be cases where the behaviour that we wish to reinforce is never performed (the baseline response rate is zero). Rats do not have innate ‘knowledge’ that pressing a bar will result in the delivery of food; this situation is far removed from their natural environment. Rats may not even realise that they can press the lever when they are initially presented with it. The process by which an animal’s behaviour is gradually modified until it matches a specific, and often complicated, target behaviour is known as shaping. Your task is to use sources of information on the internet to find out precisely how shaping works and how its principles can be applied to altering the behaviour of humans and other animals. You should access information to help you answer questions 1–3, then pick two more questions to answer. You will work in a group for this activity. Your group can present its findings to the class in any way you choose, as long as everyone contributes to the presentation. Your presentation should also include a dramatised example of how behaviour can be shaped to produce a final desired behaviour. 1. Explain how the process of how shaping behaviour works and how positive reinforcement is involved. 2. Why is shaping necessary in behavioural research? 3. How can the process of shaping be applied to d eveloping desirable patterns of human behaviour? 4. How can a rat’s behaviour be shaped so it will press a lever for food? 5. How can the process of shaping be applied to training dogs? 6. How can shaping be used to train rats to play basketball? 7. How can shaping be used to train a pigeon to guide a missile to its target or go bowling? (Search for ‘Project Pigeon’ or ‘Project Orcon ’.) 22 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 APPENDIX: LEARNING ACTIVITIES Appendix: Learning activities Activity 3B (Extension): Beyond rewarded behaviour: other categories of operant conditioning NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 23 APPENDIX: LEARNING ACTIVITIES 24 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 APPENDIX: LEARNING ACTIVITIES Activity 4: Interpreting and evaluating the first reported experimental evidence of operant conditioning An extract from Animal Intelligence: An Experimental Study of the Associate Processes in Animals (Thorndike, 1898) ‘After considerable preliminary observation of animals ’ behaviour under various conditions, I chose for my general method one which, simple as it is, possesses several other marked advantages besides those that accompany experiment of any sort. It was merely to put animals when hungry in enclosures from which they could escape by some simple act, such as pulling at a loop of cord, pressing a lever, or stepping on a platform. (A detailed description of these boxes and pens will be given later.) The ani mal was put in the enclosure, food was left outside in sight, and his actions observed. Besides recording his general behavior, special notice was taken of how he succeeded in doing the necessary act (in case he did succeed), and a record was kept of the time that he was in the box before performing the successful pull, or clawing, or bite. This was repeated until the animal had formed a perfect association between the sense-impression of the interior of that box and the impulse leading to the successful mo vement. When the association was thus perfect, the time taken to escape was, o f course, practically constant and very short. If, on the other hand, after a certain time the animal did not succeed, he was taken out, but not fed. If, after a sufficient number of trials, he failed to get out, the case was recorded as one of complete failure. Enough different sorts of methods of escape were tried to make it fairly sure that association in general, not association of a particular sort of impulse, was being studied. Enough animals were taken with each box or pen to make it sure that the results were not due to individual peculiarities. None of the animals used had any previous acquaintance with any of the mechanical contrivances by which the doors were opened. S o far as possible the animals were kept in a uniform state of hunger, which was practically utter hunger. That is, no cat or dog was experimented on when the experiment involved any important question of fact or theory, unless I was sure that his motive wa s of the standard strength. Cats (or rather kittens), dogs and chicks were the subjects of the experiments. All were apparently in excellent health, save an occasional chick. By this method of experimentation the animals are put in situations, which call into activity their mental functions and permit them to be carefully observed. The animal’s behavior is quite independent of any factors save its own hunger, the mechanism of the box it is in, the food outside, and such general matters as fatigue, indisposition, etc. Therefore the work done by one investigator may be repeated and verified or modified by another. No personal factor is present save in the observation and interpretation. Again NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 25 APPENDIX: LEARNING ACTIVITIES our method gives some very important results that are quite unin fluenced by any personal fact in any way. The curves showing the progress of the formation of associations, which are obtained from the records of the times taken by the animal in successive trials, are facts, which may be obtained by any observer who can tell time. They are absolute, and whatever can be deduced from them is sure. So also the question of whether an animal does or does not form a certain association requires for an answer no higher qualification in the observer than a pair of eyes. ’ 26 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 APPENDIX: LEARNING ACTIVITIES Activity 5B (Extension): Presenting a scientific research paper on operant conditioning: what is learned? The basic structure of a research paper Title This tells you the topic of the authors’ investigation and the overall outcome. A research paper is usually the work of a team of researchers. Abstract In this section the authors briefly summarise their investigation. Introduction In this section the authors describe the background to their research. They will say what is already known about the topic they ar e investigating, and make references to the experiments that have established this. Materials and methods In this section the authors describe in detail how the y carried out their experiments. Results In this section the authors present the results they obtained. These may be displayed in the form of tables of data and/or graphs. Discussion In this section the authors draw conclusions from their results and discuss their significance. They describe how their conclusions fit in with the knowledge discussed in the introduction. The authors may also evaluate their investigation and suggest ideas for further experiments. References In this section the authors provide details of all the previous work that they used in order to carry out their investigation an d write their paper. Acknowledgements In this section the authors may thank other people and funding boards that have made contributions to their work. NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 27 APPENDIX: LEARNING ACTIVITIES Activity 5B (Extension): Presenting a scientific research paper on operant conditioning: what is learned? How to read a scientific research paper for comprehension General tips Annotate your copy of the paper as much as possible with any comments, queries or thoughts you have. Highlight unfamiliar scientific terms and find out what they mean. You may need to consult other sources of information to help you familiarise yourself with what is being discussed. The authors will often use abbreviations in their writing and symbols on their figures. You may find it useful to make a key that explains what each one represents. At the paper’s heart will usually be the simple question of ‘What is the effect of X on Y?’ You need to work out what X and Y are, how they were manipulated and measured, and what the effect was. The abstract Read this first to find out what the authors’ main questions, experiments and conclusions were. The introduction Read as much as you need to in order to understand the background to the experiments. The methods Read this to familiarise yourself with the experimental procedures the authors used. You will need to identify the independent and dependent variables, and how they were manipulated and measured. Try drawing a flow chart to show how the experiment was done. The results Look at the figures (graphs) first. These often convey most of the results. Make sure you know what any abbreviations mean. Try writing a short summary of each figure to describe what it shows. If you want more details read the authors’ description of the results. If statistics are used, focus on what they imply, not how they were produced. 28 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 APPENDIX: LEARNING ACTIVITIES If a measured difference between two groups is reported to be statistically significant then this means that the probability of the difference being found if there was actually no difference between the two groups is very low (less than 0.05). The authors can be reasonably confident that the difference measured (the dependent variable) has been caused by their differing treatment of the two groups (the independent variable). The discussion Read this to find out what conclusions the authors came to. How do the conclusions relate to the original experimental aims? How do the conclusions relate to the evidence presented? NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 29 APPENDIX: LEARNING ACTIVITIES Activity 5B (Extension): Presenting a scientific research paper on operant conditioning: what is learned? A guide to the paper Summary The aim of the experiment was to find out which factors influence whether rats encode a representation of the reinforcer that strengthens their behaviour, or just the stimulus that initiates it. That is, do the rats press a lever because they associate it with the delivery of a reinforcing reward (a goal-directed, or reward-outcome (R-O), action) or has their behaviour been conditioned ‘blindly’ so that the rats press the lever out of habit without associating it with the reward (a stimulus-response (S-R) habit)? The reward has to act as a reinforcer of the behaviour at the cellular level in the rats’ brains, but this does not imply that a representation of the reward is actually held by the rats in their memory and availabl e to them when they are exposed to the lever. The standard way of investigating this aspect of operant conditioning is to first condition a particular behaviour in an animal by positively reinforcing it, so that the animal is trained to respond at a high rate to the stimulus that leads to the reward. The next step is to devalue the reinforcer. This means exposing the animal to it (in the absence of the conditioned behaviour) in such a way that it is no longer regarded as a reward. In the final step the animal is again exposed to the stimulus (in extinction) and its rate of responding is recorded. If the animal responds at a lower rate compared to an animal that has not experienced reinforcer devaluation, then this would indicate that the animal’s lever pressing behaviour has been established as a goal -directed action rather than a stimulus-response habit. Its rate of responding has decreased because it ‘knows’ that the reinforcer has lost its value so there is no point in working to gain it anymore. It is worth contrasting this with the punishment condition of operant conditioning. It is important that the value of reinforcer is not confused with the relationship between the reinforcer and the response. In experiment 1 the factor being investigated was th e amount of training the rats received. ‘Training’ refers to how many reinforced lever presses each rat carried out. In this case the reinforcer was a sucrose pellet, the rewarded behaviour was pressing a lever, and the stimulus that initiated this behavi our 30 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 APPENDIX: LEARNING ACTIVITIES was the presentation of the lever. The subjects of this experiment were 48 naïve male hooded Lister rats. To answer the aim of the experiment the rats were divided into four experimental groups. The rats in two of the groups were reinforced 100 tim es for 100 lever presses (designated as the 100 -P and 100-U groups), and the other two groups were reinforced 500 times for 500 lever presses (designated as the 500-P and 500-U groups). This was the baseline training phase. The P groups were then subjected to food aversion training. They were given free access to sucrose pellets, but were then injected with lithium chloride, which induces sickness. As a result of this pairing between the sucrose and sickness (a form of Pavlovian conditioning) the rats in these groups developed an aversion to the sucrose and their consumption of pellets given freely (without a requirement for lever pressing behaviour) decreased. The reinforcer had been devalued. The rats in the U groups were subjected to the same procedure, but there was no pairing between the sucrose and sickness, so the reinforcer was not devalued. This was the food aversion training phase. Each group was then given an extinction test and the frequency of lever pressing responses was measured and standardi sed. In an extinction test lever-pressing behaviour is not reinforced. This was done so that the sucrose pellets could not have a direct impact on the response rate. It was found that devaluation had a significant effect on lever pressing for the group that performed 100 lever presses in baseline training (100 -P). The relative response rate of individuals in this group decreased significantly compared to the 100-U group, so it was concluded that the lever pressing of these rats was goal-directed. There was no significant difference in the relative response rates of individuals in the 500-P and 500-U groups so it was concluded that extended training had resulted in a rat forming a stimulus -response habit: they were not sensitive to reinforcer devaluation. A reacquisition test was used to measure the effectiveness of devaluing the reinforcer. In the reacquisition test lever -pressing behaviour was reinforced again. It was found that devaluation of the reinforcer had been equally effective for the 100-P and 500-P groups. There was no significant difference between their response rates, neither group’s response rate recovered to the baseline value, and none of the animals in these groups ate any of the sucrose pellets that were delivered when they engaged in lever pressing behaviour. These results mean that the difference in relative response rates between the 100-P and 100-U groups and between the 500-P and 500-U groups in the extinction test was not caused by the effectiveness of the reinforcer NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 31 APPENDIX: LEARNING ACTIVITIES devaluation. This allowed the authors to be more certain that their conclusion that extended training results in the formation of a stimulus -response habit was the correct one. Questions for discussion 1. What were the reward and reward behaviour in this investigation? 2. What does it mean to change the value of the reinforcer? 3. What does successful integration mean? 4. What is an extinction test? 5. If the rat had encoded information about the value of the reinforcer, what would be the appropriate change in respon ding during the extinction test? 6. What were the independent and dependent variables in this experiment? 7. How was the reinforcer (the sucrose) devalued? 8. What does the adjective naïve imply about the rats? Why is this important? 9. Why was the mass of each rat reduced to 80% of its free-feeding mass and maintained at this level during the investigation? 10. Can you draw a labelled diagram to show the set up of each operant chamber (Skinner box)? 11. Can you draw a flow diagram to show the training and testing and treatment that each of the four groups of rats received? You should divide it up by group and by the phase of training and testing (baseline, food aversion, extinction test, reacquisition test). 12. What controls were used in the experiment? 13. How long did food aversion training go on for? 14. Which group of rats pressed the lever most frequently after baseline training? 32 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 APPENDIX: LEARNING ACTIVITIES 15. How were the test response rates of the rats in each group presented to make comparisons fair? 16. Which groups in Figure 1(a) do the authors calculate to have significantly different mean relative response rates? 17. In Figure 1(a) which group of rats showed sensitivity to reinforcer devaluation? 18. What was the purpose of carrying out the reacquisition test? 19. What did the results of the reacquisition test show? (Look carefully at the graph on a zoomed-in view) 20. What could be an overall conclusion to the experiment? 21. Which confounding variables do the authors identify in their discussion? NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 33 GLOSSARY AND REFERENCE Glossary Behaviour The set of responses an organism makes. Extinction A decrease in the frequency of a particular behaviour as a result of it no longer being rewarded. Learning A long-term modification of the responses an animal makes as a result of the presentation of particular patterns of stimuli. Negative reinforcer A reinforcer that is omitted when an animal makes a particular response. Operant conditioning A form of learning in which an animal learns to associate its behaviour with a particular outcome. Positive reinforcer A reinforcer that is presented when an animal makes a particular response. Reinforcer Any stimulus that changes the frequency of a particular behaviour, eg the sight, smell or taste of food. Response How an organism reacts to a particular stimulus. Animals are capable of making two types of response: muscle contraction or glandular secretion. Reward Anything that is beneficial to an animal, increasing its chances of surviving and reproducing, eg an energy-rich food. Rewarded behaviour A set of responses that result in the presentation of a reward to an animal. 34 NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 GLOSSARY AND REFERENCE Shaping The reinforcement of behaviours that are closer and closer approximations to the behaviour that one is aiming to condition. Stimulus Any environmental change that an organism can detect. Unrewarded behaviour A set of responses that do not result in the presentation of a reward to an animal. Reference Adams, C D. Variations in the sensitivity of instrumental responding to reinforcer devaluation. Q J Exp Psychol.1982; 34B: 77–98. NEUROBIOLOGY AND BEHAVIOUR (H, HUMAN BIOLOGY) © Learning and Teaching Scotland 2011 35