U4- Learning 5 - Operant Conditioning

3. Operant Conditioning = A form of learning for which the likelihood of a particular response occurring is determined by the consequences of that response. A response that has a desirable consequence will tend to be repeated and a response that has an undesirably consequence will tend not to be repeated. 1. The Stimulus (S) that precedes (comes before) the operant response. 2. The Operant Response (R) to the stimulus. 3. The Consequence (C) to the operant response. Stimulus Operant Response Consequence The three-phase model of Operant Conditioning Skinner and his Rats… (page 480) • Skinner placed a hungry animal (a rat or a pigeon) into a SKINNER BOX. The box contained a lever and a food tray. • The animal would move around the cage and at some stage it would accidentally press the lever – releasing a food pellet. • Skinner took note of how many trials it took for the rat to learn to press the level straight away. Skinner concluded that BEHAVIOUR IS SHAPED AND MAINTAINED BY ITS CONSEQUENCES. • • Meaning, that what happens directly after a behaviour will determine if that behaviour will be repeated (strengthened) or will stop (weakened). Elements of Operant Conditioning Reinforcement = Any stimulus that strengthens or increases the likelihood of a response that it follows. Can either involve receiving a pleasant stimulus or ‘escaping’ an unpleasant one. Reinforcer = Any stimulus that provides reinforcement, often referred to as a reward. Elements of Operant Conditioning Positive Reinforcement = The presentation of a PLEASANT stimulus (consequence) following a desired response, thereby strengthening the response or making it more likely to occur again. E.g. The food pellet. Or: We wash the dishes for mum, receive praise (positive reinforcement) and then are more likely to do it again. Any other ideas? Elements of Operant Conditioning Negative Reinforcement (Page 487) = The removal or avoidance of an unpleasant stimulus; because the outcome is a pleasant one, the removal of the unpleasant stimulus is strengthened or more likely to occur again. Eg. We take a panadol to get rid of a headache (We add something to the situation to remove an unpleasant stimulus. Then more likely to take panadol again). OR On a rainy day we use an umbrella to remove the unpleasant experience of having wet clothes. For the maths geeks among us… To help you remember this difference, you could link the terms with mathematical symbols: positive (+) reinforcer = adding something pleasant; negative (–) reinforcer = subtracting something unpleasant. In mathematics, two negatives make a positive. The same applies to the concept of negative reinforcement—the subtraction of a negative (unpleasant) stimulus results in a positive (desirable) consequence or outcome. Elements of Operant Conditioning SCHEDULES OF REINFORCEMENT: Reinforcement can be provided on a continuous schedule (that is, after each correct response) or on a partial reinforcement schedule (only on SOME occasions after the correct response). There are four basic schedules of PARTIAL reinforcement. Each produces a different effect on the response acquisition rate and the strength of the response. 1. Fixed-Ratio Schedule 2. Variable-Ratio Schedule 3. Fixed-Interval Schedule 4. Variable-Interval Schedule Elements of Operant Conditioning SCHEDULES OF REINFORCEMENT: 1. Fixed-Ratio Schedule A schedule of reinforcement for which a correct response is reinforced after a SET NUMBER of correct responses. The most effective and fastest learned. EXAMPLES: A rat pushed the lever 10 times to receive the food. This will soon see the rat press the lever 10 times very quickly to get the food. People who are employed on a ‘piecework’ basis are on a fixedratio schedule. For example, $20 payment for every 100 newspapers sold or $5 for every bucket of cherries picked. Elements of Operant Conditioning SCHEDULES OF REINFORCEMENT: 2. Variable-Ratio Schedule A schedule of reinforcement in which a reinforcer is given after an UNPREDICTABLE NUMBER of correct responses. This is also an effective and speedy method as the uncertainty of when the next reward will occur keeps organisms responding. EXAMPLES: A gambler has no way of predicting how many times he must put a coin in the slot to win but the more times a coin is inserted the greater the chance of a payout. People who play pokie are often reluctant to leave them, especially when they have had a large number of unreinforced responses. OR Door to door salesmen. It is uncertain how many houses they will have to visit to make a sale, but the more houses they try, the more likely that they will succeed. Elements of Operant Conditioning SCHEDULES OF REINFORCEMENT: 3. Fixed-Interval Schedule A schedule of reinforcement in which a correct response is reinforced after a SET PERIOD OF TIME has elapsed since the previous reinforcer. Usually produces a moderate rate of response, particularly once the organism works out that time is the key factor. EXAMPLES: Eg. A worker who has monthly performance reviews is much more likely to perform at a higher level in the days just before the review. Also, baking a cake for 3o minutes without a timer. Not going to bother checking much in first few minutes, but from approx 20 minute (estimation) on you will check more often. Elements of Operant Conditioning SCHEDULES OF REINFORCEMENT: 4. Variable-Interval Schedule A schedule of reinforcement in which a reinforcer is given after IRREGULAR PERIODS OF TIME have passed, provided the correct response has been made. Weakest schedule, a low but steady rate of response. EXAMPLES: Fishing – people don’t know if they will get a bite at 20 seconds, 20 minutes, or at all, so the person checks their line every so often. Same for emails. If you usually get about 10 emails a day, but not at consistent times, you might check your mail randomly throughout the day. Elements of Operant Conditioning Punishment = A negative consequence (an unpleasant event or the removal of something that is pleasant) following a response which decreases the likelihood of that response occurring again. Punishment can be negative or positive, just like reinforcement. EG: Receiving a speeding fine and demerit points for speeding. These are unpleasant consequences intending to reduce the behaviour. (+ Positive Punishment) If you continue to speed, you will accumulate points and lose your license. This is the removal of something pleasant as a form of punishment. (- Negative Punishment) Elements of Operant Conditioning Punishment Since negative punishment involves taking a stimulus away or not obtaining a reinforcer as a consequence of behaviour, it is often referred to as response cost. Response cost = When any valued stimulus is removed, whether or not it causes the behaviour. (Licence removal) Elements of Operant Conditioning Factors that Influence the effectiveness of Reinforcement and Punishment (pg 490) Order of Presentation Reinforcement/Punishment need to be presented AFTER a response. Never before. Timing Reinforcement/Punishment are most effective when delivered IMMEDIATELY AFTER the response. This ensure association between the response and the consequence. Appropriateness Reinforcement/Punishment need to be relevant to the individual. Reinforcers need to be desirable and punishments undesirable. Key Processes of Operant Conditioning Acquisition = The establishment of a response through reinforcement. The speed of acquisition depends on the schedule of reinforcement applied. COMPARISON TO CLASSICAL CONDITIONING: Similar to CLASSICAL CONDITIONING in that it refers to the overall learning process during which a response is established, but differs with regard to HOW the behaviour is learned. Key Processes of Operant Conditioning Extinction = The gradual decrease in the strength or rate of a conditioned (learned) response following consistent non-reinforcement of the response. Extinction is less likely to occur when partial reinforcement occurs because the reinforcement itself is uncertain. For example, the gambler is USED to the reward being unpredictable. COMPARISON TO CLASSICAL CONDITIONING: Occurs after the removal of the REINFORCEMENT rather than the unconditioned stimulus. Key Processes of Operant Conditioning Spontaneous Recovery = The return of a response in expectation of a reinforcer after a period of extinction. The response is likely to be weaker and will not last long. COMPARISON TO CLASSICAL CONDITIONING: Very similar to classical conditioning in explanation. (Different components but same concept). Key Processes of Operant Conditioning Stimulus Generalisation = Occurs when the correct response is made to another stimulus that is similar (not necessarily identical) to the stimulus that initially produced the response. EG. The pigeon that initially learned to peck a green light to receive food also pecked yellow and red lights. COMPARISON TO CLASSICAL CONDITIONING: Very similar to classical conditioning in explanation. (Different components but same concept). Key Processes of Operant Conditioning Stimulus Discrimination = Occurs when an organism makes the correct response to a stimulus and is reinforced but does not respond to any other stimulus even if they are similar. Eg. Skinner trained his pigeons to discriminate between the red and the green light by only rewarding them when they pecked the green. READ pg 496 – Sniffer Dogs. COMPARISON TO CLASSICAL CONDITIONING: Very similar to classical conditioning in explanation. (Different components but same concept). Further Comparisons between Classical and Operant Conditioning Classical Conditioning Operant Conditioning The Role of the Learner Learner is Passive, Doesn’t have to do anything for learning to occur. Has no control. Learner is Active. Must perform some activity to receive consequence. Has control over learning. Timing of the Stimulus and Response Response (salivation) requires presentation of the UCS (Meat) first. Timing of CS then UCS needs to be immediate. The presentation of the reinforcer/punishment relies on the response occurring first. Timing can be further apart (eg. variable interval) The Nature of the Response Response by learner is usually reflexive and involuntary. (salivating, blinking.) Response usually voluntary (pressing lever).

U4- Learning 5 - Operant Conditioning

Related documents

Products

Support

U4- Learning 5 - Operant Conditioning

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib