Chapter 6 – Schedules or Reinforcement and Choice Behavior • Outline – Simple Schedules of Intermittent Reinforcement • Ratio Schedules • Interval Schedules • Comparison of Ratio and Interval Schedules – Choice Behavior: Concurrent Schedules • Measures of Choice Behavior • The Matching Law – Complex Choice • Concurrent-Chain Schedules • Studies of “Self Control” • Simple Schedules of Intermittent Reinforcement • Ratio Schedules – RF depends only on the number of responses performed • Continuous reinforcement (CRF) – each response is reinforced • barpress = food • key peck = food • CRF is rare outside the lab. – Partial or intermittent RF • Partial or intermittent Schedules of Reinforcement • FR (Fixed Ratio) – fixed number of operants (responses) • CRF is FR1 – FR 10 = every 10th response RF • originally recorded using a cumulative record – Now computers • can be graphed similarly • The cumulative record represents responding as a function of time – the slope of the line represents rate of responding. • Steeper = faster • Responding on FR scheds. – Faster responding = sooner RF • So responding tends to be pretty rapid – Postreinforcement pause • Postreinforcement pause is directly related to FR. – Small FR = shorter pauses • FR 5 – large FR = longer pauses • FR 100 – wait a while before they start working. – Domjan points out this may have more to do with the upcoming work than the recent RF • Pre-ratio pause? • how would you respond if you received $1 on an FR 5 schedule? • FR 500? – Post RF pauses? • RF history explanation of post RF pause – Contiguity of 1st response and RF • FR 5 – 1st response close to RF – only 4 more • FR 100 – 1st response long way from RF – 99 more • VR (Variable ratio schedules) – Number of responses still critical – Varies from trial to trial • VR 10 – reinforced on average for every 10th response. – sometimes only 1 or 2 responses are required – other times 15 or 19 responses are required. • Example (# = response requirement) VR10 FR10 • • • • • • • • 19 2 8 18 5 15 12 1 RF RF RF RF RF RF RF RF 10 RF 10 RF 10 RF 10 RF 10 RF 10 RF 10 RF 10 RF • VR 10 – (19+2+8+18+5+15+12+1)/8 = 10 • VR = very little postreinforcement pause – why would this be? • Slot machines – very lean schedule of RF – But - next lever pull could result in a payoff. • FI (Fixed Interval Schedule) – 1st response after a given time period has elapsed is reinforced. • FI 10s – 1st response after 10s RF. • RF waits for animal to respond • responses prior to 10-s not RF. • scalloped responding patterns – FI scallop • Similarity of FI scallop and post RF pause? – FI 10s? – FI 120s? • The FI scallop has been used to assess animals’ ability to time. • VI (variable interval schedule) – Time is still the important variable – However, time elapse requirement varies around a set average • VI 120s – time to RF can vary from a few seconds to a few minutes • $1 on a VI 10 minute schedule for button presses? – Could be RF in seconds – Could be 20 minutes • post reinforcement pause? • Produces stable responding at a constant rate – peck..peck..peck..peck..peck – sampling whether enough time has passed • The rate on a VI schedule is not as fast as on an FR and VR schedule – why? – ratio schedules are based on response. • faster responding gets you to the response requirement quicker, regardless of what it is? – On a VI schedule # of responses don’t matter, • steady even pace makes sense. • Interval Schedules and Limited Hold – Limited hold restriction • Must respond within a certain amount of time of RF setup – Like lunch at school • Too late you miss it • Comparison of Ratio and Interval Schedules – What if you hold RF constant • Rat 1 = VR • Rat 2 = Yoked control rat on VI – RF is set up when Rat 1 gets to his RF • If Rat 1 responds faster, RF will set up sooner for Rat2 • If Rat 1 is slower, RF will be delayed • Comparison of Ratio and Interval Schedules • Why is responding faster on ratio scheds? – Molecular view • Based on moment x moment RF • Inter-response times (IRTs) – R1……………R2 RF » Reinforces long IRT – R1..R2 RF » Reinforces short IRT • More likely to be RF for short IRTs on VR than VI • Molar view – Feedback functions • Average RF rate during the session is the result of average response rates – How can the animal increase reinforcement in the long run (across whole session)? • Ratio - Respond faster = more RF for that day – FR 30 – Responding 1 per second RF at 30s – Respond 2 per second RF at 15s • Molar view continued – Interval - No real benefit to responding faster • FI 30 • Responding 1 per second RF at 30 or 31 (30.5) • What if 2 per second 30 or 30.5 (30.25) – Pay • Salary? • Clients? • Choice Behavior: Concurrent schedules – The responding that we have discussed so far has involved schedules where there is only one thing to do. – In real life we tend to have choices among various activities – Concurrent schedules • examines how an animal allocates its responding among two schedules of reinforcement? • The animals are free to switch back and forth • Measures of choice behavior – Relative rate of responding • for left key BL (BL + BR) . – BL = Behavior on left – BR = Behavior on right We are just dividing left key responding by total responding. • This computation is very similar to the computation for the suppression ratio. – If the animals are responding equally to each key what should our ratio be? 20 . = 20+20 .50 – If they respond more to the left key? 40 . = 40+20 .67 – If they respond more to the right key? 20 . = 20+40 .33 • Relative rate of responding for right key – Will be reciprocal of left key responding, but also can be calculated with the same formula BR (BR + BL) . • Concurrent schedules? – If VI 60 VI 60 – The relative rate of responding for either key will be .5 • Split responding equally among the two keys • What about the relative rate of reinforcement? – Left key? • Simply divide the rate of reinforcement on the left key by total reinforcement. rL (rL + rR) . • VI 60 VI 60? – If animals are dividing responding equally? – .50 again • The Matching Law – relative rate of responding matches relative rate of RF when the same VI schedule is used • .50 and .50 – What if different schedules of RF are used on each key? • • Left key = VI 6 min (10 per hour) Right key = VI 2 min (30 per hour) Left key relative rate of responding BL = rL . (BL + BR) (rL + rR) . 10 =.25 left 40 Right key? simply the reciprocal .75 Can be calculated though BR = rR . (BR + BL) (rR + rL) . 30 =.75 right 40 Thus - three times as much responding on right key .25x3 = .75 Matching Law continued: Simpler computation. BL BR . = rL rR . 10 30 again – three times as much responding on right key • Herrnstein (1961) compared various VI schedules – Matching Law. • Figure 6.5 in your book • Application of the matching law – The matching law indicates that we match our behaviors to the available RF in the environment. – Law,Bulow, and Meller (1998) • Predicted adolescent girls that live in RF barren environments would be more likely to engage in sexual behaviors • Girls that have a greater array of RF opportunities should allocate their behaviors toward those other activities • Surveyed girls about the activities they found rewarding and their sexual activity • The matching law did a pretty good job of predicting sexual activity – Many kids today have a lot of RF opportunities. • May make it more difficult to motivate behaviors you want them to do – Like homework » X-box » Texting friends » TV • Complex Choice – Many of the choices we make require us to live with those choices • We can’t always just switch back and forth – Go to college? – Get a full-time job? • Sometimes the short-term and long-term consequences (RF) of those choices are very different – Go to college » Poor now; make more later – Get a full-time job » Money now; less earning in the long run • Concurrent-Chain Schedules • Allows us to examine these complex choice behaviors in the lab – Example • Do animals prefer a VR or a FR? – Variety is the spice of life? • Choice of A – 10 minutes on VR 10 • Choice of B – 10 minutes on FR 10 • Subjects prefer the VR10 over the FR10 – How do we know? • Subjects will even prefer VR schedules that require somewhat more responding than the FR – Why do you think that happens? • Studies of Self control – Often a matter of delaying immediate gratification (RF) in order to obtain a greater reward (RF) later. • Study or go to party? • Work in summer to pay for school or enjoy the time off? • Self control in pigeons? – Rachlin and Green (1972) • Choice A = immediate small reward • Coice B = 4s Delay large reward – Direct choice procedure • Pigeons choose immediate, small reward – Concurrent-chain procedure • Could learn to choose the larger reward – Only if a long enough delay between initial choice and the next link. • This idea that imposing a delay between a choice and the eventual outcomes helps organisms make “better” (higher RF) outcomes works for people to. • Value-discounting function V= M . (1+KD) • • • • V-value of RF M- magnitude of RF D – delay of reward K – is a correction factor for how much the animal is influenced by the delay – All this equation is saying is that the value of a reward is inversely affected by how long you have to wait to receive it. – IF there is no delay D=0 • Then it is simply magnitude over 1 • If I offer you – $50 now or $100 now? 50 . = 50 100 . = 100 (1+1x0) (1+1x0) – $50 now or $100 next year? 50 . = 50 100 . = 7.7 (1+1x0) (1+1x12) • As noted above K is a factor that allows us to correct these delay functions for individual differences in delay-discounting • People with steep delay discounting functions will have a more difficult time delaying immediate gratification to meet long-term goals – Young children – Drug abusers • Madden, Petry,Badger, and Bickel (1997) – Two Groups • Heroin-dependent patients • Controls – Offered hypothetical choices • $ smaller – now • $ more – later – Amounts varied • $1,000, $990, $960, $920, $850, $800, $750, $700, $650, $600, $550, $500, $450, $400, $350, $300,$250, $200, $150, $100, $80, $60, $40, $20, $10, $5, and $1 – Delays varied • 1 week, 2 weeks, 2 months, 6 months, 1 year, 5 years, and 25 years. • • • • • • • • • • • • • It has been described mathematically in the following way (Baum, 1974) RA = b rA a RB rB RA and RB refer to rates of responding on keys A and B (i.e. left and right) rA and rB refer to the rates of reinforcement on those keys When the value of exponent a is equal to 1.0 a simple matching relationship occurs where the ratio of responses perfectly match the ratio of reinforcers obtained. The variable b is used to adjust for response effort differences between A an B when they are unequal, or if the reinforcers for A and B were unequal.