Class Business • No lab/office hours • Papers – Format- subheadings – Progress Reports – Early drafts – Thanksgiving • Rachlin – Summaries • Tests – Review – Rewrites Matching Continued Herrnstein’s Hyperbola -- Application kR 1 B1 R1 R e Given this equation: What two ways could we reduce B? kR 1 B1 R1 R e K = 10, R1=10, Re=10 10*10/(10+10) = 5 Decrease Rf for the Behavior kR 1 B1 R1 R e K = 10, R1=5, Re=10 10*5/(5+10) = 3.33 Increase Re kR 1 B1 R1 R e K = 10, R1=10, Re=20 10*10/(10+20) = 3.33 We do this all the time… Herrnstein's law and concurrent schedules The strict-matching equations for describing rates of responding during either alternative in concurrent VI VI schedules are: Alternative 1: B1 R1 , B1 B 2 Be R1 R 2 Re Alternative 2: B2 R2 B1 B 2 Be R1 R 2 Re According to Herrnstein's assumptions, the denominators (B1 + B2 + Be) both equal k, so: kR1 B1 , R1 R 2 Re and kR2 B2 R1 R 2 Re If now we divide B1 by B2 to see what relative behaviour allocation would look like, we get: B1 R1 B2 R2 As we saw already, this is just the same relation as: B1 R1 B2 R2 which is the strict matching law. B1 R1 B1 B 2 R1 R 2 So, the response rate on one of the schedules comprising a concurrent VI VI schedule is: kR1 B1 , R1 R 2 Re How good is Herrnstein's law (1)? A law is only as good as the assumptions are true. It fits the data well, But is that enough? -- many other equations could be good fits. How good is Herrnstein's law (2)? The assumption that total output (k) is constant seems counter-intuitive, but it could be correct if you measure all the responses in the same modulus -- which Herrnstein's theory does. Even doing nothing is behaving. How good is Herrnstein's law (3)? The assumption about Re existing -- that there are other reinforcers available -- cannot be faulted. Logically, it seems that this must be correct. BUT the assumption that Re remains constant when R1 is varied is probably unreasonable. It seems unlikely that Re could remain constant when Be varies considerably. Re surely must fall when R1 is increased. How good is Herrnstein's law (4)? The assumption of strict matching is likely to be wrong. Considerable research has shown that the behavior ratio changes rather less with changes in reinforcer ratios than is suggested by strict matching. Generalized Matching Strict matching law This one is the relative form, because it shows relative response and reinforcer rates (proportions) : B1 B1 B2 R1 R1 R2 You can rearrange this to give the ratio form : B1 B2 R1 R2 Lots of concurrent-schedule experiments supported the strict matching law, at least roughly, but not all of them Staddon (1968) was the first to collect data that clearly didn’t fit the strict matching law His data did NOT fit on the major diagonal on relative coordinates as required by the strict matching law.... B1/(B1+B2) He studied concurrent DRL VI(DRL) schedules in pigeons, and varied the VI schedule 1.0 0.5 0.0 0.0 0.5 R1/(R1+R2) 1.0 So Staddon (1968) plotted his data in log-ratio form, log(B1/B2) against log(R1/R2) : Strict matching is still shown as a diagonal line at 45º Equal responses means log (B1/B2) = 0 The data clearly fall along a line, whose equation is y = .59x + .51 log (B1/B2) Equal reinforcers means log (R1/R2) = 0 1 y = .59x + .51 equal choice 0 -1 equal reinforcers -2 strict matching -3 -3 -2 -1 log (R1/R2) 0 1 Two deviations from strict matching are visible in Staddon’s data : 2 STADDON (1968) BIRD 420 y = .59x + .51 y = 0.59x + 0.51 1 equal choice 0 0 log B1/B2 Bias – over all log reinforcer ratios, there are more B1 responses than predicted by strict matching 1 log (B1/B2) Undermatching – there’s less change in log response ratio than predicted by strict matching (the slope of the line is less than 1) -1 -1 -2 -2 equal reinforcers strict matching DATA -3 -3 FIT STRICT MATCHING -3 -3 -2 -2 -1 -1 /R2) log log (RR11/R 2 0 0 1 1 More undermatching and bias At first, researchers didn’t take much notice of Staddon’s results – weird procedure, weird results People like Herrnstein’s strict matching theory simple and parsimonious Reports of these deviations from strict matching continue in more standard concurrent schedules Hollard and Davison (1971) – concurrent VI VI, pigeons, food vs electrical brain stimulation as reinforcers Trevett, Davison and Williams (1972) – concurrent VI FI, pigeons, food reinforcers Baum (1974): concluded that undermatching was actually more common than strict matching The Generalized Matching Law - GML Baum (1974) therefore suggested a more general form of the matching law that included parameters to describe undermatching and bias : B1 log B2 R1 a log log c R2 The GML has the form of the equation for a straight line : y = slope (x) + intercept The slope a is called sensitivity to reinforcer rate (Lobb & Davison, 1975) The intercept log c is called bias (Baum, 1974) The Generalized Matching Law - GML Why is the equation in log-ratio form? Because otherwise it wouldn’t be a straight line. If we take the GML out of logs : B1 log B2 R1 a log log c R2 becomes a power function : B1 B2 R1 a c( ) R2 Not a straight line, harder to interpret on a graph, and much harder to find the best-fitting values of a and log c The Generalized Matching Law B1 log B2 R1 a log log c R2 If sensitivity (a) < 1, undermatching If sensitivity (a) > 1, overmatching If bias (log c) > 0, biased towards Alternative 1 If bias (log c) < 0, biased towards Alternative 2 If a = 1 and log c = 0, strict matching If a = 1 and log c 0, biased matching For Staddon’s data, a = 0.59, so undermatching, and log c = 0.51, so biased towards Alternative 1 (the shorter DRL) Examples LOG-RATIO MEASURES PLOT STRICT MATCHING UNDERMATCHING BIASED UNDERMATCHING 1.0 Strict matching a = 1, log c = 0 Undermatching a < 1 , log c = 0 Biased undermatching LOG RESPONSE RATIO Three GML plots: 0.5 0.0 -0.5 a < 1 , log c < 0 -1.0 -1.0 -0.5 0.0 0.5 LOG REINFORCER RATIO 1.0 Examples RELATIVE MEASURES PLOT STRICT MATCHING UNDERMATCHING BIASED UNDERMATCHING Much harder to see what’s going on in relative plots RELATIVE RESPONSES 1.0 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 RELATIVE REINFORCERS 1.0 Examples RATIO MEASURES PLOT STRICT MATCHING UNDERMATCHING BIASED UNDERMATCHING 12 And even worse with ratio axes RESPONSE RATIO 10 8 6 4 2 0 0 2 4 6 8 REINFORCER RATIO 10 12 The Generalized Matching Law B1 log B2 R1 a log log c R2 Undermatching means that when we vary the log reinforcer ratio (the distribution of reinforcers between alternatives), the animal’s choice changes less than strict matching predicts Undermatching is more common than strict matching – usual value of sensitivity is about 0.8 to 0.9 Example – Alsop and Elliffe (1988) – six pigeons, food reinforcement, five different concurrent VI VI schedules Alsop and Elliffe (1988) The GML fitted the data well (data points close to line) 1 Bird 131 Bird 132 Bird 133 Bird 134 Bird 135 Bird 136 log response ratio 0 -1 1 0 -1 -1 0 1 -1 0 1 log reinforcer ratio -1 0 1 Alsop and Elliffe (1988) All birds showed undermatching (slopes < 1) 1 Bird 131 Bird 132 Bird 133 Bird 134 Bird 135 Bird 136 log response ratio 0 -1 1 0 -1 -1 0 1 -1 0 1 log reinforcer ratio -1 0 1 Alsop and Elliffe (1988) All showed some bias – 132 and 134 to B1, others to B2 1 Bird 131 Bird 132 Bird 133 Bird 134 Bird 135 Bird 136 log response ratio 0 -1 1 0 -1 -1 0 1 -1 0 1 log reinforcer ratio -1 0 1 Response vs. time allocation B1 log B2 R1 a log log c R2 Just as with the strict matching law, we could measure choice in terms of time allocation instead of response allocation : T log 1 T2 R a log 1 log c R2 Results are very similar, but sensitivity is usually a little higher (a is closer to 1, less undermatching) Alsop and Elliffe (1988) log response ratio and log time ratio All birds showed higher time than response sensitivities 1 Bird 131 Bird 132 Bird 133 responses time 0 -1 1 Bird 134 Bird 135 Bird 136 0 -1 -1 0 1 -1 0 1 log reinforcer ratio -1 0 1 Why do we usually get undermatching? Baum’s view – undermatching is a failure of matching Organisms are ‘trying to match’ but various things prevent it – usually related to confusion between the alternatives e.g., no COD or COD too short (Schroeder & Holland, 1969) – without a COD, reinforcers on one alternative will affect responses on the other, so not surprising that we get undermatching e.g., stimuli signaling the alternatives may be confused Miller, Saunders and Bourland (1980) ran a switching-key concurrent VI VI with the alternatives signaled by line orientations Found almost strict matching when lines were 45º different, but a = 0.33 when lines were 15º different Why do we sometimes get overmatching? It can’t just be a failure of matching though, because there are conditions that reliably produce overmatching (a > 1) This happens if we make it difficult to switch between alternatives e.g., Baum (1982) – two-key concurrent VI VI with pigeons, but they had to walk around a partition or jump over a hurdle to switch between keys – a was about 1.9 e.g., Davison and Elliffe (2000) – switching-key concurrent VI VI, but a switch response produced a delay of 4.5 s before the other stimulus and schedule appeared – a was about 1.6 This doesn’t seem consistent with Baum’s idea that strict matching is normative, but sometimes animals can’t do it Why do we get bias? Much easier to understand – bias is a constant proportional preference for one alternative doesn’t change when we change the reinforcer ratio Choice is biased towards: Large reinforcers over small reinforcers Immediate reinforcers over delayed reinforcers Easy responses over hard responses (e.g., force required to peck) Higher quality reinforcers over lower quality reinforcers Reinforcers that occur at unpredictable times (VI) over reinforcers that occur at regular times (FI) There might also be some inherent bias specific to the individual subject, like muscular differences: handedness (beakedness?) Why do we get bias? All these variables affect choice. If one of them is constant but unequal on the two alternatives, it will produce a bias. e.g., if left-key responses are reinforced with 3 s access to wheat, and right-key responses are reinforced with 6 s access to wheat, and we then vary the reinforcer rates on each key, we would expect the GML to fit the data well, but with a negative value of log c – a bias towards B2, the right key e.g., Hollard and Davison (1971) – pigeons, left key always had 3 s food as the reinforcer, right key had 10 s of ectostriatal electrical brain stimulation, varied the reinforcer rate ratio over conditions Hollard and Davison (1971) HOLLARD & DAVISON (1971) GROUP DATA 1.0 TIME: Y = 1.06X + 0.71 RESPONSE DATA LOG RESPONSE OR TIME RATIO Sensitivity values look normal, but there was a big bias in favor of the alternative providing food reinforcers Responses: bias to food = 0.65 in log units 100.65 4.5, so birds liked food 4.5 times as much as EBS TIME DATA 0.5 0.0 RESPS: Y = 0.77X + 0.65 -0.5 STRICT MATCHING -1.0 -1.0 -0.5 0.0 0.5 LOG OBTAINED REINFORCER RATIO 1.0 Hollard and Davison (1971) HOLLARD & DAVISON (1971) GROUP DATA 1.0 TIME: Y = 1.06X + 0.71 RESPONSE DATA LOG RESPONSE OR TIME RATIO Time: bias to food = 0.71 in log units 100.71 5.1, so birds liked food 5.1 times as much as EBS But there’s a problem with the design of this experiment … TIME DATA 0.5 0.0 RESPS: Y = 0.77X + 0.65 -0.5 STRICT MATCHING -1.0 -1.0 -0.5 0.0 0.5 LOG OBTAINED REINFORCER RATIO 1.0 Hollard and Davison (1971) We can’t be certain that the bias was because of the different reinforcer qualities, or because of inherent bias, or some other asymmetry between the keys e.g., maybe the food key was easier to peck than the EBS key, and that’s why the birds were biased towards it Hollard and Davison needed a series of control conditions where each key provided the same reinforcer, so they could measure any inherent bias etc and subtract it from the bias in the food v EBS conditions Trevett, Davison and Williams (1972) Does GML applied to concurrent FI VI as well as to VI VI? Is there a bias to either FI or VI? Ran the control conditions Bias should be due to a preference for FI or VI Baseline (control): Five concurrent VI VI schedules in which reinforcer ratio was varied from 1:4 to 4:1 (log reinforcer ratio varied from –0.6 to +0.6) Experimental conditions: Five concurrent FI VI schedules that varied reinforcer ratio over the same range Trevett, Davison and Williams (1972) 1.0 LOG RESPONSE RATIO (FI/VI) Sensitivities (slopes) not significantly different for the two schedule types conc VI VI – small bias towards Key 2 (inherent bias? easier to peck?) conc FI VI – bigger bias towards VI. Difference between the biases, 0.15 log units, must measure preference for VI over FI Pigeons seem to prefer unpredictable reinforcers over predictable ones – risk-prone rather than risk-averse, perhaps TREVETT ET AL. (1972) GROUP DATA CONC VI VI CONC FI VI y = 0.69x -0.10 0.5 0.0 -0.5 -1.0 -1.0 Similar results for time allocation y = 0.62x - 0.25 -0.5 0.0 0.5 LOG OBTAINED REINFORCER RATIO 1.0 White & Davison (1973) FI FI White and Davison varied relative reinforcer rates in conc FI FI Sometimes, the pigeons showed normal FI scallops, but sometimes the cumulative records looked like VI schedules (high constant response rate) When they showed the same pattern of responses on both keys, sensitivity was close to 1 (strict matching) and there was no bias When they showed different patterns on each key (one FI-like, one VI-like), they found undermatching, and a bias towards the VI-like alternative Concurrent VI VR The results depend on whether you measure response or time allocation Response bias towards VR schedule Time bias towards VI schedule Seems reasonable? VR is response-based, VI is time-based So response and time measures aren’t always the same, and can be in opposite directions Davison (1982): strange result that is a problem for the GML: Keep the VI schedule constant and vary the VR, get strict matching (a = 1) Keep the VR constant and vary the VI, get undermatching. So maybe the GML doesn’t apply to ratio schedules very well. Other IVs in the GML B1 log B2 R1 a log log c R2 Reinforcer rates control choice according to the GML Other independent variables, like reinforcer magnitudes, delays, qualities, etc. produce biases in the GML if they are held constant but unequal while we vary reinforcer rates What if we keep the reinforcer rates constant and vary something else instead? e.g., arrange conc VI 60 s VI 60 s and vary magnitude B1 log B2 M1 a log log c M2 Schneider (1973) and Todorov (1973) B1 log B2 M1 a log log c M2 Version of the GML that describes the effect of reinforcer magnitude on choice Schneider and Todorov both varied reinforcer magnitude and found that the above equation did describe their data But sensitivity to magnitude was about 0.5 – much less than sensitivity to rate Reinforcer rate is more effective at influencing choice than reinforcer magnitude The concatenated generalized matching law B1 log B2 R1 M1 ar log am log log c R2 M2 If we put the GML descriptions of control by rate and magnitude together, we could write the above equation The sensitivity terms have subscripts identifying the IV they refer to Each IV controls choice in the same way, but sensitivity to each might differ Because reinforcer rate affects choice more than reinforcer magnitude does, ar is greater than am The concatenated generalized matching law B1 log B2 R1 M1 D2 ar log am log ad log ... R2 M2 D1 A Q aA log 2 aq log 1 log c A1 Q2 We could just keep going, adding a log ratio and a sensitivity term for all the other IVs that affect choice Notice that reinforcer delay and response arduousness have reversed log ratios – why? Sensitivity term: how much influence that IV has on choice If an IV is constant and equal for both alternatives, its log ratio will be zero and it will drop out of the equation If an IV is constant and unequal it will produce a bias The concatenated generalized matching law B1 log B2 R1 M1 ar log am log log c R2 M2 e.g., suppose M1 is 6 s access to wheat and M2 is 3 s We know that am is about 0.5 The magnitude ratio is 2, so log magnitude ratio is 0.3, so the magnitude term is about 0.5 x 0.3 = 0.15 So if we vary the reinforcer rates and measure bias, it should be about 0.15 towards the larger magnitude, as long as all the other IVs in the concatenated GML are constant and equal, and there’s no inherent bias Vollmer and Bouret (2000) American college basketballers, 3-point shots vs 2-point shots Across players, ratio of shots attempted (responses) slightly undermatched ratio of successful shots (reinforcer rate) with a bias in favour of 3-point shots (reinforcer magnitude) The concatenated generalized matching law B1 log B2 X1 ax log X2 This is an efficient way to write the concatenated GML. The symbol S means “the sum of” X means any independent variable that affects choice The equation says that the predicted log response ratio is the sum of all the log independent-variable ratios multiplied by their sensitivities There’s no bias term in the equation, because the hope is that eventually, when all the IVs that affect choice have been discovered, there will be no inherent bias The concatenated generalized matching law B1 log B2 X1 ax log X2 There’s a body of research on measuring sensitivities to different independent variables – we’ve already seen that sensitivity to : reinforcer rate is about 0.8 to 0.9 reinforcer magnitude is about 0.5 We’ll look at two more IVs – reinforcer delay and response force requirement (a way of manipulating the arduousness of a response) Response force: Hunter and Davison (1982) B1 log B2 R1 A2 ar log aA log log c R2 A1 Hunter and Davison varied both reinforcer rate in concurrent VI VI and the force required for a key-peck to be counted as a response (they had weights behind the keys to make them harder to peck) Response measures: ar = 0.88, aA = 0.71 Time measures: ar = 0.98, aA = 0.41 Force required affected choice less than reinforcer rate Sensitivities measured using time allocation aren’t always higher than for response allocation This is the only thorough published test of the concatenated GML, and it fitted the data very well Rf magnitude: Elliffe, Davison, Landon (2008) B1 log B2 R1 M1 ar log am log log c R2 M2 But what about with other variables? Both reinforcer rate and magnitude varied thoroughly The concatenated GML fitted the data very well, but Sensitivity to rate was highest when the magnitudes were equal Sensitivity to magnitude was highest when the rates were equal The CGML can’t be exactly right probably near enough for predicting behavior in practical terms Reinforcer delay B1 log B2 D2 ad log log c D1 The first experiment on this (Chung & Herrnstein, 1967) reported strict matching to relative delay (ad = 1) But a more thorough later experiment (Williams & Fantino, 1978) found that sensitivity increased with the average delay This shouldn’t happen according to the GML – sensitivity should stay constant for a particular IV Assessing food preference: reinforcer quality B1 log B2 R1 Q1 ar log aq log log c R2 Q2 You can find out which of two foods an animal prefers by arranging each as a reinforcer for a different response and varying the reinforcer rates The bias will reflect preference for one or other food, because it will be affected by the log quality ratio This is the same logic that Hollard and Davison used with food v EBS Matthews and Temple (1979) did this with different cow feeds – crushed barley seems to be what they like best Assessing food preference: reinforcer quality B1 log B2 R1 Q1 ar log aq log log c R2 Q2 Notice how important it is to use dependent schedules If you use independent schedules the animal will respond more for the preferred food (say, Q1) and so its reinforcer rate for that food (R1) will increase so log (R1/R2) increases and that increases preference more so log (R1/R2) increases more And so on. You can end up with an apparently very strong preference for one food that is actually caused by the animal receiving that food at a higher rate – watch for this in pet-food advertisements. Summary Generalized matching seems to describe choice well in a wide variety of situations – different species, response, reinforcers It says that different IVs to do with responses and reinforcers all affect choice in the same way … But that some of them are more effective than others, as measured by their different sensitivity values It’s the best, most widely applicable, description of choice that we have at the moment But is it a theory of choice? Why should it happen? Is Baum’s idea that it’s really ‘failed strict matching’ right? Or is it an outcome of some other mechanism for choice, like maximization? Theories of Matching Experimental Analysis of Choice • Methods: concurrent schedules, concurrent chains, delay discounting, foraging contingencies, behavioral economic contingencies. • Models and Issues: matching/melioration, maximizing/optimality, hyperbolic discounting, behavioral economic/ecological models, behavior momentum, molar versus molecular issue, concepts of response strength. • Applications: self-control, drug abuse, gambling, risk, economics, behavioral ecology, social/political decision making. Melioration: A Theory of Matching “To make better”: Behavior shifts to the higher return (lower cost) or equal local rates of reinforcement. (R1 / R2) = (r1 / r2), or (r1 / R1) = (r2 / R2) (reinforcers per response, i.e., return). (T1 / T2) = (r1 / r2), or (r1 / T1) = (r2 / T2) (local rate of reinforcement). Example: Conc VI 30”VI 120” Suppose in the first hour of exposure 1000 responses were emitted to each alternative: (VI 30”) r1 / R1 = (120 rfs / 1000 resps). Return = 0.120 (VI 120”) r2 / R2 = (30 rfs / 1000 resps). Return = 0.03 Ultimately behavior will shift toward the higher return. What will be the result? 120 / (1000 + x) = 30 / (1000 – x); x = 600. 120 / 1600 = 30 / 400 i.e., matching (80% responses on VI 30” alternative). Return = 0.075 rfs/resp on each alternative. Problem: Conc VR 30 VR 120 ALL responses will ultimately be made to the VR 30 alternative. This is consistent with matching, but same would be said if all the responses were made to the VR 120 alternative. But melioration can predict which alternative should receive all the responses: VR 30: r1/ R1 = 1/30; VR 120: r2/ R2 = 1/120. These cannot change, so shifting to the higher return means all the responses will go to VR 30 alternative. Operant Conditioning Operant Conditioning Operant Conditioning MATCHING LAW Herrnstein R1 / R1 = r1 / r2 Baum R1 / R2 = b (r1 / r2)a MATCHING LAW • R1 / R2 = r1 / r2 Herrnstein • R1 / R2 = b (r1 / r2)a Baum • V1 / V2 = b (r1 / r2)a1 (M1 / M2)a2 (D2 / D1)a3 • Rachlin’s “Value” Operant Conditioning Operant Conditioning FUNCTIONAL PROPERTIES AND CURVE FITTING • What is the “real” delay function? Vt = V0 / (1 + Kt) Vt = V0/(1 + Kt)s Vt = V0/(M + Kts) Vt = V0/(M + ts) Vt = V0 exp(-Mt) Operant Conditioning Operant Conditioning Operant Conditioning Operant Conditioning Operant Conditioning Operant Conditioning Operant Conditioning Operant Conditioning Operant Conditioning Operant Conditioning Operant Conditioning Operant Conditioning Operant Conditioning Operant Conditioning