Decisions reduce sensitivity to subsequent information Zohar Z. Bronfman1, 2*, Noam Brezis1*, Rani Moran1, Konstantinos Tsetsos3, Tobias Donner4, 5, 6** & Marius Usher1, 7** 1: School of Psychology, Tel-Aviv University 2: The Cohn Institute for the History and Philosophy of Science and Ideas, Tel-Aviv University 3: Department of Experimental Psychology, University of Oxford, Oxford, UK 4: Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands 5: Amsterdam Brain & Cognition, University of Amsterdam, Amsterdam, The Netherlands 6: Bernstein Center for Computational Neuroscience, Charité-Universitätsmedizin, Berlin, Germany 7: Sagol School of Neuroscience, Tel-Aviv University * Co-first authors; ** Co-senior authors Supplementary materials Data Analysis in Exp. 1 We obtained participants’ decision, evaluation and response time (RT; measured from sequence’s offset until mouse button press) in each trial. In order to quantify observers' accuracy in estimating the mean value of the numerical sequences, we used Pearson correlation between the sequences' averages and the estimations. In the comparison of accuracy within the DEE condition, between decision-congruent and incongruent trials (trials in which the mean of the additional information was opposite to the preliminary decision, e.g., decision < 50 and mean additional information > 50), we used absolute deviations (rather than correlation) as a measure of accuracy, in order to avoid possible artefacts that result from computing correlations on trials with dissimilar numerical-ranges; nevertheless, the same pattern of results is observed for correlation (see Results). Other factors which can influence accuracy, such as variance or numerical magnitudes, did not differ between the trial types. All regression models reported did not include intercepts, unless stated otherwise. All ANOVAs reported were conducted on not-normalized regression weights. Exp. 1, full time-course regression A full regression (16 regressors, for each number), ran separately for the DEE and NEE trials shows an increase in the weight of the 8th item, and a decrease in the weight of the 11th item in DEE as compared to the NEE condition. 1 Figure S1 – Exp. 1, a full time-course regression, separately for the DEE (Blue) and NEE (red) trials. Coefficients for the 8th and 11th items significantly differ between the decision and the nodecision conditions [t(20)=5.8; p<0.0001; t(20)=-6.44; p<0.00001; respectively]. Model fitting Procedures In Exp. 1, all models were fitted to participants’ evaluations, separately for the DEE and NEE conditions, using the maximal likelihood method. For each model we computed a likelihood of the entire data given the model parameters (using a grid of model-parameters). The likelihood was obtained by generating 1000 simulations of the model evaluations for each sequence (experimental trial), from which we obtained a density-distribution of the model's evaluation for that sequence. Next, we calculated the likelihood of the actual evaluation given the density-distribution. The likelihood for the entire data was calculated by multiplying the likelihood for the separate trials. Finally, parameters were estimated by maximizing the likelihood term using grid search. In Exp. 2, all competing models are identical until the first decision in session-I. We therefore first fitted a 3-paramter model to the correct and incorrect group RT distributions (by averaging RT quantiles over participants; (1)) of the first decision in session-I (internal noise, boundary and leak). Observed response-times for correct and incorrect decisions were binned in 5 quantiles and the models simulated an estimation of the expected proportion of trials in each quantile, given the observed quantile values (2). Best-fitting parameters were obtained by maximizing the likelihood term: πΏπππππβπππ = ∏5π=1(πΈπ₯ππππ‘ππ_πππππππ‘ππππ )(πππ πππ£ππ_ππ’πππ‘ππππ‘π¦π ) , where i denotes RT quantile. Given the best-fitting 3-parameters, we next fitted each of the competing models to the regression slope observed in session-I (blue line Figure 4A) by maximizing the likelihood of the observed slope. The likelihood corresponds to the probability density distribution for the observed slope, which was calculated based on a simulation of 1000 trials. In addition, we constrained the models to the observed fraction of change-of-mind trials (i.e., trials in which observers changed their preliminary choice (in +/- 1 S.E.M). We used the best-fitting parameters of each of the competing models to predict both the regression slope observed in session-II (red line, Figure 4A) and the fraction of responses that were according to incongruent additional evidence (Figure 4B). 2 Figure S2 – Exp.1, Model predictions of the selective reduction in gain model, with best fitting parameters (see Table-1); error-bars denote 1 S.E.M. 3 Figure S3 – Exp. 1, model predictions for the value-shift model with a value-shift parameter of 4. This model fails to capture the observed change in spread as well as the difference in performance between the congruent and incongruent trials in the DEE condition. Model I II III IV V Unbiased Global Reduction Value-shift Selective Reduction Hybrid (gain + inflation) Internal Noise Additional Parameters 24.5 24.5 24.5 25.0 24.5 NA ω_2=0.4 β=0 ω_2c = 0.48; ω_2i = 0.39 ω_2=0.4; β=0 Log Likelihood -4,676 -4,667 -4,673 -4,664 -4,667 BIC 9, 361 9,349 9,360 9,351 9,356 Table S1 - Summary of the model comparison in Exp.1, DEE condition. The values of highest likelihood (selective reduced gain) and the highest BIC ranking (global reduced gain) are presented in bold. The weights π1 and π2 correspond to the weight given to evidence before and after the decision and are normalized to 1, in order to prevent systematic under-, or over-estimations. Thus π2 <.5 corresponds to reduced gain; π2 _c, and π2 _i, correspond to weights for extra-evidence in trials in which the latter is congruent/incongruent with the decision (see Computational Methods). 4 Introducing leak to the models in Exp. 1: We have tested an additional class of models that include a leak parameter, representing the decay of earlier items as compared to more recent ones. We first fitted the model with a leak parameter, π, for the no-decision extra-evidence (NEE) condition. The model is given by: π¦ = 50 + ( 16−π ∑16 π=1(π₯π −50+ππ )∗(1−π) 16 ). As suggested by the full time-course regression analysis, we find that λ=0, suggesting that there is no leak in the NEE condition. Nonetheless, in order to exclude the possibility that the reduced-gain results from changes in leak, we have fitted a variant of the reduced-gain model which includes a leak parameter to the DEE condition. The model is given by: 8 π¦ = 50 + (π1 ∗ ∑π=1(π₯π −50+ππ )∗(1−λ)8−π 8 16 ) + (π2 ∗ ∑π=9(π₯π −50+ππ )∗(1−λ)16−π 8 ); π€ππ‘β π1 = 1 − π2 We find that unlike in the NEE condition, in the DEE condition there is leak (λ=0.04; BIC=9,314). However, in this model, just as in the reduced-gain model, π2 = 0.4, suggesting that leak does not account for the observed reduction in participants’ sensitivity to additional information. Finally, we have tested an additional model variant with two leak parameters (λ1 and λ2 ), one for each of the 8number sequences, respectively. The model is given by: 8 π¦ = 50 + (π1 ∗ ∑π=1(π₯π −50+ππ )∗(1−λ1 )8−π 8 16 ) + (π2 ∗ ∑π=9(π₯π −50+ππ )∗(1−λ2 )16−π 8 )with π1 = 1 − π2 Here, we find that λ1 = 0.04, λ2 = 0.02 and π2 = 0.4 (BIC=9,324). The BIC for the two-leak model is higher (by 10) providing support against different leaks for information before and after the decision. Nevertheless, in this model too, the weight of the extraevidence is reduced (0.4). Figure S4 – a between-participants correlation between loss of sensitivity for additional information and loss of accuracy in value-estimation (blue circles depict data points, blue line 5 depicts the regression line (R=0.69); red stared line depicts the prediction of the global reduction in gain model, across various values of the gain parameter). Exp. 2 method On each time-frame (16.7ms), the mean brightness level of one of the disks was higher than the other (Strong=0.7; Weak=0.4; SD=0.15). Each trial began with a central fixation cross (300ms), on the next frame, two disks (25 mm) appeared symmetrically on both sided of the cross (30 mm from screen center), and one of the disks (random) was brighter than the other. On the next frame, and hereafter the brightness levels of the disks were dependent solely on the preceding frame. When the predetermined disk that represented the correct response was brighter on a certain frame, it had a higher probability for being brighter on the next frame, as compared to the probability of the incorrect disk for maintaining a brighter state (0.988 and 0.97 respectively). This procedure resulted in a dynamic stream of brightness, in which at every moment one disk is brighter than the other, and one of the disks (that corresponded to the correct response) had a higher probability of being brighter. The participants were told that each disk represents the momentary strength of a fightingboxer during a boxing match, and that they would observe on each trial random parts of a match (different match on each trial). The participants were asked to guess, at their own pace (freeresponse), which of the two boxers is more likely to win the match (keyboard press). Exp.3 Results The regression-weights for Exp.3 are shown in Figure S5a, and for comparison the regressionweights for Exp. 2 are reproduced in Figure S5b. Like the weights for session-II in Exp. 2, the regression weights in Exp. 3 show recency [t(9)=2.31; p=0.046]. Similarly, the weights of Exp. 3 are significantly different from those observed in Exp. 2, Session-I [repeated measures ANOVA of weights as a within-participant factor and session as a between-participant factor F(1, 18)=5.23; p=0.035]. This shows that the increase in weight assigned to the additional evidence (red-lines) is not due to participants having done this test as a second session. Figure S5 – A) regression weights obtained in Exp. 3; B) regression weights obtained in Exp. 2; Error bars denote 1 within participants S.E.M. 6 Figure S6 – observed (blue) and model predictions (black) of the group RT distribution (frames of 16.7 ms) in Exp.2, session-I, using the vincentizing method (3). Figure S7 – Exp. 2, predictions of the selective reduction in gain model. Figure S8 – Exp. 2, predictions of the value-shift model (shift parameter=20). Decision-modulated change in attractor-dynamics is consistent with reduced gain of extra evidence We demonstrate here that the reduced gain for extra evidence we observed in Exp. 2, using the diffusion framework (Figure 4c), is equivalent to a decision-induced change in the underlying attractor dynamics within a leaky-competing accumulator model (LCA) framework. 7 The LCA, assumes that the momentary local evidence (brightness of spots) in favor of Left and Right alternatives, πΌ1 (π‘), πΌ2 (π‘), are integrated in two accumulators (x1, x2), which are subject to neural leak (decay in activity) and to mutual-inhibition. π₯1 (π‘ + 1) = πΌ1 (π‘) + (1 − π)π₯1 (π‘) − π½π₯2 (π‘) + π1 (0, π); π₯2 (π‘ + 1) = πΌ2 (π‘) + (1 − π)π₯2 (π‘) − π½π₯1 (π‘) + π2 (0, π). where, π is the leak, β the mutual inhibition and ππ (0, σ) represents processing noise thought to be intrinsic to the accumulation, which is assumed to be Gaussian (with mean 0 and SD σ). Furthermore the LCA assumes that information integration is subject to reflecting boundary nonlinearity in the lower bound (0) - the activation of the accumulators cannot be negative. Note that there are two stochastic sources in this equation. One is external and is included in the, πΌ1 , πΌ2 inputs (subjects to Poisson statistics), while the other is internal. In applying the LCA to the data in Exp. 2 we make few assumptions, consistent with our previous studies of the LCA and other literature. For the interrogation session we assume the absence of decision criteria, as shown in previous data fits with this model (4). Accordingly, the decision is determined by the accumulator whose activation is highest at the end of the evidence stream. To account for the free response session, we assume the presence of a decision criterion (π1 ) that determines the first response – the preliminary decision (see Figure S9 a, b). Such decision criteria are thought to correspond to a change in the dynamics of the decision variables (5), which are usually not modeled explicitly, as they take place after the decision is "triggered". Since the task in Exp. 2 is not terminated with the preliminary decision, but requires integrating additional evidence for at least 1.5 seconds, we test two assumptions regarding this follow up process. i) ii) The default model: Following the preliminary decision at boundary π1 , the observers keep integrating the additional evidence without any change in network parameters until a second boundary, π2 is reached (6). The decision-induced attractor model: Following the decision, the network parameters change as a result of neuromodulation (7, 8), so as to increase the inhibition parameter of the model (see (9), for an explicit model). Consistent with this, we assume that the neuromodulatory effect enhances the magnitude of the lateral inhibition, making the decision-dynamics more competitive. This results in a decision-induced attractor dynamics (inhibition-dominant LCA; (4)), which makes the model less sensitive to changes in evidence that follows the preliminary decision, as compared to the default model. After reaching the boundary and achieving the preliminary decision, the accumulation process continues for at least 1.5 sec (the duration of the mandatory suspension in the experiment) and afterwards until a second bound is reached (a free parameter, π2 ). The model assigns the final decision to the accumulator that reaches the π2 boundary first. To illustrate the difference between the default and the decision-induced attractor model, we present in Figure S9, the neural activation of the two accumulators in response to the actual stimulus that was presented to the participants in a single trial. The bottom row shows the stochastic input (πΌ1, 8 πΌ2 ), which corresponds to a trial that overall favors πΌ1 , but which due to the stochastic nature of the stimuli, happens to start with an advantage to πΌ2 (note that we do not switch the input, but rather we selected a case that happens naturally as part of the stimulus presented in the experiment). The default model's response to this stochastic input is shown in the upper panel, and the decisioninduced attractor model is depicted in the middle panel (see caption for the models' parameters). The dashed line corresponds to the moment of the preliminary-decision - when the decision criterion (π1 ) was reached (in session-I). For the default model, in which the parameters do not change following the preliminary-decision, we see that the units respond to the change in evidence (the blue accumulator (πΌ1 ) becomes the dominating one after 150 time-steps), thus triggering a 'change of mind'. Conversely, for the decision-induced attractor model we see that since the preliminary-decision (at t=58) has triggered an enhanced mutual-inhibition, the model is less sensitive to the change in the additional evidence (the red accumulator continues to dominate). 9 Figure S9 – A, B: simulations of the LCA models for a single trial in session-I (free response); A) the default LCA (λ=0.025; β=0.01, σ=1.2, π½π =22 and π½π =25); and B) the attractor LCA with a decision-induced increase in mutual-inhibition (β_After decision=0.085, all other parameters identical to the default LCA). C) Illustration of the evidence profile on a single trial in which due to the stochastic nature of the stimulus, the mean signal changes during the course of the trial. To confirm this illustration, we have simulated the two models over multiple trials (same inputs as in the experiment), under two conditions that correspond to the two experimental sessions. The 10 default model was simulated for session-II (interrogation), in which the decision criteria are not in place, and thus that the network parameters do not change. The decision-induced attractor model is simulated for session-I (free-response). We then carry out a logistic regression of the decision on the evidence in interval-I and II (until and following the preliminary-decision, respectively; see Figure 4 in the main text). We see that while the default LCA indicates a mild recency (leak dominance in the LCA parameters), the decision-induced attractor indicates a strong primacy (due to the increase in mutual-inhibition). This shows that the decision-induced-attractor mimics the reduced gain model, which we consider as a simplified and theory neutral account of the phenomenon, and is consistent with the data in our experiment. Figure S10 – simulated logistic regression weights for the default LCA model and the decisioninduced attractor LCA. The default model was simulated to session-II (i.e., interrogation, in which there is no decision-criterion; red line). The attractor LCA was simulated for session-I (freeresponse; blue line). References 1. Ratcliff R. Group reaction time distributions and an analysis of distribution statistics. Psychological bulletin. 1979;86(3):446. 2. Heathcote A, Brown S, Mewhort D. Quantile maximum likelihood estimation of response time distributions. Psychonomic bulletin & review. 2002;9(2):394-401. 3. Rouder JN, Speckman PL. An evaluation of the Vincentizing method of forming group-level response time distributions. Psychonomic bulletin & review. 2004;11(3):419-27. 4. Tsetsos K, Gao J, McClelland JL, Usher M. Using time-varying evidence to test models of decision dynamics: bounded diffusion vs. the leaky competing accumulator model. Frontiers in neuroscience. 2012;6. 5. Simen P. Evidence accumulator or decision threshold–which cortical mechanism are we observing? Frontiers in psychology. 2012;3. 6. Resulaj A, Kiani R, Wolpert DM, Shadlen MN. Changes of mind in decision-making. Nature. 2009;461(7261):263-6. 7. Usher M, Cohen JD, Servan-Schreiber D, Rajkowski J, Aston-Jones G. The role of locus coeruleus in the regulation of cognitive performance. Science. 1999;283(5401):549-54. 8. Aston-Jones G, Cohen JD. An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance. Annu Rev Neurosci. 2005;28:403-50. 11 9. Usher M, Davelaar EJ. Neuromodulation of decision and response selection. Neural Networks. 2002;15(4):635-45. 12