Supplementary Information A - Proceedings of the Royal Society B

advertisement
Moving in time: Bayesian causal inference explains movement coordination to auditory
beats
Mark T. Elliott*, Alan M. Wing, Andrew E. Welchman
School of Psychology, University of Birmingham, Edgbaston, UK, B15 2TT
*Corresponding Author: Email: m.t.elliott@bham.ac.uk; Tel: 0121 414 7260; Fax: 0121
414 4897
Supplementary Information A:
Bayesian Inference Model Derivation and Simulation Description
In the experiment, participants were asked to synchronise to auditory metronomic cues. Cues
were formed of two independent metronomes (A and B) with equal underlying tempo, but
with B offset in phase () relative to A. The metronomes also varied in temporal reliability,
by perturbing each onset by a random value sampled from a distribution with standard
deviation jittA and jittB respectively.
We model this task using an observer who must synchronise their movements to the rhythmic
auditory cues presented to them. From the observer’s point of view, each set of cues consists
of two discrete tones of different pitch (sA and sB). The observer must estimate the onset of
the underlying beat that is formed by the auditory cues and in the context of the previous cue
onsets to allow them to make movements in synchrony with those beats. They do this using a
causal inference process based on (i) the likelihood of the onsets of the two auditory cues,
whose true onset times are corrupted by sensory noise and (ii) the prior expectation of where
the beat will occur, based on the previous beat onset estimate. The causal inference process
allows the observer to determine if the two auditory cues should form a single common beat
and hence combine the likelihood of the two beats with the prior to obtain the estimated onset
time of that beat (ŝ). Alternatively, if the causal inference process indicates that the two
auditory cues are in fact independent, then two beat onset times are estimated (ŝA, ŝB) based
on the prior and likelihood of each independent cue onset.
Page 1 of 12
Here, we formally derive the causal inference model (CI) we used to fit to the experimental
data. Subsequently, we show the alterations made to this model to derive the alternative
models we tested (Causal Inference with phase-offset adaptation (CIPA); Mandatory
Integration (MI); Mandatory Separation (MS)).
Generative model
We assume there are two scenarios: one where the observer registers a single common beat
(C=1) or alternatively, two independent beats (C=2). C is determined by drawing from a
binomial distribution, with p(C=1) = psingle, where psingle is the prior probability of the
auditory cues forming a single beat [1]. For a single common beat we sample the onset of the
beat, s, from a prior distribution, N(µp, p), where µp is based on the estimated onset time of
the previous beat. We then set sA=s and sB=s. For independent beats, we sample two beat
onsets, sA, sB from the same prior distribution. We assume the observer’s estimated onset
times of the two auditory signals are corrupted by Gaussian noise. The estimated onset times
tA, tB by the observer are therefore sampled from Gaussian distributions: N(sA, A) and N(sB,
B) respectively.
Time Referencing
With a stream of isochronous beats, the mth beat would occur at mT seconds, where T is the
interval between beats. For the model, we are only concerned with the relative timing of each
beat estimate, so we define a relative time of 0s where each beat of a single isochronous
metronome would occur. Time onsets occurring before this, are deemed negative; later events
positive. All estimates of sA, sB and their derivations are centred around this relative time
frame. The final asynchrony of the simulated observer’s response is also measured relative to
this reference. We further use this reference to define our initial condition of the prior
probability of where the next beat will occur, by starting with µp = 0.
Inference Calculations
First we determine the probability of a single common beat based on the estimated onset
times tA, tB of the signals:
Page 2 of 12
p(t A , t B | C  1) p(C  1)
p(t A , t B )
p(C  1 | t A,t B ) 
(S1)
As there are only two possibilities: A single beat (C=1) or two independent beats (C=2), and
the overall probability must sum to one, then the denominator can be expanded to:
p(C  1 | t A,t B ) 
p(t A , t B | C  1) psin gle
(S2)
p(t A , t B | C  1) psin gle  p(t A , t B | C  2)(1  psin gle )
Therefore, we need to calculate the likelihood function, p(tA,tB|C=1), which is given by:
pt A , t b | C  1   p(t A , t B | s ) p( s ) ds
(S3)
  p (t A | s ) p (t B | s ) p ( s ) ds
(S4)
All three terms in the integral are Gaussians and hence this can be solved analytically:
p(t A , t B | C  1) 
 1 t B  t A 2  p2  t B   p 2  A2  t A   p 2  B2 
exp 
 (S5)
 A2 B2   B2 p2   A2 p2
2  A2 B2   B2 p2   A2 p2
 2

1
For the condition where the signals are treated independently (C=2), we get:
pt A , tb | C  2    p(t A | s A ) p(t B | s B ) p( s A , s B ) ds A ds B
(S6)
  p(t A | s A ) p( s A ) ds A  p(t B | s B ) p( s B ) ds B
(S7)
Similarly to S5, this can be solved analytically to get:
p(t A , t B | C  2) 
 1 t A   p 2 t B   p 2 
exp 
 2

2
2
 B   p2 
2  A2 p2   B2 p2
 2  A   p
1

 

(S8)
From equation S2, we consider the cues to form a single beat when p(C  1 | t A,t B )  0.5 .
Page 3 of 12
Optimal estimate of signal onset
Once the observer has inferred if the signals form a single beat or not, they must then
estimate the onset of the beat or beats, based on the inference made.
The optimal Bayesian estimate is defined generally as:


ˆ(m)  arg min
C ˆ   p( | m) d
ˆ 

(S9)
Where: C( ˆ -) is the cost function of the error between the estimate of the signal and the
signal itself, m are the parameters describing the signal and p( |m) is the posterior
probability of the signal given m.
Here, we aim to minimise the cost function that is the squared error between the signal and
the estimate. Hence, the estimate of the signals is:
2
ŝ j,C=1 = argmin éê ò ( ŝ j - s j ) p(s j | t A , tB ,C =1)ds j ùú, when C=1 and where j=A or B. (S10)
ë
û
ŝ j
Similarly,
2
ŝ j,C=2 = argmin éê ò ( ŝ j - s j ) p(s j | t A , tB ,C = 2)ds j ùú, when C=2.
û
ŝ j ë
(S11)
This results in the equivalent of calculating the mean of the posterior and, in this case where
the posterior is Gaussian, is also equivalent to calculating the maximum a-posteriori (MAP)
estimate.
For the condition when C=1, the MAP estimate can be calculated analytically, linearly
weighting the cues according to their variances:
sˆ j ,C 1  sˆ A,C 1  sˆB ,C 1 
t A A2  tb B2   p p2
 A2   B2   p2
j  A or B
(S12)
For independent beats (C=2), the estimates of tA and tB are deemed independent but are
combined with the prior expectation of where the current beat should occur:
Page 4 of 12
sˆ j ,C 2 
t j j 2   p p2
 j 2   p2
j  A or B .
,
(S13)
Hence, our overall estimate of the current beat(s) ŝj can be described by:

sˆ j ,C 1 ,
sˆ j  

sˆ j ,C 2 ,
p(C  1 | t A , t B )  0.5
p(C  1 | t A , t B )  0.5
,
j = A or B.
(S14)
Alternative model definitions
Adapting to the phase offset (CIPA)
In the experiment, we used a fixed phase offset, ϕ, to separate the onset times of the
metronome cues in addition to any deviations created by adding jitter. We therefore
questioned whether participants would adapt to this phase offset and hence recalibrate their
judgement of the level of deviation required between cues before they were considered
separate beats. To test this we modified the model, such that the observer has knowledge of
the consistent phase offset between the cues and subsequently ignores that in their inference
of whether the cues are deemed a single beat, or independent beats. This results in
substracting  from tB in equation S5 to get:
 1 t B    t A 2  p2  t B     p 2  A2  t A   p 2  B2 
p(t A , t B | C  1) 
exp 

 A2 B2   B2 p2   A2 p2
2  A2 B2   B2 p2   A2 p2
 2

1
(S15)
Similarly, we do the same to equation S8:
 1 t A   p 2 t B     p 2 
p(t A , t B | C  2) 
exp 


2
2
 B2   p2 
2  A2 p2   B2 p2
 2  A   p
1




(S16)
The causal inference is now based on the deviation of the signals after taking into account
any constant phase offset between the signals. The estimated beat onset calculations (S9-S14)
Page 5 of 12
do not change as they remain based on the actual sensory registrations tA and tB and the prior,
µp.
Mandatory Integration (MI) / Mandatory Separation (MS)
We further tested the causal inference model against the simpler models of mandatory
integration (MI), where all cues are integrated into a single beat estimate, regardless of
statistics, and mandatory separation (MS), where all cues are treated as independent beats.
To test these models, we fixed the previously free parameter, psingle, to be zero for MS and one
for MI. This forces p (C  1 | t A,t B ) (see S2) to zero or one respectively, such that for MS all
estimates are calculated using equation S13 only and for MI, all estimates are calculated
using equation S12 only.
Simulation of experimental conditions
It was not possible to analytically calculate the distributions of ŝ due to the non-linearity
created by the model selection process shown in S14 [1]. Instead, simulations of the
experimental conditions we tested were generated based on an observer using each of the
models described above to estimate the signal onsets. Additionally, we added motor noise
and an anticipation effect (a negative motor delay) to calculate the observer’s asynchrony to
the actual beat, which allowed a direct comparison of the simulated results to the empirical
asynchrony distributions.
We generated 2,000 simulated signal pairs (sA and sB) for each of the experimental
conditions. The signals were separated by phase offsets of 0, 50, 100 or 150 ms, with sA
always occurring at 0, and sB at the offset time. The observer estimated the onset times of the
underlying signals sA and sB, as tA and tB, respectively. The uncertainty in the estimates A and
B were defined by the temporal jitter applied (as in the experiment) and the observer’s
sensory noise:
2
2
 A   sens
  2jittA ,  B   sens
  2jittB
(S17)
We used a fixed sensory registration noise value sens = 17 ms, estimated from the data of a
previous study where participants completed a similar task, synchronising movements to
Page 6 of 12
multisensory metronomes [2]. The values of temporal jitter, were set to the same as the
experimental conditions: {0,0 ms}, {10,50 ms} and {50,10 ms}, for jittA and jittB,
respectively. Hence, there were 12 simulations per participant for each model (four phaseoffsets and three jitter conditions).
Each simulation step produced an estimate of a single common beat onset (C=1) or two
independent beat onsets (C=2). In the latter scenario, the observer must choose which beat
they will target their movement to, either ŝA or ŝB. From the experimental data, we found that
when the signals were equally reliable (jitter {0,0 ms}), participants did not target A or B
equally (as would be expected statistically), but in fact tended to be biased towards A over B,
or vice-versa. We therefore added a free parameter, , (fitted only to the condition jitter: {0,0
ms} and phase offset: 150 ms).  determined the bias towards metronome A and was
subsequently used to split the simulation sample, such that a proportion of the 2000 samples
selected ŝA, the remainder ŝB, when the signals were deemed independent.
In practice, participants plan their finger tap to coincide with the next beat onset, based on
their knowledge of the current beat. This results in a commonly observed ‘anticipation effect’
of a negative asynchrony between the tap and the beat onset [3]. In the simulation we added
motor noise, sampled from a Gaussian distribution with mean zero and standard deviation
equal to a participant's estimated motor variance (  M2 ; see Supplementary Information-B) and
fitted a negative asynchrony offset (d; free parameter) to the current beat estimate, ŝ. This
simplification to the model did not affect the validity of the resulting model output relative to
the empirical data, as we were interested in the overall distribution of asynchronies. Hence, it
does not matter if we align the finger movement to the current or next beat, as long as the
underlying cue estimates are valid.
Finally, the estimated onset of the beat, ŝ, was carried over to the next simulation step and
used to update the prior, such that µp(m) = ŝ(m-1), where m is the mth simulation step. This
represents the observer updating their ‘timekeeper’ on each finger tap [4,5], to maintain an
expectation of when the next beat will occur. We defined the initial conditions of the prior to
be µp = sA for the portion of the simulation where the observer is set to target sA in the C=2
scenarios, and µp = sB for the remaining portion, which coincides with the observer targeting
sB in the C=2 scenarios.
Page 7 of 12
Note: a dataset of simulated asynchronies are available, along with corresponding
experimentally measured asynchronies, for each condition. See doi:10.5061/dryad.m5k62.
Parameters
Below we provide a summary of the parameters used in the model and subsequent
simulation.
parameter description
type
C
Denotes the two possible scenarios: C=1, the signals
define a single common beat; C=2, the signals define Model outcome
two independent beats.
psingle
Prior probability of signals defining a common
single beat (p(C=1)).
Free parameter
s A, s B
Underlying beat onset times of signal A and B.
Experimentally defined
ŝ
The observer’s final estimate of the underlying beat.
For (C=2), the observer estimates ŝA and ŝB, but
targets only ŝA or ŝB to get ŝ.
Model outcome
µp
Mean of the prior expectation of current beat onset
time. µp(m) = ŝ(m-1), where m is the mth simulation
step.
Model outcome
p
Uncertainty in the prior expectation of the current
beat onset time.
Free parameter
t A, t B
Observer’s sensory registration of signals A and B.
Model outcome
A, B
Uncertainty in the sensory estimates of the signals A
and B. Made up of jittA/B and sens
Experimentally defined
jittA, jittB
Standard deviations of distributions from which
experimentally manipulated temporal jitter is
sampled from.
Experimentally defined
Page 8 of 12
S
The observer’s sensory noise.
Experimentally defined

The consistent phase offset between signals, which
the observer adapts to in the CIPA model.
Experimentally defined

The proportion of simulation samples to which the
observer chooses ŝA for their estimate of ŝ when the
signals are deemed independent (C=2).
Free parameter
 M2
Motor variance due to producing the finger tap
movement
Experimentally defined
(participant specific)
d
Negative delay time. Represents the anticipation
effect resulting in a negative mean asynchrony in
finger tapping tasks.
Free parameter
Table S1: Summary table of all parameters used in the simulation of an observer using
Bayesian causal inference to synchronise their movements to rhythmic timing cues.
Free parameter values
We fitted three parameters to each participant's data for each condition. In addition a fourth
parameter, β, was fitted to each participant’s experimental data resulting from the condition
in which phase-offset was 150 ms and the jitter was {0, 0 ms}. Based on the fitted values, we
generated the distributions of timing errors (after adding motor noise) and thereafter were
able to compare the simulated results to the experimental results. Below we show the mean
fitted values of the CIPA model (which showed the best fit to participants’ data) across
participants.
We checked that each of the free parameters contributed independently to describing the
model by testing for correlations between the values. We found no significant correlation
between the parameters amongst participants and conditions (Table S3), confirming each
parameter contributed to the model fit.
Page 9 of 12
jitter (A, B; ms)
free parameter
d (ms)
psingle
p (ms)

offset (ms)
0,0
10,50
50,10
0
40
49
56
50
55
40
64
100
59
31
54
150
35
40
40
0
0.43
0.59
0.53
50
0.46
0.51
0.59
100
0.31
0.37
0.25
150
0.11
0.18
0.20
0
188
233
85
50
232
217
101
100
147
138
123
150
109
153
68
150
0.60
Table S2. Free parameter values (means across participants, fitted to the CIPA Model)
free parameter
d
psingle
psingle
.15
p
-.04
.14
*
.18
.56
p
-.26
Table S3: Correlation coefficients between free parameters (from CIPA model), collapsed
across participants and conditions. All values showed a significance of p>.05. *Bias to
metronome A is only measured to parameters in condition: Offset = 150 ms, Jitter = {0,0
ms}.
Page 10 of 12
Supplementary Information B: Procedure for estimating motor variance
The causal inference model we have developed is primarily concerned with the processes of
sensory estimation. However, the production of synchronised finger taps involves additional
stages of processing that each introduce variability, in terms of 'timekeeper' and motor noise
[4,5]. We conducted measurements on each participant in order to estimate the contributions
from these additional sources of variability. To this end, participants completed a
synchronisation-continuation tapping task at the beginning and end of the experiment.
Participants tapped in synchrony to a metronome (isochronous, period 500 ms) presented for
five beats and then continued to tap their finger at the same tempo for another 60 seconds
without the metronome. To analyse these data, the inter-response intervals (IRIs) between
each tap onset were calculated. Autocovariance of the tapping responses at lag 0 and lag 1
were measured along the sequence using a sliding window technique (window size = 40
intervals sliding in steps of 5 intervals). Based on the Wing-Kristofferson model [6], we
thereby estimated the motor variance (  M2 ) and the timekeeper variance (  T2 ):
g I (0) = s T2 + 2s M2
(S18)
g I (1) = -s M2
(S19)
Where g I (k) is the lag k autocovariance of the IRIs. We calculated the median of the
timekeeper and motor variances calculated within each window, and then took the mean of
these values across repetitions of these trials for each participant. Given that timekeeper and
motor variance are independent of variability associated with sensory processing of the
stimuli, we simplify our treatment by combining these two (statistically independent) terms
into a single term that we refer to as the motor variance.
Page 11 of 12
References
1. Körding, K. P., Beierholm, U., Ma, W. J., Quartz, S., Tenenbaum, J. B. & Shams, L.
2007 Causal inference in multisensory perception. PLoS ONE 2, e943.
(doi:PMC1978520)
2. Elliott, M. T., Wing, A. M. & Welchman, A. E. 2010 Multisensory cues improve
sensorimotor synchronisation. Eur. J. Neurosci. 31, 1828–1835. (doi:10.1111/j.14609568.2010.07205.x)
3. Aschersleben, G. & Prinz, W. 1995 Synchronizing actions with events: the role of
sensory information. Percept. Psychophys. 57, 305–317.
4. Vorberg, D. & Wing, A. M. 1996 Modeling variability and dependence in timing. In
Handbook of perception and action, pp. 181–262. London: Academic Press.
5. Vorberg, D. & Schulze, H. H. 2002 Linear Phase-Correction in Synchronization:
Predictions, Parameter Estimation, and Simulations. J. Math. Psychol. 46, 56–87.
6. Wing, A. M. & Kristofferson, A. B. 1973 Response delays and the timing of discrete
motor responses. Percept. Psychophys. 14, 5–12.
Page 12 of 12
Download