Quantifying the uncertainty of activation periods

advertisement
Introduction
Background
Methodology
Application and Results
Conclusions & References
Quantifying the uncertainty of activation periods
in fMRI data via changepoint analysis
Christopher F. H. Nam
Department of Statistics, University of Warwick
Joint work with John A. D. Aston and Adam M. Johansen
CRiSM Changepoint Workshop,
Tuesday 27th March 2012
References
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
A Biological Application
fMRI data from an Anxiety Induced Experiment
Figure: Design of Anxiety Induced Experiment
215 images/time points
24 subjects
Image from Lindquist et al. [2007]
Introduction
Background
Methodology
A Biological Application
fMRI Data
Image from Lindquist et al. [2007]
Application and Results
Conclusions & References
References
Introduction
Background
Methodology
Application and Results
Conclusions & References
A Biological Application
Brain Regions of interest
6
4
2
−2
0
Cluster 6 data
8
Cluster 6
0
50
100
150
200
Time
(a) RMPFC time series
2
0
−2
−4
Cluster 20 data
4
Cluster 20
0
50
100
150
Time
(b) VC time series
200
References
Introduction
Background
Methodology
Application and Results
Conclusions & References
A Biological Application
Questions of interest
The exact design of the experiment.
When might activations have occurred?
How many activations may have occurred?
How long are these activation periods?
References
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
Proposal
Aim
Propose a methodology to fully capture the uncertainty of
changepoint characteristics given a sequence of data
y = y1:n = (y1 , . . . , yn ).
Changepoint Probability (CPP) to a regime occurring at time
t.
Probability of m changepoints occurring in the data.
Focus our attention to these quantities although others are easily
accessible.
Introduction
Background
Methodology
Application and Results
Conclusions & References
Hidden Markov Models
Hidden Markov Models
Assume y can be generated by a finite number of states.
Hidden Markov Models (HMMs) to model time series y
with {Xt } as our underlying Markov Chain, with finite state
space ΩX .
General Finite State Hidden Markov Model
Emission:
yt |y1:t−1 , x1:t , θ ∼ f (yt |xt−r :t , y1:t−1 , θ)
Transition:
p(xt |y1:t−1 , x−r +1:t−1 , θ) = p(xt |xt−1 , θ)
Changepoints in this HMM framework.
References
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
Changepoints
Changepoint Definition
Changepoint Definition
Changepoint to a regime is said to have occurred at time t when a
state persists for at least k time periods in {Xt }.
xt−1 6= xt = xt+1 = . . . = xt+j
where j ≥ k − 1
Define τ (k) and M (k) to be the time and number of changepoints.
Introduction
Background
Methodology
Application and Results
Conclusions & References
Model Parameters
Model Parameters
θ indicates our model parameter vector which needs to be
estimated.
Components will vary; dependent on the particular type of
HMM used.
P, the |ΩX | × |ΩX | probability transition matrix for {Xt }
Parameters for the emission distribution which can be
dependent on the underlying state
Means µXt , variances σX2 t , Poisson intensity rates λXt , ...
References
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
Proposed Methodology
Suppose we are interested in the changepoint probability
P(τ (k) = t|y ).
Z
Z
(k)
(k)
P(τ = t|y ) = P(τ = t, θ|y )dθ = P(τ (k) = t|θ, y )p(θ|y )dθ
θ
θ
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
Proposed Methodology
Suppose we are interested in the changepoint probability
P(τ (k) = t|y ).
Z
Z
(k)
(k)
P(τ = t|y ) = P(τ = t, θ|y )dθ = P(τ (k) = t|θ, y )p(θ|y )dθ
θ
θ
≈ PcN (τ (k) = t|y ) =
N
X
i =1
by standard Monte Carlo results.
{W i , θ i }N
i =1 approximates p(θ|y )
P(τ (k) = t|θ i , y )
W i P(τ (k) = t|θ i , y )
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
Proposed Methodology
Proposed Methodology
1
Approximate p(θ|y ) via SMC samplers to obtain a normalised
weighted cloud of particles {W i , θ i }N
i =1 .
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
Proposed Methodology
Proposed Methodology
1
Approximate p(θ|y ) via SMC samplers to obtain a normalised
weighted cloud of particles {W i , θ i }N
i =1 .
2
For i = 1, . . . , N, compute the associated exact changepoint
distribution conditioned on θ i , P(τ (k) = t|θ i , y ), via FMCI in
a HMM context.
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
Proposed Methodology
Proposed Methodology
1
Approximate p(θ|y ) via SMC samplers to obtain a normalised
weighted cloud of particles {W i , θ i }N
i =1 .
2
For i = 1, . . . , N, compute the associated exact changepoint
distribution conditioned on θ i , P(τ (k) = t|θ i , y ), via FMCI in
a HMM context.
3
Weighted average of these exact changepoint distributions
to obtain the changepoint distributions in light of parameter
uncertainty, P(τ (k) = t|y ).
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
Proposed Methodology
Proposed Methodology
1
Approximate p(θ|y ) via SMC samplers to obtain a normalised
weighted cloud of particles {W i , θ i }N
i =1 .
2
For i = 1, . . . , N, compute the associated exact changepoint
distribution conditioned on θ i , P(τ (k) = t|θ i , y ), via FMCI in
a HMM context.
3
Weighted average of these exact changepoint distributions
to obtain the changepoint distributions in light of parameter
uncertainty, P(τ (k) = t|y ).
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
Proposed Methodology
Proposed Methodology
1
Approximate p(θ|y ) via SMC samplers to obtain a normalised
weighted cloud of particles {W i , θ i }N
i =1 .
2
For i = 1, . . . , N, compute the associated exact changepoint
distribution conditioned on θ i , P(τ (k) = t|θ i , y ), via FMCI in
a HMM context.
3
Weighted average of these exact changepoint distributions
to obtain the changepoint distributions in light of parameter
uncertainty, P(τ (k) = t|y ).
Combines recent work of Del Moral et al. [2006] and Aston et al.
[2009].
Specific details can be found in Nam et al. [in press].
Introduction
Background
Methodology
Application and Results
Conclusions & References
Approximating p(θ|y), the model parameter posterior
Sequential Monte Carlo (SMC) Samplers
SMC Samplers
πb ∝ p(θ)l (y |θ)γb
where
p(θ) = prior of
the model
parameters
l (y |θ) =
likelihood
0 = γ1 ≤ γ2 ≤
. . . ≤ γB = 1,
a tempering
schedule
References
Introduction
Background
Methodology
Application and Results
Exact distributions conditional on model parameters, P(τ
(k)
i
|θ , y)
Exact distributions
Exact computation of P(τ (k) = t|θ i , y ).
Conclusions & References
References
Introduction
Background
Methodology
Application and Results
Exact distributions conditional on model parameters, P(τ
(k)
Conclusions & References
i
|θ , y)
Exact distributions
Exact computation of P(τ (k) = t|θ i , y ).
(k)
τu
denote the uth changepoint under our definition.
P
(k)
= t|θ i , y ) = u=1,2,... P(τu = t|θ i , y ).
P(τ (k)
(k)
P(τu = t|θ i , y ) ≡ P(W (k, u) = t + k − 1|θ i , y ) re-express
as a waiting time for runs.
References
Introduction
Background
Methodology
Application and Results
Exact distributions conditional on model parameters, P(τ
(k)
Conclusions & References
i
|θ , y)
Exact distributions
Exact computation of P(τ (k) = t|θ i , y ).
(k)
τu
denote the uth changepoint under our definition.
P
(k)
= t|θ i , y ) = u=1,2,... P(τu = t|θ i , y ).
P(τ (k)
(k)
P(τu = t|θ i , y ) ≡ P(W (k, u) = t + k − 1|θ i , y ) re-express
as a waiting time for runs.
Waiting time distribution for runs can be computed exactly
via Finite Markov Chain Imbedding (FMCI).
References
Finite Markov Chain Imbedding (FMCI)
Introduction
Background
Methodology
Application and Results
Exact distributions conditional on model parameters, P(τ
(k)
Conclusions & References
i
|θ , y)
Waiting Time Distributions via FMCI
(u)
P(W (k, u) ≤ t + k − 1|θ i ) = P(Zt+k−1 ∈ A|θ i ) where A
denotes the set of absorption states in ΩZ .
Computed by standard Markov Chain results.
References
Introduction
Background
Methodology
Application and Results
Exact distributions conditional on model parameters, P(τ
(k)
Conclusions & References
i
|θ , y)
FMCI in a HMM context
Inference on the underlying state sequence.
References
Introduction
Background
Methodology
Application and Results
Exact distributions conditional on model parameters, P(τ
(k)
Conclusions & References
i
|θ , y)
FMCI in a HMM context
Inference on the underlying state sequence.
Compute posterior transition probabilities P(Xt |Xt−1 , y ) to
consider all possible state sequences.
Sequence of time dependent posterior transition probability
matrices {P̃1 , P̃2 , . . . , P̃n }.
References
Introduction
Background
Methodology
Application and Results
Exact distributions conditional on model parameters, P(τ
(k)
Conclusions & References
References
i
|θ , y)
FMCI in a HMM context
Inference on the underlying state sequence.
Compute posterior transition probabilities P(Xt |Xt−1 , y ) to
consider all possible state sequences.
Sequence of time dependent posterior transition probability
matrices {P̃1 , P̃2 , . . . , P̃n }.
Results in a sequence of time dependent posterior transition
probability matrices {M̃1 , M̃2 , . . . , M̃n } for the auxiliary MCs
{Zt }.
Introduction
Background
Methodology
Application and Results
Exact distributions conditional on model parameters, P(τ
(k)
Conclusions & References
References
i
|θ , y)
Exact Distributions for Changepoint characteristics
Probability of uth changepoint at specific time
(k)
P(τu
= t|θ i , y ) = P(W (k, u) = t + k − 1|θ i , y )
= P(W (k, u) ≤ t + k − 1|θ i , y ) − P(W (k, u) ≤ t + k − 2|θ i , y )
Distribution of the number of changepoints,
P(M (k) = u|θ i ) = P(W (k, u) ≤ n|θ i , y ) − P(W (k, u + 1) ≤ n|θ i , y )
Introduction
Background
Methodology
Application and Results
Conclusions & References
fMRI application
fMRI data
6
4
2
−2
0
Cluster 6 data
8
Cluster 6
0
50
100
150
200
Time
(c) RMPFC time series
2
0
−2
−4
Cluster 20 data
4
Cluster 20
0
50
100
150
Time
(d) VC time series
200
References
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
fMRI application
Preliminary I
AR error process leads to a HMS-AR model.
Detrending within model.
HMS-AR(r ) with detrending
yt − µxt − m‘t β = at
at = φ1 at−1 + . . . + φr at−r + ǫt ,
ǫt ∼ N(0, σ 2 )
Detrending parameters within model → Estimated within
SMC samplers.
Introduction
Background
Methodology
Application and Results
Conclusions & References
fMRI application
Preliminary II
Detrending types:
No detrending, Polynomial of order 3, Discrete Cosine
Basis
References
Introduction
Background
Methodology
Application and Results
Conclusions & References
fMRI application
Preliminary II
Detrending types:
No detrending, Polynomial of order 3, Discrete Cosine
Basis
HMS-AR of order 0 and 1
References
Introduction
Background
Methodology
Application and Results
Conclusions & References
fMRI application
Preliminary II
Detrending types:
No detrending, Polynomial of order 3, Discrete Cosine
Basis
HMS-AR of order 0 and 1
ΩX = {“resting”, “active”}
References
Introduction
Background
Methodology
Application and Results
Conclusions & References
fMRI application
Preliminary II
Detrending types:
No detrending, Polynomial of order 3, Discrete Cosine
Basis
HMS-AR of order 0 and 1
ΩX = {“resting”, “active”}
5 consecutive active states for activated region (k = 5)
References
Introduction
Background
Methodology
Application and Results
Conclusions & References
fMRI application
Preliminary II
Detrending types:
No detrending, Polynomial of order 3, Discrete Cosine
Basis
HMS-AR of order 0 and 1
ΩX = {“resting”, “active”}
5 consecutive active states for activated region (k = 5)
SMC samplers: 500 = N particles, 100 = B iterations
References
Cluster 6, CPP HMS-AR(0)
Cluster 6
0.8
0.6
CPP
0.2
0.4
6
4
2
−2
0.0
0
Cluster 6 data
8
1.0
CPP into & out of activation regime
P(Into)
P(Out of)
0
50
100
150
200
0
50
Time
CPP into & out of activation regime
1.0
CPP into & out of activation regime
0.6
0.2
0.0
0.0
0.2
0.4
CPP
0.6
0.8
P(Into)
P(Out of)
0.4
1.0
200
(f) AR(0), No detrending
P(Into)
P(Out of)
0.8
150
Time
(e) RMPFC time series
CPP
100
0
50
100
150
200
Time
(g) AR(0), Poly detrending
0
50
100
150
200
Time
(h) AR(0), DCB detrending
Cluster 6, CPP, HMS-AR(1)
Cluster 6
0.8
0.6
CPP
0.2
0.4
6
4
2
−2
0.0
0
Cluster 6 data
8
1.0
CPP into & out of activation regime
P(Into)
P(Out of)
0
50
100
150
200
0
50
Time
CPP into & out of activation regime
1.0
CPP into & out of activation regime
0.6
0.2
0.0
0.0
0.2
0.4
CPP
0.6
0.8
P(Into)
P(Out of)
0.4
1.0
200
(j) AR(1), No detrending
P(Into)
P(Out of)
0.8
150
Time
(i) RMPFC time series
CPP
100
0
50
100
150
200
Time
(k) AR(1), Poly detrending
0
50
100
150
200
Time
(l) AR(1), DCB detrending
Introduction
Background
Methodology
Application and Results
Conclusions & References
fMRI application
Cluster 6, Distribution of Number Activation Regimes
Probability
0.0 0.2 0.4 0.6 0.8 1.0
Distribution of number of activation regimes
No detrending
Poly detrending
DCB detrending
0
1
2
3
4
5
6
7
8
9
No. activation regimes
(m) AR(0)
Probability
0.0 0.2 0.4 0.6 0.8 1.0
Distribution of number of activation regimes
No detrending
Poly detrending
DCB detrending
0
1
2
3
4
5
6
No. activation regimes
(n) AR(1)
7
8
9
References
Cluster 20, CPP HMS-AR(0)
Cluster 20
0.8
0.6
CPP
0.2
0.4
2
0
−2
0.0
−4
Cluster 20 data
4
1.0
CPP into & out of activation regime
P(Into)
P(Out of)
0
50
100
150
200
0
50
Time
CPP into & out of activation regime
1.0
CPP into & out of activation regime
0.6
0.2
0.0
0.0
0.2
0.4
CPP
0.6
0.8
P(Into)
P(Out of)
0.4
1.0
200
(p) AR(0), No detrending
P(Into)
P(Out of)
0.8
150
Time
(o) VC time series
CPP
100
0
50
100
150
200
Time
(q) AR(0), Poly detrending
0
50
100
150
200
Time
(r) AR(0), DCB detrending
Cluster 20, CPP, HMS-AR(1)
Cluster 20
0.8
0.6
CPP
0.2
0.4
2
0
−2
0.0
−4
Cluster 20 data
4
1.0
CPP into & out of activation regime
P(Into)
P(Out of)
0
50
100
150
200
0
50
Time
CPP into & out of activation regime
1.0
CPP into & out of activation regime
0.6
0.2
0.0
0.0
0.2
0.4
CPP
0.6
0.8
P(Into)
P(Out of)
0.4
1.0
200
(t) AR(1), No detrending
P(Into)
P(Out of)
0.8
150
Time
(s) VC time series
CPP
100
0
50
100
150
200
Time
(u) AR(1), Poly detrending
0
50
100
150
200
Time
(v) AR(1), DCB detrending
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
fMRI application
Cluster 20, Distribution of Number Activation Regimes
Probability
0.0 0.2 0.4 0.6 0.8 1.0
Distribution of number of activation regimes
No detrending
Poly detrending
DCB detrending
0
1
2
3
4
5
No. activation regimes
(w) AR(0)
Probability
0.0 0.2 0.4 0.6 0.8 1.0
Distribution of number of activation regimes
No detrending
Poly detrending
DCB detrending
0
1
2
3
No. activation regimes
(x) AR(1)
4
5
Introduction
Background
Methodology
Application and Results
Conclusions & References
Conclusions
Conclusions
Proposed methodology quantifying the uncertainty for
changepoint characteristics in light of model parameter
uncertainty.
References
Introduction
Background
Methodology
Application and Results
Conclusions & References
Conclusions
Conclusions
Proposed methodology quantifying the uncertainty for
changepoint characteristics in light of model parameter
uncertainty.
Combines recent work of exact changepoint distributions via
FMCI in HMM.
References
Introduction
Background
Methodology
Application and Results
Conclusions & References
Conclusions
Conclusions
Proposed methodology quantifying the uncertainty for
changepoint characteristics in light of model parameter
uncertainty.
Combines recent work of exact changepoint distributions via
FMCI in HMM.
Accounts for parameter uncertainty via SMC samplers.
References
Introduction
Background
Methodology
Application and Results
Conclusions & References
Conclusions
Conclusions
Proposed methodology quantifying the uncertainty for
changepoint characteristics in light of model parameter
uncertainty.
Combines recent work of exact changepoint distributions via
FMCI in HMM.
Accounts for parameter uncertainty via SMC samplers.
Application to fMRI data.
References
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
Conclusions
Conclusions
Proposed methodology quantifying the uncertainty for
changepoint characteristics in light of model parameter
uncertainty.
Combines recent work of exact changepoint distributions via
FMCI in HMM.
Accounts for parameter uncertainty via SMC samplers.
Application to fMRI data.
Effects of error process assumptions and types of detrending.
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
References
References
John A. D. Aston, J. Y. Peng, and Donald E. K. Martin. Implied
distributions in multiple change point problems. CRiSM Research
Report 08-26, University of Warwick, 2009.
Pierre Del Moral, Arnaud Doucet, and Ajay Jasra. Sequential Monte
Carlo samplers. Journal of the Royal Statistical Society Series B, 68
(3):411–436, 2006.
Martin A. Lindquist, Christian Waugh, and Tor D. Wager. Modeling
state-related fMRI activity using change-point theory. NeuroImage, 35
(3):1125–1141, 2007. doi: 10.1016/j.neuroimage.2007.01.004.
C. F. H. Nam, J. A. D. Aston, and A. M. Johansen. Quantifying the
uncertainty in change points. Journal of Time Series Analysis, (in
press).
Introduction
Background
Methodology
Application and Results
Conclusions & References
Additional Information
Implementation of SMC samplers I
Consider the model parameters for a 2-state Hamilton’s
MS-AR(r ) model. θ = (P, µ1 , µ2 , σ 2 , φ1 , . . . , φr )
Simple linear tempering schedule,
b−1
,
b = 1, . . . , B.
γb = B−1
References
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
Additional Information
Implementation of SMC samplers I
Consider the model parameters for a 2-state Hamilton’s
MS-AR(r ) model. θ = (P, µ1 , µ2 , σ 2 , φ1 , . . . , φr )
Simple linear tempering schedule,
b−1
,
b = 1, . . . , B.
γb = B−1
Reparameterisation of variances to precisions, λ = 1/σ 2
-AR parameters to Partial Autocorrelation Coefficients (PAC).
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
Additional Information
Implementation of SMC samplers II
Initialisation: Assume independence between the components
of θ, and sample from Bayesian priors.
Approximation of posterior
(i )
(i )
i i N
p(θ|y ) ≈ {WB , θB }N
i =1 = {W , θ }i =1
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
Additional Information
Implementation of SMC samplers II
Initialisation: Assume independence between the components
of θ, and sample from Bayesian priors.
Mutation: Random Walk Metropolis Hastings (RWMH).
Approximation of posterior
(i )
(i )
i i N
p(θ|y ) ≈ {WB , θB }N
i =1 = {W , θ }i =1
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
Additional Information
Implementation of SMC samplers II
Initialisation: Assume independence between the components
of θ, and sample from Bayesian priors.
Mutation: Random Walk Metropolis Hastings (RWMH).
P
(i ) 2 −1
< N/2.
Selection: Resample if ESS = { N
i =1 (Wb ) }
Re-weight resampled particles to 1/N.
Approximation of posterior
(i )
(i )
i i N
p(θ|y ) ≈ {WB , θB }N
i =1 = {W , θ }i =1
Introduction
Background
Methodology
Application and Results
Conclusions & References
References
Additional Information
Implementation of SMC samplers II
Initialisation: Assume independence between the components
of θ, and sample from Bayesian priors.
Mutation: Random Walk Metropolis Hastings (RWMH).
P
(i ) 2 −1
< N/2.
Selection: Resample if ESS = { N
i =1 (Wb ) }
Re-weight resampled particles to 1/N.
Intermediary Output: Weighted cloud of N particles,
(i ) (i )
{Wb , θb }N
i =1 approximates the distribution πb .
Approximation of posterior
(i )
(i )
i i N
p(θ|y ) ≈ {WB , θB }N
i =1 = {W , θ }i =1
Introduction
Background
Methodology
Application and Results
Conclusions & References
Additional Information
Waiting Time Distributions for Runs in HMM context
For t = 1, . . . , n
Ψt = Ψt−1 M̃t
(u)
ψt
i
P(W (k, u|θ , y ) ≤ t) =
(u)
where :ψt
(u)
← ψt
(u−1)
+ ψt−1 (M̃t − I)Υ,
(u)
P(Zt
∈ A) =
u = 2, . . . , ⌊n/k⌋
(u)
ψt U(A)
= 1 × |ΩZ | vector storing probabilities of the uth chain
being in the corresponding state.
(1)
(2)
Ψt = ⌊n/k⌋ × |ΩZ | with (ψt , ψt , . . .) as row vectors.
I = |ΩZ | × |ΩZ | identity matrix
Υ = |ΩZ | × |ΩZ | matrix connecting absorption state
to the corresponding continuation state
U(A) = |ΩZ | × 1 vector with 1s in location of absorption states
0s elsewhere.
References
Download