The General Linear Model and Statistical Parametric Mapping

advertisement
Sensor & Source Space Statistics
Rik Henson
(MRC CBU, Cambridge)
With thanks to Jason Taylor, Vladimir Litvak, Guillaume Flandin,
James Kilner & Karl Friston
Overview
A mass-univariate statistical approach to localising
effects in space/time/frequency (using replications
across trials/subjects)…
Overview
• Sensor Space:
1. Random Field Theory (RFT)
2. 2D Time-Freq (within-subject)
3. 3D Scalp-Time (within-subject)
4. 3D Scalp-Time (between-subjects)
• Source Space:
1. 3D contrast images
2. SPM vs SnPM vs PPM (vs FDR)
3. Other issues & Future directions
4. Multivariate
1. Random Field Theory (RFT)
RFT is a method for correcting for multiple statistical
comparisons with N-dimensional spaces (for parametric
statistics, eg Z-, T-, F- statistics)…
1. When is there an effect in time, eg GFP (1D)?
2. Where is there an effect in time-frequency space (2D)?
3. Where is there an effect in time-sensor space (3D)?
4. Where is there an effect in time-source space (4D)?
Worsley Et Al (1996). Human Brain Mapping, 4:58-73
2. Single-subject Example
• “Multimodal” Dataset in SPM8
manual (and website)
• Single subject:
128 EEG
275 MEG
3T fMRI (with nulls)
1mm3 sMRI
• Two sessions
• ~160 face trials and ~160
scrambled trials per session
• (N=12 subjects soon, as in
Henson et al, 2009 a, b, c)
Chapter 33, SPM8 Manual
2. Where is an effect in time-frequency (2D)?
Faces
• Single MEG channel
• Mean over trials of Morlet Wavelet
projection (i.e, induced + evoked)
• Write as t x f x 1 image per trial
• SPM, correct on extent / height
Faces > Scrambled
Scrambled
Kilner Et Al (2005) Neurosci. Letters
Chapter 33, SPM8 Manual
3. Where is an effect in scalp-time space (3D)?
• 2D sensor positions specified or
projected from 3D digitised positions
• Each sample projected to a 32x32 grid
using linear interpolation
• Samples tiled to created a 3D volume
Chapter 33, SPM8 Manual
t
y
x
• F-test of means of ~150 EEG trials of
each type (since polarity not of interest)
• (Note that clusters depend on reference)
3. Where is an effect in scalp-time space (3D)?
More sophisticated 1st-level design matrices, e.g, to remove trial-by-trial confounds within
each subject, and create mean adjusted ERP for 2nd–level analysis across subjects
Each trial-type (6)
Confounds (4)
Across-subjects
(2nd-level)
Each trial
Within-subject
(1st-level)
beta_00* images reflect mean (adjusted)
3D scalp-time volume for each condition
Henson Et Al (2008) Neuroimage
4. Where is an effect in scalp-time space (3D)?
Mean ERP/ERF images can also be tested between-subjects. Note however for
MEG, some alignment of sensors may be necessary (e.g, SSS, Taulu et al, 2005)
Without transformation to Device Space
Stats over 18 subjects on RMS of 102 planar gradiometers
With transformation to Device Space
Taylor & Henson (2008) Biomag
Overview
• Sensor Space:
1. Random Field Theory (RFT)
2. 2D Time-Freq (within-subject)
3. 3D Scalp-Time (within-subject)
4. 3D Scalp-Time (between-subjects)
• Source Space:
1. 3D contrast images
2. SPM vs SnPM vs PPM (vs FDR)
3. Other issues & Future directions
4. Multivariate
Where is an effect in source space (3D)?
Source analysis of N=12 subjects; 102 magnetometers; MSP; evoked; RMS; smooth 12mm
1. Estimate evoked/induced energy
(RMS) at each dipole for a certain
time-frequency contrast (e.g, from
sensor stats, e.g 0-20Hz, 150200ms), for each condition (e.g,
faces & scrambled) and subject
Analysis
Mask
2. Smooth along the 2D surface
3. Write these data into a 3D image
in MNI space (if canonical /
template mesh used)
4. Smooth by 8-12mm in 3D (to
allow for normalisation errors)
Henson Et Al (2007) Neuroimage
Note sparseness of
MSP inversions….
Where is an effect in source space (3D)?
Source analysis of N=12 subjects; 102 magnetometers; MSP; evoked; RMS; smooth 12mm
1. Classical SPM approach
Caveats:
• Inverse operator induces
long-range error correlations
(e.g, similar gain vectors from
non-adjacent dipoles with
similar orientation), making
RFT conservative
• Need a cortical mask, else
activity “smoothed” outside
• Distributions over subjects
may not be Gaussian…
SPM
p<.05 FWE
Where is an effect in source space (3D)?
Source analysis of N=12 subjects; 102 magnetometers; MSP; evoked; RMS; smooth 12mm
2. Nonparametric, SnPM
• Robust to non-Gaussian
distributions
• Less conservative than RFT
when dfs<20
Caveats:
• No idea of effect size (e.g, for
future experiments)
• Exchangeability difficult for
more complex designs
SnPM
p<.05 FWE
Where is an effect in source space (3D)?
Source analysis of N=12 subjects; 102 magnetometers; MSP; evoked; RMS; smooth 12mm
3. PPMs
PPM
p>.95 (γ>1SD)
• No need for RFT (no MCP!)
• Threshold on posterior
probability of an effect
(greater than some size)
• Can show effect size after
thresholding…
Caveats:
• Assume normal distributions
(e.g, of mean over voxels);
sometimes not met for MSP
(though usually fine for IID)
Grayscale=
Effect Size
Where is an effect in source space (3D)?
Source analysis of N=12 subjects; 102 magnetometers; MSP; evoked; RMS; smooth 12mm
4. FDR?
• Topological issues…?
SPM
p<.05 FWE
Where is an effect in source space (3D)?
Some further thoughts:
• Since data live in sensor space, why not perform stats there, and just report
some mean localisation (e.g, across subjects)?
True but:
What if sensor data not aligned (e.g, MEG)? (Taylor & Henson, 2008)?
What if want to fuse modalities (e.g, MEG+EEG) (Henson et al, 2009)?
What if want to use source priors (e.g, fMRI) (Henson et al, submitted)?
• Contrast localisations of conditions, or localise contrast of conditions?
“DoL” or “LoD” (Henson et al, 2007, Neuroimage)
LoD has higher SNR (though difference only lives in trial-average, i.e evoked)?
But how test localised energy of a difference (versus baseline?)
Construct inverse operator (MAP) from a difference, but then apply that operator
to individual conditions (Taylor & Henson, in prep)
Future Directions
• Extend RFT to 2D cortical surfaces (“surfstat”)
Pantazis Et Al (2005) NeuroImage
• Go multivariate…
–
To localise (linear combinations) of spatial (sensor or source)
effects in time, using Hotelling-T2 and RFT
Carbonell Et Al (2004) NeuroImage
–
To detect spatiotemporal patterns in 3D images (MLM / PLS)
Duzel Et Al (2003) Neuroimage
Kherif Et Al (2004) NeuroImage
Multivariate Model (MM) toolbox
Famous
Novel
Scrambled
Multivariate Linear Model
(MLM) across subjects on
MEG Scalp-Time volumes
(now with 3 conditions)
Famous
Novel
Scrambled
Sensitive (and suggestive
of spatiotemporal dynamic
networks), but “imprecise”
X
“M170”?
Kherif Et Al (2004) NeuroImage
The End
2. Where is an effect in time-frequency (2D)?
Kilner Et Al (2005) Neurosci. Letters
2. Parametric Empirical Bayes (PEB)
•
•
•
•
•
•
•
Weighted Minimum Norm & Bayesian equivalent
EM estimation of hyperparameters (regularisation)
Model evidence and Model Comparison
Spatiotemporal factorisation and Induced Power
Automatic Relevance Detection (hyperpriors)
Multiple Sparse Priors
MEG and EEG fusion (simultaneous inversion)
Weighted Minimum Norm, Regularisation
Linear system to be inverted:
Y  LJ  E
E ~ N (0, Ce )
Y = Data, n sensors x t=1 time-samples
J = Sources, p sources x t time-samples
L = Forward model, n sensors x p sources
E = Multivariate Gaussian noise, n x t
Ce= error covariance over sensors
Since n<p, need to regularise, eg “weighted minimum (L2) norm” (WMN):
J  arg min {|| Ce1/ 2 (Y  LJ ) ||2  || WJ ||2 }
J
W = Weighting matrix
 (W T W ) 1 LT [ L(W TW ) 1 K T  Ce ]1Y
(Tikhonov)
||Y – LJ||2
“L-curve” method
 = regularisation
(hyperparameter)
||WJ||2
W =I
W = DDT
W = diag(LTL)-1
Wp = (LpTCy-1Lp)-1
W =…
minimum norm
coherent
depth-weighted
SAM
….
Phillips Et Al (2002)
Neuroimage, 17, 287–301
Equivalent Bayesian Formulation
Equivalent “Parametric Empirical Bayes” formulation:
Y  LJ  E
(e )
J  0  E ( j)
(e)
(e)
E ~ N (0, C )
E ( j ) ~ N (0, C ( j ) )
Posterior is product of likelihood and prior:
p(J | Y)  p(Y | J ) p( J )
Maximal A Posteriori (MAP) estimate is:
Jˆ  C ( j ) LT ( LC ( j ) LT  C ( e ) ) 1Y
(Contrasting with Tikhonov):
(W T W ) 1 LT [ L(W T W ) 1 LT  C (e) ]1 Y
 C ( j )  (W T W ) 1
Y = Data, n sensors x t=1 time-samples
J = Sources, p sources x t time-samples
L = Forward model, n sensors x p sources
C(e) = covariance over sensors
C(j) = covariance over sources
Y (1)  X (1) ( 2 )  E (1)
 ( 2 )  X ( 2 ) ( 3)  E ( 2 )
 ( 3)  0...
W = Weighting matrix
W =I
W = DDT
W = diag(LTL)-1
Wp = (LpTCy-1Lp)-1
W =…
minimum norm
coherent
depth-weighted
SAM
….
Phillips Et Al (2005)
Neuroimage, 997-1011
Covariance Constraints (Priors)
How parameterise C(e) and C(j)?
“IID” constraint on sensors (Q(e)=I(n))
C ( e)   (i e )Qi( e )
i
C ( j )   (i j )Qi( j )
Q = (co)variance components (Priors)
λ = estimated hyperparameters
# sensors
# sensors
i
Sparse priors on sources (Q1(j), Q2(j), …)
“IID” constraint on sources (Q(j)=I(p))
# sources
…
# sources
# sources
# sources
2. Parametric Empirical Bayes (PEB)
•
•
•
•
•
•
•
Weighted Minimum Norm & Bayesian equivalent
EM estimation of hyperparameters (regularisation)
Model evidence and Model Comparison
Spatiotemporal factorisation and Induced Power
Automatic Relevance Detection (hyperpriors)
Multiple Sparse Priors
MEG and EEG fusion (simultaneous inversion)
Expectation-Maximisation (EM)
How estimate λ? …. Use EM algorithm:


(i )
T ~
ˆ
 j  EM YY , Q


~
Q  Q1( e ) , Q2( e ) ,..., LQ1( j ) LT , LQ2( j ) LT ,...
…to maximise the (negative) “free energy” (F):
F   12 tr(C 1YY T )  2t ln | C |  pt2 ln | 2 |
C  LC ( j ) LT  C ( e )
C (i )   (ji )Q(ji )
(Note estimation in nxn sensor space)
Once estimated hyperparameters (iterated M-steps), get MAP for parameters (single E-step):
Jˆ  MY
M  Cˆ ( j ) LT Cˆ 1
(Can also estimate conditional covariance of parameters, allowing inference:)
ˆ  Cˆ ( j )  MLCˆ ( j )
Phillips et al (2005) Neuroimage
Multiple Constraints (Priors)
Multiple constraints: Smooth sources (Qs), plus valid (Qv) or invalid (Qi) focal prior
Qs
500 simulations
Qs
Qs,Qv
500 simulations
Qs,Qi
Qs,Qi,Qv
Qv
Qi
Mattout Et Al (2006)
Neuroimage, 753-767
2. Parametric Empirical Bayes (PEB)
•
•
•
•
•
•
•
Weighted Minimum Norm & Bayesian equivalent
EM estimation of hyperparameters (regularisation)
Model evidence and Model Comparison
Spatiotemporal factorisation and Induced Power
Automatic Relevance Detection (hyperpriors)
Multiple Sparse Priors
MEG and EEG fusion (simultaneous inversion)
Model Evidence
A (generative) model, M, is defined by the set of {Q(e), Q(j), L}:
The “model log-evidence” is bounded by the free energy:
ln p(Y M )  ln  p(Y , J | M )dJ  F
Friston Et Al (2007)
Neuroimage, 34, 220-34
(F can also be viewed the difference of an “accuracy” term and a “complexity” term):
F  accuracy  complexity   12 tr(C 1YY T )  2t ln | C | 
pt
2
ln | 2 |
Two models can be compared using the “Bayes factor”:
p(Y | M 1 )
p(Y | M 2 )
Also useful when comparing different forward models, ie L’s, Henson et al (submitted-b)
Model Comparison (Bayes Factors)
Multiple constraints: Smooth sources (Qs), plus valid (Qv) or invalid (Qi) focal prior
Qs
Qs,Qv
Qs,Qv,Qi
(Qs,Qi)
LogEvidence
205.2
214.1
214.7
204.9
Mattout Et Al (2006)
Neuroimage, 753-767
Bayes
Factor
7047
Qs
Qv
1.8
(1/9899)
Qi
2. Parametric Empirical Bayes (PEB)
•
•
•
•
•
•
•
Weighted Minimum Norm & Bayesian equivalent
EM estimation of hyperparameters (regularisation)
Model evidence and Model Comparison
Spatiotemporal factorisation and Induced Power
Automatic Relevance Detection (hyperpriors)
Multiple Sparse Priors
MEG and EEG fusion (simultaneous inversion)
Temporal Correlations
To handle temporally-extended solutions, first assume temporal-spatial factorisation:
~
Y  LJ  E
E ~ N (0,V  C )
J ~ N (0,V ( j )  C ( j ) )
(e)
(e)
~
Y = vectorised data, nt x 1
C(e) = spatial error covariance over sensors
V(e)= temporal error covariance over sensors
C(j) = spatial error covariance over sources
V(j) = temporal error covariance over sources
In general, temporal correlation of signal (sources) and noise (sensors) will differ, but can
project onto a temporal subspace (via S) such that:
S TVe S  S TV j S  S TVS
V typically Gaussian autocorrelations…
V  KK T
 (i  j ) 2 

K ( ) ij  exp  
2
2 

 ~ 4ms
Friston Et Al (2006) Human
Brain Mapping, 27:722–735
S typically an SVD into Nr temporal modes…
Then turns out that EM can simply operate on
prewhitened data (covariance), where Y size n x t:
ˆ  EM (
1
YS ( S T VS ) 1 S T Y T , Q)
Nr
Jˆ  MYSS T
Localising Power (eg induced)
Friston Et Al (2006) Human
Brain Mapping, 27:722–735
2. Parametric Empirical Bayes (PEB)
•
•
•
•
•
•
•
Weighted Minimum Norm & Bayesian equivalent
EM estimation of hyperparameters (regularisation)
Model evidence and Model Comparison
Spatiotemporal factorisation and Induced Power
Automatic Relevance Detection (hyperpriors)
Multiple Sparse Priors
MEG and EEG fusion (simultaneous inversion)
Automatic Relevance Detection (ARD)
When have many constraints (Q’s), pairwise model comparison becomes arduous
Moreover, when Q’s are correlated, F-maximisation can be difficult (eg local maxima),
and hyperparameters can become negative (improper for covariances)
Prestim Baseline
Anti-Averaging
Smoothness
Depth-Weighting
Sensor-level
Note: Even though Qs may be
uncorrelated in source space, they
can become correlated when
projected through L to sensor space
(where F is optimised)
Henson Et Al (2007)
Neuroimage, 38, 422-38
Source-level
C  LC ( j ) LT  C ( e )
Automatic Relevance Detection (ARD)
When have many constraints (Q’s), pairwise model comparison becomes arduous
Moreover, when Q’s are correlated, F-maximisation can be difficult (eg local maxima),
and hyperparameters can become negative (improper for covariances)
To overcome this, one can:
1) impose positivity constraint on hyperparameters:
  ln(  )    exp( )
2) impose (sparse) hyperpriors on the (log-normal) hyperparameters:
p( ) ~ N ( , )
  8
  aI , a  32
Uninformative priors are then “turned-off” as       0 (“ARD”)
Complexity
1
1
F   12 tr (C 1YY T )  2t ln | C |  pt2 ln | 2 |  ln |    1 |  (   ) T  1 (   )
2
2
(…where η and Σλ are the posterior mean and covariance of hyperparameters)
Automatic Relevance Detection (ARD)
When have many constraints (Q’s), pairwise model comparison becomes arduous
Moreover, when Q’s are correlated, F-maximisation can be difficult (eg local maxima),
and hyperparameters can become negative (improper for covariances)
Anti-Averaging
Smoothness
Depth-Weighting
Henson Et Al (2007)
Neuroimage, 38, 422-38
Source-level
Sensor-level
Prestim Baseline
2. Parametric Empirical Bayes (PEB)
•
•
•
•
•
•
•
Weighted Minimum Norm & Bayesian equivalent
EM estimation of hyperparameters (regularisation)
Model evidence and Model Comparison
Spatiotemporal factorisation and Induced Power
Automatic Relevance Detection (hyperpriors)
Multiple Sparse Priors
MEG and EEG fusion (simultaneous inversion)
Multiple Sparse Priors (MSP)
So why not use ARD to select from a large number of sparse source priors….!?
Q(2)1
…
Left patch
…
Right patch
…
Q(2)j
Q(2)N
Bilateral patches
…
Q(2)j+1
…
Q(2)j+2
Friston Et Al (2008) Neuroimage
Multiple Sparse Priors (MSP)
So why not use ARD to select from a large number of sparse source priors….!
No depth bias!
Friston Et Al (2008) Neuroimage
2. Parametric Empirical Bayes (PEB)
•
•
•
•
•
•
•
Weighted Minimum Norm & Bayesian equivalent
EM estimation of hyperparameters (regularisation)
Model evidence and Model Comparison
Spatiotemporal factorisation and Induced Power
Automatic Relevance Detection (hyperpriors)
Multiple Sparse Priors
MEG and EEG fusion (simultaneous inversion)
Fusion of MEG/EEG
Separate Error Covariance components for each of i=1..M modalities (Ci(e)):
~
~
 Y1   L1 
 E1( e ) 
~  ~ 
 (e) 
Y
L
 2    2  J  E2 
     
  
~  ~ 
 (e) 
 E M 
YM   LM 
Data and leadfields scaled
(with mi spatial modes):
Yi 
Li 
Yi
1
mi
T
tr (YY
i i )
Li
1
mi
tr ( Li LTi )
C (e)
C1( e )

0

 

 0
0
C2( e )

0 

 
 0 
(e) 
0 Cd 

Ci( e )   (jie )Qij( e )
j
Remember, EM returns conditional precisions
(Σ) of sources (J), which can be used to
compare separate vs fused inversions…
ˆ  C ( j )  MLC ( j )
Henson Et Al (2009b) Neuroimage
Fusion of MEG/EEG
Magnetometers (MEG)
Gradiometers (MEG)
Electrodes (EEG)
+ Fused…
ˆ  71
ˆ  73
ˆ  98
ˆ  111
Henson Et Al (2009b) Neuroimage
Overview
1. Random Field Theory for Space-Time images
2. Empirical Bayesian approach to the Inverse Problem
3. A Canonical Cortical mesh and Group Analyses
4. [ Dynamic Causal Modelling (DCM) ]
3. Canonical Mesh & Group Analyses
•
•
•
•
A “canonical” (Inverse-normalised) cortical mesh
Group analyses in 3D
Use of fMRI spatial priors (in MNI space)
Group-based inversions
A “Canonical” Cortical Mesh
Given the difficulty in (automatically) creating accurate cortical meshes from
MRIs, how about inverse-normalising a (quality) template mesh in MNI space?
Original
MRI
Normalised
MRI
Spatial Normalisation
Template
MRI
(in “MNI”
space)
Ashburner & Friston
(2005) Neuroimage
Warps…
A “Canonical” Cortical Mesh
N=1
Apply inverse of warps from spatial normalisation
of whole MRI to a template cortical mesh…
Individual
Canonical
Template
Individual
Canonical
Template
“Canonical”
Mattout Et Al (2007) Comp.
Intelligence & Neuroscience
A “Canonical” Cortical Mesh
N=9
But warps from cortex not appropriate
to skull/scalp, so use individually (and
easily) defined skull/scalp meshes…
CanInd
Canonical Cortex
Individual Skull
Individual Scalp
Statistical tests of model evidence
over N=9 MEG subjects show:
1. MSP > MMN
2. BEMs > Spheres (for CanInd)
3. (7000 > 3000 dipoles)
4. (Normal > Free for MSP)
Free Energy/104
Henson Et Al (2009a) Neuroimage
3. Canonical Mesh & Group Analyses
•
•
•
•
A “canonical” (Inverse-normalised) cortical mesh
Group analyses in 3D
Use of fMRI spatial priors (in MNI space)
Group-based inversions
Group Analyses in 3D
Once have a 1-to-1 mapping from M/EEG source to MNI space, can create 3D
normalised images (like fMRI) and use SPM machinery to perform group-level
classical inference…
N=19, MNI space, Pseudowords>Words
300-400ms with >95% probability
Smoothed,
Interpolated
J
Taylor & Henson (2008), Biomag
3. Canonical Mesh & Group Analyses
•
•
•
•
A “canonical” (Inverse-normalised) cortical mesh
Group analyses in 3D
Use of fMRI spatial priors (in MNI space)
Group-based inversions
fMRI spatial priors
…
Group fMRI results in MNI
space can be used as
spatial priors on individual
source space...
Thresholding and connected component labelling
…
...importantly each fMRI
cluster is separate prior, so
is “weighted” independently
Project onto cortical surface using Voronoï diagram
Henson Et Al (submitted)
…
Prior covariance components
Qj
3. Canonical Mesh & Group Analyses
•
•
•
•
A “canonical” (Inverse-normalised) cortical mesh
Group analyses in 3D
Use of fMRI spatial priors (in MNI space)
Group-based inversions
Group-based source priors
Concatenate data and leadfields over i=1..N subjects…
~
 Y1   L1 
 E1  …projecting data and leadfields to a reference subject (0):
~   
E 
L
Y
 2    2 J   2 
~
T
T 1
Y
A

L
L
(
L
L
)
     
  
i  Ai Yi
i
0 i
i i
~   
 
L
Y
 N   N 
E N 
Common source-level priors:
C
( j)
  Q
( j)
k
( j)
k
Subject-specific sensor-level priors:
Ci(e)   (ike) AiQk(e) AiT
C (e)
C1( e )

0


 

 0
0
C2( e )

0 

 
 0 
(e) 
0 C N 

C  L0C ( j ) LT0  C ( e )
Litvak & Friston (2008), Neuroimage
Group-based source priors
N=19, MNI space, Pseudowords>Words, 300-400ms with >95% probability
Individual Inversions
Group Inversion
Taylor & Henson (in prep)
Summary
SPM also implements Random Field Theory for principled
correction of multiple comparisons over space/time/freq
SPM implements a variant of the L2-distributed norm that:
1.
2.
3.
4.
5.
effectively automatically “regularises” in principled fashion
allows for multiple constraints (priors), valid & invalid
allows model comparison, or automatic relevance detection…
…to the extent that multiple (100’s) of sparse priors possible
also offers a framework for MEG+EEG fusion
SPM can also inverse-normalise a template cortical mesh that:
1. obviates manual cortex meshing
2. allows use of fMRI priors in MNI space
3. allows using group constraints on individual inversions
Download