of fMRIstat - The Department of Mathematics and Statistics

advertisement
The statistical analysis
of fMRI data
Keith Worsley12, Chuanhong Liao1, John Aston123,
Jean-Baptiste Poline4, Gary Duncan5, Vali Petre2,
Frank Morales6, Alan Evans2, Tom Nichols7, Satoru Hayasaki7
1Department
of Mathematics and Statistics, McGill University,
2Brain Imaging Centre, Montreal Neurological Institute,
3Imperial College, London,
4Service Hospitalier Frédéric Joliot, CEA, Orsay,
5Centre de Recherche en Sciences Neurologiques, Université de Montréal,
6Cuban Neuroscience Centre
7University of Michigan
fMRI data: 120 scans, 3 scans each of hot, rest, warm, rest, hot, rest, …
First scan of fMRI data
Highly significant effect, T=6.59
1000
hot
rest
warm
890
880
870
500
0
100
200
300
No significant effect, T=-0.74
820
hot
rest
warm
0
800
T statistic for hot - warm effect
5
0
-5
T = (hot – warm effect) / S.d.
~ t110 if no effect
0
100
0
100
200
Drift
300
810
800
790
200
Time, seconds
300
Choices …
•
•
•
•
•
•
•
Time domain / frequency domain?
AR / ARMA / state space models?
Linear / non-linear time series model?
Fixed HRF / estimated HRF?
Voxel / local / global parameters?
Fixed effects / random effects?
Frequentist / Bayesian?
More importantly ...
•
•
•
•
•
•
•
Fast execution / slow execution?
Matlab / C?
Script (batch) / GUI?
Lazy / hard working … ?
Why not just use SPM?
Develop new ideas ...
FMRISTAT: Simple, general, valid, robust,
fast analysis of fMRI data
PCA_IMAGE: PCA of time  space:
Component
Temporal components (sd, % variance explained)
1
0.68, 46.9%
2
0.29, 8.6%
3
0.17, 2.9%
4
0.15, 2.4%
0
20
40
60
80
100
120
140
Frame
Spatial components
1
Component
1
0.5
2
0
3
-0.5
1: exclude
first frames
2: drift
3: long-range
correlation
or anatomical
effect: remove
by converting
to % of brain
4
0
2
4
6
8
Slice (0 based)
10
12
-1
4: signal?
FMRILM: fits a linear model for
fMRI time series with AR(p) errors
• Linear model:
?
?
Yt = (stimulust * HRF) b + driftt c + errort
• AR(p) errors:
unknown parameters
?
?
?
errort = a1 errort-1 + … + ap errort-p + s WNt
FMRIDESIGN example: pain perception
Alternating hot and warm stimuli separated by rest (9 seconds each).
2
1
0
-1
0
50
100
150
200
250
300
350
Hemodynamic response function: difference of two gamma densities
0.4
0.2
0
-0.2
0
50
Responses = stimuli * HRF, sampled every 3 seconds
2
1
0
-1
0
50
100
150
200
Time, seconds
250
300
350
FMRILM first step: estimate the autocorrelation
?
AR(1) model: errort = a1 errort-1 + s WNt
• Fit the linear model using least squares
• errort = Yt – fitted Yt
• â1 = Correlation ( errort , errort-1)
• Estimating errort’s changes their correlation structure
slightly, so â1 is slightly biased: which_stats = ‘_cor’
Raw autocorrelation Smoothed 12.4mm
~ -0.05
Bias corrected â1
~0
0.3
0.2
0.1
0
-0.1
Effective df depends on smoothing
• Variability in
2
3/2
FWHM
acor
acor lowers df
dfacor = dfresidual 2 FWHM 2 + 1
• Df depends
data
1
1
2 acor(contrast of data)2
on contrast
=
+
• Smoothing acor
dfeff dfresidual
dfacor
brings df back up:
Hot stimulus
FWHMdata = 8.79 Hot-warm stimulus
(
Residual df = 110
100
Target = 100 df
50
Contrast of data, acor = 0.61
dfeff
0
0
10
20
30
FWHM = 10.3mm
FWHMacor
)
Residual df = 110
100
Target = 100 df
50
Contrast of data, acor = 0.79
dfeff
0
0
10
20
30
FWHM = 12.4mm FWHMacor
FMRILM second step: refit the linear model
Pre-whiten: Yt* = Yt – â1 Yt-1, then fit using least squares:
Hot - warm effect, % ‘_mag_ef’
Sd of effect, % ‘_mag_sd’
1
0.25
0.2
0.5
0.15
0
0.1
-0.5
0.05
-1
T = effect / sd, 110 df ‘_mag_t’
6
0
which_stats = ‘_mag_ef
_mag_sd _mag_t’
4
2
0
-2
-4
-6
T > 4.93
(P < 0.05,
corrected)
Higher order AR model? Try AR(3): ‘_AR’
a
1
a
2
a
3
0.3
0.2
AR(1) seems
to be adequate
0.1
0
… has little effect on the T statistics:
No correlation
AR(1)
AR(2)
-0.1
AR(3)
5
0
-5
biases T up ~12% → more false positives
Results from 4 runs on the same subject
Run 1
Effect,
Ei
Run 2
Run 3
Run 4
1
0
‘_mag_ef’
-1
0.2
Sd,
Si
‘_mag_sd’
0.1
0
5
T stat,
E i / Si
‘_mag_t’
0
-5

Problem: 4 runs, 3 df for random effects sd  ...
Run 1
Run 2
Run 3
Run 4
MULTISTAT
Effect,
Ei
1
0
‘_mag_ef’
… very noisy sd:
-1
0.2
Sd,
Si
0.1
‘_mag_sd’
… and T>15.96 for P<0.05 (corrected):
0
5
T stat,
E i / Si
0
‘_mag_t’
… so no response is detected …
-5
MULTISTAT: mixed effects linear model
for combining effects from different
runs/sessions/subjects:
• Ei = effect for run/session/subject i
from
• Si = standard error of effect
FMRILM
• Mixed effects model:
?
?
F
Ei = covariatesi c + Si WNi +  WNiR
}
Usually 1, but
could add group,
treatment, age,
sex, ...
‘Fixed effects’ error,
due to variability
within the same run
Random effect,
due to variability
from run to run
REML estimation using the
EM algorithm
•
•
•
•
Slow to converge (10 iterations by default).
^2 > 0 ), but
Stable (maintains estimate 
^2 biased if 2 (random effect) is small, so:

Re-parameterize the variance model:
?2
2
Var(Ei) = Si + 
= (Si2 – minj Sj2) + (2 + minj Sj2)
? 2
2
=
Si*
+
*
^2 = *
^ 2 – min S 2 (less biased estimate)
• 
j j
Solution: Spatial regularization of the sd
• Basic idea: increase df by spatial smoothing
(local pooling) of the sd.
• Can’t smooth the random effects sd directly,
- too much anatomical structure.
• Instead,

sd = smooth
random effects sd
 fixed effects sd
fixed effects sd
)
which removes the anatomical structure
before smoothing.
^

Average Si
Random effects sd, 3 df
Fixed effects sd, 440 df
Mixed effects sd, ~100 df
0.2
0.15
0.1
0.05
0
divide
Random sd / fixed sd
multiply
Smoothed sd ratio ‘_sdratio’
1.5
1
0.5
random
effect, sd
ratio ~1.3
Effective df depends on smoothing
(
FWHMratio2
dfratio = dfrandom 2 FWHM 2 + 1
data
1 = 1 + 1
dfeff dfratio dffixed
)
3/2
e.g. dfrandom = 3,
dffixed = 4  110
= 440,
FWHMdata = 8mm:
fixed effects
analysis,
dfeff = 440
400
300
dfeff
Target = 100 df
random effects
analysis,
dfeff = 3
200
FWHM
= 19mm
100
0
0
20
40
FWHMratio
Infinity
Final result: 19mm smoothing, 100 effective df …
Run 1
Run 2
Run 3
Run 4
MULTISTAT
Effect,
Ei
‘_mag_ef’
1
0
‘_ef’
… less noisy sd:
-1
0.2
Sd,
Si
‘_mag_sd’
‘_sd’
… and T>4.93 for P<0.05 (corrected):
0.1
0
5
T stat,
E i / Si
‘_mag_t’
0
‘_t’
… and now we can detect a response!
-5
FWHM – the local smoothness of the noise
FWHM =
voxel size
(2 log 2)1/2
1/2
(1 – correlation)
(If the noise is modeled as white noise smoothed
with a Gaussian kernel, this would be its FWHM)
P-values depend on Resels:
0.1
Clusters above t = 3.0, search volume resels = 500
0.1
P value of cluster
P value of local max
Local maximum T = 4.5
0.08
0.06
0.04
0.02
0
0
Volume
Resels =
FWHM3
500
1000
Resels of search volume
0.08
0.06
0.04
0.02
0
0
0.5
1
1.5
Resels of cluster
2
Non-isotropic data
(spatially varying FWHM)
• fMRI data is smoother in GM than WM
• VBM data is highly non-isotropic
• Has little effect on P-values for local maxima (use
‘average’ FWHM inside search region), but
• Has a big effect on P-values for spatial extents:
smooth regions → big clusters,
rough regions → small clusters, so
• Replace cluster volume by cluster resels
= volume / FWHM3
FWHM (mm) of scans (110 df) ‘_fwhm’
20
Resels=1.90
P=0.007
Resels=0.57
P=0.387
FWHM (mm) of effects (3 df) ‘_fwhm’
20
15
15
10
10
5
5
0
0
FWHM of effects (smoothed)
20
effects / scans FWHM (smoothed)
1.5
15
10
1
5
0
0.5
STAT_SUMMARY
In between use Discrete
Local Maxima (DLM)
Low FWHM
use Bonferroni
High FWHM use
Random Field Theory
Bonferroni
4.7
4.6
Gaussianized threshold
4.5
True
4.4
T, 10 df
Random Field Theory
4.3
T, 20 df
Discrete Local Maxima (DLM)
4.2
4.1
Gaussian
4
3.9
3.8
3.7
0
1
2
3
4
5
6
7
FWHM of smoothing kernel (voxels)
8
9
10
STAT_SUMMARY
In between use Discrete
Local Maxima (DLM)
Low FWHM
use Bonferroni
High FWHM use
Random Field Theory
0.12
Gaussian
T, 20 df
T, 10 df
0.1
Random Field Theory
Bonferroni
P-value
0.08
DLM
can ½
P-value
when
FWHM
~3 voxels
0.06
0.04
True
Discrete Local Maxima
0.02
Bonferroni, N=Resels
0
0
1
2
3
4
5
6
7
FWHM of smoothing kernel (voxels)
8
9
10
STAT_SUMMARY example: single run, hot-warm
Detected by BON and
DLM but not by RFT
Detected by DLM,
but not by BON or RFT
T>4.86
T>4.86
T > 4.93
(P < 0.05, corrected)
T>4.86
T > 4.93
(P < 0.05, corrected)
T>4.86
Conjunction: Minimum Ti > threshold
Minimum of Ti ‘_conj’
Average of Ti ‘_mag_t’
6
6
4
4
2
2
0
0
-2
-2
-4
-4
-6
-6
For P=0.05,
threshold = 1.82
Efficiency = 82%
1
For P=0.05,
threshold = 4.93
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
Efficiency : optimum block design
Sd of hot stimulus
Sd of hot-warm
0.5
20
Magnitude
0.4
15
Optimum
design
Delay
InterStimulus Interval (secs)
10
10
0
10
15
20
0.8
15
5
10
15
5
0
20
0
15
0.1
20
20
0.8
15
5
0
0
(secs)
1
0.6
Optimum
design
0.4
0.2
X
10
5
10
X
(Not enough signal)
0.2
Optimum
design
0.6
Optimum
design
10
5
0
(secs)
1
20
0.3
0.2
0.1
5
0.4
15
0.3
X
5
0.5
20
0.4
X
(Not enough signal)
5
Stimulus Duration (secs)
10
15
0.2
20
0
Efficiency : optimum event design
0.5
0.45
(Not
enough
signal)
____ magnitudes
……. delays
uniform . . . . . . . . .
random .. . ... .. .
concentrated :
Sd of effect (secs for delays)
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
5
10
15
Average time between events (secs)
20
How many subjects?
• Largest portion of variance comes from the
last stage i.e. combining over subjects:
sdrun2
sdsess2
sdsubj2
nrun nsess nsubj + nsess nsubj + nsubj
• If you want to optimize total scanner time,
take more subjects.
• What you do at early stages doesn’t matter
very much!
Estimating the delay of the response
• Delay or latency to the peak of the HRF is approximated by
a linear combination of two optimally chosen basis functions:
delay
0.6
0.4
basis1
0.2
HRF
basis2
0
-0.2
-0.4
-5
0
shift
5
10
t (seconds)
15
20
25
HRF(t + shift) ~ basis1(t) w1(shift) + basis2(t) w2(shift)
• Convolve bases with the stimulus, then add to the linear model
• Fit linear model, estimate w1 and w2
3
1
• Equate w2 / w1 to estimates, then
solve for shift (Hensen et al., 2002)
w2 / w1
2
w1
• To reduce bias when the magnitude
is small, use
0
shift / (1 + 1/T2)
w2
-1
-2
where T = w1 / Sd(w1) is the T statistic
for the magnitude
-3
-5
• Shrinks shift to 0 where there is little
evidence for a response.
0
shift (seconds)
5
Shift of the hot stimulus
T stat for magnitude ‘_mag_t’
T stat for shift ‘_del_t’
6
6
4
4
2
2
0
0
-2
-2
-4
-4
-6
-6
Shift (secs) ‘_del_ef’
Sd of shift (secs) ‘_del_sd’
4
2
2
1.5
0
1
-2
0.5
-4
0
Shift of the hot stimulus
T stat for magnitude ‘_mag_t’
T>4
6
6
4
4
2
2
0
T~2
0
-2
-2
-4
-4
-6
-6
Shift (secs) ‘_del_ef’
~1 sec
T stat for shift ‘_del_t’
Sd of shift (secs) ‘_del_sd’
4
2
2
1.5
0
+/- 0.5 sec
1
-2
0.5
-4
0
Combining shifts of the hot stimulus
(Contours are T stat for magnitude > 4)
Run 1
Run 2
Run 3
Run 4
MULTISTAT
4
2
Effect,
Ei
‘_del_ef’
0
‘_ef’
-2
-4
2
Sd,
Si
1
‘_sd’
‘_del_sd’
0
5
T stat,
E i / Si
‘_del_t’
‘_t’
0
-5
Shift of the hot stimulus
T stat for
magnitude
‘_mag_t’ > 4.93
Shift (secs)
‘_del_ef’
Comparison: SPM’99:
fmristat:
• Different slice
acquisition times:
• Drift removal:
• Shifts the model
• Temporal
correlation:
• Estimation of
effects:
• Rationale:
• Random effects:
• Map of the delay:
• Adds a temporal
derivative
• Low frequency cosines
(flat at the ends)
• AR(1), global
parameter, bias
reduction not necessary
• Band pass filter, then
least-squares, then
correction for temporal
correlation
• More robust,
but lower df
• No regularization,
low df, no conjuncs
• No
• Splines
(free at the ends)
• AR(p), voxel
parameters, bias
reduction
• Pre-whiten, then
least squares (no
further corrections
needed)
• More accurate,
higher df
• Regularization,
high df, conjuncs
• Yes
References
• http://www.math.mcgill.ca/keith/fmristat
• Worsley et al. (2002). A general statistical
analysis for fMRI data. NeuroImage, 15:115.
• Liao et al. (2002). Estimating the delay of
the response in fMRI data. NeuroImage,
16:593-606.
Functional connectivity
• Measured by the correlation between residuals at
every pair of voxels (6D data!)
Activation only
Voxel 2
++
+ +++
Correlation only
Voxel 2
Voxel 1
+
+
++
+
Voxel 1
+
•
•
•
•
Local maxima are larger than all 12 neighbours
P-value can be calculated using random field theory
Good at detecting focal connectivity, but
PCA of residuals x voxels is better at detecting large
regions of co-correlated voxels
|Correlations| > 0.7,
P<10-10 (corrected)
First Principal
Component > threshold
False Discovery Rate (FDR)
Benjamini and Hochberg (1995), Journal of the Royal Statistical Society
Benjamini and Yekutieli (2001), Annals of Statistics
Genovese et al. (2001), NeuroImage
• FDR controls the expected proportion of false
positives amongst the discoveries, whereas
• Bonferroni / random field theory controls the
probability of any false positives
• No correction controls the proportion of false
positives in the volume
Signal + Gaussian
white noise
Signal
Noise
P < 0.05 (uncorrected), T > 1.64
5% of volume is false +
4
4
2
2
0
-2
-4
FDR < 0.05, T > 2.82
5% of discoveries is false +
True +
False +
0
-2
-4
P < 0.05 (corrected), T > 4.22
5% probability of any false +
4
4
2
2
0
0
-2
-2
-4
-4
Comparison of thresholds
• FDR depends on the ordered P-values:
P1 < P2 < … < Pn. To control the FDR at a = 0.05, find
K = max {i : Pi < (i/n) a}, threshold the P-values at PK
Proportion of true + 1
0.1 0.01 0.001 0.0001
Threshold T 1.64 2.56 3.28 3.88 4.41
• Bonferroni thresholds the P-values at a/n:
Number of voxels 1
10 100 1000 10000
Threshold T 1.64 2.58 3.29 3.89 4.42
• Random field theory: resels = volume / FHHM3:
Number of resels
0
1
10 100 1000
Threshold T 1.64 2.82 3.46 4.09 4.65
P < 0.05 (uncorrected), T > 1.64
5% of volume is false +
FDR < 0.05, T > 2.67
5% of discoveries is false +
P < 0.05 (corrected), T > 4.93
5% probability of any false +
Random fields and
brain mapping
Keith Worsley
Department of Mathematics and Statistics,
McConnell Brain Imaging Centre,
Montreal Neurological Institute,
McGill University
Download