Spatial smoothing of autocorrelations to control the degrees of freedom in fMRI analysis Keith Worsley Department of Mathematics and Statistics, McGill University, McConnell Brain Imaging Centre, Montreal Neurological Institute. fMRI data: 120 scans, 3 scans each of hot, rest, warm, rest, hot, rest, … First scan of fMRI data Highly significant effect, T=6.59 1000 hot rest warm 890 880 870 500 0 100 200 300 No significant effect, T=-0.74 820 hot rest warm 0 800 T statistic for hot - warm effect 5 0 -5 T = (hot – warm effect) / S.d. ~ t110 if no effect 0 100 0 100 200 Drift 300 810 800 790 200 Time, seconds 300 FMRISTAT: fits a linear model for fMRI time series with AR(p) errors • Linear model: ? ? Yt = (stimulust * HRF) b + driftt c + errort • AR(p) errors: unknown parameters ? ? ? errort = a1 errort-1 + … + ap errort-p + s WNt DESIGN example: pain perception Alternating hot and warm stimuli separated by rest (9 seconds each). 2 1 0 -1 0 50 100 150 200 250 300 350 Hemodynamic response function: difference of two gamma densities 0.4 0.2 0 -0.2 0 50 Responses = stimuli * HRF, sampled every 3 seconds 2 1 0 -1 0 50 100 150 200 Time, seconds 250 300 350 First step: estimate the autocorrelation ? AR(1) model: errort = a1 errort-1 + s WNt • Fit the linear model using least squares • errort = Yt – fitted Yt • â1 = Correlation ( errort , errort-1) • Estimating errort’s changes their correlation structure slightly, so â1 is slightly biased: Raw autocorrelation Smoothed 12.4mm ~ -0.05 Bias corrected â1 ~0 0.3 0.2 0.1 0 -0.1 Second step: refit the linear model Pre-whiten: Yt* = Yt – â1 Yt-1, then fit using least squares: Hot - warm effect, % Sd of effect, % 1 0.25 0.2 0.5 0.15 0 0.1 -0.5 0.05 -1 0 T = effect / sd, 100 df 6 4 2 0 -2 -4 -6 T > 4.93 (P < 0.05, corrected) Why bother to smooth the acor? Threshold • Sample variability in estimated acor adds variability to sd 14 • Lowers effective 12 10 df of T statistic 8 Corrected for whole • Increases brain search 6 threshold 4 One voxel • Less power 2 • Particularly after 0 0 50 100 Df correction for search 150 Gautama et al. (2005): Smooth autocorrelations, choose amount of smoothing to optimally predict autocorrelations using e.g. cross-validation, model selection. Effect of variability in sample acor on dbn of T: first idea • Why not write linear model with e.g. AR(1) errors Yt = xt’β + ηt, ηt = a1ηt-1 + εt where εt iid ~N(0,σ2), as Yt = a1Yt-1 + xt’β + xt-1’(a1β) + εt • Least-squares estimates are ~max like, so • Non-linear l.s.: dfeff ~ n-(#a)-(#β) …. ???? or • Linear l.s.: dfeff ~ n-(#a)-(#β)-(#a)×(#β) …. ???? • Doesn’t work (see later) because: – design matrix is random? – ~max like only for large samples i.e. df = ∞? Better idea: Harville et al. (1974), …, Kenward, Roger (1997) … SAS PROC MIXED … • Linear model at a single voxel: Y ~ Nn(Xβ, V(θ)), θ = (σ2, a1, …, ap) • Fit by ReML, interested in effect E = c’β, S = Sd(E) • T=E/S • E depends on β, S depends on θ • β, θ ~independent so variability in θ only affects S Continued … • S depends on θ, and from ReML theory we know ~mean, ~variance of θ. • Use linear approx to S2(θ) to find ~mean, ~variance of S2 • dfeff is surrogate for variability of S2: dfeff := 2 E(S2)2/Var(S2) • Satterthwaite: S2 ~ cons×χ2dfeff , T ~ tdfeff Expression for dfeff • dfeff depends on contrast(!) and θ, – Could plug in θ, but don’t know θ in advance – Explicit expression if acors = 0 – Hope it is a good approx for when acors ≠ 0 • Contrast in obs: x = X(X’X)-1c, so E = x’Y • τj = lag j acor of x, dfresidual = least-squares df • 1/dfeff = 1/dfresidual + 2(τ12 + … + τp2)/dfresidual Effect of smoothing acor • Assume ε ~ white noise smoothed by Gaussian filter, width FWHMdata, GRF(FWHMdata) • Autocors ~ GRF(FWHMdata/√2) • Smoothing acors in D dimensions by FWHMacor reduces variance by f = (2 FWHMacor2/FWHMdata2 + 1)D/2 • Define dfacor := f dfresidual • 1/dfeff = 1/dfresidual + 2(τ12 + … + τp2)/dfacor Hot, 1=0.61 Hot + Warm, 1=0.5 120 Residual df = 114 Sim, a1= 0.4 0.3 0.2 0.1 0 80 60 40 x= 20 0 100 Theory, a1=0 Effective df Effective df 100 120 Residual df = 114 0 Sim, a1= 0.4 0.3 0.2 0.1 0 80 60 40 x= 20 1 2 3 FWHMfilter/FWHMdata 0 4 0 1 2 3 FWHMfilter/FWHMdata Hot - Warm, 1=0.79 80 60 40 100 Theory, a1=0 x= 20 0 120 Residual df = 114 Effective df Effective df Sim, a1= 0.4 0.3 0.2 0.1 0 0 4 Cubic drift, 1=0.94 120 Residual df = 114 100 Theory, a1=0 Sim, a1= 0 0.1 0.2 0.3 0.4 80 60 40 x= 20 1 2 3 FWHMfilter/FWHMdata 4 0 Theory, a1=0 0 1 2 3 FWHMfilter/FWHMdata 4 Summary • Variability in 2 3/2 FWHM acor acor lowers df dfacor = dfresidual 2 FWHM 2 + 1 • Df depends data 1 1 2 acor(contrast of data)2 on contrast = + • Smoothing acor dfeff dfresidual dfacor brings df back up: Applications: Hot stimulus FWHMdata = 8.79 Hot-warm stimulus ( Residual df = 110 100 Target = 100 df 50 Contrast of data, acor = 0.61 dfeff 0 0 10 20 30 FWHM = 10.3mm FWHMacor ) Residual df = 110 100 Target = 100 df 50 Contrast of data, acor = 0.79 dfeff 0 0 10 20 30 FWHM = 12.4mm FWHMacor Application: Hot – warm stimulus Autocorrelation a1 0.2 0 0 Effective df = 110 -5 Threshold = 5.25 Effective df = 49 5 0.4 0.2 0 0 Effective df = 1249 P = 0.05, corrected 5 0.4 No smoothing 12.4mm FWHM smoothing T statistic for hot-warm -5 Effective df = 100 Threshold = 4.93 Refinements • Could get a rough estimate of acor first, then use this to get better estimate of dfeff, but this is time consuming • Acor varies spatially, so dfeff varies spatially, but we don’t have any random field theory for P-values • Could use spatially varying filter to achieve ~constant dfeff, but again this is time consuming • All the theory built on asymptotic and/or questionable assumptions, so maybe can’t take it too far …