Overview of SPM Image time-series Realignment Kernel Design matrix Smoothing General linear model Statistical parametric map (SPM) Statistical inference Normalisation Gaussian field theory p <0.05 Template Parameter estimates The General Linear Model (GLM) Frederike Petzschner Translational Neuromodeling Unit (TNU) Institute for Biomedical Engineering, University of Zurich & ETH Zurich With many thanks for slides & images to: FIL Methods group, Virginia Flanagin and Klaas Enno Stephan Image a very simple experiment… • One session • 7 cycles of rest and listening • Blocks of 6 scans with 7 sec TR time Image a very simple experiment… What we measure. single voxel time series What we know. time Question: Is there a change in the BOLD response between listening and rest? Image a very simple experiment… What we measure. single voxel time series What we know. time Question: Is there a change in the BOLD response between listening and rest? You need a model of your data… linear model effects estimate error estimate statistic Question: Is there a change in the BOLD response between listening and rest? Explain your data… Time =1 BOLD signal Single voxel regression model: + 2 x1 + x2 y = x1b1 + x2 b2 + e error as a combination of experimental manipulation,confounds and errors e regresso r Explain your data… Time =1 BOLD signal Single voxel regression model: + 2 x1 + x2 y =yx +x2b2e+ e 1b1X error as a combination of experimental manipulation,confounds and errors e n 2 + p 1 + n n 1 n: number of scans p: number of regressors 1 error error = Designmatrix The black and white version in SPM p y X e e 1 Model assumptions Designmatrix error The design matrix embodies all available knowledge about experimentally controlled factors and potential confounds. Talk: Experimental Design Wed 9:45 – 10:45 You want to estimate your parameters such that you minimize: N åe 2 t t=1 This can be done using an Ordinary least squares estimation (OLS) assuming an i.i.d. error: e » N(0, s 2 I) error GLM assumes identical and independently distributed errors i.i.d. = error covariance is a scalar multiple of the identity matrix: Cov(e) = 2I non-identity t1 t2 1 0 Cov(e) 0 1 t1 t2 4 0 Cov(e) 0 1 non-independence 2 1 Cov(e) 1 2 How to fit the model and estimate the parameters? = y X + error 1 2 „Option 1“: Per hand e How to fit the model and estimate the parameters? OLS (Ordinary Least Squares) 1 2 = y X + error yˆ = X bˆ Data predicted by our model e = y - yˆ Error between predicted and actual data e = y - X bˆ Goal is to determine the betas such that we minimize the quadratic error min(eT e) = min((y - X bˆ )T (y - X bˆ )) e OLS (Ordinary Least Squares) T T ˆ e e = (y - X b ) (y - X bˆ ) The goal is to minimize the quadratic error between data and model T T ˆ e e = (y - b X )(y - X bˆ ) T T T T T T T ˆ ˆ ˆ e e = y y - y X b - b X y + b X X bˆ T T T T T T ˆ ˆ e e = y y - 2 b X y + b X X bˆ T T ¶e e T T = -2X y + 2X X bˆ ¶bˆ T T 0 = -2X y + 2X X bˆ T T -1 T ˆ b = (X X) X y OLS (Ordinary Least Squares) T T ˆ e e = (y - X b ) (y - X bˆ ) The goal is to minimize the quadratic error between data and model T T ˆ e e = (y - b X )(y - X bˆ ) T T T T T T T ˆ ˆ ˆ e e = y y - y X b - b X y + b X X bˆ T T T T T T ˆ ˆ e e = y y - 2 b X y + b X X bˆ T T ¶e e T T = -2X y + 2X X bˆ ¶bˆ T T 0 = -2X y + 2X X bˆ T T -1 T ˆ b = (X X) X y OLS (Ordinary Least Squares) T T ˆ e e = (y - X b ) (y - X bˆ ) The goal is to minimize the quadratic error between data and model e e = (y - bˆ X )(y - X bˆ ) T T T T T T T ˆ ˆ ˆ e e = y y - y X b - b X y + b X X bˆ T T T T T T T T ˆ ˆ e e = y y - 2 b X y + b X X bˆ T T ¶e e T T = -2X y + 2X X bˆ ¶bˆ T T 0 = -2X y + 2X X bˆ T T -1 T ˆ b = (X X) X y This is a scalar and the transpose of a scalar is a scalar OLS (Ordinary Least Squares) T T ˆ e e = (y - X b ) (y - X bˆ ) The goal is to minimize the quadratic error between data and model e e = (y - bˆ X )(y - X bˆ ) T T T T T T T ˆ ˆ ˆ e e = y y - y X b - b X y + b X X bˆ T T T T T T T T ˆ ˆ e e = y y - 2 b X y + b X X bˆ T T ¶e e T T = -2X y + 2X X bˆ ¶bˆ T T 0 = -2X y + 2X X bˆ T T -1 T ˆ b = (X X) X y This is a scalar and the transpose of a scalar is a scalar OLS (Ordinary Least Squares) T T ˆ e e = (y - X b ) (y - X bˆ ) The goal is to minimize the quadratic error between data and model e e = (y - bˆ X )(y - X bˆ ) T T T T T T T ˆ ˆ ˆ e e = y y - y X b - b X y + b X X bˆ T T T T This is a scalar and the transpose of a scalar is a scalar T T T T ˆ ˆ e e = y y - 2 b X y + b X X bˆ T T ¶e e T T = -2X y + 2X X bˆ ¶bˆ T T 0 = -2X y + 2X X bˆ T T -1 T ˆ b = (X X) X y You find the extremum of a function by taking its derivative and setting it to zero OLS (Ordinary Least Squares) T T ˆ e e = (y - X b ) (y - X bˆ ) The goal is to minimize the quadratic error between data and model e e = (y - bˆ X )(y - X bˆ ) T T T T T T T ˆ ˆ ˆ e e = y y - y X b - b X y + b X X bˆ T T T T This is a scalar and the transpose of a scalar is a scalar T T T T ˆ ˆ e e = y y - 2 b X y + b X X bˆ T T ¶e e T T = -2X y + 2X X bˆ ¶bˆ T T 0 = -2X y + 2X X bˆ T T -1 T ˆ b = (X X) X y You find the extremum of a function by taking its derivative and setting it to zero A geometric perspective on the GLM OLS estimates y ˆ ( X T X ) 1 X T y e x2 yˆ Xˆ x1 Design space defined by X Correlated and orthogonal regressors y Design space defined by X x2* x2 x1 y x11 x2 2 e y x11 x2* 2* e 1 2 1 1 1; 2* 1 Correlated regressors = explained variance is shared between regressors When x2 is orthogonalized with regard to x1, only the parameter estimate for x1 changes, not that for x2! We are nearly there… linear model effects estimate = error estimate + statistic What are the problems? Design Error 1. BOLD responses have a delayed and dispersed form. 2. The BOLD signal includes substantial amounts of lowfrequency noise. 3. The data are serially correlated (temporally autocorrelated) this violates the assumptions of the noise model in the GLM Problem 1: Shape of BOLD response t f g (t ) f ( ) g ( t ) d 0 The response of a linear time-invariant (LTI) system is the convolution of the input with the system's response to an impulse (delta function). Solution: Convolution model of the BOLD response expected BOLD response = input function impulse response function (HRF) t f g (t ) f ( ) g ( t ) d 0 HRF blue = data green = predicted response, taking convolved with HRF red = predicted response, NOT taking into account the HRF Problem 2: Low frequency noise blue = data black = mean + low-frequency drift green = predicted response, taking into account low-frequency drift red = predicted response, NOT taking into account low-frequency drift Problem 2: Low frequency noise Linear model blue = data black = mean + low-frequency drift green = predicted response, taking into account low-frequency drift red = predicted response, NOT taking into account low-frequency drift Solution 2: High pass filtering discrete cosine transform (DCT) set Problem 3: Serial correlations i.i.d t1 t2 é 1 0 ùt1 Cov(e) = ê ú ë 0 1 ût2 non-independence non-identity é 2 1 ù é 4 0 ù ú Cov(e) = ê ú Cov(e) = ê ë 1 2 û ë 0 1 û n Cov(e) n n: number of scans Problem 3: Serial correlations et aet 1 t with t ~ N (0, ) 2 1st order autoregressive process: AR(1) autocovariance function n Cov(e) n n: number of scans Problem 3: Serial correlations • Pre-whitening: 1. Use an enhanced noise model with multiple error covariance components, i.e. e ~ N(0, 2V) instead of e ~ N(0, 2I). 2. Use estimated serial correlation to specify filter matrix W for whitening the data. This is i.i.d Wy WX We How do we define W ? • Enhanced noise model • Remember linear transform for Gaussians • Choose W such that error covariance becomes spherical • Conclusion: W is a simple function of V so how do we estimate V ? e ~ N (0, V ) 2 x ~ N ( , ), y ax 2 y ~ N (a , a 2 2 ) We ~ N (0, W V ) 2 W 2V I W V Wy WX We 1 / 2 2 Find V: Multiple covariance components V Cov( e) e ~ N (0, V ) 2 enhanced noise model V = 1 V iQi error covariance components Q and hyperparameters Q1 + 2 Estimation of hyperparameters with EM (expectation maximisation) or ReML (restricted maximum likelihood). Q2 We are there… linear model effects estimate error estimate statistic • GLM includes all known experimental effects and confounds • Convolution with a canonical HRF = + • High-pass filtering to account for low-frequency drifts • Estimation of multiple variance components (e.g. to account for serial correlations) We are there… c=10000000000 linear model effects estimate error estimate Null hypothesis: 1 0 statistic c ˆ t Std (cT ˆ ) T Talk: Statistical Inference and design efficiency. Next Talk = + We are there… • Mass-univariate approach: GLM applied to > 100,000 voxels single voxel time series • Threshold of p<0.05 more than 5000 voxels significant by chance! • Massive problem with multiple comparisons! • Solution: Gaussian random field theory Outlook: further challenges •correction for multiple comparisons Talk: Multiple Comparisons Wed 8:30 – 9:30 •variability in the HRF across voxels Talk: Experimental Design Wed 9:45 – 10:45 •limitations of frequentist statistics Talk: entire Friday •GLM ignores interactions among voxels Talk: Multivariate Analysis Thu 12:30 – 13:30 Thank you! • Friston, Ashburner, Kiebel, Nichols, Penny (2007) Statistical Parametric Mapping: The Analysis of Functional Brain Images. Elsevier. • Christensen R (1996) Plane Answers to Complex Questions: The Theory of Linear Models. Springer. • Friston KJ et al. (1995) Statistical parametric maps in functional imaging: a general linear approach. Human Brain Mapping 2: 189-210.