Accurate and Computationally Efficient Analysis of Longitudinal fMRI data Bryan Guillaume, Thomas E. Nichols & Lourens Waldorp University of Warwick, Coventry, England and University of Amsterdam, The Netherlands Introduction Number of subjects 10 0 −10 −50 200 700 ● ● 600 ● ● ● ● ● ● ● ● 50 100 200 400 500 ● ● ● 0 100 ● OLS/GLS SwE−Het. type A/SwE−Hom. type A SwE−Het. type B SwE−Hom. type B 300 OLS/GLS SwE−Het. type A/SwE−Hom. type A SwE−Het. type B SwE−Hom. type B % relative FPR ● ● 400 ● 300 % relative FPR ● ● ● ● ● ● ● ● ● ● ● 12 12 50 100 200 Number of subjects Number of subjects 4 ● 200 ● ● GLS SwE−Het. type A SwE−Het. type B SwE−Hom. type A SwE−Hom. type B ● ● ● −16 ● −16 −16 100 ● ● ● ● 50 ● ● ● ● ● ● ● ● ● ● −12 ● ● 0 −2 ● ● −4 GLS SwE−Het. type A SwE−Het. type B SwE−Hom. type A SwE−Hom. type B −6 ● ● ● ● ● −8 −2 −4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● % relative Bias ● 2 4 ● ● −6 ● ● −8 GLS SwE−Het. type A SwE−Het. type B SwE−Hom. type A SwE−Hom. type B ● ● −12 12 50 100 200 ● 12 50 100 200 Number of subjects Effect of a pure between covariate compound symmetric structure k = 3 visits Effect of a pure between covariate compound symmetric structure k = 5 visits Effect of a pure between covariate compound symmetric structure k = 8 visits 280 Number of subjects 280 Number of subjects 280 −8 ● ● ● ● ● % relative Bias 0 −2 ● ● −6 −4 ● ● 0 ● ● ● ● −12 % relative Bias Effect of a pure between covariate compound symmetric structure k = 8 visits 2 2 4 Effect of a pure between covariate compound symmetric structure k = 5 visits ● 240 240 240 ● ● ● ● ● ● ● ● ● 50 100 ● ● ● ● 200 12 Number of subjects 200 160 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 50 100 200 ● ● ● ● ● ● 80 ● ● ● ● ● % relative FPR ● ● ● ● ● ● 120 120 ● ● ● ● ● GLS SwE−Het. type A SwE−Het. type B SwE−Hom. type A SwE−Hom. type B ● ● ● ● ● ● ● ● 100 200 80 ● ● GLS SwE−Het. type A SwE−Het. type B SwE−Hom. type A SwE−Hom. type B ● 120 ● 200 ● 160 GLS SwE−Het. type A SwE−Het. type B SwE−Hom. type A SwE−Hom. type B % relative FPR 160 200 ● 12 Number of subjects 50 Number of subjects 35 15 20 25 30 OLS GLS SwE−Het. type A SwE−Het. type B SwE−Hom. type A SwE−Hom. type B 10 % relative FPR 0 80 −25 OLS GLS SwE−Het. type A SwE−Het. type B SwE−Hom. type A SwE−Hom. type B Effect of age in a particular group versus the effect of age in a group of reference compound symmetric structure 5 10 5 0 −5 % relative bias Effect of age in a particular group versus the effect of age in a group of reference compound symmetric structure sd/mean of SE OLS GLS SwE−Het. type A SwE−Het. type B SwE−Hom. type A SwE−Hom. type B 100 120 140 160 180 200 220 240 260 Effect of age in a particular group versus the effect of age in a group of reference compound symmetric structure 15 20 Figure 3: Effect of a pure between covariate with compound symmetry, in the rows are relative bias in the standard error, and relative FPR of GLS and SwE. ● ● ● ● ● ● ● ● ● 12 50 ● ● ● ● 140 ● ● ● ● ● ● ● ● ● ● 100 200 Number of subjects 12 group 2 vs group 1 group 2 vs group 1 Effect of age in a particular group versus the effect of age in a group of reference Toeplitz structure Effect of age in a particular group versus the effect of age in a group of reference Toeplitz structure Effect of age in a particular group versus the effect of age in a group of reference Toeplitz structure ● ● ● ● ● ● 50 100 30 25 20 15 10 % relative FPR 0 80 5 10 0 −5 group 2 vs group 1 OLS GLS SwE−Het. type A SwE−Het. type B SwE−Hom. type A SwE−Hom. type B group 2 vs group 1 group 2 vs group 1 Figure 4: Performance with imbalanced real study design, inference on difference of slopes (age-dependent BOLD) between two groups. In the columns are relative bias in the standard error, relative FPR, and relative SE stdev of N-OLS, GLS and SwE, in the rows are compound symmetric, and Toeplitz structure for the intra-visit correlation 200 Number of subjects Discussion Our results show that OLS with Sandwich Estimator standard errors provides accurate inferences in a variety of settings and the ability to estimate between-subject effects without resorting to GLS. t.e.nichols@warwick.ac.uk OLS GLS SwE−Het. type A SwE−Het. type B SwE−Hom. type A SwE−Hom. type B sd/mean of SE OLS GLS SwE−Het. type A SwE−Het. type B SwE−Hom. type A SwE−Hom. type B 35 group 2 vs group 1 100 120 140 160 180 200 220 240 260 20 150 ● 130 % relative FPR ● ● ● OLS/GLS SwE−Het. type A/SwE−Hom. type A SwE−Het. type B SwE−Hom. type B ● 120 140 130 120 110 100 200 200 ● 100 100 15 200 Figure 1: Linear effect of visit with compound symmetry, in the rows are relative bias in the standard error, and relative FPR of N-OLS, GLS and SwE. Bryan.Guillaume@doct.ulg.ac.be 50 0 50 5 12 90 200 ● ● 100 100 12 700 ● 80 200 90 50 200 500 ● ● ● 200 ● ● ● −15 100 OLS/GLS SwE−Het. type A/SwE−Hom. type A SwE−Het. type B SwE−Hom. type B ● % relative FPR ● ● ● 100 600 700 600 500 400 300 200 100 ● ● ● −25 −20 −15 −10 50 150 ● ● ● ● ● ● 8 6 100 −4 12 150 140 ● ● ● Effect of visit Toeplitz structure k = 8 visits ● 4 50 ● Effect of visit compound symmetric structure k = 8 visits ● ● ● 2 ● ● ● ● Effect of visit compound symmetric structure k = 5 visits ● 12 ● ● Effect of visit compound symmetric structure k = 3 visits ● ● ● Effect of visit Toeplitz structure k = 5 visits ● 12 ● ● ● ● ● ● Number of subjects ● ● ● Number of subjects 130 ● ● Effect of visit Toeplitz structure k = 3 visits ● ● Number of subjects ● ● ● Number of subjects ● ● % relative bias 200 ● −2 ● ● ● ● −2 100 50 Effect of a pure between covariate compound symmetric structure k = 3 visits ● ● −4 50 ● 0 6 ● ● ● ● ● OLS/GLS SwE−Het. type A/SwE−Hom. type A SwE−Het. type B SwE−Hom. type B 120 110 4 ● ● 100 ● ● ● ● ● ● 90 ● ● ● 12 % relative FPR ● ● −4 −2 ● ● ● 110 ● ● 100 ● ● ● ● ● ● ● ● ● ● OLS/GLS SwE−Het. type A/SwE−Hom. type A SwE−Het. type B SwE−Hom. type B Number of subjects ● ● ● ● ● ● Number of subjects ● OLS/GLS SwE−Het. type A/SwE−Hom. type A SwE−Het. type B SwE−Hom. type B ● % relative Bias ● ● 2 ● 0 4 2 ● 0 % relative Bias ● OLS/GLS SwE−Het. type A/SwE−Hom. type A SwE−Het. type B SwE−Hom. type B ● % relative Bias 6 OLS/GLS SwE−Het. type A/SwE−Hom. type A SwE−Het. type B SwE−Hom. type B ● ● ● −60 12 0 ● ● ● ● ● ● ● −20 −10 200 ● Figure 2: Linear effect of visit with non-compound symmetry (Toeplitz structure); same format as Figure 1 otherwise. Effect of visit compound symmetric structure k = 8 visits 8 8 Effect of visit compound symmetric structure k = 5 visits −20 −40 100 OLS/GLS SwE−Het. type A/SwE−Hom. type A SwE−Het. type B SwE−Hom. type B 12 Effect of visit compound symmetric structure k = 3 visits ● Number of subjects Results Under CS and a balanced design, N-OLS and GLS give identical performance, and the SwE similar performance (Fig. 1). But with a non-CS (Toeplitz) correlation N-OLS and GLS give appreciable bias, with false positives up to 7× nominal (Fig. 2). For a between-subject covariate, N-OLS cannot be used, and SwE gives similar performance to GLS (Fig. 3). In each instance, the “SwE Hom Type B” was the most accurate, i.e. SwE assuming homogeneous variance over subjects & using standardized residuals. With the design from the real study, we find similar results, with SwE giving performance similar to GLS & N-OLS under CS and the best performance without CS correlation (Fig. 4). ● −60 −50 −30 −40 −50 50 12 % relative FPR We compare three univariate models for inference on longitudinal data: NOLS, GLS, and SwE via Monte Carlo simulation (10,000 simulations for each setting). For a range of subject sample sizes (n = 12, 25, 50, 100, 200) and number of visits (k = 3, 5, 8), and compound symmetric (ρ = 0.9) and noncompound symmetric (Toeplitz, [0.9, 0.8, . . .]) intra-visit correlation structure. We evaluate the properties of the estimates of a within-subject covariate (the linear effect of visit number), and of a pure between-subject covariate (a random covariate value assigned to each subject, akin to initial age). We also compare those 3 methods with an unbalanced design matrix taken from a real fMRI study with 41 subjects and from 2 to 3 visits [3]. For each setting, we evaluate: (1) relative SE (Standard Error) Bias, the relative average error in the estimated variance of the parameter of interest, (2) relative P-value Bias, the mismatch between nominal α and observed false positive rate (FPR), and (3) relative SE Stdev, the standard deviation of the standard error estimates (normalized to the mean Monte Carlo standard error). We also evaluate 4 different versions of the SwE that differ by the assumption of homogeneity (“Hom”) or heterogeneity (“Het”, as suggested in [4]) between subject, and by the use of unstandardized residuals (“type A”) or standardized residuals (“type B”, as suggested in [5] and [6]). ● ● ● ● OLS/GLS SwE−Het. type A/SwE−Hom. type A SwE−Het. type B SwE−Hom. type B −30 ● OLS/GLS SwE−Het. type A/SwE−Hom. type A SwE−Het. type B SwE−Hom. type B ● ● ● ● −30 ● ● −40 ● ● ● ● ● % relative Bias ● ● ● ● 0 ● ● % relative Bias −10 ● ● ● −20 ● ● ● −60 % relative Bias ● ● 12 Methods Effect of visit Toeplitz structure k = 8 visits 10 Effect of visit Toeplitz structure k = 5 visits 10 0 ● ● ● % relative FPR There is growing need for longitudinal analyses of structural and functional MRI data. Standard software, SPM and FSL in particular, cannot accurately model these data when there are k > 2 visits, and cannot accommodate between-subject covariates (e.g. gender) within their repeated measures models. In this work we propose the use of Ordinary Least Squares (OLS) combined with sandwich estimator (SwE) standard errors [1] to provide fast and valid inferences. We compare this approach to the naive OLS (N-OLS) longitudinal model typically used in FSL & SPM, and Generalized Least Square (GLS) [2]. N-OLS is obtained by including subject indicator variables as covariates; while this is fast, it is only correct for balanced designs with a certain “compound symmetric” covariance structure, and it precludes fitting subject-level covariates. GLS is the gold standard (used in R’s lmer & SAS’s proc mix) but is slow and may fail to converge. Effect of visit Toeplitz structure k = 3 visits References [1] White (1981). JASA, 76(374), 419-433. [2] Laird & Ware (1982). Biometrics, 38(4):963-974. [3] Heitzeg et al. (2008). Alcoholism: Clin. & Exp. Res. 32:414426. [4] Diggle, et al. (1994). Analysis of Longitudinal Data. OUP. [5] MacKinnon & White (1985). J. Econometrics, 29:305-325. [6] Long & Ervin (2000). Am. Statistician, 54:217-224. http://go.warwick.ac.uk/tenichols waldorp@uva.nl http://home.medewerker.uva.nl/l.j.waldorp