Optimization of Designs for fMRI IPAM Mathematics in Brain Imaging Summer School July 18, 2008 Thomas Liu, Ph.D. UCSD Center for Functional MRI T.T. Liu July 18, 2008 T.T. Liu July 18, 2008 Why optimize? • • • • Scans are expensive. Subjects can be difficult to find. fMRI data are noisy A badly designed experiment is unlikely to yield publishable results. • Time = Money If your result needs a statistician then you should design a better experiment. --Baron Ernest Rutherford T.T. Liu July 18, 2008 What to optimize? • Statistical Efficiency: maximize contrast of interest versus noise. • Psychological factors: is the design too boring? Minimize anticipation, habituation, boredom, etc. T.T. Liu July 18, 2008 Which is the best design? It depends on the experimental question. E T.T. Liu July 18, 2008 Possible Questions • Where is the activation? • What does the hemodynamic response function (HRF) look like? T.T. Liu July 18, 2008 Model Assumptions 1) Assume we know the shape of the HRF but not its amplitude. 2) Assume we know nothing about the HRF (neither shape nor amplitude). 3) Assume we know something about the HRF (e.g. it’s smooth). T.T. Liu July 18, 2008 General Linear Model Design Matrix Data y Additive Nuisance Gaussian Matrix Noise = Xh + Parameters of Interest T.T. Liu July 18, 2008 Sb + n Nuisance Parameters Example 1: Assumed HRF shape Stimulus Convolve w/ HRF Design matrix depends on both stimulus and HRF T.T. Liu July 18, 2008 0 1 0 1 0.5 1 1 1 y 1 h1 1 1 1 0.5 1 0 1 0 1 1 1 2 2 0 3 4 3 b1 5 1 b 6 2 1 7 2 .5 8 9 .2 Parameter = amplitude of response Design Regressor The process can be modeled by convolving the activity curve with a "hemodynamic response function" or HRF = HRF Predicted neural activity Predicted response T.T. Liu July 18, 2008 Courtesy of FSL Group and Russ Poldrack Example 2: Unknown HRF shape h1 h2 h3 h4 Note: Design matrix only depends on stimulus, not HRF T.T. Liu July 18, 2008 0 0 0 1 y 1 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 1 1 1 0 1 0 0 1 1 0 h1 0 1 h2 0 1 h3 0 1 h4 1 1 1 1 1 1 1 1 2 2 0 3 4 3 b1 5 1 b2 6 1 7 2 .5 8 9 .2 Unknown shape and amplitude FIR design matrix FIR estimates T.T. Liu July 18, 2008 Courtesy of Russ Poldrack Example 3: Basis Functions If we know something about the shape, basis function expansion : h Bc we can use a 4 basis functions 5 random HDRs using basis functions 5 random HDRs w/o basis functions T.T. Liu July 18, 2008 Here if we assume basis functions, we only need to estimate 4 parameters as opposed to 20. Test Statistic Stimulus, neural activity, field strength, vascular state parameter estimate t variance of parameter estimate Thermal noise, physiological noise, low frequency drifts, motion Also depends on Experimental Design!!! T.T. Liu July 18, 2008 Efficiency 1 Efficiency Variance of Parameter Estimate T.T. Liu July 18, 2008 Covariance Matrix cov hˆ1, hˆ 2 var hˆ cov hˆ N , hˆ 2 var hˆ 1 ˆ , hˆ cov h 2 1 cov hˆ cov hˆ N , hˆ1 2 Known Efficiency T.T. Liu July 18, 2008 cov hˆ1, hˆ N cov hˆ 2 , hˆ N var hˆ N 1 Trace cov hˆ Known as an A-optimal design Efficiency Example 1 : Efficiency 1 Var hˆ 1 Example 2 : Efficiency T.T. Liu July 18, 2008 1 Var hˆ 1 Var hˆ 2 Var hˆ 3 Var hˆ 4 What does a matrix do? y Xh N-dimensional vector k-dimensional vector N x k matrix The matrix maps from a k-dimensional space to a N-dimensional space. T.T. Liu July 18, 2008 Matrix Geometry Geometric fact: The image of the a kdimensional unit sphere under any N x k matrix is an N-dimensional hyperellipse. Xv1= 1u1 1u1 v1 k-dimensional unit sphere T.T. Liu July 18, 2008 N-dimensional hyperellipse Singular Value Decomposition Xv1= 1u1 v2 v1 1u1 2u2 The right singular vectors v1 and v 2 are transformed into scaled vectors 1 u1 and 2 u2 , where u1 and u2 are the left singular vectors and 1 and 2 are the singular values. The singular values are the k square roots of the eigenvalues of the kxk matrix XT X. T.T. Liu July 18, 2008 Assumed HDR shape Parameter Space Data Space Xv1= 1u1 v1 h0 Parameter Noise Space Xh0 1u1 h0 Good for Detection Efficiency here is optimized by amplifying the singular vector closest to the assumed HDR. This corresponds to maximizing one singular value while minimizing the others. T.T. Liu July 18, 2008 No assumed HDR shape Parameter Space Data Space Xv1= 1u1 Parameter Noise Space 1u1 v1 Here the HDR can point in any direction, so we don’t want to preferentially amplify any one singular value. This corresponds to an equal distribution of singular values. T.T. Liu July 18, 2008 Some assumptions about shape If no assumptions, then use equal singular values. If we know the the HDR lies within a subspace spanned by a set of basis functions, we should maximize the singular values in this subspace and minimize outside of this subspace. T.T. Liu July 18, 2008 Knowledge (Assumptions) about HRF None Some Make singular Maximize singular values as equal as values associated possible with subspace that contains HDR T.T. Liu July 18, 2008 Total Maximize singular value associated with right singular vector closest to HDR Singular Values and Power Spectrum The singular values i are the k square roots of the eigenvalues of the The eigenvalues of the T kxk matrix X X. kxk matrix XT X asymptotically approach the the power spectral coefficients of the first T row of X X - - i.e. the power spectrum of the design. *(assume mean has been removed from design) T.T. Liu July 18, 2008 Knowledge (Assumptions) about HRF Some None Equally distributed singular values = flat power spectrum Total Set of dominant singular values = spread of power spectral components f One dominant singular value = one dominant power spectral component f Power Spectra T.T. Liu July 18, 2008 f Frequency Domain Interpretation Boxcar Randomized ISI single-event Regressor Low-pass FFT T.T. Liu July 18, 2008 Adapted From S. Smith and R. Poldrack Low-pass Which is the best design? E T.T. Liu July 18, 2008 Given an assumed shape h0, which design maximizes Xh 0 ? Xv1 = 1 u1 v1 E T.T. Liu July 18, 2008 h0 1 u1 Xh0 Which design will have the most equal spread of singular values? E T.T. Liu July 18, 2008 Which design will have the flattest power spectrum? f Which design will have a tailored spread of singular values? Which design will have a shaped power spectrum? E T.T. Liu July 18, 2008 f Detection Power When detection is the goal, we want to answer the question: is an activation present or not? When trying to detect something, one needs to specify some knowledge about the “target”. In fMRI, the target is approximated by the convolution of the stimulus with the HRF. Once we have specified our target (e.g. stimuli and assumed HRF shape), the efficiency for estimating the amplitude of that target can be considered our detection power. T.T. Liu July 18, 2008 Theoretical Trade-off HDR Estimation Efficiency Equal Singular values One Dominant Singular value Detection Power T.T. Liu July 18, 2008 T.T. Liu July 18, 2008 1 0.5 0.5 90 0 45 0 0 0.5 1 Detection power R (c) k = 10 1 0 90 63 45 0 0 0.5 1 Detection power R (d) k = 15 1 0.5 0.5 90 72 0 (b) k = 5 1 Estimation Efficiency Estimation Efficiency Estimation Efficiency (a) k = 2 Estimation Efficiency Theoretical Curves 45 0 0 0.5 1 Detection power R 0 90 75 45 0 0 0.5 1 Detection power R Efficiency vs. Power Estimation Efficiency 1.2 1 A random B 0.8 0.6 C 0.4 0.2 0 4 2 1 8 16 32 Periodic Single Trial D -0.2 0 T.T. Liu July 18, 2008 0.2 0.4 0.6 0.8 Detection Power 1 1.2 Efficiency with Basis Functions T.T. Liu July 18, 2008 Knowledge (Assumptions) about HRF Some None Experiments where you want to characterize in detail the shape of the HDR. Total Experiments where you have a good guess as to the shape (either a canonical form or measured HDR) and want to detect activation. A reasonable compromise between 1 and 2. Detect activation when you sort of know the shape. Characterize the shape when you sort of know its properties T.T. Liu July 18, 2008 Question If block designs are so good for detecting activation, why bother using other types of designs? Problems with habituation and anticipation Less Predictable T.T. Liu July 18, 2008 Entropy Perceived randomness of an experimental design is an important factor and can be critical for circumventing experimental confounds such as habituation and anticipation. Conditional entropy is a measure of randomness in units of bits. Rth order conditional entropy (Hr) is the average number of binary (yes/no) questions required to determine the next trial type given knowledge of the r previous trial types. 2 Hr T.T. Liu July 18, 2008 is a measure of the average number of possible outcomes. Entropy Example AA N AA N AAA N Maximum entropy is 1 bit, since at most one needs to only ask one question to determine what the next trial is (e.g. is the next trial A?). With maximum entropy, 21 = 2 is the number of equally likely outcomes. A C B N C B AA B C N A Maximum entropy is 2 bits, since at most one would need to ask 2 questions to determine the next trial type. With maximum entropy, the number of equally likely outcomes to choose from is 4 (22). T.T. Liu July 18, 2008 T.T. Liu July 18, 2008 Efficiency 2^(Entropy) T.T. Liu July 18, 2008 Multiple Trial Types 1 trial type + control (null) AA N AA N AAA N Extend to experiments with multiple trial types ABABNNANBBANANA BADBANDBCNDNBCN T.T. Liu July 18, 2008 Multiple Trial Types GLM y = Xh + Sb + n X = [X1 X2 … XQ] h = [h1T h2T … hQT]T T.T. Liu July 18, 2008 Multiple Trial Types Overview Efficiency includes individual trials and also contrasts between trials. Rtot K average variance of HRF amplitude estimates for all trial types and pairwise contrasts tot T.T. Liu July 18, 2008 1 average variance of HRF estimates for all trial types and pairwise contrasts Multiple Trial Types Trade-off T.T. Liu July 18, 2008 Optimal Frequency Can also weight how much you care about individual trials or contrasts. Or all trials versus events. Optimal frequency of occurrence depends on weighting. Example: With Q = 2 trial types, if only contrasts are of interest p = 0.5. If only trials are of interest, p = 0.2929. If both trials and contrasts are of interest p = 1/3. Q(2k1 1) Q (1 k1 ) k 2 p T.T. Liu July 18, 2008 1/2 1 Q(2k 1) Q 1 Q(Q 1)(k1Q Q k1 ) 2 (1 k1 ) 1/2 Design As the number of trial types increases, it becomes more difficult to achieve the theoretical trade-offs. Random search becomes impractical. For unknown HDR, should use an m-sequence based design when possible. Designs based on block or m-sequences are useful for obtaining intermediate trade-offs or for optimizing with basis functions or correlated noise. T.T. Liu July 18, 2008 Optimality of m-sequences T.T. Liu July 18, 2008 Clustered m-sequences T.T. Liu July 18, 2008 Genetic Algorithms QuickTime™ and a decompressor are needed to see this picture. T.T. Liu July 18, 2008 Wager and Nichols 2003 Genetic Algorithms QuickTime™ and a decompressor are needed to see this picture. T.T. Liu July 18, 2008 Wager and Nichols 2003 Genetic Algorithms QuickTime™ and a decompressor are needed to see this picture. T.T. Liu July 18, 2008 Wager and Nichols 2003 Genetic Algorithms QuickTime™ and a decompressor are needed to see this picture. T.T. Liu July 18, 2008 Kao et al. available at http://aaron.stat.uga.edu/~amandal Topics we haven’t covered. The impact of correlated noise -- this will change the optimal design. Impact of nonlinearities in the BOLD response. Designs where the timing is constrained by psychology. T.T. Liu July 18, 2008 Summary • Efficiency as a metric of design performance. • Efficiency depends on both experimental design and assumptions about HRF. • Inherent tradeoff between power (detection of known HRF) and efficiency (estimation of HRF) T.T. Liu July 18, 2008