Multivariate Approaches to Analyze fMRI Data Yuanxin Hu Outlines 1.Summary of principles of three approaches a) Principle Component Analysis (PCA) b) Canonical Correlation Analysis (CCA) c) Independent Component Analysis (ICA) 2.Procedure to analyze fMRI data by ICA approach a) data preprocessing 1) centering 2) whitening 3) data reduction b) strategies of ICA 1) temporal ICA 2) spatial ICA c) validation of ICA results 3.Group ICA a) models/methods b) comparison of methods c) modifications of classical group ICA Nature of fMRI data 1. Multivariate 2. Subspaces / high dimensions/directions a) Space: region of brain with similar temporal behavior b) Time course c) Space & Time course Outlines 1.Summary of principles of three approaches a) Principle Component Analysis (PCA) b) Canonical Correlation Analysis (CCA) c) Independent Component Analysis (ICA) 2.Procedure to analyze fMRI data by ICA approach a) data preprocessing 1) centering 2) whitening 3) data reduction b) strategies of ICA 1) temporal ICA 2) spatial ICA c) validation of ICA results 3.Group ICA a) models/methods b) comparison of methods c) modifications of classical group ICA Principle Component Analysis (PCA) Goal To find linear combinations of the original variables reflecting the structural dependence of data. The strategy Is to create a new set of orthogonal variables that contain the same information as the original set, and the previous Orthogonal axe occupies the majority of sample variance, and determine the direction of dimension of the dataset. Steps 1. Find independent components; X1, X2, ---------, Xp ~ multivariate distribution ( µ, Σ) 1ST component = a1t X (with maximal sample variance: a1t Sa1, and a1t a1 = 1); 2nd component = a2t X (a2t a2 = 1, and a1t a2 =0: it indicates that its coefficient vector is orthogonal to the coefficient vector of 1st component) --------Kth component = akt X (akt aK = 1, ak-1t aK = 0); 2. Transform components into coordinates. Serial components will be transformed into a new set of coordinates given values in appropriate eigenvectors Consequences 1. Sample variance comparison among components: 1st > > 2nd > > 3rd >> -------------- > ; so the 1st component has the principle axis of the p-dimensional scatter cloud; 2. The coefficient vector of sub sequential component is orthogonal to its previous one. Outlines 1.Summary of principles of three approaches a) Principle Component Analysis (PCA) b) Canonical Correlation Analysis (CCA) c) Independent Component Analysis (ICA) 2.Procedure to analyze fMRI data by ICA approach a) data preprocessing 1) centering 2) whitening 3) data reduction b) strategies of ICA 1) temporal ICA 2) spatial ICA c) validation of ICA results 3.Group ICA a) models/methods b) comparison of methods c) modifications of classical group ICA Canonical Correlation Analysis (CCA) A way to quantify correlation between sets of variables. Pairs of canonical variables: Radom variables canonical variables: 1st: 2nd . . . kth X Y a1T X a2T X . . . akT X b1T X b2T X . . . bkT X Cor (aiT X, biT Y), and its coefficients: (ΣxyΣyy -1Σyx-CiΣxx)*ai = 0 (ΣyxΣxx -1 Σxy-CiΣyy)*bi = 0 Outlines 1.Summary of principles of three approaches a) Principle Component Analysis (PCA) b) Canonical Correlation Analysis (CCA) c) Independent Component Analysis (ICA) 2.Procedure to analyze fMRI data by ICA approach a) data preprocessing 1) centering 2) whitening 3) data reduction b) strategies of ICA 1) temporal ICA 2) spatial ICA c) validation of ICA results 3.Group ICA a) models/methods b) comparison of methods c) modifications of classical group ICA Independent Component Analysis (ICA) (Originates from “Cocktail-Party Problem”) In the Cocktail-Party-Problem, you are attending a party with simultaneous conversations of hundreds guests. Same amount microphones located at different places in the room, are simultaneously recording the conversations. Each microphone recording can be considered as a linear mixture of individual 'independent' conversations. Key of ICA Non-normality Each microphone signal (X) can be modeled as linear superpositions of the recorded source signals (linear mixture by unknown matrix A). Recover original source signals by finding a matrix (W) ICA in studying fMRI data Sensor1 Sensor2 Sensor3 Difference between PCA and ICA Jung TP, et al 2001, Proceedings of the IEEE, 89(7); Outlines 1.Summary of principles of three approaches a) Principle Component Analysis (PCA) b) Canonical Correlation Analysis (CCA) c) Independent Component Analysis (ICA) 2.Procedure to analyze fMRI data by ICA approach a) data preprocessing 1) centering 2) whitening 3) data reduction b) strategies of ICA 1) temporal ICA 2) spatial ICA c) validation of ICA results 3.Group ICA a) models/methods b) comparison of methods c) modifications of classical group ICA Outlines 1.Summary of principles of three approaches a) Principle Component Analysis (PCA) b) Canonical Correlation Analysis (CCA) c) Independent Component Analysis (ICA) 2.Procedure to analyze fMRI data by ICA approach a) data preprocessing 1) centering 2) whitening 3) data reduction b) strategies of ICA 1) temporal ICA 2) spatial ICA c) validation of ICA results 3.Group ICA a) models/methods b) comparison of methods c) modifications of classical group ICA Preprocessing for ICA 1. Centering: The most basic and necessary preprocessing is to center x, by subtracting its mean vector m = E{x} to make x a zero-mean variable. It will simplify ICA algorithms; 2. Whitening Linearly transform observed vector X to make the components uncorrelated, and their covariance matrix of ˜x equals the identity matrix: E{˜x˜xT } = I. 3.Data reduction Remove noise signal to decrease data dimension, and make the data meet biological sense. Hyvärinen A, et al, 2000, Neural Networks, 13(4-5):411-430 Outlines 1.Summary of principles of three approaches a) Principle Component Analysis (PCA) b) Canonical Correlation Analysis (CCA) c) Independent Component Analysis (ICA) 2.Procedure to analyze fMRI data by ICA approach a) data preprocessing 1) centering 2) whitening 3) data reduction b) strategies of ICA 1) temporal ICA 2) spatial ICA c) validation of ICA results 3.Group ICA a) models/methods b) comparison of methods c) modifications of classical group ICA ICA Multivariate Analyses 1. Spatial ICA 2. Temporal ICA voxels t1 n Time course 1n Spatially independent time course associated the tth component image 1n t1 The tth component image Time course 1t voxels n1 associated temporally independent image Temporally independent time course n1 nt In theory, once the independent components are identified, the statistical test can be further investigated, for example: the distribution of probability of all voxels and correlation of activation of different regions upon stimuli, and so on. However, the nature of the procedure makes us not that confident. Outlines 1.Summary of principles of three approaches a) Principle Component Analysis (PCA) b) Canonical Correlation Analysis (CCA) c) Independent Component Analysis (ICA) 2.Procedure to analyze fMRI data by ICA approach a) data preprocessing 1) centering 2) whitening 3) data reduction b) strategies of ICA 1) temporal ICA 2) spatial ICA c) validation of ICA results 3.Group ICA a) models/methods b) comparison of methods c) modifications of classical group ICA Validation of ICA Results Reasons of validation 1. Different algorithms can yield different components, which will contribute different interpretations for same data; 2.Algorithms always have stochastic elements, as a result, different runs of same algorithms can contribute different results. Validation of ICA Results Strategies of validation 1.Fixed-point based: Normalize differential entropy/negentropy, and maximize negentropy to find directions of maximal non-normality of the data; 2.Bootstrap: The validation is to find out whether the statistical test is reproducible or consistent. To avoid the variation caused by stochastic element from algorithms operation, the analysis can start at different initial value, which can be accomplished via Bootstrap. In practice, researchers can repeat running same operation, and find the tight cluster of point, which will be real independent component; if the clusters are wildly scattered, which should not be selected, because they are not real independent components. This can be judged by Cluster Quality Index, higher is better. Selection of Clusters Himberg J, et al, 2004 NeuroImage, 22(3):1214–1222 Himberg J, et al, 2004 NeuroImage, 22(3):1214–1222 Outlines 1.Summary of principles of three approaches a) Principle Component Analysis (PCA) b) Canonical Correlation Analysis (CCA) c) Independent Component Analysis (ICA) 2.Procedure to analyze fMRI data by ICA approach a) data preprocessing 1) centering 2) whitening 3) data reduction b) strategies of ICA 1) temporal ICA 2) spatial ICA c) validation of ICA results 3.Group ICA a) models/methods b) comparison of methods c) modifications of classical group ICA Outlines 1.Summary of principles of three approaches a) Principle Component Analysis (PCA) b) Canonical Correlation Analysis (CCA) c) Independent Component Analysis (ICA) 2.Procedure to analyze fMRI data by ICA approach a) data preprocessing 1) centering 2) whitening 3) data reduction b) strategies of ICA 1) temporal ICA 2) spatial ICA c) validation of ICA results 3.Group ICA a) models/methods b) comparison of methods c) modifications of classical group ICA Methods of Group ICA 1. Averaging Across Subjects 2. Calhound’s model: (Temporal basis, Subject-wise concatenation) Combination of data from individual subjects. The data is large, so data reduction is essential: Clean individual data; transform original data into Talairach coordinates.; and then concatenate all individuals’ data together for analysis 3. Svensѐn model: (Spatial basis, row-wise concatenation) Data reduction by masking air out sir voxels, decrease about 50% data dimension, so there is no need to transform the data into Talairach coordinates. 1) Calhoun VD, et al (2001): NeuroImage 14(5):1080-1088. 2) Beckmann CF, et al. (2005): NeuroImage 25(1):294-311. 3) Calhoun VD, et al (2001): Hum.Brain Map. 14(3):140-151. 4) Esposito F, Neuroimage. 25(1):193-205. 5) Schmithorst VJ, et al (2004): J.Magn Reson.Imaging 19(3):365-368. 6) Svensen M, et al. (2002): NeuroImage 16:551-563. Outlines 1.Summary of principles of three approaches a) Principle Component Analysis (PCA) b) Canonical Correlation Analysis (CCA) c) Independent Component Analysis (ICA) 2.Procedure to analyze fMRI data by ICA approach a) data preprocessing 1) centering 2) whitening 3) data reduction b) strategies of ICA 1) temporal ICA 2) spatial ICA c) validation of ICA results 3.Group ICA a) models/methods b) comparison of methods c) modifications of classical group ICA Accuracy Comparison of Three Methods by Simulation MSE = mean-squared error between original and estimated sources Average CC = average cross-correlation value between original and estimated associated time courses Accuracy Comparison of Three Methods by Simulation (+ data from 5 subjects) MSE = mean-squared error between original and estimated sources Average CC = average cross-correlation value between original and estimated associated time courses Outlines 1.Summary of principles of three approaches a) Principle Component Analysis (PCA) b) Canonical Correlation Analysis (CCA) c) Independent Component Analysis (ICA) 2.Procedure to analyze fMRI data by ICA approach a) data preprocessing 1) centering 2) whitening 3) data reduction b) strategies of ICA 1) temporal ICA 2) spatial ICA c) validation of ICA results 3.Group ICA a) models/methods b) comparison of methods c) modifications of classical group ICA Modification of Basic ICA Apporaches 1. Spatiotemporal ICA either sICA or tICA are dual dimension, which is meaningless for scientific basis; 2. Skew-ICA Real images are surrounded by homogeneous background, which will cause skewed distribution. To solve this, the method uses more realistically long tail instead of heavy tail to represent the distribution. GLM PCA tICA sICA stICA Skewed-ICA Correlation Between the Four Time Courses Extracted by Each Method Method GLM PCA tICA sICA stICA Skew-sICA Skew-stICA Source 1 0.87 0.76 0.87 0.48 0.87 0.83 0.87 Source 2 0.93 0.55 0.90 0.75 0.89 0.90 0.92 Source 3 0.95 0.77 0.91 0.40 0.91 0.88 0.94 Source 4 0.84 0.81 0.72 0.71 0.72 0.81 0.83 Mean 0.90 0.72 0.85 0.59 0.85 0.86 0.89 Stone JV, 2002, NeuroImage 15: 407-421 Outlines 1.Summary of principles of three approaches a) Principle Component Analysis (PCA) b) Canonical Correlation Analysis (CCA) c) Independent Component Analysis (ICA) 2.Procedure to analyze fMRI data by ICA approach a) data preprocessing 1) centering (simplify calculation) 2) whitening (linearly transformation, to ensure components are uncorrelated) 3) data reduction (remove noise signal, keep biologically related information only) b) strategies of ICA 1) temporal ICA 2) spatial ICA c) validation of ICA results (point-fixed method, bootstrap to indentify real independent clusters ) 3.Group ICA a) models/methods (Averaging, row-wise group ICA, subject-wise group ICA) b) comparison of methods (the subject-wise group ICA is more accurate) c) modifications of classical group ICA (skewed ICA is more consistent)