Graph theory for TLE lateralization: Supporting Information 1 APPENDIX A. fMRI measures of brain graph topology. 1. Local Clustering Coefficient (CC) and Normalized Clustering Coefficient (γ) Normalized clustering coefficient (γ) provides a measure of the level of “cliquishness,” or local interconnectedness, of a network (1). The local clustering coefficient of node i (CC) was calculated for each node as the ratio of the number of existing to the number of possible connections in the Gi, the subgraph of node i: CCi Ei , K i ( K i 1) / 2 [1] where Ei denotes the number of edges in subgraph Gi and Ki denotes the degree of node i. The absolute clustering coefficient of the network (C) was then calculated as the mean value of Ci, averaged over all nodes (2). To avoid the influence of other network characteristics, γ was calculated as the ratio of C to Crandom, the averaged clustering coefficient over 500 randomly rewired null models (3). 2. Normalized Characteristic Path Length (λ) The characteristic path length (L) is defined as the average shortest path length between all pairs of nodes, and provides a measure of the global efficiency of information transfer in a network (1). To account for infinite path lengths between disconnected nodes, L was calculated as the harmonic mean of the shortest path lengths: L N ( N 1) 1 d i, j i jG , [2] where di,j denotes the shortest path length between nodes i and j, and N denotes the number of nodes in the graph (2). To avoid the influence of other network characteristics, λ was then calculated as the ratio of L to Lrandom, the averaged characteristic path length over 500 randomly rewired null models (3). Graph theory for TLE lateralization: Supporting Information 2 3. Small-World Index (σ) Small-world index (σ) is a measure of the network’s balance between network segregation and integration (4), and was calculated as the ratio of clustering coefficient to path length: . [3] Small-world networks typically have γ>1, λ≈1, σ>1 (1). Because noise reduction and better class separation can potentially be achieved through the inclusion of redundant variables (5), small-world index was included along with clustering coefficient and path length as a feature input. 4. Local efficiency (LE) and Global Efficiency (GE) The local efficiency (LE) of node i was calculated as the mean of the inverse shortest path length from node i to all other nodes: LEi 1 N jG: j i 1 di, j , [4] where di,j denotes the shortest path length between nodes i and j and N denotes the number of nodes in the graph. Global efficiency (GE), which measures the average level of efficiency in interregional communication, was calculated as the average local efficiency over all nodes: GE 1 N LE iG i . [5] 4. Connectivity Strength (CS) Connectivity strength provides a measure of the network’s average level of connectivity, and was calculated as the mean of all pairwise correlations (6): Graph theory for TLE lateralization: Supporting Information 3 CS 1 N r i jG ij , [6] where rij denotes the pairwise Pearson correlation between nodes i and j and N denotes the number of nodes in the graph. 5. Connectivity Diversity (CD) Connectivity diversity provides a measure of the heterogeneity of the network’s connectivity. The connectivity diversity of node i (CDi) is defined as the unbiased sample variance of all pairwise correlations with node i: CDi 1 ( rij ri ) 2 N 1 jG: j i [7] where ri denotes the mean pairwise Pearson correlation between node i and all other nodes in graph G, rij denotes the pairwise Pearson correlation between nodes i and j, and N denotes the number of nodes in the graph. The connectivity diversity of a network was then calculated as the mean value of CDi across all nodes (6): CD 1 N CD . iG i [8] 6. Betweenness Centrality (BC) The betweenness centrality of a node describes the number of shortest paths that pass through the node. It provides a measure of the node’s importance, with higher levels of betweenness centrality indicating nodes located on highly traveled paths. Because hippocampal sclerosis is the most common finding in TLE (7), betweenness centrality of the left hippocampus (BC(L)) and right hippocampus (BC(R)) were calculated as the fraction of shortest paths in the network containing the respective hippocampal node: Graph theory for TLE lateralization: Supporting Information 4 BC ( L) BC ( R ) M ij ( LH ) i j LH M ij M ij ( RH ) i j RH M ij [9] , [10] where RH denotes the right hippocampus, LH denotes the left hippocampus, Mij(RH) and Mij(LH) denote the number of shortest paths in the network from node i to node j which pass through the right and left hippocampus, respectively, and Mij denotes the number of shortest paths in the network from node i to node j. All unweighted measures were averaged across the non-random connection density range, where the nonrandom connection density range was chosen as the range of connection densities that retained small-world topology, such that (1) more than 99% of nodes were connected, and (2) small-world index (σ) > 1. Graph construction and feature definitions were performed using Matlab (Mathworks, Inc., Natick, MA, USA). Graph theory metrics were calculated using the Brain Connectivity Toolbox (2). Graph theory for TLE lateralization: Supporting Information 5 APPENDIX B. Quadratic discriminant analysis and leave-one-out cross-validation. QDA is a classification procedure used to estimate the decision boundary between classes, based on a nonlinear generalization of Fisher discriminant analysis. This method assumes a multivariate Gaussian distribution for each class density allowing unequal covariance matrices, resulting in quadratic decision boundaries. QDA has been found to perform well on many diverse classification tasks, due to the stability of estimates provided by Gaussian models (8). For further reference, we refer the reader to (8,9). Cross-validation is a widely used statistical method for estimating prediction error. It is especially useful for moderate sample sizes, where it may not be feasible to set aside a validation set to assess predictive performance (10). It has been shown that the LOO-CV estimator for prediction error is unbiased for the true prediction error (11,12). In LOO-CV, the discriminant function is trained on a training set that includes all subjects except one. The excluded subject is then tested using the discriminant function built using the other subjects. This is repeated until each subject has been excluded once, thereby allowing an average prediction accuracy measure to be assessed based on the number of correctly classified subjects. Since, for each crossvalidation fold, the “test set” (the excluded subject) is kept apart from the “training set” (all subjects except the excluded subject), this method allows for feature selection and assessment of classification performance which is not biased by the test sample. Graph theory for TLE lateralization: Supporting Information 6 APPENDIX C. Fisher separability criterion and backward stepwise variable importance. The Fisher separability criterion (FSC) is defined as L, j R , j j 2 [11] (13), where L, j is the mean value of the jth fMRI feature among left TLE patients, R , j is the mean value of the jth fMRI feature among right TLE patients, and j is the standard deviation of the jth fMRI feature among all TLE patients. As can be seen from Equation 11, FSC indicates greater separability when the means are further apart and when the variance is lower. FSC has been observed by (13) to be a natural unit measuring univariate separability of clusters. The backward stepwise variable importance (BSVI) of a feature was defined as the increase in LOO-CV error upon removing the respective feature from the optimal feature set. First, the optimal feature subset was defined as the subset of fMRI features which minimized LOO-CV error based on a quadratic discriminant function. Next, the contribution to the multivariate discriminatory power of each fMRI feature in the optimal feature subset was estimated, by computing the increase in LOO-CV error when the respective feature was removed. Graph theory for TLE lateralization: Supporting Information 7 REFERENCES (SUPPORTING INFORMATION) 1. Watts DJ, Strogatz SH. Collective dynamics of 'small-world' networks. Nature 1998;393:440-442. 2. Rubinov M, Sporns O. Complex network measures of brain connectivity: uses and interpretations. Neuroimage 2010;52:1059-1069. 3. Maslov S, Sneppen K. Specificity and stability in topology of protein networks. Science 2002;296:910-913. 4. Bullmore ET, Bassett DS. Brain graphs: graphical models of the human brain connectome. Annu Rev Clin Psychol 2011;7:113-140. 5. Guyon I, Elisseeff A. An introduction to variable and feature selection. The Journal of Machine Learning Research 2003;3:1157-1182. 6. Lynall ME, Bassett DS, Kerwin R, et al. Functional connectivity and brain networks in schizophrenia. J Neurosci 2010;30:9477-9487. 7. Lee DH, Gao FQ, Rogers JM, et al. MR in temporal lobe epilepsy: analysis with pathologic confirmation. AJNR Am J Neuroradiol 1998;19:19-27. 8. Hastie T, Tibshirani R, Friedman J, Hastie T, Friedman J, Tibshirani R. The elements of statistical learning: Springer: 2009. 9. Lachenbruch PA. Discriminant analysis: Wiley Online Library: 1975. 10. Hastie T, Tibshirani R, Friedman JJH. The elements of statistical learning. New York: Springer Series in Statistics: 2001. 241-244 p. 11. Luntz A, Brailovsky V. On estimation of characters obtained in statistical procedure of recognition. Technicheskaya Kibernetica 1969;3. Graph theory for TLE lateralization: Supporting Information 8 12. Kearns M, Ron D. Algorithmic stability and sanity-check bounds for leave-one-out crossvalidation. Neural Comput 1999;11:1427-1453. 13. Fisher RA. The use of multiple measurements in taxonomic problems. Annals of eugenics 1936;7:179-188.