Canonical Correlation: Equations Psy 524 Andrew Ainsworth Data for Canonical Correlations CanCorr actually takes raw data and computes a correlation matrix and uses this as input data. You can actually put in the correlation matrix as data (e.g. to check someone else’s results) Data The input correlation set up is: Rxx Rxy Ryx Ryy Equations To find the canonical correlations: First create a canonical input matrix. To get this the following equation is applied: 1 yy 1 xx R R Ryx R Rxy Equations To get the canonical correlations, you get the eigenvalues of R and take the square root rci i Equations In this context the eigenvalues represent percent of overlapping variance accounted for in all of the variables by the two canonical variates Equations Testing Canonical Correlations Since there will be as many CanCorrs as there are variables in the smaller set not all will be meaningful. Equations Wilk’s Chi Square test – tests whether a CanCorr is significantly different than zero. kx k y 1 N 1 ln m 2 Where N is number of cases, k x is number of x var iables and 2 k y is number of y var iables m m (1 i ) i 1 Lamda, , is the product of difference between eigenvalues and 1, generated across m canonical correlations. Equations From the text example - For the first canonical correlation: 2 (1 .84)(1 .58) .07 2 2 1 8 1 ln .07 2 2 (4.5)(2.68) 12.04 2 df (k x )(k y ) (2)(2) 4 Equations The second CanCorr is tested as 1 (1 .58) .42 2 2 1 8 1 ln .42 2 2 (4.5)(.87) 3.92 2 df (k x 1)(k y 1) (2 1)(2 1) 1 Equations Canonical Coefficients Two sets of Canonical Coefficients are required One set to combine the Xs One to combine the Ys Similar to regression coefficients Equations By ( Ryy1/ 2 ) ' Bˆ y Where ( Ryy1/ 2 ) ' is the transpose of the inverse of the "special" matrix form of square root that keeps all of the eigenvalues positive and Bˆ y is a normalized matrix of eigen vectors for yy Bx Rxx-1 Rxy B*y Where B*y is By from above dividing each entry by their corresponding canonical correlation. Equations Canonical Variate Scores Like factor scores (we’ll get there later) What a subject would score if you could measure them directly on the canonical variate X Z x Bx Y Z y By Equations Matrices of Correlations between variables and canonical variates; also called loadings or loading matrices Ax Rxx Bx Ay Ryy By Equations First Set Second Set TS TC BS BC Canonical Variate Pairs First Second -.74 .68 .79 .62 -.44 .90 .88 .48 Equations Redundancy Within Percent of variance explained by the canonical correlate on its own side of the equation kx 2 ixc ky 2 iyc a pvxc i 1 k x pv yc i 1 a ky (.74) .79 .58 2 2 pvxc1 2 Equations Redundancy Across - variance in Xs explained by the Ys and vice versa rd ( pv)(r ) 2 c (.74) .79 (.84) .48 2 2 rd x1 y 2