Winter Conference Borgafjäll Detecting changes in brain scale, shape and connectivity via the geometry of random fields Jin Cao, Lucent Nick Chamandy, Google Khalil Shafie, Northern Colorado Jonathan Taylor, Stanford Keith Worsley, McGill EC heuristic At high thresholds t, the excursion set Xt = fs : T (s) ¸ tg is either one connected component (EC = 1) or empty (EC = 0), so µ ¶ P max T (s) ¸ t ¼ E(EC(S \ Xt )): s2S Theorem (1981, 1995) If T (s) is a smooth isotropic random ¯eld then E(EC(S \ Xt )) = X D d=0 ¹d (S)½d (t): E(EC(S \ Xt )) = X D ¹d (S)½d (t) d=0 Intrinsic volume ¹d (S) EC density ½d (t) For S with smooth boundary @S that has curvature matrix C(s), Morse theory: X \ EC(S Xt ) = 1fT (s)¸tg 1f@T (s)=@s=0g ¡((D ¡ d)=2) ¹d (S) = ¡ Z 2¼(D d)=2 £ detrD¡1¡d fC(s)gds; @S and ¹D (S) = jS j. For D = 3, ¹0 (S) = EC(S) ¹1 (S) = 2 Diameter(S) ¹2 (S) = 1 Surface area(S) 2 ¹3 (S) = Volume(S): s µ ¶ £ sign ¡ @ 2 T + boundary @s@s0 µ µ ¶ 2 @ T ½D (t) = E 1fT ¸tg det ¡ @s@s0 ¯ ¶ µ ¶ ¯ @T @T ¯ P = 0 =0 ¯ @s @s For a Gaussian random ¯eld, ¸ = Sd( @Z ), @s µ ¡ ¶ ¸ @ d p P(Z ¸ t) ½d (t) = 2¼ @t E(EC(S \ Xt )) = Beautiful symmetry: X D L (S)½ (R ) d d t d=0 Lipschitz-Killing curvature Ld (S) Steiner-Weyl Tube Formula (1930) EC density ½d (Rt ) Taylor Kinematic Formula (2003) µ ¶ • Put a tube of radius r about @Z the search region λS and rejection region Rt: ¸ = Sd @s Z2~N(0,1) 14 r 12 10 Rt Tube(λS,r) 8 Tube(Rt,r) r λS 6 Z1~N(0,1) t-r t 4 2 2 4 6 8 10 12 14 • Find volume or probability, expand as a power series in r, pull off1coefficients: jTube(¸S; r)j = X D d=0 ¼d L P(Tube(Rt ; r)) = ¡d (S)r d D ¡(d=2 + 1) X (2¼)d=2 d! d=0 ½d (Rt )rd EC density of the T-statistic field, v df - After lots of messy algebra (Morse) - Two lines (Taylor’s Gaussian Tube Formula) ½0 (t) = ½1 (t) = Z 1 ¡ ¢ µ ¡ º+1 u2 2 ¡ ¢ 1+ º (º¼)1=2 ¡ º t 2 µ ¶¡(º ¡1)=2 2 t (2¼)¡1 1 + º ¡ ¢ ¶¡(º+1)=2 du µ ¶¡(º ¡1)=2 t2 ½2 (t) = t 1+ º µ ¡ ¶µ ¶¡(º ¡1)=2 2 º 1 ¡ t ½3 (t) = (2¼)¡2 t2 1 1+ º º ¡ º+1 ¡3=2 (2¼) ¡ ¢ 2 ¡ ¢ º 1=2 ¡ º 2 2 Multivariate linear models for random field data Y(s)n£q is the observations £ variables data matrix at point s 2 S ½ <D . At every point s we have a multivariate linear model: Y(s)n£q = Xn£p B(s)n£q + E(s)n£q ; E(s)n£q » N(0; I §): We detect sparse s where B(s) 6= 0 by testing H0 : B(s) = 0 at every point s. Test statistics T (s): # regressors p=1 p>1 q=1 T F # variables q>1 Hotelling’s T2 Need null distribution: P(maxs2S T (s) ¸ t) Wilks’ Λ Pillai’s trace Roy’s max root etc. Deformation Based Morphometry (DBM) D’Arcy Thompson (1917) On Growth and Form s2 Y(s) s1 n1 = 17 non-missile brain trauma patients, 3-14 days in coma, n2 = 19 age and gender matched controls Y(s) = vector (q=3) deformations needed to warp each MRI to an atlas standard Locate damage: find regions where deformations are different, i.e. shape change Hotelling’s T2 random field, v df - After lots of messy algebra (Morse) ½0 (t) = ½1 (t) = ½2 (t) = ½3 (t) = Z 1 ¡( º+1 ) ¡ º+1 q¡2 2 (1 + u) u 2 du; 2 ¡q+1 q º ¡( )¡( ) t 2 2 ¼ ¡ 12 ¡( º+1 ) ¡ º ¡1 q¡1 2 (1 + t) t 2 ; 2 ¡q+2 q º ¡( )¡( ) 2 2 µ ¡1 ¶ ¼¡1 ¡( º+1 ) q ¡ º ¡1 q¡2 ¡ 2 (1 + t) t t ; 2 2 ¡ ¡q+1 q º º q + 1 ¡( )¡( ) 2 2 µ ¡1 ¡ 1)(q ¡ 2) ¶ ¼ ¡ 32 ¡( º+1 ) 2q (q ¡ º ¡1 q ¡3 2¡ 2 (1 + t) t t t+ ¡ : 2 2 ¡ ¡ ¡q q º º q (º q + 2)(º q) ¡( )¡( ) 2 2 Last case # regressors p=1 p>1 q=1 # variables T✔ F✔ q>1 Hotelling’s T2✔ Wilks’ Λ Pillai’s trace Roy’s max root etc. ? We shall now ¯nd a P-value approximation (but not quite the EC density) for the maximum Roy0 s maximum root, T (s) =maximum eigenvalue of Y(s)0 X(X0 X)¡1 X0 Y(s)(Y(s)0 (I ¡ X(X0 X)¡1 X0 )Y(s))¡1 : The above messy algebra is just too complicated. Instead . . . Roy’s union-intersection principle Make it into a univariate linear model by multiplying by vector uq£1 : (Y(s)u)n£1 = Xn£p (B(s)u)q£1 + (E(s)u)n£1 ; H0u : (B(s)u)q£1 = 0 Let F (s; u) be the usual F-statistic for testing H0u . Let °q ½ <q be the unit q-sphere. Roy0 s maximum root is T (s) = maxu2° F (s; u): q Now F (s; u) is an F-¯eld in search region S °q and we already know the EC density of the F-¯eld, ½d (F ¸ t) so µ ¶ µ ¶ P max T (s) ¸ t = P max F (s; u) ¸ t s2S s2S;u2°q ¼ 1 2 D+q X ¹d (S ° )½ (F ¸ t) q d✔ d=0 Why ½? F(s,u)=F(s,-u) = 1 2 X D d=0 ¹d (S) q X k=0 Almost the EC density of the Roy’s maximum root field ¸ t) ¹k (°q )½d+k (F ✔ Cross correlation random field Let X(s), s 2 S ½ <M , and Y (t), t 2 T ½ <N be n £ 1 vectors of Gaussian random ¯elds. De¯ne the cross correlation random ¯eld as C(s; t) = P µ (X(s)0 X(s) Y (t)0 Y (t))1=2 : ¶ max C(s; t) ¸ c ¼ E(EC fs 2 S; t 2 T : C(s; t) ¸ cg) s2S;t2T = dim(S) X dim(T X) i=0 ½ij (C ¸ c) = X(s)0 Y (t) 2n¡2¡h (i L (S)L (T )½ (C ¸ c) i j ij j=0 ¡1)=2c ¡ 1)!j! b(hX ¼h=2+1 k=0 (¡1)k ch¡1¡2k (1 ¡ c2 )(n¡1¡h)=2+k X k X k l=0 m=0 ¡( n¡i + l)¡( n¡j + m) 2 2 : ¡ ¡ ¡ ¡ l!m!(k l m)!(n 1 h + l + m + k)!(i ¡ 1 ¡ k ¡ l + m)!(j ¡ k ¡ m + l)! Maximum canonical cross correlation random field Let X(s)n£p , s 2 S ½ <M , and Y(t)n£q , t 2 T ½ <N be matrices of Gaussian random ¯elds. De¯ne the maximum canonical cross correlation random ¯eld as u0 X(s)0 Y(t)v C(s; t) = max u;v (u0 X(s)0 X(s)u v 0 Y(t)0 Y(t)v)1=2 ; the maximum of the canonical correlations between X and Y, de¯ned as the singular values of (X0 X)¡1=2 X0 Y(Y 0 Y)¡1=2 . P µ max C(s; t) ¸ c s2S;t2T ¶ ¼ 1 2 X M L (S) i i=0 where L (° ) = k p X N L (T ) j j=0 p X L (° ) k p k=0 ¡ 2k+1 ¼ k2 ¡ p+1 ³ ´2 ¡ ¡ k! p 1 k ! 2 q X l=0 ¢ if p ¡ 1 ¡ k is even, and zero otherwise, k = 0; : : : ; p ¡ 1. L (° )½ ¸ c) (C l q i+k;j+l What is ‘bubbles’? Nature (2005) Subject is shown one of 40 faces chosen at random … Happy Sad Fearful Neutral … but face is only revealed through random ‘bubbles’ First trial: “Sad” expression Sad 75 random Smoothed by a bubble centres Gaussian ‘bubble’ What the subject sees 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Subject is asked the expression: Response: “Neutral” Incorrect Your turn … Trial 2 Subject response: “Fearful” CORRECT Your turn … Trial 3 Subject response: “Happy” INCORRECT (Fearful) Your turn … Trial 4 Subject response: “Happy” CORRECT Your turn … Trial 5 Subject response: “Fearful” CORRECT Your turn … Trial 6 Subject response: “Sad” CORRECT Your turn … Trial 7 Subject response: “Happy” CORRECT Your turn … Trial 8 Subject response: “Neutral” CORRECT Your turn … Trial 9 Subject response: “Happy” CORRECT Your turn … Trial 3000 Subject response: “Happy” INCORRECT (Fearful) Bubbles analysis 1 E.g. Fearful (3000/4=750 trials): + 2 + 3 + Trial 4 + 5 + 6 + 7 + … + 750 1 = Sum 300 0.5 200 0 100 250 200 150 100 50 Correct trials Proportion of correct bubbles =(sum correct bubbles) /(sum all bubbles) 0.75 Thresholded at proportion of 0.7 correct trials=0.68, 0.65 scaled to [0,1] 1 Use this as a 0.5 bubble mask 0 Results Mask average face Happy Sad Fearful But are these features real or just noise? Need statistics … Neutral Statistical analysis Correlate bubbles with response (correct = 1, incorrect = 0), separately for each expression Equivalent to 2-sample Z-statistic for correct vs. incorrect bubbles, e.g. Fearful: Trial 1 2 3 4 5 6 7 … 750 1 0.5 0 1 1 Response 0 1 Z~N(0,1) statistic 4 2 0 -2 0 1 1 … 1 0.75 Very similar to the proportion of correct bubbles: 0.7 0.65 Results Thresholded at Z=1.64 (P=0.05) Happy Average face Sad Fearful Neutral Z~N(0,1) statistic 4.58 4.09 3.6 3.11 2.62 2.13 1.64 Multiple comparisons correction? Need random field theory … Euler Characteristic Heuristic Euler characteristic (EC) = #blobs - #holes (in 2D) Excursion set Xt = {s: Z(s) ≥ t}, e.g. for neutral face: EC = 0 30 20 0 -7 -11 13 14 9 0 Heuristic: At high thresholds t, the holes disappear, EC ~ 1 or 0, E(EC) ~ P(max Z ≥ t). Observed Expected 10 EC(Xt) 1 0 -10 -20 -4 -3 -2 -1 0 1 Threshold, t 2 • Exact expression for E(EC) for all thresholds, • E(EC) ~ P(max Z ≥ t) is 3 4 extremely accurate. The»result If Z(s) N(0; 1) ¡is an¢ isotropic Gaussian random ¯eld, s 2 <2 , with ¸2 I2£2 = V @Z , @s µ ¶ P max Z(s) ¸ t ¼ E(EC(S \ fs : Z(s) ¸ tg)) s2S Z 1 1 £ L (S) = EC(S) e¡z2 =2 dz 0 (2¼)1=2 t L (S) £ 1 e¡t2 =2 1 + ¸ Perimeter(S) Lipschitz-Killing 1 2 2¼ curvatures of S 1 L (S) ¡t2 =2 2 Area(S) £ (=Resels(S)×c) + ¸ te 2 (2¼)3=2 If Z(s) is white noise convolved with an isotropic Gaussian Z(s) ¯lter of Full Width at Half Maximum FWHM then p ¸ = 4 log 2 : FWHM ½0 (Z ¸ t) ½1 (Z ¸ t) ½2 (Z ¸ t) EC densities of Z above t white noise = filter * FWHM Results, corrected for search Random field theory threshold: Z=3.92 (P=0.05) Happy Average face Sad Fearful Neutral Z~N(0,1) statistic 4.58 4.47 4.36 4.25 4.14 4.03 3.92 3.82 3.80 3.81 3.80 Saddle-point approx (2007): Z=↑ (P=0.05) Bonferroni: Z=4.87 (P=0.05) – nothing Scale space: smooth Z(s) with range of filter widths w = continuous wavelet transform adds an extra dimension to the random field: Z(s,w) Scale space, no signal w = FWHM (mm, on log scale) 34 8 6 4 2 0 -2 22.7 15.2 10.2 6.8 -60 -40 34 -20 0 20 One 15mm signal 40 60 8 6 4 2 0 -2 22.7 15.2 10.2 6.8 -60 -40 -20 0 s (mm) 20 40 60 15mm signal is best detected with a 15mm smoothing filter Z(s,w) Matched Filter Theorem (= Gauss-Markov Theorem): “to best detect signal + white noise, filter should match signal” 10mm and 23mm signals w = FWHM (mm, on log scale) 34 8 6 4 2 0 -2 22.7 15.2 10.2 6.8 -60 -40 34 -20 0 20 Two 10mm signals 20mm apart 40 60 8 6 4 2 0 -2 22.7 15.2 10.2 6.8 -60 -40 -20 0 20 40 60 s (mm) But if the signals are too close together they are detected as a single signal half way between them Z(s,w) Scale space can even separate two signals at the same location! 8mm and 150mm signals at the same location 10 5 w = FWHM (mm, on log scale) 0 -60 170 -40 -20 0 20 40 60 20 76 15 34 10 15.2 6.8 5 -60 -40 -20 0 s (mm) 20 40 60 Z(s,w) Scale space Lipschitz-Killing curvatures R Suppose f is a kernel with f 2 = 1 and B is a Brownian sheet. Then the scale space random ¯eld is Z µ ¡ ¶ s h Z(s; w) = w¡D=2 f dB(h) » N(0; 1): w Lipschitz-Killing curvatures: ¡d ¡d w + w L (S £ [w ; w ]) = 1 2 L (S) + d 1 2 d 2 b(D¡X d+1)=2c j=0 w¡d¡2j+1 ¡ w¡d¡2j+1 1 2 ¡ d + 2j 1 ¡ ¡ ¡ £ ·(1 2j)=2 ( 1)j (d + 2j 1)! L ¡1 (S); ¡ ¡ d+2j (1 2j)(4¼)j j!(d 1)! where · = R³ s0 @f (s) @s ´ 2 + D f (s) ds. For a Gaussian kernel, · = D=2. Then 2 µ ¶ D+1 X ¸ ¼ L (S £ [w ; w ])½ (R ): P max Z(s) t d 1 2 d t s2S d=0 Rotation space: Try all rotated elliptical filters Unsmoothed data Threshold Z=5.25 (P=0.05) Maximum filter Bubbles task in fMRI scanner Correlate bubbles with BOLD at every voxel: Trial 1 2 3 4 5 6 7 … 3000 1 0.5 0 fMRI 10000 0 Calculate Z for each pair (bubble pixel, fMRI voxel) a 5D “image” of Z statistics … Thresholding? Cross correlation random field Correlation between 2 fields at 2 different locations, searched over all pairs of locations, one in S, one in T: P µ ¶ max C(s; t) ¸ c ¼ E(EC fs 2 S; t 2 T : C(s; t) ¸ cg) s2S;t2T = dim(S) X dim(T X) i=0 n¡2¡h (i ¡ 1)!j! 2 ½ij (C ¸ c) = ¼h=2+1 L (S)L (T )½ (C ¸ c) i j ij j=0 b(hX ¡1)=2c (¡1)k ch¡1¡2k (1 ¡ c2 )(n¡1¡h)=2+k k=0 l)¡( n¡j 2 X k X k l=0 m=0 ¡( n¡i + + m) 2 l!m!(k ¡ l ¡ m)!(n ¡ 1 ¡ h + l + m + k)!(i ¡ 1 ¡ k ¡ l + m)!(j ¡ k ¡ m + l)! Bubbles data: P=0.05, n=3000, c=0.113, T=6.22 Discussion: modeling The random response is Y=1 (correct) or 0 (incorrect), or Y=fMRI The regressors are Xj=bubble mask at pixel j, j=1 … 240x380=91200 (!) Logistic regression or ordinary regression: logit(E(Y)) or E(Y) = b0+X1b1+…+X91200b91200 But there are only n=3000 observations (trials) … Instead, since regressors are independent, fit them one at a time: logit(E(Y)) or E(Y) = b0+Xjbj However the regressors (bubbles) are random with a simple known distribution, so turn the problem around and condition on Y: E(Xj) = c0+Ycj Equivalent to conditional logistic regression (Cox, 1962) which gives exact inference for b1 conditional on sufficient statistics for b0 Cox also suggested using saddle-point approximations to improve accuracy of inference … Interactions? logit(E(Y)) or E(Y)=b0+X1b1+…+X91200b91200+X1X2b1,2+ … MS lesions and cortical thickness Idea: MS lesions interrupt neuronal signals, causing thinning in down-stream cortex Data: n = 425 mild MS patients 5.5 Average cortical thickness (mm) 5 4.5 4 3.5 3 2.5 Correlation = -0.568, T = -14.20 (423 df) 2 1.5 0 10 20 30 40 50 Total lesion volume (cc) 60 70 80 MS lesions and cortical thickness at all pairs of points Dominated by total lesions and average cortical thickness, so remove these effects as follows: CT = cortical thickness, smoothed 20mm ACT = average cortical thickness LD = lesion density, smoothed 10mm TLV = total lesion volume Find partial correlation(LD, CT-ACT) removing TLV via linear model: CT-ACT ~ 1 + TLV + LD test for LD Repeat for all voxels in 3D, nodes in 2D ~1 billion correlations, so thresholding essential! Look for high negative correlations … Threshold: P=0.05, c=0.300, T=6.48 Cluster extent rather than peak height (Friston, 1994) Choose a lower level, e.g. t=3.11 (P=0.001) Find clusters i.e. connected components of excursion set L (cluster) Measure cluster extent by resels D Z D=1 extent L (cluster) » c D t ® k Distribution of maximum cluster extent: Bonferroni on N = #clusters ~ E(EC). Peak height Distribution: fit a quadratic to the peak: Y s References Adler, R.J. and Taylor, J.E. (2007). Random fields and geometry. Springer. Adler, R.J., Taylor, J.E. and Worsley, K.J. (2008). Random fields, geometry, and applications. In preparation.