The geometry of random fields in astrophysics and brain mapping Keith Worsley, Farzan Rohani, McGill Nicholas Chamandy, McGill and Google Jonathan Taylor, Stanford and Université de Montréal Jin Cao, Lucent Arnaud Charil, Montreal Neurological Institute Frédéric Gosselin, Université de Montréal Philippe Schyns, Fraser Smith, Glasgow Astrophysics Sloan Digital Sky Survey, release 6, Aug. ‘07 Sloan Digital Skydata Survey, FWHM=19.8335 2000 Euler Characteristic (EC) 1500 1000 500 "Meat ball" topology "Bubble" topology 0 -500 -1000 "Sponge" topology -1500 Observed Expected -2000 -5 -4 -3 -2 -1 0 1 Gaussian threshold 2 3 4 5 fMRI data: 120 scans, 3 scans each of hot, rest, warm, rest, … First scan of fMRI data Highly significant effect, T=6.59 1000 hot rest warm 890 880 870 500 0 100 200 300 No significant effect, T=-0.74 820 hot rest warm 0 800 T statistic for hot - warm effect 5 0 -5 T = (hot – warm effect) / S.d. ~ t110 if no effect 0 100 0 100 200 Drift 300 810 800 790 200 Time, seconds 300 Linear model regressors Alternating hot and warm stimuli separated by rest (9 seconds each). 2 1 0 -1 0 50 100 150 200 250 300 350 Hemodynamic response function: difference of two gamma densities 0.4 0.2 0 -0.2 0 50 Regressors = stimuli * HRF, sampled every 3 seconds 2 1 0 -1 0 50 100 150 200 Time, seconds 250 300 350 Brain imaging Detect sparse regions of “activation” Construct a test statistic “image” for detecting activation. Activated regions: test statistic > threshold Choose threshold to control false positive rate to say 0.05, i.e. P(max test statistic > threshold) = 0.05 Bonferroni??? Too conservative … False discovery rate??? Not appropriate … Detecting sparse cone alternatives Test statistics are usually functions of Gaussian fields, e.g. T or F statistics Let’s take a challenging example: a random field of chi-bar statistics for detecting a sparse cone alternative. Application: detecting fMRI activation in the presence of unknown latency of the hemodynamic response function (HRF) Linear model with two regressors of the HRF shifted by +/-2 seconds Fit by non-negative least-squares; equivalent to a cone alternative Hemodynamic response function 2 0.4-2 +2 -2 1 0.2 +2 0 0 -0.2 Stimulus and regressors 0 10 20 Time, (seconds) -1 10 20 30 40 50 Time, (seconds) 60 70  ¹= Chi-bar max Z1 cos µ + Z2 sin µ 0·µ·¼=2 Example test statistic: Z1~N(0,1) Z2~N(0,1) s2 3 2 1 0 -1 -2 Excursion sets, Xt = fs :  ¹ ¸ tg s1 -3 Threshold t 4 Rejection regions, Z2 2 3 Search Region, S 2 Rt = fZ :  ¹ ¸ tg Cone alternative 0 Z1 Null 1 -2 -2 0 2 Euler characteristic heuristic again Search Region, S Excursion sets, Xt EC= #blobs - # holes = 1 7 6 5 2 1 1 Euler characteristic, EC 10 Heuristic : ¸ t) P(max Â(s) ¹ Observed 8 0 s2S 6 ¼ E(EC) = 0:05 Expected 4 ) t = 3:75 2 0 -2 0 0.5 EXACT! 1 1.5 2 E(EC(S \ Xt )) = X D d=0 2.5 3 L (S)½ (t) d d 3.5 4 Threshold, t E(EC(S \ Xt )) = X D L (S)½ (t) d d µ @Z ¸ = Sd @s d=0 Lipschitz-Killing curvature Ld (S) Steiner-Weyl Tube Formula (1930) • Put a tube of radius r about the search region λS 14 r 12 10 Tube(λS,r) 8 λS 6 EC density ½d (t) Morse Theory µ Approachµ(1995) 1 @2Z ¡ E ½d (t) = 1fZ ¸tg det ¸d @s@s0 ¯ ¶ µ ¶ ¯ @Z @Z ¯ P = 0 =0 ¯ @s @s µ ¡ ¶ For Z a Gaussian random field ½d (t) = 4 2 2 4 6 8 10 12 14 • Find volume, expand as a power series in r, pull off coefficients: jTube(¸S; r)j = X D d=0 ¼d L ¡d (S)r d D ¡(d=2 + 1) ¶ p1 @ 2¼ @t d P(Z ¸ t) For Z a chi-bar random field??? ¶ Lipschitz-Killing curvature Ld (S) of a triangle r Tube(λS,r) λS µ ¶ @Z ¸ = Sd p @s 4 log 2 = FWHM Steiner-Weyl Volume of Tubes Formula (1930) Area(Tube(¸S; r)) = X D ¼ d=2 L ¡d (S)r d D ¡(d=2 + 1) d=0 = L2 (S) + 2L1 (S)r + ¼ L0 (S)r2 = Area(¸S) + Perimeter(¸S)r + EC(¸S)¼r2 L (S) = EC(¸S) 0 L (S) = 1 Perimeter(¸S) 1 2 L (S) = Area(¸S) 2 Lipschitz-Killing curvatures are just “intrinisic volumes” or “Minkowski functionals” in the (Riemannian) metric of the variance of the derivative of the process Lipschitz-Killing curvature Ld (S) of any set S S S ¸ = Sd Edge length × λ 12 10 8 6 4 2 . .. . . . . . . .. . . .. . . . . . . . . . . . . . . . .. . . . . . ... . . 4 .. . . . . . . . . . . . 6 .. . . . . . . . . . . . . . . . . . . . . . . . 8 .. . . . ... . .. . . . . . . . ..... . . . . .... .. .. 10 µ @Z @s ¶ of triangles L (Lipschitz-Killing ²) = 1, L (¡) curvature L (N = 1, )=1 0 0 0 L (¡) = edge length, L (N) = 1 perimeter 1 2 L1 (N) = area 2 P Lcurvature P L Lipschitz-Killing union L ² ¡ Pof L ¡ of triangles N (S) = P² 0 ( ) ¡ 0( ) + P L (S) = L (¡) ¡ L (N) ¡ N 1 L1 (S) = P L 1(N) 2 N 2 0 N 0 ( ) Non-isotropic data? Use Riemannian metric µ ¶ of Var(∇Z) ¸ = Sd Z~N(0,1) s2 3 @Z @s 2 1 0.14 0.12 0 -1 -2 Edge length × λ 12 10 8 6 4 2 .. . . . . . . . . .. . . .. . . . . . . . . . . . . . . . .. . . . . . ... . . . . . . . . . . . . . . 4 6 .. . . . . . . . . . . . . . . . . . . . . . . . 8 . .. . . . ... . .. . . . . . . . ..... . . . . .... ... 10 s1 0.1 0.08 0.06 -3 of triangles L (Lipschitz-Killing ²) = 1, L (¡) curvature L (N = 1, )=1 0 0 0 L (¡) = edge length, L (N) = 1 perimeter 1 2 L1 (N) = area 2 P Lcurvature P L Lipschitz-Killing union L ² ¡ Pof L ¡ of triangles N (S) = P² 0 ( ) ¡ 0( ) + P L (S) = L (¡) ¡ L (N) ¡ N 1 L1 (S) = P L 1(N) 2 N 2 0 N 0 ( ) Estimating Lipschitz-Killing curvature Ld (S) We need independent & identically distributed random fields e.g. residuals from a linear model Z1 Z2 Z3 Z4 Replace coordinates of the triangles in S ½ <2 by normalised residuals Z jjZ jj ; Z = (Z1 ; : : : ; Zn ) 2 <n : Taylor & Worsley, JASA (2007) Z5 Z6 Z7 Z8 Z9 … Zn of triangles L (Lipschitz-Killing ²) = 1, L (¡) curvature L (N = 1, )=1 0 0 0 L (¡) = edge length, L (N) = 1 perimeter 1 2 L1 (N) = area 2 P Lcurvature P L Lipschitz-Killing union L ² ¡ Pof L ¡ of triangles N (S) = P² 0 ( ) ¡ 0( ) + P L (S) = L (¡) ¡ L (N) ¡ N 1 L1 (S) = P L 1(N) 2 N 2 0 N 0 ( ) Beautiful symmetry: E(EC(S \ Xt )) = X D µ L (S)½ (t) d d @Z ¸ = Sd @s d=0 Lipschitz-Killing curvature Ld (S) Steiner-Weyl Tube Formula (1930) ¶ EC density ½d (t) Taylor Gaussian Tube Formula (2003) • Put a tube of radius r about the search region λS and rejection region Rt: 14 Z2~N(0,1) r 12 10 Tube(λS,r) 8 Tube(Rt,r) Rt r λS 6 t-r t Z1~N(0,1) 4 2 2 4 6 8 10 12 14 • Find volume or probability, expand as a power series in r, pull off1 coefficients: jTube(¸S; r)j = X D d=0 ¼d L ¡d (S)r d D ¡(d=2 + 1) P(Tube(Rt ; r)) = X (2¼)d=2 d! d=0 ½d (t)rd EC density ½d (t) of the  ¹ statistic Z2~N(0,1) Tube(Rt,r) r Rejection region Rt t-r t Z1~N(0,1) Taylor’s Gaussian Tube Formula (2003) 1 X (2¼)d=2 ½d (t)rd d! P (Z1 ; Z2 2 Tube(Rt ; r)) = d=0 ½0 (t) = Z = ½0 (t) + (2¼)1=2 ½1 (t)r + (2¼)½2 (t)r2 =2 + ¢ ¢ ¢ Z 1 = (2¼)¡1=2 e¡z2 =2 dz + e¡(t¡r)2 =2 =4 t¡r 1 (2¼)¡1=2 e¡z2 =2 dz + e¡t2 =2 =4 t ½1 (t) = (2¼)¡1 e¡t2 =2 + (2¼)¡1=2 e¡t2 =2 t=4 ½2 (t) = (2¼)¡3=2 e¡t2 =2 t + (2¼)¡1 e¡t2 =2 (t2 ¡ 1)=8 .. . Taylor & Worsley, Annals of Statistics, submitted (2007) General cone alternatives » Let Z = (Z1 ; : : : ; Zn ) N(¹; In£n ¾ 2 ). We wish to test H0 : ¹ = 0 vs: H1 : ¹ 2 Cone = fau : a ¸ 0; u 2 U ½ Unit Spheren¡1 g: ¹ = max 2 (u0 Z)2 : If ¾ is unknown the LR test statistic is equivalent to B u U jjZ jj2 » Beta( j ; n¡j ) then Lin & Lindsay, Takemura & Kuriki (1997): let B j;n¡j ¹ ¸ t) = P(B X n 2 2 wj P(Bj;n¡j ¸ t): j=1 Taylor & Worsley (2007): let ½F (t) be the EC density of the random ¯eld F d ½B¹ (t) d = X n wj ½Bj;n¡j (t): d j=1 Applying the EC heuristic: µ ¶ X µ ¶ n ¸t ¼ ¹ P max B(s) wj P max Bj;n¡j (s) ¸ t : s2S j=1 s2S Proof, n=3: Gaussian random field in 3D » If Z(s) N(0; 1) ¡is an¢ isotropic Gaussian random ¯eld, s 2 <3 , with ¸2 I3£3 = V @Z , @s E(EC(S \ fs : Z(s) ¸ tg)) L (S) 0 L (S) 1 Lipschitz-Killing curvatures of S L (S) 2 L (S) 3 Z 1 1 e¡z2 =2 dz (2¼)1=2 t 1 ¡2 + ¸ 2Diameter(S) £ e t =2 2¼ 1 + ¸2 1 Area(S) £ te¡t2 =2 2 (2¼)3=2 1 £ 3 + ¸ Volume(S) (t2 ¡ 1)e¡t2 =2 : (2¼)2 = EC(S) £ If Z(s) is white noise convolved with an isotropic Gaussian ¯lter of Full Width at p Half Maximum FWHM then ¸ = 4 log 2 . filter FWHM FWHM ½0 (t) ½1 (t) ½2 (t) ½3 (t) EC densities of Z The»accuracy of the EC heuristic If Z(s) N(0; 1) ¡is an¢ isotropic Gaussian random ¯eld, s 2 <3 , with ¸2 I3£3 = V @Z , and for some ® > 1, @s µ ¶ P max Z(s) ¸ t = E(EC(S \ fs : Z(s) ¸ tg)) + O(e¡®t2 =2 ) s2S Z 1 1 L (S) = EC(S) £ e¡z2 =2 dz 0 (2¼)1=2 t L (S) + ¸ 2Diameter(S) £ 1 e¡t2 =2 1 2¼ Lipschitz-Killing 1 L (S) + ¸2 1 Area(S) £ ¡t2 =2 te curvatures of S 2 2 (2¼)3=2 L (S) 1 £ 3 + ¸ Volume(S) (t2 ¡ 1)e¡t2 =2 3 (2¼)2 ½0 (t) ½1 (t) ½2 (t) ½3 (t) + O(e¡®t2 =2 ): The expected EC gives all the polynomial terms in the expansion for the P-value. EC densities of Z What is ‘bubbles’? Nature (2005) Subject is shown one of 40 faces chosen at random … Happy Sad Fearful Neutral … but face is only revealed through random ‘bubbles’ First trial: “Sad” expression Sad 75 random Smoothed by a bubble centres Gaussian ‘bubble’ What the subject sees 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Subject is asked the expression: Response: “Neutral” Incorrect Your turn … Trial 2 Subject response: “Fearful” CORRECT Your turn … Trial 3 Subject response: “Happy” INCORRECT (Fearful) Your turn … Trial 4 Subject response: “Happy” CORRECT Your turn … Trial 5 Subject response: “Fearful” CORRECT Your turn … Trial 6 Subject response: “Sad” CORRECT Your turn … Trial 7 Subject response: “Happy” CORRECT Your turn … Trial 8 Subject response: “Neutral” CORRECT Your turn … Trial 9 Subject response: “Happy” CORRECT Your turn … Trial 3000 Subject response: “Happy” INCORRECT (Fearful) Bubbles analysis 1 E.g. Fearful (3000/4=750 trials): + 2 + 3 + Trial 4 + 5 + 6 + 7 + … + 750 1 = Sum 300 0.5 200 0 100 250 200 150 100 50 Correct trials Proportion of correct bubbles =(sum correct bubbles) /(sum all bubbles) 0.75 Thresholded at proportion of 0.7 correct trials=0.68, 0.65 scaled to [0,1] 1 Use this as a 0.5 bubble mask 0 Results Mask average face Happy Sad Fearful But are these features real or just noise? Need statistics … Neutral Statistical analysis Correlate bubbles with response (correct = 1, incorrect = 0), separately for each expression Equivalent to 2-sample Z-statistic for correct vs. incorrect bubbles, e.g. Fearful: Trial 1 2 3 4 5 6 7 … 750 1 0.5 0 1 1 Response 0 1 Z~N(0,1) statistic 4 2 0 -2 0 1 1 … 1 0.75 Very similar to the proportion of correct bubbles: 0.7 0.65 Results Thresholded at Z=1.64 (P=0.05) Happy Average face Sad Fearful Neutral Z~N(0,1) statistic 4.58 4.09 3.6 3.11 2.62 2.13 1.64 Multiple comparisons correction? Need random field theory … Euler Characteristic = #blobs - #holes Excursion set {Z > threshold} for neutral face EC = 0 30 Euler Characteristic 20 0 -7 -11 13 14 9 1 0 Heuristic: At high thresholds t, the holes disappear, EC ~ 1 or 0, E(EC) ~ P(max Z > t). Observed Expected 10 0 -10 -20 -4 -3 -2 -1 0 Threshold 1 2 • Exact expression for E(EC) for all thresholds, • E(EC) ~ P(max Z > t) 3 4accurate. is extremely The»result If Z(s) N(0; 1) ¡is an¢ isotropic Gaussian random ¯eld, s 2 <3 , with ¸2 I3£3 = V @Z , @s µ ¶ P max Z(s) ¸ t ¼ E(EC(S \ fs : Z(s) ¸ tg)) s2S Z 1 1 £ L (S) = EC(S) e¡z2 =2 dz 0 (2¼)1=2 t L (S) £ 1 e¡t2 =2 + ¸ 2Diameter(S) 1 2¼ Lipschitz-Killing 1 L (S) ¡t2 =2 2 1 Area(S) £ + ¸ te curvatures of S 2 2 (2¼)3=2 L (S) 1 £ 3 + ¸ Volume(S) (t2 ¡ 1)e¡t2 =2 : 3 (2¼)2 If Z(s) is white noise convolved with an isotropic Gaussian ¯lter of Full Width at p Half Maximum FWHM then ¸ = 4 log 2 . filter FWHM FWHM ½0 (t) ½1 (t) ½2 (t) ½3 (t) EC densities of Z Results, corrected for search Random field theory threshold: Z=3.92 (P=0.05) Happy Average face Sad Fearful Neutral Z~N(0,1) statistic 4.58 4.47 4.36 4.25 4.14 4.03 3.92 Bonferroni threshold: Z=4.87 (P=0.05) – nothing Bubbles task in fMRI scanner Correlate bubbles with BOLD at every voxel: Trial 1 2 3 4 5 6 7 … 3000 1 0.5 0 fMRI 10000 0 Calculate Z for each pair (bubble pixel, fMRI voxel) – a 5D “image” of Z statistics … Thresholding? Cross correlation random field Correlation between 2 fields at 2 different locations, searchedµover all pairs of locations, one in S, one in T: ¶ P max C(s; t) ¸ c s2S;t2T = ¼ E(EC fs 2 S; t 2 T : C(s; t) ¸ cg) dim(S) X dim(T X) i=0 2n¡2¡h (i ¡ 1)!j! ½ij (c) = ¼h=2+1 L (S)L (T )½ (c) i j ij j=0 b(hX ¡1)=2c (¡1)k ch¡1¡2k (1 ¡ c2 )(n¡1¡h)=2+k k=0 ¡( n¡i 2 X k X k l=0 m=0 + l)¡( n¡j + m) 2 ¡ ¡ ¡ ¡ l!m!(k l m)!(n 1 h + l + m + k)!(i ¡ 1 ¡ k ¡ l + m)!(j ¡ k ¡ m + l)! Cao & Worsley, Annals of Applied Probability (1999) Bubbles data: P=0.05, n=3000, c=0.113, T=6.22 MS lesions and cortical thickness Idea: MS lesions interrupt neuronal signals, causing thinning in downstream cortex Data: n = 425 mild MS patients Lesion density, smoothed 10mm Cortical thickness, smoothed 20mm Find connectivity i.e. find voxels in 3D, nodes in 2D with high correlation(lesion density, cortical thickness) Look for high negative correlations … Threshold: P=0.05, c=0.300, T=6.48 n=425 subjects, correlation = -0.568 5.5 Average cortical thickness 5 4.5 4 3.5 3 2.5 2 1.5 0 10 20 30 40 50 60 Average lesion volume 70 80 Discussion: modeling The random response is Y=1 (correct) or 0 (incorrect), or Y=fMRI The regressors are Xj=bubble mask at pixel j, j=1 … 240x380=91200 (!) Logistic regression or ordinary regression: logit(E(Y)) or E(Y) = b0+X1b1+…+X91200b91200 But there are only n=3000 observations (trials) … Instead, since regressors are independent, fit them one at a time: logit(E(Y)) or E(Y) = b0+Xjbj However the regressors (bubbles) are random with a simple known distribution, so turn the problem around and condition on Y: E(Xj) = c0+Ycj Equivalent to conditional logistic regression (Cox, 1962) which gives exact inference for b1 conditional on sufficient statistics for b0 Cox also suggested using saddle-point approximations to improve accuracy of inference … Interactions? logit(E(Y)) or E(Y)=b0+X1b1+…+X91200b91200+X1X2b1,2+ … Three methods so far The set-up: S is a subset of a D-dimensional lattice (e.g. pixels); Z(s) ~ N(0,1) at most points s in S; Z(s) ~ N(μ(s),1), μ(s)>0 at a sparse set of points; Z(s1), Z(s2) are spatially correlated. To control the false positive rate to ≤α we want a good approximation to α = P(maxS Z(s) ≥ t): Bonferroni (1936) Random field theory (1970’s) Discrete local maxima (2005, 2007) Simulations (99999) 0.1 0.09 0.08 Random field theory Bonferroni 0.07 P value 0.06 0.05 Discrete local maxima 0.04 2 0.03 0 0.02 -2 0.01 0 Z(s) 0 1 2 3 4 5 6 7 8 9 10 FWHM (Full Width at Half Maximum) of smoothing filter Discrete local maxima Bonferroni applied to events: {Z(s) ≥ t and Z(s) is a discrete local maximum} i.e. {Z(s) ≥ t and neighbour Z’s ≤ Z(s)} Conservative Z(s2) If Z(s) is stationary, with Cor(Z(s1),Z(s2)) = ρ(s1-s2), all we need is P{Z(s) ≥ t and neighbour Z’s ≤ Z(s)} a (2D+1)-variate integral ≤ Z(s-1) ≤ Z(s) ≥ Z(s1) ≥ Z(s-2) “Markovian” trick If ρ is “separable”: s=(x,y), ρ((x,y)) = ρ((x,0)) × ρ((0,y)) e.g. Gaussian spatial correlation function: ρ((x,y)) = exp(-½(x2+y2)/w2) Then Z(s) has a “Markovian” property: conditional on central Z(s), Z’s on different axes are independent: Z(s±1) ┴ Z(s±2) | Z(s) Z(s2) ≤ Z(s-1) ≤ Z(s) ≥ Z(s1) ≥ So condition on Z(s)=z, find Z(s-2) P{neighbour Z’s ≤ z | Z(s)=z} = ПdP{Z(s±d) ≤ z | Z(s)=z} then take expectations over Z(s)=z Cuts the (2D+1)-variate integral down to a bivariate integral The result only involves the correlation ½d between adjacent voxels along each lattice axis d, d = 1; : : : ; D. First let the Gaussian density and uncorrected P values be Z 1 p Á(z) = exp(¡z 2 =2)= 2¼; ©(z) = Á(u)du; z respectively. Then de¯ne 1 ¡ Q(½; z) = 1 2©(hz) + ¼ where Z ® exp(¡ 1 h2 z 2 = sin2 µ)dµ; 2 0 r ¡ 1 ½ h= : 1+½ ³p ´ ¡1 ® = sin (1 ¡ ½2 )=2 ; Then the P-value of the maximum is bounded by µ ¶ Z P max Z(s) ¸ t · jS j s2S t 1 Y D Q(½d ; z) Á(z)dz; d=1 where jS j is the number of voxels s in the search region S. For a voxel on the boundary of the search region with just one neighbour in axis direction d, replace Q(½; z) by 1 ¡ ©(hz), and by 1 if it has no neighbours. Comparison Bonferroni (1936) Conservative Accurate if spatial correlation is low Simple Discrete local maxima (2005, 2007) Conservative Accurate for all ranges of spatial correlation A bit messy Only easy for stationary separable Gaussian data on rectilinear lattices Even if not separable, always seems to be conservative (but no proof!) Random field theory (1970’s) Approximation based on assuming S is continuous Accurate if spatial correlation is high Elegant Easily extended to non-Gaussian, non-isotropic random fields