Detecting sparse connectivity: the 'bubbles' task in the fMRI scanner Keith Worsley, Math + Stats, Arnaud Charil, Montreal Neurological Institute, McGill Philippe Schyns, Fraser Smith, Psychology, Glasgow Jonathan Taylor, Stanford and Université de Montréal What is ‘bubbles’? Nature (2005) Subject is shown one of 40 faces chosen at random … Happy Sad Fearful Neutral … but face is only revealed through random ‘bubbles’ First trial: “Sad” expression Sad 75 random Smoothed by a bubble centres Gaussian ‘bubble’ What the subject sees 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Subject is asked the expression: Response: “Neutral” Incorrect Your turn … Trial 2 Subject response: “Fearful” CORRECT Your turn … Trial 3 Subject response: “Happy” INCORRECT (Fearful) Your turn … Trial 4 Subject response: “Happy” CORRECT Your turn … Trial 5 Subject response: “Fearful” CORRECT Your turn … Trial 6 Subject response: “Sad” CORRECT Your turn … Trial 7 Subject response: “Happy” CORRECT Your turn … Trial 8 Subject response: “Neutral” CORRECT Your turn … Trial 9 Subject response: “Happy” CORRECT Your turn … Trial 3000 Subject response: “Happy” INCORRECT (Fearful) Bubbles analysis E.g. Fearful (3000/4=750 trials): 1 Trial 4 + 5 + 2 + 3 + + 6 + 7 + … + 750 1 = Sum 300 0.5 200 0 100 250 200 150 100 50 Correct trials Proportion of correct bubbles =(sum correct bubbles) /(sum all bubbles) 0.75 Thresholded at proportion of 0.7 correct trials=0.68, 0.65 scaled to [0,1] 1 Use this as a 0.5 bubble mask 0 Results Mask average face Happy Sad Fearful Neutral But are these features real or just noise? Need statistics … Statistical analysis Correlate bubbles with response (correct = 1, incorrect = 0), separately for each expression Equivalent to 2-sample Z-statistic for correct vs. incorrect bubbles, e.g. Fearful: Z~N(0,1) Trial 1 2 3 4 5 6 7 … 750 1 0.5 0 1 1 Response 0 1 statistic 4 2 0 -2 0 1 1 … 1 0.75 Very similar to the proportion of correct bubbles: 0.7 0.65 Comparison Both depend on average correct bubbles, rest is ~ constant Z=(Average correct bubbles -average incorrect bubbles) / pooled sd 4 2 0 -2 Proportion correct bubbles = Average correct bubbles / (average all bubbles * 4) 0.75 0.7 0.65 Results Thresholded at Z=1.64 (P=0.05) Happy Average face Sad Fearful Neutral Z~N(0,1) statistic 4.58 4.09 3.6 3.11 2.62 2.13 1.64 Multiple comparisons correction? Need random field theory … Euler Characteristic = #blobs - #holes Excursion set {Z > threshold} for neutral face EC = 0 30 Euler Characteristic 20 0 -7 -11 13 14 9 1 0 Heuristic: At high thresholds t, the holes disappear, EC ~ 1 or 0, E(EC) ~ P(max Z > t). Observed Expected 10 0 -10 -20 -4 -3 -2 -1 0 Threshold 1 2 • Exact expression for E(EC) for all thresholds, • E(EC) ~ P(max Z > t) 3 4accurate. is extremely The details If ¡Z (s)¢ » N(0; 1) is an isot ropic Gaussian random ¯eld, s 2 < 2 , wit h ¸ I 2£ 2 = V @Z , µ ¶ @s P max Z (s) ¸ t ¼ E(E C(S \ f s : Z (s) ¸ tg)) s2 S Z 1 1 = E C(S) e¡ z 2 =2 dz (2¼) 1=2 t Intrinsic volumes or Minkowski functionals ¸ 1=2 + 1 Perimet er(S) e¡ t 2 =2 2 2¼ ¸ + Area(S) te¡ t 2 =2 : (2¼) 3=2 If Z (s) is whit e noise convolved wit h an isot ropic Gaussian ¯lt er of Full Widt h at Half Maximum FWHM t hen ¸ = 4 log 2 . FW HM 2 The brain mapping version If Z (s) is whit e noise smoot hed wit h an isot ropic Gaussian ¯lt er of Full Widt h at Half Maximum FWHM µ ¶ P max Z (s) ¸ t ¼ E(E C(S \ f s : Z (s) ¸ tg)) s2 S Z 1 1 = E C(S) e¡ z 2 =2 dz EC0(S) (2¼) 1=2 Resels0(S) t Resels1(S) Resels2(S) Perimet er(S) (4 log 2) 1=2 e¡ FWHM 2¼ Area(S) 4 log 2 + te¡ t 2 =2 : FWHM 2 (2¼) 3=2 + Resels (Resolution elements) 1 2 t 2 =2 EC1(S) EC2(S) EC densities Math version If T (s) = f (Z 1 (s); : : : ; Z n (s)) is a funct ion³ of i.i.d. Gaussian random ¯elds ´ Z i (s) » Z (s) » N(0; 1), s 2 < D , wit h V @Z ( s) = ¤ D £ D (s), t hen replace @s resels by Lipschitz-K illing curvature L d (S; ¤ ): µ ¶ P max T (s) ¸ t s2 S XD ¼ E(E C(S \ f s : T (s) ¸ tg)) = L d (S; ¤ )½d (t); d= 0 where ½d (t) is t he same EC density for t he isot ropic case wit h ¤ (s) = I D £ D (Taylor & Adler, 2003). Bot h Lipschit z-K illing curvat ure L d (S; ¤ ) and EC density ½d (t) are de¯ned implicit ly as coe± cient s of a power series expansion of t he volume of a t ube as a funct ion of it s radius. In t he case of Lipschit z-K illing curvat ure, t he t ube is about t he search region S in t he Riemannian met ric induced by ¤ (s); in t he case of t he EC densit ies, t he t ube is about t he reject ion region f ¡ 1 ([t; 1 )) and volume is replaced by Gaussian probability. Fort unat ely t here are simple ways of est imat ing Lipschit z-K illing curvat ure from sample dat a (Taylor & Worsley, 2007), and simple ways of calculat ing EC densit ies. The details … 2 Tube(S,r) r S 3 A B 6 Λ is big TubeΛ(S,r) S Λ is small r ν 2 U(s1) S S s1 Tube s2 s3 Tube U(s2) U(s3) Z2 R r Tube(R,r) Z1 N2(0,I) Tube(R,r) R t-r z t z1 RR Tube(R,r) r z2 z3 Summary Random field theory results For searching in D (=2) dimensions, P-value of max Z is (Adler, 1981; W, 1995): P(max Z > z) Resels (=Lipschitz-Killing curvature/c) is Image area / (bubble FWHM)2 = 146.2 (unitless) Euler characteristic density(×c) is ~ E( Euler characteristic of thresholded set ) = Resels × Euler characteristic density (+ boundary terms) (4 log(2))D/2 zD-1 exp(-z2/2) / (2π)(D+1)/2 See forthcoming book Adler, Taylor (2007) Results, corrected for search Thresholded at Z=3.92 (P=0.05) Happy Average face Sad Fearful Neutral Z~N(0,1) statistic 4.58 4.47 4.36 4.25 4.14 4.03 3.92 Bubbles task in fMRI scanner Correlate bubbles with BOLD at every voxel: Trial 1 2 3 4 5 6 7 … 3000 1 0.5 0 fMRI 10000 0 Calculate Z for each pair (bubble pixel, fMRI voxel) – a 5D “image” of Z statistics … Discussion: thresholding Thresholding in advance is vital, since we cannot store all the ~1 billion 5D Z values Resels=(image resels = 146.2) × (fMRI resels = 1057.2) for P=0.05, threshold is Z = 6.22 (approx) The threshold based on Gaussian RFT can be improved using new non-Gaussian RFT based on saddle-point approximations (Chamandy et al., 2006) Model the bubbles as a smoothed Poisson point process The improved thresholds are slightly lower, so more activation is detected Only keep 5D local maxima Z(pixel, voxel) > Z(pixel, 6 neighbours of voxel) > Z(4 neighbours of pixel, voxel) Discussion: modeling The random response is Y=1 (correct) or 0 (incorrect), or Y=fMRI The regressors are Xj=bubble mask at pixel j, j=1 … 240x380=91200 (!) Logistic regression or ordinary regression: logit(E(Y)) or E(Y) = b0+X1b1+…+X91200b91200 But there are only n=3000 observations (trials) … Instead, since regressors are independent, fit them one at a time: logit(E(Y)) or E(Y) = b0+Xjbj However the regressors (bubbles) are random with a simple known distribution, so turn the problem around and condition on Y: E(Xj) = c0+Ycj Equivalent to conditional logistic regression (Cox, 1962) which gives exact inference for b1 conditional on sufficient statistics for b0 Cox also suggested using saddle-point approximations to improve accuracy of inference … Interactions? logit(E(Y)) or E(Y)=b0+X1b1+…+X91200b91200+X1X2b1,2+ … MS lesions and cortical thickness Idea: MS lesions interrupt neuronal signals, causing thinning in down-stream cortex Data: n = 425 mild MS patients 5.5 Average cortical thickness (mm) 5 4.5 4 3.5 3 2.5 Correlation = -0.568, T = -14.20 (423 df) 2 1.5 0 10 20 30 40 50 Total lesion volume (cc) 60 70 80 Charil et al, NeuroImage (2007) MS lesions and cortical thickness at all pairs of points Dominated by total lesions and average cortical thickness, so remove these effects Cortical thickness CT, smoothed 20mm Lesion density LD, smoothed 10mm Find partial correlation(lesion density, cortical thickness) removing total lesion volume linear model: CT-av(CT) ~ 1 + TLV + LD, test for LD Repeat of all voxels in 3D, nodes in 2D Subtract average cortical thickness ~1 billion correlations, so thresholding essential! Look for high negative correlations … Thresholding? Cross correlation random field Correlation between 2 fields at 2 different locations, searched over all pairs of locations one in R (D dimensions), one in S (E dimensions) sample size n Cao & Worsley, Annals of Applied Probability (1999) MS lesion data: P=0.05, c=0.300, T=6.46 Cluster extent rather than peak height (Friston, 1994) fit a quadratic Y to the peak: L D (clust er) » c ® k Peak height Choose a lower level, e.g. t=3.11 (P=0.001) Find clusters i.e. connected components of excursion set Z D=1 Measure cluster LD extent extent by t Distribution: Distribution of maximum cluster extent: Bonferroni on N = #clusters ~ E(EC). s Cao, Advances in Applied Probability (1999) How do you choose the threshold t for defining clusters? If signal is focal i.e. ~FWHM of noise If signal is broad i.e. >>FWHM of noise Choose a high threshold i.e. peak height is better Choose a low threshold i.e. cluster extent is better Conclusion: cluster extent is better for detecting broad signals Alternative: smooth data with filter that matches signal (Matched Filter Theorem)… try range of filter widths … scale space search … correct using random field theory … a lot of work … Cluster extent is easier! Thresholding? Cross correlation random field Correlation between 2 fields at 2 different locations, searched over all pairs of locations one in R (D dimensions), one in S (E dimensions) µ ¶ XD XE P max Correlat ion > c ¼ L d (R) L e (S) ½d;e (c) R ;S ½d;e (c) = d= 0 e= 0 (d ¡ 1)!e!2n ¡ ¼d +2 e + 1 d¡ e¡ 2 X X (¡ 1) k cd+ e¡ 1¡ 2k (1 ¡ c2 ) n ¡ d ¡ e¡ 1 2 +k k i ;j ¡ ( n ¡ d + i )¡ ( n ¡ e + j ) 2 2 i !j !(k ¡ i ¡ j )!(n ¡ 1 ¡ d ¡ e + i + j + k)!(d ¡ 1 ¡ k ¡ i + j )!(e ¡ k ¡ j + i )! Cao & Worsley, Annals of Applied Probability (1999) MS lesion data: P=0.05, c=0.300, T=6.46 0.1 correlation 0 5 x 10 2.5 2 -0.1 1.5 -0.2 1 -0.3 threshold -0.4 -0.5 0 50 100 150 Different hemisphere 0.1 5 x 10 2.5 0 correlation Histogram Same hemisphere -0.1 2 -0.2 1.5 -0.3 1 0.5 -0.4 0 -0.5 0 threshold 50 100 150 0.5 0 ‘Conditional’ histogram: scaled to same max at each distance 0.1 1 -0.1 0.6 -0.2 0.4 -0.3 -0.4 -0.5 0 threshold 50 100 150 distance (mm) 1 0 0.8 correlation correlation 0 0.1 0.8 -0.1 0.6 -0.2 0.4 -0.3 0.2 -0.4 0 -0.5 0 threshold 50 100 150 distance (mm) 0.2 0 Science (2004) fMRI activation detected by correlation between subjects at the same voxel The average nonselective time course across all activated regions obtained during the first 10 min of the movie for all five subjects. Red line represents the across subject average time course. There is a striking degree of synchronization among different individuals watching the same movie. Voxel-by-voxel intersubject correlation between the source subject (ZO) and the target subject (SN). Correlation maps are shown on unfolded left and right hemispheres (LH and RH, respectively). Color indicates the significance level of the intersubject correlation in each voxel. Black dotted lines denote borders of retinotopic visual areas V1, V2, V3, VP, V3A, V4/V8, and estimated border of auditory cortex (A1).The face-, object-, and building-related borders (red, blue, and green rings, respectively) are also superimposed on the map. Note the substantial extent of intersubject correlations and the extension of the correlations beyond visual and auditory cortices. What are the subjects watching during high activation? Faces … … buildings … … hands Thresholding? Homologous correlation random field Correlation between 2 equally smooth fields at the same location, searched over all locations in R (in D dimensions) Cao & Worsley, Annals of Applied Probability (1999) P-values are larger than for the usual correlation field (correlation between a field and a scalar) E.g. resels=1000, df=100, threshold=5, usual P=0.051, homologous P=0.139