Borgafjall2

advertisement
Winter Conference Borgafjäll
Detecting changes in brain
scale, shape and connectivity
via the geometry of random fields
Jin Cao, Lucent
Nick Chamandy, Google
Khalil Shafie, Northern Colorado
Jonathan Taylor, Stanford
Keith Worsley, McGill
EC heuristic
At high thresholds t, the excursion set Xt = fs : T (s) ¸ tg is either one
connected component (EC = 1) or empty (EC = 0), so
µ
¶
P max T (s) ¸ t ¼ E(EC(S \ Xt )):
s2S
Theorem (1981, 1995)
If T (s) is a smooth isotropic random ¯eld then
E(EC(S \ Xt )) =
X
D
d=0
¹d (S)½d (t):
E(EC(S \ Xt )) =
X
D
¹d (S)½d (t)
d=0
Intrinsic volume ¹d (S)
EC density ½d (t)
For S with smooth boundary @S
that has curvature matrix C(s),
Morse theory:
X
\
EC(S Xt ) =
1fT (s)¸tg 1f@T (s)=@s=0g
¡((D ¡ d)=2)
¹d (S) =
¡
Z 2¼(D d)=2
£
detrD¡1¡d fC(s)gds;
@S
and ¹D (S) = jS j. For D = 3,
¹0 (S) = EC(S)
¹1 (S) = 2 Diameter(S)
¹2 (S) = 1 Surface area(S)
2
¹3 (S) = Volume(S):
s
µ
¶
£ sign ¡ @ 2 T
+ boundary
@s@s0
µ
µ
¶
2
@ T
½D (t) = E 1fT ¸tg det ¡
@s@s0
¯
¶ µ
¶
¯ @T
@T
¯
P
=
0
=0
¯ @s
@s
For a Gaussian random ¯eld, ¸ = Sd( @Z ),
@s
µ ¡
¶
¸ @ d
p
P(Z ¸ t)
½d (t) =
2¼ @t
E(EC(S \ Xt )) =
Beautiful symmetry:
X
D
L (S)½ (R )
d
d
t
d=0
Lipschitz-Killing curvature Ld (S)
Steiner-Weyl Tube Formula (1930)
EC density ½d (Rt )
Taylor Kinematic Formula (2003)
µ
¶
• Put a tube of radius r about
@Z the search region λS and rejection region Rt:
¸ = Sd
@s
Z2~N(0,1)
14
r
12
10
Rt
Tube(λS,r)
8
Tube(Rt,r)
r
λS
6
Z1~N(0,1)
t-r t
4
2
2
4
6
8 10 12 14
• Find volume or probability, expand as a power series in r, pull off1coefficients:
jTube(¸S; r)j =
X
D
d=0
¼d
L
P(Tube(Rt ; r)) =
¡d (S)r d
D
¡(d=2 + 1)
X (2¼)d=2
d!
d=0
½d (Rt )rd
EC density of the T-statistic field, v df
- After lots of messy algebra (Morse)
- Two lines (Taylor’s Gaussian Tube Formula)
½0 (t) =
½1 (t) =
Z
1
¡
¢
µ
¡ º+1
u2
2 ¡ ¢ 1+
º
(º¼)1=2 ¡ º
t
2
µ
¶¡(º ¡1)=2
2
t
(2¼)¡1 1 +
º
¡
¢
¶¡(º+1)=2
du
µ
¶¡(º ¡1)=2
t2
½2 (t) =
t 1+
º
µ ¡
¶µ
¶¡(º ¡1)=2
2
º 1 ¡
t
½3 (t) = (2¼)¡2
t2 1
1+
º
º
¡ º+1
¡3=2
(2¼)
¡ ¢ 2 ¡ ¢
º 1=2 ¡ º
2
2
Multivariate linear models
for random field data
Y(s)n£q is the observations £ variables data matrix at point s 2 S ½ <D .
At every point s we have a multivariate linear model:
Y(s)n£q = Xn£p B(s)n£q + E(s)n£q ;
E(s)n£q » N(0; I
§):
We detect sparse s where B(s) 6= 0 by testing H0 : B(s) = 0 at every point s.
Test statistics T (s):
# regressors
p=1
p>1
q=1
T
F
# variables
q>1 Hotelling’s T2
Need null distribution: P(maxs2S T (s) ¸ t)
Wilks’ Λ
Pillai’s trace
Roy’s max root
etc.
Deformation Based Morphometry (DBM)
D’Arcy Thompson (1917) On Growth and Form
s2
Y(s)
s1




n1 = 17 non-missile brain trauma patients, 3-14 days in coma,
n2 = 19 age and gender matched controls
Y(s) = vector (q=3) deformations needed to warp each MRI to an atlas standard
Locate damage: find regions where deformations are different, i.e. shape change
Hotelling’s T2 random field, v df
- After lots of messy algebra (Morse)
½0 (t) =
½1 (t) =
½2 (t) =
½3 (t) =
Z
1
¡( º+1 )
¡ º+1 q¡2
2
(1
+
u)
u 2 du;
2
¡q+1
q
º
¡( )¡(
)
t
2
2
¼ ¡ 12 ¡( º+1 )
¡ º ¡1 q¡1
2
(1
+
t)
t 2 ;
2
¡q+2
q
º
¡( )¡(
)
2
2
µ
¡1 ¶
¼¡1 ¡( º+1 )
q
¡ º ¡1 q¡2
¡
2
(1
+
t)
t
t
;
2
2
¡
¡q+1
q
º
º
q
+
1
¡( )¡(
)
2
2
µ
¡1
¡ 1)(q ¡ 2) ¶
¼ ¡ 32 ¡( º+1 )
2q
(q
¡ º ¡1 q ¡3
2¡
2
(1
+
t)
t
t
t+ ¡
:
2
2
¡
¡
¡q
q
º
º
q
(º
q
+
2)(º
q)
¡( )¡(
)
2
2
Last case
# regressors
p=1
p>1
q=1
# variables
T✔
F✔
q>1 Hotelling’s T2✔ Wilks’ Λ
Pillai’s trace
Roy’s max root
etc.
?
We shall now ¯nd a P-value approximation (but not quite the EC density) for
the maximum Roy0 s maximum root,
T (s) =maximum eigenvalue of
Y(s)0 X(X0 X)¡1 X0 Y(s)(Y(s)0 (I ¡ X(X0 X)¡1 X0 )Y(s))¡1 :
The above messy algebra is just too complicated. Instead . . .
Roy’s union-intersection principle
Make it into a univariate linear model by multiplying by vector uq£1 :
(Y(s)u)n£1 = Xn£p (B(s)u)q£1 + (E(s)u)n£1 ;
H0u : (B(s)u)q£1 = 0
Let F (s; u) be the usual F-statistic for testing H0u . Let °q ½ <q be the unit
q-sphere. Roy0 s maximum root is
T (s) = maxu2° F (s; u):
q
Now F (s; u) is an F-¯eld in search region S °q and we already know the EC
density of the F-¯eld, ½d (F ¸ t) so
µ
¶
µ
¶
P max T (s) ¸ t = P
max F (s; u) ¸ t
s2S
s2S;u2°q
¼
1
2
D+q
X
¹d (S
° )½ (F ¸ t)
q d✔
d=0
Why ½? F(s,u)=F(s,-u)
=
1
2
X
D
d=0
¹d (S)
q
X
k=0
Almost the EC
density of the
Roy’s maximum
root field
¸ t)
¹k (°q )½d+k
(F
✔
Cross correlation random field
Let X(s), s 2 S ½ <M , and Y (t), t 2 T ½ <N be n £ 1 vectors of
Gaussian random ¯elds. De¯ne the cross correlation random ¯eld as
C(s; t) =
P
µ
(X(s)0 X(s)
Y
(t)0 Y
(t))1=2
:
¶
max C(s; t) ¸ c ¼ E(EC fs 2 S; t 2 T : C(s; t) ¸ cg)
s2S;t2T
=
dim(S)
X dim(T
X)
i=0
½ij (C ¸ c) =
X(s)0 Y (t)
2n¡2¡h (i
L (S)L (T )½ (C ¸ c)
i
j
ij
j=0
¡1)=2c
¡ 1)!j! b(hX
¼h=2+1
k=0
(¡1)k ch¡1¡2k (1 ¡ c2 )(n¡1¡h)=2+k
X
k X
k
l=0 m=0
¡( n¡i + l)¡( n¡j + m)
2
2
:
¡
¡
¡
¡
l!m!(k l m)!(n 1 h + l + m + k)!(i ¡ 1 ¡ k ¡ l + m)!(j ¡ k ¡ m + l)!
Maximum canonical cross correlation random field
Let X(s)n£p , s 2 S ½ <M , and Y(t)n£q , t 2 T ½ <N be matrices of Gaussian
random ¯elds. De¯ne the maximum canonical cross correlation random ¯eld as
u0 X(s)0 Y(t)v
C(s; t) = max
u;v (u0 X(s)0 X(s)u
v 0 Y(t)0 Y(t)v)1=2
;
the maximum of the canonical correlations between X and Y, de¯ned as the
singular values of (X0 X)¡1=2 X0 Y(Y 0 Y)¡1=2 .
P
µ
max C(s; t) ¸ c
s2S;t2T
¶
¼
1
2
X
M
L (S)
i
i=0
where
L (° ) =
k
p
X
N
L (T )
j
j=0
p
X
L (° )
k
p
k=0
¡
2k+1 ¼ k2 ¡ p+1
³
´2
¡
¡
k! p 1 k !
2
q
X
l=0
¢
if p ¡ 1 ¡ k is even, and zero otherwise, k = 0; : : : ; p ¡ 1.
L (° )½
¸ c)
(C
l
q i+k;j+l
What is ‘bubbles’?
Nature (2005)
Subject is shown one of 40
faces chosen at random …
Happy
Sad
Fearful
Neutral
… but face is only revealed
through random ‘bubbles’

First trial: “Sad” expression
Sad
75 random
Smoothed by a
bubble centres Gaussian ‘bubble’
What the
subject sees
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0


Subject is asked the expression:
Response:
“Neutral”
Incorrect
Your turn …

Trial 2
Subject response:
“Fearful”
CORRECT
Your turn …

Trial 3
Subject response:
“Happy”
INCORRECT
(Fearful)
Your turn …

Trial 4
Subject response:
“Happy”
CORRECT
Your turn …

Trial 5
Subject response:
“Fearful”
CORRECT
Your turn …

Trial 6
Subject response:
“Sad”
CORRECT
Your turn …

Trial 7
Subject response:
“Happy”
CORRECT
Your turn …

Trial 8
Subject response:
“Neutral”
CORRECT
Your turn …

Trial 9
Subject response:
“Happy”
CORRECT
Your turn …

Trial 3000
Subject response:
“Happy”
INCORRECT
(Fearful)
Bubbles analysis

1
E.g. Fearful (3000/4=750 trials):
+
2
+
3
+
Trial
4 + 5
+
6
+
7 + … + 750
1
= Sum
300
0.5
200
0
100
250
200
150
100
50
Correct
trials
Proportion of correct bubbles
=(sum correct bubbles)
/(sum all bubbles)
0.75
Thresholded at
proportion of
0.7
correct trials=0.68,
0.65
scaled to [0,1]
1
Use this
as a
0.5
bubble
mask
0
Results

Mask average face
Happy

Sad
Fearful
But are these features real or just noise?
 Need statistics …
Neutral
Statistical analysis
Correlate bubbles with response (correct = 1, incorrect =
0), separately for each expression
Equivalent to 2-sample Z-statistic for correct vs. incorrect
bubbles, e.g. Fearful:


Trial 1
2
3
4
5
6
7 …
750
1
0.5
0
1
1
Response
0
1
Z~N(0,1)
statistic
4
2
0
-2
0
1
1 …
1
0.75

Very similar to the proportion of correct bubbles:
0.7
0.65
Results

Thresholded at Z=1.64 (P=0.05)
Happy
Average face
Sad
Fearful
Neutral
Z~N(0,1)
statistic
4.58
4.09
3.6
3.11
2.62
2.13
1.64

Multiple comparisons correction?
 Need random field theory …
Euler Characteristic Heuristic
Euler characteristic (EC) = #blobs - #holes (in 2D)
Excursion set Xt = {s: Z(s) ≥ t}, e.g. for neutral face:
EC = 0
30
20
0
-7
-11
13
14
9
0
Heuristic:
At high thresholds t,
the holes disappear,
EC ~ 1 or 0,
E(EC) ~ P(max Z ≥ t).
Observed
Expected
10
EC(Xt)
1
0
-10
-20
-4
-3
-2
-1
0
1
Threshold, t
2
• Exact expression for
E(EC) for all thresholds,
• E(EC) ~ P(max Z ≥ t) is
3
4
extremely
accurate.
The»result
If Z(s) N(0; 1) ¡is an¢ isotropic Gaussian random ¯eld, s 2 <2 ,
with ¸2 I2£2 = V @Z ,
@s
µ
¶
P max Z(s) ¸ t ¼ E(EC(S \ fs : Z(s) ¸ tg))
s2S
Z 1
1
£
L (S)
= EC(S)
e¡z2 =2 dz
0
(2¼)1=2
t
L (S)
£ 1 e¡t2 =2
1
+
¸
Perimeter(S)
Lipschitz-Killing
1
2
2¼
curvatures of S
1
L (S)
¡t2 =2
2 Area(S) £
(=Resels(S)×c)
+
¸
te
2
(2¼)3=2
If Z(s) is white noise convolved
with an isotropic Gaussian
Z(s)
¯lter of Full Width at Half
Maximum
FWHM then
p
¸ = 4 log 2 :
FWHM
½0 (Z ¸ t)
½1 (Z ¸ t)
½2 (Z ¸ t)
EC densities
of Z above t
white noise
=
filter
*
FWHM
Results, corrected for search

Random field theory threshold: Z=3.92 (P=0.05)
Happy
Average face
Sad
Fearful
Neutral
Z~N(0,1)
statistic
4.58
4.47
4.36
4.25
4.14
4.03
3.92


3.82
3.80
3.81
3.80
Saddle-point approx (2007): Z=↑ (P=0.05)
Bonferroni: Z=4.87 (P=0.05) – nothing
Scale space: smooth Z(s) with range of filter widths w
= continuous wavelet transform
adds an extra dimension to the random field: Z(s,w)
Scale space, no signal
w = FWHM (mm, on log scale)
34
8
6
4
2
0
-2
22.7
15.2
10.2
6.8
-60
-40
34
-20
0
20
One 15mm signal
40
60
8
6
4
2
0
-2
22.7
15.2
10.2
6.8
-60
-40
-20
0
s (mm)
20
40
60
15mm signal is best detected with a 15mm smoothing filter
Z(s,w)
Matched Filter Theorem (= Gauss-Markov Theorem):
“to best detect signal + white noise,
filter should match signal”
10mm and 23mm signals
w = FWHM (mm, on log scale)
34
8
6
4
2
0
-2
22.7
15.2
10.2
6.8
-60
-40
34
-20
0
20
Two 10mm signals 20mm apart
40
60
8
6
4
2
0
-2
22.7
15.2
10.2
6.8
-60
-40
-20
0
20
40
60
s (mm)
But if the signals are too close together they are
detected as a single signal half way between them
Z(s,w)
Scale space can even separate
two signals at the same location!
8mm and 150mm signals at the same location
10
5
w = FWHM (mm, on log scale)
0
-60
170
-40
-20
0
20
40
60
20
76
15
34
10
15.2
6.8
5
-60
-40
-20
0
s (mm)
20
40
60
Z(s,w)
Scale space Lipschitz-Killing curvatures
R
Suppose f is a kernel with f 2 = 1 and B is a Brownian sheet.
Then the scale space random ¯eld is
Z µ ¡ ¶
s h
Z(s; w) = w¡D=2 f
dB(h) » N(0; 1):
w
Lipschitz-Killing curvatures:
¡d
¡d
w
+
w
L (S £ [w ; w ]) = 1
2 L (S) +
d
1
2
d
2
b(D¡X
d+1)=2c
j=0
w¡d¡2j+1 ¡ w¡d¡2j+1
1
2
¡
d + 2j 1
¡
¡
¡
£ ·(1 2j)=2 ( 1)j (d + 2j 1)! L
¡1 (S);
¡
¡
d+2j
(1 2j)(4¼)j j!(d 1)!
where · =
R³
s0 @f (s)
@s
´
2
+ D f (s) ds. For a Gaussian kernel, · = D=2. Then
2
µ
¶ D+1
X
¸
¼
L (S £ [w ; w ])½ (R ):
P max Z(s) t
d
1
2
d
t
s2S
d=0
Rotation space:
Try all rotated elliptical filters
Unsmoothed data
Threshold
Z=5.25 (P=0.05)
Maximum filter
Bubbles task in fMRI scanner

Correlate bubbles with BOLD at every voxel:
Trial
1
2
3
4
5
6
7 …
3000
1
0.5
0
fMRI
10000
0

Calculate Z for each pair (bubble pixel, fMRI voxel)

a 5D “image” of Z statistics …
Thresholding? Cross correlation random field
Correlation between 2 fields at 2 different locations,
searched over all pairs of locations, one in S, one in T:

P
µ
¶
max C(s; t) ¸ c ¼ E(EC fs 2 S; t 2 T : C(s; t) ¸ cg)
s2S;t2T
=
dim(S)
X dim(T
X)
i=0
n¡2¡h (i ¡ 1)!j!
2
½ij (C ¸ c) =
¼h=2+1
L (S)L (T )½ (C ¸ c)
i
j
ij
j=0
b(hX
¡1)=2c
(¡1)k ch¡1¡2k (1 ¡ c2 )(n¡1¡h)=2+k
k=0
l)¡( n¡j
2
X
k X
k
l=0 m=0
¡( n¡i +
+ m)
2
l!m!(k ¡ l ¡ m)!(n ¡ 1 ¡ h + l + m + k)!(i ¡ 1 ¡ k ¡ l + m)!(j ¡ k ¡ m + l)!

Bubbles data: P=0.05, n=3000, c=0.113, T=6.22
Discussion: modeling







The random response is Y=1 (correct) or 0 (incorrect), or Y=fMRI
The regressors are Xj=bubble mask at pixel j, j=1 … 240x380=91200 (!)
Logistic regression or ordinary regression:
 logit(E(Y)) or E(Y) = b0+X1b1+…+X91200b91200
But there are only n=3000 observations (trials) …
Instead, since regressors are independent, fit them one at a time:
 logit(E(Y)) or E(Y) = b0+Xjbj
However the regressors (bubbles) are random with a simple known distribution, so
turn the problem around and condition on Y:
 E(Xj) = c0+Ycj
 Equivalent to conditional logistic regression (Cox, 1962) which gives exact
inference for b1 conditional on sufficient statistics for b0
 Cox also suggested using saddle-point approximations to improve accuracy of
inference …
Interactions? logit(E(Y)) or E(Y)=b0+X1b1+…+X91200b91200+X1X2b1,2+ …
MS lesions and cortical thickness

Idea: MS lesions interrupt neuronal signals, causing thinning in
down-stream cortex
Data: n = 425 mild MS patients
5.5
Average cortical thickness (mm)

5
4.5
4
3.5
3
2.5
Correlation = -0.568,
T = -14.20 (423 df)
2
1.5
0
10
20
30
40
50
Total lesion volume (cc)
60
70
80
MS lesions and cortical thickness at all pairs of
points





Dominated by total lesions and average cortical thickness, so remove these
effects as follows:
CT = cortical thickness, smoothed 20mm
ACT = average cortical thickness
LD = lesion density, smoothed 10mm
TLV = total lesion volume

Find partial correlation(LD, CT-ACT) removing TLV via linear model:
 CT-ACT ~ 1 + TLV + LD
 test for LD

Repeat for all voxels in 3D, nodes in 2D
~1 billion correlations, so thresholding essential!
Look for high negative correlations …
Threshold: P=0.05, c=0.300, T=6.48



Cluster extent rather than peak height
(Friston, 1994)

Choose a lower level, e.g. t=3.11 (P=0.001)

Find clusters i.e. connected components of excursion set

L (cluster)
Measure cluster
extent
by resels D
Z
D=1
extent

L (cluster) » c
D

t
®
k
Distribution of maximum cluster extent:
 Bonferroni on N = #clusters ~ E(EC).
Peak
height
Distribution:
 fit a quadratic to the
peak:
Y
s
References

Adler, R.J. and Taylor, J.E. (2007). Random fields and geometry. Springer.

Adler, R.J., Taylor, J.E. and Worsley, K.J. (2008). Random fields, geometry,
and applications. In preparation.
Download