chicago wshop

advertisement
The Geometry of Random
Fields in Astrophysics and
Brain Mapping
Keith Worsley, McGill
Jonathan Taylor, Stanford and Université de Montréal
Robert Adler, Technion
Frédéric Gosselin, Université de Montréal
Philippe Schyns, Fraser Smith, Glasgow
Arnaud Charil, Montreal Neurological Institute
Astrophysics
Sloan Digital Sky
Survey,
release
6, Aug. ‘07
Sloan
Digital Skydata
Survey,
FWHM=19.8335
2000
Euler Characteristic (EC)
1500
1000
500
"Meat ball"
topology
"Bubble"
topology
0
-500
-1000
"Sponge"
topology
-1500
Observed
Expected
-2000
-5
-4
-3
-2
-1
0
1
Gaussian threshold
2
3
4
5
Nature (2005)
Subject is shown one of 40
faces chosen at random …
Happy
Sad
Fearful
Neutral
… but face is only revealed
through random ‘bubbles’

First trial: “Sad” expression
Sad
75 random
Smoothed by a
bubble centres Gaussian ‘bubble’
What the
subject sees
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0


Subject is asked the expression:
Response:
“Neutral”
Incorrect
Your turn …

Trial 2
Subject response:
“Fearful”
CORRECT
Your turn …

Trial 3
Subject response:
“Happy”
INCORRECT
(Fearful)
Your turn …

Trial 4
Subject response:
“Happy”
CORRECT
Your turn …

Trial 5
Subject response:
“Fearful”
CORRECT
Your turn …

Trial 6
Subject response:
“Sad”
CORRECT
Your turn …

Trial 7
Subject response:
“Happy”
CORRECT
Your turn …

Trial 8
Subject response:
“Neutral”
CORRECT
Your turn …

Trial 9
Subject response:
“Happy”
CORRECT
Your turn …

Trial 3000
Subject response:
“Happy”
INCORRECT
(Fearful)
Bubbles analysis

1
E.g. Fearful (3000/4=750 trials):
+
2
+
3
+
Trial
4 + 5
+
6
+
7 + … + 750
1
= Sum
300
0.5
200
0
100
250
200
150
100
50
Correct
trials
Proportion of correct bubbles
=(sum correct bubbles)
/(sum all bubbles)
0.75
Thresholded at
proportion of
0.7
correct trials=0.68,
0.65
scaled to [0,1]
1
Use this
as a
0.5
bubble
mask
0
Results

Mask average face
Happy

Sad
Fearful
But are these features real or just noise?
 Need statistics …
Neutral
Statistical analysis
Correlate bubbles with response (correct = 1, incorrect =
0), separately for each expression
Equivalent to 2-sample Z-statistic for correct vs. incorrect
bubbles, e.g. Fearful:


Trial 1
2
3
4
5
6
7 …
750
1
0.5
0
1
1
Response
0
1
Z~N(0,1)
statistic
4
2
0
-2
0
1
1 …
1
0.75

Very similar to the proportion of correct bubbles:
0.7
0.65
Results

Thresholded at Z=1.64 (P=0.05)
Happy
Average face
Sad
Fearful
Neutral
Z~N(0,1)
statistic
4.58
4.09
3.6
3.11
2.62
2.13
1.64

Multiple comparisons correction?
 Need random field theory …
Euler Characteristic Heuristic
Euler characteristic (EC) = #blobs - #holes (in 2D)
Excursion set Xt = {s: Z(s) ≥ t}, e.g. for neutral face:
EC = 0
30
20
0
-7
-11
13
14
9
0
Heuristic:
At high thresholds t,
the holes disappear,
EC ~ 1 or 0,
E(EC) ~ P(max Z ≥ t).
Observed
Expected
10
EC(Xt)
1
0
-10
-20
-4
-3
-2
-1
0
1
Threshold, t
2
• Exact expression for
E(EC) for all thresholds,
• E(EC) ~ P(max Z ≥ t) is
3
4
extremely
accurate.
The»result
If Z(s) N(0; 1) ¡is an¢ isotropic Gaussian random ¯eld, s 2 <2 ,
with ¸2 I2£2 = V @Z ,
@s
µ
¶
P max Z(s) ¸ t ¼ E(EC(S \ fs : Z(s) ¸ tg))
s2S
Z 1
1
£
L (S)
= EC(S)
e¡z2 =2 dz
0
(2¼)1=2
t
L (S)
£ 1 e¡t2 =2
1
+
¸
Perimeter(S)
Lipschitz-Killing
1
2
2¼
curvatures of S
1
L (S)
¡t2 =2
2 Area(S) £
+
¸
te
2
(2¼)3=2
If Z(s) is white noise convolved
with an isotropic Gaussian
Z(s)
¯lter of Full Width at Half
Maximum
FWHM then
p
¸ = 4 log 2 :
FWHM
½0 (Z ¸ t)
½1 (Z ¸ t)
½2 (Z ¸ t)
EC densities
of Z above t
white noise
=
filter
*
FWHM
Results, corrected for search

Random field theory threshold: Z=3.92 (P=0.05)
Happy
Average face
Sad
Fearful
Neutral
Z~N(0,1)
statistic
4.58
4.47
4.36
4.25
4.14
4.03
3.92

Bonferroni threshold: Z=4.87 (P=0.05) – nothing
Theory (1981,1995)
Let T (s), s 2 S ½ <D be a smooth isotropic random ¯eld.
Let Xt = fs : T (s) ¸ tg be the the excursion set.
Let Rt = fz : f (z) ¸ tg be the rejection region of T .
Then
X
D
\
L (S)½ (R ):
E(EC(S Xt )) =
d
d
t
d=0
Proof.
E(EC(S \ Xt )) =
X
D
L (S)½ (R )
d
d
t
d=0
(Hadwiger, 1930s): Suppose Á(S), S ½ <D , is a set functional that is invariant
under translations and rotations of S, and satis¯es the additivity property
Á(A [ B) = Á(A) + Á(B) ¡ Á(A \ B):
Then Á(S) must be a linear combination of intrinsic volumes Ld (S):
Á(S) =
X
D
L (S)c :
d
d
d=0
The choice
Á(S) = E(EC(S \ Xt ))
is invariant under translations and rotations because the random ¯eld is isotropic,
and is additive because the EC is additive:
EC(A [ B) = EC(A) + EC(B) ¡ EC(A \ B)
E(EC(S \ Xt )) =
X
D
L (S)½ (R )
d
d
t
d=0
Lipschitz-Killing curvature Ld (S)
EC density ½d (Rt )
Steiner-Weyl Tube Formula (1930)
Morse Theory method (1981, 1995)
• Put a tube of radius r about the search
µ
¶
@Z
¸ = Sd
region λS
@s
EC has a point-set representation:
14
10
Tube(λS,r)
8
λS
6
4
2
2
4
6
8 10 12 14
• Find volume, expand as a power series
in r, pull off coefficients:
jTube(¸S; r)j =
X
D
d=0
1fT (s)¸tg 1f@T (s)=@s=0g
s
r
12
EC(S \ Xt ) =
X
¼d
¡(d=2 + 1)
L
(S)r d
D ¡d
µ
¶
£ sign ¡ @ 2 T
+ boundary
0
@s@s
µ
µ
¶
2
1
@ T
E 1f ¸ g det ¡
½D (Rt ) =
T t
¸D
@s@s0
¯
¶ µ
¶
¯ @T
@T
¯
P
=0
¯ @s = 0
@s
µ ¡random
¶ field:
For a Gaussian
½d (Z ¸ t) =
p1 @
2¼ @t
d
P(Z ¸ t)
Lipschitz-Killing
curvature Ld (S)
of a triangle
r
Tube(λS,r)
λS
¸ = Sd
µ
@Z
@s
¶
Steiner-Weyl Volume of Tubes Formula (1930)
Area(Tube(¸S; r)) =
X
D
¼ d=2
L
¡d (S)r d
D
¡(d=2 + 1)
d=0
= L2 (S) + 2L1 (S)r + ¼ L0 (S)r2
= Area(¸S) + Perimeter(¸S)r + EC(¸S)¼r2
L (S) = EC(¸S)
0
L (S) = 1 Perimeter(¸S)
1
2
L (S) = Area(¸S)
2
Lipschitz-Killing curvatures are just “intrinisic volumes” or “Minkowski functionals”
in the (Riemannian) metric of the variance of the derivative of the process
Lipschitz-Killing curvature Ld (S) of any set
S
S
S
¸ = Sd
Edge length × λ
12
10
8
6
4
2
.
.. . .
.
. . .
.. . .
.. . .
.
. . . .
. . . .
. . . .
.. . .
. . .
... .
.
4
..
.
.
.
.
.
.
.
.
.
.
.
6
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
8
..
. .
. ...
. ..
. . .
. .
. . .....
. . .
. ....
..
..
10
µ
@Z
@s
¶
of triangles
L (Lipschitz-Killing
²) = 1, L (¡) curvature
L (N
=
1,
)=1
0
0
0
L (¡) = edge length, L (N) = 1 perimeter
1
2
L1 (N) = area
2
P Lcurvature
P L
Lipschitz-Killing
union
L
² ¡ Pof L
¡ of triangles
N
(S) = P² 0 ( )
¡ 0( ) +
P
L (S) =
L (¡) ¡
L (N)
¡
N 1
L1 (S) = P L 1(N)
2
N 2
0
N
0
( )
Non-isotropic data
We must restrict T (s) to
T (s) = f (Z(s)); s 2 S ½ <D ;
where Z(s) = (Z1 (s); : : : ; Zn (s)) and Zi (s) are independent and identically distributed non-isotropic Gaussian random ¯elds Z(s) » N(0; 1). Luckily this
covers many of the usual test statistics such as T, Â2 , F for testing for contrasts
in a linear model.
Heuristic: If we know the spatial correlation function of Z(s), can we warp
(deform) the space so that the data becomes isotropic?
Obviously not globally, but perhaps locally . . .
We may need to embed S in a higher dimensional space . . .
How many dimensions are needed? Nash Embedding Theorem says it is ¯nite.
Better idea: Replace local Euclidean distance by the variogram:
d(s1 ; s2 ) = V(Z(s1 ) ¡ Z(s2 ))
³
´
or Reimannian metric by ¸(s)2 = V @Z(s) .
@s
Non-isotropic data
¸(s) = Sd
Z~N(0,1)
s2
3
µ
@Z
@s
¶
2
1
0.14
0.12
0
-1
-2
Edge length × λ(s)
12
10
8
6
4
2
..
.
.
.
. .
. .
.
.. .
.
.. .
.
.
.
.
.
.
.
. . .
.
. . .
.. .
.
.
. .
...
.
. .
.
.
.
.
.
.
.
.
.
.
.
4
6
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
8
.
..
. .
. ...
. ..
. . .
. .
. . .....
. . .
. ....
...
10
s1
0.1
0.08
0.06
-3
of triangles
L (Lipschitz-Killing
²) = 1, L (¡) curvature
L (N
=
1,
)=1
0
0
0
L (¡) = edge length, L (N) = 1 perimeter
1
2
L1 (N) = area
2
P Lcurvature
P L
Lipschitz-Killing
union
L
² ¡ Pof L
¡ of triangles
N
(S) = P² 0 ( )
¡ 0( ) +
P
L (S) =
L (¡) ¡
L (N)
¡
N 1
L1 (S) = P L 1(N)
2
N 2
0
N
0
( )
Estimating Lipschitz-Killing curvature Ld (S)
We need independent & identically distributed random fields
e.g. residuals from a linear model
Z1
Z2
Z3
Z4
Replace coordinates of
the triangles 2 <2 by
normalised residuals
Z 2<
n;
jjZjj
Z5
Z7
Z8
Z9 … Zn
of triangles
L (Lipschitz-Killing
²) = 1, L (¡) curvature
L (N
=
1,
)=1
0
0
0
L (¡) = edge length, L (N) = 1 perimeter
1
2
L1 (N) = area
2
P Lcurvature
P L
Lipschitz-Killing
union
L
² ¡ Pof L
¡ of triangles
N
(S) = P² 0 ( )
¡ 0( ) +
P
L (S) =
L (¡) ¡
L (N)
¡
N 1
L1 (S) = P L 1(N)
2
N 2
0
Z = (Z1 ; : : : ; Zn ):
Z6
N
0
( )
Scale space
How much to smooth the data? Why not try all
R smooths (in a range), then
choose the maximum. Suppose f is a kernel with f 2 = 1 and B is a Brownian
sheet. Then the scale space random ¯eld is
Z µ ¡ ¶
s h
Z(s; w) = w¡D=2 f
dB(h) » N(0; 1):
w
Lipschitz-Killing curvatures:
Note scaled to preserve variance, not mean
¡
¡
L (S £ [w ; w ]) = w1 1 + w2 1 L (S) +
i
1
2
i
2
b(N ¡X
i+1)=2c
j=0
w¡i¡2j+1 ¡ w¡i¡2j+1
1
2
¡
i + 2j 1
¡
¡
¡
£ ·(1 2j)=2 ( 1)j (i + 2j 1)! L
(S);
i+2j ¡1
(1 ¡ 2j)(4¼)j j!(i ¡ 1)!
where · =
R³
s0 @f (s)
@s
´
2
+ D f (s) ds:
2
E(EC(S \ Xt )) =
Beautiful symmetry:
X
D
L (S)½ (R )
d
d
t
d=0
Lipschitz-Killing curvature Ld (S)
Steiner-Weyl Tube Formula (1930)
EC density ½d (Rt )
Taylor Gaussian Tube Formula (2003)
µ
¶
• Put a tube of radius r about
@Z the search region λS and rejection region Rt:
¸ = Sd
@s
Z2~N(0,1)
14
r
12
10
Rt
Tube(λS,r)
8
Tube(Rt,r)
r
λS
6
t-r
t
Z1~N(0,1)
4
2
2
4
6
8 10 12 14
• Find volume or probability, expand as a power series in r, pull off1coefficients:
jTube(¸S; r)j =
X
D
d=0
¼d
L
P(Tube(Rt ; r)) =
¡d (S)r d
D
¡(d=2 + 1)
X (2¼)d=2
d!
d=0
½d (Rt )rd
EC density ½d (Â
¹ ¸ t)
of the Â
¹ statistic
Z2~N(0,1)
Tube(Rt,r)
Â(s)
¹
=
r
max Z1 (s) cos µ + Z2 (s) sin µ
0·µ·¼=2
t-r
Taylor’s Gaussian Tube Formula
(2003)
1
P (Z1 ; Z2 2 Tube(Rt ; r)) =
X (2¼)d=2
d!
Rejection region
Rt
t
Z1~N(0,1)
½d (Â
¹ ¸ t)rd
d=0
(2¼)1=2 ½1 (Â
¹
¸ t)r + (2¼)½ (Â
¸ t)r2 =2 + ¢ ¢ ¢
= ½0 (Â
¹ ¸ t) +
¹
2
Z 1
=
(2¼)¡1=2 e¡z2 =2 dz + e¡(t¡r)2 =2 =4
t¡r
½0 (Â
¹ ¸ t) =
Z
t
1
(2¼)¡1=2 e¡z2 =2 dz + e¡t2 =2 =4
½1 (Â
¹ ¸ t) = (2¼)¡1 e¡t2 =2 + (2¼)¡1=2 e¡t2 =2 t=4
½ (Â
¹ ¸ t) = (2¼)¡3=2 e¡t2 =2 t + (2¼)¡1 e¡t2 =2 (t2 ¡ 1)=8
2
..
.
EC densities for some standard test statistics
Using Morse theory method (1981, 1995):






T, χ2, F (1994)
Scale space (1995, 2001)
Hotelling’s T2 (1999)
Correlation (1999)
Roy’s maximum root, maximum canonical correlation (2007)
Wilks’ Lambda (2007) (approximation only)
Using Gaussian Kinematic Formula:



T, χ2, F are now one line …
Likelihood ratio tests for cone alternatives (e.g chi-bar, beta-bar) and
nonnegative least-squares (2007)
…
Accuracy of the P-value approximation
If Z(s) » N(0; 1) ¡is an¢ isotropic Gaussian random ¯eld, s 2 <2 ,
with ¸2 I2£2 = V @Z ,
@s
µ
¶
P max Z(s) ¸ t ¼ E(EC(S \ fs : Z(s) ¸ tg))
s2S
Z 1
1
= EC(S) £
e¡z2 =2 dz
(2¼)1=2
t
1 ¡2
£
1
+ ¸ Diameter(S)
e t =2
2
2¼
1
£
2
+ ¸ Area(S)
te¡t2 =2
(2¼)3=2
Z 1
1
= c0
e¡z2 =2 dz + (c1 + c2 t + ¢ ¢ ¢ + cD tD¡1 )e¡t2 =2
(2¼)1=2
t
¯ µ
¯
¶
¯
¯
¯P max Z(s) ¸ t ¡ E(EC(S \ fs : Z(s) ¸ tg))¯ = O(e¡®t2 =2 ); ® > 1:
¯
¯
s2S
The expected EC gives all the polynomial terms in the expansion for the P-value.
Bubbles task in fMRI scanner

Correlate bubbles with BOLD at every voxel:
Trial
1
2
3
4
5
6
7 …
3000
1
0.5
0
fMRI
10000
0

Calculate Z for each pair (bubble pixel, fMRI voxel) – a
5D “image” of Z statistics …
Thresholding? Cross correlation random field
Correlation between 2 fields at 2 different locations,
searchedµ
over all pairs of locations,
¶ one in S, one in T:

P
max C(s; t) ¸ c
s2S;t2T
=
¼ E(EC fs 2 S; t 2 T : C(s; t) ¸ cg)
dim(S)
X dim(T
X)
i=0
2n¡2¡h (i ¡ 1)!j!
¸
½ij (C c) =
¼h=2+1
L (S)L (T )½ (C ¸ c)
i
j
ij
j=0
b(hX
¡1)=2c
(¡1)k ch¡1¡2k (1 ¡ c2 )(n¡1¡h)=2+k
k=0
X
k X
k
l=0 m=0
¡( n¡i + l)¡( n¡j + m)
2
2
¡
¡
¡
¡
l!m!(k l m)!(n 1 h + l + m + k)!(i ¡ 1 ¡ k ¡ l + m)!(j ¡ k ¡ m + l)!
Cao & Worsley, Annals of Applied Probability (1999)

Bubbles data: P=0.05, n=3000, c=0.113, T=6.22
MS lesions and cortical thickness

Idea: MS lesions interrupt neuronal signals, causing thinning in downstream cortex

Data: n = 425 mild MS patients

Lesion density, smoothed 10mm

Cortical thickness, smoothed 20mm

Find connectivity i.e. find voxels in 3D, nodes in 2D with high
correlation(lesion density, cortical thickness)

Look for high negative correlations …

Threshold: P=0.05, c=0.300, T=6.48
n=425 subjects, correlation = -0.568
5.5
Average cortical thickness
5
4.5
4
3.5
3
2.5
2
1.5
0
10
20
30
40
50
60
Average lesion volume
70
80
Summary





Points are in a low dimensional space, physically meaningful
Smooth (choice of kernel? Scale space …), threshold
Galaxies:
 Looking for sheets, strings, clusters of high density
 EC is used to measure “topology”; other intrinsic volumes (diameter, surface
area, volume) are also used
 Compare observed EC with expected EC under some model (e.g. “inflation”)
Bubbles, MS lesions:
 Detect sparse high-density clusters with a very low signal to noise
 Thresholding gives maximum likelihood estimates under certain conditions
 EC is merely a device for getting an extremely accurate approximation to the
false positive rate (P-value of the maximum)
Brain mapping data:
 Detect sparse high-density “activations” with a very low signal to noise
 Riemannian metric is usually unknown
 But we only need to estimate Lipschitz-Killing curvature
 Fill with small simplices, work out LKC on each component, sum using
inclusion-exclusion formula
Download