ca - Wellcome Trust Centre for Neuroimaging

advertisement
Computational
Anatomy
&
Statistical Shape
Models
John Ashburner
john@fil.ion.ucl.ac.uk
Functional Imaging Lab, 12 Queen
Square, London, UK.
Why?
o The Wellcome Trust is keen that there is a translational
component to the work in the FIL.
o E.g. develop some potentially useful diagnostic stuff.
o For proper generative models of brain shape differences.
o More accurate spatial normalisation.
o More accurate shape characterisations.
o To use these models for proper characterisation of
population differences.
o These may be multivariate.
o Join the mainstream.
o How do more established fields of biology compare shapes?
NeuroImage Volume 23, Supplement 1, Pages CO2-S299 (2004)
Mathematics in Brain Imaging
Edited by P.M. Thompson, M.I. Miller, T. Ratnanather, R.A. Poldrack and T.E. Nichols
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
Mapping cortical change in Alzheimer's disease, brain development, and schizophrenia Paul M. Thompson, Kiralee
M. Hayashi, Elizabeth R. Sowell, Nitin Gogtay, Jay N. Giedd, Judith L. Rapoport, Greig I. de Zubicaray, Andrew L. Janke, Stephen E.
Rose, James Semple et al.
Computational anatomy: shape, growth, and atrophy comparison via diffeomorphisms Michael I. Miller
Geometric strategies for neuroanatomic analysis from MRI James S. Duncan, Xenophon Papademetris, Jing Yang,
Marcel Jackowski, Xiaolan Zeng and Lawrence H. Staib
Variational, geometric, and statistical methods for modeling brain anatomy and function Olivier Faugeras,
Geoffray Adde, Guillaume Charpiat, Christophe Chefd'Hotel, Maureen Clerc, Thomas Deneux, Rachid Deriche, Gerardo Hermosillo,
Renaud Keriven, Pierre Kornprobst et al.
Computational anatomy and neuropsychiatric disease: probabilistic assessment of variation and statistical
inference of group difference, hemispheric asymmetry, and time-dependent change John G. Csernansky, Lei
Wang, Sarang C. Joshi, J. Tilak Ratnanather and Michael I. Miller
Sequence-independent segmentation of magnetic resonance images Bruce Fischl, David H. Salat, André J.W. van der
Kouwe, Nikos Makris, Florent Ségonne, Brian T. Quinn and Anders M. Dale
Expert knowledge-guided segmentation system for brain MRI Alain Pitiot, Hervé Delingette, Paul M. Thompson and
Nicholas Ayache
Surface-based approaches to spatial localization and registration in primate cerebral cortex David C. Van
Essen
Cortical surface segmentation and mapping Duygu Tosun, Maryam E. Rettmann, Xiao Han, Xiaodong Tao, Chenyang Xu,
Susan M. Resnick, Dzung L. Pham and Jerry L. Prince
Cortical cartography using the discrete conformal approach of circle packings Monica K. Hurdal and Ken
Stephenson
A framework to study the cortical folding patterns J.-F. Mangin, D. Rivière, A. Cachia, E. Duchesnay, Y. Cointepas, D.
Papadopoulos-Orfanos, P. Scifo, T. Ochiai, F. Brunelle and J. Régis
Geodesic estimation for large deformation anatomical shape averaging and interpolation Brian Avants and James
C. Gee
Unbiased diffeomorphic atlas construction for computational anatomy S. Joshi, Brad Davis, Matthieu Jomier and
Guido Gerig
Statistics on diffeomorphisms via tangent space representations M. Vaillant, M.I. Miller, L. Younes and A. Trouvé
Soliton dynamics in computational anatomy Darryl D. Holm, J. Tilak Ratnanather, Alain Trouvé and Laurent Younes
Implicit brain imaging Facundo Mémoli, Guillermo Sapiro and Paul Thompson
Computational anatomy: shape, growth, and atrophy
comparison via diffeomorphisms
Michael I. Miller
Training and Classifying
?
?
Patient
Training Data
Control
Training Data
?
?
Classifying
?
?
Patients
Controls
?
?
y=f(aTx+b)
Support Vector Classifier
Support Vector Classifier (SVC)
Support
Vector
Support
Vector
Support
Vector
a is a weighted linear
combination of the
support vectors
Some Equations
o Linear classification is by y = f(aTx + b)
o where a is a weighting vector, x is the test data, b is an offset,
and f(.) is a thresholding operation
o a is a linear combination of SVs a =
o So y = f(Si wi xiTx + b)
Si wi xi
Going Nonlinear
o Nonlinear classification is by
y = f(Si wi (xi,x))
o where (xi,x) is some function of xi and x.
o e.g. RBF classification (xi,x) = exp(-||xi-x||2/(2s2))
o Requires a matrix of distance measures (metrics)
between each pair of images.
Nonlinear SVC
What is a Metric?
o Positive
A
B
o Dist(A,B) ≥ 0
o Dist(A,A) = 0
o Symmetric
o Dist(A,B) = Dist(B,A)
o Satisfy triangle inequality
o Dist(A,B)+Dist(B,C) ≥ Dist(A,C)
C
Concise representations
o Information reduction/compression
o Most parsimonious representation - best
generalisation
o Occam’s Razor
o Registration compresses data
o signal is partitioned into
o deformations
o residuals
The Small deformation setting
o Most “nonlinear” registration is done in the small-deformation setting.
o Involves adding a smooth displacement field to an identity transform:
y=x+u
o No one-to-one constraint
o Inverse from: x = y - u
o
o
o
o
o Can be a poor approximation to the real inverse
Adding and subtracting displacements doesn’t work properly
Smoothing and averaging displacements doesn’t work properly
Not the most parsimonious model
Unrealistic generative model
Small
deformation:
displacements
are linear
within the
Eulerian
framework
Small-deformation setting
Forward transform
Backward transform
Small def. approx. to
backward transform
Small def. approx. to
forward transform
Illustrating some concepts with rotations
o Consider a 2D rotation y=Rx, where
 cos  sin  

R  
  sin  cos  
o This can be formulated as the solution of a differential
equation at time t=1
o x1(t) = x2(t)
o x2(t) = -x1(t)
o or
 0 

o x(t) = Ax(t), where A  
  0 
Flow field for rigid 2D rotation
Exponentials
o The solution can be obtained by
 cos 
x(1)  Rx(0)  
  sin 
sin  
 0 
A
x(0)  e x(0)  exp 
x(0)
cos  
  0 
o The exponential is defined as:
A 2 A3 A 4
A
R e IA


 ...
2! 3!
4!
o There are many ways of computing R from A, but one of
the easiest is by scaling and squaring
Averaging rotations
o It makes no sense to average the rotation matrices themselves.
o The result may not be a rotation
o The elements of a rotation
matrix lie on a manifold.
o Average by minimising the
sum of squared distances
tangential to the manifold
o Distance derived from
velocity (distance travelled in
unit time)
o Shortest distances are geodesics,
which require constant velocity
r12
r11
Groups
o 2D rotation matrices form a Lie group under
multiplication (SO2).
o Group Requirements
o Composition of group members is another group
member
o The members have inverses
o There is an identity member
o The composition operations are associative
o Lie Group requirements
o Continuous and differentiable manifold
3D Rotations
o 3D rotations defined by
1   2 
 0


A
R  e  exp   1
0
3 




0
3
 2

o These do not commute (R1R2≠R2R1)
o Similarly exp(A1) exp(A2) ≠exp(A2) exp(A1)
o Both differ from exp(A1+A2)
o Makes life more difficult
o Iterative schemes needed for averaging etc.
Lie Algebra
o A would be known as the Lie algebra of the
rotation matrix.
o The amount of non-commutativity is measured by
the Lie bracket
o Results from
curvature of the
manifold
o [A,B] = AB-BA
eB
eA
e2(AB-BA)
eA
eB
How much rotation is in a rotation matrix?
o Given two rotation matrices, R and S. The
relative difference between the rotations can be
found by computing
C = log (R-1S)
and then computing the RMS of C.
(12+ 22 + 32)1/2
Cartan decomposition
o A matrix can be decomposed into
A = (A+AT)/2 + (A-AT)/2
o (A+AT)/2 is symmetric
o Encodes zooms and shears
o (A-AT)/2 is skew symmetric
o Encodes rigid rotations
o A can be converted into a column vector (a)
o There is a matrix L, such that La gives the elements of (A-AT)/2.
o The square of the rotation angle is then given by (La)T(La) = aT(LTL)a
o This excludes any zooming and shearing from the measure
o Similarly – and more usefully - the amount of zooming and shearing
can be computed in a way that is independent of the rotations.
Nonlinear Registration
Mapping
Flow field for nonlinear deformation
… and the resulting deformation
A diffeomorphism and its inverse
Diffeomorphisms have
curved trajectories
(variable velocity) if
followed in the Eulerian
reference frame (fixed).
If followed within the
Lagrangian frame (moves
over time), they appear to
have constant velocity.
Partial Differential Equations
Model one image as it deforms to match another.
x(t) = u(x(t))
x(1) =
u
e
(x(0))
Matrix representations
of diffeomorphisms
x(1) = eU x(0)
x(0) = e-U x(1)
For large k
eU ≈ (I+U/k)k
Compositions
Large deformations generated from compositions of small deformations
S1 = S1/8oS1/8oS1/8oS1/8oS1/8oS1/8oS1/8oS1/8
Recursive formulation
S1 = S1/2oS1/2,
Small deformation
approximation
S1/8 ≈ I + U/8
S1/2 = S1/4oS1/4,
S1/4 = S1/8oS1/8
The shape metric
o Don’t use the straight distance (i.e.
o Distance =
√uTLTLu
o What’s the best form of L?
o Membrane Energy
o Bending Energy
o Linear Elastic Energy
√uTu)
LTL for “membrane energy”
LTLu for “membrane energy” is generated
by convolving with
LTL for “bending energy”
LTLu for “bending energy” is generated by
convolving with
Registration with different
models
Consistent registration
Register to a mean
shaped image
A
B
A
B
µ
C
Totally
impractical
for lots of
scans
C
Problem: How can the distance between
e.g. A and B be computed? Inverse
exponentiating is iterative and slow.
Baker-Campbell-Hausdorff series
o Exp-1(Exp(A)Exp(B))
= A+B
+[A,B]/2
+[A,[A,B]]/12-[B,[A,B]]/12
-[B,[A,[A,B]]]/48-[A,[B,[A,B]]]/48
+…
o Where [A,B] is the Lie bracket applied to flow fields
eB
eA
eB
e2(AB-BA)
eA
Sometimes unstable. Looks like proper nonlinear
methods would be impractical.
Alternative strategy
o Assume the manifold is locally flat around some point (the
template image.
o The results depend on the point on the manifold that is chosen.
o Ideally use a mean shape as the template.
o Best approximation.
Visualisation
o The results of multivariate analyses are difficult
to visualise
o For linear classifiers, it can be done by
“caricaturing” the difference
o E.g. for the separation of two groups, it would be
possible to show two exaggerated versions of the mean
image.
Controls
Patients
y=f(aTx+b)
t=-1.0
t=-0.75
t=-0.5
t=-0.25
t=0.0
t=0.25
t=0.5
t=0.75
t=1.0
I could say more about the
registration algorithm next
time
But let’s just say that it shows potential…
Average of 452 images
Only affine registered
2D average of 471 images
Registration of each 2D image takes about 3
seconds per iteration, and about 16
iterations. I see no problems scaling it to 3D.
Over-fitting
Test data
A simpler model can often do better...
Cross-validation
o Methods must be able to generalise to new data
o Various control parameters
o More complexity -> better separation of training data
o Less complexity -> better generalisation
o Optimal control parameters determined by crossvalidation
o Test with data not used for training
o Use control parameters that work best for these data
Two-fold Cross-validation
Use half the data for
training.
and the other half for
testing.
Two-fold Cross-validation
Then swap around the
training and test data.
Leave One Out Cross-validation
Use all data except one
point for training.
The one that was left
out is used for testing.
Leave One Out Cross-validation
Then leave another point
out.
And so on...
Interpretation??
o Significance assessed from accuracy based on
cross-validation.
o Main problems:
o No simple interpretation.
o Mechanism of classification is difficult to visualise
o especially for nonlinear classifiers
o Difficult to understand (not like blobs)
o May be able to use the separation to derive
simple (and more publishable hypotheses).
Group Theory
o Diffeomorphisms (smooth
continuous one-to-one
mappings) form a Group.
o Closure
o AoB remains in the same group.
o Associativity
o (AoB)oC = Ao(BoC)
o Identity
o Identity transform I exists.
o Inverse
o A-1 exists, and A-1oA=AoA-1 = I
o It is a Lie Group.
o The group of
diffeomorphisms constitute
a smooth manifold.
o The operations are
differentiable.
Lie Groups
o Simple Lie Groups include
various classes of affine
transform matrices.
o E.g. SO(2) : Special
Orthogonal 2D (rigid-body
rotation in 2D).
o Manifold is a circle
o Lie Algebra is
exponentiated to give Lie
group. For square
matrices, this involves a
matrix exponential.
Relevance to Diffeomorphisms
o Parameterise with
velocities, rather than
displacements.
o Velocities are the Lie
Algebra. These are
exponentiated to a
deformation by recursive
application of tiny
displacements, over a
period of time=0..1.
o A(1) = A(1/2) oA(1/2)
o A(1/2) = A(1/4) oA(1/4)
o Don’t actually use
matrices.
o For tiny deformations,
things are almost linear.
o x(1/1024)  x(0) + vx/1024
o y(1/1024)  y(0) + vy/1024
o z(1/1024)  z(0) + vz/1024
o Recursive application by
o x(1/2) = x(1/4) (x(1/4), y(1/4),z(1/4))
o y(1/2) = y(1/4) (x(1/4), y(1/4),z(1/4))
o z(1/2) = z(1/4) (x(1/4), y(1/4),z(1/4))
Working with Diffeomorphisms
o Averaging Warps.
o Distances on the manifold
are given by geodesics.
o Average of a number of
deformations is a point on
the manifold with the
shortest sum of squared
geodesic distances.
o E.g. average position of
London, Sydney and
Honolulu.
o Inversion.
o Negate the velocities, and
exponentiate.
o x(1/1024)  x(0) - vx/1024
o y(1/1024)  y(0) - vy/1024
o z(1/1024)  z(0) - vz/1024
o Priors for registration
o Based on smoothness of the
velocities.
o Velocities relate to
distances from origin.
Download