simonwarfield

advertisement
Computational Radiology Laboratory
Harvard Medical School
www.crl.med.harvard.edu
Children’s Hospital
Department of Radiology
Boston Massachusetts
A Survey of Validation Techniques
for Image Segmentation and
Registration, with a focus on the
STAPLE algorithm
Simon K. Warfield, Ph.D.
Associate Professor of Radiology
Harvard Medical School
Outline
• Validation of image segmentation
– Overview of approaches
– STAPLE
• Validation of image registration
• STAPLE algorithm available as open
source software from:
– http://www.nitrc.org/projects/staple
– http://crl.med.harvard.edu/
Computational Radiology Laboratory.
Slide 2
Segmentation
• Goal: identify or label structures
present in the image.
• Many methods:
– Interactive or manual delineation,
– Supervised approaches with user
initialization,
– Alignment with a template,
– Statistical pattern recognition.
• Applications:
– Quantitative measurement of
volume, shape or location of
structures,
– Provides boundary for visualization
by surface rendering.
Newborn MRI
Segmentation.
Computational Radiology Laboratory.
Slide 3
Validation of Image Segmentation
• Spectrum of accuracy versus realism in
reference standard.
• Digital phantoms.
– Ground truth known accurately.
– Not so realistic.
• Acquisitions and careful segmentation.
– Some uncertainty in ground truth.
– More realistic.
• Autopsy/histopathology.
– Addresses pathology directly; resolution.
• Clinical data ?
– Hard to know ground truth.
– Most realistic model.
Computational Radiology Laboratory.
Slide 4
Validation of Image Segmentation
• Comparison to digital and physical
phantoms:
– Excellent for testing the anatomy, noise and
artifact which is modeled.
– Typically lacks range of normal or
pathological variability encountered in
practice.
MRI of brain
phantom from
Styner et al. IEEE
TMI 2000
Computational Radiology Laboratory.
Slide 5
Comparison To Higher Resolution
MRI
Photograph
MRI
Provided by Peter Ratiu and Florin Talos.
Computational Radiology Laboratory.
Slide 6
Comparison To Higher Resolution
Photograph
MRI
Photograph Microscopy
Provided by Peter Ratiu and Florin Talos.
Computational Radiology Laboratory.
Slide 7
Comparison to Autopsy Data
• Neonate gyrification index
– Ratio of length of cortical boundary to length
of smooth contour enclosing brain surface
Computational Radiology Laboratory.
Slide 8
Staging
Stage 3: at 28 w GA
shallow indentations of inf. frontal
and sup. Temp. gyrus
(1 infant at 30.6 w GA,
normal range: 28.6 ± 0.5 w GA)
Stage 4: at 30 w GA
Stage 3
2 indentations divide front. lobe into
3 areas, sup. temp.gyrus clearly
detectable
(3 infants, 30.6 w GA ± 0.4 w,
normal range: 29.9 ± 0.3 w GA)
Stage
Stage 5: at 32 w GA
frontal lobe clearly divided into three
parts: sup., middle and inf. Frontal gyrus
(4 infants, 32.1 w GA ± 0.7 w,
normal range: 31.6 ± 0.6 w GA)
Stage 6: at 34 w GA
temporal lobe clearly divided into
3 parts: sup., middle and inf. temporal
gyrus
(8 infants, 33.5 w GA ± 0.5 w
normal range: 33.8 ± 0.7 w GA)
Stage 4
“Assessment of cortical gyrus and sulcus
formation using MR images in normal
fetuses”, Abe S. et al., Prenatal Diagn 2003
Computational Radiology Laboratory.
Slide 9
Stage
Neonate GI: MRI Vs Autopsy
Gyrification Index versus age in days
3
2.5
GI
2
1.5
1
0.5
0
200
220
240
260
280
300
320
340
Post-conceptional age in days
MRI Scan 2
MRI Scan 1
Armstrong
Computational Radiology Laboratory.
Slide 10
GI Increase Is Proportional to Change in Age.
'change in GI' versus 'days of growth before final scan'
0.8
0.7
change of GI
0.6
0.5
0.4
0.3
0.2
0.1
0
50
55
60
65
70
75
80
85
90
time interval between scans in days
Change of Total Brain GI
Linear (Change of Total Brain GI)
Computational Radiology Laboratory.
Slide 11
GI Versus Qualitative Staging
Staging versus GI
2.4
2.2
Total Brain GI
2
1.8
1.6
1.4
1.2
1
3
4
5
6
7
8
9
Staging Grade
MRI scan 1
MRI scan 2
Computational Radiology Laboratory.
Slide 12
Neonate Gyrification
GI : interactive versus automatic segmentation.
GI - automatic segmentation
5
4.5
4
y = 1.2241x + 0.4443
3.5
3
2.5
2
1.5
1
0.5
0
-1
0
1
2
3
4
5
GI - hand segmentation
Linear (line of equality)
Computational Radiology Laboratory.
Slide 13
Validation of Image Segmentation
• Comparison to expert performance; to other
algorithms.
• Why compare to experts ?
– Experts are currently doing the segmentation tasks
that we seek algorithms for.
– Surgical planning.
– Neuroscience research.
• What is the appropriate measure for such
comparisons ?
Computational Radiology Laboratory.
Slide 14
Measures of Expert Performance
• Repeated measures of volume
– Intra-class correlation coefficient
• Spatial overlap
– Jaccard: Area of intersection over union.
– Dice: increased weight of intersection.
– Vote counting: majority rule, etc.
• Boundary measures
– Hausdorff, 95% Hausdorff.
• Bland-Altman methodology:
– Requires a reference standard.
• Measures of correct classification rate:
– Sensitivity, specificity ( Pr(D=1|T=1), Pr(D=0|T=0) )
– Positive predictive value and negative predictive value
(posterior probabilities Pr(T=1|D=1), Pr(T=0|D=0) )
Computational Radiology Laboratory.
Slide 15
Validation of Image Segmentation
• STAPLE (Simultaneous Truth and
Performance Level Estimation):
– An algorithm for estimating performance
and ground truth from a collection of
independent segmentations.
Computational Radiology Laboratory.
Slide 16
STAPLE papers
– Image segmentation with labels:
•
•
•
•
Warfield, Zou, Wells ISBI 2002
Warfield, Zou, Wells MICCAI 2002.
Warfield, Zou, Wells, IEEE TMI 2004.
Commowick and Warfield IPMI 2009
– Image segmentation with boundaries:
• Warfield, Zou, Wells MICCAI 2006.
• Warfield, Zou, Wells PTRSA 2008.
– Diffusion data and vector fields:
• Commowick and Warfield IEEE TMI 2009
Computational Radiology Laboratory.
Slide 17
STAPLE: Estimation Problem
• Complete data density: f ( D, T | p, q )
• Binary ground truth Ti for each voxel i.
• Expert j makes segmentation decisions Dij.
• Expert performance characterized by sensitivity
p and specificity q.
– We observe expert decisions D. If we knew
ground truth T, we could construct
maximum likelihood estimates for each
expert’s sensitivity (true positive fraction)
and specificity (true negative fraction):
pˆ , qˆ  arg max ln f ( D , T | p, q )
p, q
Computational Radiology Laboratory.
Slide 18
Expectation-Maximization
• Since we don’t know ground truth T, treat T as a
random variable, and solve for the expert performance
parameters that maximize:
Q ( | 
( t 1 )
)  E ln f (D, T |  ) | D, 
( t 1 )
• Parameter values θj=[pj qj]T that maximize the
conditional expectation of the log-likelihood function
are found by iterating two steps:
– E-step: Estimate probability of hidden ground truth T given a
previous estimate of the expert quality parameters, and take
expectation.
– M-step: Estimate expert performance parameters by
comparing D to the current estimate of T.
Computational Radiology Laboratory.
Slide 19

Probability Estimate of True Labels
Estimate probability of tissue class in reference standard:
W si  f (T i  s | D i ,  )
k
k
f (T i  s )  f ( D ij | T i  s,  )
k

j

s
f (T i  s  )  f ( D ij | T i  s  ,  )
k
j
Computational Radiology Laboratory.
Slide 20
Binary Input: True Segmentation
W i  f (Ti  1 | D i , p , q )
k
k


k
f ( D ij | Ti  1, p j , q j ) f (Ti  1)
k
k
j
 Ti  f ( D ij | Ti , p j , q j ) f (Ti )
k
k
j

k
k
k

 f (Ti  1) 
k
 f (Ti  0) 
k
pj
k
j : D ij  1
qj
k
j : D ij  0
k
j : D ij  0
(1  p j )
(1  q j )
k
j : D ij  1
f (Ti  1) : prior probability true label at vo xel i is 1.
k
W i : conditional probability that true lab el is 1.
Computational Radiology Laboratory.
Slide 21
Expert Performance Estimate
pj
qj
k 1
k 1




i : D ij
Wi 
k
i : D ij  1


W
i
1

i : D ij  0
Wi
k
(1  W i )
k
i : D ij  0
(1

W
)

i
1
k
i : D ij
k

(1

W
)
i
0
k
i : D ij
p (sensitivity, true positive fraction) : ratio of expert
identified class 1 to total class 1 in the image.
q (specificity, true negative fraction) : ratio of expert 
identified class 0 to total class 0 in the image.

k 1
js s

k
W si
i : D ij  s 

k
W si
i
Computational Radiology Laboratory.
Slide 22
Newborn MRI Segmentation
Computational Radiology Laboratory.
Slide 23
Newborn MRI Segmentation
Summary of segmentation quality (posterior probability
Pr(T=t|D=t) ) for each tissue type for repeated manual
segmentations.
Indicates limits of accuracy of interactive segmentation.
Computational Radiology Laboratory.
Slide 24
Expert and Student Segmentations
Test image
Expert consensus
Student 2
Student 1
Student 3
Computational Radiology Laboratory.
Slide 25
Phantom Segmentation
Image
Image
Expert
segmentation
Expert
Students
Student
segmentations
Voting
STAPLE
Computational Radiology Laboratory.
Slide 26
STAPLE Summary
• Key advantages of STAPLE:
– Estimates ``true’’ segmentation.
– Assesses expert performance.
• Principled mechanism which enables:
– Comparison of different experts.
– Comparison of algorithm and experts.
• Extensions for the future:
– Prior distribution or extended models for
expert performance characteristics.
– Estimate bounds on parameters.
Computational Radiology Laboratory.
Slide 27
Image registration
• A metric: measures similarity of images
given an estimate of the transformation.
• Best metric depends on nature of the
images.
• Alignment quality ultimately possible
depends on model of transformation.
• The transformation is identified by
solving an optimization problem.
– Seek the transform parameters that
maximize the metric of image similarity
Computational Radiology Laboratory.
Slide 28
Validation of Registration
• Compare transformations
– Take some images, apply a transformation
to them.
– Estimate the transform using registration
– How well does the estimated transformation
match the applied transform?
• Check alignment of key image features
– Fiducial alignment
– Spatial overlap
• Segment structures, assess overlap after
alignment.
Computational Radiology Laboratory.
Slide 29
Intraoperative Nonrigid Registration
• Fast: it should not take more than 1 min to make the
registration.
• Robust: the registration should work with poor quality
image, artifacts, tumor...
• Physics based: we are not only concerned in the
intensity matching, but also interested in recovering the
physical (mechanical) deformation of the brain.
• Accurate: neuro-surgery needs a precise knowledge of
the position of the structures.
• Archip et al. NeuroImage 2007
Computational Radiology Laboratory.
Slide 30
Block Matching Algorithm
Similarity measure: coefficient of correlation  [ 0 : 1]
Divide a global optimization problem in many simple local ones
Highly parallelizable, as blocks can be matched independently.
Computational Radiology Laboratory.
Slide 31
Block Matching Algorithm
Displacement
estimates are
noisy.
Computational Radiology Laboratory.
Slide 32
Patient-specific Biomechanical Model
Pre-operative
image
Automatic
brain segmentation
Brain finite
element model
(linear elastic)
Computational Radiology Laboratory.
Slide 33
Registration Validation
• Landmark matching assessment in six cases
• Parallel version runs in 35 seconds on a 10 dual 2GHz
PC cluster
7x7x7 block size
11x11x25 window
1x1x1 step
50 000 blocks
10 000 tetrahedra
Registration Error Evaluation Using Landmarks
Correspondences
3
2,5
Measured Error
–
–
–
–
–
Patient 1
Patient 2
Patient 3
Patient 4
Patient 5
Patient 6
2
1,5
1
0,5
• 60 landmarks:
0
– Average error = 0.75mm
0
– Maximum error = 2.5mm
– Data voxel size 0.8x0.8x2.5 mm3
5
10
15
Displacement
Computational Radiology Laboratory.
Slide 34
Registration Validation
• 11 prospective consecutive cases,
• Alignment computed during the surgery.
• Estimate of the registration accuracy –
95% Hausdorff distance of the edges of
the registered preoperative MRI and the
intraoperative MRI.
Computational Radiology Laboratory.
Slide 35
Automatic selection of fiducials
(1)Non-rigid alignment of
preoperative MPRAGE.
(2) Intraoperative whole
brain SPGR at 0.5T
Contours extracted from (1)
with the Canny edge
detector
Contours extracted from (2)
with the Canny edge
detector
95% Hausdorff
metric
computed
Computational Radiology Laboratory.
Slide 36
Alignment improvement
Non-rigid registration – preop to intraop scans
(95% Hausdorff distance)
Max Displacement
measured
(mm)
Rigid registration
accuracy – preop to
intraop
(mm)
Non-Rigid
registration
accuracy – preop to
intraop
(mm)
Ratio
Rigid/NonRigid
oligoastrocytoma Grade II
10.68
5.95
1.90
3.13
left posterior temporal
glioblastoma Grade IV
21.03
10.71
2.90
3.69
Case 3
left medial temporal
glioblastoma Grade IV
15.27
7.65
1.70
4.50
Case 4
left temporal
anaplastic oligoastrocytoma
Grade III
10.00
6.80
0.85
8.00
Case 5
right frontal
oligoastrocytoma Grade II
9.87
5.10
1.27
4.01
Case 6
left frontal
anaplastic astrocytoma Grade
III
17.48
10.20
3.57
2.85
Case 7
right medial temporal
anaplastic astrocytoma Grade
III
19.96
9.35
2.55
3.66
Case 8
right frontal
oligoastrocytoma Grade II
17.44
8.33
1.19
7.00
Case 9
right frontotemporal
oligoastrocytoma Grade II
15.08
7.14
1.87
3.81
Case 10
right occipital
anaplastic oligodendroglioma
Grade III
9.48
5.95
1.44
4.13
Case 11
left frontotemporal
oligodendroglioma Grade II
10.74
4.76
0.85
5.60
14.27
7.44
1.82
4.58
Tumor position
Tumor pathology
Case 1
right posterior frontal
Case 2
AVG
Computational Radiology Laboratory.
Slide 37
Visualization of aligned data
• Matched preoperative fMRI and DT-MRI
aligned with intraoperative MRI.
Tensor alignment: Ruiz et al. 2000
Computational Radiology Laboratory.
Slide 38
Conclusion
• Validation strategies for registration:
– Comparison of transformations.
– Fiducials
• Manual, automatic.
– Overlap statistics – as for segmentation.
• Validation strategies for segmentation:
– Digital and physical phantoms.
– Comparison to domain experts.
– STAPLE.
Computational Radiology Laboratory.
Slide 39
Acknowledgements
Collaborators
•
•
•
•
•
•
Neil Weisenfeld.
Andrea Mewes.
Richard Robertson.
Joseph Madsen.
Karol Miller.
Michael Scott.
•
•
•
•
•
•
•
William Wells.
Kelly H. Zou.
Frank Duffy.
Arne Hans.
Olivier Commowick.
Alexandra Golby.
Vicente Grau.
This study was supported by:
R01 RR021885, R01 EB008015, R01 GM074068
Computational Radiology Laboratory.
Slide 40
Download