Jan Kamenický Mariánská 2008 2 We deal with medical images ◦ Different viewpoints - multiview ◦ Different times - multitemporal ◦ Different sensors – multimodal Area-based methods (no features) Transformation model Cost function minimization 3 Transformation model ◦ Displacement field u(x) T ( x) x u ( x) I R ( x ) I S T ( x ) 4 Transformation model ◦ Displacement field u(x) T ( x) x u ( x) I R ( x ) I S T ( x ) Cost function ◦ Similarity measure (external forces) ◦ Smoothing (penalization) term (internal forces) ◦ Additional constraints (landmarks, volume preservation) C ( T ; I R , I S ) S (T ; I R , I S ) P (T ) C soft (T ) 5 Transformation model ◦ Displacement field u(x) T ( x) x u ( x) I R ( x ) I S T ( x ) Cost function ◦ Similarity measure (external forces) ◦ Smoothing (penalization) term (internal forces) ◦ Additional constraints (landmarks, volume preservation) C ( T ; I R , I S ) S (T ; I R , I S ) P (T ) C soft (T ) Minimization Tˆ arg m in C (T ; I R , I S ) ˆ arg m in C ( ; I R , I S ) T 6 Translation Rigid (Euler) ◦ Translation, rotation Similarity ◦ Translation, rotation, scaling Affine B-splines ◦ Control points - regular grid on reference image T ( x ) x pk ( x xk ) 3 xk N x 8 9 Sum of Squared Differences Normalized Correlation Coefficients Mutual Information Normalized Gradient Field 10 Sum of Squared Differences (SSD) ◦ Equal intensity distribution (same modality) SSD ( ; I R , I S ) 1 R I x i R R ( xi ) I S T ( xi ) 2 Normalized Correlation Coefficients Mutual Information Normalized Gradient Field 11 Sum of Squared Differences Normalized Correlation Coefficients (NCC) ◦ Linear relation between intensity values (but still same modality) I NCC ( ; I R , IS ) ( xi ) I R x i R I x i R R R ( xi ) I R I T S ( xi ) I S I T 2 S x i R ( xi ) I S 2 Mutual Information Normalized Gradient Field 12 Sum of Squared Differences Normalized Correlation Coefficients Mutual Information ◦ Any statistical dependence M I ( ; I R , IS ) s L S r L R p ( r , s; ) p ( r , s ; ) log 2 p ( r ) p ( s ; ) S R Normalized Gradient Field 13 Mutual Information (MI) ◦ From entropy H ( X ) p ( x ) log 2 p ( x ), x X p( x) 1 x X MI ( X ,Y ) H ( X ) H ( X | Y ) H (Y ) H (Y | X ) H ( X ) H (Y ) H ( X , Y ) MI ( X ,Y ) y Y x X p ( x , y ) log 2 p( x, y) p X ( x ) pY ( y ) 14 Mutual Information (MI) ◦ From Kullback-Leibler distance KL( p, q) p ( i ) log p (i ) q (i ) i MI ( X ,Y ) y Y x X p ( x , y ) log 2 p( x, y) p X ( x ) pY ( y ) 15 Mutual Information (MI) ◦ For images p(x) … normalized image histogram M I ( ; I R , I S ) p ( r , s ; ) log 2 s L S r L R p ( r ,s; ) pR ( r ) pS ( s ; ) ◦ Normalized Mutual Information (NMI) H ( X ) H (Y ) H ( X ,Y ) NM I ( ; I R , IS ) p R ( r ) log 2 p R ( r ) r LR p S ( s ; ) log 2 p S ( m ; ) s L S p ( r , s ; ) log 2 p ( r , s ; ) s L S r L R 16 Mutual Information (MI) ◦ Joint probability estimation Using B-spline Parzen windows p (r , s; ) 1 R x i R r I R ( xi ) wR wS R s I S T ( xi ) S R and S are defined by the histogram bins widths 17 Sum of Squared Differences Normalized Correlation Coefficients Mutual Information Normalized Gradient Field (NGF) ◦ Based on edges ne ( I , x ) I I (x) NGF ( ; I R , IS ) 2 2 e n T e 2 ( I R , x i ) n e ( I S (T ), x i ) 2 x i R 18 Elastic ◦ Elastic potential (motivated by material properties) P elas [u ] 4 xj u k xk u j 2 2 div u 2 dx j ,k Fluid ◦ Viscous fluid model (based on Navier-Stokes) Diffusion ◦ Much faster P diff [u ] 1 2 l ul 2 dx 19 Curvature ◦ Doesn’t penalize affine transformation d P curv [u ] 1 2 l 1 u l dx 2 Bending energy (Thin plate splines) P [u ] p ,q ,r 2 up xq xr (x) 2 dx 20 curvature elastic diffusion fluid 21 Landmarks (fiducial markers) ◦ “Hard” constraint C j rj u (rj ) t j , j 1, 2, ..., m ◦ “Soft” constraint 2 C Volume preservation C soft C soft m j 1 R j log det u ( x ) dx 22 Full Grid ◦ Used with multi-resolution Random ◦ Random subset of voxels is selected ◦ Improved speed 23 k 1 k a k d k , k 0,1, 2, ... 25 Gradient Descent (GD) ◦ Linear rate of convergence k 1 k a k g ( k ) Quasi-Newton Nonlinear Conjugate Gradient Stochastic Gradient Descent Evolution Strategy 26 Gradient Descent Quasi-Newton (QN) ◦ Can be superlinearly convergent k 1 k H 1 (k ) g (k ) Nonlinear Conjugate Gradient Stochastic Gradient Descent Evolution Strategy 27 Gradient Descent Quasi-Newton Nonlinear Conjugate Gradient (NCG) ◦ Superlinear rate of convergence can be achieved d k g ( k ) k d k 1 Stochastic Gradient Descent Evolution Strategy 28 Gradient Descent Quasi-Newton Nonlinear Conjugate Gradient Stochastic Gradient Descent (SGD) ◦ Similar to GD, but uses approximation of the gradient (Kiefer-Wolfowitz, Simultaneous Perturbation, Robbins-Monro) Evolution Strategy 29 Gradient Descent Quasi-Newton Nonlinear Conjugate Gradient Stochastic Gradient Descent Evolution Strategy (ES) ◦ Covariance matrix adaptation ◦ Tries several possible directions (randomly according to the covariance matrix of the cost function), the best are chosen and their weighted average is used 30 Data complexity ◦ Gaussian pyramid ◦ Laplacian pyramid ◦ Wavelet pyramid Transformation complexity ◦ Transformation superposition ◦ Different B-spline grid density 31 Registration toolkit based on ITK Handles many methods ◦ ◦ ◦ ◦ Similarity measures (SSD, NCC, MI, NMI) Transformations (rigid, affine, B-splines) Optimizers (GD, SGD-RM) Samplers, Interpolators, Multi-resolution, … http://elastix.isi.uu.nl 32