Performance in Medical Image Computing Dr Daniel Rueckert Department of Computing Imperial College London Introduction IXI project is about application of escience to medical imaging research Distributed image acquisition Distributed data storage Distributed image analysis Workflow Aims • To build a grid infrastructure for medical image analysis • Apply it to exemplars relevant to: – biomedical research – drug discovery – healthcare • Use e-science as a driver for novel algorithm development IXI System: Overview • • • • • How does IXI work? What sort of images? How big are images? How long do the algorithms take? What are the end-user applications? IXI System: Aims • Make large remote compute resources available via Grid Services – Dedicated service for each algorithm – Be able to compose one service with another to form a workflow • Hide the complexity from the user – Seamless integration with interface – Invisible and secure transfer of files • Make the results easily available – Store in a database IXI System: Architecture • Local network – Web portal – Relational database (image and meta data are directly imported from DICOM) – XML database (stored workflows) – File system (image files) – Locally hosted grid service (reinsertion) • Remotely – Registry Service (index) – Workflow Service (workflow execution) – IXI Core Service (delegation) How it works Reinsertion Service Relational Database File transfer via Grid FTP Insert derived data and copy to file system and Retrieve workflow Query local file locations Initiate reinsertion Finished File transfer via Grid FTP Matching subjects located Web Portal Workbench Submit workflow Co-ordinate and delegate E-mail user IXI Core Service Retrieve workflow & parse XML XML Database Workflow Service Select workflow template Clicks link in e-mail Review User enters Instantiate workflow results search Select files to keep andcriteria submit User Condor GT3 GRAM Execute locally IXI: Dynamic brain atlas demonstrator rigid/non-rigid registration Age 16-35 Age 35-65 Age 65+ classification Database of Statistical or medical images probabilistic atlas How it works: Problems Relational Database Workflow Service Submit workflow Web Portal Workbench ? XML Database E-mail user User What sorts of images? 2D images (ie x-ray) 3D images (ie CT, MR, PET) 4D images (ie CT, MR) How big are images? • Current clinical routine: – MRI examination: 200 – 300 slices of 256 x 256 pixels x 2 bytes per pixel ~ 30Mbytes – CT examination: 10 – 30 slices of 512 x 512 pixels, 2 bytes per pixel ~10Mbytes – Digital x-ray: 512 x 512 pixels x 2 bytes x 8 -25fps x 100 – 500 seconds ~1.5Gbytes • but only small fraction of this used for measurement or archive How big will images be soon? • Latest technology: – MRI examination: 300 – 500 slices of 512 x 512 at 2 bytes per pixel ~150Mbytes – CT examination: 100 – 300 slices at 512 x 512 at 2 bytes per pixel: ~100Mbytes – And can be dynamic, eg: 10 – 50 cardiac phases • The raw data problem: – Latest techniques manipulate raw data eg: 32 complex channels, which is 128x larger than reconstructed data ~20Gbytes How long do the algorithms take to run? • Segmentation – tissue segmentation: between 30 secs and 10 minutes – anatomical segmentation: between several minutes and hours • Registration – rigid and affine: between 30 secs and 5 minutes – non-rigid: between 10 minutes and 24 hours • Visualisation: – rendering: near real-time even on standard PCs Broad categories of IXI applications Biomedical Research • Accessing, Collecting and Mining Image Data: – Genomics, proteomics, Gene expression – Drug discovery – Clinical Trials • Large Scale Simulation and Analysis – Simulation of cardiac blood flow using CFD Healthcare • Large image based databases – Interpretation, training • Support of multidisciplinary and collaborative environments requiring complex planning and guidance tasks – Diagnosis – Treatment planning – Treatment verification Why does performance matter? • Performance is mainly dependent on: – Computing time – Data transfer time – Reliability and availability of services • Performance has different priority for different applications: – Drug discovery study with 100 subjects – Computer assisted surgery Biomedical Research: Drug discovery • Image mining: – Statistical parametric maps of volume change in patients with schizophrenia undergoing drug treatment population time t = 1 population time t = 2 intrasubject registration intersubject registration TBM reference Why does performance matter? • Drug discovery study with 100 subjects – End user: Researcher – Computing time for each job: ca. 8 hours – Total computing time: 100 x 8 hours, but jobs can run in parallel – Data transfer time for each job: ca. 1-2 minutes – Total transfer time: 100 x 2 minutes, however transfers can’t run in parallel (complications: firewalls slow data transfer down significantly) • Reliability is more important than run-time Healthcare: Computer-assisted interventions Use non-rigid registration to update preoperative plan Ideally real-time, however 10-20 minutes are acceptable Why does performance matter? • Computer-assisted surgery – End user: Clinicians & Surgeons – Computing time: ca. 1 – 8 hours on a workstation – Total computing time: Depending on available machine between 10 mins (cluster) and several hours (single workstation) – Data transfer time: Can be neglected • Reliability is important, but performance prediction is far more important: – Which machine should I run the job? – How long will it take on that machine? Performance modelling for image registration source target Rueckert et al IEEE TMI 1999 Performance modelling for image registration Performance modelling for image registration Initial transformation T Calculate cost function C for transformation T Generate new estimate of T by minimizing C Final transformation T Update transformation T Is new transformation an improvement ? Non-linear optimization Performance modelling • Analytical performance modelling: – Seems impossible – Not desirable since as it often takes more time than developing the algorithms • Experimental performance modelling: – Run algorithms with different parameters and datasets Work by Stephen Jarvis, Dan Spooner, Brian Foley University of Warwick Performance modelling • Highly variable runtime - a factor of 16 between fastest and slowest at the same image size • Two classes of registration. Depends on destination image. • Self registration is fast. • Significant speedup using MPI cluster implementation • Prediction based on timing of subsampled images Work by Stephen Jarvis, Dan Spooner, Brian Foley University of Warwick Performance modelling • Highly variable runtime - a factor of 16 between fastest and slowest at the same image size • Two classes of registration. Depends on destination image. • Self registration is fast. • Prediction based on timing of subsampled images • Significant speedup using MPI cluster implementation Work by Stephen Jarvis, Dan Spooner, Brian Foley University of Warwick Performance modelling • Highly variable runtime - a factor of 16 between fastest and slowest at the same image size • Two classes of registration. Depends on destination image. • Self registration is fast. • Significant speedup using MPI cluster implementation • Prediction based on timing of subsampled images Work by Stephen Jarvis, Dan Spooner, Brian Foley University of Warwick Performance modelling • Highly variable runtime - a factor of 16 between fastest and slowest at the same image size • Two classes of registration. Depends on destination image. • Self registration is fast. • Prediction based on timing of subsampled images • Significant speedup using MPI cluster implementation Work by Stephen Jarvis, Dan Spooner, Brian Foley University of Warwick Modelling systems and applications • Highly variable runtime - a factor of 16 between fastest and slowest at the same image size • Two classes of registration. Depends on destination image. • Self registration is fast. • Prediction based on timing of subsampled images • Significant speedup using MPI cluster implementation What next? • Incorporate performance modelling and predication into the IXI workflow (with help from S. Jarvis, Warwick): – to enable the user to tune parameters of the workflow with respect to the predicted performance – to enable the user to specify performance constraints – to inform the user about progress of workflow and provide updated measures of predicted performance – to implement different policies for scheduling for different IXI applications and end users What next: Challenges • Data transfer can affect performance significantly: – Model data transfer times – Model bottlenecks such as firewalls or database servers • Performance modelling for different algorithms is a time-consuming tedious task: – Large number of different algorithms and different implementations – Can this be automated? • Reliability and availability is generally more important than performance, however this will change as the grid middleware and infrastructure becomes more mature • Future projects require near real-time performance – Analyze data while patient is inside the scanner (Neurogrid) Acknowledgements • IXI team – Imperial College: Jo Hajnal, Andrew Rowland, Raj Chandrashekara, Michael Burns, Dimitrios Perperidis – University College: Derek Hill, Kelvin Leung, Bea Sneller – University of Oxford: Steve Smith, John Vickers • Stephen Jarvis, Dan Spooner, Brian Foley High Performance Systems Group Department of Computer Science University of Warwick