Optimisation of Monte Carlo codes for High Performance Computing in Radiotherapy Applications

advertisement
Optimisation of Monte Carlo codes for
High Performance Computing
in Radiotherapy Applications
aka
The Full Monte!
Dr Iwan Cornelius, M.B. Flegg, C.M. Poole, Prof Christian Langton
Faculty of Science and Technology
Queensland University of Technology
Queensland Cancer Physics Collaborative
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Outline
• Introduction
• Development of a LINAC Monte Carlo model
using GEANT4
• Optimisation
• Future Directions
• Conclusions
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Introduction: Radiotherapy
• LINAC: produce highly
controllable source of
MeV photons
– Energy
– Gantry angle
– Patient position
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Introduction: Radiotherapy
• LINAC: produce highly
controllable source of
MeV photons
– Multi Leaf Collimators
(MLCs) to define arbitrary
shaped fields
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Introduction: Radiotherapy
•
Planning
– Patient imaged
– PTV OAR Contoured
– Optimisation of fields to
conform Dose to
tumour and spare
healthy tissue
•
Delivery
– Fractionated
•
Based on analytical
calculations
– Can be inaccurate in
regions of high
heterogeneity
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Monte Carlo
• What is it?
• How is it used in radiotherapy?
– Treatment plan verification
– Support new dosimetry measurements
used in QA
• What tools exist?
– EGSnrc/BEAMnrc, PENELOPE,
MCNPX, GEANT4
• Challenges to overcome
– Reduce Computation times (maintain
accuracy)
• Code optimisation
• Variance reduction
• High Performance Computing (HPC)
– Usability
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
High Performance Computing
•
Monte Carlo: trivial to parallelise
– Launch identical application with
unique random number generator
seed
– Collate results
•
Centralised Clusters
– Multiple machines, Beowulf
– Multiple CPU, Shared memory
(SGI Altix)
•
Cons
– Look better on paper
– Sharing resource with other users
– Often limited to # of processors,
wait in queue
•
Single machine, multiple
processors
– Dual quad core
– Hyperthreading can get 16 cores
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
High Performance Computing: GPGPU
•
General Purpose Graphics Processing Units
–
–
•
CUDA
–
–
–
–
–
–
•
hundreds of processors on a chip
NVIDIA Tesla C1060: PCIx 240 cores per card 4GB
memory
Compute Unified Device Architecture
Write ‘kernel’ in ‘C for CUDA’ to run on the GPU
Copy from main memory to device memory
Kernel executes on GPU
Copies result back to main memory
Great for loops
How to ‘Accelerate’ Monte Carlo codes with GPUs
–
–
–
Re-engineer entire code into C for CUDA kernels
Re-write computationally intensive portions of code
into ‘kernels’ using CUDA
Calculation time doesn’t scale with # of processors
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
GEANT4
• Toolkit of C++ classes
– Primary beam, geometry,
physics processes, scoring
– User must create their own
application based on these
• Very powerful general
purpose Monte Carlo tool
– High energy physics, space
physics, medical physics,
optics, radiation protection,
astrophysics
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
GEANT4
•
Pros
– Extremely flexible
– Time dependent geometries
– Radioactive decay, Neutron
transport
– Various visualisation tools
•
Cons
– Extremely flexible
– Requires proficiency with C++
programming
– Steep learning curve
– Deterrent for first time users
– Hospital based Medical Physicists with
limited research time
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
The Full Monte!
•
Create generic LINAC application using GEANT4
– Capable of modelling Elekta, Varian, Siemens LINACs
– Do for GEANT4 what BEAMnrc did for EGSnrc (just text inputs)
– Accurate. Verify against experimental data.
•
Optimise for HPC environments (Desktop Supercomputer)
– Distribute over available CPUs
– Port to the GPU
•
User interface
– Simple text-file based interface
– Graphical User Interface
•
Interface with TPS
– Able to routinely verify treatment plans
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Geometry
• Varian 2100
Clinac
– Dimensions,
material
composition from
Varian Docs
• Target
• Primary Collimator
• Vacuum window
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Geometry
• Flattening filter
– Compensate for
forward peaked
distribution of
bremsstrahlung
photons
• Ionisation
chamber
– Monitor total
Dose delivery
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Geometry
• Jaws
– Define square fields
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Geometry
• Multi-Leaf
Collimators (MLCs)
– Interleaved Tungsten
leaves
– Varian Millenium
– Brad Oborn (UoW)
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Primary Beam
• Monoenergetic electron
beam
• Normally incident on
target
• Gaussian spread radially
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Physics
• Photons
– Photoelectric effect
– Compton
– GammaConversion
• Electrons
– Multiple scatter
– Ionisation
– Bremmstrahlung
• Positrons
– Ditto
– Annihilation
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Scoring
• Water Phantom
– 50 cm x 50 cm
x 50 cm
– Score in
voxelised
geometry
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Validation / Commissioning
• Comparison with ionisation
chamber measurements in a
water phantom
– Scanning with x,y,z
• Dose along beam axis
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Validation: Tune Electron Beam Energy
• Tuning of electron
beam energy for
best match
– 10 cm x 10 cm field
– Compare between
– 10-30cm depths
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Results: Tune Electron Beam Energy
• Comparison with
ionisation chamber
measurements in
water
• Tuning of electron
beam energy for best
match
– 10 cm x 10 cm field
– Compare between
– 10-30cm depths
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Results: 5.85 MeV, 10 cm x 10 cm
• Within 2%
agreement
between
0.5cm and
38cm
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Results: 5.85 MeV, 10 cm x 10 cm
• Within 2%
agreement
between
0.5cm and
38cm
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Results: 5.85 MeV, 5 cm x 5 cm
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Results: 5.85 MeV, 20 cm x 20 cm
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Results: 5.85 MeV, 40 cm x 40 cm
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Optimisations
• No Optimisation
– Many photons
produced will never
reach the sensitive
region of the
geometry
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Optimisations
• Kill zones
– Nothing fancy-pants
– Terminate histories
that are unlikely to
contribute to
observable
– Above target
– Around primary
collimator
• Relative Computation
Time: 78 %
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Optimisations
• Phase space files
– Some aspects of
geometry don’t change
– Create pre-calculated
radiation field at plane
– Sample this population
to conserve
computation times
• Relative Computation
Time: 38 %
• 380 hrs, O(1010)
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
HPC: GPU/CPU Desktop Supercomputer
• Purchase of Xenon T5 Desktop
Supercomputer
– “The Terminator”
– 4 x C1060 Tesla card = 960 cores!
– 2 x quad core processors
• hyper-threading
• Linux ‘sees’ 16 processors
• NVIDIA Professorial partnership
grant
– Awarded 3 x C1060 Tesla cards
• Research team learning CUDA
– Mark Harris, local CUDA guru
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Optimisations: Parallelise on CPUs
• Message Passing Interface
(MPI)
– Run identical simulation on
different core with unique random
number
– Geant4 MPImanager class
– Time scales roughly linearly with
number of processors
– Simulations in 24 hrs, O(1010)
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
The GPU Dilemma
• 1. Re-write entire code into C for CUDA?
– C for CUDA doesn’t support sophisticated data types
(classes)
– O(10^6) lines of code, dozens of developers
– Wait for CUDA to catch up (?)
• 2. Create C++ wrapper classes for certain methods
– First step, random number generator
– Incorporated into GEANT4 framework via inheritance
– Implementing Mersenne Twister algorithm (hack example
from CUDA SDK) to generate cache of random numbers
– Improvement of only a few percent
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Profiling!
•
•
•
Great first step
when optimising
code
Linux gprof require
to re-compile with
flags set
MacOSX
– Profiling tool
doesn’t require
recompile
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Conclusions
• GEANT4 LINAC application has been developed
–
–
–
–
Specific to Varian Clinac
Many parameters hard-coded
Work commenced on textfile based UI commands
Preliminary validation promising
• Optimisation
–
–
–
–
Phase space files
Kill zones
MPI for parallel processing on CPUs
Porting random number generator to GPU
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Future Directions
• Validation
– Verify dose distributions in heterogeneous phantoms
– Verify model of MLCs (irregular fields)
– Develop interface to Treatment Planning System
• Optimisation
– Re-write part of GEANT4 to run on GPU
• Interface
– User friendly text-file based commands
• Treatment Plan interface
– Implement DICOM-RT interface
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Acknowledgements
• QUT
– Scott Crowe, Tanya Kairn, Andrew Fielding
• discussion on Varian LINAC model, Experimental data
– Mark Barry, Mark O Dwyer
• discussion on CPU optimisation, High Performance Computing
• Mater Hospital, Brisbane
– Radiation Oncology Group
• UoW
– Brad Oborn
• Millenium MLC model
• GEANT4 Collaboration
– Joseph Perl (SLAC)
• discussion on visualisation / profiling
• NVIDIA
– Mark Harris
CRICOS No. 00213J
Queensland University of Technology
AstroMed09, 14-16th December, The University of Sydney
Download