paper

advertisement
Monte Carlo dose calculation algorithm on a distributed system
Stéphane Chauviea,b^, Matteo Dominonic, Piergiorgio Marinia, Michele Stasia, Maria Grazia Piab, Giuseppe
Scielzoa
a
Medical Physics Division, Ordine Mauriziano-IRCC, I-10128 L.go Turati, 62 Torino, Italy
b
INFN, Genoa and Turin, Italy
c
DISCO, University of Milan-Bicocca, Milan, Italy
The main goal of modern radiotherapy, such as 3D conformal radiotherapy and intensity-modulated
radiotherapy is to deliver a high dose to the target volume sparing the surrounding healthy tissue. The accuracy
of dose calculation in a treatment planning system is therefore a critical issue. Among many algorithms
developed over the last years, those based on Monte Carlo proven to be very promising in terms of accuracy. The
most severe obstacle in application to clinical practice is the high time necessary for calculations. We have
studied a high performance network of Personal Computer as a realistic alternative to a high-costs dedicated
parallel hardware to be used routinely as instruments of evaluation of treatment plans.
We set-up a Beowulf Cluster, configured with 4 nodes connected with low-cost network and installed MC
code Geant4 to describe our irradiation facility. The MC, once parallelised, was run on the Beowulf Cluster. The
first run of the full simulation showed that the time required for calculation decreased linearly increasing the
number of distributed processes. The good scalability trend allows both statistically significant accuracy and
good time performances.
The scalability of the Beowulf Cluster system offers a new instrument for dose calculation that could be
applied in clinical practice. These would be a good support particularly in high challenging prescription that
needs good calculation accuracy in zones of high dose gradient and great dishomogeneities.
1. INTRODUCTION
The new radiological strategy in diagnosing and
treating tumours requires a novel approach based
on an advanced and integrated technological
environment. Besides, the opportunity of using
analytical and numerical methods to merge
information
from
different
radiological
examinations is everyday more useful to define and
localise targets [1]. This framework offers the
possibility of treating these volumes with highenergy X-ray beams from linear accelerators using
advanced radiotherapy techniques such as 3Dimensional Conformal Radiotherapy (3D-CRT)
and Intensity Modulated Radiotherapy (IMRT) [27]. Such an improvement in the target volume and
organ at risk delineation ought to be supported by a
major precision in describing the physical
interactions in the human body. The dose
^
distribution inside patient is calculated, in clinical
practice, by Treatment Planning System (TPS).
They use, as input, the dosimetric characteristic,
obtained experimentally, of the radiation machine
such as dose profiles and percent depth dose (PDD)
curves. The patient anatomy is described in terms
of electronic density through the Computed
Tomography (CT) data set.
Through different algorithms [8-11] it is possible
to calculate analytically the dose inside the
irradiated
regions.
The
oversimplification
introduced in these methods permits to have a fast
dose engine but locally inaccurate [12-14]. The
accuracy of dose calculation algorithms is now the
main feature of a TPS. MC algorithms [15-17]
describe with the same accuracy the beam transport
in regions with and without electronic equilibrium:
while conventional algorithms work really well
only when an electronic equilibrium exists. MC is
corresponding author: chauvie@to.infn.it
1
so relevant in calculating the dose at the interface
between different media such as between bone and
tissue, tissue and air.
Geant4 [18,19] is a simulation toolkit designed
for a variety of applications, including HEP,
astrophysics and nuclear physics experiments,
space science and medical physics. The Geant4
kernel handles the management of runs, events and
tracks. Some characteristic of Geant4 is relevant to
be used in a medical domain. A fast
parameterisation framework is integrated with the
full simulation, allowing independent and
simplified geometrical description of the set-up and
direct production of hits. Thanks to the power and
flexibility of Geant4 geometry, beam sources and
patients can be described in the same framework.
Low-energy Electromagnetic [20] is a Geant4
package specialised in low energy electromagnetic
interaction of lepton, photon and muon, as well as
hadrons and ions. Low energy processes assure
good precision down to 250 eV for photons and
electrons [21] and 1 keV for hadrons [22] and ions
[23]. Geant4 offers a user support team, a quality
assurance program based on a rigorous software
process and use of standards; in addition a
worldwide community of users validates it
independently. A point of force is the transparency
of the physics implementation that comes both from
public domain manuals and evaluated data libraries
without nasty hidden formulas. Moreover thanks to
his OOT design is extensible and of easy interface
to other tools for visualisation and analysis.
To be applied in clinical practice the time needed
for calculations with MC methods is the critical
point. Until recently the MC calculations on a good
workstation required hours or even a day to provide
statistically significant results. In the last 15 years
the computer technology enhanced the speed of
calculation by nearly a factor of 100. This
computational power escalation has offered a
solution to problems otherwise unaffordable. It is
now possible to solve complex calculation on a
desktop Personal Computer (PC) whereas in the
past powerful mainframe or vector supercomputers
were needed.
Moreover since MCs are intrinsically parallel
their natural implementation is on parallel
machines. The first idea of using an array of normal
PCs, as an alternative to supercomputers, was born
in the 90s at Goddard Space Flight Centre [24]. The
idea was simple, and, with modern tools affordable.
High performance networks of Personal Computer
(namely Beowulf clusters) are now a realistic
alternative with respect to high-costs dedicated
parallel hardware.
In next sections we will describe the experience
on MC calculations on a prototype of Beowulf
cluster that offers parallel processing at a lower cost
showing competitive performances. In particular
we’ll focus our attention on time performances and
cost of the system describing with some detail the
overall process of construction so as to make this
tool a routine system in a Medical Physics Division
of a Hospital.
2. MATERIALS AND METHODS
In March 2002 we set-up a prototype of Beowulf
cluster constituted of few nodes. We will go shortly
trough the HW installation, SW configuration and
the benchmarking of cluster parallelisation. Then
we’ll describe MC simulation set-up. Finally we’ll
go through the MC parallelisation and run onto the
cluster offering, at the end, some preliminary points
for discussion.
CPU
Memory
Workstation
Pentium
III 450
256 Mb
DIMM
Hard Disk
Present
Video Card
Present
Ethernet Card Present
CD-ROM
Present
Floppy Disk
Present
Monitor, key Present
boards, mouse
Master
Slaves
AMD
AMD
Athlon
Athlon
XP 1500+ XP 1500+
512 Mb
512 Mb
DIMM
DIMM
Present
Present
Present
Present
Present
Present
Present
Present
Present
Table 1: Hardware configuration for the
benchmarking machine. The cluster is made of 1
master and 3 slaves.
2.1 Hardware set-up
Each off the shelf PC is a good compromise
between computer power and cost (about 1 k€). The
requisite to be a slave is: a CPU, memory and fast
Ethernet card. The master is more expensive since
is equipped to be the interface of the cluster with
the outside world and hence is the only one with
2
Execution time [s]
monitor, double Ethernet card, key-board, floppy,
CD-ROM and so on. In Tab. 1 it is possible to see
the HW configuration of the 3 kinds of PC’s we
used in this work. The PCs have been networked on
a private LAN trough a switching hub and the
master is also connected to the hospital LAN in
order to exchange data and images with TPSs, CTs,
Magnetic Resonances (MR) etc…
We installed Linux Red Hat 7.2 [25] with the xfs
journaled file system [26]. Linux owns a collection
of libraries for parallelisation and multiprocessor
environment. We use the MPI/LAM parallel
libraries [27] that enable to run a daemon of
parallelisation on the cluster using secure shell ssh
connections [28]. Message Passing Interface (MPI)
is a complete interface for message-based
interprocess
communication.
Local
Area
Multicomputer (LAM) is an MPI programming
environment and development system for
heterogeneous computers on a network. With LAM,
a dedicated cluster or an existing network
computing infrastructure can act as one parallel
computer solving one problem. The cluster
configuration
was
refined
reorganising
authorisation protocols on the nodes and sharing
partition on networked file system (NFS).
After the HW set-up we were ready to start to
run simple C++ programs on the cluster to both
verify its correct behaviour and prove the
scalability. In the Fig. 1 we can see the execution
time versus the number of nodes. If we recall
Amdhal's law we could calculate the speed-up Sup:
T1
S up 
 3.99
Tm  T0
300
250
200
150
100
50
0
Node 01
Node 01-02
Node01-03
Node 01-04
Proce ssors inv olv e d
Figure 1 Execution time for a benchmarking C++
program with an increasing number of nodes of the
cluster.
where T1 is the task execution time for a single
processor, Tm is the execution time for the
multiprocessing portion and T0 the execution time
of the sequential and communication portions
(overheads). This means that we have a gain of
about a factor 4 in calculation time using the
cluster. Normalising to the number of processors
we obtain the efficiency E=Sup/N  0.997 proving
that the architecture offers a good scalability trend.
We should now endeavour whether the same
scalability will be achievable in case of a more
complex MC calculation
2.2 Simulation
As regards geometry we carried out the
simulation of a linear accelerator of electrons
600CD (Varian, Palo Alto, California) with a 6 MV
potential applied. This machine is routinely used in
our centre in particular for complex radiotreatments such as IMRT and Total Body
Irradiation (TBI). The accelerator is equipped with
a Millennium Multi-Leaf Collimator (MLC)
(Varian, Palo Alto, California). The MLC is
composed of 120 thin round-ended leaves of
different size. The 40 central leaves are small, 0.5
cm at the isocentre, and they are able to produce a
20x20 cm2 field; the remaining leaves have a width
of 1 cm, except for the 4 external ones, which are
1.4 cm wide. The scheme of the accelerators as it is
described in the MC is given in Fig.2. The electrons
at about 6 MeV impinge on a metallic target were
bremmstrahlung photons arise, interacting with the
flattening filter surrounded by the primary
collimator. Below the monitor chambers, which are
used for on-line monitoring of the beam, are
mounted the jaws that could move to define the
fields’ size during the run. The last device is the
MLC collimator in which every single leaf could be
moved and positioned accordingly to the MLC files
defined for the machine.
The dose calculation could be done through
software interfaces on different media such as
homogeneous and anthropomorphic phantoms or
real patients. The CT data sets are used as input of
geometrical description of phantom and patient
using Hounsfield number to recover electron
density and hence reconstruct their materials
composition.
3
PDD
120%
100%
80%
60%
Simulation
Measurements
40%
20%
0%
0
50
100
150
200
Depth [mm]
Figure 2. Accelerator’s head. Components are
described in the text.
As regards physics, in the Geant4 simulation we
model electrons, positron and photons with a
flavour of processes including both elastic and
inelastic: multiple scattering, bremmstrahlung,
ionisation, photoelectric, Compton and Rayleigh
effects etc... We would highlight that Geant4 has no
tracking cut but only production threshold and
particle are followed down to zero range, unless
otherwise specified by user: so it is not necessary to
tune any parameter.
Special care has been devoted to software
engineering allowing a fast and easy addition of
different tools for data analysis, graphical user
interface (GUI) and input data management.
Moreover a general GUI has been created to
manage the overall simulation: the compilation,
make and link of the C++ code, the run in the
parallel environment and the analysis of the data.
Figure 3 Simulation versus experimental PDD
curve for a 40X40 cm2 field size in water at
SSD=100cm
In fig. 4 is shown the dose profile for a 10X10
cm2 field at SSD=100 cm. We have examined
different field sizes and set-ups to compare MC and
experimental measurements.
Therefore we parallelised our simulation using
LAM/MPI parallel libraries. In Fig. 5 we show the
number of events per seconds for the full simulation
in several condition for the HW described in Tab 1.
The comparison of the first 2 points gives
indication on how computation power has
developed in about 1 year. Then we could see how
the number of events per second increases if we use
2, 3 or 4 parallel nodes. The efficiency E=0.987
proves the scalability of the system.
1,20
X Profile
1,00
0,80
a.u.
0,60
3. RESULTS AND DISCUSSION
Simulation
0,40
Once set-up the simulation, to prove the
correctness of the implementation, we reproduced
PDD and dose profiles for different field at
different SSD in water phantom. In Fig. 3 we could
see the PDD curve for a 40X40 cm2 field at Source
Surface Distance (SSD) of 100 cm. We obtained a
general agreement of the order of 1% between exp
and MC in the exponential tail while a 2-4% error
was found in the build-up region, where the
experimental measurements become less accurate.
Measurements
0,20
0,00
-7,8 -7,3 -6,8 -6,3 -5,8 -5,3 -4,8 -4,3 -3,8 -3,3 -2,8 -2,3 -1,8 -1,3 -0,8 -0,3 0,2 0,7 1,1 1,7 2,2 2,7 3,2 3,7 4,2 4,6 5,2 5,7 6,2 6,7 7,1 7,7
-0,20
X [cm]
Figure 4 Simulation versus experimental
normalised dose profile curve for a 10X10 cm2 field
size in water at SSD=100cm
4
REFERENCES
Events/s
6000
5000
4000
3000
2000
1000
0
WS
Node 01
Node 01-02
Node01-03
Node 01-04
Processors involved
Figure 5 Number of events processed per second
for the full simulation.
With this work we were therefore aiming at
evaluating different advanced dose calculations
engines against both experimental measurements
and numerical simulation with a Monte Carlo
algorithm. While experimental measurements
provided an estimate of the accuracy of dose
calculation in homogeneous and inhomogeneous
phantoms, Monte Carlo proved to be a valid
alternative in dose calculations where the ordinary
algorithms fail. Monte Carlo calculations performed
on patient CT data, would allow an evaluation of
the error associated with the commercial algorithms
for any patient of interest.
Due to the high statistical precision necessary, to
the complexity of the geometry and to the physics
involved, a lot of time is required for the
simulation. This time is unacceptably high if
compared to traditional algorithms. To lower the
calculation time, the Monte Carlo has been
parallelised and the simulation carried out on high
performance computers, presently obtained at
reasonable cost with a cluster of workstation. We
hence proved the scalability of the parallel
implementation. More intensive run on complex
treatments and patient anatomy are in progress to
understand whether it would be possible to create a
tool to be used in clinical practice instead of less
precise and more expensive analytical algorithms.
ACKNOWLEDGEMENTS
1. Gregoire V, Coche E, Cosnard G, Hamoir M,
Reychler H Radiother Oncol. Aug;56(2):135-50
(2000)
2. Brahme A. Radiother Oncol 12:129-140 (1988)
3. Borfeld T, Bukelbach R, Boesecke R, Schlegel
W. Phys. Med. Biol 35:1423-1434 (1990)
4. Bortfeld T, Kahler DL, Waldron TJ, Boyer AL
Int J Radiat Oncol Biol Phys 30:889-90 (1994)
5. Kallman P, Lind BK, Brahme A. Phys Med
Biol 37:871-889 (1992)
6. Webb S. Phys Med Biol 34:1349-1370 (1989)
7. Spirou S, Chui CS Med Phys 21:1031-1041
(1994)
8. Ahnesjo A, Andreo P. and Brahme A. Acta
Oncol.26 49-56 (1987)
9. Boyer A and Mock EC Med Phys 12:169-77
(1985)
10. Mohan R., Chui C., Lidofsky L. Med Phys
13:64-73 (1986)
11. Mohan R., Chui CS Med Phys 14:70-77 (1987)
12. Cygler J., Battista JJ, Scrimger JW, Mah E and
Antolak J Phys med Biol 32:1073-1083 (1987)
13. Bielajew A, F. Rogers DWO Nucl Instrum
Methods B 18:165-81 (1987)
14. Ma CM and Nahum A. Med Phys 20:1189-99
(1993)
15. Shortt K, Ross Ck, BielajevAF and Rogers
DWO Phys Med Biol 31:235-49 (1986)
16. Nahum AE Monte Carlo transports of electrons
and photons new York, Plenum 523-47 (1988)
17. Andreo P. Phys Med Biol 36:861-920 (1991)
18. Agostinelli et al. Submitted to Nucl Instrum
Methods A Preprints: CERN-IT-2002-003, KEK
Preprint 2002-85, SLAC-PUB-9350 (2002)
19. cern.ch/geant4
20. www.ge.infn.it/geant4
21. Apostolakis J. et al. Report INFN/AE-99/18
(1999)
22. Giani S. et al Report INFN/AE-99/20{1999)
23. Pia M. G. et al. MC2000 Proceedings (2000)
24. beowulf.gsfc.nasa.gov
25. www.redhat.com
26. www.sgi.com/software/xfs
27. www.lam-mpi.org
28. www.ssh.com
This study was supported by a programme Grant
from the Italian Association of Cancer Research
(AIRC).
5
6
Download