TeraTomo project: a fully 3D GPU based reconstruction code for

advertisement
TeraTomo project: a fully 3D GPU based reconstruction code for exploiting the imaging capability of the NanoPET™/CT system
M. Magdics 1), L. Szirmay-Kalos1), Á. Szlavecz1), G. Hesz1), B. Benyó1), Á. Cserkaszky2), J. Lantos2), D. Légrády2), Sz. Czifrus2), A. Wirth3), B. Kári3), G. Patay4),
D. Völgyes4), T. Bükki4), P. Major4), G. Németh4), B. Domonkos4)
1)
Department of Informatics, Budapest University of Technology and Economics, Hungary
2) Institute of Nuclear Techniques, Budapest University of Technology and Economics, Hungary
3) Department of Radiology, Semmelweis University of Budapest, Hungary
4) Mediso Ltd., Budapest, Hungary
The TeraTomo project is dedicated to the development of a fully 3D iterative reconstruction code for multi-modality
(PET/SPECT/CT) imaging. Recently, we have employed the EM/OSEM scheme for reconstruction of PET images; we
have decided to focus on the on-the-fly calculation of the system matrix elements as precisely as possible taking the
following physical effects into account: 3D geometry, detector response, positron-range attenuation, and scatter in
the medium. The reconstruction algorithms have been tailored to the massively parallel GPU platform (using CUDA
technology), enabling to execute the code in parallel on multiple graphics cards [1]. The reconstruction algorithm
employs Monte Carlo (MC) techniques for sampling the Lines Of Responses (LOR) and voxels in forward- and backprojection steps. To attack the ill-posed EM scheme, our implementation contains regularization techniques like
Gaussian and median filtering, as well as Total Variation regularization that significantly increases the quality of
reconstructions at a negligible additional cost. With these advancements, over 2563 resolution voxel arrays can be
reconstructed in a few minutes.
Verification
At the recent stage of TeraTomo development, we have already successfully implemented and carefully verified the
3D geometry based reconstruction engine including detector response modeling.
Mathematical phantoms
TeraTomo
Tomography reconstruction
Source
estimation
Compare
Compute expected
detector response
System overview
The reconstruction algorithm is an iterative maximum likelihood estimation method (EM/OSEM), which alternatively
executes photon transport simulation (forward projection) and source correction (backprojection). We implemented
two types of simulation approaches, both running on multiple GPUs: 1.) Using Monte Carlo particle transport
simulation (MC) [2] and 2.) Using adjoint Monte Carlo approximation (adjMC) [3].
Reconstructed
voxels
Expected
detector response
Measured
LORs
yL
^x
V
xV(n)
~
yL
Forward
projection
Filtering
Phantom
Backprojection +
TV regularization
GATE
(n+1)
xV
sV
*
xV(n)
Monte Carlo particle transport (MC)
In both forward projection and backprojection, MC assigns particles to GPU computing threads. The particle
transport is simulated by each thread closely mimicking nature by sampling each possible interaction with the
corresponding probability distributions as long as the particles are in the object (phantom). Particles initiated at a
voxel would hit a detector with a given probability, otherwise miss; by the latter the computing effort spent on a
particle is lost. To minimize this price we pay for the simulation accuracy, non-analog MC techniques are used such as
source direction biasing, implicit capture, biased source sampling and precomputed detector response. Free path
sampling is achieved by Woodcock tracking, thereby the simulation efficiency is only loosely dependent on material
constituents.
Each particle history contains but a few events from positron
annihilation to escape or to energy cut-off, hence computing
threads hardly diverge. Overall simulation speed offers sampling up
to 5108 positrons/sec, roughly 2.5 magnitudes over a general
purpose MC particle transport code. Advantages of faithful physics
simulation are expected at media with high scattering components
in exchange for slower convergence and more pronounced
tendency for noise build-up.
Forward projection and backprojection employ the same MC engine
for calculating system matrix elements. Optimum samplings of
voxels in both reconstruction phases were found to be proportional
to the activity.
4D detector
response
image
Source correction
In order to validate our system, we used both simulated and measured data. Mathematical phantoms were
simulated by GATE [4]. Both simulated and measured list-mode data were binned into LOR files before
reconstruction. Then, taking the simulated detector hits, we reconstructed the source distribution with our program,
and compared the result with the original phantom. In case of measured data, the normalization information was
passed to the TeraTomo reconstruction engine.
PET equipment: NanoPET™/CT
The NanoPET™/CT is an ultra-high resolution, high sensitivity pre-clinical PET-CT system using the most advanced,
commercially available components, i.e. an 18 cm diameter PET-detector polygon with 12 detector modules, each
consisting of 81×39 LYSO crystals (1.12×1.12×13 mm³) tightly packed and coupled to two 256-channel PS-PMTs. The
imaging capability of this system can only be exploited by using a fully 3D reconstruction algorithm modeling the
detector response, positron range, gamma attenuation, and scatter effects.
Speed of reconstruction
Reconstructing the Derenzo phantom using 144×144×128 voxels, assuming the NanoPET™/CT PET detector
geometry [5] that contains 180 million LORs, the speed of reconstruction is similar to that of FORE rebinning and 2DEM running on multiple CPUs. The TeraTomo reconstruction times on a single Nvidia Fermi GPU card are shown in
the following table. When two GPU cards are enabled, running time is halved.
Forward
Back
3D-MC
28 sec/iteration
28 sec/iteration
3D-adjMC
2 sec/iteration
16 sec/iteration
Effect of Total Variation regularization
TeraTomo reconstruction of the rotated Derenzo phantom, rod sizes are 1.0, 1.1, 1.2, 1.3, 1.4, and 1.5 mm. GATE
simulation in NanoPET™/CT PET detector geometry was reconstructed into 144x144x128 voxels volume. The central
sagittal slice of the reconstructed volume at 50 Iterations is depicted below
Adjoint Monte Carlo approximation (adjMC)
The adjMC method employs approximate adjoint transfer in forward projection and a geometric backprojection,
assigning LORs to threads in forward projection and voxels in backprojection. In order to increase the accuracy of
integral quadratures, we use quasi-Monte Carlo techniques combined with stochastic iteration and filtered sampling.
Forward projection of the adjMC method
Direct
Direct:
Geometry
voxels
Positron
range
Detector
model
Combination+
Stochastic
iteration
Indirect
Scatter + Attenuation +
Detector model
Expected
LORs
Noise tolerance
Regularization
methods
guarantee
correct
reconstruction even for low-dose measurements
when the average number of hits per LOR is very low.
The left image depicts a Derenzo that was
reconstructed from 4 hits per LOR in average, the
right image shows the reconstruction from only 0.2
hits per LOR in average.
The forward projector of the adjoint method examines LORs one-by-one and computes the expected number of hits
due to annihilations in voxels intersected by the line samples of this LOR (direct contribution) and all voxels
(scattered contribution). Scattering in the object and scattering in the detectors are handled independently via the
interface of sample points generated on the surface of the detector crystal. The high-dimensional integral of a LOR is
estimated by quasi-Monte Carlo quadrature including Poisson-disk sampling. In order to reduce the variance of this
estimator, we employ both spatial and temporal filtering. Temporal filtering is called stochastic iteration, which
averages the estimated LOR values with the results of previous steps. Spatial filtering may include either Gaussian or
median filtering before the execution of the forward projector.
Scatter compensation of adjMC method
Results of incorporating detector response modeling

s2

s2

z1

s1

s1

s2

z1

z2
3D reconstruction of the GATE simulated Derenzo phantom. The effects of 3D geometry and of detector response
modeling in the reconstruction have significantly improved the image quality.

s1
1. Scattering points 2. Ray marching from
3. Combination of paths
detector to scattering points
The sampling process estimating the scattered contribution has three steps. First, scattering points are sampled with
a probability density that is proportional to the scattering cross section of the material. Then, the total annihilation
and out-scattering is computed between these sample points and the detector crystals. In the final step, we just
combine these results together and compute the direct component. As the number of crystal pairs is much larger
than the number of crystals and scattering points, the scattering computation has just a small overhead with respect
to that of the direct component.
2D reconstruction:
SSRB + OSEM
3D adjMC reconstruction
without detector
response compensation
3D adjMC reconstruction
with detector response
compensation
Results of measured data reconstruction
Detector response modeling
q
Photons may get scattered in detector crystals before they get finally absorbed. Unlike the measured object, which is
different in each measurement, the detectors are fixed, so the probabilities of photon paths between detector
crystals can be pre-computed, and these pre-computed data can be included in the estimator. During the precomputation we consider a single input crystal and incident photons coming from a direction of given inclination and
azymuthal angles are simulated and we compute the probabilities that this photon is absorbed in another crystal.
These probabilities can be visualized as a two dimensional image, where arrival probabilities are depicted by gray
levels. These images are two large to be sampled efficiently. So, during the pre-computation, we pre-generate quasiMonte Carlo sample sets that contain just a few samples, but their cumulative distribution is as close as the simulated
distribution as possible.
Backprojection of the adjMC method
Detector 1
Detector 2
voxel

z1

v

z2
The backprojector of the adjMC method has been developed with the
objective of efficient GPU execution. Thus, it is also of gathering type
where a thread computes the update of a voxel from all LORs
intersecting it. To select these LORs, a point is sampled in the voxel,
then the detector module is centrally projected onto its coincidence
pair via the voxel sample.
The backprojector is also responsible for TV regularization, thus it also
reads the neighboring voxel values of the previous step, obtains the
gradient and includes it in the iteration formula.
3D EM reconstruction of the F-18 mouse bone PET study, taken by NanoPET™/CT. There was no scatter, attenuation,
and detector model compensation.
Future work
We are going to deal with the verification of scatter and attenuation modeling in the near future. Then we are
planning to implement random coincidence modeling as well as dead time correction in order to archive a 3D
quantitative reconstruction tool.
References
[1] DOMONKOS, B., AND JAKAB, B. A Programming Model for GPU-based Parallel Computing with Scalability and
Abstraction. In Spring Conference on Computer Graphics (2009), pp. 115–122.
[2] WIRTH, A., CSERKASZKY, A., KÁRI, B., LÉGRÁDY, D., FEHÉR, S., CZIFRUS, S., AND DOMONKOS, B. Implementation of
3D Monte Carlo PET Reconstruction Algorithm on GPU. In IEEE Medical Imaging Conference (2009)
[3] SZIRMAY-KALOS, L., TÓTH, B., MAGDICS, M., LÉGRÁDY, D., AND PENZOV, A. Gamma Photon Transport on the GPU
for PET. Lecture Notes in Computer Science 5910: pp. 433-440. (2010)
[4] GATE, see http://opengatecollaboration.healthgrid.org/, and JAN S. et. al., GATE: a simulation toolkit for PET and
SPECT. Phys. Med. Biol. 49 (2004) 4543-4561
[5] More information about NanoPET™/CT device, visit: http://www.bioscan.com/molecular-imaging/nanopet-ct
Download