Radiotherapy and Oncology 60 (2001) 203±213 www.elsevier.com/locate/radonline Application of a test package in an intercomparison of the photon dose calculation performance of treatment planning systems used in a clinical setting Jack Venselaar a,*, Hans Welleweerd b a b Department of Radiotherapy, Dr B. Verbeeten Institute, P.O. Box 90120, 5000 LA Tilburg, The Netherlands Department of Radiotherapy, University Medical Center, P.O. Box 8500, 3508 GA Utrecht, The Netherlands Received 26 May 2000; received in revised form 12 December 2000; accepted 9 January 2001 Abstract Background and purpose: Testing the performance of treatment planning systems by using the AAPM Task Group 23 test package is a useful approach, but has its limitations. To be able to include technical developments, such as the asymmetric collimator, it was decided to remeasure the AAPM data set on more modern radiotherapy equipment, to extend the test geometries, and to evaluate the use of the new package. Materials and methods: A coherent set of beam data of 6, 10 and 18 MV photon beams was measured on two modern linear accelerators. These data served as input data in seven commercially available treatment planning systems, which were clinically in use in different radiotherapy departments. Next, a test package was measured which included a missing tissue geometry and ®elds with asymmetrical collimator setting, with and without a wedge. Results: The absolute dose prediction from the different treatment planning systems in which the measured beam data were entered, was compared for all test points with the results of direct measurements. The criteria of acceptability were exceeded by some systems in cases of irregular ®eld geometry and missing tissue geometry. The majority of the systems had dif®culties with accurate dose calculation for asymmetrically wedged ®elds. Conclusions: The application of the new test package did not introduce insuperable dif®culties and was highly appreciated by the participating centres. Most systems performed reasonably well for the majority of the beam geometries, with the exception of asymmetrically wedged beams. The extended test package is available for other users or user groups for the purpose of commissioning new treatment planning systems, or new releases of existing systems. q 2001 Elsevier Science Ireland Ltd. All rights reserved. Keywords: Treatment planning system; Quality assurance; Dose calculation 1. Introduction Commissioning of the dose calculation algorithms of a treatment planning system is generally performed: (i), by entering basic beam data into the system according to the methods and requirements described in the user's manual of the system; and (ii), by comparing the results of dose calculations with the entered data and with data that were measured speci®cally for this purpose. Most commonly, existing beam data are used as input data. Differences between calculated and actual dose values may be encountered, partly due to uncertainties in the measured data, and partly due to imperfect beam modelling. Criteria for accept- ability have to be applied before accepting a treatment planning system for clinical use. Several authors have developed such criteria [4,6,27,29]. These criteria for acceptability can, in a ®rst approach, be based on an analysis of clinical dose±response curves. Mijnheer et al. [17] stated that for dose delivery in the patient, one should strive for an overall accuracy of ^3.5% (1 SD) in the value of the dose delivered to the ICRU reference point [15]. The evaluation of dose± response curves requires an accurate knowledge of the dose, for tumour control as well as for normal tissue damage. Treatment planning is one of the main steps in radiotherapy. These steps include: calibration of the dosimeter, determination of the absorbed dose under reference conditions, phantom measurements under non-reference conditions, calculation of dose distributions in the patient, and ®nally, treatment delivery. Out of the total uncertainty budget, only * Corresponding author. 0167-8140/01/$ - see front matter q 2001 Elsevier Science Ireland Ltd. All rights reserved. PII: S 0167-814 0(01)00304-8 204 J. Venselaar, H. Welleweerd / Radiotherapy and Oncology 60 (2001) 203±213 a relatively small margin remains for the accuracy of the treatment planning part of the total procedure. On the other hand, the criteria may also be based on a judgement of the actual performance of state-of-the-art treatment planning systems. These criteria should then be based on an analysis of the applied algorithms and on studies in which the results of calculations are compared with measured data [4,12,16,27]. The ®nal criteria should re¯ect both what is achievable in clinical practice with upto-date equipment, and the radiobiological requirements for accuracy [14]. This paper discusses the accuracy achievable with commercially available treatment planning systems employed in a clinical environment. The optimal evaluation of such systems makes use of one set of data, entered into the different systems in order to avoid the in¯uence of uncertainties which result from different sets of measurements performed at different sites and obtained at various times [21,24]. However, such an evaluation procedure is very time consuming, because it requires the development of a set of basic data in such a format that these can be entered into the various treatment planning systems. Furthermore, these systems need to be made available for study purposes, which means that a close collaboration is needed between several radiotherapy departments and/or the manufacturers of treatment planning systems. For the individual departments, the burden is high, because the work needs to be done in addition to the work of testing the local beam data of the clinic. However, not all centres have to participate, as the results of a speci®c system are, in general, applicable to all systems with the same version from the same manufacturer, provided that individual user in¯uences can be excluded. So, test results and experiences may be shared by others in the user group. Finally, new technical developments of treatment machines, such as asymmetric collimators, and new computer facilities with, e.g. more sophisticated algorithms for inhomogeneity corrections and 3D calculations, necessitate this type of test to be repeated at certain intervals, and eventually, to be extended to encompass the new tools. An important attempt to produce a standard set of beam data for testing treatment planning systems was performed by AAPM Task Group 23, who developed a set of beam data from a 4 and 18 MV photon beam [2]. This set includes a number of test cases which can be used for comparison purposes. However, a few drawbacks are associated with the test package presented by Task Group 23. The ®rst drawback is that, in clinical practice, photon beam qualities in the range from 6 to 15 MV are used most frequently, which are energies just in between the energies provided by the task group. Another point is that the high energy photon beam is measured on a linear accelerator from a type (Therac-20, AECL, Kanata, Canada) which is obsolete. Furthermore, the AAPM data set is static in the way it is proposed. It is not presented as an open data set to which new devices can be added, such as asymmetrical collimator settings and multi-leaf collimators (MLCs), which are now commonly used; neither can the speci®c demands of a given treatment planning system be added to the data set. As a consequence, the test package risks being outdated in a relatively short period of time. In the present study, we have tried to overcome these drawbacks for at least a certain period of time. Photon beam data were measured on two modern linear accelerators installed in the radiotherapy department of the University Medical Center, Utrecht, the Netherlands. These machines (Elekta SL 15 and SL 20, EOS, Crawley, UK) operated with 6, 10 and 18 MV nominal photon beam quality and were both equipped with asymmetrical collimator jaws. One of the treatment units was equipped with a MLC. This means that it was possible to adapt the data set from these machines with the latest technical developments and to incorporate newer tests related to these developments in the set of test data. Furthermore, the speci®c demands of basic beam data for entering into a system could be realized by performing additional measurements. The purpose of this paper was to describe the present status of this new data set, measured at the Elekta SL machines. The set of test con®gurations was kept, as much as possible, similar, or even identical, to the Task Group 23 set of test con®gurations, but was expanded with tests for a `missing tissue' geometry and for asymmetrical collimator settings, including asymmetrically wedged beams. We refer to this test package as the NCS set, as it will be part of a forthcoming report on Quality Assurance of Treatment Planning Systems of the NCS (the Netherlands Commission on Radiation Dosimetry). Furthermore, we present the results of an intercomparison that was performed using the NCS test package for seven different treatment planning systems, some of which are commercially available as `3D systems'. These systems, all in use in a clinical setting, are considered representative for the currently available treatment planning systems. The results are discussed with respect to the criteria for acceptability and quality assurance of treatment planning systems, which have been suggested elsewhere [29]. The feasibility of these criteria has been investigated by applying these to the results of the intercomparison. It is emphasized here that it was not the aim of this study to judge the performance of each individual system, but to investigate: (i), the methodology of using such a test package; and (ii), the accuracy which can be obtained in general with modern treatment planning systems in a clinical setting. All participants agreed to publish the results of the tests for their speci®c centre including the name of their system. 2. Methods and materials 2.1. Measurement techniques Beam data were obtained at an Elekta SL 15 linear accelerator for 6 and 10 MV photon beams with quality indexes J. Venselaar, H. Welleweerd / Radiotherapy and Oncology 60 (2001) 203±213 (QI) of 0.676 and 0.734, respectively, and at an Elekta SL 20 for an 18 MV photon beam with a QI equal to 0.770. Linear accelerators of this type were the most widespread treatment machines in the Netherlands at the time of this investigation. The percentage depth±dose (PDD) and pro®le measurements were performed in a WMS water phantom (Nucletron, Veenendaal, the Netherlands), and by point-by-point measurements with an 0.1 cm 3 ionization chamber of the type, RK01 (Scanditronix, Uppsala, Sweden). The data were obtained in several different measurement sessions. At each time of a new measurement session, the water phantom was accurately positioned in the treatment room and left there for at least 2 h in order to reach temperature stability. The pro®le at 5 cm depth in a ®eld of 40 £ 40 cm 2 and the PDD in a 10 £ 10 cm 2 ®eld were measured and compared with a reference pro®le and a reference PDD curve obtained at the time of acceptance of the accelerator. The pro®les of the beams of these linear accelerators were adjusted for optimal ¯atness at a depth of 5 cm at the time of installation of the machine. The correspondence between the pro®les and PDD values was con®rmed to be better than 1% of the local dose before accepting the beam regulation of the accelerator for further measurements. Furthermore, a dose measurement at the reference point at 10 cm depth was performed at the beginning and at the end of each session. The value agreed with the reference value within 1%. This measurement was repeated and recorded at the end of each measurement session. Point doses in the test cases were obtained by registration of the integrated signal of an irradiation with 100 monitor units (MU). The measured data were evaluated with great care in order to construct a data set with good internal consistency. Several open beam situations were remeasured during each session, and the results of these situations were compared. Some smoothing of the test data was applied and/or data were averaged, e.g. by using the average of the dose point values over 2 mm of the pro®le. Occasionally, differences in beam quality were present and the results of the measurements of such a session were corrected for these differences by making use of the reference measurements. As a result, the basic beam data had an estimated overall internal consistency of better than ^1% (1 SD; maximum error, 1.5%) of the local dose at any point within the beam. 2.2. The basic beam data set The following data were collected for the purpose of modelling the 6, 10 and 18 MV photon beams in any of the treatment planning systems used in the study. The reference conditions for the determination of output factors and the measurement of pro®les and depth±dose data were: a ®eld size of 10 £ 10 cm 2, a depth of 5 cm, and a source± surface distance (SSD) of 100 cm. 2.2.1. Depth±dose data Open beam central axis depth±dose data for square ®eld 205 sizes of 3, 5, 7, 10, 15, 20, 30 and 40 (cm £ cm), for depths from 0 up to 30 cm. 2.2.2. Pro®les Five open beam depth pro®les for the each of these square ®elds at depths of, e.g. dmax, 5, 10, 20 and 30 cm. The pro®les were measured over a maximum distance of 49 cm. 2.2.3. Wedged ®eld data Depth±dose data from 608 wedged ®elds and wedge pro®les for square ®eld sizes of 5, 7, 10, 15, 20 and 30 (cm £ cm) at ®ve depths of, e.g. dmax, 5, 10, 20 and 30 cm. 2.2.4. Output factors Output factors, Scp versus square ®eld size, separated into head scatter correction factors, Sc, and phantom scatter correction factors, Sp [28]. These output factors were measured using a full-scatter phantom for the determination of Scp, and a mini-phantom of 3.0 cm diameter polymethylmethacrylate (PMMA) with the ionization chamber in an upright position for Sc measurements. The Sp is then obtained from the ratio of Scp and Sc for a given ®eld size [28]. The measurement depth was 5 cm for the 6 and 10 MV beams, and 10 cm for the 18 MV beam. For those users of the data set for whom the treatment planning system required output factors at dmax, Scp was recalculated to the output factor de®ned at this depth using the ratios of PDD values. 2.2.5. Wedge factors, tray factor, block transmission Wedge transmission factors were measured at 5 cm depth as a function of square ®eld size. These wedge factors were recalculated to the depth of dmax, again using the PDD ratios. Tray transmission factors for the reference ®eld were determined in the same way. Transmission factors from wide ®eld measurement were included. The dimensions of blocks were the dimensions projected to the source-axis distance. In principle, all data were made available for evaluation to each of the investigators participating in the intercomparison of this study. Occasionally, some additional data had to be measured to meet the requirements of a speci®c treatment planning system. The data were originally documented in print, but are currently also available in digital form on CDROM. Further developments and test plans may lead to an extension of the data set. An example is the set of diagonal pro®les, which, in the meantime, has been added to the data set in order to comply with other treatment planning systems. 2.3. The test package data The following data were measured for the purpose of the test cases. In the development of the NCS test geometries, the list of tests as proposed by the AAPM Task Group 23 [2] was followed as closely as possible. For numbering of the tests, the original AAPM Task Group 23 nomenclature has 206 J. Venselaar, H. Welleweerd / Radiotherapy and Oncology 60 (2001) 203±213 Table 1 Correspondence of the NCS test set and the AAPM task group 23 test set a NCS Short description of the test (dimensions in cm) AAPM TG 23 1a 1b 1c 2a 2b 3 4 5 6 7 Square ®eld, 5 £ 5 Square ®eld, 10 £ 10 Square ®eld, 25 £ 25 Rectangular ®eld, 5 £ 25 Rectangular ®eld, 25 £ 5 Square ®eld, 10 £ 10, SSD 85 Square ®eld, 9 £ 9, wedge Square ®eld, 16 £ 16, central block Square ®eld, 10 £ 10, off-axis Square ®eld, 16 £ 16, blocked to Lshaped ®eld (irregular) Square ®eld, 6 £ 6, lung inhomogeneity Square ®eld, 16 £ 16, lung inhomogeneity Square ®eld, 16 £ 16, bone inhomogeneity Square ®eld, 10 £ 10, oblique incidence Square ®eld, 10 £ 10, half phantom (`missing tissue') Square ®eld, 20 £ 20, half phantom (`missing tissue') Asymmetrical ®eld, 15 £ 15; geometric radiation ®eld centre at: 7.5,0; 0,7.5; 7.5,7.5 Asymmetrically wedged ®eld, 15 £ 15; geometric radiation ®eld centre at: ^7.5,0; 0,7.5; ^7.5,7.5 1 1 1 2 2 3 4 5 6 7 8a 8b 8c 9 10a 10b 11 12 a were entered into the system and the results were listed in tables and in a spreadsheet program, again by or under supervision of the responsible physicist. Comparison of computed with measured dose values was done on the basis of the output of the treatment planning system. If the system did not provide the result of the calculation in absolute dose values, but only in relative dose distributions, the output was converted into dose values using the locally applied MU calculation programme. This was done by taking into account the relevant correction 8 8 8 9 ± ± ± ± Tests 10±12 were not included in the original set. been used, to which other tests were added. Table 1 presents an overview of the proposed tests and the corresponding test numbers in the NCS set and in the AAPM Task Group 23 set. The test situations are shown graphically in Fig. 1 for tests 10±12 only, which form the present extension of the Task Group 23 test package. A more detailed description of the other test cases, including the deviations from the Task Group 23 tests, can be found in Appendix A [1,2]. The data in the test package were expressed in dose (in cGy), for an irradiation with 100 MU. So, the dose values included the effects of ®eld size, tray, wedge, inhomogeneities, etc. on the output. Unless speci®ed otherwise, the SSD is 100 cm, and points at depths 1, 3, 5, 10, 15, 20, 25 and 30 cm were measured, on the central axis, off-axis and just outside the penumbra region. 2.4. Treatment planning systems The basic beam data of the 6, 10 and 18 MV photon beams were entered into the treatment planning systems listed in Table 2. For this procedure, beam data were sent to the physicist who was locally responsible for clinical acceptance of the treatment planning system. This local physicist supervised and controlled the beam data entry into the system. Then, the test situations described above Fig. 1. Graphical representation of the test geometries 10±12. The dots in the side views of the beams represent the points at which the dose was measured. In test 10, the collimator rotation axis of the treatment unit was positioned at the edge of the phantom. In the beam's eye view of tests 11 and 12, the cross indicates the position of the collimator rotation axis, while the dots represent the position of the geometric radiation ®eld centre of the asymmetrical 15 £ 15 cm 2 ®eld. Three asymmetrical positions of the ®eld centre were included in test 11 and ®ve positions were used for the asymmetrically wedged ®eld of test 12. The tests 10±12 form the extension of the AAPM Task Group 23 test package, and these have now been included in the NCS set. J. Venselaar, H. Welleweerd / Radiotherapy and Oncology 60 (2001) 203±213 207 Table 2 List of treatment planning systems participating in the intercomparison System Version Manufacturer Test centre LPS TheraplanPlus Plato RTS UM-Plan CadPlan Pinnacle 3 Renderplan 4C 3.0 2.01 339 2.7.7 4.0e 3.5 LPS Theratronics Nucletron University of Michigan Varian ADAC Elekta Eindhoven, Catharina Hospital Tilburg, Dr B. Verbeeten Institute Utrecht, University Medical Center Amsterdam, The Netherlands Cancer Institute Nijmegen, University Hospital Antwerpen, AZ Middelheim Arnhem, ARTI factors: the head scatter factor, the phantom scatter factor, and whenever applicable, the tray and/or wedge factor. If these factors were a function of ®eld size, linear interpolation was applied between the nearest measured data points. These correction factors were part of the data set. The results were reported back to the responsible physicists. In a number of cases (see below), recalculation of a part of the test was accepted by the investigators, e.g. due to inappropriate initial beam modelling or misinterpretations of the tests. 2.5. The analysis of the results The deviations between the calculated dose, Dcalc, and the measured dose, Dmeas, reported in this paper have been de®ned as percentage deviations of the local dose, i.e. the dose measured at a speci®c depth according to: Dcalc 2 Dmeas £ 100%=Dmeas . In those cases where the points were outside the penumbra or under a block, the results of the comparison were expressed relatively to the dose measured at the same depth, but on the central axis of the open beam, Dmeas,cax, according to Dcalc 2 Dmeas £ 100%=Dmeas;cax [29]. To avoid the necessity to present all data in the form of a large number of histograms, the concept of the con®dence limit, D , has been used in addition to the graphical representations[29,30]. The con®dence limit is based on the calculation of the average deviation between calculated and measured dose values for a group of data points in comparable situations, and the standard deviation (1 SD of the average) of the differences. The con®dence limit has been de®ned as follows in Eq. (1) D javerage deviationj 1 1:5 £ SD 1 For each speci®c test situation, the con®dence limit was used to judge the performance of a set of calculations. The tolerance for the con®dence limit could be exceeded because the average deviation of all points is too large, but also in cases where a few data points showed extreme deviations and therefore increased the SD. The factor 1.5 in this expression is a weighting factor which was shown in this study to be practical for this type of test. If a factor of .1.5 was chosen, then the effect of random errors would have been emphasized, while a factor ,1.5 would increase the relative importance of systematic deviations. For many test situations with open beams, the con®dence limit should not exceed a tolerance of 3%, but in more complex cases, a larger value of the tolerance was justi®able (see Table 3) [29]. The results of the comparison between the measured and calculated dose values at all points of the test set were made available to the participants of the study. In the presentation of the results in this paper, all relevant data points up to 25 cm depth were included, except for the points lying in the build-up area (at 1 cm depth) and the points outside the beam borders. It is generally acknowledged that the accuracy of calculations in areas with a very high dose gradient (.3%/mm) can be better expressed as a shift of isodose lines[27,29]. For an evaluation in such areas, the data in the test package were considered unsuitable in the context of the present analysis. The build-up and penumbra points were therefore not taken into account in the present analysis. 3. Results Table 4 presents the ®nal outcome of the intercomparison Table 3 Values of the criterion for the con®dence limit for the different types of test geometries a Description 1 Homogeneous, simple geometry Output factors Central axis data of square ®elds Off-axis data 2 Complex geometry (wedged ®elds, inhomogeneities, irregular ®elds, asymmetrical collimator setting) Central and off-axis data 3 More complex geometries, i.e. combinations of #2 Central and off-axis data 4 Outside beam edges In simple geometry In complex geometry (see #2) In more complex geometry (combinations of #2) Tolerance in % of local dose 1 2 3 3 4 In % relative to the dose at the same depth, but at the central axis of the open beam 3 4 5 208 J. Venselaar, H. Welleweerd / Radiotherapy and Oncology 60 (2001) 203±213 Table 4 Intercomparison of seven treatment planning systems a,b,c Test number Description of ®elds Tolerance (%) LPS-1 (LPS-2) d Theraplan Plato-RTS UM-Plan CadPlan Pinnacle RenderPlan 1a±c 2a±b 3 4 5 6 7 8a±b 8c 9 10a±b 11 12 Square Rectangular Short SSD (85) Wedged Central block Off-centre plane Irregular block Lung inhomogeneity Bone inhomogeneity Oblique incidence Missing tissue Asymmetrically open Asymmetrically wedged 3 3 3 3 4 3 3 3 3 3 3 3 4 2.1 (0.8) 3.0 (1.0) 1.6 (0.8) 4.1 (0.8) 3.9 (0.9) 2.1 (0.9) 10.7 (2.8) 2.0 (1.2) 4.3 (0.7) 1.5 (1.5) 2.6 (1.5) 4.9 (1.7) 8.9 (2.0) 1.4 2.3 1.5 3.0 3.4 3.0 4.5 3.1 1.7 1.9 2.2 4.0 9.7 1.1 2.9 1.1 2.5 3.3 2.1 5.2 1.2 3.6 1.5 1.6 3.3 8.5 1.2 4.0 1.4 2.4 2.1 1.9 6.9 1.5 2.8 1.4 3.4 2.0 e 5.4 e 1.3 2.1 1.1 1.7 3.8 3.4 4.5 2.4 NA 1.2 4.3 2.8 10.1 1.3 2.5 1.1 2.3 2.6 1.1 3.3 3.2 2.0 1.7 1.7 2.4 4.8 0.9 1.9 1.0 2.3 3.1 1.8 4.5 3.5 1.6 1.5 1.3 2.6 8.8 a The values in the columns under the system names represent the con®dence limit (in % of the local dose value) calculated for the groups of pre-selected points in the geometries presented. b Values are presented in bold if the tolerance for the con®dence limit (see Table 3) was exceeded. Note that, if the calculated value of the con®dence limit was within the tolerance set for D , this means that still 6.5% of the pre-selected points in that geometry may exceed that limit. c NA, `not available'. d The column LPS-1 includes the results obtained initially with the LPS system, while the values between brackets show the ®nal results (LPS-2) obtained after remodelling the basic beam data, illustrating the user's in¯uence on the outcome. e Only data for the relative dose calculation were available. for the different test geometries. The outcome was expressed for each treatment planning system as the con®dence limit found for the individual test or groups of tests. In those cases where the tolerance was exceeded, the results in the table are marked in bold font. Table 4 illustrates for one speci®c treatment planning system, the in¯uence of the user on the results of this type of comparison. The con®dence limit which was initially found for the LPS system is shown under the heading LPS-1. When we reported these values back to the responsible physicist, a number of corrections in the modelling process were considered necessary, resulting in the data shown between brackets in the same column (LPS-2). As the evaluation of the test was done for each beam separately, more information was available and we actually could have reported here the results of the tests of each beam quality separately. In one particular case, such as for test 2 of the UM-Plan, considerable differences were found for the three beam qualities, but, in general, the results were quite similar for the different beam qualities of a given treatment planning system. This ®nding allowed us to combine the results and, as a consequence, to limit the data presented in this paper. The results of the square ®elds test were all in compliance with the tolerance set for D in Table 3. For each of the systems, the square beam situations of the tests 1a±c were combined. Only data points lying within the geometrical borders of the ®eld up to a depth of 25 cm were included. For each system, 108 data points were used. The largest deviations were found for the Theraplan system, with values calculated for D of 1.4%. The initial value of 2.1% calculated for LPS was later reduced to 0.8%. The overall result was very satisfying for all systems. The same observation as made for square ®elds, compliance with the tolerance for D , held for practically all results of the tests with rectangular ®elds (test 2, except for D of the UM-Plan exceeding the tolerance), shortened SSD (test 3), the wedged ®eld (test 4, except for LPS-1), and the off-centre plane (test 6, one system with 3.4%, others #3%). In the central block test (test 5), the calculated values of D were within 4%. For the irregular ®eld geometry of test 7, a 3% tolerance for the con®dence limit was de®ned, but only two (LPS-2 and Pinnacle) out of the seven systems were able to reach or approximate this goal. The UM-Plan showed a large deviation (6.9%) from the criterion. For the other systems, D was in the range of 3.3±5.2% (see Fig. 2). This histogram shows the frequency distribution of the relative dosimetric errors, expressed in percentages, of the local dose for all preselected dose points included in the evaluation of the speci- Fig. 2. Frequency distribution of the relative dose deviations (in % of the local dose) for test 7, irregular ®eld. For this geometry, the results of the points off-axis, but within the borders of the beam were included. Eighteen data points were included for each treatment planning system. J. Venselaar, H. Welleweerd / Radiotherapy and Oncology 60 (2001) 203±213 ®c test geometry. We have chosen, for clearer presentation, to include in this and other histograms only the results of the three systems which showed the largest values of D calculated for each test geometry. Thus, if not shown, other systems performed better for the given test. Both the low and high density inhomogeneities of test 8 were calculated, in general, with a reasonable accuracy. All results were within 4%, and the majority of the data was within the 3% tolerance set for this case. For the situation with a beam obliquely incident on the surface of the phantom of test 9, a few points showed a large deviation between measurement and calculation. All these points were at the shallowest depth. The con®dence limit, D , for this situation was calculated only for the points beyond the depth of dmax. Then, for these points, no problems were found for any of the systems (see Table 4). The results for all systems were well within the 3% tolerance. In the test for `missing tissue', all points were taken into account for both ®eld sizes, i.e. for the 10 and 20 cm square ®elds. The results are shown in Fig. 3. Some of the systems showed very good agreement between calculation and measurement, but others (UM-Plan and CadPlan) had larger deviations. This result indicated a less adequate handling of the scatter dose contribution by these systems, particularly by CadPlan, which was clearly in excess of the criterion of 3%. In general, no serious problems were detected with the open asymmetrical ®eld settings. Two systems were in excess of the tolerance for D of 3%, but were within #4%; all others had better results. In contrast with this ®nding, the asymmetrically wedged case was apparently too dif®cult for most of the systems, as can be seen in Table 4, and is shown for three systems in Fig. 4. All systems except LPS-2 exceeded the tolerance of 4% set for D in this case, one other system (Pinnacle) was within 5%. For one of the systems, Renderplan, a histogram of the results of test 12 is shown in Fig. 5, in which a separation was made for points lying below the thin part, the central Fig. 3. Frequency distribution of the relative dose deviations (in % of the local dose) for test 10, missing tissue situation. The two different ®eld sizes were combined. CadPlan was in excess of the tolerance with a calculated value of D of 4.3%. For comparison, the results of Theraplan and UM-Plan were also included in the graph, with values for D of 2.2 and 3.4%, respectively. For these two systems, 108 data points were included, while 90 points were available for CadPlan. 209 Fig. 4. Frequency distribution of the relative dose deviations (in % of the local dose) for test 12 with asymmetrically wedged beam settings. Results are shown here for the wedged ®elds for Theraplan, CadPlan, and Renderplan. For each system for test 12, 225 data points were included. part, and the thick part of the wedge. In most of the systems, we found that there is a similar continuous shift in the dose deviation if the position of the rayline from the source to the point is varied from the thin to the thick part of the wedge. The calculated dose was typically too low under the thin part and too high under the thick part of the wedge. 4. Discussion Quality assurance of treatment planning systems has been the subject of study of several groups of physicists, formulating recommendations for the commissioning and routine quality control of these systems [2,3,4,6,7,13,14,18,24]. Different types of studies can be distinguished. Dale [8], Prasad et al. [19], and Sauer et al. [23] presented studies, in which the performance of one speci®c treatment planning system was discussed, using machine data obtained in the same clinic and comparing these data with results of calculations. Other groups reported results of intercomparisons of the performance of different planning systems. This was occasionally done using the planning systems from different clinics using the locally available data [22,31,32]. In some Fig. 5. Frequency distribution of the relative dose deviations (in % of the local dose) for the asymmetrical beam settings of wedged ®elds in test 12 shown for Renderplan. This histogram is a typical example of the results of the systems with larger deviations, when these were separated for points below the thin part, the central part, and the thick part of the wedge. 210 J. Venselaar, H. Welleweerd / Radiotherapy and Oncology 60 (2001) 203±213 reports, data for a speci®c treatment technique or tumour localization, such as breast treatment, were used [9,11,25]. With such a type of intercomparison, in general, no judgement could be given of the relative quality of a speci®c system, because these tests were not performed with the same data set. A new and interesting approach to analyze the performance of a treatment planning system was followed in the European Dynarad project [5]. In this project, speci®c test conditions were de®ned, for which dosimetric reference data were made available for various beam qualities. A test protocol was developed, in which calculated correction factors could be compared with measured correction factors which depend on the QI only. The generally applied approach to test treatment planning systems is to enter one common data set into different system(s) and to compare the results of calculations with measured data [21,24]. Recently, a test package was developed for this purpose by the AAPM Task Group 23 [2]. The ®rst reports on the use of this data set were published, e.g. by Alam et al. [1], comparing two treatment planning systems, Plato version 1.21 and Theraplan version 5. Ramsey et al. recently added a version of the ADAC Pinnacle and Varian CadPlan system to the list [20]. Declich et al. [10] reported the work of the Italian CadPlan user's group in this ®eld for versions 2.7.9, 3.0.6, and 3.1.1 of this system. In the present study, we have tried to overcome the drawbacks associated with the AAPM data set [2] as discussed earlier. The beam qualities of the Elekta SL linear accelerators used in our study re¯ect commonly used photon beam energies: 6, 10 and 18 MV. These machines will be in use for a number of years and are equipped with modern accessories, such as a MLC. The set of beam data described in the present paper was compatible with the demands of the planning systems listed in Table 2. With respect to the test results of the different treatment planning systems, the following remarks can be made. All systems taking part in our intercomparison were available as full 3D treatment planning systems, except LPS [26]. This latter system was a relatively simple treatment planning system, which could perform all dose calculations in 3D. It had, however, a number of limitations with respect to, for instance, table rotation and patient contouring. All systems performed reasonably well, with results for most of the tests in compliance with the stated tolerances for D , except for tests 7 and 12. In those cases where a system exceeded the criterion for D in tests 1±11, the results were, in general, not far above this criterion. The somewhat larger deviations of test 2 with rectangular ®elds in comparison with the case of the open square ®elds of test 1 compared well with the results previously reported by Declich et al. [10] for the CadPlan system. The fact that only the output factors of square ®elds were used by most systems and not the data separated for the X and Y jaw settings (the collimator exchange effect), can have given an unnecessary rise to the deviations. These data were available in the NCS set. It is noted that the AAPM Task Group 23 package lacks this information. The irregular ®eld geometry of test 7 simulated blocked ®elds used in conventional therapy (e.g. mantle ®elds). This test is even more relevant when applying conformal radiotherapy and intensity modulated radiotherapy. The largest deviations were found for the UM-Plan with a value for D above 6%. With values around 5%, most other systems were above the criterion of 3% set for this situation. Only for LPS-2 and Pinnacle was a relatively good result found in the irregular ®eld case. For the inhomogeneity tests, no serious deviations were observed, with values of D of 3.6% at maximum, which was close to the tolerance of 3%. Extreme values, such as those reported previously for the older Plato version 1.21 by Alam et al. [1], for the inhomogeneity test were not found in this study. For the results of test 9, oblique incidence, satisfactory results were found for all systems, provided that the points in the build-up region were excluded from the analysis. It is generally agreed upon that deviations between calculation and measurement for points in a region of high dose, large dose gradient should be expressed as a millimetre shift of isodose lines, rather than in percentage deviation of the dose values. Most systems may have suffered from deviations due to interpolations or to the choice of grid size. For this reason, we have excluded these points from the calculation of the con®dence limit for this test. The `missing tissue' test of test 10 was not included in the AAPM Task Group 23 package. The UM-Plan and CadPlan systems did not ful®l the requirement of 3%. Other systems performed better. We have found that the results of test 11 with asymmetrically collimated ®elds were almost all within, or close to, the stated criterion. Deviations were, in general, slightly worse than the results for the rectangular ®elds. In contrast with this observation, the asymmetrically wedged ®elds showed a much larger deviation, as illustrated in Table 4 and Fig. 4. Apparently, present-day algorithms in treatment planning systems cannot handle this situation properly. An explanation might be that the systems have a method of calculating the dose for the symmetrical wedge settings, in which the lack of scatter contribution from the region under the thick part of the wedge is balanced by increased scatter contribution from the region under the thin part of the wedge. In asymmetrical settings, this balance may be lost, which results in large deviations between calculations and measurements. Improper beam hardening calculation for thick and thin parts of the wedge may also have contributed to this discrepancy. Most of the systems showed the same type of deviation, leading to too low a value of the calculated dose under the thin part of the wedge, and too high a value under the thick part. It can therefore be concluded that software developers should put their efforts into improvements of their models speci®cally for the situation of test 12. We have compared our detailed test results with similar results published recently by Alam et al. [1], Ramsey et al. [20], and Declich et al. [10]. In this comparison, it was important to note that different versions of treatment planning systems have been used. Furthermore, differences in J. Venselaar, H. Welleweerd / Radiotherapy and Oncology 60 (2001) 203±213 the results may have been caused by different investigators implementing the beam data into the system. As Alam et al. [1] stated, the test package can be installed differently by different users, computation options can be chosen differently by the user executing the treatment planning test cases, variations may have been applied in the ®tting procedures used in the calculation algorithms. Personal choices by the user could thus have affected the results. Finally, the results of our study were obtained with the NCS test package, whereas the others used the original AAPM test package. In summary, we have found in this comparison, overall results for the CadPlan system which were slightly better than those reported by Declich et al. Another conclusion is that the results of the newer Plato system V.2.01 and the TheraplanPlus system 3.0 were much improved compared with the older 2D or 2.5D versions, either by improved algorithms or better operator skills. This was especially the case for the central block (test 5) and the inhomogeneity tests (test 8), where improvements were signi®cant. For irregular block geometry, however, no signi®cant difference was found between the two versions of each system. The ADAC Pinnacle system showed better results for most tests, except for the wedge and inhomogeneity cases. With respect to the criteria of acceptability, we conclude that the concept of a con®dence limit, D , works well to reduce the amount of data in the presentation of the results of these test packages. The tolerances set for the different geometries [29] seem realistic, although for some situations, some of the systems have dif®culties in complying with the proposed criteria. However, the fact that other treatment planning systems were able to meet a certain criterion, as illustrated by the results in Table 4, should be a challenge for the suppliers of any other system. It is our opinion that a con®dence limit of 4% should be strived for in cases of the asymmetrically wedged ®elds of test 12, although most of the systems were unable to meet this criterion. In clinical practice, ®elds with asymmetrical collimation and a wedge are used more and more routinely. The users, on the other hand, should therefore be warned about the differences that were found and should set limitations on the clinical use of such settings unless the results of dose calculations are veri®ed thoroughly. In general, the data presented as a result of a test package calculation in the form of mean deviations, standard deviations or, as we have done, the con®dence limit, cannot be considered to have a de®nitive value. Too many factors had more or less in¯uence on these results. The data points used in the tests form a selected set of points in a 3D beam geometry which we assumed to be representative for the whole volume. Another major factor was the effort of the users to model the beams in their systems. In our experiences, with an intermediate report of the results to all participants, we found that several tests of several systems gave better results after a recalculation, i.e. after a remodelling of the beams. Some rather extreme examples are presented in Table 4. This table demonstrates that the results presented in 211 this paper not only re¯ect the relative accuracy of the treatment planning systems, but also indicate that increased attention of the user leads to improved results. Another point is that pro®les of certain types of linear accelerators may, in some treatment planning systems, be better modelled than pro®les of other types of linacs. Pro®les may be indented at the central part of the beam, which changes with depth. Wedge design is typical for each linear accelerator type as well. The NCS package is composed with data of Elekta SL type accelerators only. Finally, new versions of treatment planning systems will outdate at least some of the results. Our results can therefore best be considered as a good set of reference values, against which individual users or user groups can check which improvements are achieved by newer versions of their systems or by better beam modelling. For future work, the beam data set has been further extended to comply with treatment planning systems other than those listed in Table 2. In this respect, the set will include, in the near future, output factor data determined at a reference depth of 10 cm and an SSD of 90 cm. Beam pro®les at this depth including cross pro®les, as demanded by the Helax TMS system (MDS Nordion), have been included. As part of new studies, data from developments in treatment delivery systems will also be measured and added to the test package. In this way, the compatibility of dose calculation algorithms with these devices can still be tested. As an example, we have developed a speci®c test program for the MLC of one of the linear accelerators of the Utrecht department. Also, more complex test con®gurations, including 3D inhomogeneities, are under development and will be added to the package. 5. Conclusions A new data set has been developed, analogous to the AAPM Task Group 23 test package, but measured on accelerators having beam qualities that are clinically applied more frequently. The set of test con®gurations was expanded with tests for a `missing tissue' geometry and for asymmetrical collimator settings, including asymmetrically wedged beams. Further development towards new features of radiation equipment, e.g. a MLC, are underway. The new test package was used in an intercomparison of seven different treatment planning systems, which were considered to be representative of state-of-the-art systems. The set proved to be complete, and in digital form, suitable for beam modelling purposes for most of the tests of the participating systems. Additional data making the test package suitable for other systems have been measured and will be added to the set. Typical results of the tests performed on the different systems are presented and discussed. Major problems were found for the asymmetrically wedged ®elds for the majority of the treatment planning systems in this study. The criteria for acceptability and quality assurance of 212 J. Venselaar, H. Welleweerd / Radiotherapy and Oncology 60 (2001) 203±213 treatment planning systems, which have been suggested elsewhere [29], could be analyzed. The feasibility of these tolerances was established in this intercomparison. The new test package is available for other investigators or user groups and will be part of a forthcoming NCS report on QA of treatment planning systems. Acknowledgements Arjan Renders, Piet van der Linden, Henk Huizenga, Bob Smulders, Lau Mestrom, Bie De Ost, are acknowledged for their kind co-operation, their help and useful comments regarding all aspects of this study. The authors thank Ben Mijnheer and Bob Schaeken for their detailed comments on the manuscript. This work was performed within the framework of the Task Group on Quality Assurance of Treatment Planning Systems of the Netherlands Commission on Radiation Dosimetry (NCS). Appendix A. Description of the test geometries and differences with the Task Group 23 test package Test 1: square ®elds; the dose at points on the central beam axis was measured for ®elds 5 £ 5, 10 £ 10 and 25 £ 25 cm 2. The distances from the central beam axis to the off-axis points were 1, 3 and 9 cm, respectively. The distances from the central beam axis to the points just outside the penumbra were 5, 9 and 19 cm, respectively. So, these points were all on lines parallel to the central axis of the beam. Test 2: rectangular ®elds; the dose was measured for ®elds with collimator settings of 5 £ 25 and 25 £ 5 cm 2, in points on the central beam axis and in off-axis points: at distances from the central axis of 3 and 9 cm in the beam, respectively, and 9 and 19 cm, i.e. outside the penumbra. Test 3: variation of SSD; in this case, the isocentre was set at a depth of 15 cm, i.e. at an SSD of 85 cm. The ®eld size at the isocentre was 10 £ 10 cm 2. Points were at the central axis and at off-axis distances of 2.5 and 7 cm. Test 4: wedged ®eld; the ®eld size is 9 £ 9 cm 2 at the isocentre. The motorized 608 wedge of the accelerators was used. Points on the central beam axis were chosen, as well as points at a distance 2.5 cm left and right from the central beam axis, in the direction of the wedge. Note that in the Task Group 23 test package, a 458 wedge was used. Test 5: central block; the ®eld size was 16 £ 16 cm 2. Centrally and symmetrically with respect to the beam axis, a block was positioned, of which the projection shades an area of 2 cm width and 7 cm length at isocentre distance. Points were located on the central beam axis (shaded by the block) and at an off-axis distance of 4 cm. Note that the block dimensions differ slightly from the block used in the equivalent Task Group 23 test. Test 6: off-centre plane; a plane parallel to the central beam axis was used, shifted 4 cm in the GT-direction towards the gantry. The ®eld size was 10 £ 10 cm 2. Points were located on the central line of this plane, and at off-line distances of 3 and 8 cm. Test 7: irregular ®eld; an `L'-shaped ®eld was obtained by blocking an area of 12 £ 12 cm 2 in the corner of a 16 £ 16 cm 2 ®eld. Points were located on the central beam axis shaded by the block, and at an off-axis distance of 7 cm, i.e. in the open part of the `L'-shaped ®eld. Due to the divergence of the open part of the beam, an off-axis distance of 7 cm was chosen instead of the 6 cm distance in the equivalent Task Group 23 test. Tests 8a,b: inhomogeneity, `lung'; the lung inhomogeneity was created in a ®eld of: (a), 6 £ 6; and (b), 16 £ 16 cm 2 using a cylinder with a relative electron density 0.2, perpendicular to the central axis. The centre of the cylinder was positioned on the central beam axis at 8 cm depth. The diameter of the cylinder was 6 cm. The points were below the inhomogeneity, starting at depths of 12 cm up to the maximum depth, along the central axis and off-axis at the same depths at a distance of 2 cm in the 6 £ 6 cm 2 ®eld and at a distance of 5 cm in the 16 £ 16 cm 2 ®eld. Note that the relative electron density of the cylinder is different from the cylinder in the Task Group 23 test. Test 8c: inhomogeneity, `bone'; for this test case, a ®eld size of 16 £ 16 cm 2 was used. The bone simulating inhomogeneity was created by using a cylinder with a relative electron density of 1.8 with a diameter of 2.0 cm. The cylinder was positioned perpendicularly to the central beam axis with its centre at 6 cm depth. The points were on the central beam axis, below the inhomogeneity, starting at depths 10 cm up to the maximum depth, and at the same depths, at a 4 cm off-axis distance. Note that the relative electron density of the cylinder is different from the cylinder in the Task Group 23 test. Test 9: oblique incidence; a beam was positioned at an angle of 458 to the phantom, with the entrance point at SAD. The ®eld size was 10 £ 10 cm 2, de®ned perpendicular to the beam axis. Points for comparison were taken at the central beam axis, and at distances of 13 and 23 cm from the central beam axis in planes parallel to the phantom surface. Depths were taken from 1 to 20 cm. Tests 10a,b: missing tissue (half phantom); two ®elds were used with ®eld sizes of 10 £ 10 and 20 £ 20 cm 2. In practice, the beam was measured with the gantry at 908, and the central beam axis coincides with the surface of the phantom: only half the beam hits the phantom. Points were at 1.0, 2.5 and 4.0 cm from the phantom side wall in the smaller ®eld, and at 1.0, 5.0 and 9.0 cm from the phantom side wall in the larger ®eld (see Fig. 1). Tests 11±12: half and quarter ®elds with asymmetrical collimator setting; for a test of an open (11) and wedged (12) asymmetrical ®eld setting, a square ®eld projection of 15 £ 15 cm 2 was used. The geometric radiation ®eld centre coincided in the ®rst instance with the collimator rotation axis (the setting is symmetrical, at position 0,0), and was then shifted to the asymmetrical positions, 0,7.5; ^7.5, 0; and ^7.5,7.5 (in cm, from the collimator rotation axis). Points were chosen at 16.0 and 26.0 cm from the J. Venselaar, H. Welleweerd / Radiotherapy and Oncology 60 (2001) 203±213 geometric radiation ®eld centre, i.e. below the thick part and below the thin part of the wedge (see Fig. 1). The latter tests (10±12) were not included in the Task Group 23 test package. For more information regarding the NCS activities and the availability of the forthcoming report, see http:// www.ncs-dos.org. References [1] Alam R, Ibbott GS, Pourang R, Nath R. Application of AAPM Radiation Therapy Committee Task Group 23 test package for comparison of two treatment planning systems for photon external beam radiotherapy. Med Phys 1997;24:2043±2054. [2] American Association of Physicists in Medicine. Report of Task Group 23 of the Radiation Therapy Committee. AAPM Report No. 55. Radiation treatment planning dosimetry veri®cation. Woodbury, NY: American Institute of Physics, 1995. [3] American Association of Physicists in Medicine. AAPM Report No. 53. Quality assurance for clinical radiotherapy treatment planning Report of Task Group 53 of the Radiation Therapy Committee. Med Phys 1998;25:1773±1829. [4] Brahme A, Chavaudra J, Landberg T, et al. Accuracy requirements and quality assurance of external beam therapy with photons and electrons. Acta Oncol 1988;27(Suppl 1):1±76. [5] Caneva S, Rosenwald JC, Zefkili S. A method to check the accuracy of dose computation using quality index: application to scatter contribution in high energy photon beams. Med Phys 2000;27:1018±1024. [6] Dahlin H, Lamm I-L, Landberg T, Levernes S, Ulsù N. User requirements on CT-based computed dose planning systems in radiation therapy; presentation of `check lists'. Comput Methods Programs Biomed 1983;16:131±138. [7] Dahlin H, Lamm I-L, Landberg T, Levernes S, Ulsù N. User requirements on CT-based computed dose planning systems in radiation therapy. Acta Radiol Oncol 1983;22:396±415. [8] Dale RG. Implementation of the Philips treatment planning system for use in radiation teletherapy. Br J Radiol 1978;51:613±621. [9] Davis JB, PfaÈf¯in A, Cozzi AF. Accuracy of two- and three-dimensional dose calculation for tangential irradiation of the breast. Radiother Oncol 1997;42:245±248. [10] Declich F, Fumasoni K, Mangili P, Cattaneo GM, Iori M. Dosimetric evaluation of a commercial 3-D treatment planning system using Report 55 by AAPM Task Group 23. Radiother Oncol 1999;52:69±77. [11] Dunscombe P, McGhee P, Lederer E. Anthropomorphic phantom measurements for the validation of a treatment planning system. Phys Med Biol 1996;41:399±411. [12] Harrison RM. External beam treatment planning, can we deliver what we plan? Acta Oncol 1993;32:445±451. [13] International Commission on Radiation Units and Measurements. ICRU Report No. 24. Determination of absorbed dose in a patient irradiated by beams of X or gamma rays in radiotherapy procedures. Bethesda, MD: ICRU, 1976. [14] International Commission on Radiation Units and Measurements. ICRU Report No. 42. Use of computers in external beam radiotherapy procedures with high-energy photons and electrons. Bethesda, MD: ICRU, 1987. [15] International Commission on Radiation Units and Measurements. Supplement to ICRU Report 50. ICRU Report No. 60. Prescribing, recording and reporting photon beam therapy. Bethesda, MD: ICRU, 1999. [16] McCullough EC, Krueger AM. Performance evaluation of computer- [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] 213 ized treatment planning systems for radiotherapy: external photon beams. Int J Radiat Oncol Biol Phys 1980;6:1599±1605. Mijnheer BJ, Battermann JJ, Wambersie A. What degree of accuracy is required and can be achieved in photon and neutron therapy? Radiother Oncol 1987;8:237±252. Netherlands Commission on Radiation Dosimetry. NCS Report 11. Quality control of simulators and CT scanners; some basic requirements for the quality control of treatment planning systems. Delft: NCS, 1997. Prasad SC, Glasgow GP, Purdy JA. Dosimetric evaluation of a computed tomography treatment system. Radiology 1979;130:777± 781. Ramsey CR, Cordrey IL, Spencer KM, Oliver AL. Dosimetric veri®cation of two commercially available three-dimensional treatment planning systems using the TG 23 test package. Med Phys 1999;26:1188±1195. Rosenow UF, Dannhausen H-W, Lubbert K, et al. Quality assurance in treatment planning. Report from the German task group Proc. IXth ICCR. In: Bruinvis IAD, van der Giessen PH, van Kleffens HJ, Whitkamper FW, editors. The use of computers in radiation therapy. Amsterdam, Elsevier, 1987. pp. 45±48. Samulski T, Dubuque GL, Cacak RK, et al. Radiation therapy dosimetry reviews by the centers for radiological physics. Int J Radiat Oncol Biol Phys 1981;7:379±383. Sauer O, Nowak G, Richter J. Accuracy of dose calculations of the Philips treatment planning system OSS for blocked ®elds. Quality assurance in treatment planning. Report from the German task group Proc. Xth ICCR. In: Bruinvis IAD, van der Giessen PH, van Kleffens HJ, Whitkamper FW, editors. The use of computers in radiation therapy. Amsterdam, Elsevier, 1987. pp. 57±60. SocieÂte FrancËaise des Physiciens d'Hopital. Evaluation des systemes informatiques en radiotheÂrapie. Paris: SFPH, 1982 Siege Institut Curie, 26 Rue D'Ulm, 75231 Paris. van Bree NAM, van Battum LJ, Huizenga H, Mijnheer BJ. Threedimensional dose distribution of tangential breast treatment: a national dosimetry intercomparison. Radiother Oncol 1991;22:252±260. van der Linden PM. A three-dimensional photon beam model used in a microcomputer planning system Proc. IXth ICCR. In: Bruinvis IAD, van der Giessen PH, van Kleffens HJ, Whitkamper FW, editors. The use of computers in radiation therapy. Amsterdam, Elsevier, 1987. pp. 517±519. Van Dyk J, Barnett RB, Cygler JE, Shragge PH. Commissioning and quality assurance of treatment planning computers. Int J Radiat Oncol Biol Phys 1993;26:261±273. van Gasteren JJM, Heukelom S, van Kleffens HJ, van der Laarse R, Venselaar JLM, Westermann CF. The determination of phantom and collimator scatter components of the output of megavoltage photon beams: measurement of the collimator scatter part with a beam-coaxial narrow cylindrical phantom. Radiother Oncol 1991;20:250±257. Venselaar JLM, Welleweerd J, Mijnheer BJ. Tolerances for photon beam dose calculations using a treatment planning system. Radiother Oncol 2000 Submitted for publication. Welleweerd J, van der Zee W. Dose calculations for asymmetric ®elds using Plato version 2.01 In: Proc. 17th Annual ESTRO Meeting. Radiother Oncol 1998;48(Suppl 1):S134. Westermann C, Mijnheer B, van Kleffens H. Determination of the accuracy of different computer planning systems for treatment with external photon beams. Radiother Oncol 1984;1:339±347. WittkaÈmper FW, Mijnheer BJ, van Kleffens HJ. Dose intercomparison at the radiotherapy centres in the Netherlands. I. Photon beams under reference conditions and for prostatic cancer treatment. Radiother Oncol 1987;9:33±44.