Traci M. Glass CSC 540 Ethical Software Development in the Medical Field: The Therac-25 Incident The Therac-25 incident was one of the worst series of radiation accidents in 35+ years. Massive overdoses of radiation were administered to six known patients between June 1985 and January 1987 resulting in three deaths and severe injuries to the other three patients. The Therac25 was a linear accelerator produced by Atomic Energy of Canada Limited or AECL. It was designed to generate high energy electron beams that would treat shallow tumors and x-ray beams to treat deeper tumors. Therac-25 was infamous for its unreliability and typically malfunctioned around 40 times each day (Quinn). To gain a better understanding of the Therac25 incident, I will first discuss the history of the first Therac linear accelerators, the Therac-6 and the Therac-20. During the 1970s, AECL along with a French corporation by the name of CGR collaborated to build the Therac-6 and the Therac-20, both of which were modernized CGR accelerators. The modernized accelerators were distinguished from their older counter parts by the addition of a minicomputer, the DEC PDP 11, as a front-end. The addition of the PDP 11 made the linear accelerators much simpler to operate (Quinn). The Therac-6, a 6 million electron volt accelerator, was capable of producing X-rays only. Its predecessor, the Therac-20, was a 20 million electron volt accelerator and had a dual-mode which was capable of producing either Xrays or electrons. Since both the Therac-6 and the Therac-20 had limited software functionality, the computer was mainly added for convenience (Leveson and Turner). Since all of the safety features were built into the hardware, both of the accelerators were able to work independently of the PDP. After completion of the Therac-20, CGR split from AECL accrediting the split to “competitive pressures” (Leveson and Turner). AECL then continued with the development and deployment of a new linear accelerator that they later named the Therac-25 (Quinn). 1 Traci M. Glass CSC 540 The Therac-25 was designed to deliver either photons at 25 million electron volts or electrons at various energy levels and also utilized a new double pass model. This double pass electron accelerator required less space to develop high energy levels compared to previous installations. The new model was also much more economical to manufacture (Leveson and Turner). Like its predecessors, the Therac-25 made use of the PDP 11. Unlike the Therac-6 and the Therac-20, the Therac-25 was designed to be unable to function without the PDP 11. This decision was made because it allowed AECL to cut costs by removing hardware safety elements and replacing them with software safety features (Quinn). Overall, the Therac-25 was designed to be easier to use, more compact and more versatile in comparison to its ancestors (Genesis of the Therac-25). The software development of the Therac-25 is an interesting subject. The Therac-25 software was basically revised Therac-20 software, which was revised Therac-6 software. This fact makes sense since it is mainly an upgraded version. The software was developed by a single person over a few years. One of the critical differences, which was listed earlier, that could contribute to major issues is that unlike the Therac-6 and Therac-20, the Therac-25 was completely incapable of functioning without the PDP 11. Some of the tasks that the Therac-25 was responsible for are: monitoring the status of the machine, receiving input in regards to the treatment, setting up the machine, turning the beam on, turning the beam off, detecting hardware malfunctions, and delivering diagnostics. Therac-25 also runs on its own customized operating system (Porrello). In the year 1976, the first Therac-25 prototype was produced. It was not until late 1982 when the Therac-25 was commercialized. In March 1983, AECL made the decision to perform a safety analysis, in this case a fault tree, on the Therac-25. This test of the machine did not 2 Traci M. Glass CSC 540 include software. During this analysis numerous assumptions were made. The final report claims that many programming errors were greatly reduced due to “extensive” on a simulator. Also, it states that the software will not degrade over time and that computer execution errors are caused by faulty hardware (Leveson and Turner). According to reports, this analysis did not seem to include computer failure in any kind. The first Therac-25 was shipped in 1983. Total there were 11 systems distributed in Canada and the United States. Five of the Therac-25 installations were in the U.S. with the remaining six installations in Canada (Leveson and Turner). One of the major components in the design of the Therac-25 was the turntable. This turntable is also a crucial element in the accidents. The turntable has three modes: electron, photon (or X-ray), and field-light. The electron mode uses low energy electron beams to treat shallow tumors in the patient. Even though the electrons are low energy, they are still too strong to be used on the patient. This being said, scanning magnets are placed in front of the beam to spread the electrons. This in turn reduces the strength of the beam. The second mode, the photon or X-ray mode, is a high energy beam that treats deeper tumors. The high energy beam strikes metal foil emitting the photons. From there, the beam is flattened beneath the foil to achieve the desired dosage. The final mode of the Therac-25 is the field-light mode. This mode allows the machine to be aligned prior to treatment. It uses a mirror and light to show where the beams will be hitting the patient (How Therac-25 worked). As stated earlier, the operator interface was controlled with the DEC microcomputer. The interface screen consisted of quite a bit of information including: the patients name, the treatment mode, beam type, actual amounts of radiation received and prescribed, as well as information regarding the positions of the turntable. The operator’s procedure was to position the patient on the treatment table, set the treatment field sizes, and attach any necessary accessories to the 3 Traci M. Glass CSC 540 Therac-25 unit. After completing their duties, the operator then leaves the treatment room and returns to the microcomputer. On the microcomputer, the operator inputs the patient identification, all fields regarding the treatment prescription, and any other remaining data. From here the system compares the values set in the treatment room with those entered on the microcomputer. If the values match, then the treatment proceeds, otherwise the treatment will not proceed (The Operator Interface). Upon complaints from operators regarding the length of time it took to enter the treatment, the manufacturer changed the software before the first unit was installed. This modification allowed the operator to copy the treatment data that was set in the treatment room by using a series of carriage returns. This modification was eventually part in several of the accidents (Leveson and Turner). The Therac-25, sadly, had very few safety features. Upon detection of an error, there were two ways Therac-25 would shutdown. One of these shut downs was the treatment suspend. This mode required a full system reset to restart treatment. The other shutdown method, treatment pause, only required a single key press to resume treatment. If this form of shutdown arose, the operator would press “P” to proceed with treatment using the previous values for treatment. This feature could be invoked up to five times before it suspended and required a full reset (Leveson and Turner). Although Therac-25 contained error messages, they were very cryptic and unhelpful. Most error messages were simply the word “malfunction” and a number following it. These malfunctions were not described in the operator’s manual. It was unknown at the time that these malfunctions could harm a patient (Leveson and Turner). In 1983, the first Therac was shipped and in June of 1985, the first incident occurred. In Marietta, Georgia at Kennestone Regional Oncology Center, a 61 year old breast cancer patient was to receive radiation after having a lumpectomy to remove a malignant tumor. This Therac4 Traci M. Glass CSC 540 25 unit had been installed three months prior with no reported incidents, until this day. The patient was set to receive treatment to the area around her collarbone. After completion of the treatment, the patient complained of being burned. After the incident AECL was contacted and asked if the Therac-25 unit could have possibly failed to diffuse the electron beam. It was not until a few days later when they called back to explain that this was not possible. The patient went home, but soon developed swelling and reddening in the treatment area. Also, the patient later had issues with severe pain in her shoulder area. This pain eventually became so severe that she could no longer move her shoulder and also she began to have spasms. The reddening soon spread to her back and the skin in swollen areas had begun to come off in layers. It was obvious that the patient had suffered from radiation burn, but it was not until much later that the physicist who performed her treatment estimated that she had received approximately 75-100 times more radiation that prescribed. In the end, the patient was required to have her breast removed due to the serious burns. She also lost all use of her shoulder and arm. The patient lived in constant pain, but the manufactures refused to believe that it was due to the Therac-25 (Leveson and Turner). A little over one month later, in Hamilton, Ontario, the second known incident occurred at the Ontario Cancer Foundation. The patient, a 40 year old woman came in for her 24th treatment on the Therac-25 unit. Having had so many treatments on the machine in the past, the patient was fully aware that something was wrong when the unit shut off after about five seconds. The operators screen displayed that no dosage had been administer, so the operator made a second attempt. This attempt failed as it did previously with the same message, so the operator made four more attempts, each failing in the same manner. After the fifth pause, as previously stated, the unit went into suspend mode and a technician was called. The technician 5 Traci M. Glass CSC 540 found nothing wrong with the Therac-25 unit. After the treatment, the patient complained that she had been burned, much like the patient in the first incident. She also described a feeling like she had been electrically shocked. After her treatment, six other patients received treatment with no error. The patient returned three days later for further treatment and complained of burning, hip pain, and swelling in the treatment region. That day, the Therac-25 unit was removed from service. The patient was hospitalized the next day. AECL was informed and later sent a technician for investigation. It was estimated that the patient received 65-85 times more radiation than was prescribed. The patient died that November from cancer, but it was noted in the autopsy that if she would have lived she would have needed her hip replaced due to the excessive radiation (Leveson and Turner). After this incident, AECL began its first investigation of the Therac-25 in July 1985. An engineer was sent to the Ontario Cancer Foundation in hopes to reproduce the malfunction. Although the AECL engineer was never able to reproduce the malfunction, he did suspect that there was an issue with the microswitch which is used in determining the position of the turntable. During the investigation, other design flaws and probable hardware issues were found. AECL released both hardware and software updates for the Therac-25 and reported that “analysis of the hazard rate of the new solution indicates an improvement over the old system by at least five orders of magnitude (Leveson and Turner).” The investigation concluded in September of 1985 (Leveson and Turner). In December 1985, a woman went to Yakima Valley Memorial Hospital for radiation treatments. Following one of her treatments, she developed a reddening of the skin in the treatment area in the form of parallel lines. Since her reaction was at the time not considered dangerous or unusual, she continued treatments with the Yakima Therac-25 unit. The patient 6 Traci M. Glass CSC 540 completed her treatments in January 1986. In late January, the red marks were then deemed to be unusual. After this, the staff monitored the red stripes and came to believe that the blocking trays were the cause, but by this time the blocking trays had been removed and discarded. Due to this removal, the pattern could not be reproduced. On January 31st, the hospital staff sent AECL a letter in regards to this incident. It was not until February 24th that AECL responded to the letter. Their response claimed that it was impossible for the Therac-25 to produce the red markings. They continued by explaining for two pages how the incident was technically impossible. In the end, the patient survived, but with major scarring and mild disability (Leveson and Turner). The next incident occurred at the East Texas Cancer Center in March 1986. On this day, a male patient came in for his ninth radiation treatment for a cancerous tumor on his back. The operator followed the standard procedure of setting the patient up in the treatment room and then returning to the operator room to set up the microcomputer. As the operator was setting up the minicomputer she realized that she had accidentally typed “X” for X-Ray instead of “E” for Electron. After quickly correcting her error, began treatment. A few seconds into treatment, the Therac-25 shut down and displayed a malfunction error, “Malfunction 54,” and a treatment pause message on the screen. The only description of this malfunction was that it was a “dose input 2.” It was later discovered that this meant the dosage administered was too high or too low. The display showed that the patient had only received 6 of 202 units. After seeing this, the operator proceeded with treatment. Since the patient and operator are in separate rooms, there were audio and video monitors. Unfortunately, the monitors were down that day. It was not until the operator heard the patient beating on the door that she stopped treatment. The patient described the first attempt as an electric shock or as if someone had poured hot coffee on him. After this hit, he began to get up from the treatment table. At this point, the second treatment 7 Traci M. Glass CSC 540 attempt hit his arm. He also compared this to an electrical shock and stated that it was as if his hand was trying to leave his body. He then began to beat on the door until the operator stopped treatment and released him. Immediately, the patient was examined. It was suspected that he had been shocked and he was sent home. Following the incident, the patient continued to experience pain and soon became paralyzed in his left arm, both legs, his left vocal cord (leaving him unable to speak), and left diaphragm. He also experienced problems with his bowels and bladder. He also had a lesion on his left lung and recurring herpes infections. Five months later, the patient died of complications to radiation overdose (Leveson and Turner). The day after the incident in Tyler, Texas, the second investigation began. One AECL engineer from Canada and a local engineer spent an entire day testing the Therac-25 unit in an attempt to reproduce the malfunction 54. After being unable to reproduce the error, the engineer from Canada stated that it was impossible for the unit to administer an overdose. The local engineer then asked if there were any other incidents with radiation overdoses and were informed that none had occurred. AECL suggested that the patient received an electrical shock due to a fault in the hospital’s electrical system. After checking the electrical system, the Therac-25 unit in Tyler was put back into service (Leveson and Turner). On April 11, 1986, only three weeks after the first incident at the East Texas Cancer Center, a male skin cancer patient came in for an electron treatment. The operator, the same from the earlier incident, set up the machine for treatment. Again, she typed too quickly and made an error. She swiftly corrected her mistake and began the treatment. As like the first time, the machine shut down after a few seconds and the screen displayed the “Malfunction 54” error message. After hearing the patient making a loud moaning noise over the now working intercom, she ran into the treatment room. The patient described a feeling of “fire” on his face in the 8 Traci M. Glass CSC 540 treatment area. The operator ran to find the hospital physicist to inform him that another patient had been “burned.” The patient described the incident to the physicist saying that something had hit him on the side of the face; he then saw a flash of light, followed by a sound reminiscent of frying eggs. After the incident, the patient’s condition worsened greatly; he progressively slipped into a coma, developed a fever of 104 degrees, and had neurological damage. Three weeks after the incident, on May 1, 1986, the patient died from a high radiation overdose to the right temporal lobe of the brain and the brain stem (Leveson and Turner). Following the second incident at the East Texas Cancer Center, the machine was immediately taken out of service. After contacting AECL, the hospital physicist and the operator began their own investigation. The pair was eventually able to decode the “Malfunction 54” error. They found that if the operator made a mistake and quickly corrected it, the machine would not have time to catch up and the overdose would then occur (Leveson and Turner). On January 17th, 1987, a male patient came to Yakima Valley Memorial Hospital to be treated for carcinoma. As with other incidents, the machine shut down after a few seconds of treatment and displayed a message. The operator then proceeded with treatment and again the machine shut down with a treatment pause. After hearing the patient make a noise, the operator went into the treatment room to check on the patient. The patient described to the operator a burning sensation in his chest. After a while, the patient developed a skin burn that a few days later took the form of the same striped pattern from the earlier incident. In April, the patient died of complications from radiation overdose (Leveson and Turner). On February 10th, 1987, the Therac-25 was officially declared defective by the FDA under the Radiation Control for Health and Safety Act. AECL was ordered to inform all purchasers of the Therac-25 unit of their defectiveness. To regain FDA approval AECL had to 9 Traci M. Glass CSC 540 demonstrate how it would make the system safe. The process to regain approval was to investigate the issues, develop a solution, and then notify the FDA with a corrective plan. After five revisions spanning over the course of five months, AECL finally met FDA approval. Included in their revisions were a variety of hardware interlocks to prevent the machine from administering overdoses or activating the beam when the turntable was not in the correct position (Leveson and Turner). One of the biggest mistakes in the development and production of the Therac-25 was that there was only one developer. In most development environments, there are at least 2 developers and in most cases there are multiple teams of developers. With software that is critical, such as Therac-25, the upmost care should be taken in development and production. Another mistake that was made during the development of Therac-25 was that limited testing on simulators. This was due to excessive confidence. The developer also exchanged faith in hardware reliability with software reliability. It was assumed that there were no design flaws since there were no issues with the hardware. The developer assumed that if there was to be an issue, it would be with the hardware and not the software. This was due to errors in previous versions always being found in the hardware. Also during development, there was very little documentation. Incidents, such as Therac-25, make us question what can we do to insure safe software? How can we encourage our employers to spend the money to implement more safety features? What can we do to make sure this doesn’t happen to us? To insure our software is safe we can use stricter software development methods. One thing that we should always include in our code is some implementation of try and catch statements. If you are developing for a mission critical system, you should use a mission critical language like Ada. Another thing we can do is use languages that are strongly typed. We can also test our code frequently on simulators. It is 10 Traci M. Glass CSC 540 necessary in the technical field to program with safety and security in mind. These thoughts do not only apply to mission critical systems, but to all software development. Being hasty or lazy in this field can not only result in inefficient software or other software issues, but in some cases we are developing software for uses where accuracy is of the greatest importance. 11 Traci M. Glass CSC 540 Bibliography Genesis of the Therac-25. 31 March 2011 <http://computingcases.org/case_materials/therac/supporting_docs/levenson/Therac%20History.html> . How Therac-25 worked. 31 March 2011 <http://computingcases.org/case_materials/therac/supporting_docs/therac_case_narr/Machine_Desig n.html>. Leveson, Nancy and Clark S. Turner. An Investigation of the Therac-25 Accidents. July 1993. 31 March 2011 <http://courses.cs.vt.edu/cs3604/lib/Therac_25/Therac_1.html>. Porrello, Anne Marie. Death and Denial: The Failure of the THERAC-25, A Medical Linear Accelerator. 31 March 2011 <http://users.csc.calpoly.edu/~jdalbey/SWE/Papers/THERAC25.html>. Quinn, Michael J. Ethics for the Information Age. Addison-Wesley, 2011. The Operator Interface. 6 April 2011 <http://computingcases.org/case_materials/therac/supporting_docs/levenson/Interface.html>. Therac-25 Wikipedia. <http://en.wikipedia.org/wiki/Therac-25>. 12