Nancy Leveson: Safeware CHAPTER 1: Risk in Modern Society • • • • • Three Mile Island 1979 Bhopal, 1984 Space Shuttle, 1986 Chernobyl, 1986 Therac-25 1985 - 1987 IS NEW TECHNOLOGY MAKING OUR WORLD RISKIER? In the United States technological hazards account for 15 to 25 % of human mortality and have significantly surpassed natural hazards in impact, cost and general importance. – [Ref 87 - Ned Franklin. The accident at Chernobyl. – The Chemical Engineer, pages 17-22, November, 1986]. p.4 Flood damage in the United States, for example, has increased as expenditures on flood control have increased. [154 - Trevor Kletz. Myths of the Chemical industry. The Institution of Chemical Engineers, Rugby, Warwickshire, United Kingdom, 1984].p.4 In fact, all hazards are affected by complex interactions among technological, ecological, sociopolitical, and cultural systems [30, 174, 3339]. Attempts to control risk by treating it simply as a technical problem or only as a social issue are doomed to fail or to be less effective than possible. 1.1 Changing Attitudes toward Risk • Not Safe, only Safer. • Societies are recognizing human and workers’ rights • “Workers are at the mercy of their employers in terms of safety” • Complete abdication of personal responsibility, however, is not always wise In some instances -- such as the Bhopal accident -- the public has completely trusted others to plan for and respond effectively to an emergency, with tragic results. – Many aspects of Bhopal guaranteed that an accident would occur. • emergency and evacuation planning, training, and equipment were inadequate. • Public not told of simple measures (such as closing eyes with wet cloth over faces) which could have saved their lives. In writing about the Bhopal tragedy, Bogard expresses this new attitude: We are not safe from the risks posed by hazardous technologies, and any choice of technology carries with it possible worst case scenarios that must take into account in any implementation decision. The public has the right to know precisely what these worst case scenarios are and participate in all decisions that directly or indirectly affect their future health and well-being. •In many cases, we must accept the fact that the result of employing such criteria may be a decision to forego the implementation of a hazardous technology altogether [30, p. 109]. p.6 Is risk increasing in our modern society as a result of new technological achievements, or are we simply experiencing a new and unjustified form of Luddhism? p.6 • [The Luddite disturbances occurred in Yorkshire between 1811 and 1816, when workers in the English woolen industry tried, through violence, to stem the increasing mechanization of the mills. Luddism has become a generic term describing opposition to technological innovation] 1.2 Is Increased Concern Justified? • Is technological risk increasing? Depends on data used and its interpretation [p.6] • “Harris and colleagues argue that technological hazards, in terms or human mortality, were greater in the earlier, less fully managed stages of industrial development [112] • Data cited from National Safety Council shows that occupational death and injury rates have declined steadily since the early part of this century [112]. NSC Concludes that technological hazard mortality is not currently rising. – However warning that: • “The positive effects of technology have for some time reached their maximum effect on human mortality, while the hazards of technology continue partially unchecked, affecting particularly the chronic causes of death currently account for 85 percent of mortality in the U.S.A. “ [112] • On other hand, examination of technological accident rate, rather than the occupational death and injury rate, suggests that the technological risk is increasing • 60% of all major industrial disasters from 1921 to 1989 occurred after 1975 [30]. Bogard (1989) argued that 12 of the 19 major industrial accidents in the twentieth involving 100 or more deaths occurred after 1950. – If small-scale accidents also included (transportation, dams, structural collapses) evidence is more compelling. – Example: Military Aviation; accident rate has slowed; probably due to emphasis on system safety. – Past experience does not allow us to predict the future when the risk factors in the present and future differ from those in the past. Examining these changes will help us understand the problems we face. 1.3 Unique Risk Factors in Industrialized Society • RISK = Likelihood of an accident plus severity of the potential consequences • Factors include: new hazards, increasing complexity, exposure, energy, automation, centralization, scale, and pace of technological change in systems 1.31 The Appearance of New Hazards • Misuse and overuse of antibiotics have created resistant microbes • Children no longer work in mines or as chimney sweeps, but are exposed to man-made chemicals and pesticides in their food or increased environmental pollution [57] • Atomic energy has increased potential for death and injury from radiation exposure Redundancy (duplication of components to protect against individual failures) – Not effective means of controlling risks – not effective against hazards that arise from interactions among components in today’s increasingly complex and interactive systems. – May in fact increase complexity to the point where the redundancy itself contributes to accidents 1.3.2 Increasing Complexity • Perrow distinguishes between accidents caused by component failures and those which he calls system accidents, that are caused by interactive complexity in the presence of tight coupling [259, Normal Accidents] • High technology systems are often made up of networks of closely related subsystems. Conditions leading to hazards emerge in the interfaces between subsystems, and disturbances progress from one component to another. Increasing Complexity/Cont. An example of this increasingly common type of complexity, modern petrochemical plants often combine several separate chemical processes into one continuous production, without the intermediate storage that would de-couple the subsystems [274]. • “…analysis of major industrial accidents invariably reveal highly complex sequences of events leading up to accidents, rather than single component failures.” p.8 Increasing Complexity Cont./ In the past component failure was cited as the major factor in accidents, today more accidents result from dangerous design characteristics and interactions among components [108] p8 top. Not only does functional complexity make the designer’s task more difficult, but the complexity and scope of the projects require numerous people and teams to work together. The anonymity of team projects dilutes individual responsibility [172] • Paradox that people are willing to spend money on complexity but not on simplicity [158, Kletz] WHY IS THIS THE CASE? CASE in Point: A British Chemical Plant (p.9) • pump, various pipelines, had several uses including: – transferring methanol from a road tanker to storage – charging it to the plant, – moving methanol back from a road tanker to storage, – charging it to the plant – moving recovered methanol back from the plant On this particular occasion, a tank truck was being emptied: – The pump had been started from the control panel but had been stopped by means of a local button. The next job was to transfer some methanol from storage to the plant. – the computer set the valves, but as the pump had been stopped manually it had to be started manually. When the transfer was complete the computer told the pump to stop, but because it had been stated manually it did not stop and a spillage occurred [157, p.225]. P 9 – a simpler design -- independent pipelines for different functions, actually installed after the spill, made errors much less likely and was not more expensive over the lifetime of the equipment. Computers • may encourage the introduction of unnecessary and dangerous complexity • enable more interactive, tightly coupled, and error-prone designs to be built • Kletz has noted: “Programmable electronic systems have not introduced new forms of error, but by increasing the complexity of the the processes that can be controlled, they have increased the scope of the introduction of conventional errors” [158]. Conclusion • Accepting Perrow’s argument that interactive complexity and coupling are a cause of serious accidents, then the introduction of computers to control dangerous systems, may increase risk unless great care is taken to minimize complexity and coupling.. 1.3.3 More People Exposed to Hazards • Larger flight capacities • Dangerous plant facilities closer to population centers • More of workforce in cities or within commuting distance. • Interdependencies and complexity cause ripple effects of hazards magnifying potential consequences. • 1.3.4 High Energy Sources increase risks 1.3.5 Increasing Automation of Manual Operations • Automation does not remove humans but tends to redefine their roles • operators become concerned with maintenance, repair and higher level supervisory control and decision making [270] Operators relegated to central control rooms Case in Point: 1977 NYC Blackout • Indirect Information • Operator followed prescribed procedures • But electrical system was brought to a complete halt • Operator could not know there were two relay failures: – one leading to a high flow over line normally carrying little or no current (operator would have been alerted) – other blocked flow over line making it appear normal • Operator had no way of knowing that zero reading would appear normal. • Operators become the “scapegoat” of an automated system. Operators and Embedded Systems • Embedded systems can mask the occurrence and subsequent development of a problem • When malfunction is discovered it may be more difficult to control • Systems may be hidden or distorted • Such design further limits operator options and hinders broad comprehension. p.11 Case: China Airlines, 1985 • 747 suffered slow loss of power in right outer engine • autopilot compensated preventing yaw to the right • when limit reached, crew had no time to determine cause • plane rolled and went into vertical dive of 31,500 ft. • Aircraft severely damaged Multiple Goals may lead to conflicts • 1970’s attempt to combine energy savings efforts with process plants lead to complications • Safety and Economy conflicting • component interactions make system functions less transparent to designers and operators • Place where trouble first recognized may not be where it started. 1.3.7 Increasing Scale and Centralization • Bupp and Derian [1968] observed that manufacturers were taking orders for nuclear power plants six times the size of those in operation. • Previously 1-2 times extrapolations were considered at the outer boundary of acceptable risk. • Brown’s Ferry Accident, 1975 was 10 times size of plant construction in 1966 Supertankers built without sound design and redundancy • Mostert writes: – The gigantic scale of vessels creates an abstract environment in which crews are far removed from direct experience of the sea’s unforgiving qualities and potentially hostile environment. Heavy automation undermines much of the old-fashioned vigilance and induces engineers to lose their occupational instincts -- qualities that in earlier days of shipping were an invaluable safety factor. Increasing Pace of Technological Change • Average time of conversion of technical discovery into commercial product was 30 years [1880 -1919] • now 5 years! • number of new products increasing exponentially • dangerous new substances with economic pressures preventing adequate testing. Pace of Technological Change/ cont. • Learning by trial and error not possible in modern times • Design and testing procedures must be right from the start. • Christopher Hinton (1957) writing about nuclear power pointed out that in other domains, learning from failures was possible. • Progress continues at torrid pace; new standards and regulatory procedures are necessary. 1.4 How Safe Is Safe Enough? • The goal is to understand and manage risk in order to eliminate accidents or to reduce their consequences. • Frola and Miller (88) claim that system safety investment has reduced losses where it has been applied rigorously in military and aerospace programs. How Safe is Safe/ cont. • Conflicting goals between • safety, performance, and goals. • e.g. Industrial arm which cannot be stopped easily. Slower arm determines that it must hit something. • human interface(s) designed for ease of use are often more troublesome Risk - Benefit Analysis • Often viewed as only way to make technology and risk analysis. • Must be able to: – measure risk – choose appropriate level for risk • Systems must be designed and built while knowledge of their risk is incomplete or nonexistent. Risk Assessment / cont. • Impossible to measure before system is built. • Failure rate of 4 (over 10000 years) and use body is missing • Hard to perform and hard to determine acceptable risk. • Case of Ford Pinto gas tank • Optimal Risk involves a tradeoff that minimizes the sum of all undesirable consequences. Risky Systems and Perrow • Perrow Divides High-Risk systems into 3 categories: