OCTOBER 2020 UNDERSTAND THE IMPORTANCE OF PROCESS SAFETY MANAGEMENT SYSTEMS Rethink Your Process Safety Procedures Get the Best Out of Incident Data Journal Bulk Solids Innovation Center Editor in Chief Mark Rosenzweig mrosenzweig@putman.net Executive Editor Todd Smith Editor in Chief toddsmith@k-state.edu Mark Rosenzweig CONTENTS mrosenzweig@putman.net Associate Editors Raju Dandu Executive Editor rdandu@k-state.edu Stewart Behie Kevin Solofra stewart_behie@ solofra@k-state.edu exchange.tamu.edu Contributing Editor Associate Editor Tom Blackwood Noor Quddus tblackwood@putman.net nooralquddus@tamu.edu Art Director Contributing Editor Jennifer Dakas Dirk Willard jdakas@putman.net dwillard@putman.net Understand the Importance of Process Safety Management Systems Public acceptance and adequate plant safety require assessment of a number of issues and factors 4 Get the Best Out of Incident Data Let machine learning and artificial intelligence do the heavy lifting 14 Effectively Share Insights from Incidents A proven approach can ensure lessons get communicated and acted upon 19 Consider Industrial Detonations in Vapor Cloud Explosions Risk assessments often overlook the issue but should not 26 Rethink Your Process Safety Procedures Interdisciplinary approach supports workers, strengthens companies 29 Boost Process Plant Resilience Measures can strengthen the ability to recover from an incident 37 Abandon-in-Place Must End Leaving equipment derelict instead of demolishing it can prove costly. 47 The Emerging Hydrogen Economy Demands Attention Recent accidents underscore the need for inherently safer designs 49 Production Manager Art Director Daniel Lafleur Jennifer Dakas dlafleur@putman.net jdakas@putman.net Publisher Production Manager BrianLafleur Marz Daniel bmarz@putman.net dlafleur@putman.net Publisher Bulk Solids Innovation Center Journal Brian is published Marz jointly by bmarz@putman.net the Kansas State University Bulk Solids Innovation Center, MKO Process Safety Journal is 607 N. Front Salina, published jointly Street, by the Mary Kay KS 67401, and Putman Media, O’Connor Process Safety Center, Texas University, Jack E. Brown 1501A&M E. Woodfield Road, Suite Chemical Engineering Building, 3122, 400N, Schaumburg, IL 60173. 100 Spence St., College Station, Copyright 2020, Kansas State TX 77843, and Putman Media, 1501 E. Woodfield Road, Suite University Bulk Solids Innovation 400N, Schaumburg, IL 60173. Center and Putman Media. All Copyright 2020, Mary Kay O’Connor rights reserved. TheTexas contents Process Safety Center, A&M of this publication notAll be University and Putmanmay Media. rights reserved. in The contents this reproduced whole or inofpart publication may not be reproduced in without the consent of the whole or in part without the consent copyright owners. of the copyright owners. AD INDEX KNF • www.knfusa.com/exproof 3 Mary Kay O’Connor Process Safety Center • psc.tamu.edu 25 Instrumentation Symposium 2021 • instrumensymp.wpengine.com 46 Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 2- EXPLOSIONS. GOOD IN MOVIES. NOT AT YOUR FACILITY. Trust KNF for proven liquid and gas pump performance in safety-critical applications. • Suited for NEC/CEC Class 1, Division 1, Groups C & D; IEC EX, ATEX, and other protection levels available • Choose from a broad range of pump head and diaphragm materials Learn more today at knfusa.com/ExProof Understand the Importance of Process Safety Management Systems Public acceptance and adequate plant safety require assessment of a number of issues and factors By Stewart Behie, Texas A&M University INTRODUCTION A recent article “Guidance to Improve the Effectiveness of Process Safety Management Systems in Operating Facilities,” published in the Journal of Loss Prevention in Process Industries, summarizes the significant number of loss of process containment (LOPC) incidents that have occurred in Texas in 2019 and 2020. These incidents incidents occurring. A few funda- handled and the chemicals used in are on the minds of the public living products. They also must deal with mental questions that undoubtedly in the affected regions are as follows: • Are these plants safe? • Am I safe to continue living near these plants? • What systems are in place to prevent a similar accident from happening in the future? This article examines the relevant to process feedstock into marketed hazards associated with the equipment and the conditions in the plant. How a facility manages these hazards and the associated risks will determine whether the operation is “safe.” To begin, let’s take a step back and examine some of the issues and factors starting with defining resulted in substantial damage to issues and factors that need to be nificant impact on the environment to the questions posed above and DEFINING THE TERMS tions that need to be considered condition that has the potential to the plants in question, caused sigand damage to the reputations of the operating companies involved in the eyes of the public. The article identified a number of factors that led to these events assessed to provide robust answers addresses other important ques- Hazard: A chemical or physical in determining what constitutes a cause damage to a receptor such as “safe” operation. Answering the questions requires and offered an overall approach for understanding the concepts of effectiveness of their process safety and “safety.” Every operating plant operating companies to improve the management systems to reduce the potential of future significant several relevant terms. “hazards,” “risk,” “risk acceptability” is faced with managing a number of hazards related to the materials Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 4- the public, plant personnel, equip- ment or the environment as well as company reputation. A hazard is a material property. For example, a knife’s hazards are the sharp point and the cutting edge, while toxicity is the hazard posed by a toxic gas. The Severity Risk Ranking 5 5A 5B 5C 5D 5E 4 4A 4B 4C 4D 4E 3 3A 3B 3C 3D 3E 2 2A 2B 2C 2D 2E 1 1A 1B 1C 1D 1E A B C D E and the estimated probability of occurrence. Figure 1 is an example of a 5 x 5 risk matrix that has five conse- Likelihood quence categories ranging from RISK MATRIX Figure 1. This matrix show five levels of severity and the likelihood of their occurring. knife and gas in and of themselves do associated with a facility’s operation ment of their properties will dictate sidered safe only if remaining risks not constitute hazards. The managethe hazard. Risk: The measure of potential damage a hazard or group of haz- ards cause to a receptor taking into (in this context). A facility is conassociated with its operation are acceptable. There are degrees of risk and hence degrees of safety. Risk Acceptability: A criterion account both the potential dam- established by a company or of its occurrence. or appetite of the risk receptors age’s magnitude and the probability Risk = Consequence × Probability Safety: A judgement of the accept- ability of risks related to the hazards industry or by the risk attitude such as the people living in an operation’s vicinity. A risk matrix negligible (1) to catastrophic (5) in consequence categories of “Public injury” and “Asset damage” and five probability levels ranging from very unlikely (A) to frequent (E). The probability level is determined by the number of layers of pro- tection available. The greater the number of protection layers, the lower the probability. Table 1 shows risk matrix con- sequences or severity levels in the two categories noted above. Table 2 is an example of a risk often is used to assess risks on a matrix probability or likelihood quence level of an event scenario for each event scenario is at the qualitative basis to assign a conse- categories. The risk level determined Rank Categories Personnel Injury Asset Damage 5 Catastrophic Fatality, permanent disabling injury, life threatening events Loss of equipment or significant damage to the equipment; Equipment failure results in catastrophic loss of containment; Severe fire or explosion 4 Critical Major exposure or severe injury requiring physician or hospitalization Upset results in major leak, spill; Minor equipment damage; minor fire event 3 Moderate Minor injury requiring medical treatment Unplanned deviation requires equipment shutdown; Minor leak or spill 2 Marginal Minor exposure or minor injury Unplanned deviation requires intervention to correct 1 Negligible No expected effects Minor upset POTENTIAL IMPACT Table 1 Example of Risk Matrix Consequence Levels Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 5- Rank Categories Frequency definition Definition based on number of barriers E Frequent This event/upset reported more than once in a year or in the project lifecycle No engineering barriers and/or only one administrative barrier D Likely This event reported once in year anywhere; Event is likely to occur for new research One preventive or mitigative engineering barrier; or combination of no more than two administraive barriers Probable This event reported once in the last 10 year anywhere; event is probable for new research Only one preventive and one mitigative engineering barriers and at least one administrative barrier; or one preventive or mitigative of engineering barrier and at least three administrative barriers B Remote The event is remotely probable, but not reported anywhere At least two preventive and one mitigative engineering barriers; or one preventive and one mitigative engineering barriers, good inherently safer design and at least one administrative barrier A Very unlikely This event is very unlikely and not reported anywhere At least two preventive and two mitigative engineering barriers; good inherently safer design and at some administrative barrier C LIKELIHOOD Table 2 Example of Risk Matrix Likelihood Levels intersection of the predicted conse- equipment in operation, assess the The predicted risk levels range from implement protective measures to quence and the estimated probability. negligible (green) with no action in terms of risk reduction to unaccept- able (red), which requires operations to be terminated and mandatory risk reduction measures be implemented immediately. The intermediate risk level (yellow) is acceptable with precautions while the major risk level (orange) requires a detailed safety review be undertaken and approved by management. To be considered safe, a facility must identify all hazards associated with the materials handled and the risks associated with each hazard, reduce the potential for LOPC upsets from occurring, implement higher risk factor than upsets that can be sensed, such as an explosion. HAZARD EVALUATION AND RISK ASSESSMENT mitigative measures to reduce the (DESIGN STAGE) events should they occur and mon- etal risks starts at the plant design escalation of impacts of LOPC The evaluation of hazards and soci- itor plant operations continuously stage. During a project’s concept to identify and react to incipient stages of upset conditions. It should also be noted that the larger the potential consequence of an upset, the lower the acceptable probability. In addition, threats that cannot be sensed directly, such as radioactivity, are given a much Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 6- development stage and preliminary engineering design stage, hazard identification studies are undertaken. The study results frame the nature of the anticipated plant equipment’s design. The potential impacts of upsets leading to LOPC determine the approach to managing these upsets in terms of safety systems to the design. The bottom line is the event scenario because, as noted process control, emergency control associated control systems at the scenario’s anticipated consequence design such as pressure relief systems, and shutdown systems, etc. Once the design has progressed to a stage in which engineering drawings of the process flow and the anticipated equipment are needed, process hazards analy- sis studies are initiated to assess potential failure modes and effects that changes in the plant design and HAZOP stage (i.e., made on paper) avoid the implementation of costly The HAZOP process is time-con- offsite/public impacts, environmental the construction or startup phases. suming but extremely valuable and cannot be rushed. An effective HAZOP process analysis is the hazard and operabil- 2 based on plausible event scenarios interrogate the design and identify potential causes of deviations to the design intent. HAZOP studies are completed at the preliminary design stage and at the completion of the front-end engineering design (FEED) stage and are updated as Several consequence categories are assessed during the risk assessment asks and answers the five questions ity (HAZOP) studies initiated to and its estimated probability. changes identified and made during (FMEA) and fault tree analysis (FTA). The most important PHA above, risk is the product of the event of risk assessment shown in Figure the assessment team identified for every part of the design (node). The components in red text below the questions are the terms in risk assessment terminology. The combination of the responses to questions 2 and 3 provide the risk associated with the design progresses to the final process, including personnel injury, impact, equipment damage and loss of revenue. A company-developed risk matrix is used to estimate the risk of each event scenario’s being the intersection of the predicted consequence and estimated probability. The risk matrix defines the action required should the predicted risk fall in the unacceptable range as defined by the company’s criteria. When the estimated risk level of an event scenario falls into the unacceptable range, the HAZOP stage ready for construction. 1 What can go wrong? Hazard Identification of deviation from design intent, 2 What are the adverse impacts or consequences? Consequence analysis addressed through design changes 3 How likely is it to happen? Frequency analysis procedures. The HAZOP process 4 Do I need to do anything about it? Risk Evaluation/Risk Assessment an engineer with the requisite expe- 5 What should I do about it? Risk Control In addition to identifying causes HAZOP studies should uncover hidden design flaws that need to be and modifications and operating is critical and must be facilitated by rience in the process as well as in HAZOP studies and conducted by a team of experts in all fields relevant RISK MANAGEMENT COMPONENTS Figure 2.These five risk assessments questions are part of an effective HAZOP process. Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 7- team must identify additional An effective risk assessment control measures or layers of pro- process will ensure that the design consequence so that the resultant that the risks associated with the tection to reduce the predicted risk level falls into the acceptable level. Most companies have a rigorous process for determining which protection measures are as proposed is safe, which means possible event scenario (i.e., upset conditions) fall within the overall acceptable range. acceptable. An example of such PREVENTION BARRIERS oil and gas company is shown of an overall plant design are the be taken well before minimum wall thickness is reached • corrosion coupons placed in areas of anticipated high cor- rosion to provide an indication of the rate of degradation, allowing corrective action to be planned These measures are defined as a program developed by a large Among the critical components layers of protection designed to in the references (Behie et al, measures built in to function as cals from escaping from the process 2016). The team documents the reassessed residual risk level with the proposed measures in place. In addition to proposing preven- tion measures to reduce predicted levels, the HAZOP team identi- fies measures in place to mitigate the impact of an event scenario should it occur to minimize the impact. These preventive and mitigative measures are recorded in the HAZOP worksheets. An important step in the overall process is a third-party verification of the risk assessment studies to ensure that all components and risk aspects have been addressed. Even when HAZOP and FMEA members are well-regarded and have sufficient time for the analysis, there is solid evidence that hazards are overlooked. Also, there are examples that residual risk still can materialize. barriers or protective measures to prevent the conditions that have the potential to lead to a LOPC event from occurring. The conditions that must be avoided in the operation of process vessels or piping systems include: • prevention of overpressure or vacuum conditions from developing, which can result from a variety of mechanisms • excessive wall corrosion • vehicle collision • water hammer keep processing fluids and chemiunits and associated piping. In other words, these systems are designed to maintain mechanical integrity of plant systems. Should the systems fail to perform as designed or perform only partially, LOPC events can occur, resulting in the release of processing fluids (gases, liquids and chemicals) with potential impacts on personnel, equipment and the environment. FIRE AND GAS DETECTION AND PROTECTION SYSTEMS The design features that prevent Should a LOPC occur, it is crit- • pressure relief systems on detected immediately and warn- these conditions include: vessels and piping systems designed to relieve pressures at levels well below the maximum allowable working pressures • corrosion-monitoring systems to monitor the rate of wall thick- ness degradation so action can Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 8- ically important that releases are ings given by way of alarms to alert plant operators to the release. Undetected releases could escalate into a significant incident if correc- tive action is not taken immediately and effectively. Proper design of detection systems that provide these warnings is critical to main- of the alarm to control the situ- operations management assembles early-warning systems from a pro- of the event escalation to a major planning for operations by set- taining safe operations. The main cess safety perspective are the fire and gas detection systems, which are designed to detect flammable and toxic gas releases as well as releases of excessive heat (i.e., fire) indicating a potential upset condition in a process unit or piping. The system responds by sound- ing an alarm to alert operators to a gas release or a potential fire and identifies the specific area of the plant where the event occurred. The system design includes a range ation and mitigate the potential incident. A number of incidents have occurred in which the detection system did not identify a gas release or a fire until the event escalated to the point that sub- stantial impacts occurred. Other important detection systems that provide an indication of a trend to failure are conditioning monitoring systems and risk-based inspection programs. PROJECT CONSTRUCTION of protection responses from alarm AND STARTUP PHASE initiate an automatic discharge of a struct the plant as per the design only to executive action that could This phase’s objective is to con- water system such as a vessel deluge depicted in the detailed drawings flood. The design of fire and gas detection systems is critical and involves expertise in the instrumentation as well as experience in system design and installation. The fire and gas detection and protection systems design is sup- ported by sophisticated modeling to determine optimal placement of detectors as well as the number, type and voting logic to be used to initiate a response. MITIGATION MEASURES The plant emergency response team should go immediately to the site (be they PFDs, P&IDs, C&E, etc.) that reflect the modifica- tions recommended during the detailed risk assessment process. In addition, the construction team ensures that all equipment and components are built and installed per the required engineering specifications and codes. Inspection teams visit equipment assembly sites to ensure that equipment and components are built to the design specifications and are coded/ stamped as such. During this phase, while the plant construction is going on, Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 9- the operations team to start the ting up the various management systems needed to ensure robust and safe operations. These systems include the HSE management plan/system, the risk management plan, the process safety management system, the asset integrity plan, the equipment inspection plan, the emergency and response plan, etc. Managing the risks from con- struction to initial operations is discussed in detail elsewhere (Behie et al., 2008). From a safety perspective, it is critical to have these systems set up and functional so that the operations risk assessment team can conduct the risk assessments required to determine the readiness of the equipment components and plant systems to be started up. This process also will help confirm that all design requirements have been met. Equipment components and operating systems are started up in the proper sequence to confirm that they meet the warranty requirements the equipment suppliers guarantee. Once the startup of all the systems is complete, the entire plant is brought online and perfor- mance specifications confirmed. OPERATIONS PHASE Risks occur during the operations phase in many different ways: human failure, process control failures, wear-out of rotating equipment, heavy pipe vibration, unplanned large people concentra- tions, etc. The management of risks the potential for LOPC events from occurring. These programs are critically important as plants age, requiring adjustments to be made in the frequency and intensity of the health of protective barriers. CHALLENGES OF MAINTAINING and safety during a facility’s oper- AN EFFECTIVE PSM PROGRAM by establishing a comprehensive challenges to improve the effective- program that incorporates and plants in view of the significant ational phase is best accomplished process safety management (PSM) integrates all of the relevant programs such as risk management, mechanical integrity, inspection, testing and maintenance, emergency response, etc. This program focuses on “critical equipment.” Equipment that meets the criteria of providing protection against major LOPC events are identified by risk studies and usually are fire and gas detection systems that provide warning of releases events. The asset integrity team, which includes the equipment inspection team, establishes the initial frequency of component testing and vessel inspection of critical plant equipment and adjusts the frequency based on results. Risk-based inspection and condition monitoring inspection programs offer advanced techniques to improve the performance of mechanical integrity programs and reduce • Ensuring programs are in place to monitor the health of protective barriers • Improving all aspects of emergency preparedness, response and recovery plans • Improving communications with external stakeholders Of particular importance is the Behie (2020) outlined a number of need to educate, train and raise ness of PSM systems at operating levels of the organization. Starting incidents that have occurred in SE US over the past few years. These challenges include: • Ensuring senior management is given adequate risk-based information on which to make decisions and ensure that deci- sions are referred to the correct management level based on overall risk level • Adjusting operations staff to effectively accommodate the dynamic changes in the workforce that has resulted in a substantial drop in experi- ence levels at the facility level in particular • Maintaining effectiveness of process safety training and knowledge at all levels of the organization • Adjusting to meet the needs of aging plants Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 10- process safety awareness among all from frontline operators to the topmost senior management of the company, the importance of safety should be understood and made second nature. In that regard, tar- geted training focused on the level of the organization should be pro- vided. Technical colleges still largely lack in providing sufficient process safety knowledge to their students who may end up working in process facilities. Curriculum development focused to meet the need for such colleges, as well as universities, can help fill the gap. Bootcamps designed for executives can help the senior management make more informed risk-based decisions. The general public, with whom this article began, can be made risk-aware through good communication and education. Understanding the hazards of a chemical facility in their vicinity and the role of barriers in containing that hazard can impact United Kingdom HSE: The (OSHA): Established in 1975, primarily followed from 1974’s regulations in 1990 in response public acceptance criteria. undertakings of the UK HSE THE ROLE OF GOVERNMENT Flixborough incident in Britain REGULATION In response to major incidents with significant impacts, governments around the world have instituted regulations that set out minimum standards for operating companies. Examples of the major incidents that have driven government oversight are: EU Seveso Directives (1982, 1996 and 2012): These directives were implemented in response to major chemical plant incidents, the first of which was at the township of Seveso in Italy in 1976. The incident in Seveso resulted in devasting environmental impact on where the lack of process safety expertise and insufficient appreciation of the failure consequences led to an explosion that killed 28 people and injured 36. HSE ensures the implementation of its legislation Control of Major Accident Hazards (COMAH), which was derived from the EU Seveso Directive I. The performance-based regulation requires facilities to take OSHA promulgated the PSM to a series of industrial accidents, beginning with the toxic methyl isocyanate (MIC) release from the Union Carbide facility in Bhopal, India in 1984. In 1985, Union Carbide had another release of toxic chemicals within the United States in its West Virginia facility that injured 135 people. Other incidents included the Philips Pasadena, Texas incident in 1989, which resulted in 23 fatalities Governments around the world have instituted regulations that set out minimum standards. surrounding lands, killing livestock and wild animals from a dispersed, highly toxic dioxin released from a the responsibilities of protecting and the complete destruction of a An amendment to the Directive reducing their risk of operation powerful explosions and associated runaway reaction at a nearby plant. that came in 2003 also had sev- eral incidents in its tail: one from a cyanide spill in the Danube river, another from a fireworks warehouse explosion that killed firefighters and nearby residents in Netherlands and another from an ammonium nitrate detonation in France. The final Seveso III Directive came out in 2012. their employees and the public by to “as low as reasonably practi- cable” (ALARP). The Offshore Installations (Safety Case) Regu- lations were introduced in 1992 in response to the Piper Alpha disas- large chemical plant from several fires after a massive LOPC of volatile liquids and flammable gases occurred during regular maintenance work. The regulations that came about ter that resulted in the 167 deaths in response to these major incidents offshore platform in the North Sea. for industry to achieve from the from fires and explosions on a large United States Occupational Safety and Health Administration Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 11- established minimum standards perspective of programs related to safety and process safety of their operating facilities. These regula- tions, supported by regulatory audits of operating facilities, have been cooperative programs among it, the industry and other stakeholders. United States Environmental Pro- significant in driving safety perfor- tection Agency (EPA): In the initial have the resources or expertise established following the disas- mance. For operations that do not to implement their own corporate standards, compliance to regulatory requirements guide their operating performance. This overall approach is an outside-in paradigm, the success of which by its very nature gives rise to mixed results. years of EPA, Superfund sites were trous consequences of unregulated hazardous waste dumping in the community of Love Canal. In 1994, two years after launch of OSHA’s PSM, EPA published its first List of Substances and Threshold Quantities and in 1996, it issued the Risk Man- Enforcement (BOEMRE) and BSEE. BSEE enacted regulatory reforms to achieve both improved drilling safety and worker safety through the Safety and Envi- ronmental Management Systems (SEMS). The widescale environ- mental impact — and the loss of 11 lives and the entire drilling rig at the Macondo well — had a significant impact on the entire oil and gas industry post 2010. agement Plan (RMP) rule. ROLE OF INDUSTRY grams (VPP) promoted by OSHA employee safety within a facility have developed internal stan- effective work-site safety and purpose was to protect commu- The Voluntary Protection Pro- encourage the development of health programs through voluntary cooperation among management, labor and OSHA. Operations that can demonstrate these cooperative relationships in the workplace and have implemented a comprehensive safety and health management system can apply for VPP status. Approval into VPP is OSHA’s official recognition of the outstanding While OSHA focused on through its PSM, EPA’s RMP’s nities beyond a facility’s fence line. The rule required the facil- ity owner or operator to conduct hazard assessment, including offsite consequence analysis; develop prevention programs that ran parallel to OSHA’s PSM; have emergency response programs; and submit a risk management plan to EPA. United States Bureau of Safety efforts of employees and employ- and Environmental Enforcement occupational health and safety. Deepwater Horizon incident, also grams such as the OSHA Strategic the then Minerals Management ers who have achieved exemplary OSHA also promotes other pro- Partnerships and the Alliances Program to promote safe and healthy work environments through (BSEE): Following the 2010 known as the Macondo Disaster, Service (MMS) was broken down to Bureau of Ocean Energy Management, Regulation and Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 12- Most larger corporations, however, dards and corporate performance requirements that go well beyond regulatory compliance. Examples of these integrated performance driv- ing standard include ExxonMobil’s Operational Integrity Manage- ment System (OIMS) program, DuPont’s Process Safety Man- agement System and Chevron’s Operational Excellence Man- agement System. These programs integrate all of the components of process safety, mechanical integrity, integrity management, etc., into one over-arching program that drives operations toward the goal of achieving operational excellence. Responsible Care (RC) Initiative: In addition to the corporate standards and programs mentioned above, industry groups have OVERALL SAFETY OF developed standards to provide OPERATING FACILITIES companies to drive facility per- and maintained effective process and education on process safety. The future of process safety is further guidance to their member Operating plants that have developed indeed promising. As companies formance toward the long-term safety programs are indeed safe to on plant operations becomes read- goal of operational excellence. Responsible Care is an example of such an industry initiative designed to drive improvements in the health and safety perfor- mance related to employees, the environment and the communities in which they operate. This initiative, which started in Canada in 1984, has grown into a global initiative now adopted by 68 countries worldwide including the American Chemistry Council, which has made participation in the RC program a condition of membership. The commitment to the guiding principles is an insideout paradigm, which by its very nature drives overall performance from the C-suite to the operating floor. Member companies welcome the input of the communities in which they operate by holding open houses and proudly outlining the many programs in place to protect the health and welfare of their local communities. With programs like RC, industry is seeking to earn the right to continue to operate in their current areas through overall transparency. work in and live near. However, there are no guarantees unless proper process safety measures are taken and appreciated. While the probability of a LOPC event occurring and esca- lating into a significant incident with offsite impacts is extremely remote, things will gowrong unless adequate barriers are put in place to stop the trajectory of an initiating event. To gain a high level of confidence, members of the public are encouraged to garner their own opinions by attending plant open houses and asking questions related to the safety of plant operations. Statistically, one of the safest places to be is in a well-managed and well-operated processing plant. Process safety implementation that begins with the inception of a facility and continues throughout its life cycle can ensure incidents that occurred in Bhopal, Texas City, Macondo and others are not repeated. Operational discipline through commitment to safe operation, not only by frontline operators but by all levels of the organization, is of utmost importance. This can be achieved through targeted training Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 13- move toward digitalization and data ily available, it will enable relevant parties to monitor the quality of the safety management system by following trends in data that can indicate imminent danger, termed leading indicators. Past incident data can be analyzed to learn and take proactive measures that prevent the onset of future incidents. Some companies already are on that route. It will give safety chiefs for the first time more indication than just gut feeling. DR. STEWART BEHIE is the Interim Director of the Mary Kay O’Connor Process Safety Center and Professor of Practice at the Department of Chemical Engineering at the Texas A&M University. He has more than 40 years of experience in Oil and Gas Industry in various roles related to process safety, risk assessment and management in North America and the Middle East. While with Dolphin Energy in Qatar, Dr. Behie, filled in as Chief Emergency Officer for a year responsible for preparing the fire response and medical teams for full operations. His last role was HES and Safety Manager for onshore and offshore operations. He can be reached at stewart_behie@exchange.tamu.edu. Get the Best Out of Incident Data Let machine learning and artificial intelligence do the heavy lifting By Noor Quddus, Mary Kay O’Connor Process Safety Center, Texas A&M University D igitalization and auto- generate valuable knowledge and Occupational Safety and Health activities have led to focus currently is on lagging indi- and Hazardous Materials Safety mation of industrial data’s playing a much bigger role than ever before. Buzzwords such as big data, machine learning, Internet of Things, digital twin, 5G and similar terms are commonplace actionable solutions. While the cators, the future focus needs to be on the development of leading indicators derived from facility safety management systems. The ultimate goal is to under- without a clear understanding of stand what incident data is telling and how they will change the an operation, a whole plant or an how they are related to our work working environment in the future. The technologies behind the buzzwords not only are beneficial to computer science or information technology professionals but also provide a technological edge to all engineering professionals who understand and use them. These terms have infiltrated the process safety and risk assessment disciplines in diverse ways. However, one trend has received a lot of interest among us about the safety of a process, Administration (PHMSA) and Bureau of Safety and Environ- mental Enforcement (BSEE). The National Response Center (NRC) database hosted by the United States Coast Guard also collects a huge amount of release data. A few other databases are main- industry at large. We see efforts to tained by the industry bodies scant and very big datasets. The Process Safety (CCPS) and Center uncover treasures from both very real challenge is to turn the data into actionable solutions using both human and artificial intelligence. Of course, incident data is generated in individual facilities, but a great deal of data is available in the public domain. To illustrate the point that I want to make, I’ll focus on the latter. process safety professionals in PUBLIC DOMAIN INCIDENT DATA recently: the analysis of process is available in the public domain both industry and academia A large amount of incident data safety incident data to extract as collected and hosted by sev- useful information and trends to Administration (OSHA), Pipeline eral federal agencies such as the Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 14- such as the Center for Chemical for Offshore Safety (COS) that are not publicly available and are accessible only to affiliated paid members. These databases contain incident records reported by operating companies as guided by the regulatory requirements or industry standard (e.g., API RP 754). Different reporting criteria, format and data processing standards result in significant variation in the datasets. Another great source of incident data is the incident investigation reports available at the Chemical Safety and Hazard Investigation Board (CSB) and The greatest learning from inci- National Transportation Safety dent data is identifying an incident’s other agencies mentioned above. on which proactive measures will Board (NTSB) along with the USING THE DATA There is no doubt that the data is beneficial and provides valuable recommendations to improve process safety management in operating facilities, but we should be aware of the limitations of the data that cur- rently is available. Extracting useful information from these colossal amounts of data is tedious and chal- root cause and obtaining insight prevent future incidents. However, typical incident databases report of limitations of the predetermined reporting criteria. Nevertheless, these databases and collected data have one thing in common: They all are related to hazardous materials and hence to process safety. It is possible to translate the gathered data into information and convert it into Interactions among the causes or factors that contributed to the incidents can be identified only from in-depth analysis or incident investigation reports. It is possible to walk back to variables, inadequate measures, all the way to policy, standards and regulatory pieces that contributed for adapting process safety man- agement policies for the future and for helping us identify components of process safety management elements that need to be improved. TURNING DATA INTO appropriate computational capac- incidents” in the process safety lit- ity through advanced machine erature. Sometimes these learnings useful knowledge and actionable solutions. structure or classification system has been developed that can guide us to distinguish and differenti- ate among data, information and knowledge. This overall situation is part of the reason why companies have a poor history of learning from past incidents, whether their own or from their industry sector. Data-Information-Knowl- explain some structural and func- We often call it “learning from mation and convert them into manner. No formal and uniform vention and mitigation measures, is instrumental for incident pre- expertise. However, we need sets of data, extract valuable infor- knowledge are not organized in edge-Wisdom (DIKW), a KNOWLEDGE techniques to analyze these large However, the challenge is that to the incidents. Such information knowledge using necessary domain learning and artificial intelligence incidents. an easy, accessible and structured human and organizational issues, data has not been collected because mechanism that led to similar vide hints of underlying causes. of incidents and sometimes pro- developed to analyze one dataset another. In many cases, in-depth tion and understand the underlying the relevant data, information and identify the deviations of process may not be directly applicable for among different pieces of informa- and categorize the direct causes lenging. Because the datasets are not like each other, the techniques someone can connect the dots can be complex but comprehensible. More important, there is hierarchical model, can be useful to tional relationships among data, information, knowledge and wisdom. From the perspective of incident data analysis, data represents gathered facts and figures relevant to material (e.g., flammability limit and amount of hazardous material), processes and equipment (e.g., pressure and tem- perature), personnel (e.g., experience and training) and organization and industry (e.g., standards and policy), to name a few. Any higher level of observation abundant relevant information that establishes an association in various well-known resources. dent can be considered information. extracted from these data sources It all becomes knowledge when Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 15- among the data or with the inci- Typically such information is used to characterize most of the inci- dents. Examples includes fire, toxic release, equipment failure, lack of training and inadequate standards. The next level of conclusions that can be drawn from such informa- tion is coined as knowledge. A few examples of extracted knowledge can be determining the overpres- sure of an explosion from incident data, understanding issues leading to a deflagration to detonation transition (DDT) event or failure from corrosion in a pipeline system and understanding causes behind a failure to adhere to a procedure by an operator or the safety culture maintained in a facility. Information gathered from var- ious sources need to be processed and analyzed before a conclusion (i.e., knowledge) can be drawn. Sometimes this information is imperiled by subjective observa- tions and interpretation. This is one of the reasons that wide variations in gathered knowledge exist. It is relatively easy to make a distinction between the data and information, but it is harder to do the same between information and different knowledge levels. • Causal inference is a process of (effect) if we change the level of a causal connection between affect the onset of DDT? If we drawing a conclusion about events based on the condi- tion of effects can be useful in establishing relationship among information at hand. • Ladder of causality provides different levels of insights by converting data and informa- tion into knowledge in a more objective fashion. The model has three levels of causation: association, intervention and counterfactual. Basic level. At the basic level, the associations invoke purely statistical relationships, defined by the data. For instance, from past incident data we know that the extent of vapor cloud congestion congestion (cause)? How does it increase the pH value (cause), how does it affect the corrosion rate (effect)? Answers to such questions may or may not come from incident data alone. More information may be needed, from external sources, to answer these questions. However, the past explosion incidents might have occurred at varying condi- tions, or the pipe failures may have taken place at different operating conditions. The models at an interventional level will not be able to answer any questions about the conditions for which they are not developed. Top level. The top level is called somehow plays an important role in counterfactuals. A typical question a medium is important for a corro- “What if I were to act differently,” DDT incidents, or the pH value of sion mechanism to propagate. Such associations can be inferred directly from the gathered information. However, these associations are purely statistical relations and do not necessarily provide any cause and effect relationship among the variables. Hence, these associations are placed at the bottom of the in the counterfactual category is thus necessitating retrospective reasoning. What if the obstruction has a different geometrical shape? What if we use corrosion inhibitors? Counterfactuals are placed at the top of the hierarchy because they ask interventional and associational questions. If we have a model that can hierarchy. answer counterfactual queries, we while some is tacit. It is important intervention, ranks higher than interventions and associations. that allows one to convert informa- what happens if there is any change METHODOLOGY Some knowledge can be explicit that we establish a methodology tion into knowledge in an objective manner: Second level. The second level, association because it can answer in one variable. This level should be able to answer what happens Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 16- also can answer questions about For example, if we have a robust corrosion model that is at the coun- terfactual level, it will be able to answer all mechanistic questions regarding corrosion even if some formal mathematical structure, (not sure what it is) will be able to more objective manner. Moreover, conditions change. For example, it determine the required cathodic protection or inhibition necessary to control the rate of corrosion. However, the models at the interventional or associational levels will not be able to answer such queries. Corrosion rate cannot be predicted simply by knowing that pH is playing an important role in a corrosion mechanism, which is an associational level knowledge. No counterfactual question involving retrospection it is possible to develop them in a work). The latter, artificial neural machine learning algorithms are variations for different tasks, but available for utilization to scav- enge and classify large datasets for information. However, currently it is only possible to develop such models for small systems such as DDT determination and corrosion rate measurement. Developing such a model is very challenging for a large complex system such as an entire chemical plant or oil refinery or offshore drilling rigs. Machine learning and artificial can be answered from purely inter- intelligence techniques and tools effect may depend on multiple verting data into information and ventional information because the causes, and changes in any of those causes will change the effect. MIND VS. MACHINE The human mind can understand immediately what is cause and what is effect; mathematics and computers cannot. On the other hand, our mind is limited to comprehend a multiplicity of causes and effects as exist in a system at one time, but computers can handle large quantities of opera- can be very helpful when con- Because ladder of causation models can be developed using a pattern recognition. Some are more effective than others at identifying the relationships among the variables and support prediction, although they don’t have much power outside the data range of interest. Artificial neural networks vary in ease of use, data screening requirements, robustness and accu- racy, and it might not be wise to rank one over another in terms of effectiveness. The major limitation of this class for the same system from which learning is a rule-based machine learning method for discovering interesting relations between variables in large datasets. It is particularly useful in reducing the number of variables to be used for are useful as long as they are used the data was collected. Consider a dataset collected for corrosion failure of a pipeline system, and a model is developed with an excel- lent accuracy level. If the operator introduces a new inspection mech- further processing when a dataset anism or replaces a reinforced old each incident record. the model’s prediction capability has a large number of fields for Once the most relevant vari- be used effectively for further counterfactual queries. data capability and are suitable for causal models. Association rule all three levels of the hierarchy of our target should be to develop terfactual level that can answer in general they have superfitting of techniques is that the models ables are identified, several other a datacentric model at the coun- networks, now come in many knowledge. They are beneficial for tions fast. To resolve this situation and apply a computer’s capability, and self-learning (neural net- machine learning techniques can model development, such as clas- sification techniques (e.g., decision tree, random forest and support vector machine), clustering tech- niques (k-means and density-based) Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 17- steel pipeline with plastic pipes, will deteriorate because no rel- evant data was used to train the model. A self-learning model will regain its effectiveness over time as it acquires the new data of the new system. In other words, they do not get past the second ladder of causation, i.e., intervention, to being counterfactual. MACHINE LEARNING TECHNIQUES Bayesian network, A solution to this can be the use of a Bayesian network (BN). This probabilistic graphical model represents the conditional dependency among all identified variables with a directed acyclic graph and as such forms a causal structure. It is acyclic because an effect cannot be its own cause, although pertinent feedback can be included. It applies the Bayes’ theorem formulating that prior statistical information can be updated with new. A BN has the advantage of meeting the requirements of all three hierarchies of the ladder of causation. Given data, the network can determine the probabilities of other dependent variables, hence effects that in turn can be causes for others, and it can take the advantage Natural language processing. Another relevant tool is useful for learning from incident findings is natural language processing (NLP), in particular for analyzing incident descriptions and inci- dent investigation reports. Very often, incidents are reported in a structured format, and if there is additional information that does not fit in that structure, then the only means to report it is through incident narratives. useful information. The NLP uses different types of machine learning algorithms to analyze and extract useful information that can be used for further analysis. NLP also can be used to analyze the incident investigation reports for thematic study. The data extracted from the NLP then can be used to develop a counterfac- structure as in a fault tree or an been implemented at all three to build the most probable network. Overall, it is safe to say that BNs have specific advantages over other machine learning techniques for both incident analysis and for learning from the incidents. for developing leading indica- tors, performing fault diagnosis and identifying weak signals to name a few. NLP has been used to extract information from both incident records and incident investigation reports. Although there are only a few excellent incident data- improvements in data collection Process Safety Center (MKOPSC), there are algorithms available and offshore fire incident and narratives manually and extract ine thousands of such incident a robust BN requires precise event tree. Although given data, line failure, microbial corrosion bases and incident investigation tual model via BN. knowledge of a system’s causation have been used to predict pipe- It is almost impossible to exam- of expert knowledge where data is absent. However, building artificial neural network. BNs At the Mary Kay O’Connor machine learning techniques have levels of causation for various application areas. For predicting material properties and corrosion failure in pipeline, procedural complexity interventional models have been developed using algorithms such as random forest, support vector machines, decision tree, k-nearest neighbors and Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 18- resources, there still is a scope of methodologies and subsequent analysis techniques. We con- tinue to observe similar incidents occurring repeatedly and wonder why we are not learning. The knowledge is gathered from the incident data, which is scattered and inaccessible to stakeholders when necessary. We need to ponder how we can build a system that will not only translate the data and information into knowledge, but also identify the required knowledge and deliver it at the right time in the right place to the right people. Perhaps we can call that wisdom. Will the future AI have that wisdom? NOOR QUDDUS, PHD, is a research scientist at the Mary Kay O’Connor Process Safety Center, Texas A&M University. He can be reached at nooralquddus@tamu.edu. Effectively Share Insights from Incidents A proven approach can ensure lessons get communicated and acted upon By Mike Bearrow, Rolls-Royce Controls and Data Services, and Kim Turner, consultant W e all are contin- that can make your organization have broader applicability often Learning from our able. The opportunities those knowledge- or incident-manage- ually learning. own mistakes as well as from the experiences of others makes us better prepared to deal with future events. Organizations are no different — successful ones use insights gained from mis- takes, incidents, accidents and other undesirable occurrences to improve. Identifying those lessons you really must learn, regardless of safer, smarter and more sustain- lessons afford can play a key role in remaining competitive, effective, efficient and profitable. don’t get entered into a corporate ment system because site staff don’t appreciate their wider value to the organization. Site staff don’t always appreciate that some ideas have wider value to the organization. Every site in your organiza- Only a small percentage of their source, and putting into tion spots local opportunities. opportunities identified at a facil- of the learning is the very defini- related to operational improve- business unit. Even fewer will action a process to take advantage tion of continuous improvement However, some ideas, e.g., ones ment or process safety, that could Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 19- ity will be important to its larger have enterprise-wide relevance (Figure 1). However, those that do in reporting requirements by the opportunities. Companies are and effective way. Agency and Occupational Safety address this tough problem — with must be managed in an efficient An organization should look for more than internal ideas. It should check for opportunities uncovered elsewhere. Regulatory bodies often spell out general lessons from major events. So, review high-pro- U.S. Environmental Protection and Health Administration, and the U.K. Health and Safety Executive’s Control of Major Accident Hazards (COMAH) regulations also mandate all facilities to react using different techniques today to mixed results. We recommend an approach called HUAA that has proven effective. A MORE EFFECTIVE APPROACH and adapt. HUAA stands for Heard, findings — e.g., on the explosion how your organization manages Actioned. It consists of the follow- Buncefield fuel-depot disaster it involves casual or informal shar- file incident investigations and at BP’s Texas City refinery, the In addition, you must consider global opportunities today. Often, in the U.K., and the Fukushima ing. Unfortunately, this usually is opportunities that might apply ate people in your organization nuclear catastrophe in Japan — for to your organization. Changes Site 1 85% - Events local learning & CAPA ing core steps: 1. I dentifying opportunities (Heard); inadequate because the appropri- 2. Entering them into an elec- may not see or understand the 3. Having experts review them Most IMS value is at the site level where the risk is and the events happen... 7% BU learning & CAPA Global Company Region 2 tronic system; (Understood), 4. Accepting or rejecting them (Acknowledged); and 5.A ssigning each to leadership to resolve and track to closure Region 1 Site 1 Understood, Acknowledged and 3% Global learning & CAPA (Actioned). A good HUAA process provides visibility to leadership about the opportunities being identified, and reports on the progress in real time Site 1 Region 3 (Figure 2). The approach offers a number of other significant benefits: • Efficient collection. Import- ant opportunities from many BROADER APPLICABILITY Figure 1. Events at sites lead to local learning and corrective or preventive action (CAPA) but some may apply more widely, to the business unit (BU) or entire enterprise. Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 20- sources (internal and exter- nal) are collected and entered into the HUAA management accountable for their actions. HUAA STEPS recording process and getting as well as inaction can be seen of people, technology and process. system, streamlining the the opportunities quickly to decision-makers. The management system is a blend real time, which spurs action. All the pieces must fit together and • Results. The organization will • Expert review. These oppor- gain competitive advantage by tunities are vetted, parsed active listening, learning from and resolved for inclu- its experience as well as the sion or exclusion, with experience of others, and by the decision documented taking action. The results of the and communicated. • Accountable assignment. Moreover, results of their actions process can be audited, judged Accepted opportunities are sys- and continually improved. The HUAA process can work efficiently. Probably the most important design element is the visible accountability of everyone involved, which allows leadership to constantly review progress and ask questions about global opportunities and their closure. Let’s take a deeper look at each of the HUAA steps. Heard. First, opportunities must tematically assigned to leaders underpin an organization’s knowl- be heard. That means key staff prise at the executive level. provide the most important part on the lookout for opportunities at and managed across the enter• Action. Leaders are held External & Internal Opportunities Heard - Opportunity collected and submitted to the HUAA System • Global concerns • Incident learnings • CSB findings • API standards • Audit best practice edge-management strategy and — action! members (idea generators) must be all times in all places. They must Feedback to initiator Understood • Analysis of the opportunity • Root causes • Lessons learned • Review of applicability Action for senior management Acknowledged • Acceptance or rejection of the opportunity into the HUAA system Action • Prioritize the opportunity • Assign and report • Measure performance Communication & reporting along the HUAA Continuum HUAA APPROACH Figure 2. This formal process involves several distinct steps and requires both feedback to the idea initiator and action. Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 21- be active attendees at events that might produce an opportunity. For instance, a key mechanical engineer escalation so ideas don’t await review for too long. The reviewer acts as gatekeeper, see that these important opportunities are handled appropriately. The gatekeeper also must might be charged with going to evaluating the idea for applicability ensure the idea is assigned a pri- of Mechanical Engineers and even ing or accepting it into the system. account severity, frequency, etc. industry guidance. A process safety and priority for any accepted idea. meetings of the American Society participating in groups developing management (PSM) specialist might attend meetings of AIChE’s Center for Chemical Process Safety (understanding), and either reject- The person also provides a value The next step is the acknowledgment of the idea. Acknowledged. The gatekeeper and the Mary K. O’Connor Pro- communicates the decision to Symposium and even volunteer to the system. If the idea is rejected, cess Safety Center International chair a working group. This PSM person also must keep up-to-date on, e. g., safety incidents occurring in the industry as well as changes to PSM and risk-management-plan rules, suggested or on the horizon. After identifying a potential opportunity, the idea generator enters it into the HUAA management system. This involves explaining what the opportunity is and why it’s the initiator and documents it in recording closure comments and sending these to the initiator closes the communication loop — the initiator has spent time and energy entering the idea in the first place and believes it has merit, and so deserves such feedback. Docu- must initiate a quality review by the right person at the right time. Inaction should generate age the corporate risk-ranking matrix with severity and frequency. Severity must consider safety, environment, reputation, assets, etc., to properly compare one opportunity to another. Ideally, both current and future risk should be identified for each opportunity. Up to this point, there’s been motion but no action. Knowledge or wisdom without action is wasted. Actioned. This step is where we address the opportunity, assigning The gatekeeper then selects a Assigning a senior manager is an captured, the HUAA process medium or small, or may lever- ment and the quality of the process. a great way of monitoring involve- gathering opportunities and entering Understood. Once an idea is involve checking a box for big, reap the rewards of the HUAA leader, like a plant manager, to be them all the time. Setting that priority simply may menting the reason for rejection is important to the organization (upside and downside). The person should be ority (importance) that takes into responsible for the accepted idea. essential part of this process. The individual must be someone with the resources, both human and monetary, and influence to address the opportunity. More importantly, the person must be accountable to Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 22- process. Creating an action plan to actions and tracking those actions to ensure everything is achieved on time and to quality allow realizing the opportunity quickly, effectively and efficiently. The responsible person can develop an action plan on how to address the assigned idea in as much detail desired, and can make as many subordinates as needed responsible for specific actions. The person must set firm and rejection rate. This allows a score that things must get done. but also of the quality of opportu- deadlines for all actions to underMonitoring progress becomes easy, as does confirming the value is realized. Continual monitoring ensures the opportunity remains satisfied. (Maybe we should add another “A” for auditing to the HUAA process — that would make it HUAAA!) The audit step covers two angles — ensuring changes continue to be embedded into the operation of a facility; and understanding who is making recommendations for oppor- tunities, their quality, and acceptance holistic view not only of the volume nities and reviews. It can assist in identifying any weaknesses in your HUAA process. For instance: • A re opportunities being docu- mented effectively to allow the reviewer to understand? • A re the reviewers rejecting opportunities because of a weakness in their knowledge of a specific area? • A re there people who you would have expected to enter opportunities who never have? • Are there people who consis- tently miss their deadlines for actions? All these finding can help make your HUAA process even more effective. A good HUAA process relies on people. People identify, review and implement the opportunities. You must choose the right listeners and properly motivate them. If they are too busy, not interested, not experienced enough, too experi- enced or lack an innovative spirit, the process will flounder. People without adequate experience, AVOID CONFUSION Many people use the terms data, information, knowledge, Wisdom wisdom and ideas interchangeably but they have very different meanings. Knowledge •D ata are numbers on a spreadsheet, maybe without context or units of measure. Information • Information has context like units of measure. Data are turned into information by organizing them so one can easily draw conclusions. Data and information deal with Data the past. •K nowledge has the complexity of experience, which comes about by seeing it from different perspectives. REACHING THE PINNACLE Information is static, knowledge is dynamic. Knowledge Figure 3. Data provide the foundation but putting knowledge into appropriate context is the key to gaining wisdom. deals with the present. •W isdom is the ultimate level of understanding (Figure 3). We can share our experiences that create the building • Ideas are thoughts or suggestions as to a possible blocks for wisdom. However, imparting wisdom involves course of action. We refer to ideas in this discussion more than just such sharing; it requires putting knowl- as “opportunities;” opportunities can be information, edge into the personal context of the audience. knowledge or wisdom. Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 23- education or curiosity won’t spot to identify opportunities. Curios- opportunities. Those with the right improve, along with being given global learnings, concerns and credentials must be vigilant and feed the HUAA process as if the organization’s future depends upon it. It may! The gatekeepers filtering and accepting/rejecting opportunities must be exceptional people who can be trusted to separate the wheat from the chaff. They should be held accountable for inclusion or exclusion. Likewise, it’s vital to hold members of the leadership team personally accountable for disposition of global opportunities assigned to them. Measurement by corporate-level executives and board members is essential to ensure you get value from the HUAA process. What gets measured really does get done. THE KEYS TO SUCCESS To have a successful HUAA process, you must: •H ave the right automation for the job. It must be easy to use, intuitive, follow good business processes, enable reporting and auditing, and engage all users in the process. • Empower the right people to be able ity, innovation and a desire to the time and space to go hunting for opportunities are key characteristics. Efficient identification and collection of opportunities is breeds interest, involvement and commitment. • Measure and continually improve. Getting better and better will allow you to identify opportunities. A HUAA opportunity-man- the foundation of HUAA. agement process helps ensure that decisions on where the value lies in opportunities are consistently and • Empower the right people to make the opportunities identified. For example, a mechanical engineer reviews a new American Petroleum Institute recommended practice on mechanical integrity, while a PSM expert evaluates opportunities arising from find- vital initiatives and important systematically identified, evaluated and prioritized. Imagine how much more efficient and effective an organization would run if we were taking advantage of all the knowledge around us. If properly designed and imple- ings of a U.S. Chemical Safety mented, a HUAA management prioritizing the opportunities all organization’s risk profile is productive and profitable. at the lowest level possible. Not Board incident report. Properly means your business is kept safe, •E mpower the right people to make the decision on how to implement the opportunity to get the best value. Using the people who best know the particular aspect impacted and have the authority to take action to manage the process is the most efficient way to make the opportunity a reality. • Make everyone in the HUAA process responsible for his or her part in it. Responsibility Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 24- system will ensure your over- known, visible and manageable having a HUAA management system may be the most expensive mistake you ever make. MIKE BEARROW, PE, is principal consultant, process safety management, for Rolls-Royce Controls and Data Services, Houston. KIM TURNER is a consultant based in Nottingham, U.K. E-mail them at Michael.Bearrow@ controlsdata.com and KimnTurner@ hotmail.com. CONTINUING EDUCATION The Mary Kay O’Connor Process Safety Center offers continuing education courses year-round both online and in Houston. The continuing education classes are taught by experienced engineers with years of industrial, chemical, research, and process safety knowledge. The Center strives to deliver the courses and topics that are important and vital to the ever-changing environment and industrial audiences. These courses can be taken for continuing education credit and can be applied toward the Safety Practice Certificate. PROCESS SAFETY PRACTICE CERTIFICATE FOR INDUSTRY The Process Safety Practice Certificate is a program that allows engineers in industry to gain greater knowledge in process safety. The certificate requires 125 Professional Development Hours (PDHs) for completion within a three-year timeframe. COST OF CERTIFICATE The approximate cost to complete the certificate is $5,400-$6,470. - 1 day course: $495 (7 PDHs) - 2 day course: $990 (14 PDHs) - 3 day course: $1,485 (21 PDHs) - Semester long SENG Courses: $1,800 (42 PDHs) ONLINE COURSES Existing online courses are available now - SENG 655 Process Safety Engineering - SENG 660 Quantitative Risk Assessment - SENG 674 System Safety Engineering - SENG 670 Industrial Safety Engineering - SENG 677 Fire Protection Engineering In-person courses are being offered as online courses New courses are being included in the program For questions email us at: mkopsc@tamu.edu Apply here: tx.ag/MKOpspcert Visit psc.tamu.edu for more Information Find us on Consider Industrial Detonations in Vapor Cloud Explosions Risk assessments often overlook the issue but should not By Cassio B. Ahumada, Texas A&M University U nwanted detonations are the right conditions are in place is the series of explosions at a enormous destructive flame fronts. Gaseous detona- December 11, 2005, in Buncefield, powerful events with to create and sustain high-speed potential if not controlled ade- tions are intrinsically different approximately 2,700 tons of ammo- as the Beirut explosion, because quately. The recent explosion of nium nitrate at Beirut’s Port, which shook the entire city and claimed more than 200 lives, reminded us how dangerous uncontrolled det- onations can be and why we must ensure these types of hazards are from solid-phase ignition, such the energy is more dispersed throughout flammable gas clouds. However, this reduction in energy density does not make gaseous detonation less dangerous. managed effectively. VAPOR CLOUD EXPLOSIONS inate through different means, a seen devastating damage from Although detonations can orig- common route is through flame propagation in gaseous mix- tures. This phenomenon, known as deflagration-to-detonation transition (DDT), occurs when In the last decades, we have industrial vapor cloud explosions hydrocarbon storage facility on United Kingdom, after a propane tank overflow. Even though mul- tiple blast events occurred during the Buncefield incident, the first and most energetic one involved a flame front propagating more than 100 m that created high-pressure loads destroying onsite buildings, vehicles and equipment. Post-incident investigations later hypothesized this first explosion to be a detonation wave. Industrial denotations from (VCEs) resulting from releases of VCEs often are overlooked facilities. A well-known example because they are believed to be explosive mixtures in processing in the process safety community Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 26- during risk assessment reports possible only with more energetic substances. However, the latest crucial to facility siting and land searching for VCE assessment challenged this assumption and and mitigation actions as well DDT likelihood to support site VCE research programs have provided scientific-based evidence showing that DDT in industrial vapor clouds is more common than previously believed and involves a range of materials. Besides, use planning, explosion prevention as effective emergency response plans. These components are pil- lars of good hazard identification and risk management plans. it is known that confinement RESEARCH UPDATES acceleration, creating conditions detonations at the Mary Kay and congestion enhance flame Current research in the area of favorable to detonation onset. O’Connor Process Safety Center Such conditions are challenging to avoid in industrial plants given space constraints on equipment arrangement, especially in offshore units that result in congestion. (MKOPSC) at Texas A&M University is focused on under- standing how the facility layout affects flame acceleration and, ultimately, the DDT process. Evaluating detonation hazards in process plants is crucial for several reasons. In the perspective of industrial More important, our goal is to events with consequences that can fications or mitigative measures safety, DDTs are catastrophic go beyond the facility boundaries. Therefore, evaluating detonation hazards in process plants becomes identify potential layout modi- that can be employed to achieve inherently safer facilities that avoid DDT impacts. We also are Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 27- methods that reliably quantify layout decisions. For example, in a joint project conducted with Gexcon in 2014, researchers from MKOPSC expanded the application of the computational fluid dynamics (CFD) commercial software FLACS to include DDT prediction for various fuels, including hydrogen, ethylene, propane and natural gas. The authors validated their proposed methodology using experimental data from multiple scenarios, ranging from confined lab-scale setups to semiconfined large-scale geometries. A more recent study published this year reviewed easy-to-use empirical VCE correlations and evaluated their ability to assess the likelihood of DDT and fast deflagrations for less energetic fuels such as propane and methane in large, unconfined structures. The six models analyzed included the TNO Multi-Energy method, the Baker-Strehlow-Tang (BST) method and Shell’s Congestion Assessment Method (CAM). The review demonstrated that simplified VCE methodologies can be applied to indicate DDT in scales relevant for industrial applications with relatively good accuracy after minor model modifications. Both studies demonstrate initial progress toward including detonation hazards in industrial risk assessments. Overall, the steps necessary for estimating DDT likelihood based on empirical correlations can be summarized as follows: 1. Define the project’s scope. At this initial stage, the safety professional should identify and establish the physical boundaries of the process under investigation. 2. Collect relevant process infor- mation, including chemical inventory, equipment list, process variables (temperature and pressure), weather conditions, etc. 3. Identify potential release sources and perform dispersion studies to estimate the dimen- industrial detonations can occur formed. conditions prevail. Failing to sions of flammable clouds 4. Separate congested areas based on their degree of confinement and equipment density. In this step, it is essential to high- light regions that could result in flame acceleration should a flammable cloud be formed nearby. 5. For each explosion scenario identified, estimate the maximum flame speed and/ or overpressure generation applying at least two empirical VCE models for comparison purposes. 6. Subsequently, compare the predicted outcomes obtained from the VCE modeling with DDT criteria defined by the fuel type. 7. Finally, assess onsite and off- site consequences applying pressure load response curves and estimate the event severity. To conclude, VCE research programs and post-incident investigations have demonstrated that Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 28- in processing facilities if the right account for detonation hazards from flammable gas releases may substantially underpredict the explosion severity at a particular site, which in turn underestimates the risks of event escalation and critical building damage. Therefore, identifying the potential for DDT is crucial for effectively managing explosion risks in industrial plants and reducing the potential occurrence of such catastrophic events. CASSIO BRUNORO AHUMADA is a doctoral candidate in the Chemical Engineering Department at the Texas A&M University. His research investigates how the congestion pattern variation affects the deflagration-to-detonation transition (DDT) mechanism on flammable gaseous mixtures. He is also involved in many safety-related projects, including facility risk assessments, facility siting, and vapor cloud explosion modeling studies. He can be reached at cassioahumada@tamu.edu. Rethink Your Process Safety Procedures Interdisciplinary approach supports workers, strengthens companies By S. Camille Peres, Texas A&M University I approached Texas A&M Univer- tribute to system failures. NGAP Safety Center (MKOPSC) to surprising discoveries about pro- n 2014, a unique and special edu/) consortium — a collabora- factors professionals do not use it users of procedures and a industry to identify issues related needed to better comprehend how thing happened: Industry procedure technology company sity’s Mary Kay O’Connor Process conduct research regarding procedure design. This was motivated by the number of incidents that occurred as a result of procedural deviations. Many of them involved significant loss of process contain- ment and were publicly visible (e.g., tive effort between academia and to procedural systems that con- continues to this day and has made cedures and procedural systems. This article summarizes some of our findings to date and shares our current concerns for the industry moving forward. HUMAN FACTORS AND THE Macondo, BP Texas City and the PROCESS INDUSTRY explosion). NGAP was to “human factor” the Formosa Plastics vinyl chloride This ultimately was the begin- ning of the Next Generation Advanced Procedures (NGAP) https://advancedprocedures.tamu. One of the initial charges to in this manner. Our consortium this industry understands the term. We learned that many, if not most, in the process industry domain understood human factors as a list of issues associated with human error, such as fatigue, task com- plexity and quality of the interface. As Human Factors (capital “H” capital “F”) professionals, we consider these a list of “performance shaping factors” from the human reliability domain (e.g., SPAR-H [1], HFACS [2]). Human factors (HF) is the sci- current procedure designs, guide- entific and professional domain using this term as a verb in some regarding humans’ capabilities and lines and frameworks. Although industries is common, most human Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 29- that applies methods and theories constraints to improve the efficiency, effectiveness and safety of their availability of procedures was a pri- writers and the current standards and found additional common prob- We learned that most of that guid- performance. The approach of many mary issue. However, other studies the system’s design is more or less lems with procedures: they were ior. The approach is not “how to fix read (too technical or wordy); diffi- HF professionals is to identify how likely to support the desired behavthe humans so they are less likely to make mistakes” (e.g., follow procedures). If people reliably make mistakes when using a tool designed for a particular task, then the error incorrect or out of date; difficult to cult to access; or generally of poor quality, which made them difficult to use (frequent spelling, grammar and punctuation errors). In many of our investigations we have con- is with the tool’s design, not with firmed these issues. However, we we do not focus on reducing human may decrease the likelihood that the human using the tool. Therefore, error. We focus on supporting human performance. Although this difference may seem semantic and trivial, it puts have discovered more issues that written procedures will support workers’ performance when they most need it. the focus on the system’s attributes NGAP MAJOR FINDINGS able and predictable errors. This using multiple research methods, regulations regarding procedures. ance was based on historical practice and found little to no published empirical evidence to support the guidelines in most of the regulations and standards. Further, writers’ guides, books and other guidelines regarding procedure writing and procedure system development provided little evidence. When some was provided, it often was based only on “years of experience.” Certainly, experience is a valuable teacher and without question many of the guidelines within these doc- uments are very good. However, as will be seen in some of the findings that result in humans making reli- Our findings are based on studies focus shifts the overall responsibil- including interview analysis, lit- ous designs can result in behavior experiments in both a lab envi- writer desired. It is important to ities to the system’s designers and managers to improve them to support the user’s work. This systems approach is associated with highly reliable organization and requires a thorough understanding of not only the identified “problem” (here, the procedure) but also the users, the tasks and the contexts in which this problem occurs. PAST FINDINGS As with any endeavor, our efforts built on previous research that erature reviews and controlled ronment and a field training environment (we went to Shell’s Robert training facility in Robert, Louisiana). Here, we will articulate a summary of the findings but not necessarily discuss the specifics of some seemingly intuitively obvi- antithetical to what the procedure have empirical research conducted for some of the most critical aspects of the procedure design to ensure that the designs support the desired behavior. Finding 2: Units within facilities the methods used to identify those differed remarkably on the health refer to the list of publications on studies at which we were onsite at findings. The interested reader can our website or our webinar series for more details on these studies. Finding 1: Guidance often is not identified some consistent issues based on empirical findings. Our tems. For several incidents, the ance currently available for procedure associated with procedural sys- below regarding hazard statements, first effort was to identify the guid- Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 30- of their procedural systems. During multiple facilities, we found that within the same facilities, it was common for units to differ in their attitudes toward and reported use of procedures. The workers in those units that were more positive about procedures generally had a more before that happened. This obser- and reported using them more different attitudes toward proce- positive opinion about procedures regularly as recommended by management systems. These workers also seemed to have ownership of the procedures’ content, and when they requested changes or turned in redlined procedures (a procedure that has been marked up as need- ing changes — typically with a red pen), they reported receiving feed- back on them quickly (within days or weeks). One experienced worker in this type of unit shared a challenge. As the unit became more engaged with Another concerning perspective vation of units having remarkably was that workers always would dures has been a pervasive finding knew they were incorrect because in multiple facilities around the world. Thus, we think it does not reflect an idiosyncratic facility that has a different management method. Finding 3: Reports regarding procedure deviation and use caused concern. In interviews with work- follow procedures even if they following the written procedure exactly would protect them from liability if something went wrong. The focus for these workers in their organizations was about protecting their jobs. Ideally, an organization should support workers’ thinking critically regarding how to perform Work as imagined often is different from work as done. procedures, its workers asked more questions to clarify attributes of the task and procedure. Although this ers, we found issues related to their tasks effectively, efficiently diminishing number of experienced that were extremely concerning. focused exclusively on “watching seems fine on the face of it, for the workers in the field, this means they get asked questions regularly and thus often are distracted from their own tasks. Workers in units that were more negative about procedure use often reported the procedures did not help them much in performing their tasks and viewed them more as a control mechanism for man- management and safety climate When we asked them why they or another worker would deviate from a procedure, several reported that it was it was not uncommon to be it could be months or even years Finding 4: Procedure cannot be experimental and observational from the procedure because of time pressures. When asked about possible ben- help them stick to the procedure (because often they would not), at risk. indirectly instruct them to deviate supervisor would either overtly or mended, and when they submitted that if they ever received feedback zation’s productivity and the safety one size fits all. One of the clear efits of using digital procedures, redlined procedures, they said their backs,” it can put the organi- in situations in which their direct agement. They were more likely to report not using them as recom- and safely. When workers are many workers reported they might in time pressure situations because deviations would be documented and the supervisor could not “get around that.” Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 31- findings we have made in both studies is that procedure use and performance differ by experience level. Many in the process indus- try who write and use procedures are familiar with the dilemma of how much content is enough. The experienced workers want less content and more parsimonious, bulleted-style steps, while the less experience workers want more content to facilitate their understanding of the task itself and how that task relates to the entire system. From our research we have found that this is not simply a preference for these two groups, Good Design 1. Can cause serious bodily injury or death. For those organizations still using Power off the equipment to prevent electrocution. paper procedures, this is not some- tiple versions of the same procedure is a recipe for a procedural system failure. However, for those organi- Power off the equipment to prevent electrocution. Electrical shock hazard. content impact their performance. because effectively managing mul- Can cause serious bodily injury or death. Bad Design but having different amounts of thing that can be accommodated E lectrical shock hazard. SAFETY NOTIFICATIONS Figure 1. These two designs were used in an eye-tracking study. Top one is the design that had the longest gaze duration and was associated with the best performance. task or integrating a new mon- also impact or be a reflection procedure changes to ensure clarity ities. When workers see that zations that are adopting digitally itoring system all may require can develop procedures with the and accuracy. based procedures, many vendors content presented in different formats for more or less experienced workers. Finding 5: Written proce- dures likely never will be perfect. Although many organizations have as a goal to develop “perfect” procedures that need to be reviewed or updated only every three to four years, what we have seen and heard from workers is that this likely is not a realistic goal for many situations. Many process industries are highly complex socio-technical sys- tems and constantly are undergoing subtle changes that can create the need for changes in the procedures. For instance, upgrading or replacing a pump, identifying a more efficient method of performing a One of the major challenges many organizations face is having sufficient resources for ongoing procedure revision processes. We have identified two major organizational components needed to maintain a healthy procedural system: • A robust and efficient procedure revision process to support safe, effective and of the safety climate in facilthe organization is not only putting effort into having correct procedures but also into incorporating their feedback to update those procedures, this may facilitate the work- ers’ ownership and use of the procedures (as we saw in those units that had a more positive attitude toward procedural systems in “Finding 2”). Finding 6: Efficacy in commu- efficient operations, which will nicating safety information was procedures are correct. Incor- cedures — which many regulatory increase the likelihood that rect procedures historically surprising. One of the goals of proagencies require — is to communi- have been one of the biggest cate hazard information regarding procedural systems. esting and surprising issues regarding worker complaints regarding • Investments into the procedure change process, which may Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 32- the task itself. We found some intercurrent practices for communi- cating this information in written procedures. The first was, contrary quality is challenging given the type of technology as it is not just domain, when hazard statements are ities have. The good news is that it dures themselves. It also requires to evidence in the consumer product embedded in procedures, the more “stuff” (e.g., shading and icons) they have around them, the less likely workers will attend to them to the content of the hazard statement. We documented this with behavioral studies using existing workers as well as with eye-tracking studies (Figure 1). Another surprising finding was that most workers equated the meaning of the signal words CAU- TION and WARNING. Although WARNING generally is supposed to communicate a more dangerous hazard than CAUTION, workers thought the two words commu- number of procedures many facil- is a relatively straightforward task. However, in a survey of workers who use procedures, we found that the procedure’s usefulness about the hand-held digital proceensuring sufficient bandwidth in facilities for the system to be used effectively; providing sufficient resources for procedural conver- was related to use and deviation sion, including the engagement both variables were related to the process; and planning an effective as much as to its quality. Further, number of incidents or near misses per year. This indicates that it is important for a procedure to have the attributes of quality (meaning it is easy to understand, has no typos, contains current information, has steps in the correct order and is well-organized) and to help the workers do their jobs. of workers during the conversion integration of digital procedures (e.g., not during a startup). If an organization does not prepare for the conversion to digital procedures properly, it may never see the benefit of that conversion, or, if so, it may be at a much larger cost than originally expected. Finding 8: Adoption of digi- SUMMARY AND CONCLUSION complex. Many of the findings experiments and worker observa- nicated the same level of hazards. tal procedures is important and We have found through interviews, used procedures that had lever- and recommendations from our tions that work as imagined often This study involved workers who aged these words to differentiate between different levels of hazards. These findings suggest that instead of using signal words to communicate the level of the hazard in procedures, it is better to simply communicate the specific hazard and the method for avoiding or mitigating the hazard. Finding 7: Procedure usefulness requires high-quality attributes. As mentioned previously, poor proce- research indicate that using a digital procedure system will allow for more flexibility with regard to the procedure’s presentation, revision and likely usability. Further, in interviews with workers using digital procedures, one of the benefits was not having to handle paper agers at the same facility, the issues that managers believe workers were having with procedures were different from the issues workers were reporting. Overall, workers perceive proce- dures as a good tool — for training, the task. tasks done infrequently. For more etc., while they were performing Before adopting digital pro- cedure systems, it is extremely workers to deviate from or not use leverage methods of assessing their procedures. Improving procedure interviews with workers and man- that would get dirty, wet, torn, dure quality historically has been found to be a prominent reason for is different from work as done. In important that organizations readiness to adopt and accept this Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 33- for less experienced workers and for frequent critical tasks, particularly for experienced workers, it is likely that well-designed checklists would be a better solution than step-bystep procedures. These checklists Although the original charge to NGAP was to “human factor” procedures (or apply the princi- ples, theories and methods of HF to procedural systems), the work we have done to date has been an excellent example of interdisci- plinary research. My approach to HF is from the information processing paradigm as my training is primarily cognitive psychology. ENDURING STRUCTURE A healthy system, like a keystone, should last for years. also could indicate steps that have influences converge (the keystone) When procedures are designed can last for years. However, if the to be done in a particular order. primarily for documenting workers’ accountability or regulatory compliance, they are not necessarily going to be effective at supporting the workers’ tasks. This limited to create a strong structure that keystone is weak or has too much pressure from one side or the other, the structure falls apart. IMPORTANT COLLABORATIONS usability will decrease the likeli- AND COLLABORATORS them. Accountability is important years have contributed to a better hood that workers will adhere to Our findings over the past six in a working environment, but we understanding of how proce- found the procedure is not an effective method for documenting it. A healthy procedural system can play a keystone role in an organization’s safety system. The healthy system consists of pressure (information and guidance) coming from both top down and bottom up, which correlates to the right and left sides of the arch (Figure 2). For an organization with a healthy safety system, the procedural system should be a place where the top-down and bottom-up dures and procedural systems can support workers better while they perform their tasks and also improve safety in these high- risk industries. Two important hallmarks of NGAP have made these findings possible: the interdisciplinary team of scientists conducting the research and the tight collaborations between industry and academia regarding what questions needed to be asked and how they should be asked. Interdisciplinary research. Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 34- However, many, if not most, of our interesting findings came from intersections between my approach and that of another discipline or from another discipline entirely: • Chemical Engineering (ChemE): The collaborators from ChemE were my first regular collaborators, and they brought deep knowledge of risk analysis as well as the clas- sic engineering approach. With their influence, we looked first to the documents that guide industry practice — standards and regulatory documents as well as existing procedure writing guides. Further, they have explored methods of systematically identifying complexity using natural language processing given their comfort and prowess with many computer programs. • Industrial and Systems Engineering (ISEN): The col- laborators from ISEN often are closest to me scientifically in between attributes of the move forward, and those are ers. However, their training and turnaround times) as period of time. that they also are HF researchand research paradigms are different, as they are engineers and not psychologists. For instance, one colleague sees issues with procedures through the lens of a systems approach. He and his team have used rigorous methods to conduct and analyze interviews of workers to understand the issues and strengths of procedural systems better. Another col- league used machine learning procedural systems (redlines key performance indicators demics to hold fast to what we do strong knowledge of survey research to build the body of sci- collaborators with IO have a construction and analysis. This collaboration has been integral to NGAP for collecting and analyzing data from a large number of workers to understand trends and relations between variables in procedural systems better. Collaboration between industry and academia. The second hallmark that are associated with the laborations with industry partners. successful completion of that step. Still other colleagues in ISEN consider how the work- ers’ state (e.g., fatigue or stress) may impact their interactions with written procedures and task performance. • Industrial and Organiza- tional Psychology (IO): The collaborators from IO are psychologists like me, but they typically focus more on how attributes of the social or organizational system impact worker behavior or safety (where I investigate the tool itself, meaning the procedure). This has led us to the beginnings of an investigation on the relationship This process allows the aca- (KPIs) of safety climate. Other methods to identify attributes of steps in written procedures the studies we conduct for that of NGAP has been the regular colTwo industry partners — Elliott Lander from ATR and Abbe Barr from Chevron — originally came to MKOPSC with the need for this type of consortium. The industry/ academia (IA) collaborations in NGAP have matured over time to the following process: 1. The industry partners (the board) let us know the major well, which is rigorous, empirical ence. At the same time, it holds us accountable to do this science in a manner and on a topic that is directly relevant to topics that can be applied immediately in these high-risk industries. This translation of science to practice is something particularly exciting about NGAP — and honestly, I am pretty proud of it. The added benefit is that many of our findings regarding procedural systems have come from outside the consortium’s funding (such as federal and local funds), so the NGAP has been more of a constant source of seed funding to keep the effort going with the IA there to continue to hold us accountable to the original effort. STILL TO COME current issues. Like any good academic effort, studies or approaches we could questions we discover. As can 2. We identify several possible take to identify causal elements associated with these issues and develop and empirically test mitigation methods. 3. Given the resources available, the board votes on the studies it would most like to see Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 35- the more we learn, the more be seen by the list below, a lot of work remains to be done regarding procedural systems: 1. W hat are the most important design guidelines for digital procedures — for example, for moving between multiple procedures and for providing works side by side with me to “languages” that psychologists procedures? He is a premier researcher with a we were saying the same words. feedback regarding incorrect 2. How can we identify whether organizations are ready to adopt digital procedures? What are the measures and methods they should us to identify this? 3. W hat are the implications for electronic performance monitoring (EPM) with digital procedures? Previous research has found that workers’ per- formance can suffer if EPM is implemented badly. 4. Can attributes of procedural systems (such as number of redlines or redline turnaround time) be leveraged as KPIs for safety climate? 5. Can we develop a procedural manage the projects and students. systems engineering perspective and has contributed greatly to NGAP’s strength. Dr. Joseph Hendricks, an IO researcher, has been the brains behind all of the statistical analysis and engineers speak, even though Through intense and intentional listening to hear and learn from each other, we started the ethic of collaboration for NGAP that remains to this day. Of course, as always, I thank Dr. that was beyond my reach (and Sam Mannan. His presence with I know some stats). He also has ment and support for those first that’s saying something because ensured that any survey we use goes through a rigorous develop- NGAP as well as his encourageyears were pivotal. ment process. REFERENCES ical, is an industry partner who H., Marble, J., Byers, J., & Smith, ning and provides insight and reliability analysis method. US Roger Young, with NovaChem- has been with us from the beginsupport that keep us motivated. Wendy Schram is with Dow and worked for two years to pro- [1] Gertman, D., Blackman, C. (2005). The SPAR-H human Nuclear Regulatory Commission, 230, 35. [2] Shappell, S. A., & Wieg- system that adapts to user needs vide us important opportunities mann, D. A. (2000). The human and the context of the work- interact with both paper and digital system—HFACS. and profile, task attributes ing environment? ACKNOWLEDGEMENTS An acknowledgement section for this effort is difficult because there have been, and are, a lot of people who have contributed substantively to this effort’s success. A few deserve special thanks and acknowledgment. First is Dr. Farzan Sasango- har who for the past two years has been a Co-PI on NGAP and to learn more about how workers procedures. We appreciate all the personnel at Shell who provided us access to the BOOST facility at the Robert training facility. Conducting the experiment there was a wonder- ful experience, and the staff was superlative. Dr. Noor Quddus, from MKOPSC, was the first person from MKOPSC with whom I collaborated. He and I learned so much about the different Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 36- factors analysis and classification DR. CAMILLE PERES is an Associate Professor with Environmental and Occupational Health at Texas A&M University as well as the assistant director of Human Systems Engineering with the Mary Kay O’Connor Process Safety Center. Her expertise is Human Factors and she does research regarding: procedures; Human Robotic Interaction in disasters; and team performance in Emergency Operations. She can be reached at peres@tamu.edu. Boost Process Plant Resilience Measures can strengthen the ability to recover from an incident By Hans J. Pasman and Changwon Son, Texas A&M University ABSTRACT Conducting robust risk assessment does not guarantee that all possible hazardous events have been identi- fied. Unacceptable residual risk may materialize after risk assessments have been completed. To that end, resilience measures that may alert of a threat, call for action and require emergency response and recovery preparedness should be available. Error-tolerant equipment will help, too. The paper will describe how resilience can be measured and what is being done a sudden substantial upset such as Hollnagel et al. (2011, p. xxix), on any disastrous effect on its product that are necessary for a system to a market breakdown, a strike or output and turnover. Furthermore, guided by the United Nations during the past 20 years, national governments have been urged to work on their resilience capabilities so they are able to respond better to large-scale damage from extreme weather events such as tropical cyclones and natural calamities such as earthquakes. T he term resilience is become increasingly known in management circles. It refers to an organization’s ability to survive and recover from future threats and opportunities, and to learn from past failures and successes alike.” The “engineering” here, though, is not the way we, engineers, understand it as it misses the physical hardware component. At process plants, in addition is an important consideration. and successful mentioned resil- cess safety, emergency response ing developments, to anticipate what made an organization strong ronment. Future research priorities Keywords: resilience, disaster, pro- respond to events, to monitor ongo- to business aspects such as pro- scientists attempting to describe also will be summarized. be resilient. These are the ability to In the 1990s and early 2000, to minimize communication and coordination errors in a complex envi- strengthening the “four abilities ience as one of the properties (e.g., Weick & Sutcliffe, 2011). In 2004 , Erik Hollnagel launched the first in a series of symposia on what he called resilience engineering (Hollnagel et al., 2006). These symposia focused on the functioning of the people in an organization and, quoting Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 37- duction efficiency, process safety The question of how safe we are is determined by risk assessment and management. But one has to ask whether or not a detailed quantita- tive risk assessment is fully reliable. The short answer is no. Many reports indicate that tools to identify hazards and threats such as hazard and operability (HAZOP) are fallible, and teams Performance Resilience measures applied developed by Leveson (2004) and Leveson (2011), who also turned it into a predictive analysis tool (system theoretic process analysis, or STPA). Recognizing that safety is a control problem, STPA considers all process control loops — organizational and technical — acting on internal and external disTime turbances. By asking four questions, a team can find out what may go wrong. The other highly automated PERFORMANCE RESILIENCE DURING A DISASTEROUS EVENT effort is called blended HAZID, or BLHAZID. It Figure 1. Effective resilience measures can have a demonstrable effect on performance during an unexpected disastrous event. interactions. BLHAZID is making use of massive using them can fail (Baybutt, 2015; Cameron et al., 2017; Casal & Olsen, 2016; Jarvis & Goddard, 2017; focuses on plant, people and procedures and their digitization for equipment and generates causal models (Németh & Cameron, 2018). Even though STS provides a comprehensive system Lauridsen et al., 2002; Pasman et al., 2017; Suokas & view and promises improved risk assessment, in safety externally or internally initiated hazardous events with possible details, hazards still can be overlooked. In Rouhiainen, 1989; Taylor, 2016). As such, unexpected serious consequences are possible. Despite all safety measures a disastrous event can occur without warn- ing, causing a great deal of damage and interruption to business (such as COVID-19 to the airline industry!). the devil often is in the details. And in the myriad addition, there may be unknown threats while unex- pectedly accepted residual risks still may materialize, so resilience has an important role to play. Starting around 2009, the Mary Kay O’Connor Resilience measures should help to avoid such events Process Safety Center (MKOPSC) began studying the damage and accelerate recovery to restore perfor- unexpected mishaps from the STS point of view. To not foreseeable in traditional risk assessment or to lower mance when the events occur, as shown in Figure 1. Failures of hazard identification methods have led to impressive efforts attempting to improve process hazard analysis. Two of these efforts, based on a socio-technical system (STS) approach, will be mentioned briefly. An STS encompasses the entire hierarchical line from regulators down via board and the concept of resilience as a means to guard against date, two MKO students have completed their Ph.D. studies on the topic of plant resilience: Dinh (2011) and Jain (2018). Pasman et al. (2020) presented a summary and review of their work that includes not only references to their dissertations but also references to their published journal articles. The work by Dinh (2011) and Jain (2018) covers plant management to the various work floor levels mainly resilience aspects of process operations. A was introduced to the process industry by Rasmus- his Ph.D. graduation, focused on organizational resil- to the plant equipment and technology. STS theory sen (1997) for accident investigations and further co-author of this article, Changwon Son, who is near ience engineering and elaborated how resilience of Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 38- emergency response organizations factors and safety management element in trying to avert and avoid among cognitive system elements lowed by case studies. of incident precursors, knowing is achieved through interactions such as humans and technologies. This article will discuss meth- ods to strengthen resilience, the current state of resilience, the analysis of emergency response operations with respect to organi- system effectiveness. This was folJain (2018) followed this line of thought by condensing the resil- ience requisites into four elements that will be explained further: • Error-tolerant design mishaps. It includes the recognition their root causes and correcting them before a serious event occurs. Recognition means that one has learned from previous incidents. Warning signal detection goes • Detection of early warn- far beyond this, particularly when and conclusions. • Plasticity of thinking possible. Even physical attacks MEASURES TO STRENGTHEN Error-tolerant design of process zational resilience, future priorities RESILIENCE Dinh (2011) focused on avoiding process upsets and identified a number of resilience principles: • Minimization of failure • Early detection • Flexibility • Controllability • Minimization of effects • Administrative controls and procedures All these principles were eluci- ing signals • Preparation of recoverability and plant includes inherently safer design, but its scope is broader. It means that human-machine interac- tion is optimized, e.g., maintenance operations are facilitated and in are much influenced by process and making speed is reckoned with. plant design, but the other prin- ciples have their effects. Further, besides the design factor, Dinh et al. (2012) proposed additional con- tributing factors including warning signal detection potential, emer- gency response capability, human cannot be excluded. Extreme weather conditions with hurricanes and flooding from the effects of climate change should be warned for. A company’s business intelli- gence used to signal detection and Starting around 2009, the Mary Kay O’Connor Process Safety Center began studying the concept of resilience. dated and illustrated. In particular, process flexibility and controllability cyberattacks on installations are control human capability in decision interpretation also could warn in case of heightened risk. Early detection and identification Also, as much as possible the design of signals is not enough. If a fast an error is made, consequences the organization’s top management should be forgiving” so that when should be limited and/or recovery should be possible. Detection of early warning signals of disturbance is an important Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 39- response is required, it is critical that be informed and understand the need to alert the entire organiza- tion and take the necessary actions. The required mental attitude from management down throughout halted production, to avoid losing constraints. All this is supported by of thinking. It means having the be supplied with products from zation in the future, digital twins). the organization is called plasticity flexibility to divert attention from ongoing business to quickly grasping the seriousness of a threat and taking the right action. However, a company needs to avoid jumping to conclusions and embrace the concept market share, customers could elsewhere, while additional man- ous statistical methods (analytics) Most important, reserve financial model simulation and congruence to handle the disaster aftermath. fluidity should be available. HOW RESILIENT ARE WE? signals were available but manage- concept and the resulting strategies seen many instances in which clear By starting from the resilience ment response was inadequate or to build resilience and then fol- Emergency response capability is fully recognized as a necessary asset of any organization. In case of a major upset event such as an explosion, fire or toxic chemical lowing the principles, Dinh (2011) sorted out the factors that influence resilience and their weights and composed an algorithm to determine factor index values. Jain (2018) went further and release, avoiding delays in imple- developed the first process resilience minimum is crucial. Many mecha- prediction, PRAF should be able to menting mitigative actions to a nisms incorporated in the layers of protection of a plant are available. Ultimately, well-prepared and trained plant and community fire brigades will do their jobs to minimize damage. However, recoverability encompasses more. It requires a detailed plan of what should be done to repair damage quickly, to get a supply of required materials and to maintain avail- ability of a specialized workforce to repair the plant. In case of fully This is realized by using vari- agement capacity may be needed of resistive flexibility. History has nonexistent. process models (after process digiti- analysis framework (PRAF). By shed light on the influencing factors to treat and mine data, dynamic analysis. Congruence analysis is about the extent of dependen- cies in an STS determined by the coordination of (temporal) tasks to run the process and the communication and coordination dependencies as determined by the organizational structure (Cataldo et al., 2008). The larger the inten- sity of dependencies, the larger the chance something goes wrong. The analysis is completed by economic optimization under defined limits of product quality, safety, environmental impact and sustainability. After the development of PRAF, and effectiveness of the three phases the next question is how to obtain recovery. The basis of PRAF is an mance with respect to resilience. of resilience: avoidance, survival and extensive risk assessment, consider- ing the system architecture of people, plant and procedures with inputs data that indicate plant perfor- To this end, Jain, Mentzer et al. (2018) introduced 26 measurable indicators based on the PRAF fac- about processes, safety require- tors, such as alarm rates, number and a step beyond conventional action item closure and number ments and costs. However, crucial risk assessment is the identification of uncertainties, their quantification and determination of safety Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 40- of trips per month, process safety of mock drills for emergency sit- uations per year. These indicators come in three groups according to the three PRAF phases: avoidance, survival and recovery. A survey to determine the weighting of proposed indicators was given to respondents from the oil, gas, and chemical indus- tries; academia; risk analysts; and those with expertise in process operations, process safety and risk holistically, hence including simul- guide words are linked to an indica- (startup, turnaround and shutdown). deviations are considered and causes taneous and transient operations The approach is bi-layered, meaning it distinguishes management and plant system layers, each with three subsystems, which in turn consist of parameters (functions and aspects). First, the plant system layer is tor metric. Using the guide words, and consequences of the deviations identified. The worksheet also cap- tures the actions required as well as the person responsible for completing the action. A few example cases have been assessment, as well as individu- considered with the subsystems: worked out, including LNG storage and procurement (for details see procedural hazards and operator/ et al., 2018), a PVC batch reactor als involved in business, finance Jain, Mentzer et al., 2018). A few indicators appear in more than one phase, so altogether there were 30 questions. More than 250 responses were collected and weights determined. The metrics are relative mea- sures. In actual plants the metrics process/plant equipment hazards, human hazards, with safeguards analysis conducted separately. Equipment failure rate and unavailability also are taken into account, both of which are substan- tially influenced by organizational and human performance factors. Next, the management system tank startup (Jain, Chakraborty upset event prediction analysis (Jain, Chakraborty et al., 2018; Jain, Diangelakis et al., 2019) and a cool- ing tower maintenance optimization (Jain, Pistikopoulos et al., 2019). RESILIENCE AFTER INCIDENTS OCCUR would be monitored and trends layer is analyzed with process The fourth element of the resil- also was linked with one of the four pline and process safety culture is recovery. Unlike the design analyzed. Each indicator metric resilience elements: design, warning, plasticity and recovery. Next, Jain, Rogers, Pasman, Keim et al. (2018) and Jain, Rogers, Pasman and Mannan (2018) devel- oped the resilience-based integrated process system hazard analysis (RIPSHA). This is a HAZOP-type protocol conducted by a team con- safety system, operational disciand leadership subsystems. The analysis process runs similar to the conventional HAZOP. Guide words for the plant equipment part are the same as for a HAZOP, but new guide words were defined for the people and procedures (Jain, Rogers, Pasman, Keim et al., 2018). New guide words also have been sisting of a facilitator, subject matter created for the management system ing the operation under question Pasman & Mannan, 2018). All professionals and a scribe, cover- layer’s subsystem (Jain, Rogers, Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 41- ience process Jain (2018) proposed that prevents or tolerates process upsets and the early detection of such upsets, responding to and recovering from process incidents inevitably necessitates communication and coordination between multiple human elements. An archetypical case that revealed the need for adaptive response to and recovery from a disastrous incident is the Deepwater Horizon (DWH) incident in 2010. The official Collective sensemaking indicates the ERO’s awareness of evolving situations by sharing relevant information in a timely manner. Based on the collective sensemaking, organizational decisions are made to bring necessary adjustment in incident objectives and strategic TEEX EMERGENCY OPERATIONS TRAINING CENTER Figure 2. An observational study at the TEEX Emergency Operations Training Center was undertaken to look at how WAI and WAD were handled.. government response to the DWH and continued failures to shut the was deployed to search for missing abilities to adjust their functioning, began when the U.S. Coast Guard drilling crew members. After the DWH sank into the water, the oil spill became a serious issue that oil well, the response organizations’ or resilience, was the key to successful management during the disaster and tactical plans. As the EROs consist of multiple members with different expertise and knowledge, interactions between them are a crucial mechanism that facilitates the sharing of incident information and making mutually agreed-upon decisions as the disaster continues. The uncertainty associated with (Birkland & DeYoung, 2011). situations that unfold during the the search and rescue to the oil spill tance of resilience in emergency of actions (WAI) often is different the BP Deepwater Horizon Oil Spill started examining the current state required an adaptive transition from response (National Commission on and Offshore Drilling, 2011). Because of the unprecedented amount of oil spilled, the DWH disaster required immediate, largescale response efforts to mitigate the detrimental effects. More than 47,000 people were involved in the containment, recovery and dispersion of the released oil across multiple emergency response organizations (EROs) (Starbird et al., 2015; U.S. Coast Guard, 2010). Because of unknown factors about the oil spill Realizing the growing impor- management, Changwon Son of resilience research and conducted a series of empirical investigations. Son, Sasangohar, Neville et al. (2020a) conducted a systematic review of resilience literature in the emergency management domain. The review suggests four key factors disaster means that expected course from activities that actually take place in the field (WAD). Findings from the review indicate that it is important to identify and reconcile the gaps between the two, assuming that neither WAI or WAD is absolutely correct and hence should be solely pursued. Acknowledging a research gap of resilient performance of EROs: that lies in the methodology of nated decision making, interactions EROs, Son et al. (2018) devel- collective sensemaking, coordi- between ERO members and recon- ciling work-as-imagined (WAI) and work-as-done (WAD). Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 42- identifying WAI and WAD in oped an analytical method called interaction episode analysis (IEA) that incorporates cognitive systems engineering theory to represent emergency managers of govern- resilience-oriented study is lacking. hospital that responded to Hurri- tions, a cooperative research study These studies identified what has to see how the approach could interactions between humans ment organizations and a regional tasks. The IEA enables analysts cane Harvey in 2017, respectively. of interactions: context (which made the EROs able to function in and technologies to deal with to identify three essential aspects human operators and technologies are involved in the interactions), characteristics (how often and long resilient ways as well as what has hindered them from doing so. One of the major character- the interactions occur) and content istics of resilience of the EROs during the interactions). common operating picture (COP), (what conversation or action occurs To apply the IEA to a naturalis- tic ERO, Son, Sasangohar, Neville et al. (2020b) conducted an observational study at the Emergency Operations Training Center of TEEX (Figure 2) and identified WAI and WAD of incident command operations. Findings from the observational study confirmed that not every expected interaction occurs and that the members of was creating and maintaining a a practical concept of collective sensemaking proposed in the literature review work (Son, Sasangohar, emergency operations. To understand resilient perfor- to product requirements, chang- ing feed stock and such should be investigated. In other words, how is process resilience to overcome abnormal situations changing when sudden process condition modifica- tial phenomena that reveal how for cognition in EROs have been defined. A well-organized and fast recovery, but the latter usually takes e.g., because of improvised action and liability cases. WHAT ARE FUTURE RESEARCH PRIORITIES? On error-tolerant design, includ- and Son, Larsen et al. (2020) has been done at various places, carried out interview studies with in process operation by adaptation in which a number of concepts conducted an integrative review mance of real-world EROs, Son, Sasangohar, Peres et al. (2020) resilience aspects of the dynamics to identify the gaps between much longer and more attention, information) they face during the indicator metrics. In addition, the the EROs, Moon et al. (2020) understanding of cognition in tions (e.g., communicating with despite challenges (e.g., confusing contribute in practice using the tions need to be implemented? response will help to enlighten different roles from expected ones) with industry would be welcome Neville, et al., 2020a). For a deeper EROs strive to achieve given tasks via alternative patterns of interac- As regards resilience in opera- ing inherently safer design, work but an overall integrating and Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 43- The previous and current efforts WAI and WAD, two essen- EROs cope with complexities imposed by disasters, are relevant to work relations in general and in particular to cases of complex operations under time pressure. A question to tackled in the future is how we can address such gaps to strengthen resilience. Based on our current knowledge, the following are suggested as future research agenda items to better answer to the question: • Predicting the unpredictable — the gaps between WAI and WAD often result from a lack of understanding about what the long-term benefit of profitabil- Therefore, future research and other work procedures so they undesired events could happen. should be focused on predict- ing extreme incident scenarios quickly and developing associated response plans and ity. It may influence maintenance are less vulnerable to unexpected threats and can recover more quickly if incidents occur. translate them into procedures. REFERENCES adaptive capacity — it is Hazard and Operability (HAZOP) study. • Improving organizational nearly impossible to consider all potential hazards and to prepare prescribed courses of actions for such hazards (“thickening a rule book does not solve all the problems”). Thus, what is required to make an organization more resilient during major upset condi- tions is to increase adaptive capacity — to flexibly adjust performance to changes in the environment. An inventory of strategies used for Baybutt, P. (2015). A critique of the Journal of Loss Prevention in the Process Industries, 33, 52-58. Birkland, T. A., & DeYoung, S. E. (2011). Emergency response, doctrinal confusion, and federalism in the Deepwater Horizon oil spill. Publius: The Journal of Federalism, 41(3), 471-493. Cameron, I., Mannan, S., Németh, E., Park, S., Pasman, H., Rogers, W., & Seligmann, B. (2017). Process hazard analysis, hazard identification and scenario definition: Are the conventional tools sufficient, or should and can we do much better? Process Safety and Environmental Protection, 110, 53-70. Casal, A., & Olsen, H. (2016). Operational risks in QRAs. Chemical Engineering Transactions, 48, 589-594. Cataldo, M., Herbsleb, J. D., & Carley, the adaptive capacity in the K. M. (2008). Socio-technical congruence: 2019) can be applicable to the technical and work dependencies on soft- health care domain (Son et al., process procedures and training programs. CONCLUSIONS Resilience analysis comes on top of risk assessment to increase not only process safety but also to deal with uncertainties and will contribute to A framework for assessing the impact of ware development productivity. The 2nd ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, Kaiserslautern, Germany. Dinh, L. T., Pasman, H., Gao, X., & resilience evaluation in chemical processes [Doctoral Dissertation, Texas A&M University]. College Station, TX. Hollnagel, E., Paries, J., Woods, D. D., & Wreathall, J. (2011). Resilience engi- neering in practice: A guidebook (Vol. 3). Ashgate Publishing. Hollnagel, E., Woods, D. D., & Leveson, N. (2006). Resilience Engi- neering: Concepts and Precepts. Ashgate Publishing. Jain, P. (2018). Process Resilience Anal- ysis Framework for Design and Operations [Doctoral Dissertation, Texas A&M University]. College Station, TX. Jain, P., Chakraborty, A., Pistikopoulos, E. N., & Mannan, M. S. (2018). Resil- ience-based process upset event prediction analysis for uncertainty management using Bayesian deep learning: application to a polyvinyl chloride process system. Indus- trial & Engineering Chemistry Research, 57(43), 14822-14836. Jain, P., Diangelakis, N. A., Pistiko- poulos, E. N., & Mannan, M. S. (2019). Process resilience based upset events prediction analysis: Application to a batch reactor. Journal of Loss Prevention in the Process Industries, 62, 103957. Jain, P., Mentzer, R., & Mannan, M. S. (2018). Resilience metrics for improved process-risk decision making: survey, analysis and application. Safety Science, 108, 13-28. Jain, P., Pistikopoulos, E. N., & Mannan, M. S. (2019). Process resilience analysis based data-driven maintenance optimization: Application to cooling tower operations. Computers & Chemical Engi- Mannan, M. S. (2012). Resilience engineer- neering, 121, 27-45. contributing factors. Journal of Loss Preven- Keim, K. K., & Mannan, M. S. (2018). A ing of industrial processes: principles and tion in the Process Industries, 25(2), 233-241. Dinh, L. T. T. (2011). Safety-oriented Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 44- Jain, P., Rogers, W. J., Pasman, H. J., resilience-based integrated process systems hazard analysis (RIPSHA) approach: Part I plant system layer. Process Safety and Pasman, H. J., Rogers, W. J., & Environmental Protection, 116, 92-105. Mannan, M. S. (2017). Risk assessment: Mannan, M. S. (2018). A resilience-based with it, or can it do a better job? Safety Jain, P., Rogers, W. J., Pasman, H. J., & integrated process systems hazard analysis What is it worth? Shall we just do away and the 2010 BP Deepwater Horizon oil spill. Human and Ecological Risk Assessment: An International Journal, 21(3), 605-630. Suokas, J., & Rouhiainen, V. (1989). Science, 99, 140-155. Quality control in safety and risk analyses. system layer. Process Safety and Environ- in a dynamic society: a modelling problem. Industries, 2(2), 67-77. Jarvis, R., & Goddard, A. (2017). An Son, C., Larsen, E. P., Sasangohar, F., (RIPSHA) approach: Part II management mental Protection, 118, 115-124. Rasmussen, J. (1997). Risk management Safety Science, 27(2-3), 183-213. analysis of common causes of major losses & Peres, S. C. (2020). Opportunities and industries. Loss Prevention Bulletin(255). Management: Case Study of a Hospital’s in the onshore oil, gas & petrochemical Lauridsen, K., Kozine, I., Markert, F., Amendola, A., Christou, M., & Fiori, M. (2002). Assessment of uncertainties in risk analysis of chemical establishments: The ASSURANCE project. Final Report, Risø. Leveson, N. G. (2004). A new accident model for engineering safer systems. Safety Science, 42(4), 237-270. Leveson, N. G. (2011). Engineering a safer world: Systems thinking applied to safety. The MIT Press. Moon, J., Sasangohar, F., Son, C., & Peres, S. C. (2020). Cognition in Crisis Management Teams: An Integrative Anal- Challenges for Resilient Hospital Incident Response to Hurricane Harvey. Journal of Critical Infrastructure Policy, 1(1), 81-104. Son, C., Sasangohar, F., Neville, T., Peres, S. C., & Moon, J. (2020a). Investigating resilience in emergency management: An integrative review of liter- Taylor, R. (2016). Can process plant QRA reduce risk?–experience of ALARP from 92 QRA studies over 36 years. Chem- ical Engineering Transactions, 48, 811-816. U.S. Coast Guard. (2010). National Incident Commander’s Report: MC252 Deepwater Horizon. http://www.nrt. org/production/NRT/NRTWeb.nsf/ AllAttachmentsByTitle/SA-1065NICReport/$File/Binder1.pdf Weick, K. E., & Sutcliffe, K. M. (2011). ature. Applied Ergonomics, 87, 103114. Managing the unexpected: Resilient per- Peres, S. C., & Moon, J. (2020b). Eval- John Wiley & Sons. Son, C., Sasangohar, F., Neville, T. J., uation of work-as-done in information formance in an age of uncertainty (Vol. 8). management of multidisciplinary incident DR. HANS J. PASMAN is a Research Analysis. Applied Ergonomics, 84, 103031. Process Safety Center at the Texas A&M management teams via Interaction Episode Son, C., Sasangohar, F., Peres, S. C., ysis of Definitions. Ergonomics, 63(9). & Moon, J. (2020). Muddling through water Horizon Oil Spill and Offshore incident management teams during Hurri- National Commission on the BP Deep- Journal of Loss Prevention in the Process troubled water: resilient performance of Professor at the Mary Kay O’Connor University. Together with being Emeritus Professor Chemical Risk Management at the Delft University of Technology, and in cane Harvey. Ergonomics, 63(6), 643-659. management of TNO Industrial safety NL, the President. Neville, T. J., Moon, J., & Sam Mannan, M. various roles related to almost all areas of level failure, causality and hazard insights team as a joint cognitive system. Journal of Drilling. (2011). The Gulf Oil Disaster and the Future of Offshore Drilling: Report to Németh, E., & Cameron, I. (2018). Multi- viaknowledge based systems. In Mary Kay O’Connor Process Safety Center (MKOPSC) Son, C., Sasangohar, F., Peres, S. C., (2018). Modeling an incident management Loss Prevention in the Process Industries. Son, C., Sasangohar, F., Rao, A. H., he has more than 50 years of experience in process safety and risk management. He can be reached at hjpasman@gmail.com. 21st Annual International Symposium Octo- Larsen, E. P., & Neville, T. (2019). Resil- CHANGWON SON is a graduate student Pasman, H. J., Kottawar, K., & Jain, P. Patterns, models and strategies. Safety Sci- Lab, Industrial and Systems Engineering ber 23–25, College Station, TX. ient performance of emergency department: (2020). Resilience of process plant: what, ence, 120, 362-373. safety and sustainability. Sustainability, 12, Leschine, T. M., Pavia, R., & Bostrom, A. why, and how - How resilience can improve 6152; doi:10.3390/su12156152 Starbird, K., Dailey, D., Walker, A. H., (2015). Social media, public participation, Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 45- at the Applied Cognitive Ergonomics Department, Texas A&M University, and the Mary Kay O’Connor Process Safety Center. Email: coralx@tamu.edu. CALL FOR PAPERS Abstract topics include but are not limited to the focus areas below: Instrumented Safeguards Held virtually February 16-17, 2021 - Alarm Management - Safety Instrumented Systems - Fire and Gas Systems - BPCS Protection Layers Technology Focus - Control System Migration - Process Control and Optimization - Artificial Intelligence (Manufacturing 4.0) - New Field Device Technology - Smart Sensors & IIOT Instrument Reliability For more information email mkopsc@tamu.edu or visit tx.ag/instrumentation symposium - Methods and Tools - Maintenance Strategies - Technician Training - Prior Use Justification - Case Studies Regulatory Compliance - Control and Monitoring Systems - Maintenance Tools and Practices - Business Case for Automation - Cybersecurity Technical paper and Workshop topic submission: - An abstract submission - A manuscript submission (a template will be available in the website) - A presentation (30 minute presentation and 10 minute live Q&A) - Workshops consist of 1 hour lecture and 15 minute Q&A Find us on Due: October 15, 2020 Submit your abstract here: tx.ag/2021callforpapers Abandon-in-Place Must End Leaving equipment derelict instead of demolishing it can prove costly. By Dirk Willard, Contributing Editor A t 2:09 p.m., February 16, 2007, a cracked elbow in bypassed piping at the Valero-McKee refinery in Sunray, Texas, leaked propane. This leak triggered a series of events around a de-as- phalting extraction column that, It’s interesting to speculate on who opened the block valve and why. The dead leg was properly isolated for many years. This question wasn’t addressed in the CSB video: “Fire from Ice,” http:// www.chemsafety.gov/. The CSB found the dead-leg although serious, could have been piping originally was designed workers were severely burned. pitch to the top of the extraction much worse. As it was, three The U.S. Chemical Safety Board (CSB) investigated and concluded that a foreign object had probably lodged in the block valve that was to provide propane mixed with column. In the early 1990s the process was changed, presumably after a hazards and operability approach euphemistically is called, “abandon-in-place.” Regardless of what you call it, it’s a poor engineering practice. At Anheuser-Busch, we aban- doned piping because of asbestos insulation; it was cheaper to leave it in place than to remove the asbestos. In the end we took out the pipe because block valves leaked, causing product contamination. We had a similar situation when I worked at Ralston Purina. Sometimes, abandon-in-place (HAZOP) review. can involve electrical service. Der- section of piping. Water, an impu- common problem in our industry. prompts a great many so-called the elbow. In early February, tem- cesses for weeks, months or even supposed to isolate the dead-leg rity in the feed, accumulated in peratures dropped to well below zero — the water turned to ice and expanded, cracking the elbow. Then, when temperatures rose and the ice melted, propane escaped. This accident highlights a Plants sideline equipment and proyears. The units are isolated, or perhaps not, and allowed to rust. I’ve seen this at refineries, chemical and food plants, and even in municipal water-treatment facilities. This Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 47- elict switchgear, it seems to me, ground faults. At the isomeriza- tion unit of a refinery such a fault caused a power blip. As a process engineer at a chemical plant I saw a similar problem affect our dis- tributed control system and backup programmable logic controller. These incidents, really near misses, assure compliance with the wishes Don’t be fooled. Take special spoil product. drawings! Electrical and mechani- Maintain records of the equip- are more serious than ones that When I was at Anheuser-Busch we lost several batches of yeast because water poured down inside a rusty neglected starter box, causing a ground-fault trip. If the electrician had pulled the wires as instructed, this never would have occurred. SO HOW CAN THESE of the committee. Don’t forget the cal diagrams should reflect current plant operation. While much of the responsibil- ity falls on operating staff, design engineers aren’t completely off precautions to avoid problems. ment. Completely tear down rotary equipment — even if properly stored with correct lubricant (make sure the stor- age lubricant is compatible with “Abandon-in-place is a poor engineering practice.” HAZARDS BE AVOIDED? For aboveground piping, the solution is simple. Install complete the hook. They should allow for the lubricant required when the than a block valve; a weld cap is bypassing a line. Ideally, design standing for more than a month. isolation: a blind flange is better best. For underground piping, costs significantly increase — but dig- ging is a lot cheaper than risking an accident or a visit from the EPA. You have more to worry about than leaks. Dealing with wiring can pose greater challenges, especially at an older site. Drawings may not be current or accurate; this is espe- cially true for one-line diagrams. Regardless of the nature of the abandoned equipment, a manage- potential situations that require for double-block-and-bleed (DB&B). This is familiar to most engineers from the food-anddrug side of our business but new to refineries. DB&B enables safe removal of equipment such as pressure gauges. If DB&Bs had been used instead of a single block valve, Valero may have avoided its accident. NOW, LET’S MOVE ON TO “TEMPORARILY” ment of change committee should BYPASSED EQUIPMENT. equipment after it’s been isolated to equipment to perform like new. review its disposition. Inspect the Some engineers expect such Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 48- equipment is running) — if left Check and clean static equipment such as tanks. Replace all relief equipment. Inspect high maintenance items. And, of course, pressure-test the bypassed process. As with most start-ups, a thorough checklist can prevent anything from being overlooked. Modify the checklist to include what-if provisions. This tool will provide inexperienced engineers with some insight into solving problems. With good judgment and cau- tion, you can avoid accidents involving bypassed equipment. The Emerging Hydrogen Economy Demands Attention Recent accidents underscore the need for inherently safer designs | By Nilesh Ade, Texas A&M University A important material is coming into research firm, Markets and Markets, at Mary Kay O’Connor Process hydrogen demand is for ammonia s the world moves for various industrial applications hydrogen economy, the industry, the glass industry, etc. increasingly toward a safe use and management of this sharper focus. The researchers Safety Center (MKOCPS) are identifying safer design alternatives for the hydrogen economy to facilitate its further commercialization. HYDROGEN ECONOMY such as metallurgy, the chemical Based on analyses by a market approximately 55% of the global synthesis, 25% in refineries and 10% for methanol production. The hydrogen market can be classified broadly into a “merchant hydrogen market” and a “captive hydrogen market.” The merchant hydrogen market com- John Bockris coined the term prises central production of hydrogen It refers to the application of ers through transportation methods “hydrogen economy” in the 1970s. hydrogen as a fuel in a clean energy system instead of conventional hydrocarbon fuels. Hydrogen is the lightest and one of the most abundant elements on earth and that is supplied to end point consumsuch as truck delivery and pipeline. In the captive hydrogen market, hydrogen is produced on-site by the consumers themselves. Currently the captive hydrogen market dom- is present in molecules of organic inates the overall hydrogen market currently is used in a gaseous form the merchant hydrogen market compounds and water. Hydrogen with a 95% market share. However, Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 49- share is growing rapidly at a rate of 7% per year. Interestingly, hydrogen is gaining increasing attention as a potential fuel in the energy sector, which would allow the estab- lishment of an overall hydrogen economy. Hydrogen can be used as an energy carrier in both stationary and transport applications. This increased interest in hydrogen as an energy carrier is based on the following advantages: • Combustion of hydro- gen results in formation of steam and water as by-products as opposed to environmental pollutants from combusting conventional hydrocarbon fuels. • Hydrogen is nontoxic. • A major raw material is water (which is abundantly present on earth). • It can be used as an energy source in fuel cells that con- vert hydrogen into electricity without the formation of heat, resulting in higher efficiency. • Transmission of hydrogen hydrogen-containing compounds, arise mainly because of hydro- of water, electrochemical decom- fast burning speed, high energy thermochemical cycles, electrolysis position, photolysis of water, photochemical decomposition of water, photo-electrochemi- over long distances is more cal decomposition of water and AC current. schematic representation of the economical than high-voltage Hydrogen typically exists in a compound form, thus the gen- eration methods for hydrogen involve hydrogen extraction from biological decomposition. A hydrogen economy is shown in Figure 1. SAFETY INCIDENTS RELEVANT these compounds. Some of the TO THE HYDROGEN ECONOMY hydrogen production include gen as a fuel is gaining attention, current and future methods for Although the application of hydro- steam reforming of hydrocarbons, among the key factors inhibiting partial oxidation of hydrocar- bons, thermal decomposition of its growth are the associated safety concerns. These safety concerns gen’s unique properties such as content, low ignition energy and wide flammability range. Since 1969, more than 200 incidents pertaining to hydrogen have been reported, and these incidents serve as a major hindrance toward the hydrogen economy’s further commercialization. A study of 32 hydrogen-based incidents published in Interna- tional Journal of Hydrogen Energy that occurred before 2011 found that approximately 44% of these incidents resulted in a fire, 31% resulted in an explosion, and 16% were both fire and explosion. Only Hydrogen production: Hydrogen packaging: Hydrogen usage: electrochmical, thermochemical and biochemical methods compression, liquefaction and hydrides Transportation and stationary applications Hydrogen distribution: pipelines, road, rail, ship Hydrogen storage: Pressure and cyrogenic HYDROGEN ECONOMY Figure 1. This chart shows the various stages of the hydrogen economy, from production to usage. Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 50- a small fraction of these incidents hydrogen is the hydrogen refuel- “What You Don’t Have Can’t occurred in Sandvika, Norway on processes that eliminate or reduce were near misses (9%). However, ing station (HRS) explosion that study was that “design error” con- June 11, 2019. The impact result- one of the key findings from this stituted the primary cause leading to these incidents. PEMFC forklift incident in May 2018. One of the recent explosions pertaining to hydrogen was the forklift explosion that occurred on May 24, 2018 at a Procter & Gamble plant in Pineville, Louisiana. The forklift was based on the proton exchange membrane fuel cell (PEMFC) technology. The incident led to the forklift oper- ator’s death and injured six other plant personnel. Although the causes behind the explosion are ambiguous, a lawsuit was filed against the forklift manufacturing company with design defect as one of the reasons leading to the fatality. The incident also resulted in significant economic repercussions to the forklift manufacturing company, thus hampering the growth of the ing from the explosion triggered the airbags of cars in the station’s vicinity and led to two injuries. The incident currently is under inves- tigation; however, the preliminary findings indicate an assembly error in the plug of the high-pressure accumulators led to a hydrogen leak and subsequent explosion. Following the incident, 10 HRSs constructed by the associated company were closed temporarily, reiterating the impact of safety incidents on further hydrogen economy implementation. INHERENTLY SAFER DESIGN PHILOSOPHY explosion in June 2019. Another philosophy was put forth by Dr. recent explosion pertaining to process hazards. ISD philosophy is based on the following principles: • Intensification or minimiza- tion is reducing the amount of hazardous chemicals involved in the process. Safer design alternatives will foster the emergence of the hydrogen economy. hydrogen economy. Hydrogen refueling station Leak,” which encourages design The inherently safer design (ISD) Trevor Kletz in his seminal article Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 51- • Substitution means substituting a hazardous chemical in the process with a safer one. The hazards associated with a chemical can be determined according to its flammability, explosiveness, toxicity and chemical reactivity. • Attenuation or moderation refers to reducing the severity of operating conditions (such as operating temperature and pressure) of the process involving hazardous chemicals. • Limitation of effects involves altering the design (mainly process design and operat- Limited research was carried TOWARD AN INHERENTLY ing conditions) based on the out that involved a compara- SAFER HYDROGEN ECONOMY process to limit the effect of performance for components to achieve the above objectives. hazards associated with the hazardous chemicals. Simplicity is designing simpler plants with relatively fewer pieces of equipment to reduce opportuni- ties for failures in the process. The ISD philosophy typically is used to generate alternate process designs that improve overall safety. This philosophy is based on the understanding that modifying the process in the early design stages can be most effective in reducing the associated process hazards. RESEARCH OBJECTIVES A detailed literature review performed as a part of this study identified significant research gaps. The first gap was the limited research pertaining to the hydro- gen safety incidents such as those tive analysis between safety and A two-fold study was proposed of a hydrogen economy such The first part focuses on the as PEMFC and HRS. Last, although the ISD philosophy has been applied widely in onshore and offshore chemical processing facilities, its application toward all components of a hydrogen economy was limited. Based on the identified research gaps, the following research objectives were proposed: • Identify the potential causes that led to improper design hydrogen. explosion incident. In Part I, a mathematical model was devel- oped that relates the microscale PEMFC degradation to the probability of a macroscale explosion in a fuel cell electric vehicle (FCEV). Using the model and the increasing the PEMFC system’s explosion). • Apply the ISD philosophy to investigate potential improvements to component design. and a performance metric for engineering research relevant to HRS design relating to the HRS (forklift explosion and HRS ponents in recent incidents risk of such incidents. Second, a safety research and fundamental The second part focuses on the inherent safety principle of inten- • Perform a comparative anal- substantial gap existed between the forklift explosion incident. of hydrogen economy com- discussed above and the current scientific literature to reduce the PEMFC’s design relating to ysis between a safety metric the suggested ISD alterna- tives so that performance is not affected negatively while improving their safety. Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 52- sification, one can conclude that operating temperature can improve both its safety and durability significantly while intensifying membrane design parameters, i.e., membrane thickness and mem- brane conductivity do not provide any significant improvements. A key observation from this study is that a PEMFC system’s durabil- ity (expressed in voltage loss) and safety (expressed in explosion probability) are not correlated perfectly. In Part II, the research team developed an integrated model using queuing theory, process syn- thesis, quantitative risk assessment (QRA) and economic analysis for designing HRS. Currently HRS designs are based primar- ily on economic consideration to supply hydrogen at a competitive • “Intensifying vehicular proton exchange membrane fuel cells for safer and durable, design and operation” • “An integrated approach for safer and economical design of hydrogen refueling stations” CONCLUSIONS AND price, and their safety is evaluated FUTURE WORK codes and standards. study’s findings that ISD can be through QRA as dictated by the It can be concluded from this However, the lack of relevant effective for improving hydro- safety perspective in the design stage itself leads to a possibility of HRS being overdesigned in terms of safety. The application of the integrated model was demonstrated using ISD philosophy. For the base design under consideration, the results indicated that reducing liquid storage capacity can reduce the risk associated with explosion signifi- cantly along with improving HRS economics, while reducing the dis- penser hose diameter can reduce the risk associated with jet-fire with a slight detriment to HRS economics. These two studies were published in International Journal of Energy to disseminate the findings. The article titles are as follows: gen system safety. However, ISD implementation can result in both beneficial and detrimental effects on the performance/economics and overall safety of these systems. It is imperative to support such ISD improvements with a holistic analysis, incorporating both safety and performance quantification. The current research impetus is on the aspect of hydrogen produc- tion within the hydrogen economy. This research focus is motivated by the explosion that occurred at a hydrogen reforming facility in Catawba County, North Carolina on April 7, 2020. The explosion resulting from this incident damaged 60 homes in the facility’s Oct o ber 2020 / M KO P roc e s s S a fe t y J our na l - 53- vicinity while negatively impacting the perception of hydrogen as a fuel and thus serving as a hindrance toward hydrogen economy growth. The incident highlights the need to revisit the design of the traditional steam reforming pro- cess. The research team currently is investigating potential ISD- based safety improvements in the reforming process while trying to ensure that the economics of hydrogen production remain competitively viable to hydrocarbon-based fuels. NILESH ADE is a Ph.D. student at the Mary Kay O’Connor Process Safety Center in the Chemical Engineering Department at the Texas A&M University. He pursued his bachelor’s degree in Chemical Engineering from Institute of Chemical Technology, Mumbai. Nilesh has been involved in multiple areas of research in process safety including consequence analysis, inherent safety, reliability analysis, quantitative risk analysis, and human factors. Nilesh is currently working on the safety of the Hydrogen economy as part of his dissertation. He can be reached at nilesh14@tamu.edu.