Key Principles of Process Safety for: Incident Investigation KP3 - II, Oct 2023 Copyright 2023 American Institute of Chemical Engineers www.aiche.org/ccps Acknowledgments The American Institute of Chemical Engineers (AIChE) and the Center for Chemical Process Safety (CCPS) express their appreciation and gratitude to all members of the Golden Rules of Process Safety for Key Principles project subcommittee for their generous efforts in the development and preparation of this important work. CCPS also wishes to thank the subcommittee members’ respective companies for supporting their involvement during the different phases in this project. Core Team Members: Denise Chastain-Knight, Chair Della Mann, Vice Chair Warren Greenfield, Project Consultant Denise Albrecht Kevin Campbell Walt Frank Anil Gokhale Mike Hazzan Ng Ern Huay Pete Lodal Frank Renshaw Louisa Nara Linda Bergeron Jeff Fox Curtis Clements exida CCPS Emeritus CCPS Consultant 3M Shell CCPS Emeritus CCPS Staff Acutech Petronas D&H Process Safety CCPS Emeritus CCPS Staff CCPS Emeritus CCPS Emeritus Chemours Sub Team Incident Investigation Key Principles Team Members: Mike Broadribb, Lead Eric Atkins Amy Gay Ammar Alkhawaldeh Mike Hazzan Della Mann Denise Albrecht Warren Greenfield, Project Manager BakerRisk OLIN Occidental Petroleum BASF AcuTech CCPS Emeritus 3M CCPS Consultant The collective industrial experience and knowledge of the team members make this guideline especially valuable to those who develop and manage process safety programs and management systems. Before publication, all CCPS guidelines are subjected to a peer review process. CCPS gratefully acknowledges the thoughtful comments and suggestions of the peer reviewers. Their work enhanced the accuracy and clarity of this guideline. Copyright: American Institute of Chemical Engineers Peer reviewers for the Key Principles of Process Safety for Incident Investigation: Jerry Forest John Herber Della Mann Denise Albrecht Celanese Process Hazards Management CCPS Emeritus 3M Although the peer reviewers provided comments and suggestions, they were not asked to endorse this guideline and did not review the final manuscript before its release. The Center for Chemical Process Safety was established by the American Institute of Chemical Engineers in 1985 to focus on the engineering and management practices to prevent and mitigate major incidents involving the release of hazardous chemicals and hydrocarbons. CCPS is active worldwide through its comprehensive publishing program, annual technical conference, research, and instructional material for undergraduate engineering education. For more information about CCPS, please call (+1) 646-495-1371, e-mail ccps@aiche.org, or visit www.aiche.org/ccps This document is made available for use with no legal obligation or assumptions. Corrections, updates, and recommendations should be sent to CCPS at ccps@aiche.org If you are reading this offline, you may not be reading the latest version. Please check on the CCPS website for the current release. https://www.aiche.org/ccps/tools/golden-rules-process-safety It is sincerely hoped that the information presented in this document will lead to an even more impressive safety record for the entire industry; however, neither the American Institute of Chemical Engineers, its consultants, CCPS Technical Steering Committee and Subcommittee members, their employers, their employers' officers and directors, and its employees warrant or represent, expressly or by implication, the correctness or accuracy of the content of the information presented in this document. As between (1) American Institute of Chemical Engineers, its consultants, CCPS Technical Steering Committee and Subcommittee members, their employers, their employers' officers and directors, and its employees and subcontractors, and (2) the user of this document, the user accepts any legal liability or responsibility whatsoever for the consequence of its use or misuse. Copyright: American Institute of Chemical Engineers Key Principles of Process Safety for Incident Investigation Table of Contents Key Principle #1: Know when an event is a process safety incident or a near miss.................................................... 3 Key Principle #2: Develop and implement a written procedure to investigate process safety incidents and near misses 6 Key Principle #3: The Investigation Team should consist of an appropriate number of trained and competent members .............................................................................................................................................. 8 Key Principle #4: Follow-up on process safety incident investigations by developing and resolving recommendations into final actions, and implementing the actions in a timely manner ............................................ 10 Key Principle #5: Learn from Process Safety and Near Miss Incidents .................................................................... 14 References and Supplemental Readings .................................................................................................................... 16 Issued October 2023 Page 1 of 17 Key Principles of Process Safety for Incident Investigation This monograph addresses Incident Investigation, which is a key element of Risk Based Process Safety (RBPS) [1]. The key principles presented reflect good, common, or successful practices and are intended to assist in the design and implementation of this element. This module is intended to strengthen and support Incident Investigation programs. For the purposes of this monograph the following definitions of incidents and near misses are used. These are derived from the CCPS “Guidelines for Investigating Chemical Process Incidents”, 3 rd Ed. [2] Incident—an unusual, unplanned, or unexpected occurrence that either resulted in, or had the potential to result in harm to people, damage to the environment, or asset/business losses, or loss of public trust or stakeholder confidence in a company’s reputation. Some examples are: • process upset with potential process excursions beyond operating limits, • release of energy or materials, • challenges to a protective barrier, • loss of product quality control, An accident is an incident that results in a significant consequence. Near-miss—an incident in which an adverse consequence could potentially have resulted if circumstances (weather conditions, process safeguard response, adherence to procedure, etc.) had been slightly different. Sometimes near misses are referred to as near hits or close calls. Issued October 2023 Page 2 of 17 Key Principles of Process Safety for Incident Investigation Key Principle #1: Know when an event is a process safety incident or a near miss Why: Knowing when an event is a possible process safety incident or a near miss is critical for the Risk Based Process Safety element of Incident Investigation. It is important because it is common practice in industry to combine all Environment, Health & Safety (EHS) related incidents including the process safety incidents in one incident management system. This may lead to a lack of recognition of process safety incidents and their investigation requirements. For example, a process safety interlock that shut down a piece equipment might not be investigated if no one recognized that this was a near miss. The lack of investigation may result in repeated reoccurrences of process transients or other upset events that become normalized. This will decrease the margin of safety, and the opportunity to learn about the root causes of the event will be lost. Knowing when an event is a process safety incident or a near miss sets the priority and practice (Refer to Key Principle #2) for company/facility personnel to follow. The “CCPS Process Safety Metrics - Guide for Selecting Leading and Lagging Indicators” and “API RP 754 Process Safety Performance Indicators for the Refining and Petrochemical Industries” provide guidance to identify, classify, and prioritize process safety incidents and near misses [3] [4]. Investigating a near miss is a good opportunity to learn important lessons without having to suffer the consequences of an actual incident. Clearly communicating the criteria for defining a process safety incident or a near miss will enable more immediate reporting which will provide the following benefits: Preservation of time-sensitive evidence Obtaining witness statements before memories fade Meeting company and regulatory requirements (refer to Key Principle #2). Incident History: A refinery did not recognize events that should have been recorded as process safety incidents or near misses. In 1994, the refinery had an event that involved flammable vapors being released at ground level from an atmospheric blowdown drum. In 2005, a similar release led to a multi-fatality explosion [5]. The learning relevant to this Key Principle is: If the 1994 event had been recognized as a process safety incident, then it might have been investigated. Knowing when a process safety incident investigation is required, and the investigation of this precursor event, could have led to measures that could have prevented the 2005 explosion. On January 9, 2004, a company began manufacturing its first full-scale batch of a gasoline additive chemical in a new process line. The batch produced an unanticipated exothermic reaction in the first step. During the processing of additional batches, the facility had quality problems and made several recipe changes. It was observed that exotherms were warning signs of side reactions occurring. These exotherms were not considered near misses and, thus, were not evaluated for their actual and potential severity. On December 19, 2007, during the production of another batch, an explosion occurred, resulting in 4 fatalities and 13 injuries [6]. Issued October 2023 Page 3 of 17 Key Principles of Process Safety for Incident Investigation Key Principle #1: Know when an event is a process safety incident or a near miss The learning relevant to this Key Principle is: If these exotherms had been identified as near misses, then a process safety incident investigation might have been initiated. The investigation could have identified measures to mitigate and possibly prevent the actual incident in 2007. Incident: On February 20, 2003, an oven had a malfunctioning temperature controller that caused the oven to overheat. The oven door was left open to help control the temperature. At the same time, workers cleaning the area created a dust cloud of phenolic resin powder. A history of small oven fires that had been routinely extinguished by plant personnel using extinguishers and hoses had not been investigated. Investigators concluded that on February 20, 2003 a fire developed inside the oven and ignited the dust cloud. The resulting dust explosion and fire set off subsequent explosions and destroyed much of several production lines. There were 37 injured and 7 fatalities. The learning relevant to this Key Principle is: The company was aware of the fire and explosion risks of combustible dust but did not formally investigate small fires or explosions, nor did they communicate the importance of these events to the staff. Company memoranda and safety committee meeting minutes from 1992 to 1995 showed concerns with creating explosive dust hazards. The company missed making the staff aware of the possible dust explosion hazard of phenolic resins and missed conducting a near miss incident investigation [7]. How – Leadership: Leadership should: Provide a clearly documented definition for a process safety incident and a near miss [8] [4]. Provide a tool to help classify events as a process safety incident or a near miss and to clarify when an event should be investigated [4]. Provide training to all facility personnel on how to recognize and report process safety incidents and near misses. They should provide more detailed training on incident investigation methods and techniques to personnel who might be assigned to incident investigation teams. Promote and encourage reporting occurrences regardless of whether they are investigated. Personnel should not have concerns that any occurrence that they report will negatively influence personnelrelated considerations such as promotions, assignments, bonuses, or job security. Provide necessary resources (including software tools) to allow the efficient reporting and investigating of process safety incidents and near misses. They should demonstrate a visible leadership interest in the actual and potential risk of incidents and near misses and continuously improve the incident investigation process so that more incidents are investigated and more and better lessons learned are shared. Maintain metrics to monitor the effectiveness of the incident and near miss reporting system. [9] [3] [10] [11]. Issued October 2023 Page 4 of 17 Key Principles of Process Safety for Incident Investigation Key Principle #1: Know when an event is a process safety incident or a near miss How – Implementers/ Users: Learn the differences between an event, a process safety incident, and a near miss through training (informal or formal), reviews of work experiences where incidents or near misses have occurred, or use of posted materials that constantly remind personnel of these important distinctions. Employees can learn the normal and abnormal events that could create a process safety incident or near miss by participating in Process Hazard Analyses (PHAs). Deviations from normal operation (e.g. human error) may become a process safety incident or near miss [12]. The PHA should enable better recognition of potential process safety incidents and near misses because the PHA hazard scenarios (i.e. cause and consequence) may be outside of acceptable operating conditions. Ask subject matter experts who are knowledgeable in the process safety incident investigation process if the event is a normal event, a process safety incident or a near miss. Understand the mechanics and timing requirements of the process safety incident and near miss reporting system. Report all events that are possible process safety incidents or near misses. Watch for abnormal occurrences that become normalized because they did not result in any adverse consequences [12]. Supplemental Reading: [1] [2] [3] [4] Issued October 2023 Page 5 of 17 Key Principles of Process Safety for Incident Investigation Key Principle #2: Develop and implement a written procedure to investigate process safety incidents and near misses Why: A well-executed and documented process safety incident investigation procedure will provide a consistent and repeatable process to enable collecting evidence, analyzing and identifying causal factors, identifying root causes, and developing recommendations. The practice of having a well-executed and documented incident investigation procedure should help ensure that lessons are learned and applied to support safe and reliable operations. How – Leadership: Leadership should: Support the development and approve an incident investigation procedure that provides guidance to organize, execute, document, follow-up, and communicate the investigation of process safety incidents and near misses. Reinforce the consistent implementation of the incident investigation program across the site or company. Ensure adequate resources have been allocated for the incident investigation program. Remove barriers to enable transparency for conducting a thorough investigation; e.g., ensuring the objectivity of the investigation team, and restricting the influence of stakeholders who fear being blamed [13] [14]. Establish a training program for incident investigation team leaders and team members. Include incident investigation key performance indicators in the process safety metrics program [3] [10] [11]. Establish a system to track and manage incident investigation recommendations. This may be part of an overall process safety recommendations tracking system or a separate system. How – Implementers/ Users: Develop a dependable process safety incident investigation procedure that includes the following: [2, pp. 47-77], [14]: Definition of a process safety incident or near miss (Refer to Key Principle #1). Examples can be useful to illustrate when an event is/is not a process safety incident or near miss. Reporting of events which have the potential to be a process safety incident or near miss. If the event is required to be reported to a government agency, the specific reporting time limits and reporting format/content should be included in the incident investigation procedure. Develop criteria on when to initiate a process safety incident or near miss investigation. Responsibilities and competencies required for the various roles in the incident investigation program. Establishing the boundary/scope of the incident under investigation. Choices for an appropriate technique of investigation based upon the actual and potential severity classification of the process safety incident or near miss. Commonly available investigation techniques vary in complexity depending on the complexity and severity of the incident or near miss. Selection of the incident investigation team, including contractors where they were involved in the incident. Avoid conflicts of interest where possible (Refer to Key Principle #3) [14] Practices for securing the incident scene, collecting evidence, and determining when the area is safe for the field investigation to begin. Practices and methodologies to preserve, gather, analyze and log evidence include: Issued October 2023 Page 6 of 17 Key Principles of Process Safety for Incident Investigation Key Principle #2: Develop and implement a written procedure to investigate process safety incidents and near misses Protocols for tasks such as sampling, equipment testing, and any other changes to the site/evidence. Obtain agreement to the protocols with relevant parties before collecting evidence. Evidence custody transfer system, as necessary should evidence testing by an independent third party be required. Evidence including operating data from the data historian, security camera video, photographs, and samples of released materials or debris, drone fly-overs, and other records [2]. Interviewing witnesses and taking their statements. Identifying and collecting relevant records and documents. Preserving physical evidence so that it is not compromised. Timeline development Methodology for conducting the root cause analysis. [15] Incident Report including its format, content, review, and approval. The report should reside in an archive system and have a specified retention period. In some cases, there may be a protocol for handling confidential and privileged data. Tracking and managing incident investigation recommendations through to their implementation. Communication of lessons learned from the incident investigation to the affected personnel including contractors, including information relevant to other process safety elements such as operating procedures, training, asset integrity, etc. (See Key Principle #5). External communications, where appropriate. For example, it may be necessary to communicate the results of an incident investigation to external regulators based on a valid order issued by a regulatory agency. Monitoring and trending of process safety incidents to identify reoccurring types Communication to the facility’s Process Hazard Analysis program manager so the incident event scenario can be analyzed during the next revalidation. Implement and follow the written incident investigation procedure. Witness interviews should be conducted early, before witnesses forget details or are able to discuss the event with others. Incident investigation team members collect evidence, conduct interviews, perform root cause analysis and write the incident investigation report, as assigned. Placing and assuring that the equipment is in a safe and stable condition so that investigation activities in the field can be performed safely Participate in the incident investigation, as assigned. Most investigations involve interviewing operations and maintenance personnel who were likely in the field when the incident began and have first-hand knowledge of some of the contributing factors of the incident. Report process safety incidents and near misses when they observe events that warrant reporting. Understand and comply with field evidence preservation requirements when assigned to collect evidence. Keep investigation records (e.g. witness statements, data sheets) to ensure that data is secured. Provide engineering support to the investigation team, such as engineering calculations, simulation, or modeling. Supplemental Reading: [2] [6] [14] [16] [17] Issued October 2023 Page 7 of 17 Key Principles of Process Safety for Incident Investigation Key Principle #3: The Investigation Team should consist of an appropriate number of trained and competent members Why: Providing for the appropriate number of trained and competent people on the incident investigation team should allow for a more efficient, focused, and in-depth incident investigation. It will help ensure that: The incident investigation leader is formally trained as an incident investigation leader as a minimum. The team members should also be trained in incident investigation for significant incidents [2] [18] [19] [20]. Team members are competent in their area of expertise [18] [19]. The selected investigation methodologies are understood and applied consistently [14]. There is an understanding of the technology of the process and the equipment. There is an objective and unbiased analysis of the incident [2]. The causal factors, root cause(s) and recommendation(s) are identified to prevent the incident’s reoccurrence [2]. There will be a complete and effective documentation and report [2] [14]. Process safety incident investigations team size should be commensurate with the complexity of the investigation [1]. Incident History: A Polyamide Unit produced a high-performance nylon, and it had experienced polymer reaction incidents (waste polymer fires and explosions) from its initial startup in 1993 to 2001. In 2001, the end plate of the Polymer Catch Tank blew off, fatally striking three employees during preparations to clean the vessel that was full of decomposing waste polymer [21]. Although the prior incidents had been investigated, the investigation teams had not adequately identified the controls required to prevent reoccurrence. No one on the prior investigation team understood that the process design did not identify this reactive hazard. Process instrumentation and vessel opening practices failed to provide adequate warnings of the state of the material inside. The learning relevant to this Key Principle is that: The multiple investigation teams did not have process engineering expertise sufficient to identify the reactive chemistry hazard and, therefore, could not identify how to control the hazard. There were witnesses who described the molten cores of exploded pods as they discolored, which is consistent with decomposition. A competent investigation team could have recognized that a significant hazard was associated with accumulating large quantities of molten polymer, including waste in the polymer catch tank, which created a reactivity hazard. Issued October 2023 Page 8 of 17 Key Principles of Process Safety for Incident Investigation Key Principle #3: The Investigation Team should consist of an appropriate number of trained and competent members How – Leadership: Leadership should: Ensure that initial training for lead investigators and team members is conducted. Ensure that refresher or ongoing training for lead investigators and team members is conducted. Ensure that those participating on the investigation team are competent in the expertise function that they are providing. Ensure that the investigation team members are allowed the time required to support the investigation. Ensure, to the extent possible, that the incident investigation teams are impartial. This will help ensure that there is an objective and unbiased analysis of the incident [2]. How – Implementers / Users: Provide the appropriate number of trained and competent people so that: The investigation team has all of the needed technical expertise to ensure that the incident or near miss is investigated thoroughly. The incident investigation leader is formally trained. The team members are trained in the incident investigation activities that they will perform. For example, team members who will conduct formal interviews of witnesses should receive training in how to conduct witness interviews. [2] [18] [19] [20] The incident investigation leader and team members are objective and unbiased. The selected investigation methodologies are understood and applied consistently by the team and applied consistently. [14] The team members have a basic understanding of the technology of the process and the equipment design and operation. However, additional technical support may be necessary to supplement the incident investigation team’s expertise. The correct causal factors and root cause(s) are identified. [2] The recommendations to prevent the incident’s reoccurrence are formulated. The incident investigation report can be produced. [2] [14] Process safety incident investigations team size should be commensurate with the complexity of the investigation. The actual number is dependent on factors such as the nature and severity of the incident. [1] Participate in the Incident Investigation (Knowledgeable Operations & Maintenance, Engineers, Emergency responders, and project representatives, personnel with process safety expertise). Supplemental Reading: [2] [14] [22] Issued October 2023 Page 9 of 17 Key Principles of Process Safety for Incident Investigation Key Principle #4: Follow-up on process safety incident investigations by developing and resolving recommendations into final actions, and implementing the actions in a timely manner Why: Risks from Process Safety Incidents are not reduced until the incident investigation recommendations are resolved and the action items are completed [2]. Incidents: The Challenger incident of 1986, as well as several chemical/process industry incidents, including BP’s Texas City refinery explosion of 2005, occurred, in part, because previous incidents or near misses had been investigated, but the recommendations from these investigations were not followed-up adequately. Both the Rogers Commission (on the Challenger incident) and the Baker Panel (on the BP Texas City incident) noted that precursor events had occurred which were related to the final catastrophic incidents they investigated. In both cases, the organizations had conducted investigations of the prior incidents and had generated recommendations to correct the root causes found, but these recommendations had not been completed. Additionally, recommendations for process changes not related to a previous incident investigation but based on engineering analysis and operational experience had been made but not implemented and these recommendations could have removed or altered the root causes of the final catastrophic events. For example, the Rogers Commission stated in its final report [23]: “Morton Thiokol, Inc., the contractor, did not accept the implication of tests early in the program that the design had a serious and unanticipated flaw. NASA did not accept the judgment of its engineers that the design was unacceptable, and as the joint problems grew in number and severity NASA minimized them in management briefings and reports. Thiokol's stated position was that "the condition is not desirable but is acceptable." Neither Thiokol nor NASA expected the rubber O-rings sealing the joints to be touched by hot gases of motor ignition, much less to be partially burned. However, as tests and then flights confirmed damage to the sealing rings, the reaction by both NASA and Thiokol was to increase the amount of damage considered "acceptable." At no time did management either recommend a redesign of the joint or call for the Shuttle's grounding until the problem was solved.” The Baker Panel drew the following conclusions in its final report [24]: “The ultimate objective of incident investigation is preventing reoccurrence of a specific incident scenario or related similar incidents. Considerable effort and resources are expended in determining an incident’s root causes and identifying suggested preventive measures. Despite this effort, the potential for a repeat occurrence remains unchanged until recommendations are implemented. The value of the investigation is entirely dependent on the effectiveness of follow-up activities. The team that conducted the review, for example, identified a backlog of unclosed action items in the tracking databases relating to various aspects of process safety management, including those stemming from incident investigation. Some of the action items from incident investigations extended back over a period of more than 12 months.” The learning relevant to this Key Principle is that: The failure to promptly resolve recommendations and close/complete the resulting action items stemming from incident and near miss investigations represents a serious systemic management system issue. The underlying reasons for this failure (which might also apply not only to incident investigations, but also to PHAs, audits, and other process safety program elements) should be thoroughly identified, understood, and corrected. Otherwise, the likelihood of reoccurrence is higher and the consequences of the reoccurrence could be more severe than the original event. Issued October 2023 Page 10 of 17 Key Principles of Process Safety for Incident Investigation Key Principle #4: Follow-up on process safety incident investigations by developing and resolving recommendations into final actions, and implementing the actions in a timely manner How – Leadership: Leadership should: Ensure the site/facility has a system for resolving recommendations and tracking action items to completion. This system may be the same as or separate from the system that is used to resolve and track other process safety and EHS related recommendations and action items. Provide appropriate resources to allow the management of recommendations and action items to be completed in a technically sound manner and as expeditiously as possible. This should include periodic reviews to monitor the status and progress of recommendation resolution and action item completion. Assign a due date and a person who is responsible for the resolution of the recommendation(s). If the completion of the final action items will be delayed, ensure that appropriate interim/temporary safeguards are implemented as warranted to ensure safe operation during the delay. The temporary safety measures should be commensurate with the risk that the permanent action item is intended to abate. Implementation difficulties may be discovered later, requiring due date deferral/extension and consideration of the need for interim safety measure(s). Review incident investigation recommendations, and accept, reject, or modify them at an appropriate level of management to assure the recommendation is an effective resolution and that the risk identified by the investigation is reduced as much as practicable. The reasons for the rejection or modification of recommendations should be documented. The appropriate approval level may be a function of the complexity of the recommendation, or the consequence or severity of the incident. The documented rationale for rejection or modification could be based on the following criteria (adapted from [25]): The underlying incident investigation root cause analysis and other work contained factual errors which resulted in flawed recommendations. The recommendation was not necessary to protect the health and safety of people, i.e., the recommendation addressed non-safety or process safety issues such as product quality, production costs, etc. An alternative measure that would provide a sufficient level of protection was substituted. The recommendation was infeasible. When claiming that a recommendation is not feasible the evidence substantiating such a claim should be documented and an alternate means of mitigating the risk should be identified and implemented. Make the decisions regarding which action items are completed, when they are completed, and their priorities. Leadership approves the projects, work orders, MOCs, or other processes as provided for in the procedures governing those activities, which should designate an appropriate level of approval depending upon relevant factors such as the risk being abated, the cost, the impact on production or operability of the facility or one or more of its processes, or other factors. The approval should also include assigning and empowering personnel responsible to develop and implement action items for each incident investigation recommendation. Clearly assign responsibility for each action item to an individual (not an organizational function) and should ensure that a system exists for handover of action item responsibilities when organizational changes occur. Prepare a plan with target dates for implementation of the final action items. The implementation target dates should be assigned so that they are “timely.” The following criteria should be applied when defining “timely:” The risks incurred if the action item is not completed. For example, there may be an increased likelihood of reoccurrence with current conditions. When does “timely” become immediate? Under what conditions should this high priority action be taken? Normally, the uncorrected risk will be an important criterion to consider in making this decision. Issued October 2023 Page 11 of 17 Key Principles of Process Safety for Incident Investigation Key Principle #4: Follow-up on process safety incident investigations by developing and resolving recommendations into final actions, and implementing the actions in a timely manner The length of time is not excessive. A low complexity action item might be completed quickly and relatively easily. For example, action items involving procedural or administrative changes are easier and should be quicker than action items that involve equipment changes. The date is reasonable and defensible. For example, the opportunity for action item completion might require a process or equipment outage that delays implementation. The existing trend of meeting target dates. For example, if the current resources do not enable meeting the target dates then it may be necessary to add resources or to extend the target date. Care should be exercised with extensions or deferrals of action items. These should be tracked and managed, along with those action items that are considered overdue to understand the complete picture of the status of the action items. Track the aging of overdue and deferred action items to understand the amount of time that has elapsed since they were due for completion. If the action item involves modifications to equipment, has an opportunity to complete the action item come and gone without taking any action? For example, has a turnaround occurred and was an action item that was scheduled for completion during the outage not completed? If the permanent corrective action cannot be accomplished on schedule or within a reasonable length of time, can any interim measures be provided to reduce the risk? Communicate expectations and manage completion of outstanding action items to meet the target due dates. Establish a formal process for deferrals or extensions when the completion of action items are overdue. The deferrals/extensions should be justified by reasons that are reasonable and defensible, and not based solely on cost considerations. When deferrals/extensions are granted for completion of incident investigation action items, they should be monitored in facility process safety program metrics [3] [10] [11]. Communicate action item progress to upper management and/or regulators when needed. How – Implementers / Users: The investigation team should determine and develop initial recommendations to address the root cause(s) and any contributing causes, and resolve those recommendation(s) to produce final action items (facility/company) by: Addressing all of the root causes identified Incorporating the concept of inherent safety to the extent possible [26] Preventing future incidents by following this hierarchical philosophy: Eliminating the hazard where possible Avoiding the hazard where possible Identifying changes to the management system governing the element Avoiding disciplinary or other human resources related action Avoiding incompletely or vaguely worded recommendations Entering the recommendation(s) into the facility or company’s incident tracking system, or alternatively, the tracking system used for all process safety related recommendations Resolving recommendation(s) using technical reviewers, SME, or other resources to produce the final action items to prevent the reoccurrence of the incident or near miss Explicitly documenting what was done to resolve the recommendation Tracking and managing the final action items in the system used by the facility or company Implement/complete assigned action items to reduce the risk and prevent reoccurrence of the incident or near miss by: Establishing projects, maintenance work orders, MOCs, or other administrative processes to make any physical changes to equipment or procedures and practices described in the final action items and schedule those projects or work orders for completion as soon as feasible Issued October 2023 Page 12 of 17 Key Principles of Process Safety for Incident Investigation Key Principle #4: Follow-up on process safety incident investigations by developing and resolving recommendations into final actions, and implementing the actions in a timely manner Executing those projects or work orders in accordance with facility/company engineering, construction, and commissioning specifications, standards, and procedures Documenting the execution of the projects, work orders, MOCs, or other changes in records specified by administrative procedures governing the activities, including the date the action items were completed Do not permit the designation of action items as closed based upon the promise of some future action. This includes projects, work orders, or MOCs that are approved but not yet executed. A project or other activity should only be considered complete when the physical changes to equipment or procedures has been completed and verified. Perform engineering work to support development and resolution of recommendations and action items produced by incident investigations. Implement action items from incident investigations by the target due dates. If a due date extension is required, submit the request in advance of the due date to allow proper consideration of alternatives. Have valid reasons for requesting due date extensions. Supplemental Reading: [2] Issued October 2023 Page 13 of 17 Key Principles of Process Safety for Incident Investigation Key Principle #5: Learn from Process Safety and Near Miss Incidents Why: Process safety incidents cost money. Process risks are not reduced until the action items to implement incident investigation recommendations are completed. [2] Process safety program performance should improve with learning from incidents. Sharing process safety incident information may prevent another division or company from having to learn the “hard way” (i.e., learning from their own losses). Keeping the awareness of process safety incidents in the minds of the workforce should create a sense of vulnerability and vigilance for warning signs of hazards that can be controlled [27]. A learning culture can improve the design, operation, and maintenance of a process to foster safe and reliable operations. Incident History: A serious fire and explosion occurred at a compressor station involving the failure of a check valve. A joint EPA/OSHA investigation determined that another incident involving a failed check valve had occurred, and the company was cited for failure to adequately apply lessons learned from the previous incident [28]. The learning relevant to this Key Principle is that: The company’s culture did not assign sufficient priority to learning from prior incidents and, in this instance, the failure of a check valve reoccurred. How – Leadership: Leadership should: Approve the level of sharing of facility incident information. Share incident and near miss root causes and lessons learned internally, and carefully define the level of external sharing. These provisions should be clearly described in the facility incident investigation procedure. Include leading and lagging key performance indicators (KPI) of the process safety incident investigation program in the facility or company process safety program metrics. These KPIs should be collected at a frequency of approximately 1 – 3 months. Schedule formal reviews of process safety metrics on a frequent, ongoing basis and generate recommendations to improve incident investigations based on the trends of the KPIs [3] [10] [11]. Possible incident investigation lagging KPIs include: number of near misses that were reported (near misses can also be categorized as leading indicators) number of incidents that were reported number of near misses that were investigated number of incidents that were investigated. Possible incident investigation leading KPIs include: number of overdue incident investigation action items number of deferred incident investigation action items Establish an open, trusting, and no-blame culture even when human error was identified as a contributor or root cause. Human error usually indicates lack of training, inadequate procedures, or other management system failures. Establish positive reinforcement for recognizing the importance of near misses. Near misses provide an opportunity to learn valuable lessons without suffering any adverse consequences. They represent early warning signs of more serious events and they should be thoroughly analyzed for their root causes [27]. Issued October 2023 Page 14 of 17 Key Principles of Process Safety for Incident Investigation Key Principle #5: Learn from Process Safety and Near Miss Incidents Conduct “safety stand down” sessions when required. A safety stand down is an organized break from work by which employers hold a safety discussion with their employees. Stand downs can occur at any time that employers deem it necessary. They are often taken as occasions to discuss safety problems that have occurred and are either serious or reoccurring while also reinforcing the organizations policies regarding safety in general. Consider performing an independent internal review of past incident investigations to determine if the root cause analysis was complete and if the lessons learned were disseminated adequately. How – Implementers/ Users: Establish a formal process for sharing the lessons learned from incident investigations with relevant internal personnel whose jobs are affected by the lessons learned. This process should include methods of presenting the lessons to personnel, e.g., face-to-face briefings, use of other regular forums such as safety meetings to present them, use of e-mail systems to transmit lessons learned to relevant personnel, or other methods. The process should also provide for documentation of what was shared, when it was shared, and who received it. Establish a process for sharing the lessons learned from incident investigations with relevant external organizations. For the purposes of this guidance, external organizations can include other sites/facilities within the same company, as well as organizations outside the company where the incident occurred. This process should carefully define what “external” means in this context, and this definition should be carefully reviewed by all relevant groups, disciplines, and relevant individuals within the company before any external sharing takes place. Use lessons learned from previous incidents and near misses to maintain and heighten the workforce’s sense of vulnerability of the risks in their facility. Use the lessons learned in training programs for operators, maintenance personnel, engineering and project personnel, and others as appropriate. Frequently “re-tell the story” of higher importance near misses and incidents in order to maintain corporate memory. Promote communication of process safety incident investigation learnings internally. These communication forums can include: Process Safety Bulletins (internal and external) Toolbox meetings Video re-enactments Process Safety Town Halls/ Meetings Any other event that supports improving process safety culture Obtain incident reports and use them to include the lessons learned from previous incidents and near misses when writing and reviewing operating procedures, particularly in warning and cautions statements. Obtain incident reports and use the lessons learned from previous incidents and near misses during operator training activities. Obtain incident reports and include the lessons learned from previous incidents and near misses in the emergency response plan and its supporting procedures [29]. Seek and apply lessons learned from other company locations and industry. Supplemental Reading: [2] [5] [22] [30] Issued October 2023 Page 15 of 17 Key Principles of Process Safety for Incident Investigation References and Supplemental Readings [1] CCPS (Center for Chemical Process Safety), Guidelines for Risk Based Process Safety, Hoboken, NJ: John Wiley and Sons, 2007. [2] CCPS (Center for Chemical Process Safety), Guidelines for Investigating Process Safety Incidents, Hoboken, NJ: John Wiley and Sons, 2019. [3] CCPS (Center for Chemical Process Safety), "Process Safety Metrics- Guide for Selecting Leading and Lagging Indicators," CCPS, New York, 2022. [4] API (American Petroleum Institute), API RP 754 Process Safety Performance Indicators for the Refining and Petrochemical Industries, 3rd Edition, Washington, D.C.: API (American Petroleum Institute), 2021. [5] CSB, "Refinery Explosion and Fire, Investigation Report No. 2005-04-I-Tx, BP, Texas City," US Chemical Safety and Hazard Investigation Board (CSB), Washington, D.C., 2007. [6] CSB, "T2 Laboratories, Inc. Runaway Reaction, Report No. 2008-3-I-FL," US Chemical Safety and Hazard Investigation Board (CSB), Washington, D.C., 2009. [7] CSB, "Combustible Dust Fire & Explosions. Report No. 2003-09-I-KY," US Chemical Safety and Hazard Investigation Board (CSB), Washington, D.C., 2003. [8] CCPS (Center for Chemical Process Safety), "CCPS Process Safety Glossary," Center for Chemical Process Safety, 2021. [Online]. Available: www.aiche.org/ccps/resources/glossary. [9] CCPS (Center for Chemical Process Safety), Guidelines for Integrating Management Systems and Metrics to Improve Process Safety Performance, Hoboken, N.J.: John Wiley and Sons, 2016. [10] The American Chemistry Council (ACC), "Performance Metrics Guidance Document," The American Chemistry Council (ACC), Washington DC, 2014. [11] International Association of Oil and Gas Producers, "Recommended Practice on Key Performance Indicators," vol. 456, no. November, 2018. [12] CCPS (Center for Chemical Process Safety), Recognizing and Responding to Normalization of Deviance, Hoboken, N.J.: John Wiley and Sons, 2018. [13] CCPS (Center for Chemical Process Safety), Essential Practices for Creating, Strengthening, and Sustaining Process Safety Culture, Hoboken, NJ: John Wiley and Sons, 2018. [14] American Petroleum Institute (API), API RP 585 – Pressure Equipment Integrity Incident Investigation – First Edition, Washington, DC, U.S.A.: American Petroleum Institute (API), 2014. [15] NFPA, Guide for Fire and Explosion Investigations, NFPA 921, Quincy, MA: National Fire Protection Association, 2017. [16] CCPS (Center for Chemical Process Safety), "CCPS-Process Safety Incident Evaluation Tool," [Online]. Available: www. aiche.org/ccps. [17] CCPS (Center for Chemical Process Safety), Guidelines for Process Safety Documentation, Hoboken, NJ: John Wiley and Sons, 1995. [18] US Occupational Safety and Health Administration (OSHA), 29 CFR1910.119 Process safety management of highly hazardous chemicals, OSHA. [19] US EPA, 40 CFR68 Chemical Accident Prevention Provisions, Washington, D.C.: US Environmental Protection Agency. [20] CSA Group, CSA Z 767 2nd edition, Process Safety Management, CSA Group, 2017. [21] CSB, "Thermal Decomposition Incident, BP Amoco Polymers, Inc., Report No. 2001-03-GA," US Chemical Safety and Hazard Investigation Board (CSB), Washington D.C., 2002. [22] CCPS (Center for Chemical Process Safety), Incidents that Define Process Safety, New York: AIChE, 2001. [23] Rogers Commission, Report to the President by the Presidential Commission On the Space Shuttle Challenger Accident, June 6, 1986. [24] Baker Panel, "The Report Of The BP U.S. Refineries Independent Safety Review Panel," (January 2007). [25] US Occupational Safety and Health Administration (OSHA), OSHA Instruction CPL 2-2.45A, Washington, DC: OSHA, 1994. [26] CCPS (Center for Chemical Process Safety), Inherently Safer Chemical Processes - A Life Cycle Approach (3rd Edition), Hoboken, N. J.: John Wiley & Sons, 2019. [27] CCPS (Center for Chemical Process Safety), Recognizing Catastrophic Incident Warning Signs in the Process Industries, Hoboken, N.J.: John Wiley and Sons, 2011. Issued October 2023 Page 16 of 17 Key Principles of Process Safety for Incident Investigation [28] U.S. EPA, Joint Chemical Accident Investigation Report, Shell Chemical Company, Deer Park, Texas, US EPA document #550-R-98-005,, Washington D.C.: U.S. Environmental Protection Agency, 1998. [29] CSB, "Toxic Chemical Release at the DuPont La Porte Chemical Facility," US Chemical Safety and Hazard Investigation Board (CSB), Washington, D.C., 2019. [30] A. Ness, "Lessons Learned from Recent Process Safety Incidents," CEP, pp. 23-29, March 2015. Issued October 2023 Page 17 of 17 This page is intentionally blank Key Principles of Process Safety for: Incident Investigation KP3 - II, Oct 2023 Copyright 2023 American Institute of Chemical Engineers www.aiche.org/ccps