Risk management assessments, including the use of risk matrices, are crucial for controlling operational risks associated with safety-critical equipment (SCE) in the oil and gas industry. However, despite the use of validated industry risk identification templates (such as the “bow tie diagram”), these assessments may be flawed, not because of the template itself, but because of the underlying data used: outdated data.
Unfortunately, process safety management systems may be considered acceptable until a safety incident, or even a major risk event, occurs.
Without a comprehensive and integrated assessment, the risks of safety-critical equipment cannot be mitigated. For example, energy companies, including those in the oil and gas sector, often assume that risk identification and assessment, as well as bower diagrams, for each safety-critical piece of equipment are based on the latest accident history database and are consistent with their current process safety management system. These assumptions should be continuously validated, including for safety-critical equipment, control rooms, alarm management, and human-machine interfaces.
Safety-critical equipment and risk prevention tools
In its recommendations for fire and explosion loads on offshore facilities (2006), the American Petroleum Institute defines a safety-critical component as "any component of a structure, equipment, plant, or system whose failure could result in a major accident."
This definition applies to all safety-critical equipment, but each safety-critical equipment element is unique. Almost every enterprise relies on a risk matrix to identify safety-critical equipment and prioritize process management (see Figure 1).
Figure 1: The risk matrix determines the probability and severity of an event after it occurs. Note the red blocks in the "Major" and "Catastrophic" consequence categories. Image credit: Niresh Behari
This risk matrix helps identify large-scale risks, which may include equipment involving hazardous materials such as gasoline, liquefied petroleum gas, and paraffin. These safety-critical devices require the highest priority for risk management, and all stakeholders must understand this definition.
After completing the Hazard and Operability (HAZOP) study and identifying and examining the risks associated with the equipment or operation of the equipment, the next step is to create a bowtie analysis and risk management plan, focusing on the risks associated with safety-critical equipment.
The left side of the bowtie diagram represents threats and proactive responses, while the right side represents reactive controls. This approach trains all safety-critical equipment operators to understand what causes "incidents," and how operators, equipment, and even the public can be exposed to them. Although tailored for process engineers and plant operators, bowtie analysis is a mandatory risk prevention tool that operators and others must understand and follow.
Barrier assurance is another component of risk management and risk prevention. Some argue that barriers are limited to hardware to prevent the release of hazards or to limit negative consequences in the event of a loss of process containment. However, a second barrier—human intervention—is just as important as hardware. The hardware barrier assurance template published by Oman LPG includes eight safety risk barriers, but human factors are its foundation.
The template includes questions specifically designed for safety-critical equipment operators and staff, requiring proactive responses. Examples include: "What can be done to keep the barriers operating safely?" and "How can damaged barriers be managed safely?"
The hardware barrier assurance form illustrates this: "Do you have your own bowtie barrier?" Without proactive measures to eliminate these shortcomings, any warning below a deterministic response indicates a process safety vulnerability. The goal is to always have appropriate barriers in place to manage risk and limit it as much as possible.
Safety Integrity Level and Maintenance Standards
Setting a target Safety Integrity Level (SIL) for each safety instrumented function can lead to a false sense of security. Each level measures performance against the probability of failure occurring within a specified timeframe. Level 1 has a specified failure risk over 10 years; Level 2 is 100 years; and Level 3 is 1000 years. These levels are the standard in the petrochemical industry for protecting critical control systems such as pressure vessels, column stacks, and storage tanks.
However, classifying Level 2 and Level 3 safety-critical equipment is not so simple. The challenge lies in identifying the instrumentation functions of safety-critical equipment, which is not an easy task because there are thousands of safety integrity level 1 (SIL1) safety instruments and control loops, and the classification of safety-critical equipment can be misclassified. Incorrect classification will affect the maintenance priority of critical instruments and control loops.
Best industry practice is to use established safety-critical equipment management processes for safety protection, filter all SIL1 instruments and safety processes, and treat SIL2 and higher-level safety protection control loops as primary risk safety-critical equipment.
The SIL1 filtering process should include reviewing recent incidents. Modern safety management systems do not need to reference outdated incident data when past safety management systems bear no resemblance to those used today.
What does this mean for human-to-safety-critical equipment interactions? A robust and modern SIL1 filtering process for identifying safety-critical equipment, utilizing relevant and up-to-date process safety incident data, can reduce maintenance and turnaround priorities and save operating costs. Even though the likelihood of an incident is extremely low, management must advise safety-critical equipment personnel to avoid assuming the use of outdated SIL1 time parameters.
In the oil and gas industry, the Fire, Explosion, and Release Severity Index (FER-SI) is another important risk prevention and mitigation tool for measuring and quantifying safety-critical equipment. It is a hydrocarbon leak quantification and identification model used to assess potential equipment defects that could lead to leaks. The index can also be used as a hysteresis indicator to provide a conceptual framework for identifying safety-critical equipment.
For those managing and operating safety-critical equipment, it is important to consider the probabilities and likelihoods involved, including the type of plume generated by a leak, the location of a safe room, the safe distance for workers in the event of a toxic release, and the risk consequences due to falling objects.
Figure 2 shows existing practices regarding leading and lagging indicators related to design and safety-critical equipment management, alarm control, and shutdown bypass. Figures 3, 4, and 5 show the root causes of process safety failures related to safety-critical equipment. The hydrocarbon leak shown in Figure 3 could be related to equipment design, change management issues, asset integrity, or operation. The latter two deficiencies tend to be more severe in plants lacking a robust safety-critical equipment management system.
Figure 2: This model describes leading and lagging safety-critical equipment problem indicators.
Figure 3: This figure shows the possible sources of hydrocarbon leakage.
Figure 4 illustrates how equipment or mechanical failures at natural gas processing plants demonstrate that failure to follow maintenance plans is a significant cause of leaks or catastrophic process safety incidents. This is often related to deficiencies in prioritizing safety-critical equipment and inadequate process safety management systems.
Figure 4: This figure shows the various causes of safety-critical equipment problems by percentage.
Several potential human factors are associated with ineffective management of safety-critical equipment (see Figure 5). These include incorrect risk management principles when assessing reliability-centric maintenance, such as those that are incompatible with existing process safety management systems or based on outdated data.
Figure 5: This figure shows the percentage of safety-critical equipment problems caused by personnel and enterprise issues.
Do operators or workers understand the usage restrictions involved in each item defined as safety-critical equipment? For almost all companies, published standard operating procedures require documentation that clearly defines each operational boundary. For example, the boundary approach implemented in the oil sands industry includes identifying pressure vessels, heat exchangers, rotating equipment, and tanks used for hydrocarbon services as safety-critical equipment.
Human-computer interaction evaluation
Individual work-related factors and corporate factors are integral components of process safety culture and the interaction with safety-critical equipment. To facilitate understanding, multidisciplinary process safety experts and human factors engineering researchers at Sasasol Gas & Chemicals, South Africa, conducted a cognitive survey between 2009 and 2013 to assess human-machine interface systems, followed by risk assessment interviews to address technical, mechanical, maintenance, work-related issues, and personnel stress. These issues covered topics such as safety-critical communication and remote operation. On the corporate side, this included FER-SI assessments of leak-related mechanical performance and audits of process safety management systems.
The issue templates covered control rooms, alarm handling, and process control systems, as well as the perspectives of safety-related staff on relevant issues. For example, one template addressed issues related to the reliability of lost containers and general plant equipment. The assessment results were surprising, even disturbing. The survey found that negligence was becoming a major problem due to a lack of staff rotation, significantly impacting other issues related to alarm management.
On the positive side, the facility is committed to problem-solving. A review of the monthly alarm downtime list indicates that alarm automation controls incorporating AI components are considered effective support for the process control system architecture. On the other hand, respondents described insufficient review of trends and patterns. One plant mentioned the frequency of false alarms.
The respondent explained why it is recommended to include staffing, workload, and maintenance surveys as part of the risk assessment template. A list of safety-critical equipment technologies and human factors for each project should be considered mandatory.
An inadequate or insufficient risk matrix could have a significant negative impact on human-machine interaction, prompting executives and management to demand ongoing reviews of processes and procedures to identify every potential industry risk. This template can unveil the mysteries of process safety culture assessment by collecting best performance indicators from all preceding and following data related to safety-critical equipment, including surveys of employee perceptions of their equipment interactions.
This approach can reverse a disturbing industry trend in which companies learn from major risk events or fully publicized disasters to mitigate risks to safety-critical equipment. While valuable lessons can be learned from disasters of the past, today's risk matrix must be relevant to current safety management systems. Any negligence can lead to an unacceptable increase in risk.
Disclaimer: This article is a reprint. If it involves copyright issues, please contact us promptly for deletion (QQ: 2737591964). We apologize for any inconvenience.