Connecting alarm management and process safety

Making the connection between alarm management and process safety management can ensure a safe and productive process.

By Lee Swindler May 29, 2019

Two highly specialized areas of expertise in the process automation industry are alarm management and process safety management. While you might think these are separate topics, they actually go hand-in-hand. Let’s examine the relationship. To get us all on the same page, here is a definition of each:

Alarm management is the application of human factors (ergonomics) to design and maintain an alarm system to maximize its effectiveness. A common problem is having too many alarms annunciated during a plant upset, commonly referred to as an “alarm flood.” However, other problems can exist with an alarm system, such as poor prioritization, improperly set alarm points, ineffective annunciation, unclear alarm meanings and so on. Improper alarm management is one of the leading causes of unplanned downtime, contributing to more than $20 billion in lost production every year, and to major industrial incidents, such as the 2005 Texas City refinery explosion.

Process safety management (PSM) is a disciplined framework for managing the integrity of systems and processes that handle hazardous substances. It relies on good design principles, well-implemented automation systems and engineering, operating and maintenance practices. It deals with the prevention and control of events that have the potential to release hazardous materials and energy. For the process industry, emphasis is placed on process safety to prevent unplanned releases that could result in a major incident, which typically is initiated by a hazardous release. It also may result from a structural failure or loss of stability that potentially escalates into a major incident.

It’s about risk

How does alarm management impact process safety? In addition to keeping a facility operating better, it comes into play when determining the risk, or more precisely the residual risk, in a given process. PSM risk analysis can be broken down into three steps:

Step 1: First, you must systematically evaluate the hazards (inherent risks) in operating a given process unit. This is typically done in a team setting performing a process hazard analysis (PHA) with the most common method being a hazard and operability (HAZOP) study. Hazards are identified and individually evaluated to determine the probability of occurrence along with the severity of consequences if the hazard is realized. In most companies, the overall risk is defined as the probability times the severity.

Step 2: For each identified hazard, the team must then evaluate any safeguards that mitigate those hazards (e.g., the alarm system) to determine how much residual risk remains after taking credit for the safeguards. The safeguards are called independent protection layers (IPLs). Independence is important because if one safeguard fails, it should not affect any other safeguard’s ability to mitigate risk. Several methods are used to evaluate the safeguards with the most common being a layer of protection analysis (LOPA).

Step 3: After taking credit for the IPLs, the team will compare the residual risk to the company-defined tolerable risk level to determine if action needs to be taken. If residual risk is greater than tolerable, either the process needs to be redesigned or additional safeguards installed. A common safeguard is to install a safety instrumented system (SIS) to reduce the residual risk to acceptable levels. The size of the gap between the residual risk and the tolerable risk determines the safety integrity level (SIL), which is a measure of how “safe” the SIS needs to be (see Figure 1).

In the case shown in Figure 1, the residual risk level exceeds the tolerable risk level even after taking credit for IPLs, such as the basic process control system (BPCS) and mechanical protection (e.g., relief valves or rupture disks). To fill this gap in protection, a SIS can be implemented to reduce the residual risk to a tolerable level.

Note the BPCS IPL is usually a credit for a safety alarm that triggers an operator response preventing the hazardous event, or some type of automated control, which keeps the process from reaching the hazardous condition. However, you only can count on the BPCS to provide credit for one IPL, because the alarm and automated control functions are not truly independent. Certain BPCS failures could disable both functions.

Gaining independence

As mentioned, the BPCS only can be counted on for one IPL credit for reasons of independence. But what if you are already taking credit for automated control and also would like to have credit for a safety IPL alarm? Is there a way of implementing alarms that need to be kept separate from the BPCS? Well, yes, there is. Here are two common approaches:

  1. You can directly wire a field instrument to a lightbox annunciator. This was the traditional approach for alarming in the “old days” of panel board control systems and still is used today. Because the signal and annunciation are kept separate from the BPCS, you can take IPL credit for operator action from the alarm in addition to the automated control in the BPCS.
  2. To avoid the limitations of using a lightbox annunciator, a field instrument instead can be routed to an independent monitoring system with a human-machine interface (HMI) separate from the BPCS. The SIS or a dedicated programmable logic controller (PLC) can feed the independent HMI. Note the overall availability for this independent system needs to have probability of failure on demand (PFD) of 0.1 or less.

The key is to have the safety IPL alarm function completely separate from the BPCS such that no single failure, including cybersecurity attack, could jeopardize the function of both systems. This independence includes the field instrument, which cannot be shared with the BPCS.

Requirements for safety IPL alarms

Several requirements need to be met to use an alarm as a safety IPL:

  1. The alarm must be designed to work for a specific initiating event that leads to the hazard you are trying to prevent.
  2. You need to test the alarm, including associated instrumentation, at an appropriate frequency to verify that the alarm will work.
  3. The alarm must be independent from other IPLs and not disabled by the initiating event for the hazard.
  4. The likelihood that the alarm will annunciate, and the operator will respond properly meets the requirements to be counted as an IPL. The overall alarm function (sensor + logic solver + HMI + operator response) needs to have a PFD of less than or equal to 0.1. This subject is addressed in detail in various Center for Chemical Process Safety (CCPS) books, such as “Guidelines for Safe Automation of Chemical Processes and Layer of Protection Analysis.”

PHA team responsibilities

When performing the PHA, it is essential that the team properly evaluates and documents every alarm that has been designated for IPL credit. This includes the following responsibilities:

  • Does the alarm meet the four requirements defined above for safety IPL alarms?
  • Will the operator have adequate time to recognize the alarm and take necessary action before the hazard is realized?
  • What should the alarm setpoint be?
  • What does the test frequency need to be?
  • What is the proper operator action required to mitigate the hazard?

All this information needs to be entered into the alarm management system with a designated class for safety IPL alarms to ensure they are not modified unless a PHA is conducted. Regardless of what sort of alarm management system is used, the IPL alarms need to be clearly designated because they require special handling.

Impact on alarm management

Using alarms as safeguards for process safety hazards increases their importance and adds another dimension of importance for performing proper alarm management. Proper alarm management becomes more imperative than ever.

Maintaining the performance of the alarm system is critical to ensure that operator response is timely and accurate. Alarm floods, chattering or an excessive number of active alarms will reduce the chance that the safety IPL alarm will receive the attention needed. Alarm response procedures should be clear and easily accessible (ideally in the HMI) so operators can respond quickly and effectively.

Auditing is an ANSI/ISA-18.2 lifecycle requirement that requires a comprehensive assessment of the alarm system, including evaluation of the alarm system performance and work practices used to administer the alarm system. Periodic reviews of how frequently safety IPL alarms have been triggered along with the timing and accuracy of the associated operator response will reveal gaps not apparent from routine monitoring and will allow identification of necessary improvements.

Safer processes

As you can see, there are many interactions between alarm management and process safety management. Each discipline requires a rigorous methodology to properly implement, yet understanding how they interact is equally important to ensure a safe and productive process.

Lee Swindler is an industry manager with Maverick Technologies, a CFE Media content partner. He has 31 years of automation industry experience, including 21 years in manufacturing and 10 years on the engineering services side. He has a PMP certification along with being a TÜV certified functional safety engineer.

This article appears in the Applied Automation supplement for Control Engineering and Plant Engineering.

– See other articles from the supplement below.

Original content can be found at Control Engineering.


Author Bio: Lee Swindler is an industry manager with Maverick Technologies, a CFE Media content partner. He has 31 years of automation industry experience, including 21 years in manufacturing and 10 years on the engineering services side. He has a PMP certification along with being a TÜV certified functional safety engineer.