Effective alarm management requires planning, maintenance

ISA 18.2 specification offers a roadmap to successful implementation


Every year existing HMI/SCADA systems are upgraded. Engineers update control systems seeking new technology or better support. HMI screens are ported or redrawn, storage methods are configured, and security roles are created. With new technology HMI screens become more effective through consolidation of information and enhanced visualization. Storage becomes smarter by utilizing advanced algorithms to store data faster and make it more accessible for data mining. 

More devices, more data, and more alarms. Often overlooked or ignored are the questions what and why. Why is this data important? What is the purpose of this HMI display? Why is this alarm triggered?

Figure1: ISA 18.2 Alarm Management Lifecycle example. Source: ICONICS Inc.In 2009 the International Society of Automation sought to help guide engineers and operators in the management of alarm systems. Written to aid in alarm management, the ISA 18.2 specification defines a simple, iterative process to maintain operability and efficacy of alarms in automation systems.


The ISA 18.2 specification defines an alarm as “an audible and/or visual means of indicating to the operator an equipment malfunction, process deviation, or abnormal condition requiring a response.” The stipulation that alarms demand responses predicates that when designing the application, contributors must be familiar with the automation system in operation so as to correctly characterize what process values or device states represent valid alarms by definition. 

Whether or not an application is old or new, alarms are a depiction of the system itself. Too many alarms and the image becomes cluttered; too few and the image is blurry. The ISA 18.2 specification and the guidelines proposed help eliminate poorly defined alarms from the initial design to keep the alarm system manageable. 

Alarm management lifecycle

In order to qualify all potential alarms with a consistent procedure, the ISA 18.2 specification outlines an Alarm Management Lifecycle. The lifecycle is an iterative process diagram that begins with an alarm philosophy, guides through the rationalization and implementation of new alarms, defines a maintenance cycle, and recommends periodic auditing of the alarm system. 

The process identified by the ISA 18.2 specification begins with a discussion of the individualized philosophy followed by a design, implementation, and change cycle fed by operational and maintenance-based feedback periodically checked by an auditing process.


Alarm philosophy is the first step in the alarm management lifecycle defined in the ISA 18.2 specification and fundamental to the process. Without a good plan defining roles for implementation, operation, maintenance, and periodic auditing, it is easy for any organization to let an alarm system degrade. The philosophy should document objectives and be revisited during auditing to ensure that it evolves with the alarm system.

Figure 2: An example of a weekly alarm distribution by priority report. Source: ICONICS Inc.Design, implementation, and change cycle

For existing systems it is important that the process begins with the operational feedback stages of the alarm management cycle as information ascertained from that process is crucial to eliminating alarms that are not practical or effective. Designing new alarms should include operations and maintenance representatives to ensure viability. Following the documented philosophy, a new alarm must be identified as necessary before it is implemented and organized into the system.

Operational feedback

The operational feedback loop observed as part of the alarm management lifecycle in the ISA 18.2 specification is the basis of the process for continuously improving an alarm system. Once an alarm system is in place and operational, the alarm management lifecycle shrinks in scope to the steps of operation, maintenance, monitoring and assessment, and auditing. 

Figure 3: An example of an alarm chattering report. Source: ICONICS INc.Emphasis should be placed on gaining insight from the live system by running periodic reports that identify key performance indicators of the alarm system. Key performance indicators are essential to identifying nuisance alarms such as stale or chattering alarms, acquiring root cause analysis evidence and Pareto charts to provide vision into the operations of an alarm management system. 

Combining ISA 18.2 with Engineering Equipment and Materials Users’ Association (EEMUA 191) best practices can help applications reduce unnecessary stoppages and lead to a more effective alarm system.

Load balancing

For applications that have multiple operators per shift, the ability to balance the load of alarms through alarm filtering can lead to better productivity and reduce alarm load per operator station. The ISA 18.2 standard recommends the following metrics for alarm performance based on at least 30 days of data:

If all unnecessary alarms are eliminated, then the issue becomes routing the alarms proportionally to the appropriate level of staffing. Many alarming systems allow for alarm areas to be configured for any desired organization. Filtering on those alarm areas can allow for simple alarm load balancing.

Figure 4: An example of a standing alarm by interval report. Source: ICONICS Inc.Reporting

Periodic reporting is an important tool to not only monitor current operations, but gain insight over time to be used in larger scale system audits. Reports can be scheduled to run continuously and customized to summarize any metric the data allows. In general, reports can be categorized as relating to individual alarms, concerning operations, or examining the system as a whole.

 For successful monitoring and assessment of an alarm system, it is important to generate reports from all categories to include a variety of perspectives. Every application looking to integrate the ISA 18.2 specification and EEMUA 191 standards to an alarm management system should consider using standard reports for the example alarm metrics list below. 

Individual alarm reports

Reports designed to visualize alarm distribution, alarm chattering, and alarm frequency can be plotted against any time period and often offer a way to dynamically change the time period if necessary. The example report below parameterizes the start time, end time, and interval when the report is run to allow users to execute the report desired. 

This report shows a good example of the alarm distribution based on a single week of operations and is organized by priority. While providing detailed information, this report shows a general increase of alarms over time. Although important, this report can’t accurately stimulate change, but rather requires further investigation. 

Other individual reports, such as alarm chattering and alarm frequency, can allow insights into specific alarms that should be considered for modifications. Alarm chattering reports can expose specific alarms that tend to enter and exit alarm condition multiple times within a predefined time period. The configuration of these alarms may be too sensitive, require a deadband, or be redundant. 

The first occurrence of the alarm may indeed require a response; subsequent occurrences are an operator distraction. To identify these chattering alarms, the simple report below reveals the average number of times the alarm triggered within the interval period of 60 seconds and the number of times this happened within the reporting period. These alarms can be changed or marked for review at the next scheduled audit. 

System-wide alarm reports

For alarm systems it is essential that the operations also be viewed as one entity. Calculating the average alarm rate per time period is the most basic of these reports but will allow the health of the system to be monitored over time and is an important metric for the auditing process. 

Figure 5: An example of an operator response report. Source: ICONICS Inc.

The report above can be modified to detail the average alarm rate per 10 minutes, per hour, or per day to provide this information for auditing purposes. Other reporting templates can be used to detail metrics such as peak alarm distribution. 

Operator-based reports

The function of the operator response time report can be to reveal alarms that are ignored or become stale as well as to identify alarms that may be hidden by filters. Alarms that are left in the system the longest should also be considered during audit to identify their value in the system. Filters should be checked to make sure that they do not remove alarms unintentionally. The report below illustrates an example operator response time report and details the minimum, maximum, and average time to respond as well as the time to return.

Figure 6: An example of a cross-correlation example report. Source: ICONICS Inc.Audit

Auditing is a continuous and iterative process that, combined with reporting tools, can reveal vital information of operations. Auditing is a process of periodic evaluations defined by the philosophy as outlined in the ISA 18.2 specification. The goal of an audit is to identify problems in the process and to continuously monitor, assess, and update the alarm system. 

To reach the performance metrics mentioned in the ISA 18.2 specification and Table 1, the information must first be exposed. Through scheduled reporting and circulation of findings, contributors can identify the alarms and subsystems that represent the least effective processes to target first.

A critical tool to eliminate redundant alarms is to view the alarm system as a whole and to look at the interactivity between alarms. Alarm cross-correlation reports can show connections between alarm conditions as parent-child relationships indicating the percentage of times that an alarm triggers with another. By identifying child alarms through this type of root cause analysis, auditing can revisit the intention of the alarm to see if it is unnecessary or poorly defined. Typically the root condition is the only true alarm requiring a response.

Strategize, collect and compare

There are always barriers to beginning a project of this magnitude, but below are a few ideas to start the process. The ISA 18.2 specification is a list of guidelines. Some applications can benefit from full compliance, while others can from even basic adherence.

For any running system, data will need to be gathered in order to begin. If reports are already being generated, consider adding some of the suggested reports to highlight problem areas. 

Does the system meet the performance metrics? Are too many alarms coming in? Are operator response times acceptable? Are there unnecessary alarms? Use all available information to define a plan of action.

Identify, rationalize, design and implement changes. Document any changes. This will help clear the system and reduce the noise created by extra alarms. 

Define an audit schedule

Plan a recurring audit to keep the system manageable. As processes change so should the alarm management system. Through continuous system monitoring, assessment, and auditing, growing alarm management systems will evolve efficiently without degradation according to an applications alarm philosophy.

Table 1: ISA 18.2 Alarm Performance

Annunciated Alarms per time


Very likely to be ACCEPTABLE


MAXIMUM Manageable


Annunciated Alarms per day per operating position


150 alarms per day


300 alarms per day


Annunciated Alarms per hour per operating position


6 (average)


12 (average)


Annunciated Alarms per 10 minutes per operating position


1 (average)


2 (average)




An alarm management system is an integral piece of any HMI/SCADA system and is often configured for safety concerns. Allowing an alarm system to degrade can be a major factor in poor efficiency and unsafe conditions. Following the guidelines set forth in both the ISA 18.2 specification and EEMUA 191 can transform an alarm management system to one that helps increase efficiency. 

An effective alarm system also requires a closed loop. It should have continuous and preferably automatic analysis to assure that it properly utilizes the attention of operators. Alarm management systems require planning, maintenance, and iterative auditing to remain effective, but with guidelines like the ISA 18.2 specification and products that comply with EEMUA 191, any application in any industry can benefit.

Pinkham is GENESIS64 Product Manager for ICONICS, Inc. He can be reached via the company website at www.iconics.com.

Top Plant
The Top Plant program honors outstanding manufacturing facilities in North America.
Product of the Year
The Product of the Year program recognizes products newly released in the manufacturing industries.
System Integrator of the Year
Each year, a panel of Control Engineering and Plant Engineering editors and industry expert judges select the System Integrator of the Year Award winners in three categories.
September 2018
2018 Engineering Leaders under 40, Women in Engineering, Six ways to reduce waste in manufacturing, and Four robot implementation challenges.
GAMS preview, 2018 Mid-Year Report, EAM and Safety
June 2018
2018 Lubrication Guide, Motor and maintenance management, Control system migration
August 2018
SCADA standardization, capital expenditures, data-driven drilling and execution
June 2018
Machine learning, produced water benefits, programming cavity pumps
April 2018
ROVs, rigs, and the real time; wellsite valve manifolds; AI on a chip; analytics use for pipelines
Spring 2018
Burners for heat-treating furnaces, CHP, dryers, gas humidification, and more
August 2018
Choosing an automation controller, Lean manufacturing
September 2018
Effective process analytics; Four reasons why LTE networks are not IIoT ready

Annual Salary Survey

After two years of economic concerns, manufacturing leaders once again have homed in on the single biggest issue facing their operations:

It's the workers—or more specifically, the lack of workers.

The 2017 Plant Engineering Salary Survey looks at not just what plant managers make, but what they think. As they look across their plants today, plant managers say they don’t have the operational depth to take on the new technologies and new challenges of global manufacturing.

Read more: 2017 Salary Survey

The Maintenance and Reliability Coach's blog
Maintenance and reliability tips and best practices from the maintenance and reliability coaches at Allied Reliability Group.
One Voice for Manufacturing
The One Voice for Manufacturing blog reports on federal public policy issues impacting the manufacturing sector. One Voice is a joint effort by the National Tooling and Machining...
The Maintenance and Reliability Professionals Blog
The Society for Maintenance and Reliability Professionals an organization devoted...
Machine Safety
Join this ongoing discussion of machine guarding topics, including solutions assessments, regulatory compliance, gap analysis...
Research Analyst Blog
IMS Research, recently acquired by IHS Inc., is a leading independent supplier of market research and consultancy to the global electronics industry.
Marshall on Maintenance
Maintenance is not optional in manufacturing. It’s a profit center, driving productivity and uptime while reducing overall repair costs.
Lachance on CMMS
The Lachance on CMMS blog is about current maintenance topics. Blogger Paul Lachance is president and chief technology officer for Smartware Group.
Material Handling
This digital report explains how everything from conveyors and robots to automatic picking systems and digital orders have evolved to keep pace with the speed of change in the supply chain.
Electrical Safety Update
This digital report explains how plant engineers need to take greater care when it comes to electrical safety incidents on the plant floor.
IIoT: Machines, Equipment, & Asset Management
Articles in this digital report highlight technologies that enable Industrial Internet of Things, IIoT-related products and strategies.
Randy Steele
Maintenance Manager; California Oils Corp.
Matthew J. Woo, PE, RCDD, LEED AP BD+C
Associate, Electrical Engineering; Wood Harbinger
Randy Oliver
Control Systems Engineer; Robert Bosch Corp.
Data Centers: Impacts of Climate and Cooling Technology
This course focuses on climate analysis, appropriateness of cooling system selection, and combining cooling systems.
Safety First: Arc Flash 101
This course will help identify and reveal electrical hazards and identify the solutions to implementing and maintaining a safe work environment.
Critical Power: Hospital Electrical Systems
This course explains how maintaining power and communication systems through emergency power-generation systems is critical.
Design of Safe and Reliable Hydraulic Systems for Subsea Applications
This eGuide explains how the operation of hydraulic systems for subsea applications requires the user to consider additional aspects because of the unique conditions that apply to the setting
click me