Dealing with Undocumented Field Device Changes
Field device parameter alterations persist across industries despite the potentially enormous problems they can cause. The good news: fixing the problem isn't difficult. The bad news: the options and concerns surrounding selection of the right correction method are nearly limitless.
The most popular subject of discussion in 2009 on Control Engineering ’s Facebook and LinkedIn groups (see box for direct links) grew out of the following discussion post: Should operators be free to make their own changes to instrumentation settings, or should this be subject to more formal change management?
On our LinkedIn group alone, this topic drew more than 100 comments.
This discussion got its start after we heard about an increasing number of instances in which operators were adjusting instruments in the field — usually to reduce nuisance alarms — but without making a corresponding database change. When this happens, no one else is aware of the change, and if a process tied to one of those instruments goes out of spec due to that change, correcting it can become a complex process since no notice of the change is reflected in the database.
To summarize this wide-ranging discussion, I’ve collected some of the more salient points and grouped them into the four areas under which most of the comments seemed to fall — impact on controls; instrumentation and software tools; procedures; and ultimate responsibility.
Impact on controls
The fact that so many embedded control issues can be affected by ill-advised and undocumented instrumentation changes, led Allan Gibson , senior consultant and functional safety engineer with Invensys, to assert that making tuning changes and tracking those changes are “one of the critical tasks” of an operator.
“No alarm setting should be changed without a careful analysis,” said Gibson. “If the alarm is so non-critical and such a nuisance that the operators are given authority to change it, then the first question should be: Why is there an alarm in the first place? Alarms should always indicate a loss of control, and no alarm should be installed without an automated layer of control preventing it from being generated under normal operating conditions. Frequent alarms are a sign of either a failed control system or a failed instrumentation strategy and, in either case, the instrumentation or DCS engineer should have this as his (or her) highest priority to fix.”
Bill Hollifield , principal alarm management and HMI consultant at PAS, argues against such changes “because proper tuning is accomplished by an engineering methodology, not by intuition.
“Allowing operators to change settings means that an operator coming on shift has no idea where hundreds of alarms are set,” Hollifield said. “The setting of individual preferences as alarm limits results in sub-optimization of the process, causes shift-based process variation, introduces non-rationalized alarms, contributes to alarm floods, and is therefore not in keeping with best practices. And since security settings for control of alarm setpoint change are not always effective, that is why the use of automated audit and enforcement mechanisms to ensure proper alarm setpoints has become common.”
Hardware and software tools
Increasing the variability of an operation’s outcome is yet another cause for concern when operators adjust field devices without noting the change in the database.
Charles Cohon , president and founder, Prime Devices Corp., offered this example: “An assembly line operator is responsible for drilling a hole inside a circle with a half inch radius. When that operator drills a hole at the extreme left edge of the tolerance, the operator’s instinct suggests that the drill should be repositioned a half inch to the right because 'the hole seemed so very far off center.’ By repositioning the drill each time the hole is at the extreme edge of the tolerance, the tolerance goes from being within a circle with a half inch radius to being within a circle with one inch radius. In other words, the operator who chases the optimal results actually increases the range of potential outcomes (variability) fourfold.”
While most everyone is in agreement that specific processes and procedures should hold sway over any field device changes, most agree that some ability to change parameters is a necessary evil.
“Having worked in a chemical plant in a very damp and changeable climate, I know that if the operators did not have some control over process flow and temperature, the process would have been frequently bogged down,” said Glenn Hutson , compliance assurance coordinator at BP Exploration. “I have seen control boards on which the critical process controls can only be accessed by a process engineer with a key (yes, a real physical key). Likewise, on the silly yet effective side, I have seen an instrumentation engineer disconnect a dial entirely. The operators, however, continued to happily adjust it, thinking that it was making a difference.”
Jon McClain , senior electrical/software design engineer at ATI, noted that “this problem of operator authorization in the user software has been an issue for a very long time—ever since microcontroller-based products started replacing the older analog products. On older analog products, I would often just see them key-locked in a larger enclosure. To maintain control authority for setting changes and calibration in micro-based products, most of the instruments in our industry have some kind of multi-level password control and logging features that note when any system changes are made.”
McLain notes that this method is “by no means bullet-proof, but it does stop the casual user. It can also help you figure out who the individual is who is changing settings. Bottom line, someone can always figure out passwords. Ultimately, violations involving circumventing passwords and such can only be effectively dealt with at a personnel level.”
The fact that HART specifications include write protection capabilities for smart devices, was brought up by Mike Boudreaux , product manager at Emerson Process Management. “You can write-protect HART devices and the write-protect status is auditable using an asset management system. When in write-protect mode, calibration changes using a hand-held communicator are not possible.”
Boudreaux also noted that software products such as Emerson’s QuickCheck Snap-On, can generate a report showing which devices are currently in write-protect mode, “making it easy to audit the HART instruments to find devices that are unprotected,” he said.
Processes and procedures
Most comments made on this discussion focused on resolving the issue with well-thought out and mandated procedural requirements. Some group members even went so far as share their guidelines as a means of helping others review or develop their own requirements. Others suggested taking a closer look at the capabilities inherent in some control technologies that may already be in place at a user’s facility.
One such post came from Arif Mustafa , Kuwait country manager for Yokogawa Middle East. Mustafa said that some control technologies help account for operator changes in the field at the operator station level by:
Clearly defining critical and non-critical loops, interlocks, etc., based on severity levels;
Asset management software can be used to interact with intelligent field instruments utilizing HART, Foundation Fieldbus or Profibus;
Certain changes in the field at the instrument level can be propagated via “operator guide messages” or DCS event notifications. Operators have visibility to these changes and integrated asset management software with DCS operator software can be deployed to synchronize as event data all changes made in the field;
Intelligent field instruments also allow for certain parameters to be “locked-in” or to set up a “threshhold” of permissions for defined changes (i.e., the system can lock out someone making a change in the field that might have an adverse effect on the upstream or downstream process);
Operator station logs also provide visibility to alarm and event data associated with tags on a what, where, when, and by who basis.
Leslie Parchomchuk , president of ClearSky Risk Management, provided a pragmatic set of guidelines. “Start with an alarm prioritization/alarm rationalization study to determine which alarms are critical to the safety and integrity of the process (safe limits), and which are process variable alarms that aid in the efficiency of the process,” Parchomchuk said. “With the ability to alarm every DCS point, many facilities are overwhelmed by the number of alarms and many alarms become nuisance alarms. For critical/emergency/high priority alarms, these certainly need to follow management of change (MOC) processes, and should be locked out from being changed by operators. Other DCS indicators can have flexible alarm points set by the operators themselves. This give the benefit of having the operating limits safeguarded by fixed alarm points, but lets the operators actually operate their process efficiently.”
Some see the crux of the discussion as being more about where changes can be allowed and where they absolutely cannot be permitted.
“In my experience, operators are given authority/ability to change control parameters only as far as the process or processes are concerned, and in no way, shape or form should they have access to calibration or safety parameters,” said Scott Mahler , an electrical/electronic manufacturing professional. “Operators will try and find ways to 'work’ around perceived inconveniences to them. But any changes in calibration or safety parameters should be handled through proper channels and documented.”
Assessing the issue of unauthorized and untracked field device changes as both a system level problem and a device problem, Herman Storey , chief technology officer at Herman Storey Consulting, said both cases fall into the category of unmanaged or uncontrolled work processes.
Storey added that, in his experience, he has seen many projects where alarms were established at the last minute (during startup) without proper engineering review. “After that, no system was established to make sure alarm settings are proper,” he said.
“Sometimes instrument ranges are done the same way. [Ultimately this] is a management failure, but generally the database tools are included in a DCS to keep backup and restore settings that are inside the system. What is often not implemented is a comprehensive way to keep all the metadata around the alarm setting such as why the alarm is there, what it is protecting, and what unique action an operator is to perform if the alarm becomes active. Lacking a comprehensive tool and a management system, authorization is a minor point.”
Expanding on that point about proper authorization, Storey noted how authorized changes can be just as troublesome as unauthorized changes. Many companies “give a technician a hand-held communicator and tell the technician to 'fix’ the field devices,” Storey said. “This communicator is used for troubleshooting and configuration management [which is typically] done in the field with whatever information the technician has in his head. Often, there is no connected system to manage device configuration or to receive diagnostic messages. The accuracy of configuration using handheld communicators is very low and affects operational accuracy and failure modes. This problem of authorized people doing authorized work with inadequate tools [can lead to them] producing work that is just as bad as the unauthorized work.”
Storey said that he has seen that “most owner/operators have a large installed base of systems that do not have good support tools for managing assets and manpower or work process failures where the tools are installed.” So what are these owner/operators doing in response to the problem? Storey says that they continue to cut costs in the face of this issue.
With opinions on the subject ranging from never allowing such changes to empowering operators to make the proper decisions through training, communication, and better systems and procedures, the question becomes: Who is ultimately responsible for correcting the problem of undocumented field device changes?
“It is the control engineer’s responsibility to program a system to allow for safe operation while maximizing freedom to keep the system running…period!” said Jerold Aulph , program manager of Battery Management Systems & Automation. “Evoking the name of safety and quoting an OSHA number will not stop a system from melting down. Set your process failure modes and effects analysis in the beginning of the project design phase to design a system that is fail safe, and will limit operator access to a safe level, [thereby giving] them the freedom to be safe operators.”
Marc Houwelijckx , senior process control engineer at Total, holds a decidedly different viewpoint. “As a control engineer, I’m not responsible for the outside instrumentation,” he said. “However, I’m heavily dependent on [others] to do their proper job, otherwise all the tuning I do is a waste of time.”
Duane Mahnke , president of DBMahnke Consulting, said operators are more cooperative if they know the implications of their action and are involved as part of the issue correction. “[This can be done] through incentives, monetary or otherwise, to make them step up and report the changes or inconveniences they face, in order to do their jobs,” said Mahnke. “Without incentives, and being asked to run without downtime, they tend to do things that make it easier for them and they don’t report it because they either don’t know how it affects the process/product or they know it might, but don’t want to let anyone know they are doing something [in the name of efficiency] that they shouldn’t be doing. If the system design engineer and management have good communication with operators and encourage them to be part of a “team” and get recognition for doing so, less of these things happen.
David Greenfield is editorial director of Control Engineering; email@example.com
Control Engineering’s social media forums
To become a part of the conversation or merely follow the discussions in real-time, access our forums on Facebook and LinkedIn via the following links:
Automation & Control Engineering group on LinkedIn: budurl.com/celinkedin
Automation & Engineering group on Facebook: budurl.com/cefacebook
When is a device change a violation?
Apart from all the discussion of best practices, communication, and optimal processes, making changes to field devices can often be a direct violation of industry or government standards.
Michael G. Mariscalco, PE and owner of QEI Engineers, noted that, in the U.S., “if the process involves hazardous chemicals, unplanned or unauthorized process control changes would be considered a violation of the OSHA Process Safety Management standard. Further, the practice would violate the requirements of ANSI/ISA 84.00.01-2004 'Safety Instrumented Systems for the Process Industry Sector.’”
Mike Boudreaux of Emerson Process Management, indicated that IEC 61511-1 clause 11.6.4 (and ANSI/ISA 84) requires that smart sensors be write-protected to prevent inadvertent modification from a remote location, unless appropriate safety review allows the use of read/write. “The standard requires that the review should take into account human factors such as failure to follow procedures,” added Boudreaux.
Richard Quinnette, chemical and environmental engineering professional, also pointed out that, according to OSHA 29 CFR 1910.119, in the section on “Management of Change,” if you are dealing with a “covered process” in the U.S., and an operator makes a change without documentation, the facility would be in violation. “This is something that should be strongly discouraged by operations management,” Quinnette noted, “not just for compliance reasons, but for safety reasons as well”
A note of thanks
The social media discussion on which this article is based proved to be a truly global conversation, with comments posted by engineers from all over the world. One person in particular helped to facilitate the global nature of this discussion—Fabienne Salimi, technical director and founder of ADEPP Academy in France. Fabienne thought so highly of this topic that she posted it to various other LinkedIn engineering forums of which she is a member and then re-posted responses collected from those groups on our group where the discussion originated.
For all her efforts, I would like to thank Fabienne personally for taking the time to help this story develop in such a widespread manner.
— David Greenfield, editorial director, Control Engineering