Jump-starting human reliability

Eddie Habibi, founder and chief executive at PAS, sat down and talked with ISSSource Editor and Founder Gregory Hale just before the release of PAS’ Integrity iMOC, electronic Management of Change software.

By Source: ISS Source July 15, 2011

ISSSOURCE: What is your new software all about?

HABIBI: The emphasis is on human reliability and leveraging information that already exists in a plant to help people make better decisions.

If you think about where the knowledge exists in a processing plant, a tremendous amount of plant operating knowledge resides in the automation system. That is what we call explicit knowledge, the knowledge that is already committed. It is there; it is configured. It is almost like a book that has already been written. You don’t have to get it out of somebody’s head. Our Integrity software collects that information and brings it into an environment where it can be managed. We can inventory it; we can identify links and references amongst pieces of information so we can make sense of it. When we do that, we identify instances where configuration is outright defective and could lead to incidents. We also find a serious issue around managing the configuration of these systems. Our industry — primarily the oil and gas, refining and petrochemicals hydrocarbon processing — is mired with incidents. Many of these incidents can be tied back to the failure to manage change. Invariably, lack of proper change management contributes to a lot of these incidents, whether it was BP Texas City where they had bad instrumentation, and they failed to manage that bad instrument, or the Texaco Pembroke Refinery incident back in 1994 where they had made a modification to the flare line and they hadn’t communicated it to the proper plant personnel – the operations people. This new flare line was not supposed to handle liquids, but they ended up sending liquids down through it and it caused a rupture and eventual accident. Thankfully it wasn’t as bad as the Texas City incident. Nevertheless, it was a bad accident.

A major contributor to many of these incidents is lack of proper management of change. Management of change is a mandated article in the United States under OSHA 1910.119 process safety management (PSM), well known in the industry since 1992 when it was ratified by the Congress in the USA. One of the most critical, as well as useful, articles of the 1910.119 regulation is Management of Change (MOC). Operating companies are looking for ways to improve their MOC processes. The problem in our view is the tremendous amount of information a plant has to deal with.

We categorize existing (change management) practices into two areas. One is paper-based, which has been around for decades. The second method is electronic forms, which is basically the old paper put in electronic forms that gets routed around and signed off.

The challenge here is that there is so much change going on at a processing plant that people can’t keep track of them in general. There is also this notion among some operating companies that the automation system does not necessarily fall under the OSHA 1910.119 MOC requirement.

At many plants, a technician or a control engineer can walk up to a control system and change its tuning constants, alarms, and control configuration, all without requiring a rigorous management of change process. We feel that is a huge vulnerability in the industry. Unless it is dealt with, it will continue to cause problems.

One of the reasons there is not a rigorous management of change on these automation systems is that it is not an easy thing to do. If you look at an automation system, it is the platform for change. That is where you go to make improvements to the process. A lot of good optimization takes place on the control system. A lot of good information gathering; information management comes out of the control systems to enable the plant to be more optimized. So, making improvements has to do with making changes to the automation system. Because it is a continuous improvement situation, you have a continuous change situation. How you mange change in that environment becomes a challenge. Given the underlying foundation we have in our Integrity software, we have built an intelligent application that facilitates a robust and low overhead MOC workflow process management for automation systems.

We call it intelligent because it has the ability to automatically compile the initial information and documentation needed to start an MOC process. It also automatically reconciles the changes in the system to those outlined in the MOC package. It is easy to configure; you design it and it pushes relevant information to the end user automatically. Part of the issue with change of management process is it takes a lot of time to gather the documentation you need to understand the changes you want to make.

Having Integrity as the underlying platform for this module enables us to push relevant information to the end user. All you have to do is tell the application what you are going to change; all you need is a tag reference and the application gives you all the process safety related information, as well as the process control information associated with that tag so you can make an intelligent decision.

ISSSOURCE: How would the MOC policy slow down the process? Does all this happen in real time?

HABIBI: Your top tier companies already have policies in place that require personnel to go through the management of change process, even for the automation systems, if that change is going to impact any article that comes in a PSM documentation. What we have come to find out is that a lot of people don’t follow the practice and that is a vulnerability.

As a standard practice, a control engineer is asked to fix a problem. The engineer then goes into the system, does a little research, fixes the problem and then goes on to the next thing to take care of. There is a tendency to not document changes in the control system and that causes a problem because changes you make in one area of a control system may influence other aspects of the control system. There is also the aspect of communication amongst individuals who deal with the control system. Imagine changing the settings of a critical alarm during one shift without communicating the changes to the operators on the next shift. The next operator that comes on is sitting there waiting for the alarm to go off and the alarm doesn’t go off. By the time he sees the impact of that alarm on the rest of the plant, it may be too late.

So, strict change control becomes essential to safety at a processing plant. We want to make MOC as easy as possible for plant personnel. You initiate the process electronically; you notify people who need to know about it. Approvals are done electronically. The information you need to make decisions on is pushed to you.

Once you have made the modifications, we allow you to come back to document all the changes automatically, so there is no extra work for that. As well, the software will reconcile the changes.

The key factor in successful management of change is reconciliation of change. You decide to make the change, you make the change, and then you validate or reconcile the change. The beauty of that is, if there are other changes that have not been reconciled you can easily identify those changes. You can look at the impact of the changes and who made the changes.

If you look closer, there is a tremendous cyber security aspect to this. Remember with Stuxnet, it disguised itself as a Windows patch and was installed at the controller level and manipulated controls without allowing the operator to see what was being manipulated. So changes were being made to the configuration of the control system, but not to the operator display and the operator could not see what was going on. Had you had configuration change management software in place to routinely capture these changes, someone would have looked at the block of configuration changes in the DCS or the PLC ladder logic and asked the question, “who made these changes?” “We sure can’t find an MOC for these changes, but who made the changes? OK, these are unauthorized changes and we have been breached.”

ISSSOURCE: The operator goes in and makes a change and then says we are good to go, is there a prompt that comes up that asks if you are good with the change?

HABIBI: There will be a history of the change and a reconciliation of the change. If you go into the DCS and you make a change to the alarm settings from one priority to another, we capture the changes to the control system and automatically make the reconciliation for you. If it is the same, you don’t have to make any changes. If you make the wrong changes, we flag them and highlight that these are not the proper changes authorized by the MOC.

ISSSOURCE: When that flag comes up, is that part of the alarm management program?

HABIBI: What I just described is a work flow process module. It is independent of the DCS. So, it is not an alarm system. It is a work process management application specifically designed for control systems.

ISSSOURCE: Were your customers saying that was something they needed or was it something you created and are bringing to market?

HABIBI: This came out of a customer requirement and a NERC-CIP requirement. We had a major project with an electric company and, specifically in that project, one of the requirements was a change tracking module and a management of change application for the automation system. We had intended to do this for quite a while. The electric power generation company project simply expedited our development plans.

ISSSOURCE: This is going to be interesting.

HABIBI: DOC3000, our first configuration management software for the Honeywell control system, came out in August 1996, 15 years ago. In 2001, we decided that since DOC3000 was such a big hit, and automation managers liked it so much, we would expand it to PLCs and other control systems. Since then, we have been preaching this need and best practice around configuration management for control systems. We have had people come back saying “look, it impacts safety and we have to do something about it.”

ISSSOURCE: You mention change, and that word often has a bad connotation in the industry, but here you talk about management of change and it could be something that could be automated to the point where you eliminate as much human error as possible.

HABIBI: Precisely. It is all about human reliability. Not to get too philosophical about it, but if you think about it, no improvement is possible without change. It is a corollary to the old expression “if you always do what you always did, you always get what you always got.” The way you break that routine is change. Change is essential to growth.

The whole essence of manufacturing, whether it is power production or high density polyethylene, is about reducing costs, becoming more efficient, becoming more compliant to regulations and running a safe plant. That requires continuous focus on improvement and focus on continuous change. Change in how you run your business. That is the essence of change. Unmanaged change is chaos and honestly, that is what is going on in some plants today. You have changes that are undetected. You have changes that go unsupervised and we have seen the result of that in industrial accidents.

ISSSOURCE: The technology is there, but people need to be involved, and have the answers right at their fingertips.

HABIBI: You just hit on a note about knowledge retention and collaboration. Every forum you go to and in every book you pick up these days, you see collaboration is the key to success. That is true. The simple form of collaboration is if you add one and one together and come up with three. Collaboration means minds coming together to solve problems. Collaboration is about exchange of information and knowledge. In a processing plant, that means people can see information. Not just fresh information that is being created, but existing information which has been deposited in automation systems over decades. Collecting and aggregating the existing knowledge in the processing plant is key to our core Integrity technology. I mentioned earlier that knowledge resides in the automation systems. That is not the only place it resides. It also resides in safety incident reports, standard operating procedures, drawings and design documents, and a plethora of other systems that exist in a plant. If you think about the amount of information that comes together that runs a plant safely, it is mind boggling.

Our mission is to collect all the bits and pieces of information within a process plant, give it context, and present to personnel where and when they need it.

ISSSOURCE: Context is where the action is.

HABIBI: Precisely. Safety is all about having the right information and making the right decision. The ASM Consortium concluded that 42 percent of all incidents are related to human error. I have a lot of respect for the ASM Consortium; they have done a lot of good work in creating market awareness about situation awareness, but if you look at the other 58 percent and you ask enough questions, you’ll find that the deep root cause of the other 58 percent is also human error. Process equipment doesn’t fail itself. Process equipment fails because it is under designed or over designed or since it was unmanaged, maintained improperly, or over driven by operations. All of those are human error. So, better visibility to asset performance and better visibility to design flaws help plant personnel make better decisions to prevent incidents. At the end of the day in my view, all incidents are human driven. They can be connected back somewhere along the value chain to someone making a mistake.

Human reliability is key, and if we can’t rely on humans to make the right decisions and do the right thing, we might as well pack it in and go home. You have to have faith in humans and their ability to make the right decisions. But you have to also provide the right information. Here is a philosophy I hold very dear. No plant personnel gets up early in the morning and says “today is a good day to screw up and blow up the plant.” People don’t intentionally go to work to fail. People want to succeed in what they do. Plant operators are the most conscientious people in processing plants. We need to retool them. We have ignored the operator over the past 30 years. We owe it to the operators to make the plant a safe place to operate. Providing context-based information in a simple way is one step toward retooling operators.

– Edited by Amanda McLeman, Plant Engineering, www.plantengineering.com