Combining functional safety and cyber security
Both have to work for a plant to be truly safe. How can we learn and use the best from each?
As business needs have evolved, demand for merging real-time data from the process control systems with business data, such as production planning, yields, quality, energy consumption, and emissions, has grown. This has created the need to integrate both corporate network domains and process control domains, resulting in more dynamic business operations and enabling asset owners to adjust output both to meet demand and to capitalize on market change.
If your process control network is still separate from your corporate network and you expect it to stay that way, then risk of cross-contamination is relatively low. (If you believe your networks are still separated, you might want to check that. There are likely many more places where they’re connected than you realize.) But if you, like many of your colleagues, are modernizing your plant to achieve greater connectivity between the corporate and process control domains, then this call for greater collaboration between the systems that protect the process control and engineering domains and those that protect the business domains and IT infrastructure should be of great interest to you.
Converging protection models
For years, the IT industry has been meticulous in carrying out risk and threat assessments from inception to design to deployment and finally to implementation of IT networks. These assessments have been carried out in accordance with a plethora of standards, including BS7799, BS17799, ISO27001-27005, and ISA99, just to name a few.
In parallel, manufacturers such as oil, gas, and petrochemical producers have been carrying out similar activities to identify and evaluate potential risks to personnel, equipment, and the environment. These include structured and systematic safety, risk, and threat assessments; hazard and operability (HAZOP) analysis; layer of protection analysis (LOPA); safety integrity level (SIL) determination and validation; and others. These functional safety activities are conducted in accordance with their pertinent industry standards such as IEC61511, ANSI/ISA S84.00.001, and so on.
As today’s cyber threats become increasingly malicious with sights set on automation systems, the continued evolution of the threats suggests that we battle them with the combined force of both the IT cyber security approach and engineering functional safety approach.
It is no longer adequate to consider that each possible hazard could come just from equipment failures, fires, floods, or other events within a plant facility. It is now possible that hazards can be initiated from outside of the plant, some of which would never be considered if viewed from just an engineering perspective. Even if they have been considered, the analysis might already be out of date because the threat to any system is not a constant. It is a continuous evolution.
The functional safety assessments typically focus on the failure of a piece of equipment, addressing the probability of failure, the potential consequences, and its impact on safety, the environment, and the business. The IT assessments are very similar, but the consequences of a system being compromised would more likely be the massive economical impact of a production interruption, rather than loss of life.
Following corporate IT practices, tools have been created for use on process control engineering networks that scan the system for details such as asset identification, protocols in use, operating system status, version levels, patch levels, and service pack installation. If not applied with cyber security concerns in mind, however, these same tools, by the nature of the network design, might provide potential hackers with intelligence that was not previously accessible to them. For this reason a combination of skill-sets or departments should be employed in all aspects of safety from the field device to the corporate firewall that connects to the Internet.
Assessments should be carried out on the use of protocols down to the field device level. Which are vulnerable to attacks over an IP (Internet protocol) network? Assessment should continue to the level at which field device output is converted for compatibility with the DCS. A multi-skilled assessment team must determine, for example, whether a compromised field device can effectively control or interrupt operations. Making such determinations requires an understanding of many different technologies, including the protocols, the network on which they reside, the switches, the cable traces, and the power supply.
If a network switch connecting field infrastructure to the process control domain were to lose power or be otherwise compromised, for example, what might the consequence be? Redundancy is not always the answer, since the design could make redundant switches available to a virus attacking one switch. What if a hacker ports forward or otherwise redirects process control commands from one network port/route to another? This is a common attack that is used to exploit data from switches, but what if this attacker could re-route command and control data from one port or network to another?
After assessing field protocols, network components, and ports, the next in line would be the process control domain (PCD) or DCS. The days of DCS network infrastructures being managed solely by the instrumentation and controls engineers are waning fast. These networks are now subject to malware, viruses, and common trends through threat analysis. Therefore, the IT department usually has the necessary skill-set and knowledge to ascertain the damage that can be done to the network should miscreant programs enter. Of course, the engineering department still has to assess the damage caused to the production process, should an IT-based attack take down a particular device!
Further, the change in the process control domain is traditionally slow compared with that of a traditional IT system. IT systems have been responding to outside interference since the 1970s and have evolved with a tactical mentality in their approach to security, whereas process control has taken a more strategic approach. Only now are they becoming more tactical, making it essential that both IT and engineering skills and practices be combined in the assessment of today’s plant risk.
The human factor
If we look at the rise in recent attacks, the Stuxnet virus is probably the most well known due to the surrounding publicity, in-depth reporting, and the fact that it specifically targeted process automation systems. This threat was created outside the plant and designed to cause disruption to the routine running of a process or processes. The virus was designed with the specific knowledge of the protocols used on the process control network, enabling it to wreak havoc. It would be interesting to know if any pre-HAZOP or risk and threat assessment considered this type of attack when the systems were designed and implemented.
But Stuxnet wasn’t just about technology. It also involved human weakness and error. The introduction of the virus was apparently via a USB device that was left where employees would find it. Did the persons finding such a device consider the risk, impact, and consequence of using the device in the process control domain? Probably they did not.
So which department’s safety, security, risk, and threat assessment should be responsible for addressing this type of threat? The answer should be both!
In retrospect it is easy to see that from an IT perspective, better management of the USB ports would have helped. If employees followed policies and procedures, assuming they were in place and being enforced, they would not have introduced the virus into the process control domain. Many IT departments now provide employees with authorized USB devices to prevent such incidents from arising and to keep employees from using their own devices.
Another recent attack, this one by Shamoon malware, consisted of three elements: reporting; overwriting drivers, programs, or libraries; and, the most damaging, the function that overwrote the master boot record and then instructed the machine to reboot, rendering it useless.
Had cyber security been considered during the functional safety assessment, then the consequences and impact might have been assessed and understood, driving the relevant risk reduction measures to be implemented. These could have included disabling USB ports, ensuring that policies and procedures are implemented, training personnel on the risks of using rogue USB devices, and so on.
Integrating practices and procedures
Just having IT and engineering groups in communication is not enough. Effective collaboration requires a close analysis of the practices and procedures for both departments to see if there are any contradictions. Synergy is good, but any contradictions could be a potential weakness in your system.
The time has come for us to combine the best of the IT world and the functional safety world. The next time that a HAZOP is performed, consider not just the process hazards, but also the IT hazards, consequences, and impact. The sooner we can eradicate the traditional divide between the safety and cyber security protection functions, the better we will all be, both from a welfare and financial perspective. And the best way to do that is to understand the best practices of both worlds, learn from them, and apply the knowledge together.
Gary Williams, MSc, ITSEC, is a control and safety product manager for Invensys Operations Management. Steve J. Elliott is the Triconex product director for Invensys Operations Management.
- While the safety systems of a plant may be designed to protect against human or technical failures, they may not consider the effects of a deliberate cyber attack from the outside.
- Your process safety group can and probably should work with your IT security group as the two systems need to work hand-in-hand to protect from traditional safety threats and cyber threats.
For more information, visit: http://iom.invensys.com