Power fail-safe state: After the power outage

Identifying a power outage – small or large – is only a fraction of the battle for your equipment.


Every time you order a valve or configure a programmable drive, you specify the safe state that the equipment will move to in the event of a power failure. Motors generally just stop, and valves frequently move with the assistance of springs or other stored-energy devices. Some facilities have battery banks that allow for orderly, controlled shutdown. Others have separate safety systems that ensure that everything goes down in the safest possible manner.

After your power fails, the system is down and no energy was released in a way that endangered people or property. Good job. Now what? Blip! The power just came back.

Now we have to deal with the aftermath of the power outage. Controllers are powering back up, circuits are energized, and the system needs to stay safe. Generally, the protocol is to leave all the equipment in the power-off safe state until something external to the direct control verifies that all is well. This external entity might be a Safety Instrumented System, or it might be an operator who walks down the process and presses the reset button.

What if we had a power outage and no one noticed?

How do we define a power outage? Is it the drop in voltage? My stereo might drop offline if my nominal 120 Vac falls to 100 Vac, but my incandescent desk lamp might work fine (if dimmer) at 70 Vac. Every device has different voltage requirements, so it can be difficult to precisely define the minimum voltage. How about time? Clearly, if the power is out for an hour you would call it an outage. How would you refer to an outage that lasted one cycle, about 17 ms? The power supply in my desktop computer might notice, but my box fan certainly will not.

Pick a voltage drop of X% and a duration of Y cycles, or maybe a product X*Y that allows for variations in both, and you have just defined your own ‘power outage’ metric. Unfortunately, every device in the plant has a different metric and its own unique way of defining when it thinks an outage has occurred. You generally cannot know these limits on a device by device basis.

When the outages are long and deep, you can expect every device to notice the loss and to shut down. When outages are very short and shallow, you might get them regularly and never notice because all your equipment is able to ride through the episode.

What if we had a power outage and only some noticed?

Device A notices a short power outage. The enabling signal to a valve is released and the spring return forces it to the safe state. After the momentary outage, a controller is back online waiting for an operator to press the reset button before the valve is re-enabled. Device B does not notice the same short power outage, so it stays in normal operational mode. If Device A and Device B are not communicating or otherwise linked, an unsafe situation can easily result.

Some years ago, a facility I was associated with had multiple independent single-loop controllers operating a boiler. A short power outage caused some of these controllers to reset to their safe state. Others did not notice the outage and continued on in normal operation. Things went boom and that boiler suddenly had a slightly different shape.

How do we make sure everyone notices?

There are two ways to make sure that all devices stay up or go down together.

The first is to find the minimum incident that any equipment would notice, install a sensor that can detect those conditions, and kill power to everything whenever those conditions are met. Everything will go to safe state together, at the expense of frequent nuisance shutdowns of all the equipment.

The second is to install a supplemental power supply that will filter out all incidents smaller than that which would take out every device. No device would notice an outage until the outage is large enough that everyone notices. Such supplemental power supplies are generally prohibitively expensive.

If you cannot make sure that the devices all stay up or go down together, then the solution needs to be communication. Devices that notice the outage perform their shut down function. Devices that do not notice the outage must instead detect the shut down function of the other devices and take their own actions accordingly. These webs of communication and detection links tend to grow very rapidly, and the interrelationships between different systems and process areas can become very complex.

The perfect solution does not exist.

Every situation is different, and every risk evaluation is different. What works at my plant may be completely inappropriate at yours. The key is to recognize the risks that uncertainty about power failures can cause.

Power-fail events are not yes/no situations. There are wide ranges of maybe that must be considered. If you ignore these potential areas of risk, you might find yourself trying to explain just why that system went boom.

This post was written by Robert Henderson. Robert is a Principal Engineer at MAVERICK Technologies, a leading automation solutions provider offering industrial automation, strategic manufacturing, and enterprise integration services for the process industries. MAVERICK delivers expertise and consulting in a wide variety of areas including industrial automation controls, distributed control systems, manufacturing execution systems, operational strategy, business process optimization and more. 

No comments
The Top Plant program honors outstanding manufacturing facilities in North America. View the 2013 Top Plant.
The Product of the Year program recognizes products newly released in the manufacturing industries.
The Engineering Leaders Under 40 program identifies and gives recognition to young engineers who...
A cool solution: Collaboration, chemistry leads to foundry coat product development; See the 2015 Product of the Year Finalists
Raising the standard: What's new with NFPA 70E; A global view of manufacturing; Maintenance data; Fit bearings properly
Sister act: Building on their father's legacy, a new generation moves Bales Metal Surface Solutions forward; Meet the 2015 Engineering Leaders Under 40
Cyber security cost-efficient for industrial control systems; Extracting full value from operational data; Managing cyber security risks
Drilling for Big Data: Managing the flow of information; Big data drilldown series: Challenge and opportunity; OT to IT: Creating a circle of improvement; Industry loses best workers, again
Pipeline vulnerabilities? Securing hydrocarbon transit; Predictive analytics hit the mainstream; Dirty pipelines decrease flow, production—pig your line; Ensuring pipeline physical and cyber security
Upgrading secondary control systems; Keeping enclosures conditioned; Diagnostics increase equipment uptime; Mechatronics simplifies machine design
Designing positive-energy buildings; Ensuring power quality; Complying with NFPA 110; Minimizing arc flash hazards
Building high availability into industrial computers; Of key metrics and myth busting; The truth about five common VFD myths

Annual Salary Survey

After almost a decade of uncertainty, the confidence of plant floor managers is soaring. Even with a number of challenges and while implementing new technologies, there is a renewed sense of optimism among plant managers about their business and their future.

The respondents to the 2014 Plant Engineering Salary Survey come from throughout the U.S. and serve a variety of industries, but they are uniform in their optimism about manufacturing. This year’s survey found 79% consider manufacturing a secure career. That’s up from 75% in 2013 and significantly higher than the 63% figure when Plant Engineering first started asking that question a decade ago.

Read more: 2014 Salary Survey: Confidence rises amid the challenges

Maintenance and reliability tips and best practices from the maintenance and reliability coaches at Allied Reliability Group.
The One Voice for Manufacturing blog reports on federal public policy issues impacting the manufacturing sector. One Voice is a joint effort by the National Tooling and Machining...
The Society for Maintenance and Reliability Professionals an organization devoted...
Join this ongoing discussion of machine guarding topics, including solutions assessments, regulatory compliance, gap analysis...
IMS Research, recently acquired by IHS Inc., is a leading independent supplier of market research and consultancy to the global electronics industry.
Maintenance is not optional in manufacturing. It’s a profit center, driving productivity and uptime while reducing overall repair costs.
The Lachance on CMMS blog is about current maintenance topics. Blogger Paul Lachance is president and chief technology officer for Smartware Group.