Predictive maintenance: Analyze data properly

Smart maintenance keeps that ‘new machine feeling’ intact.
By Salman Aftab Sheikh October 2, 2015

Anyone who has owned a new car or bike will remember the thrill and excitement of those first few years with their vehicles: shining parts, flawless performance, and maintenance-free running. As is the natural course of things, a few years later the shining parts begin to rust and the roaring engine starts to sputter while the suspension creaks and squeaks.

Though some may be motivated to discard their vehicles and get a new one, most people either cannot afford that luxury or see prudence in doing so, particularly when all that’s needed is good maintenance to keep things running smoothly for many years ahead.

The same dilemma is faced by plant owners; neatly cabled control panels, dust-free HMI screens, and smoothly operating mechanical parts don’t last forever. And unlike a vehicle, replacing everything in a plant instead of maintaining it is an absurd idea. While new equipment can make a good first impression, what builds long-term charm is smart maintenance of all that equipment.

Reactive vs. preventive

The term "maintenance" broadly covers any continuous activity undertaken to improve the faltering health of equipment and assets, ensuring that both short-term and long-term performance are not adversely affected. There are two main types of maintenance activities, defined by the state of assets being maintained: reactive maintenance and preventive maintenance.

Reactive maintenance is, as the name suggests, maintenance carried out to remedy a failure or incident. Replacing broken tool parts on a machine, restarting a conveyor that stopped due to overloading, repairing damaged pipelines, and run time debugging of a software error all fall under reactive maintenance.

Preventive maintenance, by comparison, focuses on preventing failures or incidents by promptly replacing or repairing equipment during routine shutdowns/inspections before they fail and cause trouble during operation. Replacing filters every two months, changing generator oil every 50 hr, and adjusting low inlet pressure on compressors are typical examples of preventive maintenance. Choosing between which method to use depends a lot on the process, application, and budget in question. Reactive maintenance is relatively low-cost in itself, but depending on the criticality of processes in question, the shutdown/failure cost could be very significant. Similarly, preventive maintenance incurs much higher costs, but the potential shutdown and damage averted can save a significant amount of operating expense (OPEX).

The key here is for end users to evaluate which processes are critical and which processes don’t affect productivity as much. A water-filtration system failure for residents of a plant facility with clean tap water is hardly the type of failure that requires monitoring. A similar system providing treated water to an ammonia plant for production becomes very critical, by comparison, and requires frequent monitoring.

Since the ultimate goal of any plant owner is to maximize productivity and minimize expenditures, a very general approach to choosing a maintenance strategy is to evaluate average failure cost against average maintenance cost and choose the lower cost.

Preventive maintenance: going deeper

Preventive maintenance is further divided into three categories with varying degrees of system uptime/reliability and operating costs:

Periodic maintenance: Perhaps the most typical form of maintenance, periodic maintenance is based on following original equipment manufacturer (OEM) and learned-from-experience timelines for expected failures. These potential issues are addressed by replacing components as scheduled to ensure smooth operation. This maintenance incurs medium to high costs and marginally high uptime.

Preventive maintenance: Going a step over periodic maintenance, preventive maintenance also includes reliability tests and health checks to ensure asset availability. Results are compared with past data to predict when failures may occur. Issues are addressed before expected failures happen. This maintenance incurs high costs and high uptime.

Condition-based maintenance/monitoring (CBM): This advanced form of preventive maintenance relies on sophisticated software algorithms and continuous monitoring of data from field sensors and instruments to accurately predict the state of each component. CBM presents a holistic view of plant reliability and can help maintain a plant with practically no downtime at all. This maintenance incurs very high costs and very high uptime.

Why predictive maintenance?

Choosing the right maintenance method is critical. Just as reactive maintenance of heat-exchanger failures is catastrophic, CBM for auxiliary waste-treatment plants is a major OPEX sink.

For oil and gas applications (and many similar high-risk industries),), the default form of maintenance is shifting from periodic to predictive. While CBM provides the highest level of reliability and control, the degree of sophistication required for implementation and costs incurred make it a luxury that is not prudent in most typical applications. Predictive maintenance then becomes the best tool for ensuring sustained productivity, and it can be improved with smart planning and careful analyses.

The key to good predictive maintenance is analyzing data from sensors, data from probes, physical appearance, device-state on visual displays, etc. A good plan to analyze data and key insights on what to check helps take maintenance a long way in ensuring a great return on investment.

Know what to check

At the heart of it all, predictive maintenance requires a thorough understanding of how everything works both internally and with its environment. This knowledge is often found through OEM documentation and experiences of maintenance teams who have worked on similar systems long enough to know their nitty-gritty details.

A starting point for any performance measurement is benchmarking—finding out how a system is performing becomes easier when it can be compared to how it performed at its peak. Benchmark results help point out the key performance issues. The next step is to have a good idea of root-cause analyses and trace these issues to their sources. These sources are usually physical parts like proportional-integral-derivative controllers, valves, transmitters, and sensors, and can indicate potential issues in equipment and assets.

Automation systems have made tracking changes in these parts much simpler with the option to include alarms and metrics reporting instantaneously within a control system. A flow transmitter that checks outlet pressure can be programmed to generate alarms whenever the pressure goes over or under a certain limit; similarly, a control valve can be set with feedback (via a positioner) to indicate valve position in response to the control signal. An alarm generated in these situations calls for an inspection of all associated equipment.

Many times, the alarm generates due to failing voltages, aging sensors, or ungrounded wires, but it can also point to serious issues like process parameter constraints or aging equipment. Early indicators help address potential threats and are among the most common methods used in predictive maintenance.

Good examples of early indicators are the probes on compressors; a low-pressure compressor typically has around 10 probes monitoring all aspects of it. When inlet pressure goes below a certain limit, anti-surge control gets active to maintain outlet pressure. Prolonged anti-surge can cause serious damage to the compressor and is usually detected through displacement sensors. Typical compressor vibrations are between 0.6 mils and 0.8 mils, alarms are generated at 1.2 mils to 1.6 mils, and vibrations over 2 mils require an emergency shutdown since 10 mils will cause catastrophic failure. Detecting increased vibrations early on can help address issues in time and prevent shutdowns or worse.

In many cases, alarms and alerts cannot be configured, and the concerned assets require manual inspection to ensure reliability. In situations like that, a well-defined checklist helps ensure long-term maintenance and uptime. These checklists can be designed based on OEM documentation and best practices for maintenance. For example, a checklist for a redundant-configuration control-system cabinet would include aspects of physical inspection, wiring/junction inspection, and humidity/moisture inspection. A detailed list of possible configurations of LEDs on the controllers themselves would be compared against a list of possible causes and spontaneous remedies, if available.

These checklists can then be filled out by periodic inspection staff members who are able to pick out problems early on and communicate them to the maintenance team. The more thorough and analytical a checklist is, the better it can reflect on equipment and asset health.

Contingency plans

Another important aspect of predictive maintenance is to ensure contingency in the face of unforeseen issues. For very critical applications, a fail-safe system or fallback mechanism can help ensure minimal damage and loss of productivity. Backup generators, redundant controllers, spare input/output modules, auxiliary plants, and work-in-progress storage buffers are some of the typical contingency methods used in the industry to prevent damage and loss from failures. While it may seem a daunting task for someone new to predictive maintenance, the actual task is not as intimidating with the proper homework and the right professionals to help plan things out. In the absence of experience, a good third-party operations-and-maintenance team can help create operator and inspection checklists, set up alarms, and optimize process parameters. They also help with training to ensure that in-house staff can perform these tests reliably and efficiently.

Knowing what degree of control is needed and understanding all aspects of the facility requiring maintenance are keys to building a good maintenance plan. After that, it is just a matter of having the right tools for analyzing data and training the right people to actively participate in these activities.

Salman Aftab Sheikh is a lead engineer at Intech Process Automation. Intech Process Automation is a CFE Media Content Partner. Edited by Joy Chang, digital project manager, CFE Media, jchang@cfemedia.com.