The failure dilemma: How to move towards proaction when working in a reactive environment

Consultants have the opportunity to visit many different plants across various industries. This perspective provides a unique viewpoint, especially when compared to an individual who may have worked in one plant or industry his entire career.

By Robert J. Latino, Senior Vice President, Strategic Development and Operational Integrity, Reliability Center, Inc., Hopewell, VA February 1, 2001

Consultants have the opportunity to visit many different plants across various industries. This perspective provides a unique viewpoint, especially when compared to an individual who may have worked in one plant or industry his entire career.

Consultants see the entire spectrum, from archaic to star wars technology programs; from vibration programs that prefer hickory sticks over oak, to the use of finite element modeling in root cause analysis (RCA). Maintenance strategies are an extremely important element of the consultant’s environment.

Maintenance strategies

There are a number of maintenance philosophies employed throughout industry.

Break down maintenance. The basic strategy of this maintenance approach is totally reactive. When something breaks, fix it!

Preventive maintenance. This approach is a time-based maintenance strategy. Equipment is taken off-line, opened up, and inspected on a predetermined periodic timeframe. Based on the visual inspection, necessary repairs are made and the equipment is put back on-line.

Some preventive maintenance is always necessary. For example, some state laws require annual boiler inspections. While this is a well-intended strategy, it can be very expensive since 95% of the time everything is working properly.

Predictive maintenance. This version is a condition-based maintenance strategy that uses predictive technologies such as vibration monitoring, infrared thermography, and ultrasonics to determine the condition of equipment. Decisions are then made about necessary repairs. Predictive maintenance is a much more economically feasible strategy since labor; materials, and production schedules are used more efficiently.

Precision approach. This concept is futuristic, though not a revelation. It is based on the fact that if jobs are done with a mastery level of precision, then the only failures experienced are wear-outs.

Maintenance statistics reveal that 10% or less of industrial equipment ever reaches the wear-out stage. Therefore, about 90% of the mechanical failures are “personnel avoidable events.” This figure means the human being has intervened in some manner to prevent the wear-out stage from being attained.

However, it is important to be practical and not fool ourselves. The introduction of new technology does not-in and of itself-ensure a more reliable operation. Many companies own fancy, hi-tech testing instruments and devices, but no one has been trained properly in their use.

This fact is further proven by the array of CMMSs that collect tons of data, but yield little information. It does not matter if $10,000,000 is spent on a complex software package that has the capability to do wonderful things for the organization if the users do not know how to operate it properly. It becomes a “trash-in, trash-out” system worth a small fraction of the original investment.

Reviewing the first three maintenance strategies shows a common thread-they all focus on the prediction of when failure will occur. The last approach deals with ensuring that the failure does not recur.

Strategies to control failure rates

Assuming that the type of maintenance strategy in place is defined, how do you control failure?

As discussed earlier, preventive and predictive approaches seek to sharpen the timing as to when failure will occur. However, these two approaches do not address why the failure occurs in the first place.

The focus should be on the elimination of failure recurrence. If this objective is obtained, there are no signals (temperature, vibration, etc.) for preventive/predictive methods to detect.

It is very difficult for maintenance organizations to accept this premise. For years predictive programs were refined, which increased the accuracy of their information. Now consultants are saying that eliminating the recurrence of the failure is more important.

Two strategies

There are two common strategies that attempt to manage failure.

Reliability centered maintenance (RCM) . In its traditional state, the goal of RCM is to determine the criticality of equipment in any process. Based on this information, a customized preventive/ predictive maintenance strategy is designed for the organization. This approach is an effort to optimize the use of maintenance resources.

However, this method is an extremely time consuming and expensive process when done according to the text. The end result is sharpened responders, but is still reactionary.

Returns from such efforts vary with the RCM approach used and skill of the execution. However, on average, significant returns take more than 1 yr due to the preparation and set-up times.

However, there is a great need for RCM, but realize that it is most effective as a short-term solution until you can analyze and eliminate the failure from ever occurring again.

Root cause analysis (RCA). This plan is a strategy based on failures that have occurred in the past, whether chronic or sporadic. Their impact is determinable because labor, materials, and lost production are now a sunk cost.

RCA focuses on eliminating the risk of recurrence of the failures by identifying the physical, human, and latent (organizational) system roots that lead to the failure. Typically the worst drain on an organization’s maintenance budget is not one-time occurrences, but the “cost of doing business” failures.

Generally, 20% or less of the failure events cause 80% or more of the losses experienced. When resources are dedicated to identifying these 20%, an immediate bottom-line result is realized, if they are analyzed and corrective action taken. The accepted chronic failures stop occurring; therefore, labor hours are not assigned to them, materials are not expended, and production increases. This scenario also increases product quality and reduces the risk of safety and environmental incidents.

When failures are not occurring, there is no need to detect them. Consequently, resources allocated for P/PM efforts can be used to assist in RCA.

Most costs associated with conducting a RCA are in people’s time and resources to verify findings. Recommendations are generally noncapital expenditures that correct people’s decision-making skills and the information they receive.

Average returns on investment (ROI) for conducting RCA range from 600-1000%. Compare these numbers against the average ROIs expected for engineering projects and it leaves you wondering why more resources are not doing this type of work.

People realize they should be doing these things, but the reactive culture of the organization does not allow the time and resources to do so. In essence, employees are so busy fire fighting they do not have time for fire prevention

If failure is accepted as inevitable, the best we can do is sharpen our prediction efforts. Conversely, if we believe that failure is not inevitable, we will refine our RCA efforts.

What do you believe?

Bob Latino holds a degree in Business Administration and Management from Virginia Commonwealth University. Mr. Latino has been a teacher and practitioner of root cause analysis for over 15 yr. He has developed several courses and a software package dealing with the subjects presented in this article.