Beware of the fallacy of the bathtub curve

As this title suggests, all may not be totally well with the bathtub curve. True, some devices may follow its general shape, but the fact is that more has been assumed along those lines than has actually been measured and proven to be the case. The commercial aviation industry, however, does have fairly large populations of identical or similar components in its aircraft fleets.

By Anthony M. Smith and Glenn R. Hinchcliffe February 1, 2006

As this title suggests, all may not be totally well with the bathtub curve. True, some devices may follow its general shape, but the fact is that more has been assumed along those lines than has actually been measured and proven to be the case. The commercial aviation industry, however, does have fairly large populations of identical or similar components in its aircraft fleets. As a part of the extensive investigation that was conducted in the late 1960s as a prelude to the RCM methodology, United Airlines used this database to develop the age-reliability patterns for the nonstructural components in their fleet. The results of this analysis are summarized in Figure 1.

These results came as a surprise to almost everyone-and continue to do so today when people see these results for the first time. Follow-up studies using aircraft data in Sweden in 1973, and by the U.S. Navy in 1983, produced similar results, as shown in Figure 2.

The United Airlines and Bromberg results are essentially identical findings, and the U.S. Navy results show very similar patterns. In these three studies, random failures accounted for between 77% and 92% of the total failure population, and age-related failures for the remaining 8% to 23%. The authors are not aware of any other studies of a similar nature, so one can only conjecture that the age-reliability characteristics of your plant would show similar trends.

The significance of these results, and their potential importance to the maintenance engineer, cannot be stated too strongly. Let’s examine these more closely, assuming for the moment that these curves may be characteristic of your plant or system.

Only a very small fraction of the components (3% to 4%) actually replicated the traditional bathtub curve concept (curve A).

More significantly, only between 4% and 20% of the components experienced a distinct aging region during the useful life of the aircraft fleets (curves A and B). If we are generous in our interpretation, and allow that curve C also is an aging pattern, this still means that only between 8% and 23% percent of the components experienced an aging characteristic.

Conversely, between 77% and 92% of the components never saw any aging or wear mechanism developing over the useful life of the airplanes (curves D, E, and F). Thus, while common perceptions tend toward the belief that 9 out of 10 components have “bathtub” behavior, the analysis indicated that this trend was completely reversed when the facts were known.

Notice that many components, however, did experience the infant mortality phenomenon (curves A and F).

What does all of this mean? Quite a bit! First, recall that a constant failure rate region (curves A, B, D, E, and F all have this region) means that the equipment failures in this region are random in nature – that is, the state of the art is not developed to the point where we can predict what failure mechanisms may be involved, nor do we know precisely when they will occur. We only know that, on average in a large population, the instantaneous failure rate (or the mean time between failure) is a constant value. Of course, we hope that this constant failure rate value is very small, and we thus have a very reliable set of components in our system. But, for the maintenance engineer, these constant failure rate regions mean that overhaul actions will essentially (short of luck) do very little, if anything, to restore the equipment to a like-new condition.

In this constant value region, overhaul is usually a waste of money because we really do not know what to restore, nor do we really know the proper time to initiate an overhaul. (In the constant failure rate region, any time you might select is essentially the wrong time!) Second, and worse yet, is that these overhaul actions may actually be harmful because, in our haste to restore equipment to new, pristine condition, we may have inadvertently pushed it back into the infant mortality region of the curve due to human error during the intrusive actions.

In this specific study, for example, overhaul actions on the components in curves D, E, and F would be susceptible to this counterproductive situation. A third point relates to the periodicity that should be specified for an overhaul task when such an action is considered to be the correct step to take.

For example, if a component is either a curve A or B type, we want to assure that the overhaul action is not taken too soon-or again, we may be wasting our resources. Often, we do not know what the correct interval should be, or even if an overhaul PM task is the right thing to do. Why? Because we do not have sufficient data to tie down the age- reliability patterns for our equipment.

In summary, we should be very careful about selecting overhaul PM tasks because our equipment may not have an age-reliability pattern that justifies such tasks. In addition, due to human errors, overhauls are likely to cause more problems than they prevent if aging regions are not present. When data is absent to guide us on this very fundamental and important issue, we should initiate an Age Exploration program and/or the collection of data for statistical analyses that will permit us to make the right decisions.

We should also defer, where possible, to the non-intrusive condition-directed tasks until we have more definitive results from the age exploration process. It is indeed a curious (and unfortunate) fact that, in today’s world of modern technology, one of the least understood phenomena about our marvelous machines is how and why they fail.

Printed with permission from Butterworth-Heinemann, a division of Elsevier, from RCM—Gateway to World Class Maintenance, by Anthony M. Smith, AMS Associates Inc. in California, and Glenn R. Hinchcliffe, Consulting Professional Engineer, G&S Associates Inc. in North Carolina. Copyright 2004. For more information about this title and similar titles, please visit www.books.elsevier.com .

Age — reliability patterns

Failure Rate by Type
UAL 1968
Bromberg 1973
U.S. Navy 1982

A
4%
3%
3%

B
2%
1%
17%

C
5%
4%
3%

D
7%
11%
6%

E
14%
15%
42%

F
68%
66%
29%