Understanding event data collection: Part 1

Editor's note: This is the first of a two-part series on the importance of effective event data collection for reliability analysis. The second part will appear in the August issue. Collecting event data is a minimum requirement to measure the effectiveness of your current asset management strategies.

By Ken Latino July 8, 2004

Collecting event data is a minimum requirement to measure the effectiveness of your current asset management strategies. Without this information one tends to allocate resources on the “problem of the day.” This means you probably do not have a systematic approach to removing defects from your operation. Having a complete set of event data for all asset events provides a much clearer understanding of how to get the greatest return from your human and capital resources.

Sources of event data

Event data can come from a variety of sources such as a maintenance management system, predictive and inspection systems, as well as production systems. For this discussion, we will focus our attention primarily on collecting event data related to equipment that resides in a computerized maintenance management system (CMMS).

It is important to understand the reasoning behind the data collection effort before getting into the details of how it is actually accomplished. The collection of event data has a double benefit. The primary benefit of comprehensive event data is to alert process owners as to whether their asset strategies are effective. Once ineffective strategies are identified, the same event data are used to drill down and determine the causes of the ineffective strategies.

What is reliability event data?

Reliability event data can comprise many items within many contexts. For purposes of this discussion criteria of reliability event data include:

  • Work events that occur on equipment

  • Type of work performed

  • Conditions found at the time of work

  • Technical findings after work is completed

  • Dates/time associated with the work

  • Cost associated with performing the work.

    • When an event occurs on a piece of equipment, it is critical to record what type of event actually occurred. For example, was it a failure, repair, or preventive maintenance (PM)? What was the condition of the equipment at the time of the event? Some of the most critical information to be included in the recording of any event are the date and time stamps related to the event and the costs associated with that event — for example, labor, material, contractor, and production losses (see “Recommended event data to collect”).

      After the work is completed, it is necessary to record the technical finding such as the failed item, failure mode, cause, and several other data elements discussed in further detail in Part 2 of this article in the August issue.

      A company’s balanced scorecard is composed of standardized, enterprise-wide performance measurements related to production assets. A balanced scorecard provides a holistic view of key performance indicators (KPIs) spanning multiple plants and allows management to make strategic, fact-based decisions with greater confidence.

      Author Information
      Ken Latino is a senior consultant in Meridium, Inc. Asset Performance Management Consulting Group, a root cause analysis practitioner, frequent speaker on the topic at conferences, and an educator. He is author of several root cause analysis courses, books, and trade magazine articles. Ken designed a software program that assists analysts in conducting a disciplined root cause analysis. He can be reached at 540-344-9205 ext. 1176 or KLatino@meridium.com .

      Recommended event data to collect

      These lists include the data and supplemental descriptions that are recommended to be collected on a given event. These data will be used as the basis for compiling a balanced scorecard as well the information required to find the underlying causes.

      Identification

      Event ID — the unique identifier for each failure event

      CMMS ID — useful if you are using a CMMS as the base data collection system for failure events

      Functional location — typically a “smart” ID that represents what function takes place at a given location (for example, pump 01-G-0001 must move liquid X from point A to point B)

      Functional location hierarchy — functional hierarchy to roll up metrics at various levels:

      Level 1

      Level 2 Level n (System)

      Level 3

      Equipment ID — usually a randomly generated ID that reflects the asset that is in service at the functional location. The reason for a separate equipment ID and functional Location is because assets can move from place to place and functional locations

      Equipment name — name or description of equipment for identification purposes

      Equipment category (for example, rotating) — indicates the category of equipment the work was performed on — generally organized by discipline (rotating, fixed, electrical, instrument)

      Equipment class (for example, pump) — indicates the class of equipment on which the work was performed. Failure codes can depend on this value

      Equipment type (for example, centrifugal) — indicates the type of equipment on which the work was performed. Failure codes can depend on this value also

      History

      Functional loss — indicates whether the equipment experienced a functional loss as part of this event. A functional loss can be defined as:

      Complete loss of function

      Partial loss of function

      Potential loss of function

      Functional failure (ISO failure mode) — basically the symptoms of a failure if one has occurred. Any physical asset is installed to fulfill a number of functions. The functional failure describes which function the asset no longer is able to fulfill

      Effect — the effect of the event on production, safety, environmental, or quality

      Maintainable item — the actual component identified as causing the asset to lose its ability to operate (for example, bearing)

      Condition — indicates the type of damage found to the maintainable item. In some cases this also tends to indicate a failure mechanism as well

      Cause — the general cause of the condition. This is not the root cause; it is recommended to use root cause failure analysis (RCFA) to assess root causes

      Maintenance action — corrective action performed to mitigate the damaged item

      Narrative — text description of work and suggestions for improvements

      Dates

      Event date — the date that the event was first observed and documented

      Mechanically unavailable date/time — the date/time that the equipment is actually taken out of service either due to a failure or for the repair work

      Mechanically available date/time — the date/time that the equipment is available for service after the repair work is completed

      Mechanical downtime — difference between mechanically unavailable date and mechanically available date expressed in hours

      Maintenance start date/time — the date/time that the equipment is actually being worked on by maintenance

      Maintenance end date/time — the date/time that the equipment repair is completed

      Time to repair — the total maintenance time to repair the equipment

      Consequence

      Maintenance cost — the total maintenance expenditure to rectify the failure. This could be company or contractor cost. This cost could be organized into categories, such as material, labor, contractor, etc.

      Production cost — the amount of business loss associated with not having the assets in service. This cost includes lost opportunity, such as when an asset fails to perform its intended function and there is no spare asset or capability to make up the loss.