Understanding event data collection: Part 1

Editor's note: This is the first of a two-part series on the importance of effective event data collection for reliability analysis. The second part will appear in the August issue. Collecting event data is a minimum requirement to measure the effectiveness of your current asset management strategies.

Collecting event data is a minimum requirement to measure the effectiveness of your current asset management strategies. Without this information one tends to allocate resources on the “problem of the day.” This means you probably do not have a systematic approach to removing defects from your operation. Having a complete set of event data for all asset events provides a much clearer understanding of how to get the greatest return from your human and capital resources.

Sources of event data

Event data can come from a variety of sources such as a maintenance management system, predictive and inspection systems, as well as production systems. For this discussion, we will focus our attention primarily on collecting event data related to equipment that resides in a computerized maintenance management system (CMMS).

It is important to understand the reasoning behind the data collection effort before getting into the details of how it is actually accomplished. The collection of event data has a double benefit. The primary benefit of comprehensive event data is to alert process owners as to whether their asset strategies are effective. Once ineffective strategies are identified, the same event data are used to drill down and determine the causes of the ineffective strategies.

What is reliability event data?

Reliability event data can comprise many items within many contexts. For purposes of this discussion criteria of reliability event data include:

Work events that occur on equipment
Type of work performed
Conditions found at the time of work
Technical findings after work is completed
Dates/time associated with the work
Cost associated with performing the work.

When an event occurs on a piece of equipment, it is critical to record what type of event actually occurred. For example, was it a failure, repair, or preventive maintenance (PM)? What was the condition of the equipment at the time of the event? Some of the most critical information to be included in the recording of any event are the date and time stamps related to the event and the costs associated with that event — for example, labor, material, contractor, and production losses (see “Recommended event data to collect”).

After the work is completed, it is necessary to record the technical finding such as the failed item, failure mode, cause, and several other data elements discussed in further detail in Part 2 of this article in the August issue.

A company’s balanced scorecard is composed of standardized, enterprise-wide performance measurements related to production assets. A balanced scorecard provides a holistic view of key performance indicators (KPIs) spanning multiple plants and allows management to make strategic, fact-based decisions with greater confidence.

Author Information

Ken Latino is a senior consultant in Meridium, Inc. Asset Performance Management Consulting Group, a root cause analysis practitioner, frequent speaker on the topic at conferences, and an educator. He is author of several root cause analysis courses, books, and trade magazine articles. Ken designed a software program that assists analysts in conducting a disciplined root cause analysis. He can be reached at 540-344-9205 ext. 1176 or [email protected] .

Recommended event data to collect

These lists include the data and supplemental descriptions that are recommended to be collected on a given event. These data will be used as the basis for compiling a balanced scorecard as well the information required to find the underlying causes.

Identification

Event ID — the unique identifier for each failure event

CMMS ID — useful if you are using a CMMS as the base data collection system for failure events

Functional location — typically a “smart” ID that represents what function takes place at a given location (for example, pump 01-G-0001 must move liquid X from point A to point B)

Functional location hierarchy — functional hierarchy to roll up metrics at various levels:

Level 1

Level 2 Level n (System)

Level 3

Equipment ID — usually a randomly generated ID that reflects the asset that is in service at the functional location. The reason for a separate equipment ID and functional Location is because assets can move from place to place and functional locations

Equipment name — name or description of equipment for identification purposes

Equipment category (for example, rotating) — indicates the category of equipment the work was performed on — generally organized by discipline (rotating, fixed, electrical, instrument)

Equipment class (for example, pump) — indicates the class of equipment on which the work was performed. Failure codes can depend on this value

Equipment type (for example, centrifugal) — indicates the type of equipment on which the work was performed. Failure codes can depend on this value also

History

Functional loss — indicates whether the equipment experienced a functional loss as part of this event. A functional loss can be defined as:

Complete loss of function

Partial loss of function

Potential loss of function

Functional failure (ISO failure mode) — basically the symptoms of a failure if one has occurred. Any physical asset is installed to fulfill a number of functions. The functional failure describes which function the asset no longer is able to fulfill

Effect — the effect of the event on production, safety, environmental, or quality

Maintainable item — the actual component identified as causing the asset to lose its ability to operate (for example, bearing)

Condition — indicates the type of damage found to the maintainable item. In some cases this also tends to indicate a failure mechanism as well

Cause — the general cause of the condition. This is not the root cause; it is recommended to use root cause failure analysis (RCFA) to assess root causes

Maintenance action — corrective action performed to mitigate the damaged item

Narrative — text description of work and suggestions for improvements

Dates

Event date — the date that the event was first observed and documented

Mechanically unavailable date/time — the date/time that the equipment is actually taken out of service either due to a failure or for the repair work

Mechanically available date/time — the date/time that the equipment is available for service after the repair work is completed

Mechanical downtime — difference between mechanically unavailable date and mechanically available date expressed in hours

Maintenance start date/time — the date/time that the equipment is actually being worked on by maintenance

Maintenance end date/time — the date/time that the equipment repair is completed

Time to repair — the total maintenance time to repair the equipment

Consequence

Maintenance cost — the total maintenance expenditure to rectify the failure. This could be company or contractor cost. This cost could be organized into categories, such as material, labor, contractor, etc.

Production cost — the amount of business loss associated with not having the assets in service. This cost includes lost opportunity, such as when an asset fails to perform its intended function and there is no spare asset or capability to make up the loss.

Sources of event data

What is reliability event data?

Recommended event data to collect

Identification

History

Dates

Consequence

related topics

you might also like

How to bolster asset management efficiency using analytics platforms

How to prevent failure modes common in gearbox mounting positions

Why applying smarter coatings can extend transformer life span

The value of gas detection in the food and beverage industry

Get the newsletter