How to avoid fault tree analysis (FTA) mistakes
FTA is a complex exercise that requires efforts to fully understand the system, its associated subsystems and components
Fault Tree Analysis (FTA) is a popular reliability and safety technique with roots back in 1962 when Bell Laboratories performed safety analysis of the Minuteman Missile Control System. FTA is a top-down approach that involves diagraming lower-level subsystems and components to understand failure impact on the upper-level or top system.
It is similar to failure mode effect analysis (FMEA) — bottom-up approach — as they both are used to analyze the risk of failure of any system. However, unlike FMEA that generally considers single failure, FTA can analyze multiple failures through complex Boolean computation. As a result, it requires a higher degree of expertise. This article presents common mistakes made while performing FTA and the steps to avoid or correct them.
Incorrect definitions of FTA
The accurate definition of the top event plays a crucial role in converging FTA to a meaningful conclusion. A fundamental mistake is the failure to clearly define the system and its boundary. Often under real-world scenarios, the failure event can impact more than one interconnected system. The top-level event in the FTA corresponds to the definition of the system and its boundary. If the system name is not adequately defined, the top-level event will likely be too generic or overlapping on more than one system.
For example, in the case of failure of a crude oil distillation unit in an oil refinery, the impact could outrange the boundary of the distillation unit thus leading to the shutdown of the entire oil refinery. If the system is considered an entire refinery, the resultant FTA will be mathematically complex and may not converge into the actual cause of failure. However, if the system is considered just a distillation column, the resultant FTA will be more focused and accurate and could converge to an actual cause, which, in this case, was a crude oil pump motor failure that led to the shutdown of the oil distillation unit.
The boundary of the system should be clearly defined to overcome this type of challenge. The decision to select the boundary should depend on the aspects of system performance. For example, it may be outside the scope of study to review the impact of a failure event on the environment. Therefore, any impact to the environment system in case of fuel spillage may be considered outside the boundary of analysis. To provide clarity on system boundary, the following information should be provided:
The physical and functional interface of the system with other external systems
Functional descriptions and failure modes of each component
The initial state of the components
Functions of human and machine.
A lack of clarity of FTA
The resolution of the analysis is vital to understand as it explains at what levels the analysis needs to be stopped. One mistake could be when the resolution of FTA is not clear thus continuing FTA to a level that holds no value to the organization as a business. For example, in the case of electronic printed circuit board (PCB) manufacturing, the objective of FTA might be to break down the failure event into the very lowest level component, which could be a small capacitor or switch on the PCB. This lowest level makes sense in this application as the manufacturing plant can solder and upgrade the faulty PCBs with new components.
On the other hand, in case of oil refinery fuel pump failure, the FTA may most likely stop when reaching the motor pump without further breaking down the motor into its components such as stator and rotor windings. This is because it will be fruitful and economical for refinery operators to replace the motor with a new motor instead of fixing, troubleshooting and correcting the motor control circuits and associated interfaces.
Misinterpretation of events
Since FTA is a top-down analysis that starts from the top-level event, there exists a high chance of misinterpretation at the lowest level of component analysis, especially if the data is missing. Often, the data availability is a challenge for most organizations. As a result, often the FTA analysts fail to clearly understand the cause-effect relationship leading to the top-level event. The availability of accurate and clean data is the most important thing to have before starting the FTA.
To overcome this challenge, the starting point of the FTA should be to understand the system through FMEA. Since FMEA is bottom-up analysis, it will provide good insight into the lower-level subsystems and components. To avoid FTA being strangled in the middle due to missing information, the following documents should be handy before constructing the FTA:
System and subsystem block diagram
Process flow diagram (PFD) and/or process and instrumentation diagram (P&ID)
Integrated parts catalogue or manuals
Maintenance and test procedures
Installation and commissioning diagrams/results.
FTA is a complex exercise that requires extensive efforts on the part of the team to fully understand the system, its associated subsystems and components. While the aspects to consider in FTA are many with each aspect open to a range of human errors and mistakes, this article presents only elementary mistakes and possible steps to avoid that can arise during the pre-work and early steps of performing FTA.