The impact of imperfect manual testing on safety systems

Safety instrumented system standards (IEC 61508 & 61511) are performance based. Essentially, the greater the level of process risk, the better the systems needed to control it. A variety of techniques may be used to determine the level of performance (referred to as Safety Integrity Level, or SIL) needed of safety instrumented functions.

By Paul Gruhn, PE, CFSE, ICS Triplex July 15, 2008

Safety instrumented system standards ( IEC 61508 & 61511 ) are performance based. Essentially, the greater the level of process risk, the better the systems needed to control it. A variety of techniques may be used to determine the level of performance (referred to as Safety Integrity Level , or SIL) needed of safety instrumented functions.

A variety of techniques may also be used to analyze system performance to see if the hardware meets the SIL targets that have been assigned. These modeling techniques — often referred to as SIL verification — account for many different factors such as failure rates, failure modes, quantities, levels of automatic diagnostic coverage, manual test intervals and more. Assumptions made during modeling can have a dramatic impact on the overall answers. For example, assuming 95% manual test coverage (rather than the optimistic 100% often used) results in an average 40% reduction in performance; assuming 90% results in an average 57% reduction.

The impact of manual test coverage has a significant impact that should be accounted for in system modeling. The results also indicate that manual testing needs to be as realistic and thorough as possible.

Failure modes

Safety instrumented system (SIS) hardware is normally dormant (e.g., an isolation valve remains open for long periods of time). Such hardware may fail in two ways: safe and dangerous. Safe failures result in nuisance trips and lost production (e.g., the valve slams shut because the normally-energized solenoid coil burned out; there was no process hazard). Dangerous failures result in the system not being able to perform its safety function (e.g., the solenoid de-energizes, but the valve is stuck open and does not close on demand).

Automatic diagnostics, manual testing

Some hardware has built-in automatic diagnostics to detect dangerous failures. For example, PLCs can detect a variety of internal problems such as being stuck in an endless loop. However, automatic diagnostics can never be 100% effective. Some devices, for example non-intelligent sensors and valves, have no automatic diagnostics at all. For example, a solenoid-operated valve cannot tell by itself whether it is stuck open.

All safety devices must be manually tested in order to detect potentially dangerous failures. SIL verification calculations are based on a manual test interval. The more often devices are tested, the quicker potentially dangerous failures can be detected and repaired, which results in better safety performance. The performance requirements for a safety instrumented function, as listed in the ANSI/ISA 84 standard, are shown in Table 1.

Safety Integrity Level (SIL) Probability of Failure on Demand (PFD) Risk Reduction Factor (RRF = 1/PFD)
4 ≥ 0.00001 to & 0.0001 > 10,000 to≤ 100,000
3 ≥ 0.0001 to & 0.001 > 1,000 to≤ 10,000
2 ≥ 0.001 to & 0.01 > 100 to≤ 1,000
1 ≥ 0.01 to & 0.1 > 10 to≤ 100

Assuming no automatic diagnostics, the probability of failure on demand (PFD) of a non-redundant device is calculated using the following formula:

PFD =λ d * ( TI m / 2 )

Where:λ d is the dangerous failure rate

TI m is the manual test interval

If a device has a level of automatic diagnostics, then dangerous failures are split into two categories: dangerous detected and dangerous undetected. Accounting for automatic diagnostics, the PFD calculation becomes:

PFD = (λ dd * ( TI a / 2 )) + (λ du * ( TI m / 2 ))

Where:λ dd is the dangerous detected failure rate

TI a is the automatic diagnostic test interval

λ du is the dangerous undetected failure rate

TI m is the manual test interval

In most every case, the PFD due to automatic diagnostics is insignificant compared to the PFD due to manual testing (usually by two orders of magnitude) and can therefore be ignored.

What is ‘manual test coverage’?

Assuming that manual testing is 100% effective is unrealistic. For example, full or partial stroking of a valve does not determine whether the valve will seat properly or whether it might leak. Partial stroking will not determine whether the seat is eroded or whether there is a welding rod stuck in the valve.

Testing the electronics of a sensor does not determine whether the sensing element itself is responding properly. Removing a sensor and testing it in a laboratory or maintenance shop does not determine whether the sensor will respond properly in the actual process. (Stories have been told at conferences where they did not.) Testing a level float switch by moving the float with a rod will not ensure that the float will actually float. So in reality, manual testing — referred to as manual test coverage by some — is not 100% effective in detecting all possible failures. This can be accounted for in these calculations:

PFD = (λ dd * ( TI a / 2 )) + (λ du * ( TI m / 2 )) + (λ dn * ( Life / 2 ))

Where:λ dd is the dangerous detected failure rate

TI a is the automatic diagnostic test interval

λ du is the dangerous undetected failure rate

TI m is the manual test interval

λ dn is the dangerous never-detected failure rate

Life is the proposed life of the hardware.

In other words, some dangerous failures will remain in the system for the life of the system.

Imperfect manual testing

The impact of imperfect manual testing is significant. Calculations may be done by hand or by using a number of commercially available programs that do account for imperfect manual testing. The impact is the same regardless of the hardware or configuration.

In other words, calculations for single devices (switches or valves), triplicated transmitters with 99% automatic diagnostics, dual valves, valves with partial stroke testing, all reveal the same results.

If the manual test coverage drops to 95%, the risk reduction is reduced by an average of 40%. If the manual test coverage drops to 90%, the risk reduction is reduced by an average of 57%. These results assume a 15 year life and yearly manual testing. Assumptions that can vary the final answer by a factor of two are significant enough to warrant paying attention to.

SIL performance targets vary by an entire order of magnitude (as shown in Table 1, previous page). A system with perfect manual testing and a risk reduction factor of 400 will be reduced to 170, assuming 90% manual test coverage.

Both numbers are in the SIL 2 range. However, a system with an initial risk reduction of 200 will be reduced to 86 assuming 90% manual test coverage. This is enough to slip from an assumed SIL 2 level of performance, down to an actual SIL 1 level of performance.

Thorough manual testing

The results summarized here show that manual testing needs to be as realistic and thorough as possible or else intended performance levels may not be met. The goal is to have the manual test coverage percentage as close to 100% as possible. While most will accept that manual test coverage is rarely 100% (just as automatic diagnostics can never be 100%, and no redundant system can have 0% common cause), determining an accurate assessment of the manual test coverage percentage is problematic.

Until detailed failure rate data is accumulated — which is certainly possible considering the databases that most users now have available — estimating the manual test coverage may remain a SWAG (Scientific Wild Ass Guess) for the time being.

Author Information
Paul Gruhn, PE, CFSE is the training manager at

The ISA 84 committee wrote a technical report in 2002 titled Guidance for Testing of Process Sector Safety Instrumented Functions (SIF) Implemented as or Within Safety Instrumented Systems (SIS) (ISA-TR84.00.03 — 2002). As the name would imply, this 222-page document describes methods for testing safety devices. The document is currently being re-written by the committee.