The impact of imperfect manual testing on safety systems

Safety instrumented system standards (IEC 61508 & 61511) are performance based. Essentially, the greater the level of process risk, the better the systems needed to control it. A variety of techniques may be used to determine the level of performance (referred to as Safety Integrity Level, or SIL) needed of safety instrumented functions.


Safety instrumented system standards ( IEC 61508 & 61511 ) are performance based. Essentially, the greater the level of process risk, the better the systems needed to control it. A variety of techniques may be used to determine the level of performance (referred to as Safety Integrity Level , or SIL) needed of safety instrumented functions.

A variety of techniques may also be used to analyze system performance to see if the hardware meets the SIL targets that have been assigned. These modeling techniques %%MDASSML%% often referred to as SIL verification %%MDASSML%% account for many different factors such as failure rates, failure modes, quantities, levels of automatic diagnostic coverage, manual test intervals and more. Assumptions made during modeling can have a dramatic impact on the overall answers. For example, assuming 95% manual test coverage (rather than the optimistic 100% often used) results in an average 40% reduction in performance; assuming 90% results in an average 57% reduction.

The impact of manual test coverage has a significant impact that should be accounted for in system modeling. The results also indicate that manual testing needs to be as realistic and thorough as possible.

Failure modes

Safety instrumented system (SIS) hardware is normally dormant (e.g., an isolation valve remains open for long periods of time). Such hardware may fail in two ways: safe and dangerous. Safe failures result in nuisance trips and lost production (e.g., the valve slams shut because the normally-energized solenoid coil burned out; there was no process hazard). Dangerous failures result in the system not being able to perform its safety function (e.g., the solenoid de-energizes, but the valve is stuck open and does not close on demand).

Automatic diagnostics, manual testing

Some hardware has built-in automatic diagnostics to detect dangerous failures. For example, PLCs can detect a variety of internal problems such as being stuck in an endless loop. However, automatic diagnostics can never be 100% effective. Some devices, for example non-intelligent sensors and valves, have no automatic diagnostics at all. For example, a solenoid-operated valve cannot tell by itself whether it is stuck open.

All safety devices must be manually tested in order to detect potentially dangerous failures. SIL verification calculations are based on a manual test interval. The more often devices are tested, the quicker potentially dangerous failures can be detected and repaired, which results in better safety performance. The performance requirements for a safety instrumented function, as listed in the ANSI/ISA 84 standard, are shown in Table 1.

Safety Integrity Level (SIL)

Probability of Failure on Demand (PFD)

Risk Reduction Factor (RRF = 1/PFD)


≥ 0.00001 to & 0.0001

> 10,000 to≤ 100,000


≥ 0.0001 to & 0.001

> 1,000 to≤ 10,000


≥ 0.001 to & 0.01

> 100 to≤ 1,000


≥ 0.01 to & 0.1

> 10 to≤ 100

Assuming no automatic diagnostics, the probability of failure on demand (PFD) of a non-redundant device is calculated using the following formula:

PFD =λ d * ( TI m / 2 )

Where:λ d is the dangerous failure rate

TI m is the manual test interval

If a device has a level of automatic diagnostics, then dangerous failures are split into two categories: dangerous detected and dangerous undetected. Accounting for automatic diagnostics, the PFD calculation becomes:

PFD = (λ dd * ( TI a / 2 )) + (λ du * ( TI m / 2 ))

Where:λ dd is the dangerous detected failure rate

TI a is the automatic diagnostic test interval

λ du is the dangerous undetected failure rate

TI m is the manual test interval

In most every case, the PFD due to automatic diagnostics is insignificant compared to the PFD due to manual testing (usually by two orders of magnitude) and can therefore be ignored.

What is 'manual test coverage'?

Assuming that manual testing is 100% effective is unrealistic. For example, full or partial stroking of a valve does not determine whether the valve will seat properly or whether it might leak. Partial stroking will not determine whether the seat is eroded or whether there is a welding rod stuck in the valve.

Testing the electronics of a sensor does not determine whether the sensing element itself is responding properly. Removing a sensor and testing it in a laboratory or maintenance shop does not determine whether the sensor will respond properly in the actual process. (Stories have been told at conferences where they did not.) Testing a level float switch by moving the float with a rod will not ensure that the float will actually float. So in reality, manual testing %%MDASSML%% referred to as manual test coverage by some %%MDASSML%% is not 100% effective in detecting all possible failures. This can be accounted for in these calculations:

PFD = (λ dd * ( TI a / 2 )) + (λ du * ( TI m / 2 )) + (λ dn * ( Life / 2 ))

Where:λ dd is the dangerous detected failure rate

TI a is the automatic diagnostic test interval

λ du is the dangerous undetected failure rate

TI m is the manual test interval

λ dn is the dangerous never-detected failure rate

Life is the proposed life of the hardware.

In other words, some dangerous failures will remain in the system for the life of the system.

Imperfect manual testing

The impact of imperfect manual testing is significant. Calculations may be done by hand or by using a number of commercially available programs that do account for imperfect manual testing. The impact is the same regardless of the hardware or configuration.

In other words, calculations for single devices (switches or valves), triplicated transmitters with 99% automatic diagnostics, dual valves, valves with partial stroke testing, all reveal the same results.

If the manual test coverage drops to 95%, the risk reduction is reduced by an average of 40%. If the manual test coverage drops to 90%, the risk reduction is reduced by an average of 57%. These results assume a 15 year life and yearly manual testing. Assumptions that can vary the final answer by a factor of two are significant enough to warrant paying attention to.

SIL performance targets vary by an entire order of magnitude (as shown in Table 1, previous page). A system with perfect manual testing and a risk reduction factor of 400 will be reduced to 170, assuming 90% manual test coverage.

Both numbers are in the SIL 2 range. However, a system with an initial risk reduction of 200 will be reduced to 86 assuming 90% manual test coverage. This is enough to slip from an assumed SIL 2 level of performance, down to an actual SIL 1 level of performance.

Thorough manual testing

The results summarized here show that manual testing needs to be as realistic and thorough as possible or else intended performance levels may not be met. The goal is to have the manual test coverage percentage as close to 100% as possible. While most will accept that manual test coverage is rarely 100% (just as automatic diagnostics can never be 100%, and no redundant system can have 0% common cause), determining an accurate assessment of the manual test coverage percentage is problematic.

Until detailed failure rate data is accumulated %%MDASSML%% which is certainly possible considering the databases that most users now have available %%MDASSML%% estimating the manual test coverage may remain a SWAG (Scientific Wild Ass Guess) for the time being.

Author Information

Paul Gruhn, PE, CFSE is the training manager at

The ISA 84 committee wrote a technical report in 2002 titled Guidance for Testing of Process Sector Safety Instrumented Functions (SIF) Implemented as or Within Safety Instrumented Systems (SIS) (ISA-TR84.00.03 %%MDASSML%% 2002). As the name would imply, this 222-page document describes methods for testing safety devices. The document is currently being re-written by the committee.

No comments
The Top Plant program honors outstanding manufacturing facilities in North America. View the 2013 Top Plant.
The Product of the Year program recognizes products newly released in the manufacturing industries.
The Engineering Leaders Under 40 program identifies and gives recognition to young engineers who...
A cool solution: Collaboration, chemistry leads to foundry coat product development; See the 2015 Product of the Year Finalists
Raising the standard: What's new with NFPA 70E; A global view of manufacturing; Maintenance data; Fit bearings properly
Sister act: Building on their father's legacy, a new generation moves Bales Metal Surface Solutions forward; Meet the 2015 Engineering Leaders Under 40
Cyber security cost-efficient for industrial control systems; Extracting full value from operational data; Managing cyber security risks
Drilling for Big Data: Managing the flow of information; Big data drilldown series: Challenge and opportunity; OT to IT: Creating a circle of improvement; Industry loses best workers, again
Pipeline vulnerabilities? Securing hydrocarbon transit; Predictive analytics hit the mainstream; Dirty pipelines decrease flow, production—pig your line; Ensuring pipeline physical and cyber security
Upgrading secondary control systems; Keeping enclosures conditioned; Diagnostics increase equipment uptime; Mechatronics simplifies machine design
Designing positive-energy buildings; Ensuring power quality; Complying with NFPA 110; Minimizing arc flash hazards
Building high availability into industrial computers; Of key metrics and myth busting; The truth about five common VFD myths

Annual Salary Survey

After almost a decade of uncertainty, the confidence of plant floor managers is soaring. Even with a number of challenges and while implementing new technologies, there is a renewed sense of optimism among plant managers about their business and their future.

The respondents to the 2014 Plant Engineering Salary Survey come from throughout the U.S. and serve a variety of industries, but they are uniform in their optimism about manufacturing. This year’s survey found 79% consider manufacturing a secure career. That’s up from 75% in 2013 and significantly higher than the 63% figure when Plant Engineering first started asking that question a decade ago.

Read more: 2014 Salary Survey: Confidence rises amid the challenges

Maintenance and reliability tips and best practices from the maintenance and reliability coaches at Allied Reliability Group.
The One Voice for Manufacturing blog reports on federal public policy issues impacting the manufacturing sector. One Voice is a joint effort by the National Tooling and Machining...
The Society for Maintenance and Reliability Professionals an organization devoted...
Join this ongoing discussion of machine guarding topics, including solutions assessments, regulatory compliance, gap analysis...
IMS Research, recently acquired by IHS Inc., is a leading independent supplier of market research and consultancy to the global electronics industry.
Maintenance is not optional in manufacturing. It’s a profit center, driving productivity and uptime while reducing overall repair costs.
The Lachance on CMMS blog is about current maintenance topics. Blogger Paul Lachance is president and chief technology officer for Smartware Group.