IEC 61508 and Functional Safety
System Selection
Issue 2.1
4 April 2006
Author
Dil Wetherill
Measurement Technology Ltd.
Power Court, Luton, Bedfordshire
ENGLAND LU1 3JJ
COPYRIGHT  2005 by Measurement Technology, Ltd. All rights reserved.
No part of this publication may be copied or distributed, transmitted, transcribed, stored in a retrieval
system or translated into any human or computer language in any form or by any means, electronic,
mechanical, magnetic, manual or otherwise, or disclosed to third parties without the express written
permission of:
Measurement Technology, Ltd., Power Court, Luton, Bedfordshire, England LU1 3JJ
Functional Safety System Selection
___________________________________________________________________________________
Table of Contents
1
INTRODUCTION ..................................................................................................................5
2
OVERVIEW OF FUNCTIONAL SAFETY..............................................................................6
2.1
Safety Integrity Level ............................................................................................... 7
2.2
Low and High Demand Modes................................................................................. 8
2.3
Constraints on Safety Integrity Level........................................................................ 8
2.3.1
Design Processes............................................................................................. 8
2.3.2
Proportion of Failures that are Safe................................................................... 9
2.3.3
Design Techniques and Measures.................................................................... 9
2.3.4
Tolerance of Hardware Faults........................................................................... 9
2.3.5
Safety Integrity Level and Architecture............................................................ 10
2.3.6
Probabilities of Failure .................................................................................... 11
2.4
Process Safety Time ............................................................................................. 13
2.5
Summary of Safety-related System Selection ........................................................ 14
2.5.1
System Architecture ....................................................................................... 14
2.5.2
Probability of Dangerous Failure..................................................................... 14
2.5.3
Speed of Response ........................................................................................ 15
2.6
Management Requirements................................................................................... 16
2.7
Certified Products.................................................................................................. 16
2.8
IEC 61508 and ANSI/ISA S84.01........................................................................... 16
3
APPLICATION EXAMPLE ................................................................................................. 17
3.1
Low Demand Application - Emergency Shutdown System ..................................... 17
3.1.1
Description of application................................................................................ 17
3.1.2
MOST SafetyNet System................................................................................ 18
3.1.3
Required Input and Output types .................................................................... 18
3.1.4
Configuration and Programming ..................................................................... 19
3.1.5
Probability of Dangerous Failure..................................................................... 20
3.1.6
Response Time .............................................................................................. 21
APPENDIX A – GLOSSARY OF TERMS AND ABBREVIATIONS............................................ 22
Terms and Abbreviations for IEC61508 ............................................................................ 22
List of Figures
Figure 1 The Relationship between EUC Risk, Tolerable Risk and Residual Risk .................. 7
Figure 2 Probability of Failure on Demand with Proof Testing .............................................. 11
Figure 3 Determining if a Safety-Related System is suitable for the application .................... 15
Figure 4 Typical Emergency Shutdown Application.............................................................. 17
Figure 5 Typical Low Demand Application ........................................................................... 20
Figure 6 Typical ESD System Response Times ................................................................... 21
List of Tables
Table 1 Safety Integrity Level with Architecture for Type A Subsystems ................................ 10
Table 2 Safety Integrity Level with Architecture for Type B Subsystems ................................ 10
Table 3 PFH and PFD for High and Low Demand Applications ............................................. 12
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 3 of 24
Functional Safety System Selection
___________________________________________________________________________________
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 4 of 24
Functional Safety System Selection
___________________________________________________________________________________
1 Introduction
This paper provides an introduction to IEC 61508 and describes an illustrative application
example using the MOST SafetyNet System.
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 5 of 24
Functional Safety System Selection
___________________________________________________________________________________
2 Overview of Functional Safety
Machinery, process plant and equipment may malfunction in such ways that people are put at
risk of harm. The malfunctions may arise through physical faults (such as random
hardware failures), through systematic faults (such as errors made in software) or from
common cause failures (such as temperature extremes affecting a number of pieces of
equipment).
IEC 61508 provides a framework for:
•
•
•
Assessing the level of risk initially presented by the machinery, process plant and
equipment and establishing if this risk is acceptable.
Implementing a safety function that will provide a level of protection such that the risk is
reduced to an acceptable level - if the initial level of risk is found to be too high.
Providing a means by which the equipment selected to implement the safety function
can be shown to provide the required protection.
The machinery, process plant and equipment is referred to as the Equipment Under Control
or EUC. The system which is used to monitor inputs from the EUC (and its Operators) and
which then generates outputs, causing the EUC to operate in the desired manner, is called
the EUC Control System.
The risk presented by the EUC and its Control System (the EUC risk) is the starting point
from which risk reduction begins. Risk reduction should initially focus on the EUC and its
Control System – perhaps by re-designing the machinery, process plant or equipment.
Eliminating or reducing the EUC risk itself is preferable to using protection techniques to
reduce that risk.
IEC 61508 concentrates on protection using electrical, electronic or programmable
electronic systems. These are referred to as E/E/PE systems. Since they are used to
reduce the equipment under control (or EUC) risk they are said to be “safety-related”.
Other methods for providing protection can be used – either “alternative technologies” (for
example hydraulic systems which are alternatives to E/E/PE systems) or “external”
protection (such as bunds, firewalls or drainage systems). Neither alternative technologies
nor external protection are specifically covered by IEC 61508, but their use is recognised as
an integral part of reducing the EUC risk to a tolerable level. The combination of E/E/PE,
alternative technology and external protection employed to reduce the EUC risk is
described as “Functional Safety” – in the sense that the correct operation (or function) of the
protective systems provides the required reduction in risk (i.e. the required level of safety).
An E/E/PE system will normally be made up of one or more input devices such as switches
or transmitters (sensors), a programmable logic solver of some form (a logic system) and
one or more output devices such as pumps or valves (final elements). In practice, initiation
of the safety function is by the E/E/PE system setting its outputs to a safe state for the
application in question.
This gives the background to the title of IEC 61508: “Functional safety of
electrical/electronic/programmable electronic safety-related systems”.
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 6 of 24
Functional Safety System Selection
___________________________________________________________________________________
Figure 1 shows the relationship between EUC risk, tolerable risk, residual risk and the
necessary and actual risk reduction through the functional safety provided by E/E/PE,
alternative technology and external protection systems.
Residual
risk
Tolerable
risk
N ecessary risk reduction
EUC
risk
Increasing
risk
Actual risk reduction
Figure 1 The Relationship between EUC Risk, Tolerable Risk and Residual Risk
2.1 Safety Integrity Level
The level of risk reduction required varies according to the risk that is to be reduced and the
tolerable risk that must be achieved. The techniques described in IEC 61508 lead to a
formal determination of the reduction required for each risk under consideration. Once the
level of required risk reduction is found, it is normally expressed as a safety function within a
particular band of Safety Integrity Level. An appropriate safety-related system can then be
selected by choosing a system that falls in the appropriate band of Safety Integrity Level.
Products that provide the highest degrees of protection are designated SIL 4, with SIL 3; SIL
2 and SIL 1 providing respectively lower degrees of protection. The majority of safety
systems are designated SIL 3 or SIL 2; SIL 4 is rarely used.
In addition to assessing the likelihood of hardware failure, Safety Integrity Level is assessed
against the rigor of the design processes used to prevent systematic failures (as might
occur in software) and the hardware architecture used to provide the safety function. It is
not sufficient for the probability of hardware failure alone to be compatible with a particular
Safety Integrity Level; it is also necessary for the manufacturer to satisfy the design
process requirements, hardware fault tolerance and safe failure fraction for the target
SIL.
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 7 of 24
Functional Safety System Selection
___________________________________________________________________________________
2.2 Low and High Demand Modes
IEC 61508 defines two fundamental modes of operation for the safety-related system – Low
Demand and High or Continuous Demand. Which of these is required, depends on the
frequency with which the system might be required to perform its safety function in the given
application (i.e. the likely frequency of operation of the safety function defines which mode of
operation is required).
High demand is defined as being more than one demand per year; low demand is defined
as one demand per year or less.
A typical application requiring a safety-related system operating in a high demand mode
would be a guard for a machine press, where the guard prevents the operator from being at
risk from personal injury. The safety system could be expected to operate significantly more
than once per year. A safety-related system that operates in high demand mode is
therefore required.
A typical application requiring a safety-related system operating in low demand mode
would be a fire and gas system that would only be required to operate in the case of a fire or
gas leak. This system would be expected to operate less than once per year and a safetyrelated system operating in low demand mode would be appropriate. (Note: it could be
considered that a fire and gas system is a “mitigation” system – i.e. one whose objective is to
limit the damage caused by a failure, rather than preventing the failure – and therefore not
subject to IEC 61508. Here, it is considered as a protection system, which prevents a fire or
gas release from causing further harm.)
2.3 Constraints on Safety Integrity Level
A number of constraints are defined which limit the SIL that can be claimed for any safetyrelated system. The constraints are:
•
•
•
•
•
the design processes used by the manufacturers of the elements of the system
the design techniques and measures used to limit the effects of failures during operation
the tolerance of the system to hardware faults
the proportion of faults that lead to safe failure modes
the probability of the system failing to provide protection
Each of the above constraints is discussed in more detail in the following Sections.
2.3.1 Design Processes
In order to use a product as part of a safety-related system, the end-user or system
designer must establish that the manufacturer of the product has met the requirements of IEC
61508 in the processes used to manage the specification and design of the product. This is
to ensure that all relevant measures have been taken to avoid failures (i.e. to ensure that
failures are not inadvertently designed in to the product).
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 8 of 24
Functional Safety System Selection
___________________________________________________________________________________
2.3.2 Proportion of Failures that are Safe
IEC 61508 defines a concept known as the safe failure fraction. This is a simple measure of
the proportion of hardware failures that are either safe, or dangerous but detected, compared
with the total number of possible failures (the total being made up of safe, dangerous
detected and dangerous undetected failures). Obviously, the proportion of undetected
dangerous failures is of critical importance in a safety-related system.
The level of safe failure fraction, together with hardware fault tolerance, limits the SIL that
can be claimed for a particular safety-related system.
For simplicity, bands of safe failure fraction are defined by the standard: <60%, 60% to
<90%, 90% to <99% and ≥ 99%.
IEC 61508 defines type A and type B subsystems, the difference between the two being the
level of confidence in the understanding of failure modes of components, the behaviour of
sub-systems under fault conditions and the field data collected to provide practical
confirmation of the theoretical analysis. Type A subsystems are those for which there is a
higher level of confidence, for type B systems there is less confidence, with the significant
difference being that for type A subsystems, more field failure data has been collected.
2.3.3 Design Techniques and Measures
IEC 61508 specifies techniques and measures that should be used in the detailed design of
the product. Their purpose is to avoid failures such as software and manufacturing faults
and to control failures during operation.
Only by using the techniques and measures specified can manufacturers claim a particular
safe failure fraction and safety integrity level.
2.3.4 Tolerance of Hardware Faults
A safety-related system is said to have a hardware fault tolerance of N, when N+1 faults
could cause the loss of the safety function. The level of hardware fault tolerance (either 0,1
or 2) is one of the determining factors for the safety integrity level of a particular product.
Hardware fault tolerance determines the highest SIL that can be claimed for a product, but
also determines whether or not the speed with which the product carries out its internal
diagnostics need to be considered in relation to the process safety time – see Section 2.4.
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 9 of 24
Functional Safety System Selection
___________________________________________________________________________________
2.3.5 Safety Integrity Level and Architecture
The safety integrity levels that can be claimed for given safe failure fractions - given the
restrictions on design techniques and measures - and hardware fault tolerances for type
A and type B systems are shown in the tables below:
Safe Failure Fraction
< 60%
60% to < 90%
90% to < 99%
≥ 99%
0
SIL 1
SIL 2
SIL 3
SIL 3
Hardware Fault Tolerance
1
SIL 2
SIL 3
SIL 4
SIL 4
2
SIL 3
SIL 4
SIL 4
SIL 4
Table 1 Safety Integrity Level with Architecture for Type A Subsystems
Safe Failure Fraction
< 60%
60% to < 90%
90% to < 99%
≥ 99%
0
Not allowed
SIL 1
SIL 2
SIL 3
Hardware Fault Tolerance
1
SIL 1
SIL 2
SIL 3
SIL 4
2
SIL 2
SIL 3
SIL 4
SIL 4
Table 2 Safety Integrity Level with Architecture for Type B Subsystems
Note – if any subsystem of a particular safety function is type B, then the safety function
must be treated as if it were type B.
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 10 of 24
Functional Safety System Selection
___________________________________________________________________________________
2.3.6 Probabilities of Failure
The probability that a safety-related system would fail to provide the required protection
could be expressed either as the probability of a dangerous failure per hour (PFH) or as
an average probability of failure of protection on demand (PFDavg). Which of these two
measures is used depends on the nature of the hazard – if it is continually (or very often)
present (as in high demand applications), then the probability of dangerous failure per
hour is the most useful figure to use. If the hazard is infrequently present, then the probability
of failure of protection on demand (as in low demand applications) is most appropriate.
For safety functions that do not employ hardware fault tolerance, PFH is simply calculated
as the sum of the undetected dangerous failure rates for each element of the safety
function. Where hardware fault tolerance is used, the calculations are considerably more
complicated, and have not been considered here.
Probability of Failure on Demand
PFDavg is calculated according to the probability of failure, but is also dependent on the proof
test interval defined for the product. For simplicity, it is assumed that the probability of failure
is constant – such that as time passes after the last proof test, the probability of an
undetected failure having occurred increases linearly. (The probability of failure is actually an
exponential, but it can be taken as approximately linear for the early part of the curve.) This
probability of failure is effectively reset to zero by carrying out a proof test (this assumes that
the proof test is a “complete” test, which may be only approximately true). The average
probability of failure can then be found. This is shown in Figure 2.
PFDAVG
Proof Test Interval
Time
Figure 2 Probability of Failure on Demand with Proof Testing
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 11 of 24
Functional Safety System Selection
___________________________________________________________________________________
IEC 61508 defines the Safety Integrity Level required for both Continuous/High Demand
applications and Low Demand applications, according to the required PFH or PFDavg.
PFH or PFDavg by Safety Integrity Levels for high and low demand applications are shown
in Table 3 below.
Sa fety
Integrity
Level
Continuous/ High-dema nd
M ode of Opera tion
(prob. of dangerous fa ilure per hour)
Sa fety
Integrity
Level
Low dema nd
M ode of Opera tion
(prob. of Failure on Demand)
4
>= 10 -9 to 1 0 -8
4
>= 1 0 -5 to 1 0 -4
3
>= 10 -8 to 1 0 -7
3
>= 1 0 -4 to 1 0 -3
2
>= 10 -7 to 1 0 -6
2
>= 1 0 -3 to 1 0 -2
1
>= 10 -6 to 1 0 -5
1
>= 1 0 -2 to 1 0 -1
Table 3 PFH and PFD for High and Low Demand Applications
Note: When “probability of dangerous failure per hour” and “probability of failure (to
protect) on demand” are given, these relate specifically to the probability that the safetyrelated system will fail to provide the necessary protection i.e. fail in a dangerous manner.
These figures give no indication as to the likely level of overall failure (i.e. the availability of
the system).
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 12 of 24
Functional Safety System Selection
___________________________________________________________________________________
2.4 Process Safety Time
IEC 61508 defines the concept of process safety time as “the period of time between a
failure occurring in the EUC or the EUC control system (with the potential to give rise to a
hazardous event) and the occurrence of the hazardous event if the safety function is not
performed”.
It follows that implementation of the safety function must be appropriate to the process
safety time of the given risk.
The time to carry out the safety function is not given a specific name within IEC 61508, but
will be termed response time in this document and treated as if it were defined in the
standard. It is easy to see that the response time of the safety function must be shorter
than the process safety time.
In high demand applications, it is also necessary to consider the length of time to detect and
respond to - and/or repair - faults revealed by internal diagnostics. The time taken to detect
internal faults is known as the diagnostic test interval and the time taken to respond once a
fault is detected is known as the fault reaction time. Further, the mean time to repair the
system must be taken into account for applications that will continue to operate – and
therefore continue to present a risk – before the safety function can be repaired.
Note - it is not necessary to consider the diagnostic test interval and the fault reaction time
when the system is tolerant to hardware faults – but it must be considered when a single
hardware fault could render the system incapable of carrying out its safety function on
demand.
Note – the response time of the system must include the time taken for input and output
devices to respond (e.g. the time taken for a valve to close) and must make worst case
assumptions for any cyclical (or non-deterministic) processes.
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 13 of 24
Functional Safety System Selection
___________________________________________________________________________________
2.5 Summary of Safety-related System Selection
Establishing the suitability of a certified safety-related system to carry out a particular safety
function can be simplified in to 3 basic tests:
•
•
•
Is the system architecture suitable?
Is the probability of a dangerous failure low enough?
Can the system respond sufficiently quickly?
The following sections provide a simple summary of what is required to check the suitability of
a safety-related system against these basic tests.
2.5.1 System Architecture
Table 1 in Section 2.3.5 shows the maximum safety integrity level that a system can be
used to provide, given the hardware fault tolerance and the safe failure fraction.
2.5.2 Probability of Dangerous Failure
Table 3 in Section 2.3.6 shows PFH and PFDavg by Safety Integrity Level, for high and low
demand applications.
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 14 of 24
Functional Safety System Selection
___________________________________________________________________________________
2.5.3 Speed of Response
For any safety function, the process safety time must be longer than the response time.
In high demand applications the process safety time must also be longer than the
diagnostic test interval and the fault reaction time, if there is no hardware fault
tolerance.
Figure 3 summarises the steps that must be taken to determine if the performance of a
safety-related system is sufficient to achieve a particular safety function. It assumes that
the system architecture is suitable for the SIL being considered and that the mean time to
repair need not be considered due to the nature of the application (i.e. the safety function will
be carried out in the event of a fault). Further, in high demand applications, it assumes that
the diagnostic test interval is at least an order of magnitude smaller than the demand rate.
L
No
Process Safety Time >
Response Time?
More than once a year
No
Process Safety Time >
Diagnostic Test Interval
+ Fault Reaction Time?
Yes
What is the
Demand Rate?
Yes
Low Demand Mode calculate PFDavg
of each safety loop
High Demand Mode calculate PFH
of each safety loop
L
No
Once a year or less
PFH < limit for
target SIL?
Yes
PFDavg < limit for
target SIL?
No
L
J
Figure 3 Determining if a Safety-Related System is suitable for the application
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 15 of 24
Functional Safety System Selection
___________________________________________________________________________________
2.6 Management Requirements
IEC 61508 places a number of requirements on the individuals and organisations that are
involved in the design, implementation and maintenance of safety-related systems.
It does not prescribe exactly how this management should be done, but it does require that
formal development processes must be specified, followed and audited; see IEC 61508-1
clause 6.
Organisations may have their functional safety management capability assessed and
certified (for example under the CASS scheme), to demonstrate their competence. This can
be particularly useful to end users and system integrators in demonstrating compliance to
regulatory authorities.
2.7 Certified Products
One of the advantages of using products certified to IEC 61508 by a recognised body, is that
the certificate validates much more than just the actual product itself. The certification also
confirms the suitability of:
•
•
•
•
•
the design processes used by the manufacturer of the product to avoid failures
the design techniques and measures used to control failures (or limit the effects of
failures) during operation
the methods used to define the hardware fault tolerance
the methods used to measure the safe failure fraction
the methods used to measure the probabilities of failure
Many more aspects are brought in to the certifying process and the certificate is sufficient
proof that all requirements have been met for the safety integrity level claimed for the
product.
IEC 61508 does not require that certified safety products are used in safety-related
systems, but if an end user or system designers elects to use non-certified products then
they must take responsibility themselves for validating that all these elements have been
carried out according to the standard.
2.8 IEC 61508 and ANSI/ISA S84.01
ANSI/ISA S84.01 is the process industry functional safety standard for North America and
Canada, designed to be compatible with draft versions of IEC 61508. Now that IEC 61508 is
published, it is expected that ANSI/ISA S84.01 will evolve further.
ANSI/ISA S84.01 uses only three safety integrity levels (SIL 1 to 3), which are defined
almost identically to those in IEC 61508. A further difference is that the ANSI/ISA standard
does not cover the full safety lifecycle – from design to decommissioning – as does IEC
61508.
IEC 61508 may be used on a voluntary basis in North America and Canada.
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 16 of 24
Functional Safety System Selection
___________________________________________________________________________________
3 Application Example
3.1
Low Demand Application - Emergency Shutdown System
3.1.1 Description of application
An ESD System is intended to shut down the process safely in the event of a failure of the
system controlling the process (in IEC 61508 this is termed the Basic Process Control
System or BPCS), or when certain critical parameters exceed pre-set limits. It is used in
order to protect against injury, loss of life, damage to the plant, and environmental damage in
the event of a malfunction.
The ESD System is almost always separate to the BPCS and usually has its own dedicated
sensors and actuators.
A typical application is shown in the diagram below.
Input devices
e.g. temperature or
pressure
transmitters
MOST
SafetyNet
System
Actuators
e.g. shut-off valves,
dump valves etc.
Control room
Figure 4 Typical Emergency Shutdown Application
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 17 of 24
Functional Safety System Selection
___________________________________________________________________________________
3.1.2 MOST SafetyNet System
The MOST SafetyNet System is a SIL2 certified “logic solver”. It comprises a number of IO
Modules to which field instruments are connected and a Controller that runs the safety
application programme. Data relevant to a safety application is given in the table below.
General Information
Manufacturer
Model
Logic Solver Type
Configuration
Architecture Type
MTL Instruments Ltd
MOST SafetyNet System
Safety PLC
1oo1
B
Certified for use up to
SIL2
Hardware Fault Tolerance
0
Failure Rate Data
Part
Model
Safety Controller
AI Safety Module
DI/DO Safety Module
8851-LC-MT
8810-HI-TX
8811-IO-DC
λDU (dangerous undetected failure rate per 109 hours)
100
20
50
3.1.3 Required Input and Output types
The input and output types discussed below are those required for the safety-related
functions of the ESD System. Other input and output types (for non-safety-related functions)
may also be used in the system. These must not compromise the safety function.
•
•
•
4/20mA analogue inputs are used to interface to a number of measurement transmitters.
Line fault monitoring is carried out by checking if the current input is either under- or overrange.
Digital outputs are used to control valves. These will be normally energised and used with
either shut-off or release valves. (Shut-off valves are kept normally open by the
energised digital output. Release valves are kept normally closed.) Line fault monitoring
is not required on these outputs, as they would be de-energised by any line fault.
Digital inputs are used for monitoring volt-free contacts. If the field wiring to the switch
became open-circuit, it would not be possible to detect the closure of the switch and if the
line became short-circuit, it would not be possible to detect that the switch was open.
Line fault monitoring, in conjunction with end-of-line resistors, is used to identify and
report open and short circuit faults in field wiring.
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 18 of 24
Functional Safety System Selection
___________________________________________________________________________________
3.1.4 Configuration and Programming
The safe state for an ESD system is for the normally energised outputs to be de-energised,
which can be triggered either by the programmed application or automatically by the safety
features built in to the Safety System.
Depending on the particular ESD application requirements, and the nature of the detected
fault, it is possible that immediately triggering a shutdown of the process is neither necessary
nor desirable. It may be possible – for example – to report to the operator that a fault has
been detected and then set a timer to expire after a certain period, such that if the fault is not
cleared when the timer expires, then shut down will be triggered. A fault that might be treated
in this way would be (for example) a line fault on an input channel.
These faults are such that the system retains some level of safety functionality – but the
consequences of not immediately shutting down the process must be carefully considered as
part of the safety analysis.
Further, taking an approach whereby the process is not immediately shutdown requires that
the mean time to repair the system must be considered directly in the analysis of
probabilities of failure – and the simplified procedure shown in Figure 3 cannot be used.
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 19 of 24
Functional Safety System Selection
___________________________________________________________________________________
3.1.5 Probability of Dangerous Failure
This Section gives a basic introduction to calculating the average probability of failure on
demand (PFDavg) for a safety function incorporating the MOST SafetyNet System.
PFDavg for a particular safety function is the sum of the probabilities of the average failure on
demand of each element of the system, taking in to account the proof test interval of each
element.
Figure 5 below includes a pressure transmitter for an input device, an 8810-HI-TX Analogue
Input Module, a Safety Controller, an 8811-IO-DC Digital I/O Module configured as an output
and a pilot and control valve.
8810-HI-TX
Safety
AI Module
Pressure
Transmitter
λDU = 100 x10 -9
Tp = 8760 hours
PFDavg = 5x10 -4
λDU = 20 x10 -9
Tp = 8760 hours
PFDavg = 1x10 -4
8851-LC-MT
Safety
Controller
8811-IO-DC
Safety
DI/DO Module
λDU = 100 x10 -9
Tp = 8760 hours
PFDavg = 5x10 -4
λDU = 50 x10 -9
Tp = 8760 hours
PFDavg = 3 x10 -4
Pilot &
Control Valve
λDU = 1400 x10 -9
Tp = 8760 hours
PFDavg = 6.1 x10 -3
λDU for all elements is failure rate per hour, PFDavg is the average probability of dangerous failure on demand.
Tp is the proof test interval - 8760 hours is 1 year.
PFDavg = 1/2 * Tp * λDU
Figure 5 Typical Low Demand Application
PFDavg for each element is calculated according to the equation above, where λDU is the
undetected dangerous failure rate per hour and Tp is the proof test interval (also in hours).
Tp in this example is 8760 hours (1 year) for all components of the safety function. The
value for PFDavg is half of the product of Tp and λDU – see Section 2.3.6.
The overall PFDavg for the safety function is then:
PFDavg = 5x10
-4
+ 1x10
-4
+ 5x10
-4
+ 3x10
-4
+ 6.1x10
-3
= 7.5 x 10
–3
–2
The PFDavg limit for SIL2 is < 10 , which is the case for this example.
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 20 of 24
Functional Safety System Selection
___________________________________________________________________________________
3.1.6 Response Time
The response time requirement of typical ESD System - from detecting a fault or alarm to
completion of an action by an output device can vary considerably, according to the nature of
the process under control. A safety function with an input transmitter or switch as a sensor
and a valve as a final element would normally give a response time better than 10 seconds –
with the operating time of the valve the dominant factor. The response time of the MOST
SafetyNet System – in the range 50 to 200ms – is significantly faster than that of a typical
valve, this will allow for much lower response times when combined with faster acting final
elements.
Figure 6 below shows typical response time figures for an ESD system.
Pressure
Transmitter
8810-HI-TX
Safety
AI Module
8851-LC-MT
Safety
Controller
8811-IO-DC
Safety
DI/DO Module
Pilot &
Control Valve
Response time
50ms
Response time
30ms
Response time
100ms
Response time
10ms
Response time
4s
Figure 6 Typical ESD System Response Times
The typical response time for the system outlined above is:
0.05 + 0.03 + 0.1 + 0.01 + 5 = 4.19 seconds (i.e. within the 10 second process safety time)
The worst case response time for the system outlined above (which would occur when the
input cycles of the transmitter and the AI module become as un-synchronised as is possible,
so that their individual contribution to the response time is doubled) is:
0.05*2 + 0.03*2 + 0.1 + 0.01 + 5 = 4.27 seconds (i.e. still within the process safety time)
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 21 of 24
Functional Safety System Selection
___________________________________________________________________________________
Appendix A – Glossary of terms and abbreviations
Terms and Abbreviations for IEC61508
Note: where a definition of the term or abbreviation is given in IEC61508-4 “Definitions and
abbreviations”, the definition from the standard is given first in quotation marks, followed by
further explanation if this is necessary.
1oo1D – a system which has no hardware fault tolerance and some level of diagnostic
coverage to detect faults.
1oo2D – a system which has a hardware fault tolerance of “1” and some level of diagnostic
coverage to detect faults.
Average probability of failure of protection on demand – or PFDavg is the probability that
a safety system will be unable to carry out its required safety function when a hazardous
situation arises and a demand – in other words a request for the safety function to act –
occurs. This probability is used to determine the suitability of safety systems in low demand
applications. The value of PFDavg of a particular element within the safety system is
determined by its intrinsic reliability, but also by the length of time between proof tests.
Average probability of failure on demand (PFDavg) – “is the safety integrity failure
measure for safety-related protection systems operating in low demand mode”
Continuous mode – also known as high demand. A safety function for high demand or
continuous mode may be required to carry out its safety function more often than once per
year. The alternative is a low demand application, in which the safety function would
normally be required to operate once per year, or less.
Control failures – a number of techniques are specified in the standard to PFDavg during
the operation of the E/E/PE safety-related system. These techniques, when combined with
the techniques specified for fault avoidance in all stages of the safety life cycle, play an
important part in ensuring that the E/E/PE safety-related system attains its safety integrity
level.
Diagnostic test interval – “interval between on-line tests to detect faults in a safety-related
system that has a specified diagnostic test coverage”. The diagnostic test interval is an
important factor (when combined with the fault reaction time), in determining if a particular
safety-related system (with no tolerance to hardware faults) is suitable for use in a given
high demand/continuous mode application.
Electrical, electronic or programmable electronic system (E/E/PES) – “system for control,
protection or monitoring based on one or more electrical/electronic programmable electronic
devices, including all elements of the system such as power supplies, sensors and other input
devices, data highways and other communication paths, and actuators and other output
devices.”
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 22 of 24
Functional Safety System Selection
___________________________________________________________________________________
Equipment under control (EUC) – the equipment, plant and machinery that is the source of
the risk.
EUC control system – “system which responds to input signals from the process and/or from
an operator and generates output signals causing the EUC to operate in the desired manner”.
EUC risk – “risk arising from the EUC or its interaction with the EUC control system”.
External risk reduction facility – “measure to reduce or mitigate the risks which are
separate and distinct from, and do not use, E/E/PE safety-related systems or other
technologies safety-related systems. Examples: A drain system, a fire wall and a bund are
all external risk reduction facilities.
Fault avoidance – “use of techniques and procedures which aim to avoid the introduction of
faults during any phase of the safety lifecycle of the safety-related system”.
Fault reaction – the time taken for safety function to perform its specified action - to achieve
or maintain a safe state. This should be considered along with the diagnostic test interval
and the process safety time for systems that have a hardware fault tolerance of zero and
which are operating in high demand mode.
Final elements – the actuators (such as valves, solenoids, solenoid valves, pumps, alarms
etc.) that carry out an action to control the process or carry out the safety function.
Functional safety – “part of the overall safety relating to the EUC and the EUC control
system which depends on the correct functioning of the E/E/PE safety-related systems,
other technology safety-related systems and external risk reduction facilities”.
Hardware fault tolerance – IEC 61508 defines fault tolerance as “ability of a functional unit
to continue to perform a required function in the presence of faults or errors”. Hardware fault
tolerance is obviously fault tolerance specifically related to hardware.
Harm – “physical injury or damage to the health of people either directly or indirectly as a
result of damage to property or to the environment”
Hazard – “a potential source of harm”. The standard covers harm caused in both the shortterm – such as harm from an explosion – and the long term – such as harm from the release
of a toxic substance.
Hazard and risk analysis – part of the development of the overall safety requirements.
Hazardous event – “a hazardous situation which results in harm”.
Hazardous situation – “circumstances in which a person is exposed to hazard(s)
High demand – also known as continuous mode. A safety function for high demand or
continuous mode may be required to carry out its safety function more often than once per
year. The alternative is a low demand application, in which the safety function would
normally be required to operate once per year, or less.
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 23 of 24
Functional Safety System Selection
___________________________________________________________________________________
Low demand – a safety function for low demand applications may be required to carry out
its safety function once per year or less. The alternative is a high demand/continuous
mode application, in which the safety function would normally be required to operate more
than once per year.
Other technologies – IEC 61508 is concerned with the use of electrical, electronic and
programmable electronic systems to provide safety systems. “Other technologies” are
neither electrical, electronic nor programmable electronic, but the standard recognises that
such protection based on alternative technologies – such as a hydraulic system - can be used
in risk reduction.
Probability of dangerous failure per hour (PFH) - “is the safety integrity failure measure for
safety-related protection systems operating in high demand mode”
Process safety time – “the period of between a failure occurring in the EUC or the EUC
control system (with the potential to give rise to a hazardous event) and the occurrence of
the hazardous event if the safety function is not performed”.
Programmable electronic system – “system for control, protection or monitoring based on
one or more programmable electronic devices, including all elements of the system such as
power supplies, sensors and other input devices, data highways and other communication
paths, and actuators and other output devices”.
Proof test – “periodic test performed to detect failures in a safety-related system so that, if
necessary, the system can be restored to an “as new” condition or as close as practical to this
condition”.
Random hardware failures – “failure, occurring at a random time, which results from one or
more of the possible degradation mechanisms in the hardware”.
Residual risk – “risk remaining after protective measures have been taken”. This level of risk
should typically be lower than the “tolerable risk” once protective measures have been
taken. Note, it is not necessary that this risk is zero – but it should be below what is
considered a “tolerable risk”.
Response time – the standard does not specifically define “response time”, but for
convenience in this safety manual, it is taken as if it were a defined concept. Given that
condition, response time is the time taken from the input to the sensor (or input device)
associated with a particular safety function being set, to the output device (final element)
completing its required action. This time period includes the time taken for the E/E/PE system
to carry out any software applications and communicate with the sensors and final elements.
Risk – “the combination of the probability of occurrence of harm and the severity of that
harm”.
Safe failure fraction – “of a subsystem is defined as the ratio of the average rate of safe
failures plus dangerous undetected failures of the subsystem to the total average failure rate
of the subsystem”.
Safe state – “state of the EUC when safety is achieved”.
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 24 of 24
Functional Safety System Selection
___________________________________________________________________________________
Safety function – “function to be implemented by an E/E/PE safety-related system, other
technology safety-related system or external risk reduction facilities, which is intended
to achieve or maintain a safe state for the EUC, in respect of a specific hazardous event”.
Safety integrity level – “discrete level (one out of a possible four) for specifying the safety
integrity requirements of the safety functions to be allocated to the E/E/PE safety-related
systems, where safety integrity level 4 has the highest level of safety integrity and safety
integrity level 1 has the lowest”.
Safety life cycle – “necessary activities involved in the implementation of safety-related
systems, occurring during a period of time that starts at the concept phase of a project and
finishes when all of the E/E/PE safety-related systems, other technology safety-related
systems and external risk reduction facilities are no longer available for use”.
Safety-related systems – “designated system that both
-
implements the required safety functions necessary to achieve or maintain a safe state
for the EUC; and
is intended to achieve, on its own or with other E/E/PE safety-related systems, other
technology safety-related systems or external risk reduction facilities, the necessary
safety integrity for the required safety functions”.
Safety requirements specification – “specification containing all the requirements of the
safety functions that have to be performed by the safety-related systems”. This should
include the action the safety function is required to perform and also the safety integrity
requirements of the safety function.
Sensors – input devices to the safety function.
SIL – see safety integrity level.
Systematic failure – “failure related in a deterministic way to a certain cause, which can only
be eliminated by a modification of the design or of the way the manufacturing process,
operational procedures, documentation or other relevant factors”.
Tolerable risk – “risk which is accepted in a given context based on the current values of
society”
Type A system – a subsystem will be regarded as type A if, for the components used to
achieve the safety function can satisfy the following requirements:
(a) the failure modes of all the constituent components are well defined
(b) the behaviour of the subsystems under fault conditions can be completely determined
(c) there is sufficient dependable failure data from field experience to show that the claimed
rates of failure for detected and undetected dangerous failures are met.
Type B system - a subsystem will be regarded as type B if, for the components used to
achieve the safety function:
(d) the failure mode of at least one constituent component is not well defined
(e) the behaviour of the subsystems under fault conditions cannot be completely determined
(f) there is insufficient dependable failure data from field experience to support the claims for
rates of failure for detected and undetected dangerous failures.
______________________________________________________________________________________
 MTL Open Systems Technology Ltd.
Issue: 1.0
04 April 2006
Page 25 of 24