Uploaded by abdu.badry

Guide-to-the-automation-body-of-knowledge-third-edition

advertisement
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
A Guide to the Automation Body of Knowledge, Third Edition, edited by Nicolas Sands, and Ian Verhappen, International Society of Automation (ISA), 2018. ProQuest Ebook Central,
http://ebookcentral.proquest.com/lib/ybp-ebookcentral/detail.action?docID=6110271.
Created from ybp-ebookcentral on 2020-03-30 07:32:50.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Notice
The information presented in this publication is for the general education of the reader. Because neither the author nor
the publisher has any control over the use of the information by the reader, both the author and the publisher disclaim any
and all liability of any kind arising out of such use. The reader is expected to exercise sound professional judgment in using
any of the information presented in a particular application.
Additionally, neither the author nor the publisher has investigated or considered the effect of any patents on the ability of
the reader to use any of the information in a particular application. The reader is responsible for reviewing any possible
patents that may affect any particular use of the information presented.
Any references to commercial products in the work are cited as examples only. Neither the author nor the publisher
endorses any referenced commercial product. Any trademarks or tradenames referenced in this publication, even without
specific indication thereof, belong to the respective owner of the mark or name and are protected by law. Neither the author
nor the publisher makes any representation regarding the availability of any referenced commercial product at any time. The
manufacturer’s instructions on the use of any commercial product must be followed at all times, even if in conflict with the
information in this publication.
The opinions expressed in this book are the author’s own and do not reflect the view of the International Society of
Automation.
Copyright © 2018 International Society of Automation (ISA)
All rights reserved.
Printed in the United States of America.
10 9 8 7 6 5 4 3 2
ISBN 978-1-941546-91-8
No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic,
mechanical, photocopying, recording or otherwise, without the prior written permission of the publisher.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
ISA
67 T. W. Alexander Drive
P.O. Box 12277
Research Triangle Park, NC 27709
Library of Congress Cataloging-in-Publication Data in process
About the Editors
Nicholas P. Sands PE, CAP, ISA Fellow
Nick Sands is currently a senior manufacturing technology fellow with more than 27
years at DuPont, working in a variety of automation roles at several different businesses
and plants. He has helped develop several company standards and best practices in the
areas of automation competency, safety instrumented systems, alarm management, and
process safety.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Sands has been involved with the International Society of Automation (ISA) for more
than 25 years, working on standards committees, including ISA18, ISA101, ISA84, and
ISA105, as well as training courses, the ISA Certified Automation Professional (CAP)
certification, and section and division events. His path to automation started when he
earned a BS in Chemical Engineering from Virginia Tech.
Ian Verhappen P Eng, CAP, ISA Fellow
After receiving a BS in Chemical Engineering with a focus on process control, Ian
Verhappen’s career has included stints in all three aspects of the automation industry:
end user, supplier, and engineering consultant. Verhappen has been an active ISA
volunteer for more than 25 years, learning from and sharing his knowledge with other
automation professionals as an author, presenter, international speaker, and volunteer
leader.
Through a combination of engineering work, standards committee involvement, and a
desire for continuous learning, Verhappen has been involved in all facets of the process
automation industry from field devices, including process analyzers, to controllers to the
communication networks connecting these elements together. A Guide to the
Automation Body of Knowledge is a small way to share this continuous learning and
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
pass along the expertise gained from all those who have helped develop the body of
knowledge used to edit this edition.
Preface to the Third Edition
It has been some years since the second edition was published in 2006. Times have
changed. We have changed. Technology has changed. Standards have changed. Some
areas of standards changes include; alarm management, human machine interface
design, procedural automation, and intelligent device management.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Another change, in 2009, we lost the pioneer of A Guide to the Automation Body of
Knowledge and the Certified Automation Professional (CAP) program, my friend,
Vernon Trevathan. He had a vision of defining automation engineering and developing
automation engineers.
With the changes in technology, it is clear that the trend of increasing automation will
continue into the future. What is not clear, is how to support that trend with capable
engineers and technicians. This guide is a step towards a solution. The purpose of this
edition is the same as that of the first edition, to provide a broad overview of
automation, broader than just instrumentation or process control, to include topics like
HAZOP studies, operator training, and operator effectiveness. The chapters are written
by experts who share their insights in a few pages.
The third edition was quite a project for many reasons. It was eventually successful
because of the hard work and dedication of Susan Colwell and Liegh Elrod of the ISA
staff, and the unstoppable force of automation that is my co-editor Ian Verhappen. Every
chapter has been updated and some new chapters have been added. It is my hope that
you find this guide to be a useful quick reference for the topics you know, and an
overview for the topics you seek to learn. May you enjoy reading this third edition, and I
hope Vernon enjoys it as well.
Nicholas P. Sands
May 2018
Contents
About the Editors
Preface
I – Control Basics
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
1
Control System Documentation
By Frederick A. Meier and Clifford A. Meier
Reasons for Documentation
Types of Documentation
Process Flow Diagram (PFD)
Piping and Instrument Diagrams (P&IDs)
Instrument Lists
Specification Forms
Logic Diagrams
Location Plans (Instrument Location Drawings)
Installation Details
Loop Diagrams
Standards and Regulations
Other Resources
About the Authors
2
Continuous Control
By Harold Wade
Introduction
Process Characteristics
Feedback Control
Controller Tuning
Advanced Regulatory Control
Further Information
About the Author
3
Control of Batch Processes
By P. Hunter Vegas, PE
What Is a Batch Process?
Controlling a Batch Process
What Is ANSI/ISA-88.00.01?
Applying ANSI/ISA-88.00.01
Summary
Further Information
About the Author
4
Discrete Control
By Kelvin T. Erickson, PhD
Introduction
Ladder Logic
Function Block Diagram
Structured Text
Instruction List
Sequential Problems
Further Information
About the Author
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
II – Field Devices
5
Measurement Uncertainty
By Ronald H. Dieck
Introduction
Error
Measurement Uncertainty (Accuracy)
Calculation Example
Summary
Definitions
References
Further Information
About the Author
6
Process Transmitters
By Donald R. Gillum
Introduction
Pressure and Differential Pressure Transmitters
Level Measurement
Hydraulic Head Level Measurement
Fluid Flow Measurement Technology
Temperature
Conclusion
Further Information
About the Author
7
Analytical Instrumentation
By James F. Tatera
Introduction
Sample Point Selection
Instrument Selection
Sample Conditioning Systems
Process Analytical System Installation
Maintenance
Utilization of Results
Further Information
About the Author
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
8
Control Valves
By Hans D. Baumann
Valve Types
Actuators
Accessories
Further Information
About the Author
9
Motor and Drive Control
By Dave Polka and Donald G. Dunn
Introduction
DC Motors and Their Principles of Operation
DC Motor Types
AC Motors and Their Principles of Operation
AC Motor Types
Choosing the Right Motor
Adjustable Speed Drives (Electronic DC)
Adjustable Speed Drives (Electronic AC)
Automation and the Use of VFDs
Further Information
About the Authors
III – Electrical Considerations
10 Electrical Installations
By Greg Lehmann, CAP
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Introduction
Scope
Definitions
Basic Wiring Practices
Wire and Cable Selection
Ground, Grounding, and Bonding
Surge Protection
Electrical Noise Reduction
Enclosures
Raceways
Distribution Equipment
Check-Out, Testing, and Start-Up
Further Information
About the Author
11 Safe Use and Application of Electrical Apparatus
By Ernie Magison, Updated by Ian Verhappen
Introduction
Philosophy of General-Purpose Requirements
Equipment for Use Where Explosive Concentrations of Gas, Vapor, or Dust
Might Be Present
Equipment for Use in Locations Where Combustible Dust May Be Present
For More Information
About the Author
12 Checkout, System Testing, and Start-Up
By Mike Cable
Introduction
Instrumentation Commissioning
Software Testing
Factory Acceptance Testing
Site Acceptance Testing
System Level Testing
Safety Considerations
Further Information
About the Author
IV – Control Systems
13 Programmable Logic Controllers: The Hardware
By Kelvin T. Erickson, PhD
Introduction
Basic PLC Hardware Architecture
Basic Software and Memory Architecture (IEC 61131-3)
I/O and Program Scan
Forcing Discrete Inputs and Outputs
Further Information
About the Author
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
14 Distributed Control Systems
By Douglas C. White
Introduction and Overview
Input/Output Processing
Control Network
Control Modules
Human-Machine Interface—Operator Workstations
Human-Machine Interface—Engineering Workstation
Application Servers
Future DCS Evolution
Further Information
About the Author
15 SCADA Systems: Hardware, Architecture, and Communications
By William T. (Tim) Shaw, PhD, CISSP, CPT, C|EH
Key Concepts of SCADA
Further Information
About the Author
V – Process Control
16 Control System Programming Languages
By Jeremy Pollard
Introduction
Scope
What Is a Control System?
What Does a Control System Control?
Why Do We Need a Control Program?
Instruction Sets
The Languages
Conclusions
About the Author
17 Process Modeling
By Gregory K. McMillan
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Fundamentals
Linear Dynamic Estimators
Multivariate Statistical Process Control
Artificial Neural Networks
First Principle Models
Capabilities and Limitations
Process Control Improvement
Costs and Benefits
Further Information
About the Author
18 Advanced Process Control
By Gregory K. McMillan
Fundamentals
Advanced PID Control
Valve Position Controllers
Model Predictive Control
Real-Time Optimization
Capabilities and Limitations
Costs and Benefits
MPC Best Practices
Further Information
About the Author
VI – Operator Interaction
19 Operator Training
By Bridget A. Fitzpatrick
Introduction
Evolution of Training
The Training Process
Training Topics
Nature of Adult Learning
Training Delivery Methods
Summary
Further Information
About the Author
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
20 Effective Operator Interfaces
By Bill Hollifield
Introduction and History
Basic Principles for an Effective HMI
Display of Information Rather Than Raw Data
Embedded Trends
Graphic Hierarchy
Other Graphic Principles
Expected Performance Improvements
The HMI Development Work Process
The ISA-101 HMI Standard
Conclusion
Further Information
About the Author
21 Alarm Management
By Nicholas P. Sands
Introduction
Alarm Management Life Cycle
Getting Started
Alarms for Safety
References
About the Author
VII – Safety
22 HAZOP Studies
By Robert W. Johnson
Application
Planning and Preparation
Nodes and Design Intents
Scenario Development: Continuous Operations
Scenario Development: Procedure-Based Operations
Determining the Adequacy of Safeguards
Recording and Reporting
Further Information
About the Author
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
23 Safety Instrumented Systems in the Process Industries
By Paul Gruhn, PE, CFSE
Introduction
Hazard and Risk Analysis
Allocation of Safety Functions to Protective Layers
Determine Safety Integrity Levels
Develop the Safety Requirements Specification
SIS Design and Engineering
Installation, Commissioning, and Validation
Operations and Maintenance
Modifications
System Technologies
Key Points
Rules of Thumb
Further Information
About the Author
24 Reliability
By William Goble
Introduction
Measurements of Successful Operation: No Repair
Useful Approximations
Measurements of Successful Operation: Repairable Systems
Average Unavailability with Periodic Inspection and Test
Periodic Restoration and Imperfect Testing
Equipment Failure Modes
Safety Instrumented Function Modeling of Failure Modes
Redundancy
Conclusions
Further Information
About the Author
VIII – Network Communications
25 Analog Communications
By Richard H. Caro
Further Information
About the Author
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
26 Wireless Transmitters
By Richard H. Caro
Summary
Introduction to Wireless
Powering Wireless Field Instruments
Interference and Other Problems
ISA-100 Wireless
WirelessHART
WIA-PA
WIA-FA
ZigBee
Other Wireless Technologies
Further Information
About the Author
27 Cybersecurity
By Eric C. Cosman
Introduction
General Security Concepts
Industrial Systems Security
Standards and Practices
Further Information
About the Author
IX – Maintenance
28 Maintenance, Long-Term Support, and System Management
By Joseph D. Patton, Jr.
Maintenance Is Big Business
Service Technicians
Big Picture View
Production Losses from Equipment Malfunction
Performance Metrics and Benchmarks
Further Information
About the Author
29 Troubleshooting Techniques
By William L. Mostia, Jr.
Introduction
Logical/Analytical Troubleshooting Framework
The Seven-Step Troubleshooting Procedure
Vendor Assistance: Advantages and Pitfalls
Other Troubleshooting Methods
Summary
Further Information
About the Author
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
30 Asset Management
By Herman Storey and Ian Verhappen, PE, CAP
Asset Management and Intelligent Devices
Further Information
About the Authors
X – Factory Automation
31 Mechatronics
By Robert H. Bishop
Basic Definitions
Key Elements of Mechatronics
Physical System Modeling
Sensors and Actuators
Signals and Systems
Computers and Logic Systems
Data Acquisition and Software
The Modern Automobile as a Mechatronic Product
Classification of Mechatronic Products
The Future of Mechatronics
References
Further Information
About the Author
32 Motion Control
By Lee A. Lane and Steve Meyer
What Is Motion Control?
Advantages of Motion Control
Feedback
Actuators
Electric Motors
Controllers
Servos
Feedback Placement
Multiple Axes
Leader/Follower
Interpolation
Performance
Conclusion
Further Information
About the Authors
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
33 Vision Systems
By David Michael
Using a Vision System
Vision System Components
Vision Systems Tasks in Industrial/Manufacturing/Logistics Environments
Implementing a Vision System
What Can the Camera See?
Conclusion
Further Information
About the Author
34 Building Automation
By John Lake, CAP
Introduction
Open Systems
Information Management
Summary
Further Information
About the Author
XI – Integration
35 Data Management
By Diana C. Bouchard
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Introduction
Database Structure
Data Relationships
Database Types
Basics of Database Design
Queries and Reports
Data Storage and Retrieval
Database Operations
Special Requirements of Real-Time Process Databases
The Next Step: NoSQL and Cloud Computing
Data Quality Issues
Database Software
Data Documentation
Database Maintenance
Data Security
Further Information
About the Author
36 Mastering the Activities of Manufacturing Operations Management
By Charlie Gifford
Introduction
Level 3 Role-Based Equipment Hierarchy
MOM Integration with Business Planning and Logistics
MOM and Production Operations Management
Other Supporting Operations Activities
The Operations Event Message Enables Integrated Operations Management
The Level 3-4 Boundary
References
Further Information
About the Author
37 Operational Performance Analysis
By Peter G. Martin, PhD
Operational Performance Analysis Loops
Process Control Loop Operational Performance Analysis
Advanced Control Operational Performance Analysis
Plant Business Control Operational Performance Analysis
Real-Time Accounting
Enterprise Business Control Operational Performance Analysis
Summary
Further Information
About the Author
XII – Project Management
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
38 Automation Benefits and Project Justifications
By Peter G. Martin, PhD
Introduction
Identifying Business Value in Production Processes
Capital Projects
Life-Cycle Cost Analysis
Life-Cycle Economic Analysis
Return on Investment
Net Present Value
Internal Rate of Return
Project Justification Hurdle
Getting Started
Further Information
About the Author
39 Project Management and Execution
By Michael D. Whitt
Introduction
Contracts
Project Life Cycle
Project Management Tools and Techniques
References
About the Author
40 Interpersonal Skills
By David Adler
Introduction
Communicating One-on-One
Communicating in Group Meetings
Writing
Building Trust
Mentoring Automation Professionals
Negotiating
Resolving Conflict
Justifying Automation
Selecting the Right Automation Professionals
Building an Automation Team
Motivating Automation Professionals
Conclusion
References
About the Author
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Index
I
Control Basics
Documentation
One of the basic tenets of any project or activity is to be sure it is properly documented.
Automation and control activities are no different, though they do have different and
unique requirements to properly capture the requirements, outcomes, and deliverables of
the work being performed. The International Society of Automation (ISA) has developed
standards that are broadly accepted across the industry as the preferred method for
documenting a basic control system; however, documentation encompasses more than
just these standards throughout the control system life cycle.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Continuous and Process Control
Continuous processes require controls to keep them within safe operating boundaries
while maximizing the utilization of the associated equipment. These basic regulatory
controls are the foundation on which the automation industry relies and builds more
advanced techniques. It is important to understand the different forms of basic
continuous control and how to configure or tune the resulting loops—from sensor to
controller then actuator—because they form the building blocks of the automation
industry.
Batch Control
Not all processes are continuous. Some treat a discrete amount of material within a
shorter period of time and therefore have a different set of requirements than a
continuous process. The ISA standards on batch control are the accepted industry best
practices in implementing control in a batch processing environment; these practices
are summarized.
Discrete Control
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
This chapter provides examples of how to implement discrete control, which is typically
used in a manufacturing facility. These systems mainly have discrete sensors and
actuators, that is, sensors and actuators that have one of two values (e.g., on/off or
open/closed).
1
Control System Documentation
By Frederick A. Meier and Clifford A. Meier
Reasons for Documentation
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Documentation used to define control systems has evolved over the past half century as
the technology used to generate it has evolved. Information formerly stored on
smudged, handwritten index cards in the maintenance shop is now more likely stored in
computer databases. The purpose of that documentation, however, remains largely
unchanged: to impart information efficiently and clearly to a knowledgeable viewer. The
information that is recorded evolves in the conceptualization, design, construction,
operation, and maintenance of a facility that produces a desired product.
The documents described in this chapter form a typical set used to accomplish the goal
of defining the work to be done, be it design, construction, or maintenance. The
documents were developed and are commonly used for a continuous process, but they
also work for other applications, such as batch processes. The authors know of no
universal “standard” for documentation, but these can be considered typical. Some
facilities or designs won’t include all the described documents, and some designs may
include documents not described, but the information provided on these documents will
likely be found somewhere in any successful document suite.
All the illustrations and much of the description used in this section were published in
the 2011 International Society of Automation (ISA) book Instrumentation and Control
System Documentation by Frederick A. Meier and Clifford A. Meier. That book includes
many more illustrations and a lot more explanation.
This section uses the term automation and control (A&C) to identify the group or
discipline responsible for the design and maintenance of a process control system; the
group that prepares and, hopefully, maintains these documents. Many terms are used to
identify the people responsible for a process control system; the group titles differ by
industry, company, and even region. In their book, the Meiers’ use the term instrument
and control (I&C) to describe the engineers and designers who develop control system
documentation; for our purposes, the terms are interchangeable.
Types of Documentation
This chapter provides descriptions and typical, albeit simple sketches for the following
documents:
• Process flow diagrams (PFDs)
• Piping and instrument diagrams (P&IDs)
• Loop and tag numbering
• Instrument lists
• Specification forms
• Logic diagrams
• Location plans (instrument location drawings)
• Installation details
• Loop diagrams
• Standards and regulations
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Operating instructions
Figure 1-1 is a timeline that illustrates a sequence for document development. There are
interrelationships where information developed in one document is required before a
succeeding document can be developed. Data in the process flow diagram drives the
design of the P&ID. P&IDs must be essentially complete before instrument
specification forms can be efficiently developed. Loop diagrams are built from most of
the preceding documents in the list.
The time intervals and percentage of total effort for each task will vary by industry and
by designer. The intervals can be days, weeks, or months, but the sequence will likely be
similar to that shown above. The documents listed are not all developed or used solely
by a typical A&C group. However, the A&C group contributes to, and uses, the
information contained in them.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Process Flow Diagram (PFD)
A process flow diagram is a “big picture” schematic representation of the major features
of a process. These diagrams summarize the overall intent of the process using a
graphical representation of the material flow and the conversion of resources into a
product. Points where resources and energy combine to produce material are identified
graphically. These points are then defined in more detail in associated mass balance
calculations. The PFD shows how much of each resource or product a plant might make
or treat; it includes descriptions and quantities of needed raw materials, as well as byproducts produced. PFDs show critical process conditions—pressures, temperatures,
and flows; necessary equipment; and major process piping. They differ from P&IDs,
which will be discussed later, in that they have far less detail and less ancillary
information. They are, however, the source from which P&IDs grow.
Figure 1-2 shows a simple PFD of a knockout drum used to separate liquid from a wet
gas stream.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Process designers produce PFDs to sketch out the important aspects of a process. In a
way, a PFD serves the same purpose that an abstract does for a technical paper. Only the
information or major components needed to define the process are included, using the
minimal amount of detail that is sufficient to define the quantities and energy needed.
Also shown on the drawing are the main components for storage, conversion of
materials, and transfer of materials, as well as the main interconnections between
components. Because a schematic is a very broad view and an A&C is ultimately about
details, little A&C information is included.
Identification marks are used on the lines of interconnection, commonly called streams.
The marks link to tables containing the content and conditions for that stream. This
information comes from a mass and energy balance calculated by the process designers.
The mass balance is the calculation that defines what the process will accomplish. PFDs
may include important—or high-cost—A&C components because one purpose of a PFD
is to support the preparation of cost estimates made to determine if a project will be
done.
There is no ISA standard for PFDs, but ANSI/ISA-5.1-2009, Instrument Symbols and
Identification, and ISA-5.3-1983, Graphic Symbols for Distributed Control/Shared
Display Instrumentation, Logic, and Computer Systems, contain symbols that can be
used to indicate A&C components.
Batch process plants configure their equipment in various ways as raw materials and
process parameters change. Many different products are often produced in the same
plant. A control recipe, or formula, is developed for each product. A PFD might be used
to document the different recipes.
Piping and Instrument Diagrams (P&IDs)
The acronym P&ID is widely understood within process industries to identify the
principal document used to define the equipment, piping, and all A&C components
needed to implement a process. ISA’s Automation, Systems, and Instrumentation
Dictionary definition for P&ID tells us what they do: P&IDs “show the interconnection
of process equipment and instrumentation used to control the process.”1 The PFD says
what the process will do; the P&ID defines how it happens.
P&IDs are developed in steps by members of the various design disciplines as a project
evolves. Information placed on a P&ID by one discipline is then used by other
disciplines as the basis for their design.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The P&ID shown in Figure 1-3 has been developed from the PFD in Figure 1-2. The
P&ID includes the control system definition using symbols from ISA-5.1 and ISA-5.3.
In this example, there are two electronic loops that are part of the shared
display/distributed control system (DCS): FRC-100, a flow loop with control and
recording capability, and LIC-100, a level loop with control and indicating capability.
There is one field-mounted pneumatic loop, PIC-100, with control and indication
capability. There are several switches and indication lights on a local (field) mounted
panel, including hand-operated switches and lights HS and ZL-400, HS and HL-401,
and HS and HL-402. Other control system components are also included in the drawing.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The P&ID also includes piping and mechanical equipment details; for instance, the data
string “10″ 150 CS 001” associated with an interconnecting line defines it as a pipe with
the following characteristics:
• 10″ = 10 nominal pipe
• 150 = ANSI 150 Class 150 rated system
• CS = carbon steel pipe
• 001 = associated pipe is line number 1
The project standards, as defined on a legend sheet, will establish the format and terms
used in the data string.
Loop and Tag Numbering
A unique loop number is used to group a set of functionally connected process control
elements. Grouped elements generally include the measurement of a process variable,
an indication or control element, and often a manipulated variable. Letters combined
with the loop number comprise a tag number. Tag numbers provide unique identification
for each A&C component. All the devices that make up a single process control loop
have the same number but use different letter designations to define their process
function. The letter string is formatted and explained on a legend sheet based on ISA5.1, Table 1.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Figure 1-4 consists of LT-100, a field-mounted electronic level transmitter; LI-100, a
field-mounted electronic indicator; LIC-100, a level controller that is part of the
distributed control system; LY-100, a current-to-pneumatic (I/P) converter; and LV-100,
a pneumatic butterfly control valve. In this case, loop L-100 would have some variation
of the title “KO Drum 100 Level.” ISA-5.1 states that loop numbers may be parallel,
using a single numeric sequence for all process variables, or serial, requiring a new
number for each process variable. Figure 1-3 illustrates a P&ID with a parallel
numbering system using a single loop number (100) with multiple process variables,
flow, level and temperature. The flow loop is FRC-100, the level loop is LIC-100, and
the temperature loop is TI-100.
Figure 1-5 shows how tag marks may also identify the loop location or service. Number
prefixes, suffixes, and other systems can be used that further tie instruments to a P&ID,
a piece of equipment, or a location.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Instrument Lists
The instrument list, also called an instrument index, is an alphanumeric listing of all tagmarked components. The list or index provides a place for each tag identifier to
reference the relevant drawings and documents for that device.
Figure 1-6 is a partial listing that includes the level devices on the D-001 K.O. drum—
LG-1, level gauge; LT-100, level transmitter; and LI-100, level indicator—all of which
are shown in Figure 1-3. The list includes instruments on P&IDs that were not included
in the figures for this chapter. Figure 1-6 has nine columns: “Tag,” “Desc(ription),”
“P&ID,” “Spec Form,” “REQ(uisition),” “Location Plan,” “Install(ation) Detail,”
“Piping Drawing,” and “Loop Diagram.” The instrument list is developed by the A&C
group.
There is no ISA standard defining an instrument list; thus, the list may contain as many
columns as the project design team or the owner needs to support the work, including
design, procurement, maintenance, and operation. The data contained within the
document will have various uses during the life of the facility. In the example index,
level gauges show “n/ a,” not applicable, in the loop drawing column because, for this
facility, gauges are mechanical devices that are not wired to the control system so no
loop drawings are made for them. This is a common approach, but your facility may
choose to prepare and use loop diagrams for all components even when they do not wire
to anything.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Specification Forms
The A&C group defines tag-marked physical or real devices on specification forms, also
known as data sheets. The forms include information useful to designers and device
suppliers during the design and procurement phase of a project, enabling vendors and
contractors to quote and supply the correct device. The forms record for maintenance
and operations the features and capabilities of the devices installed. The forms list all
the component information including materials, ratings, area classification, range,
signal, power, and service. A specification form is completed for each component. In
some cases, similar devices can all be listed on one specification form as long as the
complete model number of the device is the same for all tag numbers listed.
Let’s look at LT-100 from Figure 1-3. The P&ID symbol defines it as an electronic
displacement-type level transmitter. Figure 1-7 is the completed specification form for
LT-100. This form is from ISA-20-1981, Specification Forms for Process Measurement
of Control Instruments, Primary Elements, and Control Valves. There are many
variations of specification forms. Most engineering contractors have developed their
own set of forms; some control component suppliers have done this as well. ISA has a
revised set of forms in a “database/dropdown selection” format in technical report ISATR20.00.01-2001, Specification Forms for Process Management and Control
Instruments – Part 1: General Considerations. The purpose of all the forms is to aid the
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
A&C group in organizing the information needed to fully and accurately define control
components.
Logic Diagrams
Continuous process control is shown clearly on P&IDs; however, different presentations
are needed for on/off control. Logic diagrams are one form of these presentations. ISA’s
set of symbols are defined in ISA-5.2-1976 (R1992), Binary Logic Diagrams for
Process Operations.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
ISA symbols AND, OR, NOT, and MEMORY (FLIP-FLOP) with an explanation of
their meaning are shown in Figures 1-8 and 1-9. Other sets of symbols and other
methods may be used to document on/off control, for example, text descriptions, a
written explanation of the on/off system; ladder diagrams; or electrical elementary
diagrams, known as elementries.
Some designers develop a functional specification or operation description to describe
the intended operation of the system. These documents usually include a description of
the on/ off control of the process.
Figure 1-10 is an illustration of a motor start circuit as an elementary diagram and in
ISA logic form.
Location Plans (Instrument Location Drawings)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
There is no ISA standard that defines or describes a location plan or an instrument
location drawing. Location plans show the ordinate location and sometimes, if the user
chooses, the elevation of control components on plan drawings of a plant. These
predominantly orthographic drawings are useful to the people building the facility, and
they can be used by maintenance and operations as a road map of the system.
Figure 1-11 shows one approach for a location plan. It shows the approximate location
and elevation of the tag-marked devices included on the P&ID (see Figure 1-3), air
supplies for the devices, and the interconnection tubing needed to complete the
pneumatic loop. Other approaches to location plans might include conduit and cabling
information, and fitting and junction box information. Location plans are developed by
the A&C or electrical groups. They are used during construction and by maintenance
personnel after the plant is built to locate the various devices.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Installation Details
Installation details define the requirements for properly installing the tag-marked
devices. The installation details show process connections, pneumatic tubing, or conduit
connections; insulation and even winterizing requirements; and support methods. There
is no ISA standard that defines installation details. However, libraries of installation
details have been developed and maintained by engineering contractors, A&C device
suppliers, some plant owners, installation contractors, and some individual designers.
They all have the same aim—successful installation of the component so that it operates
properly and so it can be operated and maintained. However, they may differ in the
details as to how to achieve reliable operations.
Figure 1-12 shows one approach. This drawing includes a material list to aid in
procuring installation materials and to assist the installation personnel.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Installation details may by developed by the A&C group during the design phase.
However, they are sometimes developed by the installer during construction or by an
equipment supplier for the project.
Loop Diagrams
ISA’s Automation, Systems, and Instrumentation Dictionary defines a loop diagram as
“a schematic representation of a complete hydraulic, electric, magnetic, or pneumatic
circuit.”2 The circuit is called a loop. For a typical loop see Figure 1-4. ISA-5.4-1991,
Instrument Loop Diagrams, presents six typical loop diagrams, two each for pneumatic,
electronic, and shared display and control. One of each type shows the minimum items
required, and the other shows additional optional items.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Figure 1-13 is a loop diagram for electronic flow loop FIC-301. Loop diagrams are
helpful documents for maintenance and troubleshooting because they show how the
components are connected from the process to the control device, all on one sheet. Loop
diagrams are sometimes produced by the principal project A&C supplier, the installation
contractor, or by the plant owner’s operations, maintenance, or engineering personnel.
They are arguably less helpful for construction because there are other more efficient
presentations that are customized to only present information in support of initial
construction efforts, such as cable and conduit schedules and termination lists, which are
not discussed here because they are more appropriate for electrical disciplines than
A&C. Sometimes loop diagrams are produced on an as-needed basis after the plant is
running.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Standards and Regulations
Mandatory Standards
Federal, state, and local laws establish mandatory requirements: codes, laws,
regulations, and requirements. The Food and Drug Administration issues Good
Manufacturing Practices. The National Fire Protection Association (NFPA) issues
Standard 70, the National Electric Code (NEC). The United States government manages
about 50,000 mandatory standards. The Occupational Safety and Health Administration
(OSHA) issues many regulations including government document 29 CFR 1910.119,
Process Safety Management of Highly Hazardous Chemicals (PSM). There are three
paragraphs in the PSM that list documents that are required if certain hazardous
materials are handled. Some of these documents require input from the plant A&C
group.
Consensus Standards
Consensus standards include recommended practices, standards, and other documents
developed by professional societies and industry organizations. The standards developed
by ISA are the ones used most often by A&C personnel. Relevant ISA standards
include:
• ISA-5.1-2009, Instrumentation Symbols and Identification – Defines symbols for
A&C devices.
• ISA-TR5.1.01/ISA-TR77.40.01-2012, Functional Diagram Usage – Illustrates
usage of function block symbols and functions.
• ISA-5.2-1976-(R1992), Binary Logic Diagrams for Process Operations –
Provides additional symbols used on logic diagrams.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• ISA-5.3-1983, Graphic Symbols for Distributed Control/Shared Display
Instrumentation, Logic, and Computer Systems – Contains symbols useful for
DCS definition. The key elements of ISA-5.3 are now included in ISA-5.1, and
ISA-5.3 will be withdrawn in the future.
• ISA-5.4, Instrument Loop Diagrams – Includes additional symbols and six
typical instrument loop diagrams.
• ISA-5.5, Graphic Symbols for Process Displays – Establishes a set of symbols
used in process display.
Other ISA standards of interest include:
• ISA-20-1981, Specification Forms for Process Measurement and Control
Instruments, Primary Elements, and Control Valves – Provides standard
instrument specification forms, including space for principal descriptive options
to facilitate quoting, purchasing, receiving, accounting, and ordering.
• ISA-TR20.00.01-2001, Specification Forms for Process Measurement and
Control Instruments – Part 1: General Considerations – Updates ISA-20.
• ANSI/ISA-84.00.01-2004, Functional Safety: Safety Instrumented Systems for the
Process Industry Sector – Defines the requirements for safe systems.
• ANSI/ISA-88.01-1995 (R2006), Batch Control Part 1: Models and Terminology
– Shows the relationships involved between the models and the terminology.
In addition to ISA, other organizations develop documents to guide professionals. These
organizations include the American Petroleum Institute (API), American Society of
Mechanical Engineers (ASME), National Electrical Manufacturers Association
(NEMA), Process Industry Practice (PIP), International Electrotechnical Commission
(IEC), and the Technical Association of the Pulp and Paper Industry (TAPPI).
Operating Instructions
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Operating instructions, also known as control narratives, are necessary to operate a
complex plant. They range from a few pages describing how to operate one part of a
plant to a complete set of books covering the operation of all parts of a facility. They
might be included in a functional specification or an operating description. There is no
ISA standard to aid in developing operating instructions. They might be prepared by a
group of project, process, electrical, and A&C personnel during plant design; however,
some owners prefer that plant operations personnel prepare these documents. The
operating instructions guide plant operators and other personnel during normal and
abnormal plant operation, including start-up, shutdown, and emergency operation of the
plant.
OSHA requires operating procedures for all installations handling hazardous chemicals.
Their requirements are defined in government document 29 CFR 1910.119(d) Process
Safety Information, (f) Operating Procedures, and (l) Management of Change. For many
types of food processing and drug manufacturing, the Food and Drug Administration
issues Good Manufacturing Practices.
Other Resources
Standards
ANSI/ISA-5.1-2009. Instrumentation Symbols and Identification. Research Triangle
Park, NC: ISA (International Society of Automation).
ANSI/ISA-88.01-1995 (R2006). Batch Control Part 1: Models and Terminology.
Research Triangle Park, NC: ISA (International Society of Automation).
ISA. Automation, Systems, and Instrumentation Dictionary. 4th ed. Research Triangle
Park, NC: ISA (International Society of Automation), 2003.
ISA-TR5.1.01/ISA-TR77.40.01-2012. Functional Diagram Usage. Research Triangle
Park, NC: ISA (International Society of Automation).
ISA-5.2-1976 (R1992). Binary Logic Diagrams for Process Operations. Research
Triangle Park, NC: ISA (International Society of Automation).
ISA-5.3-1983. Graphic Symbols for Distributed Control/Shared Display
Instrumentation, Logic, and Computer Systems. Research Triangle Park, NC: ISA
(International Society of Automation).
ISA-5.4-1991. Instrument Loop Diagrams. Research Triangle Park, NC: ISA
(International Society of Automation).
ISA-5.5-1985. Graphic Symbols for Process Displays. Research Triangle Park, NC: ISA
(International Society of Automation).
ISA-20-1981. Specification Forms for Process Measurement and Control Instruments,
Primary Elements, and Control Valves. Research Triangle Park, NC: ISA
(International Society of Automation).
ISA-TR20.00.01-2007. Specification Forms for Process Measurement and Control
Instruments – Part 1: General Considerations. Research Triangle Park, NC: ISA
(International Society of Automation).
ISA-84.00.01-2004 (IEC 61511 Mod). Functional Safety: Safety Instrumented Systems
for the Process Industry Sector. Research Triangle Park, NC: ISA (International
Society of Automation).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
NFPA 70. National Electric Code (NEC). Quincy, MA: NFPA (National Fire Protection
Agency).
OSHA 29 CFR 1910.119(d) Process Safety Information, (f) Operating Procedures, and
(l) Management of Change. Washington, DC: OSHA (Occupational Safety and
Health Administration).
Books
Meier, Frederick A., and Clifford A. Meier. Instrumentation and Control System
Documentation. Research Triangle Park, NC: ISA (International Society of
Automation), 2011.
Training
ISA FG15E. Developing and Applying Standard Instrumentation and Control
Documentation. Online training course. Research Triangle Park, NC: ISA
(International Society of Automation).
About the Authors
Frederick A. Meier’s career in engineering and engineering management spanned 50
years, and he was an active member of ISA for more than 40 years. He earned an ME
from Stevens Institute of Technology and an MBA from Rutgers University, and he has
held Professional Engineer licenses in the United States and in Canada. Meier and his
son, Clifford Meier, are the authors of Instrumentation and Control System
Documentation published by ISA in 2004. He lives in Chapel Hill, North Carolina.
Clifford A. Meier began his career in 1978 as a mechanical engineer for fossil and
nuclear power generation projects. He quickly transitioned to instrumentation and
control system design for oil and gas production facilities, cogeneration plants, chemical
plants, paper and pulp mills, and microelectronic chip factories. Meier left the
consulting engineering world in 2004 to work for a municipal utility in Portland,
Oregon. He retired in 2017. Meier holds a Professional Engineer license in control
systems, and lives in Beaverton, Oregon.
The Automation, Systems, and Instrumentation Dictionary, 4th ed. (Research Triangle Park, NC: ISA [International
Society of Automation], 2003): 273.
2. Ibid., pg. 299.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
1.
2
Continuous Control
By Harold Wade
Introduction
Continuous control refers to a form of automatic process control in which the
information— from sensing elements and actuating devices—can have any value
between minimum and maximum limits. This is in contrast to discrete control, where
the information normally is in one of two states, such as on/off, open/closed, and
run/stop.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Continuous control is organized into feedback control loops, as shown in Figure 2-1. In
addition to a controlled process, each control loop consists of a sensing device that
measures the value of a controlled variable, a controller that contains the control logic
plus provisions for human interface, and an actuating device that manipulates the rate of
addition or removal of mass, energy, or some other property that can affect the
controlled variable. The sensor, control and human-machine interface (HMI) station,
and actuator are usually connected by some form of signal communication system, as
described elsewhere in this book.
Continuous process control is used extensively in industries where the product is in a
continuous, usually fluid, stream. Representative industries are petroleum refining,
chemical and petrochemical, power generation, and municipal utilities. Continuous
control can also be found in processes in which the final product is produced in batches,
strips, slabs, or as a web in, for example, the pharmaceutical; pulp and paper; steel; and
textile industries. There are also applications for continuous control in the discrete
industries—for instance, a temperature controller on an annealing furnace or motion
control in robotics.
The central device in a control loop, the controller, may be built as a stand-alone device
or may exist as shared components in a digital system, such as a distributed control
system (DCS) or programmable logic controller (PLC). In emerging technology, the
control logic may be located at either the sensing or the actuating device.
Process Characteristics
In order to understand feedback control loops, one must understand the characteristics
of the controlled process. Listed below are characteristics of almost all processes,
regardless of the application or industry.
• Industrial processes are nonlinear; that is, they will exhibit different responses at
different operating points.
• Industrial processes are subject to random disturbances, due to fluctuations in
feedstock, environmental effects, and changes or malfunctions of equipment.
• Most processes contain some amount of dead time; a control action will not
produce an immediate feedback of its effect.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Many processes are interacting; a change in one controller’s output may affect
other process variables besides the intended one.
• Most process measurements contain some amount of random variation, called
noise.
• Most processes are unique; processes using apparently identical equipment may
have individual idiosyncrasies.
A typical response to a step change in signal to the actuating device is shown in Figure
2-2.
In addition, there are physical and environmental characteristics that must be considered
when selecting equipment and installing control systems.
• The process may be toxic, requiring exceptional provisions to prevent release to
the environment.
• The process may be highly corrosive, limiting the selection of materials for
components that come in contact with the process.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• The process may be highly explosive, requiring special equipment housing or
installation technology for electrical apparatus.
Feedback Control
The principle of feedback control is this: if a controlled variable deviates from its
desired value (set point), corrective action moves a manipulated variable (the controller
output) in a direction that causes the controlled variable to return toward the set point.
Most feedback control loops in industrial processes utilize a proportional-integralderivative (PID) control algorithm. There are several forms of the PID. There is no
standardization for the names. The names ideal, interactive, and parallel are used here,
although some vendors may use other names.
Ideal PID Algorithm
The most common form of PID algorithm is the ideal form (also called the
noninteractive form or the ISA form). This is represented in mathematical terms by
Equation 2-1, and in block diagram form by Figure 2-3.
(2-1)
Here, m represents the controller output; e represents the error (difference between the
set point and the controlled variable). Both m and e are in percent of span. The symbols
KC (controller gain), TI (integral time), and TD (derivative time) represent tuning
parameters that must be adjusted for each application.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The terms in the algorithm represent the proportional, integral, and derivative
contributions to the output. The proportional mode is responsible for most of the
correction. The integral mode assures that, in the long term, there will be no deviation
between the set point and the controlled variable. The derivative mode may be used for
improved response of the control loop. In practice, the proportional and integral modes
are almost always used; the derivative mode is often omitted, simply by setting TD = 0.
There are other forms for the tuning parameters. For instance, controller gain may be
expressed as proportional band (PB), which is defined as the amount of measurement
change (in percent of measurement span) required to cause 100% change in the
controller output. The conversion between controller gain and proportional band is
shown by Equation 2-2:
(2-2)
The integral mode tuning parameter may be expressed in reciprocal form, called reset
rate. Whereas TI is normally expressed in “minutes per repeat,” the reset rate is
expressed in “repeats per minute.” The derivative mode tuning parameter, TD, is always
in time units, usually in minutes. (Traditionally, the time units for tuning parameters has
been “minutes.” Today, however, some vendors are expressing the time units in
“seconds.”)
Interactive PID Algorithm
The interactive form, depicted by Figure 2-4, was the predominant form for analog
controllers and is used by some vendors today. Other vendors provide a choice of the
ideal or interactive form. There is essentially no technological advantage to either form;
however, the required tuning parameters differ if the derivative mode is used.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Parallel PID Algorithm
The parallel form, shown in Figure 2-5, uses independent gains on each mode. This
form has traditionally been used in the power generation industry and in such
applications as robotics, flight control, and motion control. Other than power generation,
it is rarely found in the continuous process industries. With compatible tuning, the ideal,
interactive, and parallel forms of PID produce identical performance; hence no
technological advantage can be claimed for any form. The tuning procedure for the
parallel form differs decidedly from that of the other two forms.
Time Proportioning Control
Time proportioning refers to a form of control in which the PID controller output
consists of a series of periodic pulses whose duration is varied to relate to the normal
continuous output. For example, if the fixed cycle base is 10 seconds, a controller output
of 30% will produce an on pulse of 3 seconds and an off pulse of 7 seconds. An output
of 75% will produce an on pulse of 7.5 seconds and an off pulse of 2.5 seconds. This
type of control is usually applied where the cost of an on/off final actuating device is
considerably less than the cost of a modulating device. In a typical application, the on
pulses apply heating or cooling by turning on a resistance-heating element, a siliconcontrolled rectifier (SCR), or a solenoid valve. The mass of the process unit (such as a
plastics extruder barrel) acts as a filter to remove the low-frequency harmonics and
apply an even amount of heating or cooling to the process.
Manual-Automatic Switching
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
It is desirable to provide a means for process operator intervention in a control loop in
the event of abnormal circumstances, such as a sensor failure or a major process upset.
Figures 2-3, 2-4, and 2-5 show a manual/automatic switch that permits switching
between manual and automatic modes. In the manual mode, the operator can set the
signal to the controller output. However, when the switch is returned to the automatic
position, the automatic controller output must match the operator’s manual setting or
else there will be a “bump” in the controller output. (The term bumpless transfer is
frequently used.) With older technology, it was the operator’s responsibility to prevent
bumping the process. With current technology, bumpless transfer is built into most
control systems; some vendors refer to this as initializing the control algorithm.
Direct- and Reverse-Acting
For safety and environmental reasons, most final actuators, such as valves, will close in
the event of a loss of signal or power to the actuator. There are instances, however, when
the valve should open in the event of signal or power failure. Once the failure mode of
the valve is determined, the action of the controller must be selected. Controllers may be
either direct-acting (DA) or reverse-acting (RA). If a controller is direct-acting, an
increase in the controlled variable will cause the controller output to increase. If the
controller is reverse-acting, an increase in the controlled variable will cause the output
to decrease. Because most control valves are fail-closed, the majority of the controllers
are set to be reverse-acting. The setting— DA or RA—is normally made at the time the
control loop is commissioned. With some DCSs, the DA/RA selection can be made
without considering the failure mode of the valve; then a separate selection is made as to
whether to reverse the analog output signal. This permits the HMI to display all valve
positions in a consistent manner, which is 0% for closed and 100% for open.
Activation for Proportional and Derivative Modes
Regardless of algorithm form, there are certain configuration options that every vendor
offers. One configuration option is the DA/RA setting. Other configuration options
pertain to the actuating signal for the proportional and derivative modes. Note that, in
any of the forms of algorithms, if the derivative mode is being used (TD ≠ 0), a set point
change will induce an undesirable spike on the controller output. A configuration option
permits the user to make the derivative mode sensitive only to changes in the controlled
variable, not to the set point. This choice is called derivative-on-error or derivative-onmeasurement.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Even with derivative-on-measurement, on a set-point change, the proportional mode
will cause a step change in the controller output. This, too, may be undesirable.
Therefore, a similar configuration option permits the user to select proportional-onmeasurement or proportional- on-error. Figure 2-6 shows both proportional and
derivative modes sensitive to measurement changes alone. This leaves only the integral
mode on error, where it must remain, because it is responsible for assuring the long-term
equality of the set point and the controlled variable. In the event of a disturbance, there
is no difference between the responses of a configuration using either or both derivativeon-measurement and proportional-on-measurement and a configuration with all modes
on error.
Two-Degree-of-Freedom PID
A combination of Figure 2-3 (ideal PID algorithm) and Figure 2-6 (ideal PID algorithm
with proportional and derivative modes on measurement) is shown in Figure 2-7 and
described in mathematical terms by Equation 2-3. If the parameters b and c each have
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
values of 0 or 1, then this would be the equivalent of providing configuration options of
proportional-on-measurement or proportional-on-error, and derivative-on-measurement
or derivative-on-error. Although the implementation details may differ, some
manufacturers have taken this conceptual idea even further by permitting parameters b
and c to take on any value equal to or between 0 and 1. For instance, rather than having
the derivative mode wholly on error or wholly on measurement, a fraction of the signal
can come from error and the complementary fraction can come from measurement.
Such a controller is called a two-degree-of-freedom control algorithm. Some
manufacturers only provide the parameter modification b on the proportional mode.
This configuration is normally called a set-point weighting controller.
(2-3)
A problem often encountered with an ideal PID, or one of its variations, is if it is tuned
to give an acceptable response to a set-point change, it may not be sufficiently
aggressive in eliminating the effect of a disturbance. On the other hand, if it is tuned to
aggressively eliminate a disturbance effect, it may respond too aggressively to a setpoint change. By utilizing the freedom provided by the two additional tuning
parameters, the two-degree-of-freedom control algorithm can be tuned to produce an
acceptable response to both a set-point change and a disturbance. This is due to the fact
that, in essence, there is a different controller acting on a disturbance from the controller
that acts on a set-point change. Compare the loop diagrams shown in Figures 2-8a and
2-8b.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Discrete Forms of PID
The algorithm forms presented above, using calculus symbols, are applicable to analog
controllers that operate continuously. However, control algorithms implemented in a
digital system are processed at discrete sample instants (for instance, 1-second
intervals), rather than continuously. Therefore, a modification must be made to show
how a digital system approximates the continuous forms of the algorithm presented
above. Digital processing of the PID algorithm also presents an alternative that was not
present in analog systems. At each sample instant, the PID algorithm can calculate
either a new position for the controller output or the increment by which the output
should change. These forms are called the position and the velocity forms, respectively.
Assuming that the controller is in the automatic mode, the following steps (Equation 24) are executed at each sample instant for the position algorithm. (The subscript “n”
refers to the nth processing instant, “n-1” to the previous processing instant, and so on.)
(2-4)
Save Sn and en values for the subsequent processing time.
The velocity mode or incremental algorithm is similar. It computes the amount by which
the controller output should be changed at the nth sample instant.
Use Equation 2-5 to compute the change in controller output:
(2-5)
Add the incremental output to the previous value of controller output, to create the new
value of output (Equation 2-6):
mn = mn – 1 + ∆mn
(2-6)
Save mn, en–1, and en–2 values for the subsequent processing time.
From a user point of view, there is no advantage of one form over the other. Vendors,
however, may prefer a particular form due to the ease of incorporation of features of
their system, such as tuning and bumpless transfer.
The configuration options—DA/RA, derivative-on-measurement and proportional-onmeasurement, or error—are also applicable to the discrete forms of PID. In fact, there
are more user configuration options offered with digital systems than were available
with analog controllers.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Controller Tuning
In the previous section, it was mentioned the parameters KC, TI (or their equivalents,
proportional band and reset rate), and TD must be adjusted so the response of the
controller matches the requirements of a particular process. This is called tuning the
controller. There are no hard and fast rules as to the performance requirements for
tuning. These are largely established by the particular process application and by the
desires of the operator or controller tuner.
Acceptable Criteria for Loop Performance
One widely used response criterion is this: the loop should exhibit a quarter-amplitude
decay following a set-point change. See Figure 2-9. For many applications, however,
this is too oscillatory. A smooth response to a set-point change with minimum overshoot
is more desirable. A response to a set-point change that provides minimum overshoot is
considered less aggressive than quarter-amplitude decay. If an ideal PID controller (see
Figure 2-3) is being used, the penalty for less aggressive tuning is that a disturbance will
cause a greater deviation from the set point or a longer time to return to the set point. In
this case, the controller tuner must decide the acceptable criterion for loop performance
before actual tuning.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Recent developments, applicable to digital controllers, have provided the two-degree-offreedom PID controller, conceptually shown in Figure 2-7. With this configuration
(which is not yet offered by all manufacturers), the signal path from the set point to the
controller output is different from the signal path from the measurement to the controller
output. This permits the controller to be tuned for acceptable response to both
disturbances and set-point changes. Additional information regarding the two-degree-offreedom controller can be found in Aström and Hägglund (2006).
Controller tuning techniques may be divided into two broad categories: those that
require testing of the process, either with the controller in automatic or manual, and
those that are less formal, often called trial-and-error tuning.
Tuning from Open-Loop Tests
The open-loop process testing method uses only the manually set output of the
controller. A typical response to a step change in output was shown in Figure 2-2. It is
often possible to approximate the response with a simplified process model containing
only three parameters—the process gain (Kp), the dead time in the process (Td), and the
process time constant (Tp). Figure 2-10 shows the response of a first-order-plus-deadtime (FOPDT) model that approximates the true process response.
Figure 2-10 also shows the parameter values, Kp, Td, and Tp. There are several published
correlations for obtaining controller tuning parameters from these process parameters.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The best known is based on the Ziegler-Nichols Reaction Curve Method. Correlations
for P-only, PI, and PID controllers are given here in Table 2-1.
Another tuning technique that uses the same open-loop process test data is called
lambda tuning. The objective of this technique is for the set-point response to be an
exponential rise with a specified time constant, λ. This technique is applicable whenever
it is desired to have a very smooth set-point response, at the expense of degraded
response to disturbances.
There are other elaborations of the open-loop test method, including multiple valve
movements in both directions and numerical regression methods for obtaining the
process parameters. Despite its simplicity, the open-loop method suffers from the
following problems:
• It may not be possible to interrupt normal process operations to make the test.
• If there is noise on the measurement, it may not be possible to get good data,
unless the controlled variable change is at least five times the amplitude of the
noise. For many processes, that may be too much disturbance.
• The technique is very sensitive to parameter estimation error, particularly if the
ratio of Td/Tp is small.
• The method does not take into consideration the effects of valve stiction.
• The actual process response may be difficult to approximate with an FOPDT
model.
• A disturbance to the process during the test will severely deteriorate the quality of
the data.
• For very slow processes, the complete results of the test may require one or more
working shifts.
• The data is valid only at one operating point. If the process is nonlinear,
additional tests at other operating points may be required.
Despite these problems, under relatively ideal conditions—minimal process noise,
minimal disturbances during the test, minimal valve stiction, and so on—the method
provides acceptable results.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Tuning from Closed-Loop Tests
Another technique is based on testing the process in the closed loop. (Ziegler-Nichols
referred to this as the ultimate sensitivity method.) To perform this test, the controller is
placed in the automatic mode, integral and derivative actions are removed (or a
proportional-only controller is used), a low controller gain is set, and then the process is
disturbed—either by a set-point change or a forced disturbance—and the oscillating
characteristics are observed. The objective is to repeat this procedure with increased
gain until sustained oscillation (neither increasing nor decreasing) is achieved. At that
point, two pieces of data may be obtained: the value of controller gain (called the
ultimate gain, or KCU) that produced sustained oscillation, and the period of the
oscillation, PU. With this data, one can use Table 2-2 to calculate tuning parameters for a
P-only, PI, or PID controller.
There are also problems with the closed-loop method:
• It may not be possible to subject the process to a sustained oscillation.
• Even if that were possible, it is difficult to predict or to control the magnitude of
the oscillation.
• Multiple tests may be required, resulting in long periods of interruption to normal
operation.
Despite these problems, there are certain advantages to the closed-loop method:
• Minimal uncertainty in the data. (Frequency, or its inverse, period, can be
measured quite accurately.)
• The method inherently includes the effect of a sticking valve.
• Moderate disturbances during the testing can be tolerated.
• No a priori assumption as to the form of the process model is required.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
A modification of the closed-loop method, called the relay method, attempts to exploit
the advantages while circumventing most of the problems. The relay method utilizes the
establishment of maximum and minimum limits for the controller output. For instance,
if the controller output normally is 55%, the maximum output can be set at 60% and the
minimum at 50%. While this does not establish hard limits for excursion of the
controlled variable, persons familiar with this process will feel comfortable with these
settings or will reduce the difference between the limits.
The process is then tested by a set-point change or a forced disturbance, using an on/off
controller. If an on/off controller is not available, then a P-only controller with a
maximum value of controller gain can be substituted. For a reverse-acting controller, the
controlled variable will oscillate above and below the set point, with the controller
output at either the maximum or minimum value, as shown in Figure 2-11.
If the period of time when the controller output is at the maximum setting exceeds the
time at the minimum, then both the maximum and minimum limits should be shifted
upward by a small but identical amount. After one or more adjustments, the output
square wave should be approximately symmetrical. At that condition, the period of
oscillation, PU, is the same as would have been obtained by the previously described
closed-loop test. Furthermore, the ultimate gain can be determined from a ratio of the
controller output and CV amplitudes as shown in Equation 2-7:
(2-7)
Thus, the data required to enter in Table 2-2 and calculate tuning parameters have been
obtained in a much more controlled manner than the unbounded closed-loop test.
While the relay method is a viable technique for manual testing, it can also be easily
automated. For this reason, it is the basis for some vendors’ self-tuning techniques.
Trial-and-Error Tuning
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Despite these tools for formal process testing for determination of tuning parameters,
many loops are tuned by trial-and-error. That is, for an unsatisfactory loop, closed-loop
behavior is observed, and an estimate (often merely a guess) is made as to which
parameter(s) should be changed and by how much. Good results often depend on the
person’s experience. Various methods of visual pattern recognition have been described
but, in general, such tuning techniques remain more of an art than a science.
In the book Basic and Advanced Regulatory Control: System Design and Application
(Wade 2017), a technique called improving as-found tuning, or intelligent trial-anderror tuning, attempts to place controller tuning on a more methodological basis. The
premise of this technique, which is applicable only to PI controllers, is that a well-tuned
controller exhibiting a slight oscillation (oscillations that are decaying rapidly) will have
a predictable relation between the integral time and period of oscillation. The relation in
Equation 2-8 has been found to provide acceptable results:
(2-8)
Further insight into this technique can be gained by noting that the phase shift through a
PI controller, from error to controller output, depends strongly on the ratio P/TI and only
slightly on the decay ratio. For a control loop with a quarter-amplitude decay, the limits
above are equivalent to specifying a phase shift of approximately 15°.
If a control system engineer or instrumentation technician is called on to correct the
errant behavior of a control loop, then (assuming that it is a tuning problem and not
some external problem) the as-found behavior is caused by the as-found tuning
parameter settings. The behavior can be characterized by the decay ratio (DR) and the
period (P) of oscillation. The as-found data set—KC, TI, DR, and P—represents some
quanta of knowledge about the process. If either an open-loop or closed-loop test were
made in an attempt to determine tuning parameters, then the existing knowledge about
the process would be sacrificed.
From Equation 2-7, the upper and lower limits for an acceptable period can be
established using Equation 2-9.
1.5TI ≤ P ≤ 2.0TI
(2-9)
If the as-found period P meets these criteria, the implication is that the integral time is
acceptable. Hence, adjustments should be made to the controller gain (KC) until the
desired decay ratio is obtained. If the period is outside this limit, then the present period
can be used in the inverted relation to determine a range of acceptable new values for TI
(Equation 2-10).
0.5P ≤ TI ≤ 0.67P
(2-10)
Basic and Advanced Regulatory Control: System Design and Application (Wade 2017)
and “Trial and Error: An Organized Procedure” (Wade 2005) contain more information,
including a flow chart, describing this technique.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Self-Tuning
Although self-tuning, auto-tuning, and adaptive-tuning have slightly different
connotations, they will be discussed collectively here.
There are two different circumstances where some form of self-tuning would be
desirable:
1. If a process is highly nonlinear and also experiences a wide range of operating
points, then a technique that automatically adjusts the tuning parameters for
different operating conditions would be highly beneficial.
2. If a new process unit with many control loops were to be commissioned, it
would be beneficial if the controllers could determine their own best tuning
parameters.
There are different technologies that address these situations.
For initial tuning, there are commercial systems that, in essence, automate the open-loop
test procedure. On command, the controller will revert to the manual mode, test the
process, characterize the response by a simple process model, and then determine
appropriate tuning parameters. Most commercial systems that follow this procedure
display the computed parameters and await confirmation before entering the parameters
into the controller. An automation of the relay tuning method described previously falls
into this category.
The simplest technique addressing the nonlinearity problem is called scheduled tuning.
If the nonlinearity of a process can be related to a key parameter such as process
throughput, then a measure of that parameter can be used as an index to a lookup table
(schedule) for appropriate tuning parameters. The key parameter may be divided into
regions, with suitable tuning parameters listed for each region. Note that this technique
depends on the correct tabulation of tuning parameters for each region. There is nothing
in the technique that evaluates the loop performance and automatically adjusts the
parameters based on the evaluation.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
There are also systems that attempt to recognize features of the response to normal
disturbances to the loop. From these features, heuristic rules are used to calculate new
tuning parameters. These may be displayed for confirmation, or they may be entered
into the algorithm “on the fly.” Used in this manner, the system tries to adapt the
controller to the random environment of disturbances and set-point changes as they
occur.
There are also third-party packages, typically running in a notebook computer, that
access data from the process, such as by transferring data from the DCS data highway.
The data is then analyzed and advisory messages are presented that suggest tuning
parameters and provide an indication of the “health” of control loop components,
especially the valve.
Advanced Regulatory Control
If the process disturbances are few and not severe, feedback controllers will maintain
the average value of the controlled variable at the set point. But in the presence of
frequent or severe disturbances, feedback controllers permit significant variability in the
control loop. This is because a feedback controller must experience a deviation from the
set point in order to change its output. This variability may result in an economic loss.
For instance, a process may operate at a safe margin away from a target value to prevent
encroaching on the limit and producing off-spec product. Reducing the margin of safety
will produce some economic benefit, such as reduced energy consumption, reduced raw
material usage, or increased production. Reducing the variability cannot be done by
feedback controller tuning alone. It may be accomplished using more advanced control
loops such as ratio, cascade, feedforward, decoupling, and selector control.
Ratio Control
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Often, when two or more ingredients are blended or mixed, the flow rate of one of the
ingredients paces the production rate. The flow rates for the other ingredients are
controlled to maintain a specified ratio to the pacing ingredient. Figure 2-12 shows a
ratio control loop. Ratio control systems are found in batch processing, fuel oil blending,
combustion processes where the air flow may be ratioed to the fuel flow, and many
other applications. The pacing stream is often called the wild flow, because it may or
may not be provided with an independent flow rate controller—only a measurement of
the wild flow stream is utilized in ratio control.
The specified ratio may be manually set, automatically set from a batch recipe, or
adjusted by the output of a feedback controller. An example of the latter is a process
heater that uses a stack oxygen controller to adjust the air-to-fuel ratio. When the
required ratio is automatically set by a higher-level feedback controller, the ratio control
strategy is merely one form of feedforward control.
Cascade Control
Cascade control refers to control schemes that have an inner control loop nested within
an outer loop. The feedback controller in the outer loop is called the primary controller.
Its output sets the set point for the inner loop controller, called the secondary. The
secondary control loop must be significantly faster than the primary loop. Figure 2-13
depicts an example of cascade control applied to a heat exchanger. In this example, a
process fluid is heated with a hot oil stream. A temperature controller on the heat
exchanger output sets the set point of the hot oil flow controller.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
If the temperature controller directly manipulated the valve, there would still be a valid
feedback control loop. Any disturbance to the loop, such as a change in the process
stream flow rate or a change in the hot oil supply pressure, would require a new position
of the control valve. Therefore, a deviation of temperature from the set point would be
required to move the valve.
With the secondary loop installed as shown in Figure 2-13, a change in the hot oil
supply pressure will result in a change in the hot oil flow. This will be rapidly detected
by the flow controller, which will then make a compensating adjustment to the valve.
The momentary variation in the hot oil flow will cause minimal, if any, disturbance to
the temperature control loop.
In the general situation, all disturbances within the secondary loop—a sticking valve,
adverse valve characteristics, or (in the example) variations in supply pressure—are
confined to the secondary loop and have minimal effect on the primary controlled
variable. A disturbance that directly affects the primary loop, such as a change in the
process flow rate in the example, will require a deviation at the primary controller for its
correction regardless of the presence or absence of a secondary controller.
When examining a process control system for possible improvements, consider whether
intermediate control loops can be closed to encompass certain of the disturbances. If so,
the effect of these disturbances will be removed from the primary controller.
Feedforward Control
Feedforward control is defined as the manipulation of the final control element—the
valve position or the set point of a lower-level flow controller—using a measure of a
disturbance rather than the output of a feedback controller. In essence, feedforward
control is open-loop control. Feedforward control requires a process model in order to
know how much and when correction should be made for a given disturbance. If the
process model were perfect, feedforward control alone could be used. In actuality, the
process model is never perfect; therefore, feedforward and feedback control are usually
combined.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The example in the previous section employed cascade control to overcome the effect of
disturbances caused by variations in the hot oil supply pressure. It was noted, however,
that variations in the process flow rate would still cause a disturbance to the primary
controller. If the process and hot oil flow rates varied in a proportionate amount, there
would be only minimal effect on the process outlet temperature. Thus, a ratio between
the hot oil and process flow rates should be maintained. While this would eliminate
most of the variability at the temperature controller, there may be other circumstances,
such as heat exchanger tube scaling, that would necessitate a long-term shift in the
required ratio. This can be implemented by letting the feedback temperature controller
set the required ratio as shown in Figure 2-14.
Ratio control, noted earlier as an example of feedforward-feedback control, corrects for
the steady-state effects on the controlled variable. Suppose that there is also a difference
in dynamic effects of the hot oil and process streams on the outlet temperature. In order
to synchronize the effects at the outlet temperature, dynamic compensation may be
required in the feedforward controller.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
To take a more general view of feedforward, consider the generic process shown within
the dotted lines in Figure 2-15. This process is subject to two influences (inputs)—a
disturbance and a control effort. The control effort may be the signal to a valve or to a
lower level flow controller. In this latter case, the flow controller can be considered as a
part of the process. Transfer functions A(s) and B(s) are mathematical abstractions of the
dynamic effect of each of the inputs on the controlled variable. A feedforward controller
C(s), a feedback controller, and the junction combining feedback and feedforward are
also shown in Figure 2-15.
There are two paths of influence from the disturbance to the controlled variable. If the
disturbance is to have no effect on the controlled variable (that is the objective of
feedforward control), these two paths must be mirror images that cancel out each other.
Thus, the feedforward controller must be the ratio of the two process dynamic effects,
with an appropriate sign adjustment. The correct sign will be obvious in any practical
situation. That is:
If both A(s) and B(s) have been approximated as FOPDT models (see “Tuning from
Open-Loop Tests” in this chapter), then C(s) is comprised of, at most, a steady-state
gain, a lead-lag, and a dead-time function. These functions are contained in every
vendor’s function block library. The dynamic compensation can often be simpler than
this. For instance, if the dead times through A(s) and B(s) are identical, then no deadtime term is required in the dynamic compensation.
(2-11)
Now consider combining feedback and feedforward control. Figure 2-15 shows a
junction for combining these two forms of control but does not indicate how they are
combined. In general, feedback and feedforward can be combined by adding or by
multiplying the signals. A multiplicative combination is essentially the same as ratio
control. In situations where a ratio must be maintained between the disturbance and the
control effort, a multiplicative combination of feedback and feedforward will provide a
relatively constant process gain for the feedback controller. If the feedback and
feedforward were combined additively, variations in process gain seen by the feedback
controller would require frequent retuning. In other situations, it is better to combine
feedback and feedforward additively, a control application often called feedback trim.
Regardless of the method of combining feedback and feedforward, the dynamic
compensation terms should be only in the feedforward path, not the feedback path. It
would be erroneous for the dynamic compensation terms to be placed after the
combining junction in Figure 2-15.
Feedforward control is one of the most powerful control techniques for minimizing
variability in a control loop. It is often overlooked due to lack familiarity with the
technique.
Decoupling Control
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Frequently in industrial processes, a manipulated variable—a signal to a valve or to a
lower-level flow controller—will affect more than one controlled variable. If each
controlled variable is paired with a particular manipulated variable through a feedback
controller, interaction between the control loops will lead to undesirable variability.
One way of coping with the problem is to pair the controlled and manipulated variables
to reduce the interaction between the control loops. A technique for pairing the
variables, called relative gain analysis, is described in most texts on process control, as
well as in the book Process Control Systems: Application Design and Tuning by F. G.
Shinskey and the ISA Transactions article “Inverted Decoupling, A Neglected
Technique” by Harold Wade. If, after applying this technique, the residual interaction is
too great, the control loops should be modified for the purpose of decoupling. With
decoupled control loops, each feedback controller output affects only one controlled
variable.
Figure 2-16 shows a generic process with two controlled inputs—a signal to valves or
set points to lower-level flow controllers—and two controlled variables. The functions
P11, P12, P21, and P22 represent dynamic influences of inputs on the controlled variables.
With no decoupling, there will be interaction between the control loops. However,
decoupling elements can be installed so that the output of PID#1 has no effect on CV#2,
and the output of PID#2 has no effect on CV#1.
Using an approach similar to feedforward control, note that there are two paths of
influence from the output of PID#1 to CV#2. One path is through the process element
P21(s). The other is through the decoupling element D21(s) and the process element
P22(s). For the output of PID#1 to have no effect on CV#2, these paths must be mirror
images that cancel out each other. Therefore, the decoupling element must be as shown
in Equation 2-12:
(2-12)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
In a practical application, the appropriate sign will be obvious. In a similar fashion, the
other decoupling element is given in Equation 2-13:
(2-13)
If the process elements are approximated with FOPDT models as described in the
“Tuning from Open-Loop Tests” section of this chapter, the decoupling elements are, at
most, comprised of gain, lead-lag, and dead-time functions, all of which are available
from most vendors’ function block library.
The decoupling technique described here can be called forward decoupling. A
disadvantage of this technique is that if one of the manual-auto switches is in manual,
the apparent process seen by the alternate control algorithm is not the same as if both
controllers were in auto.
Hence, to get acceptable response from the control algorithm yet in auto, its control
algorithm tuning would have to be changed.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
An alternative technique to forward decoupling is inverted decoupling, depicted in
Figure 2-17. With this technique, the output of each controller’s manual-automatic
switch is fed backward through the decoupling element and combined with the output of
the alternate control algorithm PID. The forms of the decoupling elements are identical
to that used in forward decoupling.
The very significant advantage of using inverted decoupling is that the apparent process
seen by each controller algorithm does not change with changing the position of the
alternate control algorithm’s manual/automatic switch. A caution about using inverted
decoupling, however, is that an inner loop is formed by the presence of the decoupling
elements; this inner loop may or may not be stable. This loop is comprised of elements
of known parameters, and, therefore, can be precisely analyzed and the stability can be
verified before installation. Chapter 13 of Basic and Advanced Regulatory Control:
System Design and Application (Wade 2017) provides a rigorous procedure for verifying
stability, as well as for investigating realizability and robustness of the decoupling
technique.
An alternate to complete decoupling, either forward or inverted, is partial decoupling. If
one variable is of greater priority than the other, partial decoupling should be
considered. Suppose that CV#1 in Figure 2-16 is a high-valued product and CV#2 is a
low-valued product. Variability in CV#1 should be minimized, whereas variability in
CV#2 can be tolerated. The upper decoupling element in Figure 2-16 can be
implemented and the lower decoupling element eliminated.
Some form of the decoupling described above can be utilized if there are two, or
possibly three, interacting loops. If there are more loops, using a form of advanced
process control is recommended.
Selector (Override) Control
Selector control, also called override control, differs from the other techniques because
it does not have as its objective the reduction of variability in a control loop. It does
have an economic consequence, however, because the most economical operating point
for many processes is near the point of encroachment on a process, equipment, or safety
limit. Unless a control system is present that prevents such encroachment, the tendency
will be to operate well away from the limit, at a less-than-optimum operating point.
Selector control permits operating closer to the limit.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
As an example, Figure 2-18 illustrates a process heater. In normal operation, an outlet
temperature controller controls the firing rate of the heater. During this time, a critical
tube temperature is below its limit. Should, however, the tube temperature encroach on
the limit, the tube temperature controller will override the normal outlet temperature
controller and reduce the firing rate of the heater. The low-signal selector in the
controller outputs provides for the selection of the controller that is demanding the
lower firing rate.
If ordinary PI or PID controllers are used for this application, one or the other of the
controlled variables will be at its set point, with the other variable less that its set point.
The integral action of the nonselected controller will cause it to wind up—that is, its
output will climb to 100%. In normal operation, this will be the tube temperature
controller. Should the tube temperature rise above its set point, its output must unwind
from 100% to a value that is less than the other controller’s output before there is any
effect on heater firing. Depending on the controller tuning, there may be a considerable
amount of time when the tube temperature is above its limit.
When the tube temperature controller overrides the normal outlet temperature controller
and reduces heater firing, there will be a drop in heater outlet temperature. This will
cause the outlet temperature controller to wind up. Once the tube temperature is
reduced, returning to normal outlet temperature control is as awkward as was the switch
to tube temperature control.
These problems arise because ordinary PID controllers were used in the application.
Most vendors have PID algorithms with alternative functions to circumvent these
problems. Two techniques will be briefly described.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Some vendors formulate their PID algorithm with external reset. The integral action is
achieved by feeding the output of the controller back to a positive feedback loop that
utilizes a unity-gain first-order lag. With the controller output connected to the external
feedback port, the response of a controller with this formulation is identical to that of an
ordinary PID controller. Different behavior occurs when the external reset feedback is
taken from the output of the selector, as shown in Figure 2-18. The nonselected
controller will not wind up. Instead, its output will be equal to the selected controller’s
output plus a value representing its own gain times error. As the nonselected controlled
variable (for instance, tube temperature) approaches its limit, the controller outputs
become more nearly equal, but with the nonselected controller’s output being higher.
When the nonselected controller’s process variable reaches the limit, the controller
outputs will be equal. Should the nonselected controller’s process variable continue to
rise, its output will become the lower of the two—hence it will be selected for control.
Because there is no requirement for the controller to unwind, the switch-over will be
immediate.
Other systems do not use the external feedback. The nonselected controller is identified
from the selector switch. As long as it remains the nonselected controller, it is
continually initialized so that its output equals the other controller output plus the value
of its own gain times error. This behavior is essentially the same as external feedback.
There are many other examples of selector control in industrial processes. On a pipeline,
for instance, a variable speed compressor may be operated at the lower speed demanded
by the suction and discharge pressure controllers. For distillation control, reboiler heat
may be set by the lower of the demands of a composition controller and a controller of
differential pressure across one section of a tower, indicative of tower flooding.
Further Information
Aström, K. J., and T. Hägglund. Advanced PID Control. Research Triangle Park, NC:
ISA (International Society of Automation), 2006.
Shinskey, F. G. Process Control Systems: Application Design and Tuning. 4th ed. New
York: McGraw-Hill, 1996.
The Automation, Systems, and Instrumentation Dictionary. 4th ed. Research Triangle
Park, NC: ISA (International Society of Automation), 2003.
Wade, H. L. Basic and Advanced Regulatory Control: System Design and Application.
Research Triangle Park, NC: ISA (International Society of Automation), 2017.
——— “Trial and Error Tuning: An Organized Procedure.” InTech (May 2005).
——— “Inverted Decoupling, A Neglected Technique.” ISA Transactions 36, no. 1
(1997).
About the Author
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Harold Wade, PhD, is president of Wade Associates, Inc., a Houston-based consulting
firm specializing in control systems engineering, instrumentation, and process control
training. He has more than 40 years of instrumentation and control industry experience
with Honeywell, Foxboro, and Biles and Associates. A senior member of ISA and a
licensed Professional Engineer, he is the author of Basic and Advanced Regulatory
Control: System Design and Application, published by ISA. He started teaching for ISA
in 1987 and was the 2008 recipient of the Donald P. Eckman Education Award. He was
inducted into Control magazine’s Automation Hall of Fame in 2002.
3
Control of Batch Processes
By P. Hunter Vegas, PE
What Is a Batch Process?
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
A batch process is generally considered one that acts on a discrete amount of material in
a sequential fashion. Probably the easiest way to describe a batch process is to compare
and contrast it to a continuous process, which is more common in industry today. The
examples discussed below are specifically geared to continuous and batch versions of
chemical processes, but these same concepts apply to a diverse range of industries.
Batch manufacturing techniques can be found in wine/beer making, food and beverage
production, mining, oil and gas processing, and so on.
A continuous chemical process usually introduces a constant stream of raw materials
into the process, moving the material through a series of vessels to perform the
necessary chemical steps to make the product. The material might pass through a
fluidized bed reactor to begin the chemical reaction, pass through a water quench vessel
to cool the material and remove some of the unwanted byproducts, and finally be
pushed through a series of distillation columns to refine the final product before
pumping it to storage. In contrast a batch chemical process usually charges the raw
materials to a batch reactor, and then performs a series of chemical steps in that same
vessel until the desired product is achieved. These steps might include mixing, heating,
cooling, batch distilling, and so on. When the steps are complete, the material might be
pumped to storage or it may be an intermediate material that is transferred to another
batch vessel where more processing steps are performed.
Another key difference between continuous and batch processes is the typical running
state of the process. A continuous process usually has a short start-up sequence and then
it achieves steady state and remains in that state for days, weeks, months, and even
years. The start-up and shutdown sequences are often a tiny fraction of the production
run. In comparison, a batch process rarely achieves steady state. The process is
constantly transitioning from state to state as the control system performs the processing
sequence on the batch.
A third significant difference between batch and continuous processes is one of
flexibility. A continuous process is specifically designed to make a large quantity of a
single product (or a narrow family of products). Modification of the plant to make other
products is often quite expensive and difficult to implement. In contrast, a batch process
is intentionally designed to make a large number of products easily. Processing vessels
are designed to handle a range of raw materials; vessels can be added (or removed) from
the processing sequence as necessary; and the reactor jackets and overhead piping are
designed to handle a wide range of conditions.
The flexibility of the batch process is an advantage and a disadvantage. The inherent
flexibility allows a batch process to turn out a large number of very different products
using the same equipment. The batch process vessels and programming can also be
easily reconfigured to make completely new products with a short turn around.
However, the relatively small size of the batch vessels generally limits the throughput of
the product so batch processes can rarely match the volume and efficiency of a large
continuous process. This is why both continuous and batch processes are extensively
used in manufacturing today. Each has advantages that serve specific market needs.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Controlling a Batch Process
From a control system perspective the design of a continuous plant is usually quite
straightforward. The instruments are sized and selected for the steady-state conditions,
and the configuration normally consists of a large number of continuous proportionalintegral-derivative (PID) controllers that keep the process at the desired steady state.
The design of a batch control system is another matter entirely. The field
instrumentation will often face a larger dynamic range of process conditions, and the
control system must be configured to handle a large number of normal, transitional, and
abnormal conditions. In addition, the control system must be easily reconfigured to
address the changing process vessel sequences, recipe changes, varying product
requirements, and so on. The more flexible a batch process is, the more demanding the
requirements on the control system. In some cases, a single set of batch vessels might
make 50 to 100 different products. Clearly such programming complexity poses quite a
challenge for the automation professional.
Due to the difficulties that batch sequences posed, most batch processes were run
manually for many years. As sequential controllers became available, simple batch
systems were occasionally “automated” with limited sequences programmed into
programmable logic controllers (PLCs) and drum programmers. Fully computerized
systems that could deal with variable sequences for flexible processes were not broadly
available until the mid-1980s and around that time several proprietary batch control
products were introduced. Unfortunately, each had its own method of handling the
complexities of batch programming and each company used different terminology
adding to the confusion. The need for a common batch standard was obvious.
The first part of the ISA-88 batch control standard was published in 1995 and has had a
remarkable effect on the industry. Later it was adopted by the American National
Standards Institute (ANSI), it is currently named ANSI/ISA-88.00.01-2010, but it is still
broadly known as S88. It provides a standard terminology, an internally consistent set of
principles, and a set of models that can be applied to virtually any batch process.
ANSI/ISA-88.00.01 can (and has) been applied to many other manufacturing processes
that require procedural sequence control.
What Is ANSI/ISA-88.00.01?
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
ANSI/ISA-88.00.01 is such a thorough standard that an exhaustive description of the
contents is impossible in this chapter. However, it is important for the automation
professional to understand several basic concepts presented in the standard and learn
how to apply these concepts when automating a batch process. The following
description is at best a cursory overview of an expansive topic. It will not make the
reader a batch expert, but it will provide a basic knowledge of the subject and serve as a
starting point for further study and understanding.
Before discussing the various parts of the standard, it is important to understand what
the ISA88 standards committee was trying to accomplish. ANSI/ISA-88.00.01 was
written to define batch control systems and give automation professionals a common set
of terms and definitions that could be understood by everyone. Prior to the release of the
standard, different control systems had different definitions for words such as “phases,”
“operations,” “procedures,” and “recipes,” and each engineer and control system used
different methods to implement the complicated sequential control that batch processing
requires. ANSI/ISA-88.1.1.01 was written to standardize the underlying batch recipe
logic and ideally allow pieces of logic code to be reused within the control system and
even with other control systems. This idea is a critical point to remember. When batch
software is done correctly, the recipes and procedures consist of relatively simple,
generic pieces of logic that can be reused again and again without developing new logic.
Such compartmentalization of the code also allows recipes to be easily adapted and
modified as requirements change without rewriting the whole program.
ANSI/ISA-88.00.01 spends a great deal of time explaining the various models
associated with batch control. Among the models addressed in the standard are the
Process Model, the Physical Model, the Equipment Entity Model, the Procedural
Control Model, the Recipe Type Model, the General Recipe Procedure Model, and the
Master Recipe Model. While each of these models was created to define and describe
different aspects of batch control, there are two basic model concepts that are critical to
understanding batch software. These are the Physical Model and the Procedural Control
Model.
ANSI/ISA-88.00.01 Physical Model
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The Physical Model describes the plant equipment itself (see Figure 3-1). As the
diagram shows, the model starts at the highest level (enterprise), which includes all the
equipment in the entire company. The enterprise is composed of sites (or plants), which
may be one facility or dozens of facilities spread across the globe. Each site is composed
of areas, which may include single or multiple processing areas in a plant. Within a
specific plant area there is one (or multiple) process cell(s), which are usually dedicated
to a particular product or family of products. Below the level of process cell, the
ANSI/ISA-88.00.01 becomes more complicated and begins to define the plant
equipment in much more detail. It is crucial that the automation professional understand
these lower levels of the Physical Model because the batch control code itself is
structured around these same levels.
Units are officially defined in the standard as “a collection of associated equipment
modules and/or control modules that can carry out one or more major processing
activities.” In the plant this usually translates into a vessel (or a group of related vessels)
that are dedicated to processing one batch (and only one batch) at a time. In most cases,
this will be a single batch reactor with its associated jacket and overhead piping. The
various processing steps performed in the unit are defined by the recipe’s “unit
procedure,” which will be discussed in the next section. The units are comprised of
control modules and possibly equipment modules. Both are discussed in the next
paragraphs.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Defining the units in a plant can be a difficult proposition for an automation professional
new to batch processing. Generally, one should look for vessels (and their associated
equipment) that perform processing steps on a batch (and only one batch) at a time.
Storage tanks are generally not considered a unit as they do not usually perform any
batch operations on the product and often contain several batches at any given time.
Consider the example vessels shown in Figure 3-2. There are two identical batch
reactors (Rx101 and Rx102) that feed a third vessel, Mix Tank 103. Batches are
processed in the two reactors; when the process is finished, they are alternately
transferred to Mix Tank 103 for further raw material addition before the final product is
shipped to storage. (The mixing process is much faster than the reaction steps so a single
mix tank can easily process the batch and have it shipped before the next reactor is
ready.) There are four raw material headers (A, B, C, and D) and each reactor has jacket
valves and controls (not shown), as well as an overhead condenser and reflux drum that
is used to remove solvent during batch processing. In such an arrangement, each reactor
(along with its associated jacket and overhead condenser equipment) would be
considered a unit, as would the mix tank.
Control modules are officially defined in ANSI/ISA-88.00.01 as “the lowest level
grouping of equipment in the Physical Model that can carry out batch control.” In
practice, the control modules tend to follow the typical definition of “instrument loops”
in a plant. For example, one control module might include a PID loop (transmitter, PID
controller, and control valve), and another control module might incorporate the controls
around an automated on/off valve (i.e., open/close limit switches, an on/off DCS control
block, a solenoid, and a pneumatic air-operated valve). Only a control module can
directly manipulate a final control element. All other modules can affect equipment only
by commanding one or more control modules. Every physical piece of equipment is
controlled by one (and only one) control module. Sensors are treated differently.
Regardless of which control modules contain measurement instrumentation, all modules
can share that information. Control modules are given commands by the phases in the
unit (to be discussed later) or by equipment modules (discussed below).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
There are many control modules in Figure 3-2. Every on/off valve, each pump, each
agitator, and each PID controller would be considered a control module.
Equipment modules may exist in a batch system or they may not. Sometimes it is
advantageous to group control modules into a single entity that can be controlled by a
common piece of logic. For instance, the sequencing of all the valves associated with a
reactor’s heating/cooling jacket may be better controlled by one piece of logic that can
correctly open/close the proper valves to achieve heating/cooling/draining modes and
handle the sequenced transitions from one mode to another. Another common example
would be a raw material charge header. Occasionally it is easier to combine the charge
valves, pumps, and flow meters associated with a particular charge into one equipment
module that can open/close the proper valves and start/stop the pump as necessary to
execute the material charge to a specific vessel. Equipment modules can be dedicated to
a unit (such as the reactor jacket) or they may be used as a shared resource that can be
allocated by several units (such as a raw material header). They are also optional—some
batch systems will employ them while others may simply control the control modules as
individual entities.
In the example in Figure 3-2, the reactor jacket controls might be considered an
equipment module, as could the individual raw material headers. The decision to create
an equipment module is usually based on the complexity of the logic associated with
that group of equipment. If the reactor jacket has several modes of operation (such as
steam heating, tempered water heating, tempered water cooling, cooling water cooling,
brine chilling, etc.), then it is probably worth creating an equipment module to handle
the complex transitions from mode to mode independent of the batch recipe software.
(The batch recipe would just send mode commands to the equipment module and let it
handle the transition logic.) If the reactor jacket was only capable of steam heating, then
it probably is not worth the effort to create an equipment module for the jacket—batch
can easily issue commands to the steam valve control module directly.
Procedural Control Model
While the Physical Model describes the plant equipment, the Procedural Control Model
describes the batch control software. The batch recipe contains the following parts:
• Header – This contains administrative information including product
information, version number, revision number, approval information, etc.
• Formulas – Many recipes can make an entire family of related products by
utilizing the same processing steps but changing the amount of ingredients, cook
times, cook temperatures, etc. The formula provides the specific details that the
recipe needs to make a particular subproduct.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Equipment requirements – This is a list of equipment that the recipe must
acquire in order to make a batch. In many cases, this equipment must be acquired
before the batch can start, but if the recipe must process the batch in other
equipment later in the process then a recipe can “defer” the allocation of the other
equipment until it needs it. (In the example in Figure 3-2, the recipe would need
to acquire a reactor before starting and then acquire the mix tank once the reactor
processing was completed.)
• Procedure – The procedure makes up the bulk of the recipe and it contains the
detailed control sequences and logic required to run the batch. The structure of
the procedure is defined and described by the Procedural Control Model, which is
described below.
The Procedural Control Model defines the structure of the batch control software. It is
made up of four layers (see Figure 3-3), which will be described in the next sections.
Rather than starting at the top and working down, it is easier to start with the lowest
layers and work up through the structure as each layer is built on the one preceding it.
Phase
The phase is defined in ANSI/ISA-88.00.01 as “the lowest level of procedural element
in the Procedural Control Model that is intended to accomplish all or part of a process
action.” This is a rather vague description, but typically a phase performs some discrete
action, such as starting an agitator, charging a raw material, or placing a piece of
equipment in a particular mode (such as jacket heating/cooling, etc.) It is called by the
batch software as needed (much like a programming subroutine), and it usually has one
or more parameters that allow the batch software to direct the phase in certain ways.
(For instance, a raw material phase might have a CHARGE_AMT parameter that tells
the phase how much material to charge.) The level of complexity of the phases has been
an ongoing debate. Some have argued that the phase should be excessively simple (such
as opening/closing a single valve), while others tend to make phases that perform much
more complicated and involved actions. Phases that are too simple result in large,
unwieldy recipes that can strain network communications. Phases that are too complex
work fine, but troubleshooting them can be quite difficult and making even the slightest
change to the logic can become very involved. Ideally the phase should be a relatively
simple piece of code that is as generic and flexible as possible and applicable to as large
a number of recipes as possible. The subject of what to include in a phase will be
handled later in the “Applying ANSI/ISA-88.00.01” section of this chapter.
Refer to Figure 3-2. Here is a list of phases that would likely apply to the system shown.
The items in parentheses are the phase parameters:
Reactor Phases
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Setup
• Agitate (On/Off)
• Jacket (Mode, Set Point)
• Charge_Mat_A (Amount)
• Charge_Mat_B (Amount)
• Drain
• Overhead (Mode)
Mix Tank Phases
• Setup
• Agitate (On/Off)
• Charge_Mat_C (Amount)
• Charge_Mat_D (Amount)
• Ship (Destination)
Operations
Operations are a collection of phases that perform some part of the recipe while working
in a single unit. If the recipe is only run in one unit, then it may only have one operation
that contains the entire recipe. If the recipe must run in multiple units, then there will be
at least one operation for every unit required. Sometimes it is advantageous to break up
the processing in a single unit into multiple operations, especially if the process requires
the same set of tasks repeated multiple times. For instance, a reactor recipe might charge
raw materials to a reactor, and then run it through multiple distillation sequences that
involve the same steps but use different temperature set points or cook times. In this
case, the batch programmer would be wise to create a distillation operation that could be
run multiple times in the recipe using different operation parameters to dictate the
changing temperature set points or cook times. This would avoid having multiple sets of
essentially identical code that must be maintained and modified every time a change
was made to the distillation process.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Some reactor operations for the example in Figure 3-2 might look like the illustration in
Figure 3-4. (The phases with their parameters are in bold.)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Unit Procedures
Unit procedures combine the operations of a single unit into a single entity that is called
by the highest level of the recipe (the procedure). Every recipe will have at least one
unit procedure for every unit that is required by the recipe and every unit procedure will
contain at least one operation. When a unit procedure is encountered in a recipe, the
batch system immediately allocates the unit in question before beginning to process the
operations and phases it contains. When the unit procedure is completed, the
programmer has the option of releasing the unit (for use by other recipes) or retaining
the unit for some later step and thus keeping other recipes from getting control.
In the example in Figure 3-2 the unit procedures might look like this:
Reactor Unit Procedure “UP_RX_PROCESS”
Contains the Reactor Operation “OP_PROCESS” (see above).
Mix Tank Unit Procedure “UP_MIX_TANK_AQUIRE”
Contains the Mix Tank Operation “OP_SETUP,” which acquires the mix tank
when it is available and sets it up for a transfer from the reactor.
Reactor Unit Procedure “UP_RX_TRANSFER”
Contains the Reactor Operation “OP_TRANSFER” (see above).
Procedure
This is the highest level of the batch software code contained in the recipe. Many
consider the procedure to be the recipe as it contains all the logic required to make the
recipe work, however ANSI/ISA-88.00.01 has defined the recipe to include the
procedure, as well as the header, equipment requirements, and formula information
mentioned previously. The procedure is composed of at least one unit procedure (which
then contains at least one operation and its collection of phases).
In the example in Figure 3-2 the final procedure might look like this:
Reactor Unit Procedure “UP_RX_PROCESS”
Acquire the reactor before starting.
Mix Tank Unit Procedure “UP_MIX_TANK_AQUIRE”
Acquire the mix tank before starting.
Reactor Unit Procedure “UP_RX_TRANSFER”
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Release the reactor when finished.
Mix Tank Unit Procedure “UP_MIX_TANK_PROCESS”
Release the mix tank when finished.
As one considers the Procedural Control Model, one might wonder why ANSI/ISA88.00.01 has defined such an elaborate structure for the recipe logic. The reason is “reusability.” Similar to the phases, it is possible to write generic operations and even unit
procedures that can apply to several recipes. This is the strength of ANSI/ISA-88.00.01
—when implemented wisely, a relatively small number of flexible phases, operations,
and unit procedures can be employed to manufacturer many, very diverse products.
Also, simply arranging the phase blocks in a different order can often create completely
new product recipes.
Applying ANSI/ISA-88.00.01
Now that the basic terminology has been discussed, it is time to put ANSI/ISA-88.00.01
into practice. When tackling a large batch project, it is usually best to execute the project
in the following order.
1. Define the equipment (piping and instrumentation drawings (P&IDs), control
loops, instrument tag list, etc.).
2. Understand the process. What processing steps must occur in each vessel? Can
equipment be shared or is it dedicated? How does the batch move through the
vessels? Can there be different batches in the system at the same time? Do the
automated steps usually function as designed or must the operator take control of
the batch and manually adjust the process often?
3. Define the units.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
4. Carefully review the batch sheets and make a list of the required phases for all
the recipes that will run on the equipment. Be sure this list is comprehensive! A
partial list will result in a great deal of rework and additional programming later.
5. Review the phase list and determine which phases can be combined. For
instance, there might be multiple Agitate phases in the list to handle the different
types of agitators (on/off, variable-frequency drive [VFD], agitators with speed
switches, etc.). Rather than creating a special type of agitator phase for each
agitator type, it is usually not difficult to create a single phase that can handle all
the types. This results in a single phase to maintain and adjust rather than a
handful. Of course, such a concept can be carried to the extreme and phases can
get extremely complicated if they are designed to handle every contingency.
Combine phases where it makes sense and where the logic is easily
implemented. If it is not easy to handle the different scenarios, create two
versions of the phase. Document the phases (usually using simple flow charts).
This documentation will eliminate a great deal of miscommunication between
the programming staff and the system designer during logic development.
6. With the completed phase list in hand, start building operations. Watch for
recipes that use the same sequence of steps repeatedly. If this situation exists,
create an operation for each repeated sequence so that the operation can be
called multiple times in the recipe. (Similar to the phases, this re-usability saves
programming time and makes system maintenance easier.)
7. Build the unit procedures and final recipe. Document these entities with flow
charts.
At this point the engineering design team should review the phases and recipe
procedures with operations and resolve any issues. (Do NOT start programming phases
and/or batch logic until this step is done.) Once the reviews have been completed, the
team can start configuring the system. To avoid problems and rework, it is best to
perform the configuration in the following order:
1. Build the I/O and low-level control modules, such as indicators, PID controllers,
on/ off valves, motors, etc. Be sure to implement the interlocks at this level.
2. Fully test the low-level configuration. Do NOT start the phase or low-level
equipment module programming until the low level has been tested.
3. Program any equipment modules (if they exist) and any non-batch sequences
that might exist on the system. Fully test these components.
4. Begin the phase programming. It is best to fully test the phases as they are
completed rather than testing at the end. (Systemic issues in alarming and
messaging can be caught early and corrected before they are replicated.)
5. Once the phases are fully built and tested, build the operations and test them.
6. Finally, build and test the unit procedures and recipe procedures.
Following this order of system configuration will radically reduce the amount of rework
required to create the system. It also results in a fully documented system that is easy to
troubleshoot and/or modify.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Summary
As stated at the beginning of this chapter, programming a batch control system is
usually orders of magnitude more difficult than programming a continuous system. The
nearly infinite combination of possible processing steps can be overwhelming and the
opportunity for massive rework and error correction can keep project managers awake at
night. The ANSI/ ISA-88.00.01 batch control standard was created to establish a
common set of terminology and programming techniques that are specifically designed
to handle the daunting challenges of creating a batch processing system in a consistent
and efficient manner. When implemented wisely, ANSI/ISA-88.00.01 can result in
flexible, reusable code that is easy to troubleshoot and can be used across many recipes
and control systems. Study the standard and seek to take advantage of it.
Further Information
ANSI/ISA-88.00.01-2010. Batch Control Part 1: Models and Terminology. Research
Triangle Park, NC: ISA (International Society of Automation).
Parshall, J. H., and L. B. Lamb. Applying S88: Batch Control from a User’s Perspective.
Research Triangle Park, NC: ISA (International Society of Automation), 2000.
Craig, Lynn W. A. “Control of Batch Processes.” Chap. 14 in A Guide to the Automation
Body of Knowledge, 2nd ed, edited by Vernon L. Trevathan. Research Triangle
Park, NC: ISA (International Society of Automation), 2006.
About the Author
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
P. Hunter Vegas, PE, was born in Bay St. Louis, Miss., and he received his BS degree
in electrical engineering from Tulane University in 1986. Upon graduating, Vegas joined
Babcock and Wilcox, Naval Nuclear Fuel Division in Lynchburg, Va., where his
primary job responsibilities included robotic design and construction, and advanced
computer control. In 1987, he began working for American Cyanamid (now Cytec
Industries) as an instrument engineer. In the ensuing 12 years, his job titles included
instrument engineer, production engineer, instrumentation group leader, principal
automation engineer, and unit production manager. In 1999, he joined a speciallyformed group to develop next-generation manufacturing equipment for a division of
Bristol-Myers Squibb. In 2001, he entered the systems integration industry, and he is
currently working for Wunderlich Malec as an engineering project manager in
Kernersville, N.C.
Vegas holds Louisiana and North Carolina Professional Engineering licenses in
electrical and controls system engineering, a North Carolina Unlimited Electrical
contractor license, a TUV Functional Safety Engineering certificate, and an MBA from
Wake Forest University. He has executed thousands of instrumentation and control
projects over his career, with budgets ranging from a few thousand to millions of
dollars. He is proficient in field instrumentation sizing and selection, safety interlock
design, electrical design, advanced control strategy, and numerous control system
hardware and software platforms. He co-authored a book, 101 Tips to a Successful
Automation Career, with Greg McMillan and co-sponsors the ISA Mentoring program
with McMillan as well. Vegas is also a frequent contributor to InTech, Control, and
numerous other publications.
4
Discrete Control
By Kelvin T. Erickson, PhD
Introduction
A discrete control system mainly has discrete sensors and actuators, that is, sensors and
actuators that have one of two values (e.g., on/off or open/closed). Though Ladder
Diagram (LD) is the primary language of discrete control, the industry trend is toward
using the IEC 61131-3 (formerly 1131-3) standard. Besides Ladder Diagram, IEC
61131-3 defines four additional languages: Function Block Diagram (FBD), Structured
Text (ST), Instruction List (IL), and Sequential Function Chart (SFC). Even though
Ladder Diagram was originally developed for the programmable logic controller (PLC)
and Function Block Diagram (FBD) was originally developed for the distributed control
system (DCS), a PLC is not limited to ladder logic and a DCS is not limited to function
block. The five IEC languages apply to all platforms for implementation of discrete
control.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Ladder Logic
Early technology for discrete control used the electromechanical relays originally
developed for the telegraph industry of the 1800s. Interconnections of relays
implemented logic and sequential functions. The PLC was originally developed to
replace relay logic control systems. By using a programming language that closely
resembles the wiring diagram documentation for relay logic, the new technology was
readily adopted. To introduce LD programming, simple logic circuits are converted to
relay logic and then to LD (also called ladder logic).
Consider the simple problem of opening a valve, XV103, when pressure switches PS101
and PS102 are both closed, as in Figure 4-1a. To implement this function using relays,
the switches are not connected to the valve directly but are connected to relay coils
labeled PS101R and PS102R whose normally open (NO) contacts control a relay coil,
XV103R, whose contacts control the valve (see Figure 4-1b). When PS101 and PS102
are both closed, the corresponding relay coils PS101R and PS102R are energized,
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
closing two contacts and energizing the XV103R relay coil. The contact controlled by
XV103R is closed, supplying power to the XV103 valve.
The output (a valve in this case) is driven by the XV103R relay to provide voltage
isolation from the relays implementing the logic. The need for this isolation is more
obvious when the output device is a three-phase motor operating at 440 volts. The input
switches, PS101 and PS102, control relay coils so that the one switch connection to an
input relay can be used multiple times in the logic. A typical industrial control relay can
have up to 12 poles, or sets of contacts, per coil. For example, if the PS101R relay has
six poles (only one is shown), then the other five poles (contacts) are available for use in
the relay logic without requiring five other connections to PS101.
The ladder logic notation (Figure 4-1c) is shortened from the relay wiring diagram to
show only the third line, the relay contacts, and the coil of the output relay. Ladder logic
notation assumes that the inputs (switches in this example) are connected to discrete
input channels (equivalent to the relay coils PS101R and PS102R in Figure 4-1b). Also,
the actual output (valve) is connected to a discrete output channel (equivalent to the NO
contacts of XV102R in Figure 4-1b) controlled by the coil. The label shown above the
contact symbol is not the contact label; it is the label of the control for the coil that
drives the contact. Also, the output for the rung occurs on the extreme right side of the
rung, and power is assumed to flow from left to right. The ladder logic rung is
interpreted as follows: “When input (switch) PS101 is closed and input (switch) PS102
is closed, then XV103 is on.”
Suppose the control function is changed so that valve XV103 is open when switch
PS101 is closed and switch PS102 is open. The only change needed in the relay
implementation in Figure 4-1b is to use the normally closed (NC) contact of the PS102R
relay. The ladder logic for this control is shown in Figure 4-2a and is different from
Figure 4-1c only in the second contact symbol. The ladder logic is interpreted as
follows: “When PS101 is closed (on) and PS102 is open (off), then XV103 is on.”
Further suppose the control function is changed so that valve XV103 is open when
either switch PS101 or PS102 is closed. The only change needed in the relay
implementation in Figure 4-1b is to wire the two NO contacts in parallel rather than in
series. The ladder logic for this control is shown in Figure 4-2b. The ladder logic is
interpreted as follows: “When PS101 is closed (on) or PS102 is closed (on), then
XV103 is on.”
Summarizing these three examples, one should notice that key words in the description
of the operation translate into certain aspects of the solution:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
and → series connection of contacts
or → parallel connection of contacts
on → NO contact
off → NC contact
These are key ladder logic concepts.
An example of a PLC ladder logic diagram appears in Figure 4-3. The vertical lines on
the left and right are called the power rails. The contacts are arranged horizontally
between the power rails, hence the term rung. The ladder diagram in Figure 4-3 has
three rungs. The arrangement is similar to a ladder one uses to climb onto a roof. In
addition, Figure 4-3 shows an example of a diagram one would see when monitoring the
running program. The thick lines indicate continuity, and the state (on/off) of the inputs
and outputs is shown next to the contact/coil. Regardless of the contact symbol, if the
contact is closed (when it has continuity through it), it is shown as thick lines. If the
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
contact is open, it is shown as thin lines. In a relay ladder diagram, power flows from
left to right. In ladder logic, there is no real power flow, but there still must be a
continuous path through closed contacts in order to energize an output. In Figure 4-3,
the XV101 output on the first rung is off because the contact for PS132 is open
(meaning PS132 is off), blocking continuity through the PS124 and LS103 contacts.
Also note that the LS103 input is off, which means the NC contact in the first rung is
closed and the NO contact in the second rung is open. According to IEC 61131-3, the
right power rail may be explicit or implied.
Figure 4-3 also introduces the concept of function blocks. Any object that is not a
contact or a coil is called a function block because of its appearance in the ladder
diagram. The most common function blocks are timer, counter, comparison, and
computation operations. More advanced function blocks include message, sequencer,
and shift register operations.
Some manufacturers group the ladder logic objects into two classes: inputs and outputs.
This distinction was made because in relay ladder logic, outputs were never connected
in series and always occurred on the extreme right-hand side of the rung. Contacts
always appeared on the left side of coils and never on the right side. To turn on multiple
outputs simultaneously, coils are connected in parallel. This restriction was relaxed in
IEC 61131-3 and now outputs may be connected in series. Also, contacts can occur on
the right side of a coil as long as a coil is the last element in the rung.
Many novice ladder logic programmers tend to use the same type of contact (NO or NC)
in the ladder that corresponds to the type of field switch (or sensor) wired to the discrete
input channel. While this is true in many cases, this is not the best way to think of the
concept. The type of contact (NO, NC) in the field is determined by safety or fail-safe
factors, but these factors are not relevant to the PLC ladder logic. The PLC is only
concerned with the current state of the discrete input (open/on or closed/off).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
As an example, consider the problem of starting and stopping a motor with momentary
switches. The motor is representative of any device that must run continuously but is
started and stopped with momentary switches. The start and stop momentary switches
are shown with the general ladder logic in Figure 4-4. Concentrating on the two
momentary switches, the stop pushbutton is an NC switch, but an NO contact is used in
the ladder logic. In order for the motor EX101 to start, the STOP_PB must be closed
(not pushed) and the START_PB must also be closed (pushed). When the EX101 coil is
energized, it also provides continuity through the parallel path and the motor remains
running when START_PB is released. When STOP_PB is pushed, the discrete input is
now off, opening the contact in the ladder logic. The EX101 output is then de-energized.
The start and stop switches are chosen and wired this way for safety. If any part of the
system fails (switch or wiring), the motor will go to the safe state, which is stopped. If
the start-switch wiring is faulty (open wire), then the motor cannot be started because
the controller will not sense a closed start switch. If the stop-switch wiring is faulty
(open wire), then the motor will immediately stop if it is running. Also, the motor
cannot be started with an open wire to the stop switch.
The ladder logic in Figure 4-4 is also called a seal or latching circuit, and appears in
other contexts. In many systems, the start and stop of a device, such as a motor, have
more conditions that must be satisfied for the motor to run. These conditions are
referred to as permissives, permits, lockouts, inhibits, or restrictions. These conditions
are placed on the start/stop rung as shown in Figure 4-4. Permissives allow the motor to
start and lockouts will stop the motor, as well as prevent it from being started.
After coils and contacts, timers are the next most common ladder logic object. IEC
61131-3 defines three types of timers: on-delay, off-delay, and pulse. The on-delay timer
is by far the most prevalent and so it is the only one described here. For a description of
the other timers, the interested reader is referred to the references.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The TON on-delay timer is shown in Figure 4-5a. The EN input is the block execution
control and must be on for the block to execute. The ENO output echoes the EN input
and is on if EN is on and the block executes without error. The TON timer basically
delays the turn-on of a signal and does not delay the turn-off. When the IN input turns
on, the internal time (ET) increases. When ET equals the preset time (PT), the timer is
timed out and the Q output turns on. If IN turns off during the timing interval, Q remains
off, ET is set to zero, and timing recommences when the IN turns on. The ET output can
be connected to a variable (of type TIME) to monitor the internal time. The instance
name of the timer appears above the block. The preset time can be a variable or a literal
of type TIME. The prefix must be TIME#, T#, time#, or t#. The time is specified in days
(d), hours (h), minutes (m), seconds (s), and milliseconds (ms), for example,
t#2d4h45m12s450ms. The accuracy is 1 millisecond.
The timing diagram associated with the ladder in Figure 4-5a is shown in Figure 4-5b.
The LS_1 discrete input must remain on for at least 15 seconds before the LS1_Hold
coil is turned on. When LS_1 is turned off after 5 seconds, ET is set to zero time and the
LS1_Hold coil remains off.
As a final ladder logic example, consider the control of a fixed-speed pump motor. The
specifications are as follows.
The logic controls a motor starter through a discrete output. There is a Hand-Off-Auto
(HOA) switch between the discrete output and the motor starter that allows the operator
to override the PLC control. The control operates in two modes: Manual and Automatic.
When in the Manual mode, the motor is started and stopped with Manual Start and
Manual Stop commands. When in the Automatic mode, the motor is controlled by
Sequence Start and Sequence Stop commands. In the Automatic mode, the motor is
controlled by sequences. When switching between the two modes, the motor control
discrete output should not change.
The logic must monitor and report the following faults:
• Motor fails to start within 10 seconds
• Motor overload
• HOA switch is not in the Auto position
• Any fault (on when any of the above three are on)
The Fail to Start fault must be latched when it occurs. An Alarm Reset input must be
provided that when on, resets the Fail to Start fault indication. The other fault
indications track the appropriate status. For example, the Overload Fault is on when the
overload is detected and off when the overload has been cleared. When any fault
indication is on, the discrete output to the motor starter should be turned off and remain
off until all faults are cleared. In addition, a Manual Start or Sequence Start must be
used to start the motor after a fault has cleared. To detect these faults, the following
discrete inputs are available:
• Motor auxiliary switch
• Overload indication from the motor starter
• HOA switch position indication
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The connections to the pump motor, tagged as EX100, are explained as follows:
• EX100_Aux – Auxiliary contact; closes when the motor is running at the proper
speed
• EX100_Hoa – HOA switch; closes when the HOA switch is in the Auto position
• EX100_Ovld – Overload indication from the motor starter; on when overloaded
• Alarm_Reset – On to clear an auxiliary contact-fail-to-close failure indication
• EX100_ManMode – On for Manual mode; off for Automatic mode
• EX100_ManStart – On to start the motor when in Manual mode; ignored in
Automatic
• EX100_ManStop – On to stop the motor when in Manual mode; ignored in
Automatic
• EX100_SeqStart – On to start the motor when in Automatic mode; ignored in
Manual mode
• EX100_SeqStop – On to start the motor when in Automatic mode; ignored in
Manual mode
• EX100_MtrStrtr – Motor starter contactor; on to start and run the motor; off to
stop the motor
• EX100_AnyFail – On when any failure indication is on
• EX100_AuxFail – On when the auxiliary contact failed to close 10 seconds after
motor started
• EX100_OvldFail – On when the motor is overloaded
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• EX100_HoaFail – On when the HOA switch is not in the Auto position,
indicating that the PLC does not control the motor
The ladder logic that implements the control of pump EX100 is shown in Figure 4-6.
The start and stop internal coils needed to drive the motor contactor are determined by
the first and second rungs. In these two rungs, the operator generates the start and stop
commands when the control is in the Manual mode. When the control is in the
Automatic mode (not the Manual mode), the motor is started and stopped by steps in the
various sequences (function charts). The appropriate step sets the EX100_SeqStart or
EX100_SeqStop internal coil to control the motor. These two internal coils are always
reset by this ladder logic. This method of sequence-based control allows one to
start/stop the motor in multiple sequence steps without having to change the motor
control logic. The third rung delays checking for the auxiliary fail alarm until 10
seconds after the motor is started. This alarm must be latched since this failure will
cause the output to the starter to be turned off, thus disabling the conditions for this
alarm. The fourth and fifth rungs generate the overload fail alarm and the indication that
the HOA switch is not in the Auto position. The sixth rung resets the auxiliary failure
alarm so that another start attempt is allowed. The reset is often for a group of
equipment, rather than being specific to each device. The seventh rung generates one
summary failure indication that would appear on an alarm summary screen. The eighth
rung controls the physical output that drives the motor contactor (or motor starter). Note
that this rung occurs after the failure logic is scanned and the motor is turned off
immediately when any failure is detected. The last rung resets the sequential control
commands.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Function Block Diagram
The Function Block Diagram (FBD) language is another graphical programming
language. The typical DCS of the 1970s used this type of language to program the PID
loops and associated functions and logic. An FBD is a set of interconnected blocks
displaying the flow of signals between blocks. It is similar to a ladder logic diagram,
except that function blocks replace the contact interconnections and the coils are simply
Boolean outputs of function blocks.
Figure 4-7 contains two FBD equivalents to the start/stop logic in Figure 4-4. The “&”
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
block is a logical AND block, and the >=1 block is a logical OR block. The number of
inputs for each block can be increased. The circle at the lower input of the rightmost &
block is a logical inversion, equivalent to the NC relay contact. The FBD in Figure 4-7a
has an implicit feedback path, because the EX101 output of the rightmost & block is
also an input to the leftmost & block. The EX101 variable is called the feedback
variable. An alternative FBD is shown in Figure 4-7b. This FBD has an explicit
feedback path, where there is an explicit path from the EX101 output to an input of the
first & block. On this path, the value of EX101 passes from right to left, which is
opposite to the normal left-to-right flow.
The execution of the function blocks can be controlled with the optional EN input.
When the EN input is on, the block executes. When the EN input is off, the block does
not execute. For example, in Figure 4-8, the set point from a recipe
(Recipe2.FIC102_SP) is moved (“:=” block) into FIC102_SP when both Step_32 and
Temp_In_Band are true. The EN/ENO connections are completely optional in the FBD
language.
IEC 61131-3 does not specify a strict order of network evaluation. However, most FBDs
are portrayed so that execution generally proceeds from left to right and top to bottom.
The only real exception to this generality is an explicit feedback path. IEC 61131-3
specifies that network evaluation obey the following rules:
1. If the input to a function block is the output from another function block, then it
should be executed after the other function block. In Figure 4-7a, the execution
order is the leftmost &, the >=1, and then the rightmost &.
2. The outputs of a function block should not be available to other blocks until all
outputs are calculated.
3. The execution of an FBD network is not complete until all outputs of all function
blocks are determined.
4. When data is transferred from one FBD to another, the second FBD should not
be evaluated until the values from the first FBD are available.
According to IEC 61131-3, when the FBD contains an implicit or explicit feedback
path, it is handled in the following manner:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
1. “Feedback variables shall be initialized by one of the mechanisms defined in
clause 2. The initial value shall be used during the first evaluation of the
network.” Clause 2 of IEC 61131-3 defines possible variable initialization
mechanisms.
2. “Once the element with a feedback variable as output has been evaluated, the
new value of the feedback variable shall be used until the next evaluation of the
element.”
One of the more powerful aspects of the FBD language is the use of function blocks to
encapsulate standard operations. A function block can be invoked multiple times in a
program without actually duplicating the code. As an example, the ladder logic of
Figure 4-6 can be encapsulated as the function block of Figure 4-9. The “EX100_” part
of all symbols in Figure 4-6 is stripped and most of them become inputs or outputs to
the function block. The block inputs are shown on the left side and the outputs are
shown on the right side.
Structured Text
The Structured Text (ST) language defined by IEC 61131-3 is a high-level language
whose syntax is similar to Pascal. In general, ST is useful for implementing calculationintensive functions and other functions that are difficult to implement in the other
languages. The ST language has a complete set of constructs to handle variable
assignment, conditional statements, iteration, and function block calling. For a detailed
language description, the interested reader is referred to the references.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
As a simple example, the ST equivalent to the ladder logic of Figure 4-4 and the FBD of
Figure 4-7 is shown in Figure 4-10. One could also write one ST statement that
incorporated the logic of Figure 4-4 without using the IF-THEN-ELSE construct.
As a more complicated example, the ST equivalent to the ladder logic implementation
of the motor control of Figure 4-6 is shown in Figure 4-11. The use of the timer function
block in lines 8–10 needs some explanation. A function block has an associated
algorithm, embedded data, and named outputs. Multiple calls to the same function block
may yield different results. To use a function block in ST, an instance of the function
block must be declared in the ST program. For the ST shown in Figure 4-11, the
instance of the TON timer block is called AuxF_Tmr and must be declared in another
part of the program as:
VAR
AuxF_Tmr:
TON;
END_VAR
Line 8 of Figure 4-11 invokes the Aux_F_Tmr instance of the TON function block.
Invoking a function block does not return a value. The outputs must be referenced in
subsequent statements. For example, line 9 of Figure 4-11 uses the Q output of the
Aux_F_Tmr along with other logic to set the auxiliary failure indication.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Instruction List
The Instruction List (IL) language defined by IEC 61131-3 is a low-level language
comparable to the assembly language programming of microprocessors. The IL
language has a set of instructions to handle variable assignment, conditional statements,
simple arithmetic, and function block calling. The IL and ST languages share many of
the same elements. Namely, the definition of variables and direct physical PLC
addresses are identical for both languages.
An instruction list consists of a series of instructions. Each instruction begins on a new
line and contains an operator with optional modifiers, and if necessary for the particular
operation, one or more operands separated by commas. Figure 4-12 shows an example
list of IL instructions illustrating the various parts. For a detailed language description,
the interested reader is referred to the references.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
As an example, the IL equivalent to the ladder logic of Figure 4-4 and the FBD of
Figure 4-7 is shown in Figure 4-13. As a simple example, the ST equivalent to the
ladder logic of Figure 4-4 and the FBD of Figure 4-7 is also shown in Figure 4-13.
As a more complicated example, the IL equivalent to the ladder logic implementation of
the motor control of Figure 4-6 is shown in Figure 4-14. As for the ST implementation
in Figure 4-11, the instance of the TON timer block is called AuxF_Tmr and is declared
in the same manner. The timer function block is invoked in line 16 and the Q output is
loaded in line 17 to be combined with other logic.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Sequential Problems
The Sequential Function Chart (SFC) is the basic design tool for sequential control
applications. The IEC 61131-3 SFC language is derived from the IEC 848 function chart
standard. IEC 61131-3 defines the graphical, semigraphical, and textual formats for an
SFC. Only the graphical format is explained here because it is the most common form.
The general form of the function chart is shown in Figure 4-15. The function chart has
the following major parts:
• Steps of the sequential operation
• Transition conditions to move to the next step
• Actions of each step
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The initial step is indicated by the double-line rectangle. The initial step is the initial
state of the program when the controller is first powered up or when the operator resets
the operation. The steps of the operation are shown as an ordered set of labeled steps
(rectangles) on the left side of the diagram. Unless shown by an arrow, the progression
of the steps proceeds from top to bottom. The transition condition is shown as a
horizontal bar between steps. If a step is active and the transition condition below that
step becomes true, the step becomes inactive, and the next step becomes active. The
stepwise flow (called step evolution) continues to the bottom of the diagram. Branching
is permitted to cause the step evolution to lead back to an earlier step or to proceed
along multiple paths. A step without a vertical line below it is the last step of the
sequence. The actions associated with a step are shown to the right of the step. Each
step action is shown separately with a qualifier (“N” in Figure 4-15).
Only the major aspects of SFCs are described in this section. For a detailed language
description, the interested reader is referred to the references.
Each step within an SFC has a unique name and should appear only once in an SFC.
Every step has two variables that can be used to monitor and synchronize step
activation. The step flag is a Boolean of the form ****.X, where **** is the step name,
which is true while the step is active and false otherwise. The step elapsed time
(****.T) variable of type TIME indicates how long the step has been active. When a
step is first activated, the value of the step elapsed time is set to T#0s. While the step is
active, the step elapsed time is updated to indicate how long the step has been active.
When the step is deactivated, the step elapsed time remains at the value it had when the
step was deactivated; that is, it indicates how long the step was active.
The Cool_Wait step in Figure 4-24 (later in this chapter) is an example using the step
elapsed time for a transition.
The flow of active steps in an SFC is called step evolution and generally starts with the
initial step and proceeds downward. The steps and transitions alternate, that is:
• Two steps are never directly linked; they are always separated by a transition.
• Two transitions are never directly linked; they are always separated by a step.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Sequence selection is also possible, causing the step evolution to choose between two or
more different paths. An example sequence selection divergence, and its corresponding
convergence, is shown in Figure 4-16. The transition conditions must be mutually
exclusive; that is, no more than one can be true. Alternate forms of sequence selection
specify the order of transition evaluation, relaxing this rule. In the alternate forms, only
one path is selected, even if more than one transition condition is true. There are two
special cases of sequence selection. In a sequence skip, one or more branches do not
contain steps. A sequence loop is a sequence selection in which one or more branches
return to a previous step. These special cases are shown in the references.
The evolution out of a step can cause multiple sequences to be executed, called
simultaneous sequences. An example of simultaneous sequence divergence, and its
corresponding convergence, is shown in Figure 4-17. In Figure 4-17, if the
Prestart_Check step is active and Pre_OK becomes true, all three branches are executed.
One branch adds ingredient A followed by ingredient X. A second branch adds
ingredient B. The third branch starts the agitator after the tank level reaches a certain
value. When all these actions are completed, the step evolution causes the Heat_Reac
step to be active. The three wait steps are generally needed to provide a holding state for
each branch when waiting for one or more of the other two branches to complete.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The most common format of a transition condition is a Boolean expression in the
Structured Text (ST) language to the right of the horizontal bar below the step box
(Figure 4-18a). Two other popular formats are a ladder diagram network intersecting the
vertical link instead of a right rail (Figure 4-18b), and an FBD network whose output
intersects the vertical link (Figure 4-18c).
Action blocks are associated with a step. Each step can have zero or more action blocks.
Figure 4-15 shows multiple action blocks associated with the Step_1_Name step. An
action can be a Boolean variable, a ladder logic diagram, an FBD, a collection of ST
statements, a collection of IL statements, or an SFC. The action box is used to perform a
process action, such as opening a valve, starting a motor, or calculating an endpoint for
the transition condition. Generally, each step has an action block, although in cases
where a step is only waiting for a transition (e.g., waiting for a limit switch to close) or
executing a time delay, no action is attached.
Each step action block may have up to four parts, as is seen in Figure 4-19:
1. a – action qualifier
2. b – action name
3. c – Boolean indicator variable
4. d – action description using the IL, ST, LD, FBD, or SFC language
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The “b” field is the only required part of the step block. The “c” field is an optional
Boolean indicator variable, set by the action to signify step completion, time-out, error
condition, and so on. When the “b” field is a Boolean variable, the “c” and “d” fields are
absent. When the “d” field is present, the “b” field is the name of the action whose
description is shown in the “d” field. IEC 61131-3 defines the “d” field as a box below
the action name. Figure 4-20 shows an action defined as ladder logic, an FBD, ST, and
an SFC. An action may also be defined as an IL, which is not shown.
The action qualifier is a letter or a combination of letters describing how the step action
is processed. If the action qualifier is absent, it is assumed to be N. Possible action
qualifiers are defined in Table 4-1. Only the N, S, R, and L qualifiers are described in
the following paragraphs. The interested reader is referred to the references for a
description of the other qualifiers.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
N – Non-Stored Action Qualifier
A non-stored action is active only when the step is active. In Figure 4-21, the action
P342_Start executes continuously while the Start_P342 step is active, that is, while the
Start_P342.X flag is on. In this example, P342_Start is an action described in the ST
language. When the transition P342_Aux turns on, the Start_P342 step becomes
inactive, and P342_SeqStart is turned off. Deactivation of the step causes the action to
execute one last time (often called postscan) in order to deactivate the outputs (the left
side of expression).
S and R – Stored (Set) and Reset Action Qualifiers
A stored action becomes active when the step becomes active. The action continues to
be executed even after the step is inactive. To stop the action, another step must have an
R qualifier that references the same action. Figure 4-22 is an example use of the S and R
qualifiers. The S qualifier on the action for the Open_Rinse step causes
XV345.Seq_Open to be turned on immediately after the Open_Rinse step becomes
active. The XV345_SeqOpen remains on until the Close_Rinse step, which has an R
qualifier on the XV345_Open action. As soon as the Close_Rinse step becomes active,
the action is executed one last time to deactivate XV345_SeqOpen.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
L – Time-Limited Action Qualifier
A time-limited action becomes active when the step becomes active. The action
becomes inactive when a set length of time elapses or the step becomes inactive,
whichever happens first. In Figure 4-23a, the L qualifier on the action for the Agit_Tank
step causes A361_Run to be turned on immediately after the Agit_Tank step becomes
active. If the step elapsed time is longer than 6 minutes, A361_Run remains on for 6
minutes (Figure 4-23b). If the step elapsed time is less than 6 minutes, A361_Run turns
off when the step becomes inactive (Figure 4-23c).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
A more complicated SFC example is shown in Figure 4-24. This SFC controls a batch
process. The reaction vessel is heated to a desired initial temperature and then the
appropriate main ingredient is added, depending on the desired product. The reactor
temperature is raised to the soak temperature and then two more ingredients are added
while agitating. The vessel is cooled and then dumped.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Further Information
Erickson, Kelvin T. Programmable Logic Controllers: An Emphasis on Design and
Applications. 3rd ed. Rolla, MO: Dogwood Valley Press, 2016.
IEC 61131-3:2003. Programmable Controllers – Part 3: Programming Languages.
Geneva 20 – Switzerland: IEC (International Electrotechnical Commission).
IEC 848:1988. Preparation of Function Charts for Control Systems. Geneva 20 –
Switzerland, IEC (International Electrotechnical Commission).
Lewis, R. W. Programming Industrial Control Systems Using IEC 1131-3. Revised ed.
The Institution of Electrical Engineers (IEE) Control Engineering Series. London:
IEE, 1998.
About the Author
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Kelvin T. Erickson, PhD, is a professor of electrical and computer engineering at the
Missouri University of Science and Technology (formerly the University of MissouriRolla, UMR) in Rolla, Missouri. His primary areas of interest are in manufacturing
automation and process control. Before coming to UMR in 1986, he was a senior design
engineer at Fisher Controls International, Inc. (now part of Emerson Process
Management). During 1997, he was on a sabbatical leave from UMR, working for
Magnum Technologies (now Maverick Technologies) in Fairview Heights, Illinois.
Erickson received BS and MS degrees in electrical engineering from the University of
Missouri-Rolla and a PhD in electrical engineering from Iowa State University. He is a
registered professional engineer (control systems) in Missouri. He is a member of the
International Society of Automation (ISA) and senior member of the Institute of
Electrical and Electronics Engineers (IEEE).
II
Field Devices
Measurement Accuracy and Uncertainty
It is true that you can control well only those things that you can measure—and
accuracy and reliability requirements are continually improving. Continuous
instrumentation is required in many applications throughout automation, although we
call it process instrumentation because the type of transmitter packaging discussed in
this chapter is more widely used in process applications.
There are so many measurement principles and variations on those principles that we
can only scratch the surface of all the available ones; however, this section strives to
cover the more popular/common types.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Process Transmitters
The field devices, sensors, and final control elements are the most important links in
process control and automation. The reason is if you are unable to measure or control
your process, everything else built upon those devices cannot compensate for a poor
input or the lack of ability to control the output without excessive variance.
Analytical Instrumentation
Analytical instrumentation is commonly used for process control, environmental
monitoring, and related applications in a variety of industries.
Control Valves
Final control elements, such as control valves and now increasingly variable speed or
variable/adjustable frequency motors, are critical components of a control loop in the
process and utility industries. It has been demonstrated in nearly all types of process
plants that control valve problems are a major cause of poor loop performance. A
general knowledge of the impact of the control valve on loop performance is critical to
process control.
Today it has become commonplace for automation professionals to delegate the
selection and specification of instrumentation and control valves, as well as the tuning
of controllers to technicians. However, performance in all these areas may depend on
advanced technical details that require the attention of an automation professional;
there are difficult issues including instrument selection, proper instrument installation,
loop performance, advanced transmitter features, and valve dynamic performance. A
knowledgeable automation professional could likely go into any process plant in the
world and drastically improve the performance of the plant by tuning loops, redesigning
the installation of an instrument for improved accuracy, or determining a needed
dynamic performance improvement on a control valve—at minimal cost. More
automation professionals need that knowledge.
Motor Controls
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Not all final control elements are valves. Motors with adjustable speed drives are used
for pumps, fans, and other powered equipment. This chapter provides a basic review of
motor types and adjustable speed drive functionality.
5
Measurement Uncertainty
By Ronald H. Dieck
Introduction
All automation measurements are taken so that useful data for the decision process may
be acquired. For results to be useful, it is necessary that their measurement errors be
small in comparison to the changes, effects, or control process under evaluation.
Measurement error is unknown but its limits may be estimated with statistical
confidence. This estimate of error is called measurement uncertainty.
Error
Error is defined as the difference between the measured value and the true value of the
measur and [1] as is shown in Equation 5-1:
E = (measured) – (true)
(5-1)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
where
E =
(measured) =
(true) =
the measurement error
the value obtained by a measurement
the true value of the measurand
It is only possible to estimate, with some confidence, the expected limits of error. The
first major type of error with limits needing estimation is random error. The extent or
limits of a random error source are usually estimated with the standard deviation of the
average, which is written as:
(5-2)
where
S
=
M
SX
=
=
the standard deviation of the average; the sample standard
deviation of the data divided by the square root of M
the number of values averaged for a measurement
the sample standard deviation
=
the sample average
Note in Equation 5-2 that N does not necessarily equal M. It is possible to obtain SX
from historical data with many degrees of freedom ([N – 1] greater than 29) and to run
the test only M times. The test result, or average, would therefore be based on M
measurements, and the standard deviation of the average would still be calculated with
Equation 5-2.
Measurement Uncertainty (Accuracy)
One needs an estimate of the uncertainty of test results to make informed decisions.
Ideally, the uncertainty of a well-run experiment will be much less than the change or
test result expected. In this way, it will be known, with high confidence, that the change
or result observed is real or acceptable and not a result of errors from the test or
measurement process. The limits of those errors are estimated with uncertainty, and
those error sources and their limit estimators, the uncertainties, may be grouped into
classifications to make them easier to understand.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Classifying Error and Uncertainty Sources
There are two classification systems in use. The final uncertainty calculated at a chosen
confidence is identical for the two systems no matter what classification system is used.
The two classifications utilized are the International Organization for Standardization
(ISO) classifications and the American Society of Mechanical Engineers
(ASME)/engineering classifications. The former groups errors and their uncertainties by
type, depending on whether or not there is data available to calculate the sample
standard deviation for a particular error and its uncertainty. The latter classification
groups errors and their uncertainties by their effect on the experiment or test. That is, the
engineering classification groups errors and uncertainties by random and systematic
effects, with subscripts used to denote whether there are data to calculate a standard
deviation or not for a particular error or uncertainty source.
ISO Classifications
In the ISO system, errors and uncertainties are classified as Type A if there are data
available to calculate a sample standard deviation and Type B if there are not [2]. In the
latter case, the sample standard deviation might be obtained, for example, from
engineering estimates, experience, or manufacturer’s specifications.
The impact of multiple sources of error is estimated by root-sum-squaring their
corresponding elemental uncertainties. The operating equations are as follows.
ISO Type A Errors and Uncertainties
For Type A, data are available for the calculation of the standard deviation:
(5-3)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
where
uAi
=
NA
=
θi
=
the standard deviation (based on data) of the average for
uncertainty source i of Type A, each with its own degrees
of freedom; uA is in units of the measurement. It is
considered an S and an elemental uncertainty.
the number of parameters with a Type A uncertainty.
the sensitivity of the test or measurement result, R, to the ith
Type A uncertainty. θi is the partial derivative of the result
with respect to each ith independent measurement.
ISO Type B Errors and Uncertainties
For Type B (no data for standard deviation), uncertainties are calculated as follows:
(5-4)
where
uBi
=
the standard deviation of the average (based on an estimate,
not data) for uncertainty source i of Type B; uB is in units
of the measurement. It is considered an S and an elemental
uncertainty.
NB
=
the number of parameters with a Type B uncertainty.
θi
=
the sensitivity of the measurement result, to the ith Type B
uncertainty.
For these uncertainties, it is assumed that uBi represents one standard deviation of the
average for one uncertainty source.
ISO Combined Uncertainty
In computing a combined uncertainty, the uncertainties noted by Equations 5-3 and 5-4
are combined by root-sum-square. For the ISO model [2], this is calculated as:
(5-5)
The degrees of freedom of the uAi and the uBi are needed to compute the degrees of
freedom of the combined combined uncertainty. It is calculated with the WelchSatterthwaite approximation. The general formula for degrees of freedom [2] is:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
(5-6)
The degrees of freedom (df) calculated with Equation 5-6 are often a fraction. This may
be truncated to the next lower whole number to be conservative.
ISO Expanded Uncertainty
Then the expanded, 95% confidence uncertainty is obtained with Equation 5-7:
(5-7)
where usually K = t95 = Student’s t for vR degrees of freedom as shown in Table 5-1.
Note that alternative confidences are permissible. The ASME recommends 95% [1], but
99% or 99.7% or any other confidence is obtained by choosing the appropriate Student’s
t. However, 95% confidence is recommended for uncertainty analysis.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
In all the above, the errors were assumed to be independent. Independent sources of
error are those that have no relationship to each other. That is, an error in a measurement
from one source cannot be used to predict the magnitude or direction of an error from
the other independent error source. Nonindependent error sources are related. That is, if
it were possible to know the error in a measurement from one source, one could
calculate or predict an error magnitude and direction from the other nonindependent
error source. These are sometimes called dependent error sources. Their degree of
dependence may be estimated with the linear correlation coefficient. If they are
nonindependent, whether Type A or Type B, Equation 5-7 becomes [3]:
(5-8)
where
the ith elemental uncertainty of Type T (can be Type A or
B)
the expanded uncertainty of the measurement or test result
ui,T
=
UR,ISO
=
θi
=
the sensitivity of the test or measurement result to the ith
Type T uncertainty
θj
=
the sensitivity of the test or measurement result to the jth
Type T uncertainty
u(i,T),
=
the covariance of ui,T on uj, so that:
(j,T)
(5-9)
where
l
=
an index or counter for common uncertainty sources
K
δi, j
=
=
the number of common source pairs of uncertainties
the Kronecker delta. δi, j = 1 if i = j, and δi, j = 0 if not
T
Ni,T
=
=
an index or counter for the ISO uncertainty Type, A or B
the number of error sources for Types A and B
Equation 5-9 equals the sum of the products of the elemental systematic standard
uncertainties that arise from a common source (l).
This ISO classification equation will yield the same expanded uncertainty as the
engineering classification, but the ISO classification does not provide insight into how
to improve an experiment’s or test’s uncertainty. That is, it does not indicate whether to
take more data because the random standard uncertainties are too high or calibrate better
because the systematic standard uncertainties are too large. The engineering
classification now presented is therefore the preferred approach.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
ASME/Engineering Classifications
The ASME/engineering classification recognizes that experiments and tests have two
major types of errors that affect results and whose limits are estimated with
uncertainties at some chosen confidence. These error types may be grouped as random
and systematic. Their corresponding limit estimators are the random standard
uncertainties and systematic standard uncertainties, respectively.
ASME/Engineering Random Standard Uncertainty
The general expression for random standard uncertainty is the 1S standard deviation of
the average [4]:
(5-10)
where
the sample standard deviation of the ith random error
source of Type T
the random standard uncertainty (standard deviation of the
average) of the ith parameter random error source of Type T
the random standard uncertainty of the measurement or test
result
SXi, T
=
S i, T
=
S ,R
=
Ni,T
=
the total number of random standard uncertainties, Types A
and B, combined
Mi,T
=
θi
=
T
=
the number of data points averaged for the ith error source,
Type A or B
the sensitivity of the test or measurement result to the ith
random standard uncertainty
an index or counter for the ISO uncertainty Type, A or B
Note that S , R is in units of the test or measurement result because of the use of the
sensitivities, θi . Here, the elemental random standard uncertainties have been root-sumsquared with due consideration for their sensitivities, or influence coefficients. Since
these are all random standard uncertainties, there is, by definition, no correlation in their
corresponding error data so these can always be treated as independent uncertainty
sources.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
(Note: The term standard is inserted to provide harmony with ISO terminology and to
indicate that the uncertainties are standard deviations of the average.)
ASME/Engineering Systematic Standard Uncertainty
The systematic standard uncertainty of the result, bR, is the root-sum-square of the
elemental systematic standard uncertainties with due consideration for those that are
correlated [4]. The general equation is:
(5-11)
where
bi,T
=
the ith parameter elemental systematic standard
uncertainty of Type T
bR
=
the systematic standard uncertainty of the measurement
or test result
NT
=
the total number of systematic standard uncertainties
θi
=
the sensitivity of the test or measurement result to the ith
systematic standard uncertainty
θj
=
the sensitivity of the test or measurement result to the jth
systematic standard uncertainty
b(i,T),(j,T)
=
the covariance of bi on bi
(5-12)
where
l
δij
=
=
an index or counter for common uncertainty sources
the Kronecker delta. δij = 1 if i = j, and δij = 0 if not
T
=
an index or counter for the ISO uncertainty Type, A or B
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Equation 5-12 equals the sum of the products of the elemental systematic standard
uncertainties that arise from a common source (l). Here, each bi,T and bj,T are estimated
as 1S for an assumed normal distribution of errors at 95% confidence with infinite
degrees of freedom.
ASME/Engineering Combined Uncertainty
The random standard uncertainty, Equation 5-10, and the systematic standard
uncertainty, Equation 5-11, must be combined to obtain a combined uncertainty,
Equation 5-13.
(5-13)
ASME/Engineering Expanded Uncertainty
Then the expanded 95% confidence uncertainty may be calculated with Equation 5-14:
(5-14)
Note that bR is in units of the test or measurement result as is S , R.
The degrees of freedom will be needed for the engineering system combined and
expanded uncertainties in order to determine Student’s t. It is accomplished with the
Welch–Satterthwaite approximation, the general form of which is Equation 5-15, and
the specific formulation here is:
(5-15)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
where
N
M
vi,T
=
=
=
vi,T
=
t
=
the number of random standard uncertainties of Type T
the number of systematic standard uncertainties of Type T
the degrees of freedom for the ith uncertainty of Type T
infinity for all systematic standard uncertainties
Student’s t associated with the degrees of freedom (df) for
each Bi,
High Degrees of Freedom Approximation
It is often the case that it is assumed that the degrees of freedom are 30 or higher. In
these cases, the equations for uncertainty simplify further by setting t95 equal to 2.000
[1]. This approach is recommended for a first-time user of uncertainty analysis
procedures as it is a fast way to get to an approximation of the measurement uncertainty.
Calculation Example
In the following calculation example, all the uncertainties are independent and are in the
units of the test result: temperature. It is a simple example that illustrates the
combination of measurement uncertainties in their most basic case. More detailed
examples are given in many of the references cited. Their review may be needed to
assure a more comprehensive understanding of uncertainty analysis.
It has been shown [5] that there is no difference in the uncertainties calculated with the
different models. The data from Table 5-2 will be used to calculate measurement
uncertainty with these two models. These data are all in temperature units and thus the
influence coefficients, or sensitivities, are all unity.
Note the use of subscripts “A” and “B” to denote where data does or does not exist to
calculate a standard deviation. Note also in this example, all errors (and therefore
uncertainties) are independent and all degrees of freedom for the systematic standard
uncertainties are infinity except for the reference junction whose degrees of freedom are
12.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Each uncertainty model will now be used to derive a measurement uncertainty. For the
UISO model one obtains, via Equations 5-3 and 5-4, the expressions are as follows:
(5-16)
(5-17)
Thus:
(5-18)
Here, remember that the 0.21 is the root-sum-square of the 1S Type A uncertainties in
Table 5-2, and 0.058 is the root-sum-square for the 1S Type B uncertainties. Also note
that in most cases, the Type B uncertainties have infinite degrees of freedom and
represent an equivalent 1SX.
If K is taken as Student’s t95, the degrees of freedom must first be calculated. All the
systematic components of Type B have infinite degrees of freedom except for the 0.032,
which has 12 degrees of freedom. Also, all the systematic standard uncertainties, (b), in
Table 5-2 represent an equivalent 1SX. All Type A uncertainties, whether systematic or
random in Table 5-2, have degrees of freedom as noted in the table. The degrees of
freedom for UISO are then:
(5-19)
t95 is therefore 2.074. UR,ISO, the expanded uncertainty, is determined with Equation 5-7.
(5-20)
For the engineering system, UR,ENG, model, one obtains the random standard uncertainty
with Equation 5-10:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
(5-21)
The systematic standard uncertainty is obtained with Equation 5-11 greatly simplified as
there are no correlated errors or uncertainties in this example. Equation 5-11 then
becomes:
(5-22)
The combined uncertainty is then computed with Equation 5-13:
(5-23)
UR,ENG, the expanded uncertainty is then obtained with Equation 5-14:
(5-24)
The degrees of freedom must be calculated just as in Equation 5-19. Therefore, the
degrees of freedom are 22 and t95 equals 2.07. UR,ENG is then:
(5-25)
This is identical to UR,ISO, Equation 5-20, as predicted.
Summary
Although these formulae for uncertainty calculations will not handle every conceivable
situation, they will provide, for most experimenters, a useful estimate of test or
measurement uncertainty. For more detailed treatment or specific applications of these
principles, consult the references and the recommended “Additional Resources” section
at the end of this chapter.
Definitions
Accuracy – The antithesis of uncertainty. An expression of the maximum possible limit
of error at a defined confidence.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Combined uncertainty – The root-sum-square combination of either the Type A and
Type B uncertainties for the ISO error classifications or the random and systematic
standard uncertainties for the engineering error classifications.
Confidence – A statistical expression of percent likelihood.
Correlation – The relationship between two data sets. It is not necessarily evidence of
cause and effect.
Degrees of freedom – The amount of room left for error. It may also be expressed as the
number of independent opportunities for error contributions to the composite error.
Error – [Error] = [Measured] – [True]. It is the difference between the measured value
and the true value.
Expanded uncertainty – The 95% confidence interval uncertainty. It is the product of
the combined uncertainty and the appropriate Student’s t.
Influence coefficient – See sensitivity.
Measurement uncertainty – The maximum possible error, at a specified confidence,
that may reasonably occur. Errors larger than the measurement uncertainty should rarely
occur.
Propagation of uncertainty – An analytical technique for evaluating the impact of an
error source (and its uncertainty) on the test result. It employs the use of influence
coefficients.
Random error – An error that causes scatter in the test result.
Random standard uncertainty – An estimate of the limits of random error, usually one
standard deviation of the average.
Sensitivity – An expression of the influence an error source has on a test or measured
result. It is the ratio of the change in the result to an incremental change in an input
variable or parameter measured.
Standard deviation of the average or mean – The standard deviation of the data
divided by the number of measurements in the average.
Systematic error – An error that is constant for the duration of a test or measurement.
Systematic standard uncertainty – An estimate of the limits of systematic error,
usually taken as an equivalent single standard deviation of an average.
True value – The desired result of an experimental measurement.
Welch-Satterthwaite – The approximation method for determining the number of
degrees of freedom for a combined uncertainty.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
References
1. ANSI/ASME PTC 19.1-2005. Instruments and Apparatus, Part 1, Test Uncertainty
(American National Standards Institute/American Society of Mechanical Engineers),
1985: 5.
2. ISO. Guide to the Expression of Uncertainty in Measurement. Geneva, Switzerland:
ISO (International Organization for Standardization), 1993: 10 and 11.
3. Brown, K. K., H. W. Coleman, W. G. Steele, and R. P. Taylor. “Evaluation of
Correlated Bias Approximations in Experimental Uncertainty Analysis.” In
Proceedings of the 32nd Aerospace Sciences Meeting & Exhibit. AIAA paper no.
94-0772. Reno, NV, January 10–13, 1996.
4. Dieck, R. H. Measurement Uncertainty, Methods and Applications. 4th ed. Research
Triangle Park, NC: ISA (International Society of Automation), 2006: 45.
5. Strike, W. T., III, and R. H. Dieck. “Rocket Impulse Uncertainty; An Uncertainty
Model Comparison.” In Proceedings of the 41st International Instrumentation
Symposium, Denver, CO, May 1995. Research Triangle Park, NC: ISA (International
Society of Automation).
Further Information
Abernethy, R. B., et al. Handbook-Gas Turbine Measurement Uncertainty. United States
Air Force Arnold Engineering Development Center (AEDC), 1973.
Abernethy, R. B., and B. Ringhiser. “The History and Statistical Development of the
New ASME-SAE-AIAA-ISO Measurement Uncertainty Methodology.” In
Proceedings of the AIAA/SAE/ASME, 21st Joint Propulsion Conference. Monterey,
CA, July 8–10, 1985.
ICRPG Handbook for Estimating the Uncertainty in Measurements Made with Liquid
Propellant Rocket Engine Systems. Chemical Propulsion Information Agency. no.
180, 30 April 1969.
Steele, W. G., R. A. Ferguson, and R. P. Taylor. “Comparison of ANSI/ASME and ISO
Models for Calculation of Uncertainty.” In Proceedings of the 40th International
Instrumentation Symposium. Paper number 94-1014. Research Triangle Park, NC:
ISA (International Society of Automation), 1994: 410-438.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
About the Author
Ronald H. Dieck is an ISA Fellow and president of Ron Dieck Associates, Inc., an
engineering consulting firm in Palm Beach Gardens, Florida. He has more than 35 years
of experience in measurement uncertainty methods and applications for flow,
temperature, pressure, gas analysis, and metrology; the testing of instrumentation,
temperature, thermocouples, air pollution; and gas analysis.
Dieck is a former president of ISA (1999) and the Asian Pacific Federation of
Instrumentation and Control Societies (2002). He has served as chair of ASME PTC19.1
on Test Uncertainty for more than 20 years.
From 1965 to 2000, Dieck worked at Pratt & Whitney, a world leader in the design,
manufacture, and service of aircraft engines and auxiliary power units. He earned a BS
ino physics and chemistry from Houghton College in Houghton, New York, and a
master’s in physics from Trinity College, Hartford, Connecticut.
Dieck can be contacted at rondieck@aol.com.
6
Process Transmitters
By Donald R. Gillum
Introduction
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
With the emphasis on improved control and control quality and with advanced control
systems, the significance and importance of measurement is often overlooked. In early
industrial facilities, it was soon realized that many variables needed to be measured. The
first measuring devices consisted of simple pointer displays located in the processing
area, a pressure gauge for example. When the observation of a variable needed to be
remote from the actual point of measurement, hydraulic impulse lines were filled with a
fluid and connected to a readout device mounted to a panel for local indication of a
measured value.
The need for transmitting measurement signals through greater distance became
apparent as the size and complexity of process units increased and control moved from
the process area to a centrally located control room. Transmitters were developed to
isolate the process area and material from the control room. In general terms, the
transmitter is a device that is connected to the process and generates a transmitted signal
proportional to the measured value. The output signal is generally 3–15 psi for
pneumatic transmitters and 4–20 mA for electronic transmitters. As it has taken many
years for these standard values to be adopted, other scaled output values may be used
and converted to these standard units. The input to the transmitter will represent the
value of the process to be measured and can be nearly any range of values. Examples
can be: 0–100 psi, 0–100 in of water, 50–500°F and 10–100 in of level measurement or
0–100 kPa, 0–10 mm Hg, 40–120°C and 5–50 cm of level measurement. The actual
value of input measurement, determined by the process requirements, is established
during initial setup and calibration of the device.
Although transmitters have been referred to as transducers, this term does not define the
entire function of a transmitter which usually has an input and output transducer. A
transducer is a device that converts one form of energy into another form of energy that
is generally more useful for a particular application.
Pressure and Differential Pressure Transmitters
The most common type of transmitters used in the processing industries measure
pressure and differential pressure (d/p). These types will be discussed in greater detail in
the presentation of related measurement applications. The input transducer for most
process pressure and d/p transmitters is a pressure element which responds to an applied
pressure and generates a proportional motion, movement, or force. Pressure is defined
as force per unit area which can be expressed as P = F/A where P is the pressure to be
measured, F is the force, and A is the area over which the force is applied. By
rearranging the expression, F = PA. So, the force produced by a pressure element is a
function of the applied pressure acting over the area of the pressure element to which
the force is applied.
Pressure elements represent a broad classification of transducers in pressure
instruments. The deflection of the free end of a pressure element, the input transducer, is
applied to the secondary or output transducer to generate a pneumatic or electronic
signal which is the corresponding output of the transmitter. A classification of pressure
instruments is called Bourdon elements and include:
• “C” tube
• Spiral
• Helical
• Bellows
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Diaphragm
• Capsule
While most of these transducers can be used on pressure gauges or transmitters, the “C”
tube is predominant in pressure gauges. The diaphragm or capsule is used in d/p
transmitters because of the ability to respond to the low pressure in such applications.
Other types of Bourdon elements are used in most pressure transmitters.
A flapper-nozzle arrangement or a pilot-valve assembly is used for the output transducer
in most pneumatic transmitters.
A variety of output transducers have been used for electronic transmitters. The list
includes:
• Potentiometers or other resistive devices – This measurement system results in
a change of resistance in the output transducer, which results in a change in
current or voltage in a bridge circuit or other type of signal conditioning system.
• Linear variable differential transformer (LVDT) – This system is used to
change the electrical energy generated in the secondary winding of a transformer
as a pressure measurement changes the positioner of a movable core between the
primary and secondary windings. This device can detect a very small deflection
from the input transducer.
• Variable capacitance device – This device generates a change in capacitance as
the measurement changes the relative position of capacitance plates. A
capacitance detecting circuit converts the resulting differential capacitance to a
change in output current.
• Electrical strain gauge – This device operates much like the potentiometric or
variable resistance circuit mentioned above. A deformation of a pressure element
resulting from a change in measurement causes a change in tension or
compression of an electrical strain gauge. This results in a change in electrical
resistance which through a signal conditioning circuit produces a corresponding
change in current.
Other types of secondary transducers for electrical pressure and d/p transmitters include:
• Resonant frequency
• Quartz resonant frequency
• Silicon resonant sensors
• Variable conductors
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Variable reluctance
• Piezoresistivity transmitters
These transducers are not very prominent. The reader is directed to the listed reference
material for further clarification.
Level Measurement
Level measurement is defined as the determination of the position of an existing
interface between two media, which are usually fluids but can be solids or a
combination of a fluid and a solid. Many technologies are available to measure this
interface. They include:
• Visual
• Hydraulic head
• Displacer
• Capacitance
• Conductance
• Sonic and ultrasonic
• Weight
• Load cells
• Radar
• Fiber optics
• Magnetostrictive
• Nuclear
• Thermal
• Laser
• Vibrating paddle
• Hydrostatic tank gauging
This section will discuss the most common methods in present use.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Automatic Tank Gauges
An automatic tank gauge (ATG) is defined in the American Petroleum Institute (API)
Manual of Petroleum Measurement Standards as “an instrument which automatically
measures and displays liquid level or ullages in one or more tanks, either continuously,
periodical or on demand.” From this description, an ATG is a level-measuring system
that produces a measurement from which the volume and/ or weight of liquid in a vessel
can be calculated. API Standard 2545 (1965), Methods of Gauging Petroleum and
Petroleum Products, described float-actuated or float-tape ATGs, power-operated or
servo-operated ATGs, and electronic surface-detecting-type level instruments. These
definitions and standards imply that ATGs encompass nearly all level-measurement
technologies, including hydrostatic tank gaging.
Some ATG technology devices are float-tape types, which rival visual levelmeasurement techniques in their simplicity and dependability. These devices operate by
float movement with a change in level. The movement is then used to convey a level
measurement.
Many methods have been used to indicate level from a float position, the most common
being a float and cable arrangement. The float is connected to a pulley by a chain or a
flexible cable, and the rotating member of the pulley is in turn connected to an
indicating device with measurement graduations. When the float moves upward, the
counterweight keeps the cable tight and the indicator moves along a circular scale.
When chains are used to connect the float to the pulley, a sprocket on the pulley mates
with the chain links. When a flat metallic tape is used, holes in the tape mate with metal
studs on a rotating drum.
A major type of readout device is commonly used with float systems and perhaps the
simplest type is a weight connected to the float with a cable. As a float moves, the
weight also moves by means of a pulley arrangement. The weight, which moves along a
board with calibrated graduations, will be at the extreme bottom position when the tank
is full and at the top when the tank is empty. This type is generally used for closed tanks
at atmospheric pressure.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Variable-Displacement Measuring Devices
When the weight of an object is always heavier than an equal volume of the fluid into
which it is submerged, full immersion results and the object never floats. Although the
object (displacer) never floats on the liquid surface, it does assume a relative position in
the liquid, and as the level moves up and down along the length of the displacer, the
displacer undergoes a change in weight caused by the buoyancy of the liquid. Buoyancy
is explained by Archimedes’ principle, which states that the resultant pressure of a fluid
on a body immersed in it acts vertically upward through the center of gravity of the
displaced fluid and is equal to the weight of the fluid displaced. The upward pressure
acting on the area of the displacer creates the force called buoyancy. The buoyancy is of
sufficient magnitude to cause the float (displacer) to be supported on the surface of a
liquid or a float in float-actuated devices. However, in displacement level systems, the
immersed body or displacer is supported by arms or springs that allow some small
amount of vertical movement or displacement with changes of resulting buoyancy
forces caused by level changes. This buoyancy force can be measured to reflect the level
variations.
Displacers Used for Interface Measurement
Recall that level measurement is the determination of the position of an interface
between two fluids or between a fluid and a solid. It can be clearly seen that displacers
for level measurement operate in accordance with this principle. The previous paragraph
concerning displacer operation considered a displacer suspended in two fluids, with the
displacer weight being a function of the interface position. The magnitude of displacer
travel is described as being dependent on the interface change and on the difference in
specific gravity between the upper and lower fluids. When both fluids are liquids, the
displacer is always immersed in liquid. In displacement level transmission, a signal is
generated proportional to the displacer position.
Hydraulic Head Level Measurement
Many level-measurement techniques are based on the principle of hydraulic head
measurement. From this measurement, a level value can be inferred. Such levelmeasuring devices are used primarily in the water and wastewater, oil, chemical, and
petrochemical industries, and to a lesser extent in the refining, pulp and paper, and
power industries.
Principle of Operation
The weight of a 1 ft3 container of water is 62.4271bs, and this force is exerted over the
surface of the bottom of the container. The area of this surface is 144 in2; the pressure
exerted is
P = F/A
(6-1)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
where P is pressure in pounds per square inch, F is a force of 62.427 lb, and A is an area
of 1 ft2 = 144 in2.
P = 62.427 lb/144 in2 = 0.433 lb/in2/ft
This pressure is caused by the weight of a 12-in column of liquid pushing downward on
a 1in2 surface of the container. The weight of 1 in3 of water, which is the pressure on 1
in2 of area caused by a 1-in column of water, is
P = 0.433 psi/ft (H) = 0.036 psi/in
(6-2)
By the reasoning expressed above, the relationship between the vertical height of a
column of water (expressed as H in feet) and the pressure extended on the supporting
surface is established. This relationship is important not only in the measurement of
pressure but also in the measurement of liquid level.
By extending the discussion one step further, the relationship between level and pressure
can be expressed in feet of length and pounds per square inch of pressure:
1 psi = 2.31 in/ft wc = 27.7 in wc
(6-3)
(The term wc, which stands for water column, is usually omitted as it is understood in
the discussion of hydraulic pressure measurement.) It is apparent that the height of a
column of liquid can be determined by measuring the pressure exerted by that liquid.
Open-Tank Head Level Measurement
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Figure 6-1 illustrates an application where the level value is inferred from a pressure
measurement. When the level is at the same elevation point as the measuring
instrument, atmospheric pressure is applied to both sides of the transducer in the
pressure transmitter, and the measurement is at the zero reference level. When the level
is elevated in the tank, the force created by the hydrostatic head of the liquid is applied
to the measurement side of the transducer, resulting in an increase in the instrument
output. The instrument response caused by the head pressure is used to infer a level
value. Assuming the fluid is water, the relationship between pressure and level is
expressed by Equation 6-3. If the measured pressure is 1 psi, the level would be 2.31 ft,
or 27.7 in. Changes in atmospheric pressure will not affect the measurement because
these changes are applied to both sides of the pressure transducer.
When the specific gravity of a fluid is other than 1, Equation 6-2 must be corrected. This
equation is based on the weight of 1 ft3 of water. If the fluid is lighter, the pressure
exerted by a specific column of liquid is less. The pressure will be greater for heavier
liquids. Correction for specific gravity is expressed by Equation 6-4.
(6-4)
where G is specific gravity and H is the vertical displacement of a column in feet.
The relationship expressed in Equation 6-4 is used to construct the scale graduations on
the gauge. For example, instead of reading in pounds per square inch, the movement on
the gauge corresponding to 1 psi pressure would express graduations of feet and tenths
of feet.
Many head-level transmitters are calibrated in inches of water. The receiver instrument
also is calibrated in inches of water or linear divisions of percent.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Air Purge or Bubble System
The system known by various names, such as air purge, air bubble, or dip tube, is an
adaptation of head-level measurement. With the supply air blocked, the water level in
the tube will be equal to that in the tank. When the air pressure from the regulator is
increased until the water in the tube is displaced by air, the air pressure on the tube is
equal to that required to displace the liquid and equal to the hydraulic head of the liquid
in the tube.
The pressure set on the regulator must be great enough to displace the air from the tube
for maximum pressure, which will coincide with maximum level. This will be indicated
by a continuous flow, which is evidenced by the formation of bubbles rising to the level
of the liquid in the tank. As it may not be convenient to visually inspect the tank for the
presence of bubbles, an airflow indicator will usually be installed in the air line running
into the tank. A rotameter is generally used for this purpose. The importance of
maintaining a flow through the tube lies in the fact that the liquid in the tube must be
displaced by air, and the back-pressure on the air line provides the measurement, which
is related to level.
The amount of airflow through the dip tube is not critical but should be fairly constant
and not too great. In situations where the airflow is great enough to create a backpressure in the tube caused by the airflow restriction, this back-pressure would signify a
level resulting in a measurement error. For this reason, 3/8-inch tubing or 1/4-inch pipe
should be used.
An important advantage of the bubble system is the fact that the measuring instrument
can be mounted at any location or elevation with respect to the tank. This application is
advantageous for level-measuring applications where it would be inconvenient to mount
the measuring instrument at the zero reference level. An example of this situation is
level measurement in underground tanks and water wells. The zero reference level is
established by the relative position of the open end of the tube with respect to the tank.
This is conveniently fixed by the length of the tube, which can be adjusted for the
desired application. It must be emphasized that variations in back-pressure on the tube
or static pressure in the tank cannot be tolerated. This method of level measurement is
generally limited to open-tank applications but can be used in closed tank applications
with special precautions listed below.
Measurement in Pressurized Vessels: Closed-Tank Applications
The open-tank measurement applications that have been discussed are referenced to
atmospheric pressure. That is, the pressures (usually atmospheric) on the surface of the
liquid and on the reference side of the pressure element in the measuring instrument are
equal. When atmospheric pressure changes, the change is by equal amounts on both the
measuring and the reference sides of the measuring element. The resulting forces
created are canceled one opposing the other, and no change occurs in the measurement
value.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Suppose, however, that the static pressure in the level vessel is different from
atmospheric pressure. Such would be the case if the level was measured in a closed tank
or vessel. Pressure variations within the vessel would be applied to the level surface and
have an accumulated effect on the pressure instrument, thus affecting level
measurement. For this reason, pressure variations must be compensated for in closedtank applications. Instead of a pressure-sensing instrument, a differential-pressure
instrument is used for head-type level measurements in closed tanks.
Since a differential-pressure instrument responds only to a difference in pressure applied
to the measuring ports, the static tank pressure on the liquid surface in a closed tank has
no effect on the measuring signal. Variations in static tank pressure, therefore, do not
cause an error in level measurement as would be the case when a pressure instrument is
used.
Mounting Considerations: Zero Elevation and Suppression
Unless dip tubes are used, the measuring instrument is generally mounted at the zero
reference point on the tank. When another location point for the instrument is desired or
necessary, the head pressure caused by liquid above or below the zero reference point
must be discounted.
When the high-pressure connection of the differential-pressure instrument is below the
zero reference point on the tank, the head pressure caused by the elevation of the fluid
from the zero point to the pressure tap will cause a measured response or instrument
signal. This signal must be suppressed to make the output represent a zero-level value.
The term zero suppression defines this operation. This refers to the correction taken or
the instrument adjustment required to compensate for an error caused by the mounting
of the instrument to the process. With the level at the desired zero reference level, the
instrument output or response is made to represent the zero-level value. In the case of
transmitters, this would be 3 psi, 4 mA, 10 mA, or the appropriate signal level to
represent the minimum process value. More commonly, however, the zero-suppression
adjustment is a calibration procedure carried out in a calibration laboratory or shop,
sometimes requiring a kit for the transmitter that consists of an additional zero bias
spring.
When using differential-pressure instruments in closed-tank level measurement for
vapor service, quite often the vapor in the line connecting the low side of the instrument
to the tank will condense to a liquid. This condensed liquid, sometimes called a wet leg,
produces a hydrostatic head pressure on the low side of the instrument, which causes the
differential-pressure instrument reading to be below zero. Compensation is required to
eliminate the resulting error. This compensation or adjustment is called zero elevation.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
To prevent possible condensation, fouling, or other external factors from affecting the
wet leg, in most applications a sealed wet leg as described below is used to reduce the
impact of such variables as condensation as a function of ambient temperature.
Repeaters Used in Closed-Tank Level Measurement
Some level applications require special instrument-mounting considerations. For
example, sometimes the process liquid must be prevented from forming a wet leg. Some
liquids are nonviscous at process temperature but become very thick or may even
solidify at the ambient temperature of the wet leg. For such applications, a pressure
repeater can be mounted in the vapor space above the liquid in the process vessel, and a
liquid-level transmitter (e.g., a differential-pressure instrument) can be mounted below
in the liquid section at the zero reference level. The pressure in the vapor section is
duplicated by the repeater and transmitted to the low side of the level instrument. The
complications of the outside wet leg are avoided, and a static pressure in the tank will
not affect the operation of the level transmitter.
A sealed pressure system can also be used for this application. This system is similar to
a liquid-filled thermal system. A flexible diaphragm is flanged to the process and is
connected to the body of a differential-pressure cell by flexible capillary tubing. The
system should be filled with a fluid that is noncompressible and that has a high boiling
point, a low coefficient of thermal expansion, and a low viscosity. A silicon-based liquid
is commonly used in sealed systems.
For reliable operation of sealed pressure devices, the entire system must be evacuated
and filled completely with the fill fluid. Any air pockets that allow contraction and
expansion of the filled material during operation can result in erroneous readings. Most
on-site facilities are not equipped to carry out the filling process; a ruptured system
usually requires component replacement.
Radar Measurement
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The function of a microwave gauge can be described where the gauge and its
environment are divided into five parts: microwave electronic module, antenna, tank
atmosphere, additional sensors (mainly temperature sensors), and a remote (or local)
display unit. The display may include some further data processing, such as calculation
of the mass. Normally, the transmitter is located at the top of a vessel and the solid-state
oscillator transmits an electronic wave at a selected carrier frequency and wave form
that is aimed downward at the surface of the process fluid in the vessel. The standard
frequency is 10 GHz. The signal is radiated by a dish or horn-type antenna that can take
various forms depending on the need for a specific application. A portion of the wave is
reflected to the antenna where it is collected and sent to the receiver where a
microprocessor determines the time of flight for the transmitted and reflected waveform.
Knowing the speed of the waveform and travel time, the distance from the transmitter to
process fluid surface can be calculated. The detector output is based on this difference.
Non-contact radar detectors operate by using pulsed radar waves or frequency
modulated continuous waves (FMCW). In pulsed wave operation, short-duration radar
pulses are transmitted and the target distance is calculated using the transit time. The
FMCW sensor sends out continuous frequency-modulated signals, usually in successive
(linear) ramps. The frequency difference caused by the time delay between transmit and
reception indicates the distance which directly infers the level.
The low power of the beam permits safe installation for both metallic and nonmetallic
vessels. Radar sensors can be used when the process material is flammable and when
the composition or temperature of the material in the vapor space varies.
Contact radar measuring devices send a pulse down a wire to a vapor-liquid interface
where a sudden change in the dielectric of the materials causes the signal to be partially
reflected. The time of flight is measured and the distance traversed by the signal is
calculated. The non-reflected portion of the signal travels to the end of the probe and
gives a signal for a zero reference point. Contact radar can be used for liquids and small
granular bulk solids. In radar applications, the reflective properties of the process
material will affect the transmitted signal strength. Liquids have good reflective
qualities but solids usually do not. When heavy concentrations of dust particles or other
such foreign materials are present, these materials will be measured instead of the
liquid.
Tank Atmosphere
The radar signal is reflected directly on the liquid surface to obtain an accurate level
measurement. Any dust or mist particles present must have no significant influence as
the diameters of such particles are much smaller than the 3-cm radar wavelength. For
optical systems with shorter wavelengths, this is not the case. For comparison, when
navigating with radar aboard ships, a substantial reduction of the possible measuring
range is experienced, but even with a heavy tropical rain the range will be around 1 km,
which is large compared to a tank.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
There can be slight measurement errors for a few specific products in the vapor space of
the tank. This is especially true when the composition may vary between no vapor and
fully saturated conditions. For these specific products, pressure and temperature
measurement may be required for compensation. Such compensation is made by the
software incorporated in the tank intelligence system provided by the equipment
manufacturer.
End-of-the-Probe Algorithm
End-of-the-probe algorithm can be used in guided-wave radar when there is no
reflection coming back from the product. This new technology innovation provides a
downward-looking time of flight situation which allows the guided-wave radar system
to measure the distance from the probe mounting to the material level. An
electromagnetic pulse is transmitted and guided down a metal cable or rod which acts as
a surface wave transmission line. When the surface wave meets a discontinuity in the
surrounding medium such as a sudden change in dielectric constant, some of the signal
is reflected to the source where it is detected and timed. The portion of the signal that is
not reflected travels on and is reflected at the end of the probe.
The radar level gauging technique used on tanker ships for many years has been used in
refineries and tank farms in recent years. Its high degree of integrity against virtually all
environmental influences has resulted in high level-measurement accuracy; one example
is the approval for radar gauge for 1/16-in accuracy. Nearly all level gauging in a
petroleum storage environment can be done with radar level gauges adopted for that
purpose.
Although radar level technology is a relatively recent introduction in the process and
manufacturing industries, it is gaining respect for its reliability and accuracy.
Ultrasonic/Time-of-Flight Measurement
While the time-of-flight principle of sonic and ultrasonic level-measurement systems is
similar to radar, there are distinct differences. The primary difference is that sound
waves produced by ultrasonic units are mechanical and transmit sound by expansion of
a material medium. Since the transmission of sonic waves requires a medium, changes
in the medium can affect the propagation. The resulting change in velocity will affect
the level measurement. Other factors can also affect the transmitted or reflected signal,
including dust, vapors, foam, mist, and turbulence. Radar waves do not require a
medium for propagation and are inherently immune to the factors which confuse sonictype devices.
Fluid Flow Measurement Technology
Flow transmitters are used to determine the amount of fluid flowing in a pipe, tube, or
open stream. These flow values are normally expressed in volumetric or mass flow
units. Flow rates are inferred from other measured values. When the velocity of a fluid
through a pipe can be determined, the actual flow value can be derived, usually by
calculations. For example, when the velocity can be expressed in ft/sec through a crosssectional area of the pipe measured in ft2, the volumetric flow rate can be given in cubic
feet per second (ft3/sec). By knowing relationships between volume and weight, the
flow rate can be expressed in gallon per minute (gal/min), pound per hour (lb/hr), or any
desired units.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Head Flow Measurement
Most individual flow meters either measure the fluid velocity directly or infer a velocity
from other measurements. Head meters are described by Bernoulli’s expression which
states that for incompressible fluids the product of the area (A) and velocity (V) is equal
to the flow rate for static flow conditions. In a pipe, the flow rate (Q) is equal at all
points. So, Q1 = Q2 = Q3 ... = Qn and Q = (A)(V) and A1V1 = A2V2 = A3V3 = ... AnVn.
This continuity relationship requires that the velocity of a fluid increases as the crosssectional area of the pipe decreases. Furthermore, from scientific equations, a working
equation is developed that shows:
Q = KA (∆P/ρ)1/2
where
∆P
ρ
=
=
the pressure drop across a restriction
the density of the fluid
The constant (K) adjusts for dimensional units, nonideal fluid losses and behavior,
discharge coefficients, pressure tap locations, various operating conditions, gas
expansion factor, Reynolds number, and viscosity corrections which are accounted for
by empirical flow testing.
Many types of restrictions or differential producers and tap locations (or points on the
pipe where the pressure is measured) are presently used.
The overall differential flow meter performance will vary based on several conditions,
but is generally limited to about 2% for ideal conditions with a turndown ratio of about
3–5:1. Head-type flow meter rangeability can be as great at 10:1, but care should be
exercised in expressing a rangeability greater than about 6–8:1 as the accuracy may be
affected. The performance characteristics will preclude the use of head flow meters in
several applications where flow measurement is more of a consideration than flow
control. The high precision or repeatability make them suitable candidates for flow
control.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Particular attention must be given to instrument mounting and connection to the process
line. For liquids, the instrument is located below the process lines, and for gas or vapor
service the instrument is located above the lines. This is to assure that the instrument
and connecting or lead-lines, as they are called, are liquid full for liquid and vapor
service and liquid free for gas service. This is to prevent liquid legs of unequal values
from forming in the instrument and lead-lines.
Because of the nonlinear relationship between transmitter output and flow rates, headtype flow meters are not used in ratio control, totalizing, or other applications where the
transmitter output must represent the same flow value at every point in the measuring
range.
An important factor to consider in flowmeter engineering is the velocity profile and
Reynolds number (Re) of the flowing fluid. Re is the relationship of internal
forces/viscous forces of the fluid in the pipe which is equal to ρ(V)(D)/μ where
ρ
= the density of the flowing fluid in pounds per cubic foot
(lb/ft3)
V
= the velocity of the fluid in feet per second
D
= the pipe internal diameter in inches
μ
= the fluid viscosity in Centipoise
This scientific equation is reduced to the following working equations:
Re
= (3160)(Qgpm)(GF)/μ(D) for liquids
Re
=
(379)(Qacfm)(ρ)/μ(D) for gases
where the units are as given before and
Qacfm = the gas flow in absolute cubic feet per minute and GF =
specific gravity of the liquid.
The velocity profile should be such that the fluid is flowing at near uniform velocity as
noted by a turbulent flow profile. This will be such when Re is above about 6,000; in
actual practice Re should be significantly higher. For applications where Re is lower
than about 3,000, the flow profile is noted as laminar and head-type flow meters should
not be considered.
Flow meter engineering for head flow meters consists of determining the relationship
between maximum flow (QM), maximum differential (hM) at maximum flow and β, the
ratio of the size of the differential producer, and the pipe size (d/D). Once β is
determined and the pipe size is known, d can be calculated.
In general, QM and D are known, β is calculated and then d is determined from the
equation d = βD.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Although head flow technology is well established for flow meter applications, the
limited accuracy, low Re, limited turndown, limited rangeability, and relatively high
permanent head loss (which may be excessive) can preclude the adoption of this
technology. However, it is a well-established technology to use when the requirement is
to control flow within about 5% of a given maximum value.
In the oil, chemical, petrochemical, water and waste water industries, head meters are
still predominant but other flow measuring technologies are expanding and replacing
head meters in some new installations. Three such technologies are velocity measuring
flow instruments, mass measuring flow instruments, and volumetric measurement by
positive displacement meters.
Velocity measurement flow meters comprise a large group which include:
• Magnetic
• Turbine
• Vortex shedding
• Ultrasonic
• Doppler
• Time-of-flight
Magnetic Flow Meters
Magnetic flow meters operate on the principal of Faraday’s Law which states that the
magnitude of voltage induced in a conductive medium moving through a magnetic field
at right angles to the field is directly proportional to the product of the strength of the
magnetic flux density (B), the velocity of the medium (V), and the path length between
the probes (L). These terms are expressed in the following formula.
E
= (KBLV), where K is a constant based on the design of the
meter and other terms are as previously defined.
To continue the discussion of meter operation, a magnetic coil around the flow tube
establishes a magnetic field of constant strength through the tube. L is the distance
between the pick-up electrodes on each side of the tube, and V, the only variable, is the
velocity of a material to be measured through the tube. It can be seen from the above
expression that a voltage is generated directly proportional to the velocity of the flowing
fluid. Remembering that flow through a pipe, Q = A(V), where A is the cross-sectional
area of the pipe at the point of velocity measurement, V. The application of magnetic
flow meters is limited to liquids with a conductivity of about 1–5 micro mhos or
microsiemens. This limits the use of magnetic meters to conductive liquids or a solution
where the mixture is at least about 10% of a conductive liquid.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The velocity range of most flow meters is about 3–15 ft per second for liquids. The
accuracy of magnetic flow meters can be as good as 0.5%–2% of rate with about a 10:1
turndown.
When magnetic meters were first developed they were four-wire devices where the field
excitation voltage was supplied by a twisted pair of wires which was separated from the
signal cable and because of phase shift consideration, the field voltage and signalconditioning voltage were supplied from the same source. Now, however, magnetic
meters are two-wire systems where the supply and signal voltages are on the same
cable.
Magnetic meters have no obstruction to the fluid flow, have no Reynolds number
constraints, and some can measure flow in either direction. Calibration and initial startup can be provided with a compatible external handheld communicator. Earlier models
were calibrated with a secondary calibrator which replaced the magnetic pick-up coil
signal for an input equal to that corresponding to a given flow rate.
Turbine Meters
Turbine meters consist of a rotating device called a rotor that is positioned in a flowing
stream so that the rotational velocity of the rotor is proportional to the velocity of the
flowing fluid. The rotor generates a voltage the amplitude or frequency of which is
proportional to the angular rotation and fluid velocity. A pulse signal proportional to
angular velocity of the rotor can also be generated. A signal conditioning circuit
converts the output of the turbine meter to a scaled signal proportional to flow rate.
The accuracy of turbine meters can be as high as ±0.25% of rate for liquids and about
±0.5% of rate for gas service. They have good repeatability, as great as ±0.05% of rate.
Some may not be suitable for continuous service and are most prominent for flow
measurement rather than flow control.
In turbine meter operation, a low Re will result in a high pressure drop and a high Re
can result in excessive slippage. The flowing fluid should be free of suspended solids
and solids along the bottom of the pipe.
Ultrasonic Flow Meters
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Ultrasonic flow meters are velocity measuring devices and operate on the Doppler effect
and time-of-flight. Both operate on the use of acoustic waves or vibration to detect flow
through a pipe. Ultrasonic energy is coupled to the fluid in the pipe using transducers
that can be either wetted or non-wetted depending on the design.
Doppler meters operate on the principal of the Doppler shift in the frequency of a sound
wave as a result of the velocity of the sound source. The Doppler meter has a transmitter
that injects a sound wave of specific frequency into the flowing fluid. The sound wave is
reflected to a receiver across the pipe and the frequency shift between the injected wave
form and the reflected wave form is a function of the velocity of the particle that
reflected the injected wave form. The injected and reflected wave forms are “beat
together” and a pulse is generated equal to the beat frequency and represents the
velocity of the flowing fluid. Because Doppler meters require an object or substance in
the flowing fluid to reflect the injected signal to form an echo, in cases where the fluid is
extremely clean, particles such as bubbles can be introduced to the fluid to cause the
echo to form. Care should be exercised to prevent contamination of the process fluid
with the injected material.
Time-of-flight ultrasonic flow transmitters measure the difference in travel time for a
given length of pipe between pulses transmitted downstream in the fluid and upstream
in the fluid. The transmitters and receivers are transponders that alternate between
functions as a transmitter and a receiver each cycle of operation. The difference in
downstream and upstream transit time is a function of fluid velocity.
Differential frequency ultrasonic flow meters incorporate a transducer positioned so that
the ultrasonic wave form is beamed at an angle in the pipe. One transducer is located
upstream of the other. The frequency of the ultrasonic beam in the upstream and
downstream direction are detected and used to calculate the fluid flow through the pipe.
Ultrasonic flow meters can be used in nearly any pipe size above about 1/8 in and flows
as low as 0.1 gpm (0.38 L/min). Pipe thickness and material must be considered with
clamp-on transducers to prevent attenuation on the signal to make certain that the signal
strength does not fall to a point that will render the device inoperable.
Ultrasonic flow meter accuracy can vary from about 0.5% to 10% of full scale
depending on specific applications, meter types, and characteristics. These flow meters
are usually limited to liquid flow applications.
Vortex Meters
As the fluid in a vortex flow meter passes an object in the pipe of specific design called
a bluff body, vortices are caused to form and traverse alternately to each side of the pipe.
The frequency of these vortices, called the von Kármán effect, is measured by various
means. The velocity of the fluid in the pipe can be determined relating to the following
expression:
f = (St)(V/shedder width)
where
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
f
V
=
=
the frequency of the vortices
the velocity of the fluid in the pipe
shedder width = the physical dimensions and design of the bluff body
St
= Strouhel’s number, which is a dimensionless number
determined by the manufacturer
While most vortex flow meters use shedders of nearly similar width, their specific
design will vary significantly from one manufacturer to another. Specific shapes include
trapezoidal, rectangular, triangular, various T-shapes and others. The frequency of the
vortices varies directly with the fluid velocity and inversely with the pressure at the
bluff body. Various frequency sensing systems are designed to respond to the physical
properties of the vortices. Meters using metal shedders vary in size from 1/2–12 in for
liquid flow rate with flow values ranging from 3–5,000 gpm. PVC vortex flow meters
sizes vary from 1/4–2 in for flow rates with flow values ranging from .6–200 gpm.
The accuracy of vortex meters can be as good, under ideal conditions, as ±0.5–10% for
liquids and 1.5%–2% for gases. They have a limited turndown of 7–8:1 and they do
have Re constraints when the combination of Re and St number vary to the extent that
operation becomes nonlinear at extreme values. This situation can exist when Re values
drop as low as 3,000. Manufacturers can be consulted for low Re and velocity
constraints. Vortex meters are very popular; the technology is one of those that are
replacing head meters.
When high accuracies are required for flow measurement applications of custody
transfer, batching applications, and totalizing, among others, the following technologies
may be considered.
Coriolis Meters
Coriolis meters are true mass-measuring devices which operate on the existence of the
Coriolis acceleration of a fluid relating to the fluid mass. The meter consists of a tube
with the open ends fixed with the fluid entering and leaving. The closed end is forced to
vibrate and a Coriolis force causes the tube to twist; it is the amount of “twist” that is
detected. The amount of twist is a function of the mass of the fluid flowing through the
tube.
These meters have no significant pressure drop and no Re constraint. Temperature and
density variations do not affect the accuracy: the mass-volume relationship changes but
the true mass is measured. The accuracy can be as good as 0.1–0.2% of rate.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Coriolis meter applications are usually applied to liquids as the density of most gases is
generally too low to accurately operate the meter. In cases where gas applications are
possible temperature and pressure compensation is very advantageous.
Positive Displacement Flow Meters
Positive displacement (PD) flow meters operate on the principle of repeatedly filling
and emptying a chamber of known volume and counting the times of this operation.
Accuracy in the order of 0.2–0.4% of rate can be realized. Some PD meters have no Re
constraint but can have significant pressure drop, especially with highly viscous fluids.
Low viscous fluid applications can result in significant error caused by slippage. They
are commonly used in custody transfer applications, such as the measurement for fuel
purchase at the pump, household water measurement, and natural gas measurement, but
not as transmitters for process control.
Applications for Improved Accuracy
In cases where the accuracy of a device is given as a percentage of the measured value,
the accuracy is the same at every point on the measurement scale. However, when the
accuracy statement of a device is given as a percentage of full-scale value, that accuracy
is achieved only when the measurement is at full-scale. Therefore, it may be seen that
the entire measurement range may be sacrificed for improved accuracy. If the accuracy
is given as ±1% full scale, for example, that accuracy can be achieved only when the
measurement is at 100%. At 50% measurement, the error can be ±2%, and so forth.
Therefore, it is desired to maintain the measurement at or near the full-scale value. This
can be done by re-ranging the transmitter, which is a laborious process, to keep the
measured value near the 100% range. For this purpose, a procedure of wide-range
measurement application was developed whereby two, and seldom more, pipelines with
transmitters of different measurement range values can be switched to keep the
measurement at the upper end of the scale. With digital transmitters, this re-ranging can
easily be performed.
Another term used to define the quality of a measurement system is turndown, which is
the ratio of the maximum measurement to the minimum measurement that can be made
with an acceptable degree of error. For head-type flow measurement systems, this can
be as high as 10:1. This limited turndown is because the flow rate is proportional to the
square root of the pressure drop across a restrictor or differential producer inserted in the
flowing fluid. This means if the error is ±1% at full scale (which is a good quality
statement), the error would be ±5% at 25% of flow rate resulting in a turndown of 4:1.
Some flow transmitters are said to have a turndown of 100:1.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Temperature
Temperature is a measurement of relative heat in a substance or energy associated with
the activity of molecules. Two scales, Centigrade and Fahrenheit, which were arbitrarily
established and their associated absolute scales, Kelvin and Rankin, are used in
temperature measurement.
Many methods have been developed to measure temperature. Some of the most
common include:
• Filled thermal systems
• Liquid in glass thermometers
• Thermocouples
• Resistance temperature detector (RTD)
• Bimetallic strip
• Thermistors
• Optical and other non-contact pyrometers
This section will deal with those predominantly used with transmitters in the processing
industries. The list can be classified as mechanical and electrical which will be
discussed.
Filled Thermal Systems
A large classification of mechanical temperature measuring devices are filled thermal
systems, which consist of sensors or bulbs filled with a selective fluid which is either
liquid, gas, or vapor. This bulb is connected by a capillary tubing to a pressure readout
device for an indication or signal generation for transmission. Flapper-nozzle systems
are most commonly used in pneumatic temperature transmitters using filled thermal
systems. Filled temperature systems are grouped into four classes denoted by the fill
fluid. They are:
Class
I
II
III
V
Fill Fluid
Liquid
Vapor
Gas
Mercury*
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
(*The use of mercury is not very common in the processing industries and this class is
not prominent.)
As the readout device can be separated from the measuring sensor or bulb by several
feet through changing ambient temperatures, compensation must be provided so the
measurement is immune to ambient temperature fluctuations. Full compensation
corrects for temperature changes along the capillary tube and the readout section, called
the case. Case-only compensation corrects for temperature variations in the case only.
Full compensation is designated by “A” in the related nomenclature and case
compensation is noted as “B.” Class III does not require compensation but the measured
temperature cannot cross the ambient temperature. Class IIA specifies that measured
temperature must be above ambient and IIB indicates measured temperatures below
ambient.
The class of system will be selected in accordance with desired measurement range,
speed of response, compensation required/desired, sensor size, capillary length, and cost
or complexity in design. The accuracy of all classes is ±0.5 to ±1% of the measured
span. Filled systems are used with pneumatic transmitters and local indicators.
Thermocouples
The most commonly used electrical temperature systems are thermocouples (T/C) and
RTDs. Thermocouples consist of two dissimilar metals connected at the ends which
form measured and referenced junctions. An electromotive force (EMF) is produced
when the junctions are at different temperatures. The measuring or hot junction is in the
medium whose temperature is to be measured and the reference or cold junction is
normally at the connection to the readout device or measuring instrument. The EMF
generated is directly proportional to the temperature difference between the two
junctions.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
As the thermocouple EMF output is a function of the temperature difference of the two
junctions, compensation must be made for temperature fluctuations at the reference
junction. Thermocouple types depend on the material of the metal and are specified by
the International Society of Automation (ISA) with tables that relate EMF generation for
thermocouple types and temperature at the measured junction with the reference
junction at a standard temperature, normally 32°F or 0°C. In calibration and laboratory
applications, the reference junction is normally maintained at the standard temperature.
This method of reference junction compensation cannot be used in most industrial
applications. The reference junction is established at the junction of the thermocouple
and the connection of the readout device.
Because of the expense of thermocouple material, it is inconvenient to run the
thermocouple wire from the processing area to the readout instrument. Accordingly,
thermocouple lead wire has been developed whereby the connection of the lead wire to
the thermocouple does not form a junction and the reference junction is established at
the location of the readout instrument.
The current trend is towards mounting the transmitter in the thermowell head, thus
minimizing the thermocouple wire length and eliminating the need for extension wires.
In such cases a method of reference junction compensation is to design and construct an
electrical reference junction compensator. This device generates an electrical bucking
voltage that is in opposition to the voltage at the measuring junction and is subtracted
from the measuring junction voltage. Care must be taken to prevent the temperature of
the measuring junction from crossing the temperature of the reference junction. This
will cause a change in polarity of the thermocouple output voltage. To reiterate,
thermocouples are listed as types in accordance with material of construction which
establishes the temperature/EMF relationship. Thermocouple instruments used for
calibration and process measurement are designed for a particular thermocouple type. A
thermocouple arrangement is shown in Figure 6-2. Figure 6-3 shows EMF versus
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
temperature for several types of thermocouples at 32°F reference junction.
Thermocouple configurations can be designed for different applications. When
thermocouples are connected in parallel, an average temperature at the individual
thermocouples can be determined. An example of this configuration could be to
measure the temperature at individual points in a reactor with individual thermocouple
and reactor bed temperatures by measuring the voltage at the parallel connection.
Another common thermocouple application is to measure the temperature difference
between two points of measurement. This could be the temperature drop across a heat
exchanger, for example. Thermocouple connection in series opposition is used for this
purpose. When the temperatures at both points are equal, regardless of the magnitude of
voltage generated, the net EMF will be zero. From the observation of thermocouple
calibration curves, it is determined that the voltage/temperature relationship is small and
expressed in millivolts. Therefore, to measure small increments of temperature a very
sensitive voltage measuring device is needed. Potentiometric recorders and millivolt
potentiometers are used to provide the sensitivity needed.
Resistance Temperature Devices
The final type of temperature measuring devices discussed here is a resistance
temperature device (RTD). An RTD is a conductor, usually a coil of platinum wires,
with a known resistance/temperature relationship. By measuring the resistance of the
RTD and using appropriate calibration tables the temperature of the sensor can be
determined. Obtaining an accurate RTD resistance measurement with a conventional
ohm meter results in the self-heating effect produced by the sensor caused by the current
flow established by the ohm meter. Also, such measurement techniques require humanintervention which cannot provide continuous measurement needed for process control.
Most RTD applications use a DC Wheatstone Bridge circuit to measure temperature. A
temperature measurement results in a corresponding change in resistance with
temperature which produces a change in voltage or current provided by the bridge
circuit. Equivalent circuit theorems can be used to establish the relationship between
temperature and bridge voltage for a particular application. Empirical methods are
normally used to calibrate an RTD temperature measuring device.
The resistance of wires used to connect the RTD to the bridge circuit will also change
with temperature. To negate the effect of the wire resistance, a three-wire bridge circuit
is used with RTD applications.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Thermistors
Thermistors like platinum elements or other metal electrical conductors can also be used
for temperature measurement. Being much more sensitive to temperature, they can be
used to measure much smaller temperature changes. They are a semiconductor or P-N
junction and have a nonlinear resistance/temperature relationship. This requires
linearization techniques for wide temperature range applications.
Conclusion
This chapter presents the most common types of measuring devices used in the process
industries. No attempt has been made to discuss the entire principle of operation nor
applications but to merely introduce the industrial measurement methods. Many process
transmitters are used in analytical applications which include chromatographs, pH,
conductivity, turbidity, O2 content, dissolved oxygen, and others discussed elsewhere.
For further clarification, the reference material should be consulted.
Further Information
Anderson, Norman A. Instrumentation for Process Measurement and Control. 3rd ed.
Boca Raton, FL: CRC Press, Taylor & Francis Group, 1998.
Berge, Jonas. Fieldbuses for Process Control: Engineering, Operation and
Maintenance. Research Triangle Park, NC: ISA, 2004.
Gillum, Donald R. Industrial Pressure, Level, and Density Measurement. 2nd ed.
Research Triangle Park, NC: ISA, 2009.
Lipták, Béla G. Instrument Engineer’s Handbook: Process Measurement and Analysis.
Vol. 1, 4th ed. Boca Raton, FL: CRC Press, 1995.
About the Author
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Donald R. Gillum’s industrial background includes 10 years as an instrument/analyzer
engineering technician and application engineer at Lyondell Petrochemical. Before that
he was a process operator at that plant and a chemical operator at a gaseous diffusion
nuclear facility. He was employed by Texas State Technical College for 40 years where
he was actively involved in all facets of student recruitment, curriculum development,
laboratory design, instruction, industrial training, and continuing education. Gillum
developed and taught courses for ISA from 1982 to 2008 and was instrumental in the
purchase and development of the original training center in downtown Raleigh. He has
taught courses at this facility and on location since that time, and he has written articles
for various technical publications and training manuals. He is the author of the textbook
Industrial Pressure, Level, and Density Measurement.
On the volunteer side, Gillum served two terms on ISA’s Executive Board as vicepresident of District VII and vice-president of the Education Department, now PDD. As
PDD director, he served in the area of certification and credentialing. He was a program
evaluator for ABET, Commissioner of the Technical Accreditation Commission, and
was a member of ABET’s Board of Directors.
Gillum is an ISA Life Member and a recipient of ISA’s Golden Achievement Award. He
holds a BS from the University of Houston and is a PE in Control System Engineering.
7
Analytical Instrumentation
By James F. Tatera
Introduction
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Process analytical instruments are a unique category of process control instruments.
They are a special class of sensors that enable the control engineer to control and/or
monitor process and product characteristics in significantly more complex and various
ways than traditional, more physical sensors—such as pressure, temperature, and flow
—allow.
Today’s safety and environmental requirements, time-sensitive production processes,
inventory reduction efforts, cost reduction efforts, and process automation schemes have
made process analysis a requirement for many process control strategies. Most process
analyzers are providing real-time information to the control scheme that many years ago
would have been the type of feedback the production process would have received from
a plant’s quality assurance laboratory. Today most processes require faster feedback to
control the process, rather than just being advised that their process was or wasn’t in
control at the time the lab sample was taken, and/or that the sample was or wasn’t in
specification.
Various individuals have attempted to categorize the large variety of monitors typically
called process analyzers. None of these classification schemes has been widely
accepted; the result is that there are many categorization schemes in use, simultaneously.
Most of these schemes are based on either the analytical/measurement technology being
utilized by the monitor, the application to which the monitor is being applied, or the type
of sample being analyzed. There are no hard and fast definitions for analyzer types.
Consequently, most analytical instruments are classed under multiple and different
groupings. Table 7-1 depicts a few of the analyzer type labels commonly used.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
An example of how a single analyzer can be classified under many types would be a pH
analyzer. This analyzer is designed to measure the pH (an electrochemical property of a
solution—usually water-based). As such, it can be used to report the pH of the solution
and may be labeled as a pH analyzer or electrochemical analyzer (its analytical
technology label). It may be used to monitor the plant’s water out-fall and, in this case,
it may be called an environmental- or water-quality analyzer (based on its application
and sample type). It may be used to monitor the acid or base concentration of a process
stream, in which case it may be labeled a single-component concentration analyzer
(based on its application and the desired result being reported). This is just an example
and it is only intended to assist in understanding that there are many process analyzers
that will be labeled under multiple classifications. Don’t allow this to confuse or bother
you.
There are too many process analyzer technologies to mention in this chapter so only a
few will be used as examples. A few of the many books published on process analysis
are listed in the reference summary at the end of this chapter. I recommend consulting
them for further information on individual/specific technologies. The balance of this
chapter will be used to introduce concepts and technical details that are important in the
application of process analyzers.
Sample Point Selection
Determining which sample to analyze is usually an iterative process based on several
factors and inputs. Some of these factors include regulatory requirements, product
quality, process conditions, control strategies, economic justifications, and more.
Usually, the final selection is a compromise that may not be optimum for any one factor,
but is the overall best of the options under consideration. Too often, mistakes are made
on existing processes when selections are based on a simple single guideline, like the
final product or the intermediate sample that has been routinely taken to the lab. True,
you usually have relatively good data regarding the composition of that sample. But, is
it the best sample to use to control the process continuously to make the best product
and/or manage the process safely and efficiently? Or, is it the one that will just tell you
that you have or have not made good product? Both are useful information, but usually
the latter is more effectively accomplished in a laboratory environment.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
When you consider all the costs of procurement, engineering, installation, and
maintenance, you rarely save enough money to justify installing process analyzers just
to shut down or replace some of your lab analyzers. Large savings are usually achieved
through improved process control, based on analyzer input. If you can see a process
moving out of control and correct the issue before it loses control, you can avoid making
bad products, rework, waste, and so on. If you must rely on detecting a final product that
is already moving toward or out of specification, you are more likely to make additional
bad product. Consequently, lab approval samples usually focus on the final product for
approval, while process analyzers used for process control more often measure upstream
and intermediate samples.
Think of a distillation process. Typically, lab samples are taken from the top or bottom
of columns and the results usually contain difficult-to-measure, very low concentrations
of lights or heavies because they are being distilled out. Often you can better achieve
your goal by taking a sample from within the column at or near a major concentration
break point. This sample can indicate that lights or heavies are moving in an undesired
direction before a full change has reached the column take-offs, and you can adjust the
column operating parameters (temperature, pressure, and/or flow) in a way that returns
the column operation to the desired state. An intermediate sample from within the
distillation column usually contains analytes in concentrations that are less difficult and
more robust to analyze.
To select the best sample point and analyzer type, a multidisciplinary team is usually the
best approach. In the distillation example mentioned above, you would probably want a
process engineer, controls engineer, analyzer specialist, quality specialist, and possibly
others on the team to help identify the best sampling point and analyzer type. If you
have the luxury of an appropriate pilot plant, that is often the best place to start because
you do not have to risk interfering with production while possibly experimenting with
different sample points and control strategies. In addition, pilot plants are often allowed
to intentionally make bad product and demonstrate undesirable operating conditions.
Other tools that are often used to help identify desirable sample points and control
strategies are multiple, temporary, relocatable process analyzers (TURPAs). With an
appropriate selection of these instruments, you can do a great job modeling the process
and evaluating different control strategies. Once you have identified the sample point
and desired measurement, you are ready to begin the instrument selection phase of the
process.
Instrument Selection
Process analyzer selection is also typically best accomplished by a multidisciplinary
team. This team’s members often include the process analytical specialist, quality
assurance and/or research lab analysts, process analyzer maintenance personnel, process
engineers, instrument engineers, and possibly others. The list should include all the
appropriate individuals who have a stake in the project. Of these job categories, the
individuals most often not contacted until the selection has been made—and the ones
who are probably most important to its long-term success—are the maintenance
personnel. Everyone relies on them having confidence in the instrument and keeping it
working over the long term.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
At this point, you have identified the sample and component(s) to be measured, the
concentration range to be measured (under anticipated normal and upset operating
conditions), the required measurement precision and accuracy, the speed of analysis
required to support the identified control strategy, and so on. You also should have
identified all other components/materials that could be present under normal and
abnormal conditions. The method selection process must include determining if these
other components/materials could interfere with the method/technologies being
considered.
You are now identifying technology candidates and trying to select the best for this
measurement. If you have a current lab analytical method for this or a similar sample,
you should not ignore it; however, more often than not, it is not the best technology for
the process analysis. It is often too slow, complex, fragile, or expensive (or it has other
issues) to successfully pass the final cut. Lastly, if you have more than one good
candidate, consider the site’s experience maintaining these technologies. Does
maintenance have experience and training on the technologies? Does the site have
existing spare parts and compatible data communications systems? If not, what are the
spare parts supply and maintenance support issues?
These items and additional maintenance concerns should be identified by the
maintenance representative on the selection team and need to be considered in the
selection process. The greatest measurement technology in the world will not meet your
long-term needs if it cannot be maintained and kept performing in a manner consistent
with the processes operations. The minimum acceptable availability of the technology is
an important factor and is typically at least 95%.
At this point, the selection team needs to review its options and select the analytical
technology that will best serve the process needs over the long term. The analytical
technology selected does not need to be the one that will yield the most accurate and
precise measurement. It should be the one that will provide the measurement that you
require in a timely, reliable, and cost-effective manner over the long-term.
Sample Conditioning Systems
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Sample conditioning sounds simple. You take the sample that the process provides and
condition/modify it in ways that allow the selected analyzer to accept it. Despite this
relatively simple-sounding mission, most process analyzer specialists attribute 50% to
80% of process analyzer failures to sample conditioning issues. Recall that the system
must deliver an acceptable sample to the analyzer under a variety of normal, upset, startup, shut down, and other process conditions. Usually the sampling system not only has
to consider the interface requirements of getting an acceptable sample from the process
to the analyzer, but it usually must dispose of that sample in a reliable and cost-effective
manner. Disposing of the sample often involves returning it to the process and
sometimes conditioning it to make it appropriate for the return journey.
What is considered an acceptable sample for the analyzer? Often this is defined as one
that is “representative” of the process stream and compatible with the analyzer’s sample
introduction requirements. In this case, “representative” can be a confusing term.
Usually it doesn’t have to represent the process stream in a physical sense. Typical
sample conditioning systems change the process sample’s temperature, pressure, and
some other parameters to make the process sample compatible with the selected process
analyzer. In many cases, conditioning goes beyond the simple temperature and pressure
issues, and includes things that change the composition of the sample—things like
filters, demisters, bubblers, scrubbers, membrane separators, and more. However, if the
resulting sample is compatible with the analyzer and the resulting analysis is
correlatable/representative of the process, the sample conditioning is considered to be
done well.
Some sampling systems also provide stream switching and/or auto calibration
capabilities. Because the process and calibration samples often begin at different
conditions and to reduce the possibility of cross contamination, most process analyzers
include sampling systems with stream selection capabilities (often double-block and
bleed valving) just before the analyzer.
Figure 7-1 depicts a typical boxed sample conditioning system. A total sampling system
would normally also include process sample extraction (possibly a tee, probe, vaporizer,
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
etc.), transport lines, return lines, fast loop, slow loop, sample return, and so on. The
figure contains an assortment of components including filters, flow controllers, and
valves. Figure 7-2 depicts an in-situ analyzer that requires little sample conditioning and
is essentially installed in the process loop.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Another type of sample condition system is defined by ISA-76.00.02 and IEC 62339-1
Ed. 1.0 B:2006 standards and is commonly referred to as NeSSI (New Sample System
Initiative). This sample condition system minimizes dead volume and space by using
specially designed sample conditioning components on a predefined high-density
substrate system.
Most analyzers are designed to work on clean, dry, noncorrosive samples in a specific
temperature and pressure range. The sample system should convert the process sample
conditions to the conditions required by the analyzer in a timely, “representative,”
accurate, and usable form. A well-designed, operating, and maintained sampling system
is necessary for the success of the process analyzer project. Sampling is a crucial art and
science for successful process analysis projects. It is a topic that is too large to more
than touch on in this chapter. For more information on sampling, refer to some of the
references listed.
Process Analytical System Installation
The installation requirements of process analyzers vary dramatically. Figure 7-2 depicts
an environmentally hardened process viscometer installed directly in the process
environment with very little sample or environmental conditioning.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Figure 7-3 depicts process gas chromatographs (GCs) installed in an environmentally
conditioned shelter. These GC analyzers require a much more conditioned/controlled
sample and installation environment.
Figure 7-4 shows the exterior of one of these environmentally controlled shelters. Note
the heating and ventilation unit near the door on the left and the sample conditioning
cabinets mounted on the shelter wall under the canopy.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
When installing a process analyzer, the next most important thing to measurement and
sampling technologies—as in real estate—is location. If the recommended environment
for a given analyzer is not appropriate, the project is likely doomed to failure. Also,
environmental conditioning can be expensive and must be included in the project cost.
In many cases, the cost of a shelter and/or other environmental conditioning can easily
exceed the costs of the instrument itself.
Several highly hardened analyzers are suitable for direct installation in various
hazardous process environments, while others may not be. Some analyzers can operate
with only a moderate amount of process-induced vibration, sample condition variation,
and ambient environmental fluctuation. Others can require highly stable environments
(almost like a lab environment). This all needs to be taken into consideration during the
technology selection and installation design processes. To obtain the best technology
and design choice for the situation, you need to consider all these issues and
process/project specific sampling issues, such as how far and how fast the sample will
have to be transported and where it must be returned. You should also determine the
best/nearest suitable location for installing the analyzer.
Typical installation issues to consider include the hazardous rating of the process sample
area, as compared to the hazardous area certification of the process analyzers. Are there
large pumps and /or compressors nearby that may cause a lot of vibration and require
special or distant analyzer installation techniques? Each analyzer comes with detailed
installation requirements. These should be reviewed prior to making a purchase.
Generalized guidelines for various analyzer types are mentioned in some of the
references cited at the end of this chapter.
Maintenance
Maintenance is the backbone of any analyzer project. If the analyzer cannot be
maintained and kept running when the process requires the results, it should not have
been installed in the first place. No company that uses analyzers has an objective of
buying analyzers. They buy them to save money and/or to keep their plants running.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
A cadre of properly trained and dedicated craftsmen with access to appropriate
maintenance resources is essential to keep process analyzers working properly. It is not
uncommon for a complex process analyzer system to require 5% to 10% of its purchase
price in annual maintenance. One Raman spectroscopy application required a $25,000
laser twice a year. The system only cost $150,000. The result was a 33% procurement
maintenance cost to annual maintenance ratio. This is high, but not necessarily
unacceptable, depending on the benefit the analyzer system is providing.
Many maintenance approaches have been used successfully to support analyzers for
process control. Most of these approaches have included a combination of predictive,
preventive, and break-down maintenance. Issues like filter cleaning, utility gas cylinder
replacement, mechanical valve and other moving part overhauls, and many others tend
to lend themselves to predictive and/or preventive maintenance. Most process analyzers
are complex enough to require microprocessor controllers, and many of these have
sufficient excess microprocessor capacity that vendors have tended to utilize for added
features like performing appropriate diagnostics to advise the maintenance team of
failures and/or approaching failures, and resident operation and maintenance manuals.
Analyzer shelters have helped encourage frequent and thorough maintenance checks, as
well as appropriate repairs. Analyzer hardware is likely to receive better attention from
the maintenance department if it is easily accessible and housed in a desirable work
environment (like a heated and air-conditioned shelter). Figure 7-5 depicts a moderately
complex GC analyzer oven. It has multiple detectors, multiple separation columns, and
multiple valves to monitor, control, and synchronize the GC separation and
measurement.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Obviously, a system this complex will be more easily maintained in a well-lit and
environmentally conducive work environment. Complex analyzers, like many GC
systems and spectroscopy systems, have typically demonstrated better performance
when installed in environmentally stable areas. Figure 7-3 depicts one of these areas
with three process GCs. The top section of the unit includes a microprocessor and the
complex electronics required to control the analyzer functions and operation,
communicate with the outside world, and (sometimes) to control a variety of sample
system features. The middle section is the utility gas control section that controls the
required flows of several application essential gases. The lower section is the guts, so to
speak. It is the thermostatically controlled separation oven with an application-specific
assortment of separation columns, valves, and detectors (see Figure 7-5).
Utilizing an acceptable (possibly not the best) analytical technology that the
maintenance department is already familiar with can have many positive benefits.
Existing spare parts may be readily available. Maintenance technicians may already
have general training in the technology and, if the demonstrated applications have been
highly successful, they may go into the start-up phase with a positive attitude. In
addition, the control system may already have connections to an appropriate GC and/or
other data highways.
Calibration is generally treated as a maintenance function, although it is treated
differently in different applications and facilities. Most regulatory and environmental
applications require frequent calibrations (automatic or manual). Many plants that are
strongly into statistical process control (SPC) and Six Sigma have come to realize they
can induce some minor instability into a process by overcalibrating their monitoring and
control instruments. These organizations have begun using techniques like averaging
multiple calibrations and using the average numbers for their calibration. They also can
conduct benchmark calibrations, or calibration checks, and if the check is within the
statistical guidelines of the original calibration series, they make no adjustments to the
instrument. If the results are outside of the acceptable statistical range, not only do they
recalibrate, but they also go over the instrument/application to try to determine what
may have caused the change.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
With the declining price of microprocessors, equipment manufacturers are using them
more readily, even in simpler analytical instruments that don’t really require them. With
the excess computing capability that comes with many of these systems, an increasing
number of vendors have been developing diagnostic and maintenance packages to aid in
maintaining these analytical systems. The packages are typically called performance
monitoring and/or diagnostics systems; they often monitor the status of the instrument
and its availability to the process, keep a failure and maintenance history, contain
software maintenance documentation/manuals, and provide appropriate alarms to the
process and maintenance departments.
Lastly, maintenance work assignments and priorities are especially tricky for process
analytical instruments. Most are somewhat unique in complexity and other issues. Two
process GC systems that look similar may require significantly different maintenance
support because of process, sample, and/or application differences. Consequently, it is
usually best to develop maintenance workload assignments based on actual maintenance
histories when available, averaged out to eliminate individual maintenance worker
variations, and time weighted to give more weight to recent history and, consequently,
to give more weight to the current (possibly improved or aging and needing replacement
soon) installation.
Maintenance priorities are quite complex and require a multidisciplinary team effort to
determine. What the analyzer is doing at any given time can impact its priority. Some
analyzers are primarily used during start-up and shutdown and have a higher priority as
these operations approach. Others are required to run the plant (environmental and
safety are a part of this category) and, consequently, have an extremely high priority.
Others can be prioritized based on the financial savings they provide for the company.
The multidisciplinary team must decide which analyzers justify immediate maintenance,
including call-ins and/or vendor support. Some may only justify normal available
workday maintenance activities. After you have gone through one of these priority
setting exercises, you will have a much better understanding of the value of your
analytical installations to the plant’s operation. If you don’t have adequate maintenance
monitoring programs/activities in place, it can be very difficult to assess workloads
and/or priorities. The first step in implementing these activities must be to collect the
data that is necessary to make these types of decisions.
Utilization of Results
Process analytical results are used for many purposes. The following are some of the
most prominent uses:
• Closed-loop control
• Open-loop control
• Process monitoring
• Product quality monitoring
• Environmental monitoring
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Safety monitoring
With the large installed data communications base that exists in most modern plants, the
process analysis results/outputs are used by most major systems including process
control systems, laboratory information management systems, plant information
systems, maintenance systems, safety systems, enterprise systems, and regulatory
reporting systems (like environmental reports to the EPA). As mentioned previously,
various groups use these results in different ways. Refer to the cited references for more
information.
Further Information
Carr-Brion, K. G., and J. R. P. Clarke. Sampling Systems for Process Analysers. 2nd ed.
Oxford, England: Butterworth-Heinemann, 1996.
Houser, E. A. Principles of Sample Handling and Sampling Systems Design for Process
Analysis. Research Triangle Park, NC: ISA (Instrument Society of America,
currently the International Society of Automation), 1972.
IEC 62339-1:2006 Ed. 1.0. Modular Component Interfaces for Surface-Mount Fluid
Distribution Components – Part 1: Elastomeric Seals. Geneva 20 - Switzerland:
IEC (International Electrotechnical Commission).
ISA-76.00.02-2002. Modular Component Interfaces for Surface-Mount Fluid
Distribution Components – Part 1: Elastomeric Seals. Research Triangle Park, NC:
ISA (International Society of Automation).
Lipták, Béla G., ed. Instrument Engineers’ Handbook. Vol. 1, Process Measurement and
Analysis. 4th ed. Boca Raton, FL: CRC Press/ISA (International Society of
Automation), 2003.
Sherman, R. E., ed., and L. J. Rhodes, assoc. ed. Analytical Instrumentation: Practical
Guides for Measurement and Control Series. Research Triangle Park, NC: ISA
(International Society of Automation), 1996.
Sherman, R. E. Process Analyzer Sample-Conditioning System Technology. Wiley
Series in Chemical Engineering. New York: John Wiley & Sons, Inc., 2002.
Waters. Tony. Industrial Sampling Systems. Solon, OH: Swagelok, 2013.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
About the Author
James F. (Jim) Tatera is a senior process analysis consultant with Tatera Associates.
For many years, he has provided process analytical consulting/contracting services to
user, vendor, and academic organizations, authored many book chapters, and coauthored many process-analytical-related marketing reports with PAI Partners. His more
than 45-year career included 27 years of working with process analyzers for Dow
Corning, including both U.S. and international assignments in analytical research,
process engineering, project engineering, production management, maintenance
management, and a Global Process Analytical Expertise Center. Tatera is an ISA Fellow,
is one of the original Certified Specialists in Analytical Technology (CSAT), is active in
U.S. and IEC process analysis standards activities, and has received several awards for
his work in the process analysis field. He is the ANSI U.S. National Committee (USNC)
co-technical advisor to IEC SC 65B (Industrial Process Measurement, Control, and
Automation—Measurement and Control Devices), the convener of IEC SC 65B WG14
(Analyzing Equipment), and has served as the chair and as a member of the ISA-SP76
(Composition Analyzers) committee. He has also served in several section and
international leadership roles in both the International Society of Automation (ISA) and
American Chemical Society (ACS).
8
Control Valves
By Hans D. Baumann
Since the onset of the electronic age, because of the concern to keep up with everincreasing challenges by more sophisticated control instrumentation and control
algorithms, instrument engineers paid less and less attention to final control elements
even though all process control loops could not function without them.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Final control elements may be the most important part of a control loop
because they control process variables, such as pressure, temperature, tank
level, and so on. All these control functions involve the regulation of fluid
flow in a system. The control valve is the most versatile device able to do
this. Thermodynamically speaking, the moving element of a valve—may it
be a plug, ball, or vane—together with one or more orifices, restricts the
flow of fluid. This restriction causes the passing fluid to accelerate
(converting potential energy into kinetic energy). The fluid exits the orifice
into an open space in the valve housing, which causes the fluid to
decelerate and create turbulence. This turbulence in turn creates heat, and
at the same time reduces the flow rate or pressure.
Unfortunately, this wastes potential energy because part of the process is irreversible. In
addition, high-pressure reduction in a valve can cause cavitation in liquids or substantial
aerodynamic noise with gases. One must choose special valves designed for those
services.
There are other types of final control elements, such as speed-controlled pumps and
variable speed drives. Speed-controlled pumps, while more efficient when flow rates are
fairly constant, lack the size ranges, material choices, high pressure and temperature
ratings, and wide flow ranges that control valves offer. Advertising claims touting better
efficiency than valves cite as proof only the low-power consumption of the variable
speed motor and omit the high-power consumption of the voltage or frequency
converter that is needed.
Similarly, variable speed drives are mechanical devices that vary the speed between a
motor and a pump or blower. These don’t need an electric current converter because
their speed is mechanically adjusted. Control valves have a number of advantages over
speed-controlled pumps: they are available in a variety of materials and sizes, they have
a wider rangeability (range between maximum and minimum controllable flow), and
they have a better speed of response.
To make the reader familiar with at least some of the major types of control valves (the
most important final control element), here is a brief description.
Valve Types
There are two basic styles of control valves: rotary motion and linear motion. The valve
shaft of rotary motion valves rotates a vane or plug following the commands of a rotary
actuator. The valve stem of linear motion valves moves toward or away from the orifice
driven by reciprocating actuators. Ball valves and butterfly valves are both rotary
motion valves; a globe valve is a typical linear motion valve. Rotary motion valves are
generally used in moderate-to-light-duty service in sizes above 2 in (50 mm), whereas
linear motion valves are commonly used for more severe duty service. For the same
pipe size, rotary valves are smaller and lighter than linear motion valves and are more
economical in cost, particularly in sizes above 3 in (80 mm).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Globe valves are typical linear motion valves. They have less pressure recovery (higher
pressure recovery factor [FL]) than rotary valves and, therefore, have less noise and
fewer cavitation problems. The penalty is that they have less CV (KV) per diameter
compared to rotary types.
Ball Valves
When a ball valve is used as a control valve, it will usually have design modifications to
improve performance. Instead of a full spherical ball, it will typically have a ball
segment. This reduces the amount of seal contact, thus reducing friction and allowing
for more precise positioning. The leading edge of the ball segment may have a V-shaped
groove to improve the control characteristic. Ball valve trim material is generally 300
series stainless steel (see Figure 8-1).
Segmented ball valves are popular in paper mills due to their capability to shear fibers.
Their flow capacity is similar to butterfly valves; hence they have high pressure
recovery in mid- and high-flow ranges.
Eccentric Rotary Plug Valves
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Another form of rotary control valve is the eccentric, rotary-plug type (see Figure 8-2)
with the closure member shaped like a mushroom and attached slightly offset to the
shaft. This style provides good control along with a tight shutoff, as the offset supplies
leverage to cam the disc face into the seat. The advantage of this valve is tight shutoff
without the elastomeric seat seals used in ball and butterfly valves. The trim material for
eccentric disc valves is generally 300 series stainless steel, which may be clad with
Stellite® hard facing.
The flow capacity is about equal to globe valves. These valves are less susceptible to
slurries or gummy fluids due to their rotary shaft bearings.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Butterfly Valves
Butterfly valves are a low-cost solution for control loops using low pressure and
temperature fluids. They save space due to their small profile, they are available in sizes
from 2 in (50 mm) to 50 in (1,250 mm), and they can be rubber lined for tight shut off
(see Figure 8-3). A more advanced variety uses a double-eccentric vane that, in
combination with a metal seal ring, can provide low leakage rates even at elevated
temperatures.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
A more advanced design of butterfly valves is shown in Figure 8-4. Here the vane is in
the shape of a letter Z. The gradual opening described above produces a preferred equal
percentage characteristic in contrast to conventional butterfly valves having a typically
linear characteristic.
A butterfly valve used as a control valve may have a somewhat S-shaped disc (see
Figure 8-3) to reduce the flow-induced torque on the disc; this allows for more precise
positioning and prevents torque reversal. Butterfly valve trim material may be bronze,
ductile iron, or 300 series stainless steel. A feature separating on/off rotary valves from
those adapted for control, is the tight connection between the plug or vane and the seat
to ensure a tight seal when closed. Also needed is a tight coupling between the valve
and actuator stem, which avoids loose play that leads to deadband, hysteresis, and, in
turn, control-loop instability.
Globe Valves
These valves can be subdivided in two common styles: post-guided and cage-guided. In
post-guided valves, the moving closure member or stem is guided by a bushing in the
valve bonnet. The closure member is usually unbalanced, and the fluid pressure drop
acting on the closure member can create significant forces. Post-guided trims are well
suited for slurries and fluids with entrained solids. The post-guiding area is not in the
active flow stream, reducing the chance of solids entering the guiding area. Post-guided
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
valve-trim materials are usually either 400 series, 300 series, or type 17-4 PH stainless
steel. Post-guided valves are preferred in valve sizes of 1/4 in (6 mm) to 2 in (50 mm)
due to the lower cost (see Figure 8-5).
A cage-guided valve has a cylindrical cage between the body and the bonnet. Below the
cage is a seat ring. The cage/seat ring stack is sealed with resilient gaskets on both ends.
The cage guides the closure member, also known as the plug. The plug is often pressurebalanced with a dynamic seal between the plug and cage. The balanced plug will have
passageways through the plug to eliminate the pressure differential across the plug and
the resulting pressure-induced force. The trim materials for cage-guided valves are often
either 400 series, 300 series, or 17-4 PH stainless steel, sometimes nitrided or Stellite
hard-faced (see Figure 8-6). Care should be taken to allow for thermal expansion of the
cage at high temperatures, especially when the cage and housing are of dissimilar
metals. One advantage of cage-guided valves is that they can easily be fitted with lownoise or anti-cavitation trim. The penalty is a reduction in flow capacity.
Note: Do not use so-called cage-guided valves for sticky fluids or fluids that have solid
contaminants, such as crude oil.
Typical configurations are used for valve sizes from 1/2 in (12.5 mm) to 3 in (80 mm).
The plug and seat ring may be hard-faced for high pressure or slurries.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Three-Way Valves
Another common valve type is the three-way valve. As the name implies, these valves
have three ports and are used for either bypass or mixing services where a special plug
(if it is linear motion) or a rotating vane or plug (if it is rotary motion) controls fluid
flow. Most designs are either for bypass or for mixing, although some types, like the one
shown in Figure 8-7, can be used for either service. Three-way valves are quite common
in heating and air conditioning applications.
Figure 8-7 shows a novel three-way valve design, featuring a scooped-out vane that can
be rotated 90 degrees to open either port A or port B for improved flow capacity. There
is practically no flow interruption when throttling between ports because flow areas
totaling A plus B are always equal to the flow area of port C regardless of vane travel;
this maintains a constant flow through port C. This valve is equally suitable for either
mixing or bypasses operations.
The valve has a very good flow characteristic. Flow enters from ports A and B and
discharges through port C for mixing service. In case of bypass service, fluid enters port
C and is discharged alternately through ports A and B. Regardless of the angular vane
position, the combined flow through port A and B always equals the flow through port
C.
Actuators
By far, the most popular actuators for globe valves are diaphragm actuators discussed in
this section. Here an air-loaded diaphragm creates a force that is opposed by a coiled
spring. They offer low-cost, fail-safe action and have low friction, in contrast to piston
actuators.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The air signal to a pneumatic actuator typically has a range from 3 to 15 psi or 6 to 30
psi (0.2 to 1 bar or 0.4 to 2 bar) and can originate from an I/P transducer or, more
commonly, from an electronic valve positioner typically receiving a 4–20 mA signal
from a process controller. These are command signals and can be different from internal
actuator signals needed to open or close a valve.
Diaphragm Actuators
Diaphragm actuators are comprised of stamped housings clamped between a flexible
rubber diaphragm that, when subjected to an actuator signal, exert a force that is
opposed by coiled springs. Multiple parallel springs are also popular, because they offer
a lower actuator profile. Typical internal actuator signals are 5–15 psi (0.35–1 bar) or
3–12 psi (0.2–0.8 bar). The difference between a command signal from a controller or
transducer and the internal signal span is used to provide sufficient force to close a valve
plug. In a typical example, a diaphragm actuator having a 50 in2 (320 cm2) effective
diaphragm area receives a controller signal of 3 psi (0.2 bar) to close a valve. This then
provides an air pressure excess of 2 psi (0.14 bar), because the internal actuator signal is
5–15 psi (3.5–1 bar). At a 50 in2 (320 cm2) area, there is a 100 lb (45 kg) force available
to close a valve plug against fluid pressure.
The spring rate can be calculated by multiplying the difference between the maximum
and minimum actuator signals times the diaphragm area, and then by dividing this by
the amount of travel. The spring rate of the coiled spring is important to know because it
defines the stiffness factor (Kn) of the actuator-valve combination. Such stiffness is
necessary to overcome fluid-imposed dynamic instability of the valve trim. It is also
necessary to prevent a single-seated plug from slamming onto the valve orifice on a
valve with a flow direction that tends to close. Equation 8-1 below can be used to
estimate the stiffness factor for actuator springs.
Required Kn = orifice area • max. inlet pressure divided by one-quarter of
the valve travel
(8-1)
Example:
Valve orifice = 4 in (0.1 m)
P1 = 100 psi (7 bar)
valve travel = 1.5 in (38 mm)
Result:
12.6 in2 • 100/(1.5/4) = 3,360 lb/in (3,818 kg/cm)
In this case, a spring having a rating of at least 3,360 lb/in could meet the stiffness
requirement.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Commercially available spring-diaphragm actuators are available in sizes from 25 in2 to
500 in2 (160 cm2 to 3,226 cm2).
Rotary valves more commonly employ piston-type pneumatic actuators using a linkage
to convert linear piston travel to rotary action.
The stiffness of piston-type actuators depends primarily on the volume-change-induced
pressure change in the actuator. Below is a generally accepted method for calculating
the stiffness factor, Knp.
Knp = Sb • Ab2/(Vb + h • Ab) + St • Ab2/Vt + h • At
where
h = stem motion
Ab = area below the piston
At = area above the piston
(8-2)
Vb = unused volume of the cylinder below the piston
Vt = volume on the top piston
Sb = signal pressure below the piston
St = signal pressure on the top piston
Ab = typically: At less the area of the stem
Example:
h = 1 in (2.5 cm)
At = 12 in2 (75 cm2)
Ab = 11 in2 (69 cm2)
Vb = 2.5 in3 (6.25 cm3)
Vt = 5 in3
Sb = 30 psia (2 barabs)
St = 60 psia (4 barabs)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Result:
Knp = 610 lb/in (714 kg/cm)
Any reduction in unused (dead) volume and increase in signal pressures aids stiffness.
Commercially available spring-diaphragm actuators are available in sizes from 25 to
500 in2 (160 cm2 to 1300 cm2).
Rotary valves (example in Figure 8-8) more commonly employ piston-type pneumatic
actuators using a linkage to convert linear piston travel to rotary action.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The actuator shown in Figure 8-9 is in the fail-close position. By flipping the diaphragm
and placing the spring below it, the actuator makes the valve fail-open. The normal flow
direction for this globe valve is flow-to-open for better dynamic stability (here the entry
port is on the right).
However, an opposite flow direction may be desired to help the actuator close the valve
in case of an emergency. Here the fluid pressure tends to push the plug down against the
seat, which can lead to instability; a larger actuator with greater stiffness (factor Kn)
may be required to compensate for this downward force.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Pneumatic, Diaphragm-Less Piston Actuators
Another way of using compressed air to operate valves is by utilizing piston-cylinder
actuators (see Figure 8-10). Here a piston is driven by high-pressure air of up to 100 psi
(7 bar), which can be opposed by a coiled spring (for fail-safe action) or by a somewhat
lower air pressure on the opposite side of the piston. As a result, piston-cylinder
actuators are more powerful than diaphragm actuators for a given size.
A disadvantage of piston-cylinder actuators is that they always need a valve positioner
and they require higher air pressure supply—up to 100 psi (7 bar)—than is required for
diaphragm actuators.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Electric Actuators
Electric actuators are found more often in power plants where there are valves with
higher pressure requirements, or where there is no compressed air available. They
employ either a mechanical system (i.e., with gear and screws) or a magnetic pistontype system to drive the valve stem. Screw or gear types offer more protection against
sudden erratic stem forces caused by fluid-induced instability due to their inherent
stiffness. A potential problem with electric actuators is the need for standby batteries to
guard against power failure. Travel time typically is 30 to 60 seconds per inch (25.4
mm) compared to 7 to 10 seconds per inch for pneumatic actuators. Magnetic actuators
are limited by available output forces.
Electric actuators utilize alternating current (AC) motors to drive speed-reducing gear
trains. The last gear rotates an acme threaded spindle, which in turn is attached to a
valve stem. Acme threaded spindles have less than 50% efficiency and, therefore,
provide “self-locking” against high or sudden valve stem forces. Smaller electric
actuators sometimes employ ball-bearing-supported thread and nuts for greater force
output. However, they do not prevent back-sliding under excess loads.
The positioning sensitivity of electric actuators is relatively low due to the high inertia
of the mechanical components. There always is a trade-off between force output and
travel speed. High-force output is typically chosen over speed due to the high cost of
electric actuators
Hydraulic Actuators
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
These are typically electrohydraulic devices (see example in Figure 8-11) because they
derive their operating energy from electric motor-driven oil pumps. They too offer
stiffness against dynamic valve instability. The cost of these actuators is high, and they
require more maintenance than pneumatic actuators. Hydraulic actuators are found
primarily indoors because the viscosity of operating oil varies significantly at lower
outdoor temperatures. Another problem to consider is oil leakage that would cause a
drift in the stem position.
Fail-safe action can be provided by placing a coiled spring opposite the oil pressure
driven piston in a cylinder. The oil flow is controlled by three-way spool valves
positioned by a voice coil. A stem connected potentiometer provides feedback. The
typical operating signal is 4–20 mA. Quick-acting solenoids can evacuate a cylinder and
close or open a valve in an emergency. Such features make electrohydraulic actuators
popular on fuel-control valves used for commercial gas turbines.
Electrohydraulic units are typically self-contained; hence, there is no need for external
pipes or tanks. These devices are used on valves or dampers that require a high
operating torque and that need to overcome high-friction forces, where the
incompressible oil provided stiffness overcomes deadband normally caused by such
friction. One must realize that oil pumping motors have to run continuously, hence there
is larger power consumption and a need for oil blocking valves in case of power failure.
Electrohydraulic actuators require special positioners using a 4–20 mA signal. Typically,
a three-way spool valve is used, activated by a voice coil to guide the oil. Other systems
use stepping motors to pulse the oil flow at the pump.
Potential problems with hydraulic systems are change in viscosity (at low temperatures),
which reduces response time, and fluid leakage, which causes stem drifting.
Accessories
Valve Positioners
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
These devices are attached to an actuator to compare the position of the valve stem or
rotary valve shaft with the position intended by the signal generated by a digital
controller or computer. There are two basic types: pneumatic and electronic (digital).
Pneumatic positioners are gradually being phased out in favor of electronic ones.
However, they are useful in plant areas that must be explosion-proof and in gas fields
where compressed gas is used instead of compressed air (which is by far the most
common actuating medium). Commonly, 30 psi (2 bar) pressure controlled air is used as
the supply and 3–15 psi (0.2–1 bar) is used as the output signal to the actuator.
Positioners must be considered as “position controllers.” The input is the signal from the
process controller. The feedback is a signal from a potentiometer or Hall-effect sensor
measuring the position of the valve stem. Any offset between the input and feedback
causes activation of a command signal to the actuator.
As with any process controller, positioners have stability problems. Adjustable gain and
speed of travel adjustments are used to fight instability, which is typically caused by
valve friction. Typical modes of adjustment include the following: if a high gain
(increased position accuracy) is required, reduce the travel speed for the actuator, if
higher speed requirements (such as in pressure control applications) are needed, reduce
the gain to avoid “hunting.”
Varying the speed of the valve (time constants) can also be used to alter the ratio
between the time constant of the process loop and the time constant of the valve. A
minimum ratio is 3:1 (preferably above 5:1) to avoid process-loop instability (see Table
8-1.)
Electronic or digital positioners typically employ a 4–20 mA or a digital signal from an
electronic control device. Most digital positioners employ microprocessors that, besides
controlling the stem position, can monitor the valve’s overall function and can feed back
any abnormality. Some devices may be able to function as their own process controllers
responding directly to input from a process data transmitter.
The word “digital” is somewhat of a misnomer because, besides the microprocessor, the
positioners employ analog signal-activated voice coils to control the air output to the
valve.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Most modern digital valve positioners (see Figure 8-12) can monitor the performance of
the valve by triggering an artificial controller signal upset and then measuring the
corresponding valve stem reaction (see Figure 8-13). Such tests normally are done when
the process to be controlled is shut down because process engineers are averse to having
an artificial upset while a process is running.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Self-testing electronic positioners need sufficient software programming to fulfill the
following functions:
• Reading the valve’s diagnostic data
• Supporting the diagnostic windows display
• Checking the valve configuration and associated database
• Monitoring the device capabilities
• Configuring the valve’s characteristic
• Listing general processes and the valve database
• Showing the diagnostic database
• Indicating security levels as required by the user
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The overall aim is to create a real-time management control system integrating all
aspects of plant management, production control, process control, plant efficiency (see
Figure 8-14), and maintenance.
Position transmitters determine the size of the valve opening by reading the stem
position and then transmit the information to the remote operator, who uses the data to
verify that the valve is functioning.
Limit switches restrict the amount of travel and can also be used to monitor the valve
position. Activation can trigger an alarm system, or it can trigger a solenoid valve to
quickly open or close the valve or confirm the valve is fully open or closed.
Hand wheels are used to move the valve stem manually, which enables rudimentary
manual control of the process in case of signal or power failure. Hand wheels are
typically on top of a diaphragm actuator, for valves up to 2 in (50 mm) size, or on the
side of the actuator yoke.
I/P transducers are sometimes used in place of electronic valve positioners; they are
popular with small valves and in noncritical applications where no valve positioners are
specified. They receive input from a 4–20 mA command system from a process
controller, and normally create a 3–15 psi (0.2–1.0 bar) air signal to a valve.
Solenoid valves can be used as safety devices to shut off air supply to the valve in an
emergency. Alternatively, they can lock air in a piston operator.
Air sets are small pressure regulators used to maintain a constant supply pressure,
typically 30 psi (2 bar), to valve positioners or other pneumatic instruments.
Further Information
ANSI/FCI-70-2-2003. Control Valve Seat Leakage. Cleveland, OH: FCI (Fluid Controls
Institute, Inc.).
ANSI/ISA-75.01.01-2002 (IEC 60534-2-1 Mod). Flow Equations for Sizing Control
Valves. Research Triangle Park, NC: ISA (International Society of Automation).
ASME/ANSI-B16.34-1996. Valves—Flanged, Threaded, and Welding End. New York:
ASME (American Society of Mechanical Engineers).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Borden, G., ed., and P. G. Friedmann, style ed. Control Valves: Practical Guides for
Measurement and Control. Research Triangle Park, NC: ISA (International Society
of Automation), 1998.
Baumann, Hans D. Control Valve Primer: A User’s Guide. 4th ed. Research Triangle
Park, NC: ISA (International Society of Automation), 2009.
IEC 60534-8-3:2010. Noise Considerations – Control Valve Aerodynamic Noise
Prediction Method. Geneva 20 – Switzerland: IEC (International Electrotechnical
Commission).
IEC 60534-8-4:2015. Noise Considerations – Prediction of Noise Generated for
Hydrodynamic Flow. Geneva 20 – Switzerland: IEC (International Electrotechnical
Commission).
ISA-75.17-1989. Control Valve Aerodynamic Noise Prediction. Research Triangle Park,
NC: ISA (International Society of Automation).
ISA-dTR75.24.01-2017. Linear Pneumatic Control Valve Actuator Sizing and Selection.
Research Triangle Park, NC: ISA (International Society of Automation).
About the Author
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Hans D. Baumann, PhD, PE, and honorary member of the International Society of
Automation (ISA), is a primary consultant for HB Services Partners, LLC, of West Palm
Beach, Florida. He was formerly a corporate vice president of Fisher Controls and the
owner of Baumann Associates, a manufacturer of control valves. He is the author of the
acclaimed Control Valve Primer and owns titles to 103 U.S. patents. He served for 36
years as a U.S. technical expert on the International Electrotechnical Commission (IEC)
Standards Committee on control valves, where he made significant contributions to
valve sizing and noise prediction standards.
9
Motor and Drive Control
By Dave Polka and Donald G. Dunn
Introduction
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Automation is a technique, method, or system of operating or controlling a process by
highly automatic means utilizing electronic devices, which reduces human intervention
to a minimum. Processes utilize mechanical devices to produce a force, which produces
work within the process. Thus, a motor is a device that converts electrical energy into
mechanical energy. There are both alternating current (AC) and direct current (DC)
motors with the AC induction motor being the most common type utilized within most
industries. It is vital that automation engineers have a basic understanding of motor and
electronic drive principles. The drive is the device that controls the motor. The two
interact or work together to provide the torque, speed, and horsepower (hp) necessary to
operate the application or process.
The simplest concept of any motor, either direct or alternating current, is that it consists
of a magnetic circuit interlinked with an electrical circuit in such a manner to produce a
mechanical turning force. It was recognized long ago that a magnet could be produced
by passing an electric current through a coil wound around magnetic material. Later it
was established that when a current is passed through a conductor or a coil, which is
situated in a magnetic field, there is a setup force tending to produce motion of the coil
relative to the field.
Thus, a current flowing through a wire will create a magnetic field around the wire. The
more current (or turns) in the wire, the stronger the magnetic field; by changing the
magnetic field, one can induce a voltage in the conductor. Finally, a force is exerted on a
current-carrying conductor when it is in a magnetic field. A magnetic flux is produced
when an electric current flows through a coil of wire (referred to as a stator), and
current is induced in a conductor (referred to as a rotor) adjacent to the magnetic field.
A force is applied at right angles to the magnetic field on any conductor when current
flows through that conductor.
DC Motors and Their Principles of Operation
There are two basic circuits in any DC motor: the armature (device that rotates) and the
field (stationary part with windings). The two components magnetically interact with
one another to produce rotation of the armature. Both the armature and the field are two
separate circuits and are physically next to each other, in order to promote magnetic
interaction.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The armature (IA) has an integral part, called a commutator (see Figure 9-1). The
commutator acts as an electrical switch, always changing polarity of the magnetic flux
to ensure there is a “repelling” force taking place. The armature rotates as a result of the
“repelling” motion created by the magnetic flux of the armature, in opposition to the
magnetic flux created by the field winding (IF).
The physical connection of voltage to the armature is done through “brushes.” Brushes
are made of a carbon material that is in constant contact with the armature’s commutator
plates. The brushes are typically spring loaded to provide constant pressure of the brush
to the commutator plates.
Control of Speed
The speed of a DC motor is a direct result of armature voltage applied. The field
receives voltage from a separate power supply, sometimes referred to as a field exciter.
This exciter provides power to the field, which in turn generates current and magnetic
flux. In a normal condition, the field is kept at maximum strength, allowing the field
winding to develop maximum current and flux (known as the armature range). The only
way to control the speed is through change in armature voltage.
Control of Torque
Under certain conditions, motor torque remains constant when operating below base
speed. However, when operating in the field weakening range, torque drops off
inversely as 1/Speed2. If field flux is held constant, as well as the design constant of the
motor, then torque is proportional to the armature current. The more load the motor sees,
the more current that is consumed by the armature.
Enclosure Types and Cooling Methods
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
In most cases, to allow the motor to develop full torque at less than 50% speed, an
additional blower is required for motor cooling. The enclosures most commonly found
in standard industrial applications are drip-proof, fully guarded (DPFG; see Figure 9-2);
totally enclosed, non-ventilated (TENV); and totally enclosed, fan-cooled (TEFC).
DC Motor Types
Series-Wound
A series-wound DC motor has the armature and field windings connected in a series
circuit. Starting torque developed can be as high as 500% of the full load rating. The
high starting torque is a result of the field winding being operated below the saturation
point. An increase in load will cause a corresponding increase in both armature and field
winding current, which means both armature and field winding flux increase together.
Torque in a DC motor increases as the square of the current value. Compared to a shunt
wound motor, a series-wound motor will generate a larger torque increase for a given
increase in current.
Shunt (Parallel) Wound
Shunt wound DC motors have the armature and field windings connected in parallel.
This type of motor requires two power supplies—one for the armature and one for the
field winding. The starting torque developed can be 250% to 300% of the full load
torque rating, for a short period of time. Speed regulation (speed fluctuation due to load)
is acceptable in many cases, between 5% and 10% of maximum speed, when operated
from a DC drive.
Compound-Wound
Compound-wound DC motors are basically a combination of shunt and series-wound
configurations. This type of motor offers the high starting torque of a series-wound
motor and constant speed regulation (speed stability) under a given load. The torque and
speed characteristics are the result of placing a portion of the field winding circuit, in
series, with the armature circuit. When a load is applied, there is a corresponding
increase in current through the series winding, which also increases the field flux,
increasing torque output.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Permanent Magnet
Permanent magnet motors are built with a standard armature and brushes but have
permanent magnets in place of the shunt field winding. The speed characteristic is close
to that of a shunt wound DC motor. This type of motor is simple to install, with only the
two armature connections needed, and simple to reverse, by simply reversing the
connections to the armature. Though this type of motor has very good starting torque
capability, the speed regulation is slightly less than that of a compound-wound motor.
Peak torque is limited to about 150%.
AC Motors and Their Principles of Operation
All AC motors can be classified into single-phase and polyphase motors (poly, meaning
many phase or three-phase). A polyphase squirrel-cage induction motor does not require
a commutator, brushes, or slip rings that are common in DC motors. It has the fewest
windings, least insulation, and lowest cost per horsepower when compared to other
motors designs.
The two main electrical components of an AC induction motor are the stator (i.e., the
stationary element that generates the magnetic flux) and the rotor (i.e., the rotating
element). The stator is the stationary or primary side, and the rotor is the rotating or
secondary part of the motor. The power is transmitted to the rotor inductively from the
stator through a transformer action. The rotor consists of copper or aluminum bars,
connected at the ends by end rings. The rotor is filled with many individual discs of
steel, called laminations. The stator consists of cores that are also constructed with
laminations. These laminations are coated with insulating varnish and then welded
together to form the core (see Figure 9-3).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The revolving field set up by the stator currents cut the squirrel-cage conducting
aluminum bars of the rotor. This causes voltage in these bars, with a corresponding
current flow, which sets up north and south poles in the rotor. Torque (turning of the
rotor) is produced due to the attraction and repulsion between these poles and the poles
of the revolving stator field.
Each magnetic pole pair in Figure 9-4 is wound in such a way that allows the stator
magnetic field to “rotate.” A simple two-pole stator shown in the figure has three coils
in each pole group (a two-pole motor would have two poles and three phases, equaling
six physical poles). Each coil in a pole group is connected to one phase of the threephase power source. With three-phase power, each phase current reaches a maximum
value at different time intervals. This is shown by maximum and minimum values in the
lower part of Figure 9-4.
Control of Speed
The speed of a squirrel-cage motor depends on the frequency and number of poles for
which the motor is wound (see Equation 9-1).
(9-1)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
where
N = Shaft speed (RPM)
F = frequency of the power supply (hertz)
P = number of stator poles (pole pairs)
Squirrel-cage motors are built with the slip ranging from about 3% to 20%. The actual
“slip” speed is referred to as base speed, which is the speed of the motor at rated
voltage, rated frequency, and rated load. Motor direction is reversed by interchanging
any two motor input leads.
Control of Torque and Horsepower
Horsepower (hp) takes into account the “speed” at which the shaft rotates (see Equation
9-2). By rearranging the equation, a corresponding value for torque can also be
determined.
(9-2)
where
T = Torque in lb-ft
N = Speed in RPM
A higher number of poles in a motor means a larger amount of torque is developed, with
a corresponding lower base speed. With a lower number of poles, the opposite would be
true.
Enclosure Types and Cooling
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The more common types of AC motor enclosures are the open drip-proof motor (ODP);
the TENV motor; and the TEFC motor (see Figure 9-5).
AC Motor Types
Standard AC Motors
AC motors can be divided into two major categories: asynchronous and synchronous.
The induction motor is the most common type of asynchronous motor (meaning speed is
dependent on slip) (see Figure 9-6).
There are major differences between a synchronous motor and an induction motor, such
as construction and method of operation.
A synchronous motor has a stator with axial slots that consist of stator windings wound
for a specific number of poles. Typically, a salient pole rotor is used on which the rotor
winding is mounted. The rotor winding, which contains slip rings, is fed by a DC supply
or a rotor with permanent magnets.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The induction motor has a stator winding that is similar to that of a synchronous motor.
It is wound for a specific number of poles. A squirrel-cage rotor or a wound rotor can be
used. In squirrel-cage rotor, the rotor bars are permanently short-circuited with end
rings. In a wound rotor, windings are also permanently short-circuited; therefore, no slip
rings are required.
Synchronous motor stator poles rotate at the synchronous speed (Ns) when fed with a
three-phase supply. The rotor is fed with a DC supply. The rotor needs to be rotated at a
speed close to the synchronous speed during starting. This causes the rotor poles to
become magnetically coupled with the rotating stator poles, and thus the rotor starts
rotating at the synchronous speed. A synchronous motor always runs at a speed equal to
its synchronous speed (i.e., actual speed = synchronous speed or N = Ns = 120 × F/P).
When an induction motor stator is fed with a two- or three-phase AC supply, a rotating
magnetic field (RMF) is produced. The relative speed between stator’s rotating
magnetic field and the rotor will cause an induced current in the rotor conductors. The
rotor current gives rise to the rotor flux. The direction of this induced current is such
that it will tend to oppose the cause of its production (i.e., the relative speed between the
stator’s RMF and the rotor). Thus, the rotor will try to catch up with the RMF and
reduce the relative speed. An induction motor always runs at a speed that is less than the
synchronous speed (i.e., N < Ns).
The following is a subset of the uses and advantages of both a synchronous motor and
an induction motor.
A synchronous motor is used in various industrial applications where constant speed is
necessary, such as compressor applications. The advantages of these types of motors are
the speed is independent of the load and the power factor can be adjusted. Facilities with
numerous synchronous machines have the ability to operate the power factor above
unity that creates a power factor similar to that of a capacitor (the current phase leads
the voltage phase), which helps achieve power factor correction for the facility.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The induction motor is the most commonly used motor within manufacturing facilities.
The advantages of induction motors are they can operate in a wide range of industrial
conditions and they are robust and sturdy. Induction motors are cheaper in cost due to
their simple construction. Induction motors do not have accessories, such as brushes,
slip rings, or commutators, which make their maintenance costs low in comparison to
synchronous machines. Simply put, induction motors require very little maintenance if
applied and installed correctly. In addition, they do not require any complex circuit for
starting. The three-phase motor is self-starting while the single-phase motor can be
made self-starting simply by connecting a capacitor in the auxiliary winding. Induction
motors can be operated in hazardous environments and even under water because they
do not produce sparks like DC motors do. However, the proper motor enclosure
classification is required for operation in these types of applications.
The disadvantage of an induction motor is the difficulty of controlling the speed. At low
loads, the power factor drops to very low values, as does the efficiency. The low power
factor causes a higher current to be drawn and results in higher copper losses. Induction
motors have low starting torque; thus, they cannot be used for applications such as
traction and lifting loads.
Wound Rotor
The wound-rotor motor has controllable speed and torque characteristics. Different
values of resistance are inserted into the rotor circuit to obtain various performance
results. Changes in resistance values normally begin with a secondary resistance
connected to the rotor circuit. The resistance is reduced to allow the motor to increase in
speed. This type of motor can develop substantial torque and, at the same time, limit the
amount of locked rotor current.
Synchronous
The two types of synchronous motors are non-excited and DC-excited. Without complex
electronic control, this motor type is inherently a fixed-speed motor. The synchronous
motor could be considered a three-phase alternator, only operated backwards. DC is
applied directly to the rotor to produce a rotating electromagnetic field, which interacts
with the separately powered stator windings to produce rotation. In reality, synchronous
motors have little to no starting torque. An external device must be used for the initial
start of the motor.
Multiple Pole
Multiple pole motors could be considered “multiple speed” motors. Most of the multiple
pole motors are “dual speed.” Essentially, the conduit box would contain two sets of
wiring configurations—one for low-speed and one for high-speed windings. The
windings would be engaged by electrical contacts or a two-position switch.
Choosing the Right Motor
Application Related
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The application must be considered to apply the correct motor. Factors that influence the
selection of the correct motor are application, construction, industry or location, power
requirements and/or restrictions, as well as installed versus operating costs (see Figure
9-7). Typical applications for an induction motor are pumps, fans, compressors,
conveyors, crushers, mixers, shredders, and extruders. Typical applications for a
synchronous motor are high- and low-speed compressors and large pumps, extruders,
chippers, special applications with large drives and mining mills.
In addition, the manufacturer constructs the motors to comply with the standards
provided or specified by the end user. In the United States, the standards utilized vary
depending on the size and application of the motor. The primary standard for the
construction of motors in the United States is the National Electrical Manufacturers
Association (NEMA) MG 1. In addition, there are several other standards typically
utilized that vary depending on the horsepower (hp) or type of machine. The following
are some of those standards:
• American Petroleum Institute – API Standard 541, Form-wound Squirrel Cage
Induction Motors—375 kW (500 Horsepower) and Larger
• American Petroleum Institute – API Standard 546, Brushless Synchronous
Machines—500 kVA and Larger
• Institute of Electrical and Electronics Engineers – IEEE Std. 841-2009, IEEE
Standard for Petroleum and Chemical Industry—Premium-Efficiency, SevereDuty, Totally Enclosed Fan-Cooled (TEFC) Squirrel Cage Induction Motors—Up
to and Including 370 kW (500 hp)
If the motor is constructed outside of the United States, it typically complies with
International Electrotechnical Commission (IEC) standards. NEMA motors are in
widespread use throughout the United States and are used by some end users globally.
There are some differences between NEMA and IEC standards with regard to terms,
ratings, and so on. Typically, NEMA standards are considered more conservative, which
allows for slight variations in design and applications. IEC standards are specific and
require significant care in applying them for the specific application.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
AC versus DC
There are no fundamental performance limitations that would prevent a flux vector
adjustable speed drive (ASD) from being used in any application where DC drives are
used. In areas such as high-speed operation, the inherent capability of AC motors
exceeds the capability of DC motors. Inverter duty motors have speed range capabilities
that are equal to or above the capabilities of DC motors. DC motors usually require
cooling air forced through the interior of the motor in order to operate over wide speed
ranges. Totally enclosed AC motors are also available with wide speed range
capabilities.
Although DC motors are usually significantly more expensive than AC motors, the
motordrive package price for a ASD is often comparable to the price of a DC drive
package. If spare motors are required, the package price tends to favor the ASD.
Because AC motors are more reliable in a variety of situations and have a longer
average life, the DC drive alternative may require a spare motor while the AC drive may
not. AC motors are available with a wide range of optional electrical and mechanical
configurations and accessories. DC motors are generally less flexible and the optional
features are generally more expensive.
DC motors are typically operated from a DC drive, which has reduced efficiency at
lower speeds. Because DC motors tend to be less efficient than AC motors, they
generally require more elaborate cooling arrangements. Most AC motors are supplied in
totally enclosed housings that are cooled by blowing air over the exterior surface.
The motor is the controlling element of a DC drive system, while the electronic
controller is the controlling element of an AC drive system. The purchase cost of a DC
drive, in low horsepower sizes, may be less than that of its corresponding AC drive of
the same horsepower. However, the cost of the DC motor may be twice that of the
comparable AC motor. Technology advancements in ASD design have brought the
purchase price gap closer to DC. DC motor brushes and commutators must be
maintained and replaced after periods of operation. AC motors are typically less
maintenance intensive and are more “off-the-shelf” compared to comparable
horsepower DC motors.
Adjustable Speed Drives (Electronic DC)
Principles of Operation
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
IEEE 100 defines an adjustable speed drive (ASD) as “an electric drive designed to
provide an easily operable means of speed adjustment of the motor, within a specified
speed range.” Note the use of the word adjustable rather than variable. Adjustable
implies that the speed can be controlled, while variable implies that it may change on its
own. The definition also refers to the motor as being part of the adjustable speed system,
implying that it is a system rather than a single device.
This type of drive converts fixed voltage and frequency AC to an adjustable voltage DC.
A DC drive can operate a shunt wound DC motor or a permanent magnet motor. Most
DC drives use silicon-controlled rectifiers (SCRs) to convert AC to DC (see Figure 9-8).
SCRs provide output voltage when a small voltage is applied to the gate circuit. Output
voltage depends on when the SCR is “gated on,” causing output for the remainder of the
cycle. When the SCR goes through zero, it automatically shuts off until it is gated “on”
again. Three-phase DC drives use six SCRs for full-wave bridge rectification. Insulated
gate bipolar transistors (IGBTs) are now replacing SCRs in power conversion. IGBTs
also use an extremely low voltage to gate “on” the device.
When the speed controller circuit calls for voltage to be produced, the “M” contactor
(main contactor) is closed and the SCRs conduct. In one instant of time, voltage from
the line enters the drive through one phase, is conducted through the SCR, and into the
armature. Voltage flows through the armature and back into the SCR bridge and returns
through the power line through another phase. At the time this cycle is about complete,
another phase conducts through another SCR, through the armature and back into yet
another phase. The cycle keeps repeating 60 times per second due to 60-hertz line input.
Shunt field winding power is supplied by a DC field exciter, which supplies a constant
voltage to the field winding, thereby creating a constant field flux. Many field exciters
have the ability to reduce supply voltage, used in above base speed operation.
Control of Speed and Torque
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
A speed reference is given to the drive’s input, which is then fed to the speed controller
(see Figure 9-9).
The speed controller determines the output voltage for desired motor speed. The current
controller signals the SCRs in the firing unit to “gate on.” The SCRs in the converter
section convert fixed, three-phase voltage to a DC voltage and current output in relation
to the desired speed. The current measuring/scaling section monitors the output current
and makes current reference corrections based on the torque requirements of the motor.
If precise speed is not an issue, the DC drive and motor could operate “open loop.”
When more precise speed regulation is required, the speed measuring/scaling circuit
will be engaged by making the appropriate feedback selection. If the feedback is fed by
the EMF measurement circuit, then the speed measuring/scaling circuit will monitor the
armature voltage output. The summing circuit will process the speed reference and
feedback signal and create an error signal. This error signal is used by the speed
controller as a new speed command—or a corrected speed command.
If tighter speed regulation is required (<1%), then “tachometer generator” feedback is
required (e.g., tach feedback or tacho). A tachometer mounts on the end of the motor,
opposite that of the shaft, and feeds back exact shaft speed to the speed controller. When
operating in current regulation mode (controlling torque), the drive closely monitors
values of the current measuring/scaling circuit.
Braking Methods (Dynamic and Regenerative)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
After “ramp to stop,” the next fastest stopping time would be achieved by dynamic
braking (see Figure 9-10).
This form of stopping uses a fixed, high-wattage resistor (or bank of resistors) to
transform the rotating energy into heat. This system uses a contactor “M” (see Figure 98) to connect the resistor across the armature at the time needed to dissipate the motorgenerated voltage.
The fastest “electronic” stopping method is that of “regeneration.” With “regenerative
braking,” all the motor’s energy is fed directly back into the AC power line. To
accomplish this, a second set of “reverse connected” SCRs is required. This allows the
drive to conduct current in the opposite direction (generating the motor’s energy back to
the line). A regenerative drive allows “motoring” and “regenerating” in both the forward
and reverse directions.
Adjustable Speed Drives (Electronic AC)
Principles of Operation – Pulse Width Modulation (PWM)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
There are several types of AC drives (ASDs). They all have one concept in common:
they convert fixed voltage and frequency input into an adjustable voltage and frequency
output to change the speed of a motor (see Figures 9-10 and 9-11).
Three-phase power is applied to the input section of the drive, called the converter. This
section contains six diodes arranged in an electrical “bridge.” These diodes convert AC
voltage to DC voltage. The DC bus section accepts the now converted AC to fixed
voltage DC. The DC bus filters and smooths the waveform, using “L” (inductors) and
“C” (capacitors). The diodes reconstruct the negative halves of the waveform onto the
positive half, with an average DC voltage of approximately 650–680 V (460 VAC unit).
Once filtered, the DC bus voltage is delivered to the final section of the drive, called the
inverter section. This section “inverts” DC voltage back to AC—but in an adjustable
voltage and frequency output. Devices called insulated gate bipolar transistors (IGBTs)
act as power switches that turn on and off the DC bus voltage, at specific intervals.
Control circuits, called gate drivers, cause the control part of the IGBT (gate) to turn
“on” and “off” as needed.
Control of Speed and Torque
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The torque of a motor is determined by a basic characteristic—the volts per hertz ratio
(V/ Hz). If an induction motor is connected to a 460-volt power source, at 60 Hz, the
ratio is 7.67 V/Hz. As long as this ratio is kept in proportion, the motor will develop
rated torque. The output of the drive doesn’t provide an exact replica of the AC input
sine waveform (see Figure 9-12).
It provides voltage pulses that are at a constant magnitude in height. The positive and
negative switching of the IGBTs re-creates the three-phase output. The speed at which
IGBTs are switched is called the carrier frequency or switch frequency. The higher the
switch frequency, the more resolution each PWM pulse contains (typical carrier
frequencies range from 3 kHz to 16 kHz).
An enhanced version of motor control is termed the direct torque control method. In this
control scheme, field orientation (stator flux) is achieved without feedback using
advanced motor control theory to calculate the motor torque directly. With this method,
there is no need for a tachometer device to indicate motor shaft position (see Figure 913).
Braking Methods (Dynamic and Regenerative)
The DC bus of a typical AC drive will take on as much voltage as possible, without
tripping. If an overvoltage trip occurs, the operator has three choices—increase the
deceleration time, add “DC injection or flux braking” (a parameter), or add an external
dynamic braking package. If the deceleration time is extended, the DC bus has more
time to dissipate the energy and stay below the trip point.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
With DC injection braking, DC voltage is “injected” into the stator windings for a preset
period of time. Braking torque (counter torque) brings the motor to a quicker stop,
compared to “ramp.” Dynamic braking (DB) uses an externally mounted, fixed, highwattage resistor (or bank of resistors) to transform the rotating energy into heat (see
Figure 9-14).
When the motor is going faster than commanded speed, the rotational energy is fed back
to the DC bus. Once the bus level increases to a predetermined point, the “chopper”
module activates, and the excess voltage is transferred to the DB resistor.
For regenerative braking, a second set of “reverse-connected” power semiconductors is
required. The latest AC regenerative drives use two sets of IGBTs in the converter
section (some manufacturers term this an “active front end”). The reverse set of power
components allows the drive to conduct current in the opposite direction (taking the
motor’s energy and generating it back to the line). As expected with a four-quadrant
system, this unit allows driving the motor and regenerating in both the forward and
reverse directions.
Automation and the Use of AFDs
The more complex AC drive applications are now accomplished with modifications in
drive software, which some manufacturers call firmware. IGBT technology and highspeed application chips and processors have made the AC drive a true competitor to that
of the traditional DC drive system.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Intelligent and Compact Packaged Designs
Because of the use of microprocessors and IGBTs, today’s 1-horsepower drive is about
one-quarter the size of a 1-horsepower drive 10 years ago. This size reduction is also
attributed to “surface mount” technology used to assemble components to circuit boards.
AC drive designs have fewer parts to replace and include troubleshooting features
available through “on-board” diagnostic or “maintenance assistant” software. In most
cases, packaged AC drives of approximately 75 hp or less only use two circuit boards—
control board and motor control board.
Programming is typically done with a removable touch keypad or remote operator
panel. With the latest advancements in “flash memory,” the programming panel can be
removed from power after permanently storing parameter values. Drive panels guide the
user through use of a “start-up assistant” and feature multi-language programming, “soft
keys,” and high-resolution graphical displays with Bluetooth capability. The functions
of the keys change depending on the mode of the keypad. Keypads also feature preprogrammed default application values called macros, such as proportional-integralderivative (PID).
Serial and Fiber-Optic Communications
Control and diagnostic data can be transferred to the upper-level control system at a rate
of 100 milliseconds. With only three wires for control connections, the drive “health”
and operating statistics are available through any connected laptop.
Fiber-optic communications use plastic or silica (glass fiber) and an intense light source
to transmit data. With optical fiber, thousands of bits of information can be transmitted
at a rate of 4 million bits per second (4M baud). Drive manufacturers typically offer
serial and fiber-optic software that installs directly onto a laptop or desktop computer,
giving access to all drive parameters.
Fieldbus Communications (PLCs)
Data links to programmable logic controllers (PLCs) are common in many high-speed
systems that process control and feedback information. Several manufacturers of PLCs
offer a direct connection to many drive products. Because each PLC uses a specific
programming language, drive manufacturers are required to build an “adapter” (called a
fieldbus module) to translate one language to another (called a protocol). Several
manufacturers allow drive connections to existing internal network structures using
Ethernet modules. Modules that communicate through Transmission Control Protocol
(TCP) and Internet Protocol (IP) addresses allow high-level controls through automated
internal systems. Additional interfaces through wireless technologies and smart phone
“apps” are also making their way into programmable ASDs.
Drive Configurations
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Several manufacturers offer a variation of the standard 6-pulse drive. An AC drive that
is termed “12-pulse ready” offers the optional feature of converting a standard 6-pulse
drive to a 12-pulse drive (additional controls and a phase-shift transformer are required).
The 12-pulse drive does an impressive job of reducing the highest contributors of
harmonic distortion back to the power line.
One of the features of AC drive technology is the ability to “bypass” the drive if it stops
operating for any reason. Known as bypass, this configuration is used in many
applications where a fan or pump must continue operating, even though it is at fixed
speed. One manufacturer, ABB Inc., offers “electronic bypass” circuitry. If required, a
circuit board operates all the diagnostics and logic for “automatic” operation in bypass
and feeds bypass information to the building automation system.
Application
The first question to ask about the application of any adjustable speed drive system is,
“What are the advantages of using an ASD?” If a sound business decision is to be made,
the answer is usually fiscally based. The justifications can be summarized as follows:
reduced energy consumption, better process control, reduced amount of mechanical
equipment, and ease of motor starting.
Many loads, such as fans and centrifugal pumps, are centrifugal loads that follow the
affinity laws of performance, which means that if the speed is reduced to 50% of rated
speed, half as much product will be moved for one eighth of the power. In a process
where flow rates vary significantly, the motor can be run at reduced speed with an ASD
to save energy, or it can be run at full speed with the product flow rate controlled by
recirculating the product back through the pump. In the case of a fan, opening and
closing dampers control the flow rate. Both methods reduce the flow rate but do not
reduce energy consumption. Using ASDs in applications such as these reduces the flow
rate and significantly reduces the amount of energy consumed for a given amount of
delivered product, producing a huge energy savings.
ASDs simplify the process of making fine adjustments in flow rates of a material and
can give better control of the process temperatures, product proportions, and pressures.
To properly evaluate the performance of an ASD in such an application, the process
must be fully understood in detail before the appropriate ASD can be specified.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Being able to control flow rates precisely with an ASD eliminates the need for throttle
valves, fan dampers, and adjustable pitch fan blades. The elimination of these
mechanical devices significantly reduces the installation’s capital investment and
mechanical maintenance.
An ASD usually reduces the adverse impact on a power system when a large motor is
started across-the-line. The drive can be set to draw no more current from the power
system than the motor’s full load current, compared with a normal across-the-line start,
which can draw as much as five to six times the motor’s normal full load current. An
ASD allows 100% torque from a very low motor speed, which is usually more torque
than a motor can produce during across-the-line starting. The use of an ASD enables
starting a high-inertia load on a weak power system practical. The use of an ASD allows
the motor to accelerate a load to full speed with minimal impact to the power system. If
desired, the running motor can be transferred to the regular power supply, freeing up the
ASD to either start or control another motor. There are many applications where one
drive accelerates several loads in this way.
An example of the cost savings that can be achieved by using ASDs can be seen on
large motors. One major U.S. oil company found that about one-third of the financial
return from using ASDs on installations came from the energy savings that were
achieved by using an ASD. Additional savings were achieved from better process
control, reduced mechanical equipment, and maintenance. They also found that if they
selected their applications carefully, the installation had an average 2-year payback.
Further Information
ABB Inc. Drive Operations, ST-223-1. Basics of Polyphase AC Motors. Reference
Information, October 1998: 6–29.
API Standard 541. Form-wound Squirrel Cage Induction Motors—375 kW (500
Horsepower) and Larger. Washington, DC: API (American Petroleum Institute).
API Standard 546. Brushless Synchronous Machines—500 kVA and Larger. 3rd ed.
Washington, DC: API (American Petroleum Institute), 2008.
Carrow, Robert S. Electronic Drives. New York: TAB Books – an imprint of McGrawHill, 1996: 96–100, 201–207, 254–255.
Ebasco Services Inc. and EA-Mueller Inc. Adjustable Speed Drive Applications
Guidebook. January 1990: 28–29, 32–33, 36–37. (Prepared for the Bonneville
Power Administration.)
IEEE 100-2000. The Authoritative Dictionary of IEEE Standards Terms. 7th ed.
Piscataway, NJ: IEEE (Institute of Electrical and Electronics Engineers).
IEEE Std. 841-2009. IEEE Standard for Petroleum and Chemical Industry—PremiumEfficiency, Severe-Duty, Totally Enclosed Fan-Cooled (TEFC) Squirrel Cage
Induction Motors—Up to and Including 370 kW (500 hp). Piscataway, NJ: IEEE
(Institute of Electrical and Electronics Engineers).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
IEEE Std. 1566-2015. Performance of Adjustable Speed AC Drives Rated 375 kW and
Larger. Piscataway, NJ: IEEE (Institute of Electrical and Electronics Engineers).
NEMA ICS 7.2-2015. Application Guide for AC Adjustable Speed Drive Systems.
Arlington, VA: NEMA (National Electrical Manufacturers Association).
NEMA MG 1-2016. Motors and Generators. Arlington, VA: NEMA (National Electrical
Manufacturers Association).
Oliver, James A., prepared by, in cooperation with James N. Poole and Tejindar P.
Singh, PE, principal investigator. Adjustable Speed Drives – Application Guide.
JARSCO Engineering Corp., December 1992: 46–47. (Prepared for the Electric
Power Research Institute, Marek J. Samoty, EPRI Project Manager.)
Patrick, Dale R., and Stephen W. Fardo. Rotating Electrical Machines and Power
Systems. 2nd ed. Liburn, GA: The Fairmont Press, Inc., 1997: 122, 249–250, 287–
290, 296–297.
Polka, Dave. Motors & Drives: A Practical Technology Guide. Research Triangle Park,
NC: ISA (International Society of Automation), 2003.
——— “What is a VFD?” Training notes, P/N Training Notes 01-US-00. ABB Inc.,
June 2001: 1–3.
“Power Transmission Design.” 1993 Guide to PT Products. Penton Publishing Inc.,
1993: A151–A155, A183, A235–A237, A270–A273, A337–A339.
U.S. Electrical Motors, Division of Emerson Electric Co. DC Motors Home Study
Course. No. HSC616-124. St. Louis, MO: U.S. Electrical Motors, 1993:12–18, 20–
27.
About the Authors
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Dave Polka is principal technical instructor for ABB Inc., Automation Technologies,
Low-Voltage Drives, in New Berlin, Wisconsin. He has been involved with AC and DC
drive technology for more than 35 years, much of that time focusing on training and
education efforts on AC drives. A technical writer, he has written user manuals and
technical bulletins, along with several motor speed control articles published in major
trade journals. He graduated from the University of Wisconsin–Stout in Menomonie,
Wisconsin, with a BS in industrial education and emphasis in electronics and controls.
Donald G. Dunn is a senior consultant with Allied Reliability Group who has provided
services to the refining, chemical, and various other industries for more than 28 years.
He is currently a senior member of the Institute of Electrical and Electronics Engineers
(IEEE) and the International Society of Automation (ISA). He is a member of the IEEE,
ISA, National Fire Protection Association (NFPA), American Petroleum Institute (API),
and IEC standards development organizations. He co-chairs ISA18, chairs IEEE 841,
and is the convener of IEC 62682. Dunn served as the vice president for the ISA
Standards and Practices Board from 2011 to 2012, chairman of the IEEE Industrial
Applications Society (IAS) Petroleum and Chemical Industry Committee (PCIC) from
2012 to 2014, and chairman of the API Subcommittee on Electrical Equipment from
2012 to 2015. In 2015, he was elected to serve a 3-year term on the ISA Board of
Directors.
III
Electrical Considerations
Electrical Installations
The chapter on electrical installations underscores the reality that the correct
installation of electrical equipment is essential to implement almost any automation
system. Electrical installation is an extensive topic that includes rigid codes, practices,
conductor selection, distribution, grounding, interference factors, and surge protection
about which this book can only scratch the surface.
This chapter is based on the U.S. codes and regulations for which—not surprisingly—
other countries have similar documents.
Electrical Safety
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
While the safety of electrical installations is closely related to correct electrical
installation in general, electrical safety is addressed separately to ensure that sufficient
emphasis is placed on the safety aspects of the devices and field installations.
System Checkout
There are a variety of ways to verify the integrity of the installation, integration, and
operation of control systems at the component and system levels. The objectives are to
ensure the integrated system functions the way it was planned during the project’s
conceptual stage and specified during the design phase, and to ensure that the full
system functions as intended after it is assembled.
10
Electrical Installations
By Greg Lehmann, CAP
Introduction
The successful automation professional strives to deliver a work product that will not
only meet client requirements but also provide a safe and productive workplace. These
efforts should be ever-present during the design, construction, checkout, start-up, and
operational phases of the project. Instrumentation and control systems should be
designed with diligence to assure that all electrical, hydraulic, and pneumatic
installations provide safe, fault-tolerant conditions that will protect personnel, product,
and the environment.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
This chapter will introduce the reader to some basic premises that should be considered
in the electrical installations of automated industrial facilities. While not a complete
“How To,” the chapter will present the fundamentals necessary to assure a safe, reliable,
and practical electrical installation.
Codes, standards, and engineering practices should be viewed as minimum requirements
in electrical installations and the automation professional should—within reason—
attempt to go beyond simply complying with these documents. Local electrical codes
and regulations should be adhered to, and as an example, the National Electrical Code
(NEC) has been cited throughout the chapter to illustrate the referencing of
geographically appropriate code.
Scope
The scope of this chapter is not limited to the presentation of applicable codes,
standards, and practices as they relate to electrical installations. The chapter will also
present some useful and significant aspects and criteria that may serve as an aid in
designing a safe and dependable electrical installation. The information explained will
establish the groundwork necessary for a proficient design, which will lead to a
successful electrical installation.
The chapter topics include basic wiring practices, wire and cable selection, grounding,
noise reduction, lightning protection, electrical circuit and surge protection, raceways,
distribution equipment, and more.
The automation professional is responsible for the design and deployment of systems
and equipment for many diverse industries. While these industries may share a common
electrical code within their countries or jurisdictions, they will likely have differing
standards and practices between their specific industries. This chapter will present a
non-industry-specific, practical approach to electrical installations that will assist the
automation professional working with electrical engineers in providing an installation
that is safe, reliable, and productive.
Definitions
• Ampacity – The maximum current, in amperes, that a conductor can carry
continuously under the conditions of use without exceeding its temperature rating
[NEC Article 100, Definitions].
• Bonded (bonding) – Connected to establish electrical continuity and
conductivity [NEC Article 100, Definitions].
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Bonding of electrically conductive materials and other equipment – Normally
non-current-carrying electrically conductive materials that are likely to become
energized shall be connected together and to the electrical supply source in a
manner that establishes an effective ground-fault current path [NEC Article
250.4(A)(4)].
• Bonding of electrical equipment – Normally non-current-carrying conductive
materials enclosing electrical conductors or equipment, or forming part of such
equipment, shall be connected together and to the electrical supply source in a
manner that establishes an effective ground-fault current path [NEC Article
250.4(A)(3)].
• Cable – A factory assembly of two or more conductors having an overall
covering [NEC Article 800.2, Definitions].
• Cable tray system – A unit or assembly of units or sections and associated
fittings forming a structural system used to securely fasten or support cables and
raceways [NEC Article 392.2, Definitions].
• Effective ground-fault current path – Electrical equipment and wiring and
other electrically conductive material likely to become energized shall be
installed in a manner that creates a low-impedance circuit facilitating the
operation of the overcurrent device or ground detector for high-impedance
grounded systems [NEC Article 250.4(A)(5)].
• Enclosure – The case or housing of apparatus, or the fence or walls surrounding
an installation to prevent personnel from accidentally contacting energized parts
or to protect the equipment from physical damage [NEC Article 100, Definitions].
• Ground – The earth [NEC Article 100, Definitions].
• Grounded (Grounding) – Connected (connecting) to ground or to a conductive
body that extends the ground connection [NEC Article 100, Definitions].
• Grounding conductor, equipment (EGC) – The conductive path(s) that
provides a ground-fault current path and connects normally non-current-carrying
parts of equipment together and to the system grounded conductor or to the
grounding electrode conductor, or both [NEC Article 100, Definitions].
• Grounding of electrical equipment – Normally non-current-carrying conductive
materials enclosing electrical conductors or equipment, or forming part of such
equipment, shall be connected to earth so as to limit the voltage to ground on
these materials [NEC Article 250.4(A)(2)].
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Overcurrent – Any current in excess of the rated current of equipment or the
ampacity of a conductor. It may result from overload, short circuit, or ground
fault [NEC Article 100, Definitions].
• Overload – Operation of equipment in excess of normal, full-load rating, or of a
conductor in excess of rated ampacity that, when it persists for a sufficient length
of time, would cause damage or dangerous overheating. A fault, such as a short
circuit or ground fault, is not an overload [NEC Article 100, Definitions].
• Raceway – An enclosed channel designed expressly for holding wires, cables, or
busbars, with additional functions as permitted in (the NEC) [NEC Article 100,
Definitions].
• Separately derived system (SDS) – An electrical source, other than a service,
having no direct connection(s) to circuit conductors of grounding and bonding
connections [NEC Article 100, Definitions].
• Wire – A factory assembly of one or more insulated conductors without an
overall covering [NEC Article 800.2, Definitions].
Basic Wiring Practices
Creating an accurate and detailed design that includes complete specifications and
drawings is the most important facet of a successful electrical installation. Providing
comprehensive drawings will require less interpretation, allowing the installer to
concentrate more effort on a safe, neat, and workmanlike installation.
The design of the electrical installation should contain properly sized enclosures and
raceways that will also provide adequate protection for the equipment and wiring.
Described below are several types of enclosures and raceways, as well as discussions
about which types are the most prevalent in industrial facilities. Hazardous, wet, and
corrosive environments, both inside and outside of the facility, are major concerns that
need to be addressed when specifying the enclosures and raceways.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
To facilitate wiring pulling and access, junction boxes, pull boxes, or conduit fittings
should be installed when conduit runs are overly long or need to contain more bends
than the electrical code allows between access points. NEC Article 314.29 states that
boxes, conduit bodies, and handhole enclosures need to be installed so that the wiring
contained in them is accessible without removing any part of the building or, in
underground circuits, without excavating earth or any other substance that is used to
establish the finished grade. Raceway systems of conduit and properly located and sized
j-boxes and pull boxes should be practical installations that lend themselves well to
future expansion (adding conductors) that provide sufficient access to the installed
circuits and therefore initial installations normally have an upper design fill limit of
typically 40% fill.
Junction boxes, pull boxes, control panels, and marshalling panels should be centrally
and appropriately located to facilitate the interconnection of field devices to the I/O
panels of the control system. In large facilities, like signals from multiple instruments or
equipment can be individually routed to intermediate field terminal boxes where they
can be combined in multiconductor cables and further routed to marshalling panels.
Marshalling panels provide a convenient method of reorganizing individual signals into
groupings that make routing and termination at the programmable logic controller
(PLC), distributed control system (DCS), and input/output (I/O) interface more efficient.
Wire and Cable Selection
Conductors are electrically conductive materials that can intentionally or unintentionally
carry electrical current. Wires are intentional current-carrying conductors, strands of
copper or aluminum, that are used to distribute electrical power, electrical signals, and
electronic data throughout facilities—to energize equipment or establish
communications. Each wire is individually covered with a nonconductive coating that
insulates the conductor from other voltages and insulates other equipment or wires from
the voltage it contains. Cables are groupings of two or more insulated wires that are
contained within one sheath or protective jacket.
The selected wire and cable insulation must be appropriate for the raceways and
environments where they will be installed. Each end of the terminated wire should be
uniquely identified and labeled with a number, or letter and number combination, from a
numbering scheme that is designed and reflected on the drawings. Wire, raceway, and
equipment numbering schemes should be standardized and used, unchanged, throughout
the entire facility. Useful numbering schemes may include a drawing number and then a
sequential numbered suffix that is unique to the specific wire or piece of equipment. It is
very useful to design and use a numbering scheme that will allow the operations and
maintenance personnel to determine what drawing to reference based on the number on
the equipment, raceway, or wire.
Wire Selection
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Wires, as defined and used in NEC Article 800, are the conductors utilized in
communication circuits. The NEC uses the term conductor for the “current carriers”
where we will use both of the terms wire and conductor, loosely, to describe the physical
conductors. The NEC also uses the term wiring to describe the “act of” interconnecting
equipment with conductors. NEC Article 310, Conductors for General Wiring, covers
the general requirements for conductors and their type designations, insulations,
markings, mechanical strengths, ampacity ratings, and uses.
Wire is selected based on the type of insulation needed and the sizing and material of the
current-carrying conductor. The insulation must be suitable for the voltage, location
(i.e., wet, dry, damp, direct sunlight, or corrosive), and temperature (e.g., ambient and
internally generated). The conductor (copper or aluminum) must be selected and sized
to be of adequate ampacity for the current it will carry. Specifying solely by the NEC
tables will not be enough for some applications. It may be necessary to contact the
manufacturer to verify the proper application of the insulation and conductor.
Wire Insulation Selection
NEC 310.10 defines dry, damp, wet, and direct sunlight locations and most importantly,
the insulation types that are suited for these locations (see Figure 10-1).
THHN and THWN are commonly used types of wiring in industrial applications.
THHN is a 90°C (194°F) rated flame-retardant, heat-resistant thermoplastic insulation
that is approved for dry or damp locations. THWN is a 75°C (167°F) rated flameretardant, moisture- and heat-resistant thermoplastic insulation that is approved for dry
or wet locations. Both types of insulation have a nylon (or equivalent) outer covering
that facilitates installation by providing a slick finish for easy pulling through conduit.
SIS cable, a “switchboard only” type of insulation, and machine tool wire (MTW), a
“dry location” type of insulation, are the two most commonly used in switchboards,
industrial control panels, and motor control centers (MCCs). MTW and SIS insulation is
very pliable and easy to work with.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Conductor Identification
Grounded Conductors
NEC Article 200.6, Means of Identifying Grounded Conductors, states that the
insulation of grounded conductors (neutral or common), 6 American wire gauge (AWG)
or smaller, should be identified with a continuous white or gray outer finish or by three
continuous white stripes, on other than green insulation, along its entire length.
Grounding Conductors
NEC Article 250.119, Identification of Equipment Grounding Conductors, states that
grounding conductors may be bare, covered, or insulated. Covered or insulated
grounding conductors should be green, or green with one or more yellow stripes only.
NEC Article 250.119 further states that when only qualified persons will be maintaining
or servicing equipment, one or more insulated conductors in a multiconductor cable, at
the time of installation, can have their exposed insulation marked with green tape to
signify them as grounding conductors.
Ungrounded Conductors
The finish or color of ungrounded conductors, whether used as single conductor or in
multiconductor cables, needs to be clearly distinguishable from grounded and grounding
conductors.
The actual colors to be used for the coding of phase conductors is not specified in the
NEC. Common practice in the United States is to use black (L1), red (L2), and blue (L3)
for 120/ 208Y-volt systems, and brown (L1), orange (L2), and yellow (L3) for
277/480Y-volt systems.
NEC Article 110.15, High-Leg Marking, states that the high-leg (wild-leg) of a fourwire, delta connected system where the midpoint of one phase winding is grounded shall
be durably and permanently marked by an outer finish that is orange in color or by other
effective means.
In Canada, red/black/blue are mandated for L1/L2/L3 of non-isolated systems.
The National Fire Protection Association (NFPA) document NFPA 79 Article 13.2.4.3
(2007), Electrical Standard for Industrial Machinery, states that, with a few exceptions,
the following colors are to be used for ungrounded conductors:
1. Black – for ungrounded alternating current (AC) and direct current (DC) power
conductors
2. Red – for ungrounded AC control conductors
3. Blue – for ungrounded DC control conductors
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
This color scheme is not mandated by the NEC, but these colors are commonly used in
industrial facilities for the uses stated.
Un-Isolatable Conductors
Orange is used, per NFPA 79, to signify a conductor within an industrial control panel or
enclosure that is not de-energized with the main panel disconnecting means per NFPA
79. NFPA 79-2002 stated that orange or yellow could be used for this purpose and
yellow was the most commonly used color; it has now been superseded by the “orange
only” NFPA 79-2007.
Conductor Selection
The equipment nameplate data will show the amperage rating of the equipment under
normal noncontinuous duty use. The amperage rating of equipment that will be used
continuously (three or more continuous hours of operation) will have to be adjusted to
125% of the rated current. Nameplate and adjusted loads of all connected equipment
will then have to be tallied to properly determine the current that the conductor will
have to carry. NEC 210.19(A), 215.2, 230-42(A), and 430 should be used to properly
calculate continuous, noncontinuous, and motor full load currents for equipment
powered by branch-circuits, feeders, service conductors, and motor circuit conductors
respectively.
The selected insulation type is used in conjunction with the computed full load current
to determine which AWG conductor size should be used from NEC Table 310.15(B)
(16). NEC Table 310.15(B)(16) reflects the safe current-carrying ability of a conductor
when it is operating at an ambient temperature of less than 30°C (86°F) and in a
raceway with no more than three current-carrying conductors. The listed ampacity for
the selected conductor will have to be adjusted if the raceway contains four or more
conductors or if the ambient temperature will be greater than 30°C (86°F).
Reducing Voltage Drop
Resistance within a system causes voltage drop, and long runs of undersized wire
increases the resistance within the circuit. The greater the resistance, the higher the
voltage drop.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The NEC ampacity tables were tabulated using conductor current-carrying properties to
assure personnel safety and equipment protection from conductor and equipment
overheating and failure. The NEC tables are primarily safety related and are not
concerned with the length of the conductor runs and the associated voltage drop.
Excessive voltage drop in a circuit may not pose any direct safety concerns; it will
however, waste power and overstress equipment that functions best at full voltage, such
as motors for pumps and blowers.
The automation professional will not be able to design out all voltage drop within a
system. They will be able to minimize it through design, by upsizing conductors,
paralleling conductors, optimizing circuit lengths, or upsizing voltages where sizable
loads or considerable distances are involved.
Cable Selection
Multiconductor cabling is used in industrial facilities for power distribution, motor
control, process control, instrumentation signaling, and communications (data and
voice), where individually insulated conductors are physically protected by an overall
metallic or nonmetallic jacket. Nonmetallic, thermoplastic materials (e.g., PVC,
polyethylene, PVDF, tefzel, and TFE) are the most commonly used materials in the
cable jacketing. Cable is specified and selected based on use, installation method,
protection level required, and environmental conditions.
Cabling should be selected based on the individual application. Environmental
conditions, cable construction requirements, raceway or cable-support systems, and
permitted uses need to all be considered in cable selection.
Instrumentation signal cable contains an electrically isolating shielding that surrounds
the twisted pairs of conductors for the entire length of the cable and is referred to as a
shielded twisted-pair (STP) cable. Shielding can be metallic foil, a weaved thin gauge
wire, or a combination of both with typically a minimum 90% coverage. The shielding
prevents the conducted signal from radiating out and interfering with other cables or
signals, as well as protecting the signal from any external interference or noise. To
facilitate the grounding of the shield, instrumentation signal cables will contain a drain
wire that is in continuous contact with the shield and can be easily terminated to the
grounding system.
Cabling for communication is most often an unshielded twisted-pair (UTP) cable that
consists of individually twisted pairs of wires in one overall thermoplastic protective
jacket.
Some communication protocols (RS-232, RS-422, and RS-485), when used in industrial
environments, require the use of STP cabling to minimize signal interference.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Ground, Grounding, and Bonding
The grounding or earthing system of an electrical installation is a very important part of
the entire facility and should be approached with no less regard than any other mission
critical equipment or process. Electrical system grounding, equipment grounding,
lightning protection, surge protection, and noise reduction are all concerns within an
electrical installation that require a connection to an effective earth ground. Common
attributes or problems among the five issues can be identified, minimized, or totally
resolved with one well-designed and properly installed grounding system.
The purpose of the National Electrical Code is the practical safeguarding of persons and
property from hazards arising from the use of electricity [NEC 90.1(A)].
NEC Article 250, Grounding and Bonding, reinforces this purpose by covering the
requirements for the grounding and bonding of electrical systems with the foremost
concern for personnel safety, equipment protection, and fire protection. Grounding and
bonding are complex and interrelated topics whose functions seem to be characterized,
refined, and redefined with each new issuance of the electrical code.
This section will impart a general overview of grounding and bonding practices and
explain how they are the building blocks of an effective grounding system. Subsequent
sections will build upon these practices and illustrate how they can be employed by the
automation professional to reduce system noise, eliminate ground loops, and assist in
electrical circuit protection.
Ground
The earth contains a vast amount of positively charged protons and negatively charged
electrons that are dispersed rather evenly throughout its surface. This uniform
distribution of charged particles makes the earth neutral with an electrical potential
nearing zero. The earth’s neutrality is used as a reference in absolute measurements of
voltage and as a “ground” reference in electrical installations.
The electrical system is “grounded” to earth by connecting (bonding) it to an electrode
or electrode system that is in direct contact with the earth. In the United Kingdom, the
terms earth and earthing are used where North Americans use the terms ground and
grounding.
Grounding
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Electrical equipment failure or lightning strikes can induce dangerous, and sometimes
fatal, unexpected voltages on the non-grounded metal components of electrical
installations. Grounding the normally non-current-carrying conductive components of
the system can prevent shock and/or electrocution. It is virtually impossible for any
metal components of an electrical system to retain an electrical charge when its boxes,
equipment, and raceways have been grounded to earth.
Isolated Grounding of Sensitive Electronic Equipment
Bonded and grounded metal raceways, enclosures, and equipment can act as collectors
for electromagnetic interference (EMI) and can impose this noise on electronic signals,
instrumentation, and computers. Some electrical installations require a component’s
grounding circuit to be isolated from the equipment-grounding system within the facility
to reduce electromagnetic interference with computers, sensitive electronic equipment,
or signals.
Bonding
Bonding is the process of using a conductive jumper to permanently join the normally
noncurrent-carrying metallic components of a system together. Bonding provides an
electrically conductive path that will ensure electrical continuity and capacity to safely
conduct any current likely to be imposed. Bonded equipment should then be connected
to ground to provide an effective ground-fault current path.
A ground-fault occurs when there is an unintentional connection between an energized
electrical conductor and a normally non-current-carrying conductive material that has an
established ground-fault current path. Lengthy bonding and grounding conductors with
many loops and bends should be avoided when designing and installing ground-fault
current paths to minimize the imposed voltage. Longer lengths due to poor routing and
unnecessary bends in grounding conductors and system bonding jumpers will increase
the impedance of the grounding system, thereby hindering the proper operation of
overcurrent devices.
Connecting Grounding and Bonding Equipment
NEC Article 250.8, Connection of Grounding and Bonding Equipment, outlines
permissible and non-permissible methods of connecting grounding conductors and
bonding jumpers.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The surface of the equipment to be grounded should be clean and free of all nonconductive coatings that may hinder a good electrical continuity between the equipment
and the grounding/bonding connector. Paint and other coatings should be removed to
ensure a good connection unless the connector has been specifically designed to provide
an effective connection without removing the coating (see Figure 10-2).
Grounding System
The automation professional must realize the significance of a properly designed,
installed, and tested facility-grounding system. An effective grounding system does
much more than just wait around in anticipation of a lightning strike or a voltage surge
that it can eliminate. It is a perpetually functioning system that not only provides
personnel and equipment protection but an active signal zero-volt reference that reduces
noise and improves instrumentation and equipment functionality.
Utilizing the techniques mentioned in the “Ground,” “Grounding,” and “Bonding”
subsections, the grounding system is a collection of bonded connections of known low
impedance between an earth-grounded grid and the power and instrumentation systems
within the facility. These connections minimize voltage differentials on the ground plane
that produce noise or interference on instrument signal circuits. They also ensure
personnel and equipment protection by providing a direct path to ground to facilitate the
rapid operation of protective overcurrent devices during an electrical ground fault. The
principle function of a grounding system is to provide a low-impedance path, of
adequate capacity, to return ground-faults to their source and minimize transient
voltages.
The grounding system consists of five electrically interconnected subsystems. The five
subsystems are the grounding electrode subsystem, fault protection subsystem, lightning
protection subsystem, electrical system grounding subsystem, and instrumentation
signal shield grounding subsystem.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
While not all facility-grounding systems contain a lightning protection subsystem or an
instrumentation signal shield grounding subsystem, most contain an electrical system
grounding subsystem, and all contain a fault protection subsystem and a grounding
electrode subsystem.
Grounding Electrode Subsystem
Using grounding electrodes to establish a low-resistance, equipotential ground grid
under and surrounding a facility can be a complex task due to varying soil resistivity.
The grounding electrode subsystem will most commonly consist of driven ground rods,
buried interconnecting cables, and connections to underground metallic pipes, tanks, and
structural members of buildings that are grounded. Metal underground gas piping
systems or aluminum are not to be used as or be a part of the grounding electrode
subsystem per NEC Article 250.52(B). The grounding electrode system should be
augmented with additional electrodes as needed to obtain the required minimum
resistance specified in the construction documents as well as the minimum resistance
requirements of NEC Article 250.53. The lower the resistance achieved in the
installation of the grounding electrode system, the greater the protection. All attempts
should be made to reduce the resistance to the lowest practical value to provide the
greatest protection. All grounding electrodes present at the structure should be bonded
together to form one grounding electrode system for the facility.
All below-grade or permanently concealed connections should be exothermic type
connectors, and the connections should be visually inspected by the automation
professional, or a designated representative, prior to backfill or concealment.
Measurements of the resistance to ground of the installed grounding electrode system
should be made before the electrical system is energized. A “Fall off Potential” or threepoint procedure should be used, as described in IEEE 81-1983, Guide for Measuring
Earth Resistivity, Ground Impedance, and Earth Surface Potentials of a Ground System.
Grounding system modifications, including the addition of supplemental electrodes,
should be made to comply with the specified system resistance.
Electrode-to-ground or system-to-ground resistance measurements for system
verification and testing purposes should not be done less than 48 hours following
rainfall and should be made under normally dry conditions. Abnormally wet or dry soil
conditions can unduly influence the soil’s natural resistance and lead to inaccurate test
results.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Fault Protection Subsystem
According to the Military Handbook Grounding, Bonding, and Shielding for Electronic
Equipments and Facilities, Vol. 1: Basic Theory by the Department of Defense, a fault
protection subsystem is:
For effective fault protection, a low resistance path must be provided
between the location of the fault and the transformer supplying the faulted
line. The resistance of the path must be low enough to cause ample fault
current to flow and rapidly trip breakers or blow fuses. The necessary low
resistance return path inside a building is provided by the grounding (green
wire) conductor and the interconnected facility ground network. An
inadvertent contact between energized conductors and any conducting
object connected to the grounding (green wire) conductor will immediately
trip breakers or blow fuses. In a building containing a properly installed
third-wire grounding network, as prescribed by MIL-STD-188-124A, faults
internal to the building are rapidly cleared regardless of the resistance of
the earth connection [MIL-HDBK-419A].
The fault protection subsystem contains an equipment-grounding conductor (EGC) that
should be routed in the same raceway or cable with the power conductors. Equipmentgrounding conductors can be bare, covered, or insulated.
The proper identification of equipment-grounding conductors is stated in NFPA 79-2007
Section 13.2.2 and NEC Article 250.119, and the appropriate sizing is reflected on
NFPA 79 Table 8.2.2.3 and NEC Table 250.122.
All normally non-current-carrying metal objects within the facility should be grounded
with the EGC to provide fault protection. A ground bus should be installed in all
switchboards, panelboards, and MCCs to facilitate the installation of EGCs to all metal
equipment (e.g., conduit, building structural steel, piping, enclosures, and electrical
supporting structures). The EGC should be routed to and terminated on all equipment in
such a manner that the removal of any equipment will not interrupt the continuity of the
equipment-grounding circuit throughout the facility.
Lightning Protection Subsystem
Lightning can damage or destroy unprotected electrical equipment and electronic
circuitry by direct strike to the facility or indirectly by transient voltage spikes.
The lightning protection subsystem (LPS) is designed to provide a preferred lowresistance path to ground for lightning discharges without causing facility damage,
equipment damage, or injury to personnel. The crucial components of the LPS are air
terminals (lightning rods), conductors (roof, down), a grounding system, and surge
protection.
The LPS should be designed and installed according to the National Fire Protection
Association (NFPA) 780, Standard for the Installation of Lightning Protection Systems.
All materials used for an LPS should adhere to and be listed in Underwriters
Laboratories (UL) 96, Lightning Protection Components.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Electrical System Grounding Subsystem
In grounded AC distribution systems, the main bonding jumper is used to connect a
current-carrying conductor to the equipment-grounding terminal or grounding bar at the
main service disconnecting means. A grounding electrode conductor then connects the
grounding bar or terminal to the facility-grounding electrode system. The proper sizing
for main bonding jumpers and grounding electrode conductors in AC systems is
reflected in NEC Table 250.66.
NEC Article 250.26 defines which conductor is to be grounded in AC premises wiring
systems. The single-point grounded circuit (grounded conductor), typically the neutral
(common), is continued throughout the system to establish reference to ground and
provides a low-impedance path for any fault current (see Figure 10-3).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
A ground fault will occur in a grounded system if one of the ungrounded currentcarrying circuits contacts a grounded surface. The ground fault will cause an elevated
current flow within the system that will cause a circuit breaker to trip, or a fuse to blow,
opening the circuit, thereby removing voltage. The main bonding jumper’s function is to
provide a very low-impedance path for the ground-fault current to return to the source
directly and allow the trip to occur immediately. If the main bonding jumper was not
used, the ground-fault current would have to traverse the high-impedance path through
the earth, possibly retarding or preventing the breaker operation (see Figure 10-3).
There are a few types of AC systems of 50 volts to 1000 volts that are not required to be
grounded, and they are explained in NEC Article 250.21(A)(1) through (A)(4). AC
systems in this voltage range that supply industrial electric furnaces and separately
derived systems used exclusively for rectifiers that supply only adjustable-speed
industrial drives are two of the defined systems that are not required to be grounded per
NEC Article 250.21.
A ground fault on one conductor in an ungrounded AC system will not open an
overcurrent protective device as it does in a grounded system. Therefore, ungrounded
AC systems of the types listed in 250.21(A) that operate between 120 volts and 1000
volts are required to be fitted with ground detectors per NEC Article 250.21(B). Ground
detectors are designed to provide an alert to the operations and maintenance staff of the
faulted circuit condition but are not intended to open the circuit.
Some separately derived system (SDS) power sources are batteries, solar power
systems, generators, or transformers, and they have no connection to the supply
conductors in other systems. Functioning in the same way as the main bonding jumper
in the grounded AC distribution system, a grounded SDS has a system bonding jumper
that connects the equipment-grounding conductor to the grounded conductor (neutral or
common). The system bonding jumper can be installed anywhere between the SDS
power source and the first disconnecting means or overcurrent device. The grounding
electrode conductor must be connected at the same point that the system bonding
jumper is connected to the grounded conductor. Figure 10-3 illustrates a shielded
isolation transformer and distribution panel where the neutral is grounded at the first
disconnecting means by the grounding electrode conductor and system bonding jumper.
DC power distribution systems that must be grounded are defined in NEC Article
250.160. The neutral conductor of all three-wire, DC systems supplying premises wiring
should be grounded. All two-wire DC systems between 50 volts and 300 volts that
supply premises wiring systems are to be grounded with these exceptions. Industrial
equipment in limited areas that is equipped with ground detectors, rectifier-derived DC
systems that are being supplied by grounded AC systems, or DC fire alarm circuits that
have a maximum current draw of .03 amps.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Low-voltage DC power supplies (i.e., 24 VDC) used for discrete and analog
instrumentation may have the negative grounded to provide a zero-volt reference or a
“loop-back” path for the circuit. The ground reference can help minimize noise and
ensure proper operation of sensitive electronic equipment. Grounding the negative side
of the power supply is a commonly accepted standard, but non-grounded “floating” DC
power systems are common and may be preferred in certain applications.
Cables should be grounded on one side only and tied together at a single-point ground,
preferably where the power source negative has been grounded. Grounding the shield at
both ends can produce a conductive path (ground loop) and undesirable currents may
flow through this loop if there is a difference in potential between the two grounds. The
new currents themselves may become EMI and adversely affect the signal, which would
nullify the intended purpose of the shield.
Cables with signals of like levels should be routed together within the raceways and
enclosures and terminated to terminal blocks that have been grouped together and away
from any possible EMI sources in each enclosure. The length of untwisted conductors
and exposed drain wire should be minimized during termination (see Figure 10-4).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The drain wires and shields should remain isolated and ungrounded from enclosures,
junction boxes, conduit, and all other grounded metal objects throughout the entire cable
run up to their termination point. Drain wire integrity and isolation should be maintained
in all pass-through terminal boxes. The individual shields and drains should be isolated
from each other until the shared interconnection to ground at termination. It is common
to jumper the grounded drain wire terminal blocks together with one ground wire that is
then terminated on the enclosure grounding bus. If necessary, an isolated ground bus
should be provided, and one insulated grounding conductor can then be routed to ground
(see Figure 10-5).
Surge Protection
Surges are short durations of transient voltages or currents that exceed the limits of the
electrical system and components. Electrostatic discharge (ESD), lightning strikes,
circuit switching, and ground-faults are the most common reasons for electrical surges
and transient voltages in electrical systems. A surge protection device (SPD) protects
equipment from transient voltages by discharging or diverting the destructive surges
before they contact the equipment. The NEC defines SPDs as follows and addresses
four types.
• Surge-protection device – A protective device for limiting transient voltages by
diverting or limiting current. It also prevents continued flow or follow current
while remaining capable of repeating these functions. The four types are:
Type 1: Permanently connected SPDs intended for installation between the
secondary of the service transformer and the line side of the service
disconnect overcurrent device.
Type 2: Permanently connected SPDs intended for installation on the load
side of the service disconnect overcurrent device, including SPDs located
at the branch panel.
Type 3: Point of utilization SPDs.
Type 4: Component SPDs, including discrete components, as well as
assemblies.
Note: For further information on Type 1, Type 2, Type 3, and Type 4 SPDs, see UL
1449, Standard for Surge Protective Devices [NEC Article 100, Definitions].
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
NEC Article 285 covers the general requirements, installation requirements, and
connection requirements for SPDs (surge arrestors and transient voltage surge
suppressors [TVSSs]) permanently installed on premises wiring systems 1 kV or less.
Successful protection from the damages of surges is best accomplished by the proper
selection and application of SPDs. SPDs need to compliment the environment in which
they are installed and meet the specifications and voltage/current limitations of the
equipment to be protected.
Electrical Noise Reduction
Industrial facilities contain many different voltage types, sources, and levels that may
generate electromagnetic interference noise (EMI). The automation professional must
identify the troublesome signals that most commonly interfere with others, as well as the
sensitive signals that are susceptible to EMI. Once identified, both types must be made
electromagnetically compatible. Electromagnetic compatibility (EMC) is the ability of
electronic equipment to function without disruption in an electromagnetic environment
and to function without disrupting the signals or operation of other equipment with EMI
(noise).
Noise is unwanted electrostatic and/or magnetic signals superimposed on
instrumentation signals that will cause erratic readings and possible equipment damage.
Noise has a three-phased life cycle: it is emitted, conducted, and received. Motors,
motor starters, fluorescent lamps, welders, radios, electrical storms, AC power lines,
transformers, and switching circuits are some noise emitters found in industrial
facilities. Long parallel wiring runs and ungrounded raceways may conduct the noise to
the unshielded and/or ungrounded sensitive electronics (receptors).
Electrostatic Noise
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Electrostatic noise can be produced anywhere voltage is present, with or without any
current flowing on the emitting conductor. The electrostatic noise emitter imposes
(radiates) its electrical charge in all directions and it can capacitively couple onto any
adjacent conductors that have varying electrical fields (see Figure 10-6). The source of
the noise is most often the result of switching or changing voltage levels within the
facility. Common sources of electrostatic noise are fluorescent lighting, lightning, faults
(short-circuit or ground fault), and circuit or power switching (e.g., motor starts/stops).
A continuous foil shield, single-point grounded, is the most effective way to mitigate the
effects of electrostatic noise (see Figure 10-7). The effectiveness of a shielded cable is
represented by its transfer impedance (Zt) value. Zt relates a current on one surface of
the shield to the corresponding voltage drop generated by this current on the opposite
side of the shield. Because shielding is designed to minimize the ingress and egress of
imposing signals, the lower the Zt, the more effective the shielding.
Magnetic Noise
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Current flow through a conductor will form a magnetic field surrounding the conductor
and the strength of this field will be proportional to the amount of flowing current.
Magnetic noise can be created whenever there is a strong magnetic field present. Motor
circuits, power lines, transformers, and generators will all create and radiate magnetic
fields of varying strengths. Magnetic fields will extend out radially from the emitter and
as this magnetic flux crosses an unprotected circuit, a resulting voltage (noise) will be
induced into the circuit. Long parallel runs of conductors will allow the magnetic noise
to propagate throughout the circuit (see Figure 10-8). Unlike the mitigation of
electrostatic noise, shielding alone will not stop magnetic noise. The use of cables with
twisted conductors, unshielded twisted-pair (UTP), will alleviate magnetic noise by
creating small loops that minimize the area where the noise is able to exist (see Figure
10-9).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The electrostatic and magnetic noise mitigation techniques shown in Figures 10-8 and
10-9 are described as they relate to circuit protection from the ingress of noise. These
techniques are also effective in minimizing the egress of electrostatic and magnetic
voltages from within conductors as well. Using a combination of both techniques,
shielded twisted-pair (STP) cabling in the facility provides the best protection from
ingress and egress of both types of noise (see Figure 10-10).
Common-Mode Noise
Electrostatic and magnetic noise is induced without any physical connections of the
receptor to the noise-emitting sources. Unlike these noise types, common-mode noise is
the result of a physical connection made to an instrumentation circuit. Common-mode
noise is the unwanted flow of current on two conductors of an instrument circuit that is
frequently caused by a ground loop. When a circuit is grounded in two places that have
differing potentials, the current that is produced will flow through the circuit and thus
becomes common-mode noise.
Thermocouples present a unique case for two different causes of common-mode noise.
A thermocouple operates by producing a wanted milli-voltage signal that is in direct
relation to the temperature sensed at the reference junction of two dissimilar metals. The
thermocouple is installed into and directly contacts a grounded thermowell that has been
inserted into the pipe or vessel. If the input terminals of the connected recorder are
grounded also, a ground loop is established, and an unwanted flow of current will result
if different potentials exist at each end. To prevent this common-mode noise in a
thermocouple circuit, recorder manufacturers will differentiate the input terminals from
ground by providing high-impedance circuitry. This high-impedance ground connection
is referred to as common-mode rejection.
The second cause of common-mode noise in a thermocouple circuit results from the
difference in potentials of the temperature junction or thermocouple extension wire and
the surrounding metal objects (conduit, raceways, thermowells, etc.). The different
potentials can cause common-mode current to flow from the metal objects into the
thermocouple circuit resulting in common-mode noise. The best defense for this type of
common-mode noise is to provide a shielded cable as shown in Figure 10-7. As
suggested previously, the cable shielding should be single-point grounded with the
source voltage zero-reference ground, which, in this case, is at the thermocouple
reference junction.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Crosstalk Noise
Crosstalk noise occurs when the unbalanced signals of parallel conductors are
superimposed onto adjacent signal pairs. Crosstalk can be caused by magnetic,
capacitive, or conductive coupling. Sufficient circuit isolation and crosstalk noise
minimization can be achieved by the individual shielding of the twisted-pair circuits and
single-point grounding of the shielding.
High-Frequency Noise
Variable-frequency drives (VFDs), switching power supplies, and lighting ballasts are
some of the most common sources of high-frequency noise in industrial facilities. More
and more facilities are taking advantage of the cost savings and superior operating
efficiency of variable-frequency drives over traditional motors and controllers.
VFDs typically operate with a line side voltage of 60 Hz–480 VAC and vary the load
side frequency to control the connected motor speed and torque. VFD operation makes
them inherent sources of high-frequency, common-mode noise that will conductively
couple to not only the electronics in close proximity, but to the line voltage, load
voltage, and analog input signals of the VFD. The installation of line- and load-side
reactors, in series with the line- and load-side wiring, greatly reduces the high-frequency
noise, or harmonic attenuation, on all connected wiring. The reactors impede the
harmonics by providing inductive reactance in the circuit that will decrease the amount
of high-frequency noise or harmonics that is generated by the VFD.
Line- and load-side reactors are superb high-frequency noise filters and they also have
current-limiting abilities. The current-limiting properties of line side reactors will help
to protect the sensitive VFD circuitry from damage-causing transient voltage spikes that
often appear on incoming power lines. Decreasing the load side harmonics and highfrequency currents, with a load side reactor, will increase motor performance and
decrease operating temperatures, thereby increasing motor life and efficiency.
Long runs of shielded cabling in and through high-frequency noise environments will
typically suffer from higher shield impedances. This increased impedance is a highresistance path that will hamper the shield’s ability to drain its acquired potentials to
ground. Shielded cables in these high-frequency noise environments may need to be
grounded at both ends and possibly at multiple points along the route to sufficiently
improve the shielding operation. The grounds at these multiple grounding points MUST
all be at the identical potential or a ground loop will be formed with the shield that can
induce noise on the enclosed circuit.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Enclosures
Electrical enclosures (panels, not fence or walls) are sized and type-specified based on
the footprint and use of the enclosed components and the environment in which the
enclosures will be located. NEC 110.28 is a new section and table (Table 110.28,
“Enclosure Selection”) that should be used to select the enclosures of switchboards,
panelboards, industrial control panels, motor control centers, meter sockets, and motor
controllers, rated 600 volts or less, for use in non-hazardous locations. NEC 110.28
provides the automation professional a baseline to select the proper rated enclosure to be
used in a variety of environmental conditions.
The NEC Table 110.28 enclosure-type numbering scheme and protection criteria are
based on the National Electrical Manufacturers Association’s (NEMA’s) Standards
Publication 250-2003, Enclosures for Electrical Equipment (1000 Volts Maximum).
Enclosures for use in hazardous (classified) locations, defined in NEC Articles 500-503,
are also identified in NEMA 250-2003 as NEMA types 7, 8, 9, and 10.
NEMA-typed enclosure ratings are commonly used for specification in North America
and have been adopted by the Canadian Standards Association (CSA), UL, and NFPA
(NEC). IEC 60529 (IEC 529), Classification of Degrees of Protection Provided by
Enclosures, is a system for specifying non-hazardous location enclosures that is utilized
in Europe and other countries outside of North America.
IEC 529 assigns an International Protection (IP) Rating designation for enclosure types
that consists of the letters IP followed by two numbers. The first number, which ranges
from 0 to 6, represents the increasing protection of the equipment from solid objects or
persons entering the enclosure. The number 0 indicates no protection and 6 indicates the
maximum protection. The second number ranges from 0 to 8, and indicates the
increasing protection of the equipment from water entering the enclosure. The number 0
indicates no protection and 8 indicates the maximum protection, or submersible.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Testing methods and other significant characteristic differences between IP
classifications and NEMA ratings do not allow the two ratings to be exactly bidirectionally correlated. However, a chart exists in NEMA 250-2003 that assists in the
conversion of a NEMA type enclosure to an IP classification designation.
The environmental factors that the enclosure will be exposed to are prime concerns
when selecting the proper enclosure for an application. The materials of construction are
very important to the types of environments that the enclosure will be able to withstand.
Outdoor and indoor environmental factors of concern may be extreme hot or cold (harsh
weather), direct sunlight (UV deterioration of enclosure), water (salt or fresh), rain,
sleet, snow, dust, dirt, oil, solvents, and corrosive chemicals. Stainless steel enclosures
are typically the most expensive but will provide the best all-around protection against
the various environmental/ corrosive factors as well as panel longevity. Coated (painted)
steel, aluminum, polycarbonate, and fiberglass are other material choices that offer
varying degrees of environmental protection. Proper conduit fittings should be used at
all enclosure entries to maintain the NEMA and IP ratings.
Electromagnetic interference is a very important environmental issue that is often
overlooked when designing and selecting enclosures, particularly enclosures for
industrial control panels. Electromagnetic compatibility (EMC) can be assured with the
selection of appropriate materials; conductive materials typically provide the overall
best protection.
Raceways
Metal raceways—galvanized rigid conduit (GRC), intermediate metal conduit (IMC),
electrical metallic tubing (EMT), flexible metal conduit (FMC), liquid-tight flexible
metal conduit (LFMC), aluminum cable tray, and metal wireways (gutters)—are the
most common raceways used in industrial facilities and provide the best protection.
Galvanized rigid conduit (GRC) is a rigid metallic conduit (RMC) that has been
galvanized for corrosion protection and is permitted to be used under all atmospheric
conditions and occupancies per NEC Article 344 and is the absolute best choice overall
for industrial facilities. GRC, as a raceway, should be installed as a complete system
(per NEC 300.18) and securely fastened in place and supported (per NEC 344.30) prior
to the installation of conductors. All bending of GRC, required for installation, should
be done without damaging the conduit and so that the internal diameter will not be
reduced. The total of all bends between pull points (junction boxes and conduit bodies)
should not exceed 360 degrees (NEC 344.26), excessive bends may lead to conductor
insulation damage while being pulled through the GRC.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Flexible metal conduit (FMC) and liquid-tight flexible metal conduit (LFMC) are both
raceways made of helically wound, formed, and interlocked aluminum or steel metal
strip. The two flex types are very similar in installation criteria, conductor capacity, and
grounding and bonding requirements; however, LFMC has an outer liquid-tight,
nonmetallic, sunlight-resistant jacket making it more appropriate for industrial location
where protection from liquids and vapors is required. LFMC is also permitted in direct
burial installations and hazardous locations when specifically approved by NEC
501.10(B), 502.10, 503.10, and 504.20.
The cable tray wiring methods for industrial facilities should adhere to the methods
listed in NEC Table 392.18 for the specific cabling conditions. Cable trays can be used
as a support system for service conductors, feeders, branch circuits, communication
circuits, control circuits, and signaling circuits so long as the cabling is rated and
approved for cable tray installation. Cables rated at 600 volts or more and cables rated at
600 volts or less may be installed within the same tray, so long as they are separated by
a solid, fixed barrier made of the same material as the tray. Cable trays should have side
rails or structural members and be of suitable strength to provide adequate support for
all contained wiring. Cabling that passes between trays should not exceed 6 feet in
length and should be fastened securely to each of the trays on both sides of the transition
and should be guarded from physical damage.
Cable trays should be installed as a complete system prior to the installation of the
cabling. Electrical continuity should be maintained throughout the cable tray system and
between the cable tray and any associated raceway or equipment (see Figure 10-11).
Cable tray installations should be exposed and accessible, and sufficient space should be
provided and maintained surrounding the cable tray to allow adequate access to install
and maintain cables.
Distribution Equipment
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
NEC 409, Industrial Control Panels, and UL 508A, Safety Standard for Industrial
Control Panels, define the construction specifications and requirements for conductor
sizing, overcurrent protection, wiring space, and marking of industrial control panels
operating at 600 volts or less. Electrical enclosures for industrial control panels should
be sized, selected, and designed to provide the appropriate amount of space that the
enclosed components require. The components should be mounted, and wireways
should be routed, so there is no potential for interference between the unlike voltages of
the various components. Recommended clearances need to be maintained surrounding
electrical components per manufacturers’ requirements for heat dissipation and/or
electrical safety.
According to the NEC and UL, an industrial control panel is an assembly of two or
more components consisting of one of the following:
• Power circuit components only, such as motor controllers, overload relays, fused
disconnect switches, and circuit breakers
• Control circuit components only, such as pushbuttons, pilot lights, selector
switches, timers, switches, control relays
• A combination of power and control circuit components
These components, with associated wiring and terminals, are mounted on or contained
within an enclosure or mounted on a subpanel. The industrial control panel does not
include the controlled equipment.
Check-Out, Testing, and Start-Up
Piping and instrumentation drawings (P&IDs), general arrangement drawings, electrical
and mechanical installation details, cable schedules, loop drawings, and schematics are
created during the design phase of the project. Accurate final termination information is
not always available during design and will have to be provided to the installing
contractors when the shop drawings and operations and maintenance (O&M) manuals
are received from equipment providers.
The automation professional’s responsibilities do not end with the design and the
issuance of the construction drawings. An automation professional will almost always
be involved throughout the construction, check-out, and start-up phases of the electrical
installation.
All wire/cable routing and labeling should be verified to be in accordance with the
appropriate construction drawings after installation. A point-to-point verification of all
wiring should be completed and all conductors should be properly supported and
securely and tightly terminated.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Circuit breakers, fuses, and motor starter overloads should all be verified as being
installed and properly sized per the design drawings and documents. Any discrepancies
should be noted and investigated prior to the energizing of the system. It is imperative
that an up-to- date and accurate set of as-built drawings are maintained throughout the
construction, system check-out, and start-up phases. The level of accuracy of the markups and red lines on the construction drawings are the true legacy of the construction
phase and will be the basis for the accurate “record” drawings for the facility.
All DCS, programmable logic controller (PLC), and networking equipment dip switch
and jumper settings should be verified and resistors installed on communication cable
terminations where needed. All PLC and DCS I/O, communication, and power supply
modules should be securely installed and all wiring arms and terminal blocks should be
firmly snapped into place.
Branch-circuits to the control panels and equipment should be energized and verified
individually and systematically where possible and appropriate. Voltage levels should be
checked and ground-fault circuit interrupter operation should be verified. All threephase motors should be “bumped” to check for proper direction of rotation.
I/O checklists should be created for the various subsystems, processes, and equipment to
facilitate system check-out. Analog and discrete inputs, to the PLC and DCS systems,
should be jumpered and simulated to verify indication within the human-machine
interface (HMI) or control system. Analog and discrete outputs should be forced from
within the control system to verify the proper operation of the connected field devices.
The check-out of all equipment and I/O should be documented on the appropriate
checklist and should be signed off by the automation professional or owner’s
representative.
When all the power, control, and instrumentation circuits have been checked and
verified, the facility can be turned over to the start-up and operation personnel.
Instrumentation can then be calibrated, complete loops can be verified, and the
integration and interface of system components can be demonstrated and proper
operation confirmed.
Further Information
Standards
ISA (International Society of Automation)
ANSI/ISA 84.00.01-2004, Parts 1-3 (IEC 61511-1-3 Mod), Functional Safety: Safety
Instrumented Systems for the Process Industry Sector
ANSI/ISA-61010-1 (82.02.01)-2004, Safety Requirements for Electrical Equipment for
Measurement, Control, and Laboratory Use — Part 1: General Requirements.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
IEEE (Institute of Electrical and Electronics Engineers)
IEEE 81-1983, IEEE Guide for Measuring Earth Resistivity, Ground Impedance, and
Earth Surface Potentials of a Ground System
IEEE 142-1991, Recommended Practice for Grounding of Industrial and Commercial
Power Systems
NFPA (National Fire Protection Association)
NFPA 70, National Electrical Code (NEC), 2017 ed.
NFPA 79, Electrical Standard for Industrial Machinery, 2007 ed.
NFPA 780, Standard for the Installation of Lightning Protection Systems, 2008 ed.
Books
Coggan, D. A., ed. Fundamentals of Industrial Control: Practical Guides for
Measurement and Control Series. 2nd ed. Research Triangle Park, NC: ISA
(International Society of Automation), 2005.
DoD (Department of Defense). Military Handbook Grounding, Bonding, and Shielding
for Electronic Equipments And Facilities, Vol. 1: Basic Theory. MIL-HDBK-419A.
Washington DC: DoD, 1987.
ISA (International Society of Automation). The Automation, Systems, and
Instrumentation Dictionary. 4th ed. Research Triangle Park, NC: ISA, 2003.
Morrison, Ralph. Grounding and Shielding Techniques in Instrumentation. 3rd ed. John
Wiley & Sons, Inc. 1986.
Richter, Herbert P. & Hartwell, Frederic P. Practical Electrical Wiring. 19th ed. Park
Publishing, Inc. 2005.
Trout, Charles M. Electrical Installation and Inspection. Based on the 2002 National
Electrical Code. Delmar, 2002.
Whitt, Michael D. Successful Instrumentation and Control Systems Design. Research
Triangle Park, NC: ISA (International Society of Automation), 2004.
About the Author
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Greg Lehmann, CAP, is a process automation technical manager with extensive
experience in engineering, design, construction supervision, start-up, and
commissioning of various process equipment, instrumentation, and control systems.
Employed by AECOM, Lehmann is currently assigned to the AECOM-Denver, Global
Oil & Gas office. Lehmann is also the ISA Image and Membership Department vice
president, co-chair of the ISA 101 HMI Standard Committee, and a director on the ISA
Standards and Practices Board.
11
Safe Use and Application of Electrical
Apparatus
By Ernie Magison, Updated by Ian Verhappen
Introduction
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
This chapter discusses ways to ensure electrical equipment does not endanger personnel
or the plant. Refer to other chapters in this book for discussions on process safety and
safety instrumented systems (SISs)—protecting the plant against the risk of equipment
failing to perform its function in a control system or in a safety instrumented system.
Developments during the past half century have made it easier to select safe equipment.
Standardization of general-purpose safety requirements has made it possible to design a
single product that is acceptable with minor modifications, if any, in all nations.
Worldwide adoption of common standards is progressing for constructing and selecting
equipment for use in hazardous locations; but transitioning from long-accepted national
practices to adopt a somewhat different international or harmonized practice is, of
necessity, slowed by historical differences in philosophy. Emphasis in this chapter is on
the common aspects of design and use. Present day differences in national standards,
codes, and practices will be reduced in coming years. To ensure safety today, a user
anywhere must select, install, and use equipment in accordance with local standards and
codes.
General-purpose safety standards address construction requirements that ensure
personnel will not be injured by electrical shock, hot surfaces, or moving parts—and
that the equipment will not become a fire hazard. Requirements for constructing an
apparatus to ensure it does not become a source of ignition of flammable gases, vapors,
or dusts are superimposed on the general-purpose requirements for equipment that is to
be used where potentially explosive atmospheres may be present. The practice in all
industrialized countries, and in many developing countries, is that all electrical
equipment must be certified as meeting these safety standards. An independent
laboratory is mandated for explosion-protected apparatus but, in some cases, adherence
to general-purpose requirements may be claimed by the manufacturer, subject to strict
oversight by a third party. Thus, by selecting certified or approved equipment, the user
can be sure that it meets the applicable construction standards and environment for the
location in which it is to be installed. It is the user’s duty to install and use the
equipment in a manner that ensures that safety designed into the equipment is not
compromised in use.
Philosophy of General-Purpose Requirements
Protection against electrical shock is provided by construction rules that recognize that
voltages below about 30 VAC, 42.4 VAC peak, or 60 VDC do not pose a danger of
electrocution in normal industrial or domestic use, whereas contact with higher voltages
may be life threatening. Design rules, therefore, specify insulation, minimum spacings,
or partitions between low-voltage and higher-voltage circuits to prevent them from
contacting each other and causing accessible extra low-voltage circuits to become
hazardous. Construction must ensure higher-voltage circuits cannot be touched in
normal operation or by accidental exposure of live parts. Any exposed metallic parts
must be grounded or otherwise protected from being energized by hazardous voltages.
Protection against contact with hot parts or moving parts is provided by an enclosure, or
by guards and interlocks.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
To prevent the apparatus from initiating a fire, construction standards specify careful
selection of materials with respect to temperature rises of parts, minimum clearances
between conductive parts to prevent short circuiting, and the enclosure itself to prevent
arcs or sparks from leaving the equipment.
As part of the approval process—the approval authorities evaluate conformity to device
manufacturing rules, instructions, warnings, and equipment installation diagrams. The
user must install and use the apparatus in accordance with these specifications and
documents to ensure safety.
Equipment for Use Where Explosive Concentrations of Gas,
Vapor, or Dust Might Be Present
Equipment intended for use in hazardous locations is always marked for the hazardous
locations in which use is permitted and the kind of protection incorporated. It is almost
always certified by an independent approval authority. The user may depend on this
marking when selecting equipment for use.
Area Classification
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Any hazardous area classification system defines the kind of flammable material that
could be present, and the probability that it will be present. In North America and some
other locations, two nomenclatures are in use to denote type and probability: Class,
Group, and Division (as summarized in Table 11-1), and the more recent international
usage of Material Class and Zone. The Zone classification process is the one most
commonly used around the world and is gaining broader acceptance in North America,
especially for new facilities where the benefits of this system for sourcing products can
be realized. Class and Group or Material Group define the nature of the hazardous
material that may be present. Division or Zone indicates the probability of the location
having a flammable concentration of the material.
Though not defined in a standard, a common interpretation of Division 2 or Zone 2 is
that the hazardous condition can be present no more than 2 hours per year. Therefore,
the overall risk can be managed to an “acceptable” level due to the low probability of
the flammable material and sufficient ignition energy to ignite the fuel mixture both
being present at the same time.
In international practice, mining hazards are denoted as Group I, due to the potential
presence of both methane and dust. Equipment for use in mines is constructed to protect
against both hazards in a physically and chemically arduous environment. Industrial
facilities are denoted as Group II and the gases and vapors are classified as shown in
Table 11-2.
The degree of hazard is indicated by the Zone designation:
• Zone 0, Zone 20 – Hazardous atmosphere or a dust cloud may be present
continuously or a large percentage of the time
• Zone 1, Zone 21 – Hazardous atmosphere or a dust cloud may be present
intermittently
• Zone 2, Zone 22 – Hazardous atmosphere or a dust cloud is present only
abnormally, as after equipment or containment failure
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Thus, Class I, Division 1 includes Zone 0 and Zone 1; and Class I, Division 2 is
approximately equivalent to Zone 2. Class II, Division 1 includes Zone 20 and Zone 21,
and Class II, Division 2 is approximately equivalent to Zone 22.
Temperature Class
Electrical apparatus often develops areas of sufficiently high temperature to ignite a
material it may contact. To allow the user to ensure equipment will not become a
thermal source of ignition, the manufacturer must indicate the value of the highest
temperature reached by a part that is accessible to the explosive or combustible mixture.
At present, this is most often done by specifying a temperature class, defined in Table
11-3. Classes without a suffix are internationally recognized. Because many
intermediate temperature limits were defined in North American electrical codes or
standards prior to 1971, they are also included in the table below. As a practical matter,
a T4 class is safe for all but carbon disulphide and perhaps one or two other vapors. The
maximum surface temperature for some equipment used in dusty locations may be
controlled by the testing standard and may, therefore, not be marked on the equipment.
Selection of Apparatus
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Once the classification of a location has been established, usually during plant design,
one can select electrical apparatus that is safe to use in that location.
Table 11-4 shows the types of protection against the ignition of gases and vapors in
common use, and their applicability. The International Electrotechnical Commission
(IEC) designations for that type of protection (Ex d, Ex p, etc.) are also recognized in
the NEC (denoted AEx) and Canadian Electrical Code (denoted Ex or EEx) for marking
apparatus for use in the specific zones.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Table 11-5 describes the protection concept, the salient construction features of each
type of protection, and the aspects of safe use specific to that type of protection. To
ensure safe use of any type of protection, the user must install the apparatus according to
the instructions provided—especially with regard to protection from the environment,
shock, vibration, and, where specified, grounding and bonding. The user must inspect
the installation frequently enough to ensure the integrity of the enclosure, and to ensure
the continuity of grounding and bonding has not become impaired by accident or
environmental attack.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
All the protection techniques referred to in Tables 11-4 and 11-5, except intrinsic-safety
and energy-limited Type n or Division 2 constructions, are device-oriented. In principle,
a user purchases the equipment, opens the box, and installs the equipment in accordance
with the installation instructions and the applicable installation code—the NEC or CEC
in North America. Intrinsic-safety and energy-limited design for Division 2/Zone 2
applications are system-oriented. Figure 11-1 illustrates an intrinsically safe system. A
discussion of an energy-limited system for Division 2/Zone 2 would be based on a
similar diagram.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
An intrinsically safe apparatus is composed entirely of intrinsically safe circuits. It can
be described in terms of the maximum voltage, maximum current, and maximum power
that may be impressed on its terminals, and by the effective capacitance, inductance, or
optionally, inductance to resistance (L/R) ratio, which can be observed as energy at the
terminals.
Figure 11-1 shows only a single pair of terminals. There may be multiple pairs of
terminals, each of which must be characterized by the characteristic values noted. When
two symbols are shown in the legend below, the first symbol is the one commonly used
in international standards and the second symbol is common in North America, though
both are permitted. The associated apparatus contains circuits that are intrinsically safe
and circuits that are not intrinsically safe. If the nonintrinsically safe circuits are
protected by some technique, the apparatus may be located in a hazardous location.
Otherwise, it must be located in an unclassified location.
The voltage, Um, in Figure 11-1 is usually 250 VDC or rms for equipment connected to
the power line, but it could be 24 VDC or another low voltage for other equipment.
In the case of 250 VDC, the design will be based on some minimum prospective current
available from the power line. In the latter case, the assessment is carried out assuming
presence of the low voltage, but the certificate or control drawing may demand that the
24 VAC be supplied from a protective transformer, constructed in accordance with the
provisions of the standard. The intent of the 24 VDC supply mandate is to ensure that
the transformer, which is not a part of the certified apparatus, is of a quality of
construction that will reduce to an acceptably low value the probability that its failure
allows a voltage higher than 24 VAC to appear at the terminals. In countries that follow
the IEC procedure, an apparatus is usually certified as an entity, without regard to the
specific design of the other equipment to which it is connected in a system. An
intrinsically safe apparatus is defined by the parameters indicated in Figure 11-1. In
principle, using these parameters to select devices for a system is straightforward if the
intrinsically safe device is a two-terminal device. It is only necessary to ensure that Uo
and Io are equal to or less than Ui and I; and that Li and Ci are equal to or less than Lo
and Co. If the intrinsically safe apparatus is a two-wire device and both wires are
isolated from ground, an associated apparatus (barrier) must be installed in each wire. In
North America, it has become a requirement for the manufacturer of the intrinsically
safe or associated apparatus to provide a “control drawing” that provides the details of
the permissible interconnections and any special installation requirements specific to
that apparatus. This control drawing is assessed and verified by the certifying agency as
part of its examination of the product.
Standards following the IEC procedure define two levels of protection of intrinsic
safety: level of protection ia apparatus and level of protection ib apparatus.
• Level of protection ia apparatus, suitable for Zone 0, will not cause ignition
when the maximum permitted values are applied to its terminals
in normal operation with application of those noncountable faults that give
the most onerous condition,
in normal operation with application of one countable fault and those
noncountable faults that give the most onerous condition, and
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
in normal operation with application of two countable faults and those
noncountable faults that give the most onerous condition.
Normal operation means that the apparatus conforms to the design specification
supplied by the manufacturer, and it is used within electrical, mechanical, and
environmental limits specified by the manufacturer. Normal operation also includes
open circuiting, shorting, and grounding of external wiring at connection facilities.
When assessing or testing for spark ignition, the safety factors to be applied to voltage
or current are 1.5 in conditions a and b, and 1.0 in condition c. (These factors should
properly be called test factors. The real safety of intrinsic safety is inherent in the use of
the sensitive IEC apparatus to attempt to ignite the most easily ignitable mixture of the
test gas with hundreds of sparks. This combination of conditions is many times more
onerous than any likely to occur in practice.) North American Intrinsic Safety design
standards are equivalent to ia intrinsic safety.
• Level of protection ib apparatus, suitable for Zone 1, is assessed or tested under
the conditions of a and b above, with a safety factor on voltage or current of 1.5
in the condition of a and b.
• Level of protection ic apparatus has replaced type of protection nL, suitable for
use in Zone 2, in the IEC standards.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Figure 11-2 shows typical grounded and ungrounded two-wire intrinsically safe circuits.
It illustrates the principle that every ungrounded conductor entering the Division 1/Zone
0 or Zone 1 location, in this case where the transmitter or transducer is located, must be
protected against unsafe voltage and current by appropriate associated apparatus. The
boxes with three terminals represent barriers, independently certified protective
assemblies, certified and rated according to the national standard. Nonintrinsically safe
devices connected to the barrier need only be suitable for their location, and must not
contain voltages higher than the Um rating of the barrier.
Many barriers are passive, consisting of current limiting resistors and voltage limiting
diodes in appropriate configurations and redundancy to meet the requirements of the
standard. Others have active current or voltage-limiting elements with active element
systems, often called isolators. Both types may be combined with other circuitry for
regulating voltages, processing signals, and so on. The user should follow the
recommendation in the control drawing from the intrinsically safe apparatus
manufacturer, or discuss the selection of appropriate barriers with barrier vendors, all of
whom have proven configurations for many field-mounted devices.
Because international standards have a 5-year ratification period, within 10 years (two
cycles) differences should be reconciled to the present IEC-61010 series of standards as
the basis for all electrical safety globally. The international certifying agencies, such as
Underwriters Laboratories (UL) and Factory Mutual (FM) in the United States, the
Canadian Standards Association (CSA), and the British Standards Institution (BSI), are
all part of the IECEE (IEC System of Conformity Assessment Schemes for
Electrotechnical Equipment and Components) Certification program (INDAT
[INDustrial AUTomation]) that seeks to harmonize the standards certification processes
for industrial automation equipment across participating agencies.
Equipment for Use in Locations Where Combustible Dust
May Be Present
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Table 11-6 illustrates the methods approved by the NEC in 2005 for use in Class II
locations. Dust ignition-proof construction is designed and tested to ensure that no dust
enters the enclosure under swirling test conditions when cycling equipment creates a
pressure drop to draw dust into the enclosure. The temperature rise of the enclosure
surface is determined after a thick layer of dust has accumulated on the enclosure. For
IEC installations, dust-tight construction may or may not be tested by a third party. If the
device is tested by a third party, the test is essentially the same as above, but the
equipment is cycled only once during the dust test.
In European practice, the user is responsible for determining that the equipment surface
temperature under the expected accumulation of dust is safely below the ignition
temperature of the dust involved. Equipment enclosure standards classification ratings
reflect this difference. The IEC standards for enclosures originally standardized
construction of both practices. The European practice of the user determining surface
temperature under the dust layer is similar to the American dust ignition-proof and dusttight enclosures requirements. In addition to the dust layer test, Europe requires that
enclosures for Zone 21 be tested under vacuum to be dust-tight, that is, degree of
protection IP6X. Enclosures for Zone 22 are permitted to have entry of some dust, but
not enough to impair function or decrease safety.
The recommendations adopted by the IEC in 2015 for dust protection are summarized in
Table 11-7.
The suffix “D” after the symbol for type of protection indicates the version of that
technique intended for use with dusts, sometimes with reduced construction and test
requirements. The symbol “t” refers to protection by enclosure, essentially dust-tight or
dust ignition-proof construction, and suffixes “A” and “B” are as discussed previously.
It is likely that the IEC and U.S. standards for area classification and equipment
selection in dusty areas will grow closer in agreement in the coming years.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The Label Tells about the Device
In North America, an apparatus for use in Division 1 or 2 is marked with the Class and
Group for which it is approved. It may be marked Division 1, but it must be marked
Division 2 if it is suitable for Division 2 only.
Examples:
Class I, Groups A-C, Division 1 T4
Class I, Groups C, D, Division 2 T6
(The temperature code, discussed below, is also shown in these examples.)
Equipment approved for Class I, Division 1, or Division 2 may also be marked with
Class, Zone, gas group, and temperature codes. For the above examples, the additional
marking on the devices to show both representations would be:
Class I, Zone 1, IIC, T4
Class I, Zone 2, IIA, T6
In the United States, apparatus approved specifically for use in Zone 0, Zone 1, or Zone
2 must be marked with the Class, the Zone, AEx, the symbol for the method of
protection, the applicable gas group, and the temperature code.
Examples:
Class I, Zone 0 AEx ia IIC, T4
Class I, Zone 1 AEx m IIC, T6
In Canada, the Class and Zone markings are optional and AEx is replaced by Ex or EEx,
which are the symbols for explosion-protected apparatus conforming to IEC and
CENELEC (European Committee for Electrotechnical Standardization) standards,
respectively. In countries using only the Zone classification system, marking such as
those below are required.
Ex ia IIC, T4
EEx ia IIC, T4
If more than one type of protection is used, all symbols are shown, such as:
Ex d e mb IIC, T4
In the markings specifically relevant to explosion protection, a label will usually provide
additional information such as:
• Name and address of the manufacturer
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Degree of protection afforded by the enclosure
• The National Electrical Manufacturers Association (NEMA) or the Canadian
Standards Association (CSA) enclosure rating in North America, or the IP code in
countries following IEC zone classifications
• Symbol of certifying authority and an approval document number
• Voltage, current, and power ratings
• Pressure ratings, if applicable
If equipment is certified by a recognized approval agency, the user may be assured it is
safe if it is installed and used according to the instructions supplied by the manufacturer.
Conditions of use and installation instructions may be incorporated by reference in the
certification documents or in drawings provided by the manufacturer. In all cases, it is
essential that the user install equipment in accordance with local codes and operate it
within its electrical supply and load specifications and its rated ambient conditions.
For More Information
Few users need more detail about the fire and shock hazard protection that is built into
the equipment they buy, but those who wish to dig deeper may consult the documents
produced by the standards committees UL STP 3102 (Electrical Equipment for
Measurement, Control, and Laboratory Use) and IEC Technical Committee 66.
The ISA82 committee developed ANSI/ISA-61010-1 (82.02.01), which became a copublication with UL and CSA in 2004. The standard and the committee work were
transferred to UL in 2013 and are now under UL Standards Technical Panel (STP) 3102.
The current edition, which remains a co-publication of ISA, UL, and CSA, is
ANSI/ISA-61010-1 (82.02.01)-2012, Safety Requirements for Electrical Equipment for
Measurement, Control, and Laboratory Use – Part 1: General Requirements, Third
Printing 29 April 2016. IEC TC66 produced IEC 61010-1:2010+AMD1:2016 CSV
(Consolidated version), Safety requirements for electrical equipment for measurement,
control, and laboratory use — Part 1: General requirements. Together they provide a
broader understanding of area classification and the types of protection aids in the safe
application of equipment in classified locations.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The 60079 (Electrical Equipment for Hazardous Locations) publications by the related
UL Standards Technical Panel (STP) include ANSI/ISA-60079-0 (12.00.01)-2013
(R2017), Explosive Atmospheres – Part 0: Equipment – General Requirements. The
series includes information about design and use of electrical apparatus for classified
locations, as do the U.S. versions of the IEC standards for the types of protection
discussed above.
IEC publications form the basis of an increasing number of national standards of IEC
member countries, the local versions of which can be purchased from the national
member body of the IEC.
The following is a list of standards that address the use and application of electrical
apparatus rather than design.
ISA
www.isa.org
ANSI/ISA-60079-0 (12.00.01)-2013, Explosive Atmospheres – Part 0: Equipment –
General Requirements
ANSI/ISA-60079-10-1 (12.24.01)-2014, Explosive Atmospheres – Part 10-1:
Classification of Areas Explosive Gas Atmospheres
ANSI/ISA-61010-1 (82.02.01)-2012, Safety Requirements for Electrical Equipment for
Measurement, Control, and Laboratory Use – Part 1: General Requirements, Third
Printing 29 April 2016
ANSI/ISA-12.01.01-2013, Definitions and Information Pertaining to Electrical
Instruments in Hazardous (Classified) Locations
ANSI/ISA-12.02.02-2014, Recommendations for the Preparation, Content, and
Organization of Intrinsic Safety Control Drawings
ISA-TR12.2-1995, Intrinsically Safe System Assessment Using the Entity Concept
IEC
www.iec.ch
IEC 60079-10-1, Explosive Atmospheres – Part 10-1: Classification of Areas –
Explosive Gas Atmospheres
IEC 60079-10-2, Explosive Atmospheres – Part 10-2: Classification of Areas –
Explosive Dust Atmospheres
IEC 60079-17:2013. Explosive Atmospheres – Part 17: Electrical Installations
Inspection and Maintenance
IEC 61285, Industrial-Process Control – Safety of Analyser Houses
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
NFPA
www.nfpa.org
NFPA 70, National Electrical Code®
NFPA 70B, Recommended Practice for Electrical Equipment Maintenance
NFPA 70E, Standard for Electrical Safety in the Workplace®
NFPA 496, Standard for Purged and Pressurized Enclosures for Electrical Equipment
NFPA 497, Recommended Practice for the Classification of Flammable Liquids, Gases,
or Vapors and of Hazardous (Classified) Locations for Electrical Installations in
Chemical Process Areas
NFPA 499, Recommended Practice for the Classification of Combustible Dusts and of
Hazardous (Classified) Locations for Electrical Installations in Chemical Process
Areas
Many manufacturers provide free literature, often viewable on their websites, that
discusses the subjects of this chapter.
For an in-depth treatment of the science behind the types of protection and the history of
their development, as well as design, installation, and inspection of installations,
consult:
Magison, Ernest. Electrical Instruments in Hazardous Locations, 4th ed. Research
Triangle Park, NC: ISA (International Society of Automation), 1998.
For discussion of apparatus installation, refer to:
Schram, P. J., and M. W. Earley. Electrical Installations in Hazardous Locations.
Quincy, MA: NFPA (National Fire Protection Association), 2009.
McMillan, Alan. Electrical Installations in Hazardous Areas. Woburn, MA:
Butterworth-Heinemann, 1998.
Acknowledgment
The author is indebted to William G. Lawrence, PE, senior engineering specialist,
hazardous locations, FM approvals, for many valuable contributions to improve the
accuracy of this chapter.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
About the Author
Ernie Magison worked as an electrical engineer for Honeywell for 33 years and as a
professor at Drexel University in Philadelphia for 15 years concurrently. He was active
in standards development for the International Society of Automation (ISA), the
International Electrotechnical Commission (IEC), and the National Fire Protection
Association (NFPA) for four decades. Magison authored 40 articles, as well as many
papers and several books—most focusing on the application of electrical apparatus in
potentially explosive atmospheres— including four editions of his book Electrical
Instruments in Hazardous Locations. In addition, he taught many short courses and
consulted in the field. Magison passed away in 2011.
Ian Verhappen, PE, CAP, and ISA Fellow, has worked in all three aspects of the
automation industry: end user, supplier, and engineering consultant. After approximately
25 years as an end user in the hydrocarbon industry (where he was responsible for
analyzer support, as well as integration of intelligent devices in the facilities),
Verhappen moved to a supplier company as director of digital networks. For the past 5+
years, he has been working for engineering firms as a consultant. In addition to being a
regular trade journal columnist, Verhappen has been active in ISA and IEC standards for
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
many years, including serving a term as vice president of the ISA Standards and
Practices (S&P) Board. He is presently the convener of IEC SC65E WG10 Intelligent
Device Management, a member of the SC65E WG2 (List of Properties), and managing
director of several ISA standards including ISA-108.
12
Checkout, System Testing, and Start-Up
By Mike Cable
Introduction
Many automation professionals are involved in planning, specifying, designing,
programming, and integrating instrumentation and controls required for process
automation. In this chapter, we will describe the various methods of testing the
installation, integration, and operation at the component and system level. The end goal
is to ensure the integrated system functions the way it was intended when everyone got
together in the beginning of the project and specified what they wanted the systems to
do.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
In an ideal world, we could wait until the entire system is ready for operation and
perform all testing at the end. This would allow a much more efficient means of testing,
with everything connected, communicating, and operational. However, we all know
several problems would be uncovered that would lead to long start-up delays.
Uncovering the majority of these problems at the earliest opportunity eliminates many
of these delays and provides the opportunity to make corrections at a much lower cost.
An efficient means of testing the system can be developed, resulting in limited
duplication of effort by properly planning, communicating, and using a standardized
approach to instrumentation and control system commissioning. Instrumentation and
control system commissioning can be defined as a planned process by which
instrumentation and control loops are methodically placed into service. Instrument
commissioning can be thought of as building a case to prove the instrumentation and
controls will perform as specified.
This chapter does not cover testing performed during software development, as this
should be covered in the developer’s software quality assurance procedures. However, a
formal testing of the function blocks or program code, which will be described later in
this chapter, should be performed and documented. This chapter does not cover
documentation and testing required for equipment (e.g., pumps, tanks, heat exchangers,
filters, and air handling units). The scope for this chapter begins at the point when
instruments are received, panels are installed, point-to-point wiring is completed, and
systems have been turned over from construction.
The plan for testing described in this chapter must consider where the system is being
built. There may be several suppliers building skid systems at their facility for delivery
to the end user. There may be one main supplier that receives a majority of the
components, some of which are used for building panels and skid systems, while others
will be installed at the end user’s facility. For another project, all components are
delivered directly to the end user’s facility. Depending on the logistics, some testing
may be performed at the supplier’s location, even by the supplier, if properly trained.
Other testing will be performed at the end user’s facility.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The flowchart in Figure 12-1 illustrates instrument commissioning activities covered in
the “Instrumentation Commissioning” and “Software Testing” sections in this chapter.
Instrumentation Commissioning
To begin to build a case, we need to gather evidence that the components will work once
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
they are installed and integrated into the system. At the component level, the first
opportunity to gather this evidence occurs when the component is received. Once
receipt verification is completed, a bench calibration of calibrated devices can be
performed prior to installation. After installation, instruments should be verified for
proper installation. When all wiring and system connections have been completed and
the system has been powered up, loop checks and field calibrations can begin.
Preferably, the control system is installed and the control programs are loaded when
loop checks are performed. If, for example, the programmable logic controller (PLC)
code is not loaded, all testing to verify the proper indications at the human-machine
interface (HMI) would need to be duplicated later.
Not all components require all testing. For example, it might make sense to perform all
tests for a pressure transmitter, but none of the testing for an alarm light mounted on a
panel. You might skip receipt verification and only perform installation verification for a
solenoid valve because it is an off-the-shelf common item that would be simple to
replace later, if defective. A testing matrix to identify the instrument commissioning
activities for each element of the instrumentation and control system should be
developed. All instruments and input/output (I/O) should be accounted for in the testing
matrix (i.e., all the diamonds and bubbles for each piping and instrumentation drawing
[P&ID]). The I/O listing should be used to verify all control system inputs and outputs
are accounted for. Organize the project in a way that makes sense. For example,
organize projects with more than one system by P&ID. Use a database program,
preferably interfaced with the overall project database, to provide an efficient means of
entering the instrument information, tracking required commissioning activities, printing
test forms, and generating status reports.
A simple example of an instrument commissioning testing matrix is illustrated in Table
12-1.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Receipt Verification
The main objectives of performing receipt verification (RV) are to verify that the device
received is the device that was ordered, that the device meets the specification for that
instrument, and that the correct vendor manuals are received for the device. For most
projects, instrument specifications are developed and approved prior to purchase.
Instrument specifications are developed using ISA-TR20.00.01-2007, Specification
Forms for Process Measurement and Control Instruments – Part 1: General
Considerations (updated with 27 new specification forms in 2004–2006, and updated
with 11 new specification forms in 2007). The purchase order should also be referenced
in case any additional requirements are listed there.
Receipt verification is performed upon receipt of the instrument at the end user’s site,
supplier’s site, or off-site storage location. RV is performed per an approved RV
procedure and documented on an approved data sheet printed out as a report from the
inputted database information, if applicable. At a minimum, the following activities
should be performed during RV:
• The instrument matches purchase order and instrument specification.
• The manufacturer and model number are verified.
• The serial number and other nameplate data are recorded.
• The permanent tag is verified to be correct and properly applied.
• The correct quantity of manufacturers’ manuals is received and logged in.
• Any deficiencies are noted on the RV data sheet.
• Any deficiencies that are not corrected are added to the punch list.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The technical manuals should be organized for turnover to the end user. A good way to
organize manuals is by manufacturer and model number. Once the RV is complete,
properly assign the device to bench calibration, designated storage location, or
installation.
That being said, RV is optional. It is convenient to perform the activities listed above at
receipt. Even if RV is not performed for most devices, it should be performed for longlead-time devices to prevent significant delays later. If RV is not performed, all the RV
required activities should be performed with the installation verification
Installation Verification
Installation verification (IV) is performed to verify the instrument or device is installed
in accordance with project drawings, customer specifications, and manufacturer’s
instructions. Project drawings may include instrument installation details and P&IDs.
For example, it is very important to verify proper orientation for some sensors, such as
flow and pressure instrumentation. Pneumatic valves must have air connected to the
correct port for proper operation and fail position.
To minimize duplication of effort, IV will typically be performed after all instruments in
a system have been installed and authorization to begin has been received from the
project manager. Once IV has started, all installation changes must be communicated to
the commissioning team. If the start of IV is not communicated, undocumented changes
may continue to occur during construction even after the IV has been completed. At a
minimum, the following should be verified during IV:
• The instrument is installed according to the project drawings (i.e., installation
detail and P&ID) and manufacturer’s instructions.
• The instrument is properly tagged.
• The instrument wiring is properly terminated.
• The instrument air is properly connected, if applicable.
• The instrument is installed in the proper location.
• The instrument can be removed for periodic maintenance and calibration (i.e.,
slack in the flexible conduit, isolation valves installed, resistance temperature
detectors [RTDs] installed in the thermowell). Note any discrepancies and
whether the discrepancy is a deviation from the specification or an observation.
All deficiencies are noted on the IV data sheet, and corrective actions taken are
documented. Any deficiencies that are not corrected are added to the punch list.
Loop Checks
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
An instrument loop is a combination of interconnected instruments that measure and/or
control a process variable. An instrument loop diagram is a composite representation of
instrument loop information containing all associated electrical and piping connections.
Instrument loop diagrams are developed in accordance with the ISA-5.4-1991,
Instrument Loop Diagrams, standard. Minimum content requirements include:
• Identifying the loop and loop components
• Point-to-point interconnections with identifying numbers or colors of electrical
and/ or pneumatic wires and tubing, including junction boxes, terminals,
bulkheads, ports, and grounding connections
• General location of devices, such as field, panel, I/O cabinet, and control room
devices
• Energy sources of devices
• Control action or fail-safe conditions
Loop checks are performed for every I/O point. Loop checks are performed to verify
that each instrument loop is connected properly, indications are properly scaled, alarms
are functioning properly, and fail positions are properly configured from the field device
to the HMI. A formalized loop check should be documented prior to placing any loop in
service. This formalized program should include verifying the installation against the
loop diagram and simulating signals to verify output responses and indications
throughout the range.
Why is this important? A significant percentage of instrument loops have some problem,
some of which would result in hidden failures. As an example, a temperature transmitter
output wired to a PLC analog input provides a temperature display on an operator
interface.
The transmitter is calibrated from 0–100°C to provide a proportional 4–20 mA output. If
the PLC programmer writes the code for this input as 0–150°C for a 4–20 mA input and
a loop check is not performed, an inaccurate displayed value will result (and possibly an
improper control action).
Other typical problems found and corrected by performing loop checks include wiring
connected to the wrong points, ground loops, and broken wires. Whenever possible,
loop checks should be performed at the same time as the field calibration for devices in
the loop. The same test equipment will be utilized and some of the same test
connections will be used. This makes more efficient use of time and resources.
To perform loop checks with maximum effectiveness, they should be coordinated with
the field calibration requirements and should be performed with the control system
program, such as PLC code or a distributed control system (DCS) program, completed
and loaded.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Why is this important? Consider the following.
Let’s say a field calibration of a temperature transmitter was performed with the RTD
connected and placed in a temperature block. The 4–20 mA transmitter output was
checked over the calibrated input range of 0–100°C. No remote indications or alarms
were checked during the calibration. That is perfectly normal. There are a few options
for the loop check. The loop check could have been performed at the same time as the
calibration with the remote indications at the HMI verified and alarms checked. Or, if
the field calibration was already completed, the loop check could be performed by using
a milliamp simulator connected from the transmitter output to verify remote indications
and alarms.
However, if no field calibration is required because the bench calibration was the only
calibration requirement, you would have to start the loop check at the RTD to verify all
loop components are working together. Of course, it is very important to properly
reconnect the loop after completing the loop check, or the whole thing is null and void.
Let’s consider the issue where the PLC program is not loaded when performing loop
checks. This has been common in this author’s experience, so it takes excellent planning
to make sure the program is ready when it is time for loop checks. If the program is not
ready and the loop check is performed by verifying that the correct controller input
displays the correct bits, we have not checked the whole loop. In too many cases, this
author has had to do loop checks multiple times, once to the control system input (just to
show progress) and again later from the input to verify HMI indications. A little
planning would have saved time and money (and a lot of complaining).
Loop checks can be divided into four main categories: analog input, analog output,
discrete input, and discrete output. Analog refers to devices that accept or deliver signals
that change proportionately (including analog signals that have been digitalized).
Discrete refers to on-off signals. Examples of each signal type include:
• Discrete input (DI) – Pushbuttons, switches, and valve position feedback1
• Discrete output (DO) – Solenoids and alarms
• Analog input (AI) – Process parameters such as temperature, pressure, level, and
flow
• Analog output (AO) – Control output to an I/P or proportional valve,
retransmitted analog input
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Other loop checks may include RTD input, thermocouple input, and block valve. The
RTD input is an analog input but may require a different test procedure and form. The
block valve loop check can be used for efficiency to test the solenoid valve and valve
position inputs all together.
• Temperature input (RTD or T/C) – Temperature input direct to RTD or
millivolt/ thermocouple input card without the use of a transmitter.
• Block valve (BV) – Includes testing the solenoid, the valve, and the valveposition feedback switch(es) in one loop check, if desired. Otherwise, it is
acceptable to do a discrete output check for the solenoid and valve, and discrete
input checks for the valve position switch(es).
For additional information on loop checks, refer to Loop Checking: A Technician’s
Guide by Harley M. Jeffery, a part of the International Society of Automation (ISA)
Technician Guide Series.
Calibration
The ISA definition of calibration is “a test during which known values of measurand are
applied to the transducer and corresponding output readings are recorded under
specified conditions.” To perform a calibration, a test standard is connected to the device
input and the input is varied through the calibration range of the instrument while the
device output is measured with an appropriate test standard. Once the as-found readings
have been recorded, the device is adjusted until all readings are within the required
tolerance and the procedure is repeated to obtain as-left readings.
There are a few times during the project when calibration can be performed. First, the
instrument vendor can calibrate the instrument to the required specification and provide
calibration documentation. In this case, a certificate simply stating the device has been
calibrated is not sufficient. A calibration report including calibration data, test standards
used, procedures referenced, the technician’s signature, and the date must be included as
a minimum. A bench calibration can be performed upon receipt or just prior to
installation. A field calibration can be performed after the instrument is installed and
integrated with the instrument loop. A field calibration can be performed at the same
time as a loop check, described previously, for increased efficiency. Depending on the
instrument and end-user requirements, a vendor calibration, bench calibration, and field
calibration may be performed for some instruments with only a vendor calibration for
other instruments. Here are some suggestions for when to calibrate during the
commissioning of a new project:
• Vendor calibration – The calibration parameters are specified prior to ordering
the instrument, and bench calibration will typically not be performed for this
project.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Bench calibration – Vendor calibration is not performed and/or it is the enduser’s practice to perform bench calibrations.
• Field calibration – Field calibration should always be performed, unless it is
impossible to access the instrument safely in the field or it is the end-user’s
practice to perform bench calibrations.
• Other calibration – Refer to ISA’s Calibration: A Technician Guide (2005) for
examples of calibration procedures and additional information on the various
elements of calibration.
Software Testing
Software Development
Software should be developed by the supplier in accordance with approved software
development standards and a software quality assurance program. These documents
should detail programming standards, version control, internal testing during
development, documentation of bugs, and corrective actions taken to debug.
Program Code Testing
Prior to deployment, the functional elements of the programming code should be tested
using simulations. The simulations can be performed by forcing inputs internal to the
program and observing outputs. If practicable, it is better to use an external controlsystem-simulator setup that can be used to provide simulated inputs and observe
outputs. Using a simulator prevents modifying the program for testing purposes, which
may lead to errors if the program is not restored to the original conditions.
Security Testing
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Adequate controls must be placed on process control system programs and configurable
devices to prevent unauthorized and undocumented changes. The controls are specified
during program development or device design. Elements of system security that should
be tested at a minimum are login security and access levels.
Login security ensures the appropriate controls are in place to gain authorized access to
the program and to prevent unauthorized access. This can be a combination of a user ID
and password or the use of biometrics such a fingerprint or retina scan. Examples of
login security include passwords that can be configured for a minimum number of
characters, use of both alpha and numeric characters, forced password change at initial
login, and required password change at specified intervals. Access should be locked out
for any user that cannot successfully log in after a specified number of attempts. All the
specified configurations should be challenged during testing.
Each user should be assigned an access level based on his or her job function associated
with the program or device. Examples of access levels include Read-Only, Operator,
Supervisor, Engineer, and Administrator. Each access level would have the ability to
view, add, edit, delete, print, create reports, and perform other functions as applicable to
their job function. All the access levels would be challenged during testing.
Factory Acceptance Testing
Factory acceptance testing (FAT) is a major milestone in a project when the system has
been built, the supplier is ready to deliver, and the end user has an opportunity to review
documentation and witness performance testing. FAT is used to verify the system
hardware, configuration, and software has been built, assembled, and programmed as
specified.
Whenever appropriate, supplier documentation deliverables should be reviewed and
testing conducted at the supplier site prior to system delivery. This will allow for
troubleshooting and problem resolution prior to shipment, providing a higher level of
assurance that the system will meet specifications and function properly upon delivery.
In addition, problems found and corrected at the supplier site can be corrected at less
cost and with less impact on the project schedule. FAT should be performed with enduser representatives from various departments, such as manufacturing, engineering,
maintenance, information technology, and quality. During testing, systems should be
challenged to best simulate actual production conditions. FAT test plans and procedures
should be developed by the supplier and approved by the end user prior to testing. Some
of the activities to consider during FAT are described in the next section.
Documentation
During FAT, perform a formal review of all project deliverables, such as:
• Design specifications
• P&IDs
• The instrument index
• Instrument specifications
• Instrument location drawings
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Loop diagrams
• Instrument installation details
• Sequence of events
• Logic diagrams
• The DCS/PLC program
• Operating instructions
• Process flow diagrams
• Instrument, equipment, and component technical manuals
Software
No matter how much review and testing of the programming is performed, it is
impossible to test it all. The most efficient use of resources is to:
• Verify compliance with software development quality assurance
• Verify compliance with software programming standard
• Test against software design specifications
• Test security and critical functions during FAT, site acceptance testing (SAT),
commissioning, and/or qualification testing
• Generate and review a sampling of historical data, trend data, and reports for
adequacy
Operator Interface
The end user’s ability to interact with the system depends heavily on the effort taken to
ensure ergonomic and practical implementation of the operator interface. The following
items should be reviewed for each operator interface:
• HMI screen layouts
• Usability
• Readability
• Responsiveness
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Hardware
Hardware testing is performed to verify the system has been built according to the
approved hardware design specification. In addition, completed documentation should
be reviewed for any instrument commissioning, software testing, and system level
testing (described in the “Instrumentation Commissioning” and “Software Testing”
sections of this chapter) that has been performed.
If possible, and many times it is, start up and operate the system. Many suppliers now
have the facilities to connect temporary utilities so the system can be operated as if it
were installed at the end user’s facility. Take advantage of this opportunity. Any testing
completed at the FAT stage can significantly reduce the testing requirements at the enduser facility, which can reduce the burden on an already tight schedule.
Change Management
Throughout any project of this nature, design changes must be managed appropriately.
Ideally, a design freeze is implemented at the end of FAT with no changes made so that
all conditions with respect to software revisions, hardware (such as servers), and similar
elements that could change broader elements of the system are the same for the next
phase—site acceptance testing. However, it is more likely that based on observations
made during FAT, additional configuration and point changes are required. Therefore,
engineering or project change management procedures must be followed.
Site Acceptance Testing
The site acceptance test (SAT) demonstrates the system is working in its operational
environment and interfaces with instruments and equipment from other suppliers. The
SAT normally constitutes a repeat of elements of the FAT in the user’s environment plus
those tests made possible with all process, field instruments, interfaces, and service
connections established. A repeat of all FAT testing is not necessary. After all, one of the
purposes of FAT is to minimize the testing required at the end user’s facility. Obviously,
for systems built at the end-user facility, FAT is not performed. All elements mentioned
previously in the FAT should therefore be performed during SAT. A SAT plan should be
codeveloped by the supplier and end user prior to testing. During SAT, consider
performing some of the following activities:
• Repeat critical FAT elements possibly compromised by disassembly, shipment,
and reassembly.
• Perform testing that is now made possible with all process, field instrumentation,
interfaces, communications, and service connections established.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Perform interface testing of critical elements of the Level 3 system (e.g.,
manufacturing execution system [MES]), if applicable.
• Perform interface testing with critical elements of Level 4 system (e.g., enterprise
resource planning [ERP]), if applicable.
System Level Testing
To continue building a case, we need to gather evidence that the components are
working together when integrated with the equipment and control systems. At the
system level, the first opportunity to gather this evidence is when the instrumentation,
controls, equipment, and utilities have been connected, powered up, and turned over
from construction.
Alarm and Interlock Testing
Alarm and interlock testing is performed to verify all alarms and interlocks are
functioning properly and activate at the proper set points/conditions. Many of the alarm
and interlock tests can be performed with loop checks, since most alarms and interlocks
are activated from some loop component or control system output. If alarm and
interlock tests are performed separately from loop checks, they are documented as part
of the FAT, SAT, or operational qualification test.
Wet Testing and Loop Tuning
Now that we have evidence the specified components are properly installed and
calibrated, the loops components are communicating, and the programming has been
developed using appropriate quality procedures and tested, we can make a final case by
testing the integrated operation of the system. Loop tuning would also be performed
with wet testing.
Loop tuning is performed to optimize loop response to set-point changes and process
upsets. Controller PID parameters—proportional band (or gain), integral, and derivative
—are adjusted to optimize the loop response and minimize overshoot. Various methods
have been developed to perform loop tuning, including the Ziegler-Nichols and trialand-error methods. ISA has several resources for additional information on loop tuning,
such as Tuning of Industrial Control Systems (Corripio 2015) and Maintenance of
Instruments and Systems (Goettsche 2005).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Detailed test procedures must be used so wet testing is performed safely. The system
must be in operational condition with all utilities connected, and filled with an
appropriate medium, if applicable.
Each system should be tested against the specified sequence of operations for start-up,
shutdown, and operations. As an example, the system should be started up and all
aspects of the start-up sequence verified. In some cases, simply pressing the ON button
is all that is required to bring the system up to normal operation. In other cases, operator
intervention is required during the start-up sequence. In either case, all specified
operations must be verified.
After the sequence of operation testing is performed, the system should be tested for
normal operation, process set-point changes, process upsets, and abnormal operations
(to verify it recognizes improper operating parameters and places the system in a safe
condition). Examples include:
• Normal operation – Tank level drops to the low-fill set point and initiates a tank
filling operation
• Process set-point change – Temperature set point is changed from 40ºC to 80ºC
• Process upset or disturbance – Heat exchanger steam pressure decreases
• Abnormal operations – Valve(s) out of position, pump not running, filter is
clogged
Safety Considerations
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
It is worth mentioning safety considerations during commissioning and system testing.
Automation professionals who are not permanently assigned to a manufacturing location
are typically involved with start-up, commissioning, and system testing. In addition,
systems are placed in unusual configurations and construction activities are usually still
going on during commissioning. Also, more equipment failures occur when initially
placed in service, before they are “broken in.” For these reasons, more incidents occur
during checkout and commissioning. Of course, the Occupational Safety and Health
Administration (OSHA) regulations must be followed, but let’s mention a few commonsense safety precautions for those engineers who don’t often get out in the field.
• Electrical safety – Technically, any voltage source over 30 volts can kill you.
During construction and start-up, electrical panels are routinely left open. You
should always assume the wires are live until proven otherwise. Take adequate
precautions when working in and around control panels that are energized. For
example, remove all metal objects (e.g., watches or jewelry), insulate surrounding
areas, wear rubber sole shoes, and never work alone. Always have someone
working with you (and not in the panel with you!). Depending on the voltages
present and risk involved, it may be a good idea to tie a rope around yourself so
the other person can pull you out, just in case.
• Pressurized systems – Precautions must be taken for any system or component
that could be under pressure. During start-up, systems are turned over from
construction in an unknown status. Valve line-up checks should be performed to
place the system in a known condition. Even then systems are placed in unusual
operating conditions to perform testing. Always know what you’re working with
and always proceed with caution when manipulating system components and
changing system conditions. As mentioned before, if a component is going to fail,
it will tend to fail when initially placed in service. On more than one occasion,
this author has seen tank rupture disks fail the first time a system is brought up
under steam pressure. This is even after the rupture disk was visually inspected
for defects during installation verification. When those things blow, you do not
want to be in its discharge path.
Check system pressure using any available indications before removing a
component that would breech system integrity. However, even if a gauge reads 0
psig, carefully remove the component. The gauge could be faulty. And again,
never work alone. Make sure somebody knows where you are working.
• Extreme temperature – High- and low-temperature systems, such as steam, hot
water, cryogenic storage, and ultra-low temperature freezer systems can be very
dangerous even if everything is kept inside the pipes, tanks, and chambers.
Although insulation minimizes burn risks, there is exposed piping. Do not grab
any exposed piping until you know it is safe to touch. We learned this as a kid the
first time we touched a hot stove. We seem to have to relearn this for every new
project. Even the smallest steam burn hurts for several days. Exposure to
cryogenic temperatures, such as liquid nitrogen, is just like frostbite and steam
burn; it hurts just as badly.
There are also dangers with using liquid nitrogen in small, enclosed spaces. The
nitrogen will displace the air and cause you to pass out and eventually die. The
bottom line: know what you’re working with, be smart, and never work
alone.
Further Information
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Cable, Mike. Calibration: A Technician’s Guide. ISA Technician Guide Series. Research
Triangle Park, NC: ISA (International Society of Automation), 2005.
Coggan, D. A. ed. Fundamentals of Industrial Control. 2nd ed. Practical Guides for
Measurement and Control Series. Research Triangle Park, NC: ISA (International
Society of Automation), 2005.
Corripio, Armando B. Tuning of Industrial Control Systems. 3rd ed. Research Triangle
Park, NC: ISA (International Society of Automation), 2015.
GAMP Good Practice Guide Validation of Process Control Systems. Bethesda, MD:
ISPE (International Society for Pharmaceutical Engineering), 2003.
GAMP Guide for Validation of Automated Systems in Pharmaceutical Manufacture.
Bethesda, MD: ISPE (International Society for Pharmaceutical Engineering), 2001.
Goettsche, Lawrence D. Maintenance of Instruments and Systems. 2nd ed. Research
Triangle Park, NC: ISA (International Society of Automation), 2005.
ISA-RP105.00.01-2017. Management of a Calibration Program for Industrial
Automation and Control Systems. Research Triangle Park, NC: ISA (International
Society of Automation).
Jeffery, Harley M. Loop Checking: A Technician’s Guide. ISA Technician Guide Series.
Research Triangle Park, NC: ISA (International Society of Automation), 2005.
Whitt, Michael D. Successful Instrumentation and Control Systems Design. Research
Triangle Park, NC: ISA (International Society of Automation), 2004.
About the Author
Mike Cable is a Level 3 Certified Control System Technician and author of ISA’s
Calibration: A Technician’s Guide. Cable started his career as an electronics technician
in the Navy Nuclear Power Program, serving as a reactor operator and engineering
watch supervisor aboard the USS Los Angeles submarine. After the military, he spent 11
years as a validation contractor, highlighted by an assignment managing instrument
qualification projects for Eli Lilly Corporate Process Automation. Cable is currently the
manager of operations technology at Argos Therapeutics in Durham, North Carolina.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
1. In the past, DI and DO were referred to as digital inputs and digital outputs. Many people still use that terminology.
IV
Control Systems
Programmable Logic Controllers
One of the most ubiquitous control platforms uses the programmable logic controller
(PLC) as its hardware basis. This chapter describes the main distinguishing
characteristics of the PLC, its basic hardware and software architecture, and the
methods by which the program and input/output modules are scanned.
Distributed Control Systems
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Distributed control systems (DCSs) are responsible for real-time management and
control of major process plants and, therefore, are typically larger than PLC
installations. The term “distributed” implies that various subsystem control and
communication tasks are performed in different physical devices. The entire system of
devices is then connected via the digital control network that provides overall
communication, coordination, and monitoring.
SCADA
Supervisory control and data acquisition (SCADA) systems have been developed for
geographically distributed sites requiring monitoring and control, typically from a
central location or control center. This chapter describes how the three major SCADA
elements—field-based controllers called remote terminal units (RTUs), a central control
facility from which operations personnel monitor and control the field sites through the
data-collecting master terminal unit (MTU) or a host computer system, and a wide-area
communications system to link the field-based RTUs to the central control facility—are
combined into a system.
13
Programmable Logic Controllers: The
Hardware
By Kelvin T. Erickson, PhD
Introduction
In many respects, the architecture of the programmable logic controller (PLC) resembles
a general-purpose computer with specialized input/output (I/O) modules. However,
some important characteristics distinguish a PLC from a general-purpose computer.
First, and most importantly, a PLC is much more reliable, designed for a mean time
between failure (MTBF) measured in years. Second, a PLC can be placed in an
industrial environment with its substantial amount of electrical noise, vibration, extreme
temperatures, and humidity. Third, plant technicians with less than a college education
can easily maintain PLCs.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
This chapter describes the main distinguishing characteristics of the PLC, its basic
hardware and software architecture, and the method in which the program and I/O
modules are scanned.
Basic PLC Hardware Architecture
The basic architecture of a PLC is shown in Figure 13-1. The main components are the
processor module, the power supply, and the I/O modules. The processor module
consists of the central processing unit (CPU) and memory. In addition to a
microprocessor, the CPU also contains at least an interface to a programming device and
may contain interfaces to remote I/O and other communication networks. The power
supply is usually a separate module and the I/O modules are separate from the
processor. The types of I/O modules include discrete (on/off), analog (continuous
variable), and special modules, like motion control or high-speed counters. The field
devices are connected to the I/O modules.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Depending on the amount of I/O and the particular PLC processor, the I/O modules may
be in the same chassis as the processor and/or in one or more other chassis. Up until the
late 1980s, the I/O modules in a typical PLC system were in a chassis separate from the
PLC processor. In the more typical present-day PLC, some of the I/O modules are
present in the chassis that contains the processor. Some PLC systems allow more than
one processor in the same chassis. Smaller PLCs are often mounted on a DIN rail. The
smallest PLCs (often called micro-PLCs or nano-PLCs) include the power supply,
processor, and all the I/O in one package. Some micro-PLCs contain a built-in operatorinterface panel. For many micro-PLCs, the amount of I/O is limited and not expandable.
Basic Software and Memory Architecture (IEC 61131-3)
The International Electrotechnical Commission (IEC) 61131-3 programming language
standard defines a memory and program model that follows modern software
engineering concepts. This model incorporates such features as top-down design,
structured programming, hierarchical organization, formal software interfaces, and
program encapsulation. Fortunately, extensive training in software engineering
techniques is not necessary to become a proficient programmer. If fully implemented,
the model is reasonably complicated. The main disadvantages of the model are its
complexity and its contradiction to the simplicity of the early PLCs.
Only the overall IEC 61131-3 memory program and memory model is described in this
chapter. Various implementations of the standard are detailed in Programmable Logic
Controllers: An Emphasis on Design and Applications (Erickson 2016). The IEC 61131-
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
3 memory model (what the standard calls the software model) is presented in Figure 132. The model is layered (i.e., each layer hides many of the features of the layers
beneath). Each of the main elements are described next.
• Configuration – The configuration is the entire body of software (program and
data) that corresponds to a PLC system. Generally, a configuration equates with
the program and data for one PLC. In large complex systems that require multiple
cooperating PLCs, each PLC has a separate configuration. A configuration
communicates with other IEC configurations within the control system through
defined interfaces, called access paths. Unfortunately, the choice of the term
configuration conflicts with the historic use of this term in the controls industry.
Generally, configuration refers to the process of specifying items such as the PLC
processor model, communication interfaces, remote I/O connections, memory
allocation, and so on. Therefore, the vendors producing IEC-compliant PLCs that
use the term configuration in the historic sense refer to the entire body of
software with some other term.
• Resource – A resource provides the support functions for the execution of
programs. One or more resources constitute a configuration. Normally a resource
exists within a PLC, but it may exist within a personal computer (PC) to support
program testing. One of the main functions of a resource is to provide an interface
between a program and the physical I/O of the PLC.
• Program – A program generally consists of an interconnection of function
blocks, each of which may be written in any of the IEC languages. A function
block or program is also called a program organization unit (POU). In addition to
the function blocks, the program contains declarations of physical I/Os and any
variables local to the program. A program can read from and write to I/O
channels, read from and write to global variables, and communicate with other
programs.
• Task – A task controls one or more programs and/or function blocks to execute.
The execution of a program implies that all the function blocks in the program are
processed once. The execution of a function block implies that all the software
elements of the function block are processed once. There are no implied
mechanisms for program execution. In order for a program to be executed, it must
be assigned to a task, and the task must be configured to execute continuously,
periodically, or with a trigger.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Variables – Variables are declared within the different software elements of the
model. A local variable is defined at the software element and can only be
accessed by the software element. Local variables can be defined for the function
block, program, resource, or configuration. A global variable defined for a
configuration, resource, or program is accessible to all elements contained in it.
For example, a global configuration variable is accessible to all software
elements in the configuration. A global program variable is accessible to all
function blocks in the program.
Directly represented variables are memory and I/O locations in the PLC. IEC
61131-3 defines formats for references to such data, for example %IX21, %Q4,
and % MW24. However, many implementers of the standard use their own
formats, which are not consistent with the IEC standard.
I/O and Program Scan
The PLC processor has four major tasks executed repeatedly in the following order:
1. Read the physical inputs
2. Scan the ladder logic program
3. Write the physical outputs
4. Perform housekeeping tasks
The processor repeats these tasks as long as it is running. The housekeeping tasks
include communication with external devices and hardware diagnostics.
The time required to complete these four tasks is defined as the scan time and typically
ranges from a few milliseconds up to a few hundred milliseconds, depending on the
length of the program. For very large programs, the scan time can be relatively long,
causing the PLC program to miss transient events, especially if they are shorter than the
scan time. In this situation, the possible solutions are:
1. Break the program into program units (tasks, function blocks, or routines) that
are executed at a slower rate and execute the logic to detect the transient event
on every scan.
2. Lengthen the time of the transient event so that it is at least twice the maximum
scan time.
3. Place the logic examining the transient in a program unit that is executed at a
fixed time interval, smaller than one-half the length of the transient event.
4. Partition long calculations (e.g., array manipulation) into smaller parts so that
only a portion of the calculation is solved during a scan time.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Depending on the PLC processor, one or more of these solutions may be unavailable.
Normally, during the Ladder Diagram (LD) program scan, changes in physical inputs
cannot be sensed, nor can physical outputs be changed at the output module terminals.
However, some PLC processors have a function block that can read the current state of a
physical input and another function block that can immediately set the current state of a
physical output. However, using the immediate I/O block incurs a severe time penalty
on the program scan. For example, to scan one contact in the Ladder Diagram typically
requires less than one microsecond. The time to execute an immediate I/O instruction
typically requires 200 to 300 microseconds. Consequently, these instructions are used
sparingly.
From the standpoint of the physical I/O and program execution, the processor scan is
shown in Figure 13-3. The state of the actual physical inputs is copied to a portion of the
PLC memory, commonly called the input image table. When the program is scanned, it
examines the input image table to read the state of a physical input. When the logic
determines the state of a physical output, it writes to a portion of the PLC memory
commonly called the output image table. The output image may also be examined
during the program scan. To update the physical outputs, the output image table contents
are copied to the physical outputs after the program is scanned. In reality, this close
coordination between the program scan and the reading/writing of I/O applies only to
I/O modules in the same chassis as the processor. For I/O modules in another chassis,
there is a communication link or network whose operation is generally not coordinated
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
with the processor program scan. As shown in Figure 13-4, the communication module
scans the remote I/O at its own rate and maintains buffer data blocks. The transfer of
data between the processor and the buffer data is coordinated with the program scan. In
addition, some PLC processors have no coordination between the program scan and the
I/O modules. When an input module channel changes, its status is immediately updated
in the input image table and is not coordinated with the start of the program scan.
Most PLC processors have a watchdog timer that monitors the scan time. If the
processor scan time exceeds the watchdog timer time-out value, the processor halts the
program execution and signals a fault. This type of fault usually indicates the presence
of an infinite loop in the program or too many interrupts to the program scan.
The overall execution of the PLC processor scan is controlled by the processor mode.
When the PLC processor is in the run mode, the physical inputs, physical outputs, and
program are scanned as described previously. When the processor is in program mode
(sometimes called stopped), the program is not scanned. Depending on the particular
PLC processor, the physical inputs may be copied into the input image, but the physical
outputs are disabled. Some processors have a test mode, where the physical inputs and
Ladder Diagram are scanned. The output image table is updated, but the physical
outputs remain disabled.
Within the processor program, the scan order of the function blocks (see Figure 13-2)
can be specified. Within a function block (or program organization unit), the program
written in any of the IEC languages generally proceeds from top to bottom. For a Ladder
Diagram, the scan starts at the top of the ladder and proceeds to the bottom of the ladder,
examining each rung from left to right. Once a rung is examined, it is not examined
again until the next ladder scan. The rungs are not examined in reverse order. However,
most processors have a jump instruction that one could use to jump back up the ladder
and execute previous rungs. However, that use of the instruction is not recommended,
because the PLC could be caught in an infinite loop. Even if the processor is caught in
an infinite loop, the watchdog timer will cause a processor halt so the problem can be
corrected.
Forcing Discrete Inputs and Outputs
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
One of the unique characteristics of PLCs is their ability to override the status of a
physical discrete input or to override the logic driving a physical output coil and force
the output to a desired status. This characteristic is very useful for testing. Although
there are few exceptions, this override function only applies to physical discrete inputs
and physical discrete outputs. In addition, depending on the PLC manufacturer, this
function is called by different names (e.g., input/output forcing, input/output disabling,
or override). The forcing function modifies the PLC program scan as shown in Figure
13-5.
For discrete inputs, the force function acts like a mask or a filter. When a physical
discrete input is forced, the value in the force table overrides the actual input device
status with the value in the force table. An input force/disable/override is often useful
for testing a PLC program. If the PLC is not connected to physical I/O, the forces allow
one to simulate the inputs. An input force can also be used to temporarily bypass a
failed discrete input device so that operation may continue while the device is being
repaired. However, overriding safety devices in this manner is not recommended.
For discrete outputs, the forcing/disable function acts like a mask. When a particular
physical discrete output is forced, the value in the force table overrides the value
determined by the result of the logic that drives the output coil. An output
force/disable/override is useful for troubleshooting discrete outputs. Forcing an output
to the on and/or off state can be used to test that particular output. Otherwise, output
forces should not be used. Using an output force to override the PLC logic should be
used only temporarily until the logic can be corrected.
Further Information
Erickson, Kelvin T. Programmable Logic Controllers: An Emphasis on Design and
Applications. 3rd ed. Rolla, MO: Dogwood Valley Press, 2016.
Hughes, Thomas A. Programmable Controllers. 4th ed. Research Triangle Park, NC:
ISA (International Society of Automation), 2004.
IEC 61131-3:2013. Programmable Controllers – Part 3: Programming Languages.
Geneva 20 - Switzerland: IEC (International Electrotechnical Commission).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Lewis, R. W. Programming Industrial Control Systems Using IEC 1131-3. Revised ed.
The Institution of Electrical Engineers (IEE) Control Engineering Series. London:
IEE, 1998.
About the Author
Kelvin T. Erickson, PhD, is a professor of electrical and computer engineering at the
Missouri University of Science and Technology (formerly the University of MissouriRolla, UMR) in Rolla, Mo. His primary areas of interest are in manufacturing
automation and process control. Before coming to UMR in 1986, he was a senior design
engineer at Fisher Controls Intl., Inc. (now part of Fisher-Rosemount). During 1997, he
was on a sabbatical leave from UMR, working for Magnum Technologies (now part of
Maverick Technologies), Fairview Heights, Ill. Erickson received the BSEE and MSEE
degrees from the University of Missouri-Rolla and the PhD EE degree from Iowa State
University. He is a registered professional engineer (control systems) in Missouri. He is
a member of the International Society of Automation (ISA) and a senior member of the
Institute of Electrical and Electronics Engineers (IEEE).
14
Distributed Control Systems
By Douglas C. White
Introduction and Overview
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Modern distributed control systems (DCSs) support the real-time management and
control of major process plants. They provide the computing power, connectivity, and
infrastructure to link measurement sensors, actuators, process control algorithms, and
plant monitoring systems. They are extensively used in power generation plants and the
continuous and batch process industries, and less commonly in discrete manufacturing
factories.
DCSs were initially introduced in the 1970s and have subsequently been widely
adopted. They convert field measurements to digital form, execute multiple control
algorithms in a controller module, use computer screens and keyboards for the operator
interface, and connect all components together with a single digital data network. The
“distributed” in DCS implies that various sub-system control and communication tasks
are performed in different physical devices. The entire system of devices is then
connected via the digital control network that provides overall communication,
coordination, and monitoring.
The continuing evolution of computing and communication capabilities that is evident
in consumer electronics being cheaper, faster, and more reliable also impacts DCS
developments. New functionality is continually being added.
The elements of a modern DCS are shown in Figure 14-1. As illustrated in the figure,
DCS systems also provide the data and connectivity required for plant and corporate
systems. In addition, they consolidate information from safety systems and machinery
monitoring systems.
Major components of a typical system include input/output (I/O) processing and
connectivity, control modules, operator stations, engineering/system workstations,
application servers, and a process control network bus. These components are discussed
in the following section.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Input/Output Processing
The first step in control is to convert the sensed measurement into a digital value that
can be evaluated by the control algorithms. It has been common in the past to bring all
the input/ output (I/O) in the plant to marshalling panels, perhaps located in the field,
from which connections are made to the I/O terminations for the DCS controllers in the
control center. Traditionally, this connection was a bundle of individual wires
connecting the terminations in the marshalling panel to I/O connections at the controller
and was called the home run cable.
Many types of equipment must be connected to a modern DCS, with different specific
electronic and physical requirements for each interface. Each I/O type, discussed in the
next section, requires its own specialized I/O interface that will convert the signal to the
digital value used in the DCS. These interface devices are installed on an electronic bus
that provides high-speed data transfer to the controller. Typically, the inputs will be
accessed and converted approximately once per second with special control functions
executing in the millisecond range. Modern marshalling panels can include embedded
computing and communication capabilities, which are sometimes called configurable
I/O. This permits the signal conversion to be done in the marshalling cabinet rather than
in the DCS controller and the connection to the controller to be network wiring rather
than a bundle of wires.
Large oil refineries, power stations, and chemical plants can have tens of thousands of
measurements connected directly to the DCS with additional tens of thousands of
readings accessed via communication links.
Standard Electronic Analog I/O: 4–20 mA
When electronic instrumentation was first introduced, there were many different
standards for the scalar/continuous electronic I/O signals. Eventually, the 4–20
milliampere (mA) analog direct current (DC) signal was adopted as standard and is still
the most widely used I/O format. Each signal used for control, both input and output,
has its own wire, which can be connected to a marshalling panel and then connected to
the controller. Output signals are connected to and regulate the control valves, variablefrequency drive (VFD) motors, and other control devices in the plant. More information
on valves and VFDs has been provided in other chapters in this book.
Discrete I/O
There are several devices, such as limit switches and motors, whose state is binary (i.e.,
on or off). These require different I/O processing from the analog I/O and, most
commonly, separate I/O interfaces. Discrete signals are normally 24 VDC or 120 VAC
(North America) and 230 VAC (rest of world).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Specialized I/O
There are many specialized measurements with equally special interface requirements.
Thermocouples
Thermocouples may be cold junction compensated or uncompensated, or alternately,
temperature may be sensed via a resistance temperature detector (RTD). Each of these
input formats requires different calculations to convert the signal to a reading.
Multiplexers are often used for multiple thermocouple readings to reduce I/O wiring.
The current trend is to replace multiplexers and long lead temperature sensor wires with
head-mounted transmitters installed in the thermowell cap itself with an analog signal
output.
Pulse Counts
For turbine meters and some other devices, it is necessary to accumulate the number of
pulses transmitted between a predefined start time and the time of the reading. The
number of pulses is then converted to a volume flow reading based on the type and size
of the meter.
HART
With continuing computer miniaturization, it became possible to add enhanced
calculation and self-diagnostic capability to field devices (“smart” instrumentation).
This created a need for a communication protocol that could support transmitting more
information without adding more wires. Rosemount initially developed the Highway
Addressable Remote Transducer (HART) protocol in the 1980s, and in 1993, it was
turned over to an independent body, the HART Communication Foundation. The HART
protocol retains the 4–20 mA signal for measurement and transmits other diagnostic
information on the same physical line via a digital protocol. A specialized HART I/O
card is required that will read both the analog signal and diagnostic information from the
same wire.
Digital Buses
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Because modern DCSs are digital and new field instrumentation supports digital
communication, there was a need for a fully digital bus to connect them. Such a
communication protocol reduces wiring requirements, because several devices can be
connected to each bus segment (i.e., it is not necessary to have individual wires for each
signal). Several digital buses are in use with some of the more popular ones described in
the next section. It is common to connect the bus wiring directly to a DCS controller bus
I/O card, bypassing the marshalling panel and further reducing installation costs.
The various fieldbus protocols are defined in the IEC 61158, IEC 61804, and IEC 62769
standards covering approximately 20 different protocols. The following represent some
of the protocols most commonly used in the process industries.
Fieldbus
FOUNDATION Fieldbus is a digital communication protocol supporting interconnection
between sensors, actuators, and controllers. It provides power to the field devices,
supports distribution of computational functions, and acts as a local network. For
example, it is possible, with FOUNDATION Fieldbus, to digitally connect a smart
transmitter and a smart valve and execute the proportional-integral-derivative (PID)
controller algorithm connecting them locally in compatible valve electronics or in a
compatible I/O card, with the increased reliability that results from such an architecture.
The FOUNDATION Fieldbus standard is administered by an independent body, the
Fieldbus Foundation, who merged with the HART Communication Foundation and are
now known as the FieldComm Group.
PROFIBUS
PROFIBUS, or PROcess FieldBUS, was originally a German standard and is now a
European standard. There are several variations. PROFIBUS DP (decentralized
peripherals) is focused on factory automation, while PROFIBUS PA (process
automation) targets the process industries. The standard is administered by PROFIBUS
and PROFINET International.
DeviceNet
DeviceNet is a digital communication protocol that can support bidirectional messages
up to 8 bytes in size. It is commonly used for variable speed drives, solenoid valve
manifolds, discrete valve controls, and some motor starters. DeviceNet is now part of
the common industrial protocol (CIP) suite of protocols.
Ethernet
Some field devices, such as sophisticated online analyzers, are now supporting direct
Ethernet connectivity using standard Transmission Control Protocol/Internet Protocol
(TCP/IP) protocols.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Serial Communication
For relatively complicated equipment, such as gas chromatographs, tank gauging,
machinery condition monitors, turbine controls, and packaged equipment programmable
logic controllers (PLCs), it is desirable to communicate more than just an analog value
and/or a digital signal. Serial communication protocols and interface electronics were
developed to support this communication with one of the most common protocols being
Modbus. Recent implementations support block data transfer, as well as single value at a
time transmission.
Wireless
In a plant environment, there are now commonly two types of wireless field networks.
The first type includes communication with security cameras, people-tracking devices,
safety stations (e.g., eye wash), and mobile workstations. This type of wireless field
network is generally referred to as a wireless plant network. Displays from these devices
are often incorporated into plant operator stations, sometimes with dedicated consoles.
The second type of wireless field network consists of field instrumentation and is
generally referred to as a wireless field instrumentation network. Wireless sensor
connections are becoming more popular, particularly for less critical readings. Multiple
standards have been adopted for plant implementation with cybersecurity as an
important consideration. Wireless sensor input signals are usually collected in one or
more wireless signal receivers that are then connected to the DCS. Sensor diagnostic
information can be transmitted wirelessly as well utilizing protocols, such as the HART
protocol discussed previously.
Control Network
The control bus network is the backbone of the DCS and supports communication
among the various components. Data and status information from the I/O components is
transferred to and from the controllers. Similarly, data and status information from the
I/O and controllers goes to and from the human-machine interface (HMI). Transit
priorities are enforced—data and communication concerning control is the highest
priority with process information and configuration changes being the lower priority.
Typically, the bus is redundant and supports high-speed transfer. With early DCSs, each
vendor had their own proprietary bus protocol that defined data source and destination
addressing, data packet length, and the speed of data transmission. Today, Ethernet
networks with TCP/IP-based protocol and IP addressing have become the most common
choice.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Control Modules
The control modules in a typical DCS are connected to the I/O cards by a high-speed
bus that brings the raw data to the module. The module contains a microprocessor that
executes, in real time, process variable input sensor data processing and status
monitoring, alarm and status alert processing, execution of specified control algorithms,
and process variable output processing.
Execution frequencies for these functions can be multiple times per second; hence, the
number of I/O points, control loops, and calculations that can be processed in a single
control module is limited. Multiple modules are used to handle the complete
requirements for a process. Control modules are usually physically located in a climatecontrolled environment.
Database
Each control module contains a database that stores the current information scanned and
calculated, as well as configuration and tuning information. Backup information is
stored in the engineering station.
Redundancy
For critical control functions, redundant control modules and power supplies are often
used. These support automatic switching from the primary to the backup controller upon
detection of a failure.
Human-Machine Interface—Operator Workstations
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
There are usually two different user interfaces for the DCS—one for the operator
running the process, and a second one for engineering use that supports configuration,
system diagnostics, and maintenance. In a small application, these two interfaces may
be physically resident in the same workstation. For systems of moderate or larger size,
they will be physically separate. The operator interface is covered in this section, and
the system interface in the next section. A typical operator station is shown in Figure 142.
The number of consoles required is set by the size of the system and the complexity of
the control application. The consoles access the control module database via the control
bus to display information about the current and past state of the process and are used to
initiate control actions, such as set-point changes to loops and mode changes. Access
security is enforced by the consoles through individual login and privilege assignment.
There can be redundancy in the consoles and in the computers that are used to drive the
consoles.
Keyboard
Standard computer keyboards and mice are the most common operator console
interface, supplemented occasionally with dedicated key pads, which include common
sets of keystrokes preprogrammed into individual keys.
Standard Displays
The operator and engineer workstations support standard displays commonly used by
the operators and engineers to monitor and control plant operation.
Faceplates
Faceplate displays show dynamic and status parameters about a single control loop and
permit an operator to change control mode and selected parameter values for the loop.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Custom Graphic Displays
These displays present graphic representations of the plant with real-time data displays,
regularly refreshed, superimposed on the graphics at a point in the display
corresponding to their approximate location in the process. A standard display is shown
in Figure 14-3, with a faceplate display superimposed.
Displays can be grouped and linked via hierarchies and paging to permit closer
examination of the data from a specific section of a plant or an overview of the whole
plant operation. The International Society of Automation (ISA) issued the ANSI/ISA101.01-2015 standard, Human Machine Interfaces for Process Automation, to assist in
the effective design of HMIs.
Alarms
Alarms generated will cause a visible display on the operator console, such as a blinking
red tag identifier and often an audible indication. Operators acknowledge active alarms
and take appropriate action. Alarms are time-stamped and stored in an alarm history
system retrievable for analysis and review. Different operator stations may have
responsibility for acknowledgement of different alarms.
Alarm “floods” occur when a plant has a major upset, and the number of alarms can
actually distract the operator and consume excessive system resources. Finding the right
balance between providing enough alarms to alert the operator to problems but avoiding
overwhelming them with minor issues is a continuing challenge. The ANSI/ISA-18.22016 standard, Management of Alarm Systems for the Process Industries, is a
comprehensive guide to design and implementation of alarming systems. It has served
as the basis for the International Electrotechnical Commission (IEC) global standard
IEC 62682. ISA maintains a committee actively working on alarm issues. There is also a
European guide, EEMUA 191, on alarm management.
Sequence of Events
Other events, such as operator logins, set-point changes, mode changes, system
parameter changes, status point changes, and automation equipment error messages, are
captured, time-stamped, and stored in a sequence of events system—again retrievable
for analysis and review. If sequence of events recording is included on specific process
equipment, such as a compressor, it may be integrated with this system.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Historian/Trend Package
A historical data collection package is used to support trending, logging, and reporting.
The trend package shows real-time and historical process data on the trend display.
Preconfigured trends are normally provided along with the capabilities for user-defined
trends. A typical trend display is shown in Figure 14-4.
Human-Machine Interface—Engineering Workstation
The system or engineering workstation supports the following functionality:
• System and control configuration
• Database generation, edit, and backup
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• System access management
• Diagnostics access
• Area/plant/equipment group definition and assignment
Configuration
Configuration of the control module is normally performed off-line in the engineering
workstation with initial configuration and updates downloaded to the control module for
execution. Configuration updates can be downloaded with only momentary interruption
of process control. Configuring a control loop in the system typically involves the
following steps:
• Mapping DCS process variable tags to the specific input or output channels on
the I/ O cards, which contain the wiring terminations for the relevant field devices
and identify the required input conversions. Process variable ranges may be
manually set or read from the device directly if using modern digital buses.
• Identifying the specific control algorithm (i.e., PID) connecting the I/O variables,
their required configuration settings, and setting initial estimates of the tuning
parameters.
• Building a graphical display, which contains the control loop, for the operator.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Today, this configuration is usually done in a graphically based system with drop-anddrag icons, dialogue windows, and fill-in-the-blank forms, such that minimal actual
programming is required. A typical screen for configuration is shown in Figure 14-5,
where the boxes represent I/O or PID control blocks. The lines represent the data
transfer between control blocks.
Generally, there are predefined templates for standard I/O and control functions, which
combine icons, connections, and alarming functions. Global search and replace is
supported.
Prior to downloading to the control modules, the updated configuration is checked for
validity, and any errors are identified.
Other configuration functionality includes the following:
Graphic Building
A standard utility is provided to generate and modify user-defined graphics. This uses
preconfigured graphical elements, including typical ISA symbols and user fill-in tables.
The ISA-5.4-1991 standard, Instrument Loop Diagrams, provides guidelines for controlloop symbols. Typically, new graphics can be added and graphics deleted without
interrupting control functionality.
Audit Trail/Change Control
It is common to require an audit trail or record of configuration and parameter changes
in the system, along with documentation of the individual authorizing the changes.
Simulation/Emulation
It is desirable to test and debug configuration changes and graphics prior to
downloading to the control module. Simulation/emulation capabilities permit this to be
performed in the system workstation using the current actual plant configuration.
Application Servers
Application servers are used to host additional software applications that are
computationally intensive, complicated, or transaction-oriented, such as complex batch
execution control and management, production management, operator training, online
process modeling for plant and energy optimization, and so on. Again, redundant
hardware can be used, if appropriate. Application servers are also used to link higherlevel plant and corporate networks to the control network.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Remote Accessibility
It is desirable for users to be able to access information from the DCS remotely.
Application servers can act as a secure remote terminal server, providing access for
multiple users simultaneously and controlling privileges and area access.
Connectivity
The application server is also used to host communication software for external data
transfer, such as laboratory and custody transfer systems. There are several standards
that are commonly used for this transfer. One common data transfer standard is OPC.
OPC
The object linking and embedding (OLE) standard was initially developed by Microsoft
to facilitate application interoperability. The standard was extended to become OLE for
Process Control (OPC, now Open Platform Communications), which is a specialized
standard for the data transfer demands of real-time application client and servers. It is
widely supported in the automation industry and permits communication among
software programs from many different sources.
Mobile Devices
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Smartphones and similar devices have transformed personal daily activities. This
transformation has generated a demand for real-time access to essential plant operations
data, independent of the individual’s current location, whether in the plant or outside.
DCS vendors have responded by providing connectivity to these mobile devices (see
Figure 14-6). Cybersecurity is a top priority. Typically, this connectivity is read-only,
meaning that the user can access data but can’t change any data in the system. Normally,
secure VPN or Wi-Fi access points are used and data transferred is encrypted. In-plant
wireless communication is often handled with special ruggedized tablets that are
compatible with standards for electric equipment operation in hazardous areas.
DCS Physical Operating Environment
Commonly the control center is a physically separate building located some distance
from the actual process unit(s). The building is designed to protect the occupants in the
event of a major process event, such as a fire or explosion. Within the building, the DCS
architecture is often physically split. The controllers, I/O terminations, power supplies,
and computer servers are in one room, which is often called the rack room. This room is
environmentally controlled at a relatively low temperature (typically 40°F to 70°F) and
a relatively low humidity (near 50%), with air filters to exclude corrosive gases, dust,
and other airborne contaminants.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The operator consoles are grouped in another room, often called the control center with
pods of consoles for one major process unit or closely coupled units. Several major
process units or indeed the entire site can be controlled from one control center. A
typical control center is shown in Figure 14-7.
Cybersecurity
Initially DCSs were isolated systems running proprietary control and communication
software on specialized hardware. They were commonly physically separate from other
plant business systems with very limited external communication. With the evolution of
DCSs toward the use of more standard commercial hardware and software and the
implementation of increased communication with external business systems and remote
access has come greater potential vulnerability to cybersecurity incidents. A major
cybersecurity incident on a DCS in a modern process plant could lead to significant
health, safety, and environmental risks and large potential financial losses. Vendors of
these systems are well aware of these risks and typically provide defense against
cybersecurity incursions in their hardware and software. The ISA99 committee has
developed the ISA/IEC 62443 series of standards on the cybersecurity of industrial
automation and control systems, which outline policies and procedures for keeping
cybersecurity defenses current. A number of other government and industry groups also
provide cybersecurity guidelines.
Future DCS Evolution
New functionality is continually added to DCSs with the ongoing evolution of
computation and communication capabilities. Several trends are evident. One is that
central control rooms are being installed remotely from the actual plant; in some cases,
they are hundreds of miles away and are responsible for remotely controlling many
plants simultaneously. This increases the demand for diagnostic information on both the
instrumentation and other process equipment to better diagnose and predict process
problems so that corrective action can be taken before the faults occur. Second, new
plants have many more measurements than older plants with similar processes, and
“smart field devices” have extensive diagnostic information to transmit. This leads to
significant increases in the quantity of data that DCSs must process. A third related
trend is the increased requirement for “sensor-to-boardroom” integration, which
imposes ever-increasing communication bandwidth demands. Good, real-time corporate
decisions depend on good, real-time information about the state of the plant. Secure
integration of wireless field devices and terminals into the control system remains an
active area of current development. This includes further enhancement of the mobile
device interface mentioned previously.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Further Information
The ISA committees referenced in this chapter maintain up-to-date portals on their latest
standard development activities. The committees and standards can be accessed online
at www.isa.org/standards.
For additional background, see the chapter on distributed control systems in:
McMillan, G. K., and D. M. Constantine. Process/Industrial Instruments and
Controls Handbook. 5th ed. Research Triangle Park, NC: ISA (International
Society of Automation), 1999.
For feedback control design, the following books are good general references:
Blevins, Terrance, and Mark Nixon. Control Loop Foundation: Batch and
Continuous Processes. Research Triangle Park, NC: ISA (International Society
of Automation), 2011.
Wade, Harold L. Basic and Advanced Control: System Design and Application. 3rd
ed. Research Triangle Park, NC: ISA (International Society of Automation),
2017.
For further information on the evolution of control systems, refer to:
Feeley, J., et al. “100 Years of Process Automation.” Control XII, no. 12 (December
1999 special issue).
About the Author
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Douglas C. “Doug” White is a principal consultant with Emerson Automation
Solutions. Previously, he held senior management and technical positions with MDC
Technology, Profitpoint Solutions, Aspen Technology, and Setpoint. In these positions,
White has been responsible for developing and implementing state-of-the-art advanced
automation and optimization systems in process plants around the world and has
published more than 50 papers on these subjects. He started his career with Caltex
Petroleum Corporation with positions at their Australian refinery and central
engineering groups. He has a BChE from the University of Florida, an MS from
California Institute of Technology, and an MA and PhD from Princeton University, all in
chemical engineering.
15
SCADA Systems: Hardware,
Architecture, and Communications
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
By William T. (Tim) Shaw, PhD, CISSP, CPT, C|EH
In the world of industrial automation, there are processes that are geographically
distributed over large areas, making it difficult, if not impossible, to interconnect all
associated sites with local area network (LAN) technology. The classic examples are
crude oil, refined products and gas pipelines, electric power transmission systems, water
and sewage transportation and processing systems, and transportation infrastructure
(e.g., highways, subways, railroads, etc.). The common theme in all these applications is
the need to monitor and control large numbers of geographically distant sites, in real
time, from some (distant) central location. In order to perform this task, supervisory
control and data acquisition (SCADA) systems have been developed, evolving from
earlier telemetry and data acquisition systems into today’s powerful, computer-based
systems. SCADA systems classically consist of three major elements: field-based
controllers called remote terminal units (RTUs), a central control facility from which
operations personnel monitor and control the field sites through the data-collecting
master terminal unit (MTU, a term that predates computer-based systems) or a host
computer system, and a wide-area communications system to link the field-based RTUs
to the central control facility (as illustrated in Figure 15-1).
Depending on the industry or the application, these three elements will have differences.
For example, RTUs range in sophistication from “dumb” devices that are essentially just
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
remote input/output (I/O) hardware with some form of serial communications, all the
way to microprocessor-based devices capable of extensive local automatic control,
sequential logic, calculations, and even wide area network (WAN) communications that
are based on Internet Protocol (IP). In the water/waste industry, conventional industrial
programmable logic controllers (PLCs) have become the dominant devices used as
RTUs. In the pipeline industry, specialized RTUs have been developed due to a need for
specialized volumetric calculations and the ability to interface with a range of analytical
instruments. The electric utility RTUs have tended to be very basic, but today include
units capable of direct high-speed sampling of the alternating current (AC) waveforms
and the computing of real and reactive power, as well as spectral (harmonics) energy
composition. One historic issue with RTUs has been the fact that every SCADA vendor
tended to develop their own communication protocol as part of developing their own
family of RTUs. This has led to a wide range of obsolete and ill-supported
communication protocols, especially some old “bit-oriented” ones that require special
communications hardware.
In the last decade, several protocols have either emerged as industry de facto standards
(like Modbus, IEC 61850, and DNP3.0) or have been proposed as standards (like
UCA2.0 and the various IEC protocols). There are vendors who manufacture protocol
“translators” or gateways—microcomputers programmed with two or more protocols—
so that old, legacy SCADA systems (that still use these legacy protocols) can be
equipped with modern RTU equipment or so that new SCADA systems can
communicate with an installed base of old RTUs. (If located adjacent to a field
device/RTU, the typical protocol translator would support just two protocols—the one
spoken by the RTU and the one spoken by the SCADA master. But if the protocol
translator is located at the SCADA master, a gateway might support several protocols
for the various field devices and convert them all to the one protocol used/ preferred by
the SCADA master.)
One of the most recent developments has been RTUs that are Internet Protocol (IP)enabled meaning that they can connect to a Transmission Control Protocol/Internet
Protocol (TCP/ IP) network (usually via an Ethernet port) and offer one, or more, of the
IP-based protocols like Modbus/TCP or Distributed Network Protocol/Internet Protocol
(DNP/IP), or they are integrated using Inter-Control Center Communications Protocol
(ICCP). RTUs with integral cellular modems or license-free industrial, scientific, and
medical (or ISM) band radios (replacing the traditional licensed radios) have also come
on the market.
Another recent development is the use of RTUs as data concentrators with interfaces
(usually serial) to multiple smart devices at the remote site, so that all their respective
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
data can be delivered to the SCADA system via a single communication channel. This is
a variation on the master RTU concept where a given RTU polls a local set of smaller
RTUs, thus eliminating the need for a communications channel to each RTU from the
SCADA master. In a data concentrator application, the RTU may have little or no
physical I/O, but it may have multiple serial ports and support multiple protocols. A
variation often seen in the electric utility industry is RTUs that are polled by multiple
SCADA masters (multi-ported), often using different communication protocols, as
shown in Figure 15-2. This presents a unique challenge if the RTUs being shared have
control outputs because some scheme is usually needed to restrict supervisory control of
those outputs to one or the other SCADA system, but not both concurrently.
Another way in which RTUs can be differentiated is by their ability to be customized in
both physical configuration and in functionality. Smaller RTUs tend to come with a
fixed amount of physical I/O whereas larger RTUs usually allow the user to create a
somewhat flexible I/ O mix through the addition of I/O modules. I/O is somewhat
different from industry to industry. The electric power industry uses momentary contact
outputs almost exclusively, whereas latching contacts are more prevalent in the process
industries. Analog outputs are common in the process industries and almost nonexistent
in the electric power world. Electric utilities require millisecond-level, sequence-ofevent recording (time tagging) on contact inputs, where one-second time tagging is fine
for most process applications. The ability of the user to program an RTU is another
differentiator. PLCs certainly support this ability, but not all RTUs do. There is also the
difference in being able to remotely “download” program logic. Some RTUs can only be
given new application logic by physically attaching a PC to the RTU unit; others allow
program download over the polling channel (as one of the commands supported in the
communications protocol). In the past two decades, it has been common to see RTUs
supporting user-developed application logic in languages that look like BASIC and C.
However, in the last decade, many have adopted the same International Electrotechnical
Commission (IEC 61131) programming tools commonly used by PLCs, which support
relay ladder logic, sequential function charts, block functions, and even structured text.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Communications between the control center(s) and the field sites has been traditionally
based on long-distance, serial, low-bandwidth technology, mostly borrowed from voice
communication applications (i.e., telephone and radio technologies). Data rates of 300
to 9,600 bits per second (to each RTU or shared by multiple RTUs) have been typical.
Consequently, in many industries SCADA system RTUs are event-driven to optimize
bandwidth, only reporting when there has been a change in an input or output. Most
SCADA systems also often have the ability to resynchronize the measurement and status
values in the field devices (RTUs) with that of the central system if communications are
lost for a period of time.
As with other aspects of SCADA systems, the applicable communication technology
varies by industry and application. Water and sewage operations are usually
concentrated in municipal areas and thus tend to be spread over distances where
telephone service is available or where the distances involved do not exceed the usable
range of conventional radio transmission equipment. Thus, those two technologies have
tended to predominate in that industry segment. Pipelines, railroads, and high-voltage
transmission lines tend to run great distances through unpopulated or lightly populated
areas, so it has been the norm for those industries to have to construct their own
communication infrastructure. This was usually done by applying the same technology
the phone company used, which initially consisted of microwave repeaters. Today, this
technology would be fiber-optic cable. For very remote locations, it has also been
possible to use satellite communications.
In the last decade, networking technology has evolved in both wired and wireless forms
and has begun to be applied to SCADA systems. A SCADA system today might use
leased, frame-relay communications out to its field sites, and have RTUs capable of
TCP/IP communications. A municipality might erect a Worldwide Interoperability for
Microwave Access (WiMAX) wireless broadband network to provide high-speed
networking access within the municipal area. Pipeline and transmission operators
regularly bury or string fiber-optic cable along their right of ways, and both utilize the
available bandwidth for their own needs, selling the excess to telecommunication
companies. Radio equipment has become “smarter” and packet-switching mesh radio
networks can be created to provide coverage in topographically challenging areas.
It is also not unusual to see a mixture of communication technologies, particularly as a
means for communications backup (e.g., primary communications might be via leased
telephone circuits, but a VHF radio system exists as a backup in case the telephone
system has a failure). A major factor for many SCADA system operators is whether to
own or rent their telecommunication infrastructure. In remote areas, the decision is easy
(i.e., there is nothing to rent), but in a municipal area, the trade-off may need to be
considered.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The ability of a SCADA central monitoring facility to collect up-to-date measurements
from the field-based RTUs is usually dependent on four factors: the number of
communication channels going out to the field, the data rate of those channels, the
amount of data that needs to be uploaded from each RTU, and the efficiency of the
protocol being used. The electric power transmission industry needs constant data
updates every few seconds in order to run their advanced power network models (e.g.,
state estimation calculations). For that reason, they usually run a dedicated
communications channel to each and every RTU. The water/ sewer industry generally
can manage with data updates every 2 to 10 minutes and so they often use a single
radio-polling channel to communicate to all their RTUs. In the middle you have the
pipeline industry, which (for a variety of reasons such as running leak-detection, batch
tracking or ‘line pack’ models) usually wants fresh data every 10 to 30 seconds so they
will have multiple communication channels and may place more than one RTU on each
channel.
The place where the major differences can be seen, between SCADA systems designed
for specific industries, is in the application software that runs in the central monitoring
(host) facility. Today the central SCADA system (also called the host or master)
architecturally appears much like a business/IT system and, in fact, uses much of the
same technology (see Figure 15-3). Over the past two decades, SCADA vendors have
adopted standard, commercially-available computing, operating system and networking
technology (which unfortunately has also made these systems more susceptible to a
cyberattack and to malware).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Today, the computers and workstations of a SCADA system will be based on standard
PC and server technology, and they will be connected with standard Ethernet
networking and TCP/IP communications. The operating systems of the various
computers will either be a Microsoft Windows or a Unix/Linux variation. A central
master can be as simple as a single PC running one of the Microsoft Windowscompatible SCADA software package or it can fill a large room and incorporate dozens
of workstations and servers, and lots of peripheral equipment. One of the major
considerations of SCADA system design is system reliability/ availability. In many
SCADA applications, it is essential that the central master is operational 100% of the
time, or at least that critical functions have that level of availability. Computers and
networks are prone to failures, as are all electronic devices, and so SCADA vendors
must devise mechanisms to improve the availability of these systems. Two general
approaches used to get high levels of availability are redundancy and replication.
Redundancy schemes use a complete second set (backup) of hardware and software and
employ some (presumably automatic) mechanism that switches from the faulty set to the
backup set. The trick with redundancy is keeping the primary and backup hardware and
software synchronized. This has been one of the greatest technical challenges for
SCADA vendors. In some applications, such as electric power transmission and longhaul product pipelines, redundancy is carried out to the level of building duplicated
control centers, each with a fully redundant system, so that if a natural disaster (or
terrorist attack) were to disable the primary facility, operations could be resumed at the
alternate control center. Replication is also used where redundancy isn’t possible. A
good example would be operator workstations. If you have several workstations and any
one of them can be used for critical operations, then losing one can be tolerated.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The application programs running on the SCADA central system will, for obvious
reasons, vary by industry but the “core” functions will be quite similar. Regardless of
industry, all SCADA systems poll RTUs and collect field measurements and status data,
and then they perform alarm checking and alarm annunciation on those values. The
alarming and annunciation capabilities of a system can range from making color/blink
changes on operational displays and setting off audible alarms, to sending automated
email and pager/text messages. All SCADA systems provide a set of operational
displays (including those that are customizable by the user) on which current field data
can be viewed, and on which current and prior alarms can be reviewed. SCADA systems
also provide some level of historical trending of measurements, usually short-term (e.g.,
last 24 hours) and long-term (e.g., last 6 months or longer). They also include some
level of report generation and logging, including user-configurable reports and
diagnostic displays. Finally, they provide display pages through which the operational
personnel can remotely control and adjust the field site plant equipment and observe the
process. SCADA systems are designed for supervisory control and in most cases, they
support both remote manual control (also called open-loop control) and remote
automatic supervisory (closed-loop) control. (In other words, control output commands
are issued by application programs, not by human operators.) As has been mentioned,
many RTUs are also capable of local automatic control, including regulatory control
(e.g., proportional-integral-derivative [PID]) and sequential control.
Operational displays are one place where the industry-specific aspects are obvious. A
pipeline control system will have graphics showing the pipeline, pump stations, major
valves, and tank farm facilities. They will show pressures, temperatures, and flow rates.
In a power transmission control room, the displays will show the power grid in a “oneline” (all three phases shown as a single line) diagram format with circuit breakers,
transformers, generators and inter-tie points displaying voltages, currents, power factor,
and frequency. In a water distribution system, the operational displays will show pipes,
storage tanks, and booster stations, as well as the current pressures, levels, volumes, and
flow rates. Specialized supervisory applications also exist in each of the industries that
use SCADA technology, such as leak detection and “line pack” models for natural gas
pipelines, and batch tracking models for liquid (‘product’) pipelines. Water and gas
“send out” (daily utilization) forecasting models are used by both gas and water utilities,
and state estimation and load forecasting models are used by the electric power industry.
One final point regarding the cybersecurity (or vulnerability) of SCADA systems: up
until very recently, no SCADA vendor had addressed the issues associated with
cybersecurity. In the last few years, efforts have been made to apply IT security
mechanisms and to devise “fixes” to close security gaps. There is still much to be done
to build inherent security into these systems but much development is underway. As
modern SCADA systems employ a great deal of hardware, software, and networking
technology that is also used for information technology (IT) applications, many (but not
all) of the approaches used to secure (create cybersecurity for) IT systems can be
applied to at least the “host” portions of most SCADA systems. The same is true for
communications between the host and the field equipment where IP-based networking is
being used for RTU communications. But too many SCADA systems in use today
running critical infrastructure components still have exploitable cybersecurity
vulnerabilities.
Key Concepts of SCADA
The following are some key concepts that specifically relate to SCADA technology, as
compared to conventional in-plant industrial (distributed control system [DCS]-based)
automation:
1. The geographic distribution of the process is over a large area.
2. Low-bandwidth communication channels and specialized serial protocols are
often used (although this is changing as IP networking is extended “to the
field”).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
3. Updates on change of state and resumption of lost communications are possible
with adequately intelligent RTU devices.
4. RTU functionality varies by industry segment, although basic remote control and
data acquisition are common to all industry segments.
5. SCADA central system (“host”) technology migrated onto commercial
computing platforms starting in the early 1990s.
6. Software applications used in the central host system (and operational displays)
are the major differentiator between systems deployed into various industrial
sectors.
7. Operational performance requirements vary across industry segments.
8. All SCADA systems incorporate a basic set of data acquisition, display, and
alarming functions and features.
9. SCADA systems can range from very small to huge, depending on the industry
and application.
10. Special approaches and architectures are employed to give SCADA systems high
levels of availability and reliability.
11. SCADA systems can be vulnerable to cyber threats unless appropriate steps are
taken.
Further Information
The following technical references provide additional information about SCADA
systems and technology.
ANSI/ISA-TR99.00.02. Integrating Electronic Security into Manufacturing and Control
System Environments. Research Triangle Park, NC: ISA (International Society of
Automation).
ISA has worked with industry experts and industrial users to devise a set of
recommendations for industrial automation systems. This is primarily focused on
plant automation, but much of the information is relevant to SCADA systems as
well.
Clarke, Gordon, and Deon Reynders. Practical Modern SCADA Protocols: DNP3,
60870.5 and Related Systems. IDC Technologies, 2004.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
A useful book regarding some of the details on several standard SCADA/RTU
protocols that have emerged as industry leaders, both in the United States and in
Europe, especially in the electric utility market.
DOE (U.S. Department of Energy). 21 Steps to Improve Cyber Security of SCADA
Networks. Accessed July 2, 2015.
http://www.oe.netl.doe.gov/docs/prepare/21stepsbooklet.pdf.
This short list of steps includes obvious and not-so-obvious actions that can be
taken to improve the overall security of SCADA networks. This is a good
checklist of things that must be done for any SCADA system installation.
NERC (North American Electric Reliability Corporation). Critical Infrastructure
Protection standards (CIP-001 through CIP-009). Accessed July 3, 2015.
http://www.nerc.com/pa/CI/Comp/Pages/default.aspx.
NERC has developed a set of technical, procedural, and operational strategies for
making SCADA systems more secure, from both a cyber and a physical
perspective. Many of the proposed strategies are good system administration
policies and procedures. Much of this material is based on IEC IT standards and
best practices.
NIST (National Institute of Standards and Technology). 800-series standards on Wireless
Network Security (800-48), Firewalls (800-41), Security Patches (800-40), and
Intrusion Detection Technology (800-31). www.nist.gov.
In general, NIST is taking a broad leadership role in exploring both technologies
and techniques for improving the cybersecurity of automation systems and IT
security as well. NIST provides a good range of online resources that provide a
strong basis in many technical areas.
Shaw, William T. Cybersecurity for SCADA Systems. Tulsa, OK: PennWell Corporation,
2006.
A general-purpose text that gives a good, broad, easy-to-understand overview of
SCADA system technology from RTUs and telecommunication options, all the
way to technologies and techniques for making SCADA systems cyber secure. It
includes a good discussion of cyber threats and the vulnerability assessment
process.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
About the Author
William T. (Tim) Shaw, PhD, CISSP, C|EH, CPT, is the former (now retired) senior
security architect for Industrial Automation at a privately held professional services firm
that provides technical and management services to government agencies and
commercial customers. Shaw has more than 40 years of experience in industrial
automation, including process/ plant automation, electrical substation automation,
factory automation electrical/water/ pipeline SCADA, and equipment condition
monitoring.
He has held a diverse range of positions throughout his career, including technical
contributor, technical manager, general manager, CTO, and division president/CEO.
Shaw is an expert in control system cybersecurity, particularly on NERC CIP standards,
ISA-99 standards, NRC RG 5.71, and NIST 800-53 standards, and he is highly
knowledgeable in the areas of U.S. nuclear power plant cybersecurity and physical
security.
Shaw wrote Computer Control of Batch Processes and Cybersecurity for SCADA
Systems; is a contributing author to two other books; is the co-author of Industrial Data
Communications, fifth edition, and has written magazine articles, columns, and
technical papers on a variety of topics. He is also a contributing editor on security
matters to Electric Energy T&D magazine.
V
Process Control
Programming Languages
Programming languages, like any other language, describe using a defined set of
instructions to put “words” together to create sentences or paragraphs to tell, in this
case, the control equipment what to do. A variety of standardized programming
languages, based on the programming languages in International Electrotechnical
Commission (IEC) programming language standard IEC 61131-3, are recommended to
ensure consistent implementation and understanding of the coding used to control the
process to which the equipment is connected.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Process Modeling and Simulation
Process models can be broadly categorized as steady state and dynamic. A steady-state
or dynamic model can be experimental or first principle. Steady-state models are largely
used for process and equipment design and real-time optimization (RTO) of continuous
processes, while dynamic models are used for system acceptance testing (SAT), operator
training systems (OTS), and process control improvement (PCI).
Advanced Process Control
Advanced process control uses process knowledge to develop process models to make
the control system more intelligent. The resulting quantitative process models can
provide inferred controlled variables. This chapter builds on regulatory proportionalintegral-derivative (PID) control to discuss both quantitative and qualitative models
(such as fuzzy logic) to provide better tuning settings, set points, and algorithms for
feedback and feedforward control.
16
Control System Programming Languages
By Jeremy Pollard
Introduction
Today’s autonomous systems have some form of embedded intelligence that allows
them to act and react, communicate and publish information in real time, and, in some
cases, self-heal.
Driving this intelligence are instruction sets that have been employed in a specific
programming language that a developer can use to create control programs.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Remember that any language is or has been created to have a core competency. English
as a spoken language is different than French, for instance. However, the meaning of the
spoken word is similar, if not the same.
Programming languages are much the same. They allow a developer to specify what the
final functions will do, and the developer must then figure out how to put the “words”
together to create a sentence or a paragraph that says and does what is needed. Also,
choosing the correct hardware is key as it can sometimes limit a developer’s available
programming language abilities; however, a developer still needs to be able to compose
that sentence or paragraph with the tools available. If a developer has a choice in which
language is used, they must choose the right language for the task. That choice will
affect:
• Future enhancements
• Maintenance situations
• General understanding of the task that it is performing
• Troubleshooting the system(s)
• Maintainability
When developing control system programs, a developer must be aware of the audience
and the support available for the systems. A control program should not, for example, be
written in French when the team speaks only English. The task may be defined in
English, however, the programming must be done in languages that the systems
understand. That is what we as developers do.
Scope
This is not a programming course or a lesson on instruction sets as such. The intent is to
introduce the various languages that most hardware vendors currently employ, of which
there are many. Most vendors, however, use certain language standards because of their
audience. Introducing these languages is the goal, along with the variants and extensions
that some vendors have included in their control system offering(s).
The control system language is the translator that tells a system what to do. We will
delve into the various options available for that translation to happen.
Legacy
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The systems we use today may have been in operation for 10 years or more, and the
systems we initiate now will be in operation for more than 10 years. The ebb and flow
of technology and advancements assimilate into the industrial control space slowly
because of this process longevity. The International Electrotechnical Commission (IEC)
programming language standard, IEC 61131-3, for instance, was first published in 1993
and only now is beginning to gain some traction in the automation workplace.
Embedded controls using IEC 61131-3 are becoming more prevalent for control
systems. To quote a cliché, “The only thing that is constant is change.” We need to be
prepared for it at all times.
What Is a Control System?
This is a loaded question. We can make the argument that a control system is any system
that controls. Profound, I know!
Think of a drink machine. The process of retrieving a can of soda is this:
• The user deposits money (or makes a payment of some sort) into the machine.
The system must determine if the right amount has been tendered.
• The machine gives the user a sign that it is OK to choose the type of soda.
It also must indicate if the chosen soda is out of stock.
• The machine delivers the can.
This process can happen in many ways. In the good old days, it was purely mechanical.
But that mechanical system was still a control system. Today, most drink machines are
electronic. They can accept various payment forms such as tap-and-go credit cards, cell
phones, as well as cold-hard cash—coins or bills!
One of the first control systems this author ever tinkered with was a washing machine
timer. It has no language as such, yet it is still a fundamental control system. It provides
control for the task at hand, which is to wash clothes. These mechanical systems have
given way to electronic microprocessor-based systems that are Internet-enabled and part
of the Internet of Things revolution. But, they still wash clothes.
Look around right now. What do you see? What can be defined as a control system?
• TV remote
• Microwave
• TV
• Thermostat
• Dishwasher
• Stove
That is what defines a control system—it makes stuff happen.
Hardware Platforms
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Just as we find and see various items around us that can be defined as a control system,
in industry various devices form the fundamental basis of a control system.
These hardware platforms have been designated with certain definitions that let us
identify the type of control system it is. Take a robot, for instance. If you are familiar
with robot control, you know that various devices make up the system, such as:
• End effectors
• Servo drives and motors
• Encoders
• Safety mechanisms
• Communications
• Axis control
Certain control devices and characteristics will be different from system to system, but
the fundamentals remain the same. Today, we have a multitude of hardware platforms
that perform functions that constitute a control system:
• Single-board computers
• Stand-alone proportional-integral-derivative (PID) controllers
• Robot controllers
• Motion controllers
• Programmable logic controllers (PLCs)
• Programmable automation controllers (PACs)
• Distributed control systems (DCSs)
• PC-based control
• Embedded controllers (definite purpose)
• Safety PLCs
• Vision systems
Each of the above systems can perform various tasks. For instance, a PLC can replace a
motion controller, as well as perform some control functions that a DCS would do.
Just as you wouldn’t put your clothes in a dishwasher, you wouldn’t choose a hardware
platform that is not suited for the task at hand.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The common thread with all these devices, however, is software.
Software Platforms
In today’s world, all hardware is driven by software. It is important to note that software
comes in many forms. Depending on the hardware you are using, there is typically an
operating system that runs the core functions of the system.
Microsoft Windows is a well-known operating system for personal computers. It is also
available in various forms, such as:
• Windows Mobile
• Windows CE
• Windows Embedded
An operating system controls the system and its connected devices, such as DVD drives,
keyboards, and the like.
An industrial control system also uses an operating system that is typically created by
the vendor to support its hardware platform and connectable devices. These devices,
such as drive controllers, basic input/output (I/O) modules, specialty modules, and
display screens, all need components in the operating system for them to run properly.
Functions (e.g., communications and the ability to send emails) are also part of the
operating system. This operating system would have support for the available
programming languages. In most systems, there is a development software environment
or programming/ configuration software that a vendor created to support their hardware
and available functionality.
It is in this integrated development environment (IDE) that we discover what the
hardware can do, how to configure it, and how to monitor the end result. That end result
is called the application. It is this application that is created by someone who employs
some form of programming language and instructions to allow the system to perform as
advertised. The system or process that is being controlled will determine which
programming language should be used.
What Does a Control System Control?
The intrinsic responsibility of any control system is to control a target. This target could
be a palletizer, welding cell, paint booth, coking oven, or a tube bender. These targets
typically can be divided into two very different categories:
1. Discrete
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Machine control
Packaging
Parts assembly
2. Process
Food/beverage
Pharma
Refinery or petrochemical complex
Discrete
A discrete application typically uses a sequence to function. Think of it as a list of things
to do that is presented in a specific order.
These control systems typically use signal devices, such as limit switches, photocells,
and motor control. The type of control could be defined as on/off control. While
simplistic, these applications can also penetrate other control functions, such as:
• Speed control
• Motion
• Process sequence
A characteristic of a typical discrete application is repeatability, where the same things
are done over and over. The language chosen for this type of application needs to
support the instructions required to allow the target to function as the machine designer
and the end product demands.
Typical instruction requirements would include:
• Normally open/closed (examine if on/off)
• Timing
• Counting
• Output control (solenoids, motors, actuators)
• Data handling
• Operator input
Typical target applications would include:
• Metal forming
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Conveyor systems
• Material handling
• Product cartoner
• Palletizer
Process
Process systems have a different set of requirements than a discrete application. These
systems employ signal devices as well, but also employ such functions as:
• Temperature
• Pressure
• Flow
• Weight
• Speed
The variability of control requirements defines a process-type application. While
repeatability is important in the end product, a process system typically must be able to
adapt to changing situations in order to maintain the “recipe” for the end product.
Imagine a process that makes beer. The system must weigh components, heat water,
measure alkalinity and pH, and combine the ingredients according to a recipe. The
language chosen for this system requires a totally different instruction set to perform
these tasks than the languages used in the discrete manufacturing industries.
Why Do We Need a Control Program?
Good question. The simple answer is we need to control the target application for what
its intended function is, regardless of whether it is a discrete or process application.
The control program’s function is to tell a story using the instruction set available in an
organized and functional manner. The instructions chosen must perform the desired
function at the desired time and place in the application.
Most industrial control program languages are symbolic or graphical in nature. Pictures,
if you will, represent an action and tell the developer and/or user what the function of
any given instruction is. Languages that use English typically will use mnemonics,
which are groups of letters that represent an action.
Developing the end result control program has many points of responsibilities:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• System designer/engineer
The person or team that defines what the end result is to be, such as “I need
to have a machine that does this.”
Functional requirements come from here:
• Recipes
• Timing diagrams
• Sequence of operations
End user requirements
• Application designer/engineer
Applies the functional requirements to implement the design
Determines what control system to use (PLC, DCS, PC-based control, etc.)
based on the target function (discrete/process)
Selects the signal and process devices, such as proximity switches and
thermocouples
Creates the target system overview, such as device placement, distances,
tolerances, etc.
• Application programmer/designer
Determines which language to use for every component of the control
program
Implements the functional requirements using that language and its
instruction set
Documents the control program for others to use
In each of these steps, the proper language and instruction set must be selected in order
to be successful.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
It can be said that there are many ways to get to New York, for example, and application
programming is no different. In most cases, the control program for a given target
system would be different if given to 10 different programmer/designers. Program
structure, instruction usage, and fundamental control flow would vary depending on the
person’s education and personal experiences, as well as the cost constraints and
development time allowed. The end result is affected by all those things and more, but if
good engineering is done at the front end, the resulting success will show up.
Logic Solving
When designing a control program, it is important for the developer to understand the
logic solving approach. This will affect how the program is developed. It is important to
note that in any control system, the software program is typically solved one instruction
or state at a time. Some hardware platforms employ external co-processors for multithreaded program execution, and for the sake of the higher percentage of control
applications, the logic solving is linear.
Data Types
In any control program, there is a need to deal with data. This data can be external (e.g.,
from a thermocouple) or can be an internally generated result of a math computation or
data movement.
Most control hardware platforms will perform a conversion of data between data types.
For instance, if you divide an integer by an integer, you will invariably end up with a
decimal place. The resulting data would be considered a floating-point or an Institute of
Electrical and Electronics Engineers (IEEE) number. The receiving data location or
variable should be data typed as a floating-point number; however, if it is an integer data
type then the number would normally get rounded. Some control platforms do not
perform this function and will throw an error when the program attempts to execute the
instruction.
Examples of data types that all control programs use are:
• Integer: signed and unsigned (-5432, 1234)
• Boolean: true/false
• Real: +/– 1234.567
• Double integer: 256,779
• Time and date
• String: “Hello World”
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Compiled versus Interpretive
The developer must understand the way that the native control program is stored and
executed on the chosen hardware platform. There are a few reasons why this can be
important. With language choices, the ability to change the program at run time may be
important to the application. Certain smaller controllers run compiled programs and do
not allow for any program modifications on the fly. This typically results in a smaller
instruction set for the controller, so some of the instructions we think should be there,
may not be. This allows for the internal hardware’s processing power and speed, for
instance, to run less than optimal, but the program runs fairly quickly.
The program execution speed is the main issue between compiled and interpretive. A
compiled program will run much faster than an interpretive program.
The instructions and methods chosen to implement the target application can affect the
success of the project. When selecting the language used to develop the control
program, processing execution speed and how it relates to the target application must be
considered.
Application Program Scan Sequence Considerations
Back in the good old days, when I was with Rockwell Automation, one of the best
arguments I ever had was with an engineer who was a big Modicon fan and user at a
steel plant. Modicon used segmented program networks that were logically solved
(executed) in order from top to bottom, left to right. This created issues for many rookie
programmers because it didn’t follow “standard” relay ladder electrical flow.
And that is when the fight started, because it allowed for different arguments about
program constructs!
A scan sequence of a control program is not real time. If you must wait for an additional
program scan to make a programmatic decision, it can create ripples in the application
depending on the application’s required speed response time.
The sequence of execution may determine which language is chosen and which
instructions in that language are chosen based on execution speed of the scan. When
asked about the differences, Dick Morley, the “father” of the programmable logic
controller (PLC), commented, “It’s all real time enough!”
Standards
One of the common complaints is the lack of a programming standard—a certain
collection of instructions that should be used in a specific way to create a standard PID
loop.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The IEC 61131-3 programming language standard is gaining traction after being in the
marketplace for more than 20 years. PLCopen, an independent organization, is
expanding the standard to create some common ground to address some of the
application programming issues by creating a library of various application-specific
software instructions for motion and safety. Part of the goal is to allow companies to
create application programming standards for their projects and for their customers.
Part of the standards movement was started by and propagated by the Open Modular
Automation Controller (OMAC), which is/was a group formed by the “Big Three”
automotive companies to generate some standards in language and constructs.
One of the offshoots of this movement is a programming instruction set called PackML,
which is a state machine language specifically designed for packaging machinery. As a
programming language, it is still used in multi-vendor packaging applications.
Instruction Sets
Any programming language is comprised of things to do, or commands and/or
instructions. Various enterprises have developed these collections to support some level
of functionality that is important to them.
Consider a PC. There are various devices that can be part of the “system” using the
internal bus system, or external ports, such as a universal serial bus (USB). These
devices have vendor-provided drivers that allow the operating system to provide an
application program interface (API) to any application developer who wants to write a
program to do something.
If a developer wants to display a list of files in a directory, the API instruction could be
“ReadDirectoryListFiles,” which would do exactly what the developer needed. If that
function wasn’t available, the developer could use two separate functions to create the
desired result.
In looking at any industrial application, that need could be defined as a statement, such
as “I need to divert that part at this time to that bin.” So, the programmer, using the
selected programming language, would access the available instructions (the control
system’s API) and create the instruction that would do just that.
This is where the languages have diverged in recent years. The genesis of industrial
control programming as such started as Boolean logic using hardware. Boolean logic is
the basis of microprocessors as well.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The maturation of software has allowed for such a rise in the complexity of available
instructions that these instructions are actually programs within themselves. All
instructions, however, require information to operate. Remember that all industrial
processes interface with their control systems using devices and systems, like servo
drives or closed-loop controllers. The information an instruction needs defines which
device we want to control, or monitor, for state change. The next step is what to do with
the detected change when it happens. That step is a program written in a language that
performs the desired function in the connected system.
Object Orientation
Each instruction is an object. All IDEs treat each instruction as an object, and these
objects are programmed in the language of choice based on the rules (syntax) given by
the vendor and the language.
Consider a timing function. Logically the function needs to know:
• When to start
• Timing precision (time base) required
• How long to time for (preset)
The instruction then needs to report:
• When it is enabled
• When it is finished
• Where it is in the timing process
All components of any programming language must be considered in this manner.
Addressing
When an instruction is selected and applied, one of the main components of the
instruction is the who, such as “turn on a light.” Well, which light?
Examples of addressing to identify (regardless of the language chosen, they mean the
same thing!) are:
• A limit switch
X112
I:1.0/12
120223
PartPresent
• A solenoid valve
O112
O:1.0/12 o 020223
DivertValve
• Analog devices
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
N10:2
30012
TankTemp
The advent of modern language orientations has moved control system programming
from absolute identifications, such as a specific physical address, to a more computerbased approach of using variable names. While this may not help in choosing which
language to employ to solve the problem at hand, it must be stated that a control system
cannot control the system without having the information supplied by the target
application. This information for the control system comes from the addressing method
that connects information to the device(s).
This is the link between physical and logical, and all control programming systems need
this link to perform the needed functions. Regardless of the language chosen, this link
will always be required.
The Languages
One of the major concerns in the automation industry is the learning curve associated
with various vendors programming software environments. While IEC 61131-3
addresses some of the concerns regarding the standardization of data types, it has no
boundaries set for instruction sets, presentation, execution, or documentation.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
An IDE based on the IEC 61131-3 standard (see Figure 16-1) provided by vendor A
may not be the same as, or even similar to, an IDE provided by vendor B. However, the
concepts may be similar. There are six major languages that are employed in industry.
The first five are supported and defined to some degree by the IEC 61131-3
programming language standard.
• Ladder Logic or Diagram
• Function Block
• Sequential Function Chart
• Structured Text
• Instruction List
• Flow Chart
There are language variants that have surfaced as well, such as the packaging industry
standard, PackML, which many hardware vendors have implemented in different forms.
It has a common look and feel to it, but the details from vendor to vendor tend to be
different. However, it can still be considered a programming language.
You will run into many other variants of instruction sets, as previously mentioned, but
we will focus on the most common languages.
IEC 61131-3
Prior to 1993, there was a meeting of the minds in the industry to come up with a
programming standard that would enable multiple vendors to adopt a programming
standard that would allow for a common platform for the creation of control programs.
While the standard is outside of the scope of this chapter, it is important to note that the
standard includes the aforementioned five languages. The standard defines some
functions and instructions, as well as data typing and execution models.
The intent of the standard is to provide a common programming platform that enables
users to utilize multi-vendor hardware platforms in their control system designs.
One of the specifications of the standard is that any vendor is “allowed” to extend their
implementation of the standard to include various additional functions and still be able
to claim their compliancy. There are many examples of these extensions already in the
marketplace.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
In an IEC control system, you can use any language that is supported as a stand-alone
routine, as well as having the ability of using specific “tasks” within another control
strategy, such as a Ladder Diagram (see Figure 16-2). This level of flexibility can allow
the development of a control system that uses the right language for the task at hand.
One of the drastic changes found with an IEC 61131-3 language is that all programming
systems use a tag naming convention for linking data to instructions. It is important to
develop this convention for sustainability with plant and target applications.
PLCopen supports the standard by providing extensions and compliancy testing. It has
created some standard program segments and routines that anyone can use to create and
maintain standardized programs and routines.
Parts of the standard include:
• Configuration – The complete system (PLC/PAC)
• Resource – Akin to a central processing unit (CPU) or processing unit (multiple
resources can exist to emulate multi-tasking)
• Task – A control program consisting of groups of instructions in any language
• Program – Where the individual instructions are present in a program
organization unit (POU)
Ladder Diagram (LD)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Back in the good old days, control systems were created by using large relay systems, or
hardwired logic boards to create the required control narrative for the target application.
The systems were parallel in that the power rail (typically 120 VAC) was active for all
connections. This created a control approach in that power flow was king. In any relay
or logic system, it was all about controlling power flow. However, it really wasn’t
controllable because the power rail was always active. The relay systems were solved
left to right and top to bottom based on this power flow. The drawings and systems were
fondly known as relay ladder diagrams (see Figure 16-3).
When automated and software-based control systems came into being, this relay
approach was duplicated in software as a language called Ladder Diagram (LD). A
large percentage of control programs are developed in LD. Most legacy systems are also
programmed in LD.
Power flow is replaced with logic flow.
An LD program is created by adding LD rungs in groups to form a routine or program.
There can be many programs with one control system, and they are commonly referred
to as subroutines.
A recent enhancement to the LD realm is the advent of user-defined instructions. This
means that you can “import” an instruction that is written in a high-level language, such
as C++, and insert it into the LD and have it execute its “logic” under program control.
Most vendor-based LD solutions are associated with PLC/PAC offerings. Legacy
systems use physical addressing to associate the real world to the logical world. In the
IEC 61131-3 LD model, all program associations are done using variable names. The
connection to real-world I/Os is done using a mapping table.
One drawback of LD is the potential lack of structure in a developed program, which
has been called spaghetti code. This refers to the instances where rungs of LD are added
to the program in places where they don’t belong. This makes it difficult to add
organized code and to understand and interpret the program for troubleshooting
purposes.
LD is a part of the IEC 61131-3 standard.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Instruction Set
The power of LD comes from its instruction set. The vendor-supplied instructions will
typically suffice for most applications. However, these instruction sets vary depending
on the hardware platform. They are not all created equal.
Some vendors have developed their own instruction set, and some have employed the
IEC 61131-3 basic instruction set. Hardware platforms typically have high-level
instructions to interface with their systems and specialized hardware, such as serial port
data instructions.
Communication instructions can be present to enable the system to send emails upon
receiving a certain set of data points and logic.
A basic instruction set includes:
• Relay on/off
• Timer/counter
• Data manipulation and transfer
• Math
• Program control/subroutines
Advanced instructions could include:
• Communication
• Analog/process
• Motion/drives
• Device networks
• User-defined
Applications
Ladder Diagram has risen in popularity over the years due to its simplistic nature and
symbolic representation of the target application. In the beginning, it replaced relays so
whatever relays could do, LD could do. The target applications for LD have grown to
include almost any application at all, regardless of function and type (process versus
discrete). It is widely used in all control systems.
Troubleshooting
One of the intrinsic benefits of LD is its familiarity to the electricians and control
engineers who must deal with the system once it has been implemented.
Troubleshooting LD requires that the user understand the instructions used and how
they function.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The LD program is the way in which the target application works, which is true for any
control program. And because the instruction set used is typically fundamental, it is easy
to interpret the program to understand what the target application is not doing, in order
to get things working again.
It is easy to use a printed copy of the program and the connected I/O modules to
troubleshoot a system. However, if data points are used, such as temperature or any of
the advanced instructions, troubleshooting would be difficult to do. At that point, the
actual logic must be monitored.
Structured Text (ST)
The ST language is a computer-like textual approach (see Figure 16-4) to creating a
control program to interface to an existing program of another language or a selfcontained control strategy. While ST is part of the IEC 61131-3 standard, there have
been implementations of ST in legacy systems that are similar to the IEC
implementation.
ST has been compared to Pascal, which is a computer-based language that has been used
for many years. It is termed a procedural language, which is intended to encourage
good programming constructs and data typing. Some may compare it to the BASIC
programming language as well, although the syntax of ST can be very different.
Using ST in any control application has some advantages. Because it is procedural, the
flow of execution is top to bottom and one instruction at a time. There are program
control instructions, such as a JUMP command or a GOTO command.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
In certain implementations of ST, the developer can create program segments that could
be created in LD, but the ST language is used instead. The language environment is
typically object-oriented so that the instruction entry mode prompts you for the needed
variables and data types.
In IEC 61131-3, this language is widely considered to be the only language that can be
used to allow for the sharing and usage of common code. That is to say that a program
or procedure written in ST that conforms to the IEC standard can be used in any control
system that supports the IEC standard.
Be aware that any vendor-specific ST statements or operators would not be supported by
any other vendor.
Instruction Set
The ST language is made up of:
• Expressions
• Operators
• Assignments
• Iteration statements
• Selection statements
Any vendor could expand their ST offering to include hardware-specific operators and
expressions.
There is now an introduction to syntax. In the world of LD and Function Block Diagram
(FBD), the creation of the control program is typically mouse-driven and drag and drop.
ST requires that the developer write the program with the computer keyboard. Drag and
drop is not common.
While easy to implement, adhering to the syntax of instruction entry is required.
Applications
ST excels not in writing programs for full control, but in writing small segments of
control and/or data handling where the ST instructions can better reflect the
requirements of the task at hand.
The IF-THEN-ELSE structures of many real-world procedures lend themselves very
well to ST. They excel at decision-making.
Troubleshooting
ST routines are rarely used for actual system control, so the troubleshooting aspect of
these routines is not as important as it is in other languages. Because the implementation
of an ST program is more of a support role, troubleshooting an ST routine is not a high
concern.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Instruction List (IL)
IL (see Figure 16-5) is by far the most underutilized language in the IEC standard.
Legacy systems of a European nature have used IL as part of their control language
palette. It was included in the standard because of this legacy.
IL can be compared to a computer-based language called Assembler, which was used to
program computers and microprocessor systems when these systems hit the street so
many years ago.
Higher-level languages, such as C, C++, and Visual Basic/Visual Studio, have replaced
Assembler with interfaces and drag and drop, along with object orientation.
The original use of IL was to create fast executing programs for high-speed applications.
Creating these programs could only be described as a nightmare! It is textual in nature
and similar to ST; however, each command is on a single line and operates on a single
data point.
The only intrinsic benefit to creating a program in IL is speed; however, this speed could
be lost based on the hardware that is used. Where the benefit shows up is in extensive
and complex mathematical operations where the IL program would not bog down the
hardware, because the language is so efficient.
Instruction Set
The basic instruction set for IL includes such commands as:
• LD – Store
• ST – Store
• Boolean operators (AND, OR, etc.)
• Basic four-function math
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Basic compare functions
Applications
In the author’s opinion, the applications for this language in an IEC 61131-3
environment are limited. Applying this language requires having to solve a complex
problem quickly, or having a routine that has a negligible effect on the processing scan
time of the control hardware the control program is running on.
The instruction set is primitive, so certain complex mathematical equations cannot be
solved due to the lack of resources. However, should the vendor enhance this language,
then additional applications could join the party.
Function Block Diagram
In LD, some of the basic instructions, such as a timer, can be considered a function
block instruction. By definition, it is a function represented by a logic block in a
program. It is the function that is important to understand, as well as its execution.
An FBD instruction (see Figure 16-6) is a graphical representative of that function. For
instance, an OnDelay timer can be considered an FBD because it does more than one
thing and has underlying code associated with it. Anything that is more than just on/off
or moving one data point to another data point can be called a function block.
DCSs almost exclusively have used an FBD in their control systems; thus, an FBD has
been around for a long time and is included in the IEC 61131-3 programming language
specification.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
An FBD program is developed and created differently than a standard LD program. It is
the IEC 61131-3 standard that defines most platforms that support the FBD
programming language. As a language, however, the building blocks typically consist of
a set of functions to support creating the application.
An FBD is created by choosing functions and tying them together like Lego® blocks.
All blocks have data point entrance and exit points and are placed on a canvas in a freeform fashion. Logically the blocks are executed based on the variable flow and
connected block to block with logical “wires.”
Think of each function block as having a given set of responsibilities. An F-to-C
temperature conversion block would take a temperature in Fahrenheit and convert it to
Celsius. Imagine that underneath the block there are instructions in another language,
such as C++, that perform the conversion. The result of the conversion is held in a
public tag database for access by other parts of the control program.
A function block can have many discrete I/Os associated with it. These I/O do not have
to be physical I/O, and in fact, can be any Boolean data point. Variable data can also be
used as data inputs and the block can create data output. This data can be used
elsewhere in the FBD, or it can be accessed in an LD program as all data can be shared;
however, the function block itself must be a function block POU.
Typically, an FBD program executes on its own at a periodic time base or scan time. It is
individual for each FBD canvas.
The level of complexity of a target application can create complex control programs and
algorithms. At times, trying to create these algorithms in LD can be tedious and the
diagrams can be difficult to understand. Moving the application over to an FBD POU
can create an easier-to-read and understand format.
This is where a user-defined function block (UDFB) comes into play. While a vendor
can supply a myriad of function blocks, such as a PID loop or motion-control move
profiles, a user can create their own function block to perform various functions.
A UDFB is a block that the user defines and creates. It can be designed to perform a
machine or process function that is unique to the user or to the original equipment
manufacturer (OEM). The block definition starts with the connection points, variables,
and then the function. The language that the block is created in can be C++, LD,
Structured Text, or Instruction List.
Once the logical function is created, it can be locked so that it is protected. It should be
noted that a function block, once tested, verified, and locked, can then run or execute a
proven routine. This means that the resulting output data points are not to be questioned
because the underlying program has been verified.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
FBD is a good complement to LD in some control program strategies because a
complex strategy can be easily programmed and presented to the user and the process in
a very efficient way.
Instruction Set
LD’s shortcomings are solved using the basic and intrinsic instruction set of most
programming systems. LD instruction has only a single-input connect point and a
single-output connect point for logic flow. A normal function block can have many data
connection points.
Most hardware vendors that support the IEC 61131-3 specification will support FBD.
The library of function block instructions will vary, but most vendors will include
standard groups such as:
• Drives and motion
• Process
• Advanced math and trigonometry
• Statistics
• Alarming functions
• UDFBs
It is important to note that not all vendors support the importation of a UDFB.
An FBD can also support certain LD-type instructions so that an FBD canvas can in fact
be the main executable program that controls the target application.
Any function block will have data points that the developer sets. These data points
determine the block’s behavior. Another program running in the same hardware space
has access to read and write from/to these data points, so there is no need to have any
transfer type instructions to send and receive data from any other element in the target
control system.
Applications
One of the most useful applications for FBDs is OEM applications. While these can
range from discrete to process, the advantage to the OEM is that they can protect any
intellectual property they may want to hide.
Having said that, an FBD program is typically developed to support an LD program. It
is highly probable that any application will need the sequencing and control that LD can
deliver, with an FBD as supporting cast.
With the above instruction set, it is easy to see that a function block environment is akin
to having a computer-based language at your disposal and presented graphically.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
This level of functionality lends itself well to most process-based applications, as well
as mixed, discrete/process-based applications.
A pure discrete application would not lend itself well to an FBD.
Troubleshooting
This is where an FBD can shine. Because a function block abstracts functions, we as
troubleshooters do not have to understand the “how” of creating the block; we simply
concern ourselves with the data the block is using and the resulting outputs.
Because a function block’s underlying code is not available for modification, we know
that the operation of the block is not in question. So, the focus is strictly the data in and
the data out.
If a bake oven is not hot enough, the temperature control loop program, which could be
on an FBD canvas, can be viewed as it operates. Maybe the block is showing that the
gas valve is open 71%, which should correspond to a specific gas flow. The monitored
gas flow, however, is below specification. The options then would be that the valve
feedback is out of calibration or the temperature detection device is reading incorrectly.
While a simplistic view, it gives an idea of the focus level available in troubleshooting
an FBD controlled system.
Sequential Function Chart (SFC)
SFC (see Figure 16-7) has been around since the early 1980s. SFC has been called the
non-language because it is really a graphical representation of a repeatable sequential
process that has language components underneath it. SFCs were first referred to as
Grafcet.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
As the name implies, the reason for SFC is for sequential machine control or processes
(batch type processes). The language allows a developer to partition a control program
into steps and transitions.
Imagine any sequential operation and start at the beginning of the operation. Let’s use a
washing machine as an example.
SFC varies from an IF-THEN-ELSE paradigm in that it is more of a DO-THIS-THEN
paradigm. Steps define actions, and transitions define how the system progresses to the
next step for the next action.
It might go something like this:
• Load clothes when done
• Add soap when done
• Select cycle when done
• Add water when done
• Wash when done
• Rinse when done
• Finished
Simple, but effective!
The logic behind a step is in the step expression, which could be as simple as a time
delay or a complex expression using operators and available tags.
The execution of the SFC is straightforward—execute the logic in the step until the
logic in the transition is “true.” When the transition is true, then the step is finalized, and
the next step begins execution.
The transition is the “when done” answer. According to the IEC specification, the
transition can have various components to determine the Boolean output, such as an LD
or ST function, an FBD, or a simple embedded expression.
Note that the SFC canvas can support multiple steps and transitions, but design
considerations should limit the size of an SFC due to visualization.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The execution of an SFC program only considers the current active step and transition.
Parallel execution can occur, and in fact, some design considerations could be that the
main chart thread is the control, and the parallel thread would be the fault monitoring
and data reset thread.
We can use various languages beneath the chart actions and transitions. For instance, if
we use LD, we could use a retentive control bit in the action. Once that step action is
completed, that bit or variable will remain set so we can use its status in another action
or step. This is similar to the LD programming rule regarding retentive actions: the bit
or variable will remain set until it is reset. Everything we do should be deliberate
because there are no specified functions to put data back into a default condition.
Instruction Set
Because SFC is not classified as an actual language, the instruction set is more of a
definition of actions.
The SFC specification allows for parallel paths so that steps can execute in parallel and
the ending transitions can be ‘and’d’ or ‘or’d’.
Actions can be:
• Step with action(s)
• Transition
• Double divergence and convergence threads
As previously mentioned, the content for the actions and transitions can be in LD, FBD,
ST, or IL. The programming rules for each language will prevail within the SFC, along
with the boundaries of the SFC itself.
Applications
In order to use SFC, the application MUST have well-defined start positions and end
positions, as well as every step in the control program defined.
We could consider an SFC a state machine. So, any target application that can be
defined from start to finish in defined states for each step would be a good application
for an SFC.
Motion control and profiles are a common application using SFC as are batch/recipetype applications. Machine control was the first widely used application for SFC.
Troubleshooting
The Holy Grail for any system is the ability to determine why a system is not
functioning as designed, has stalled, or has stopped. Should an SFC application stall for
any reason, the execution engine stops on the step that it is waiting for to continue.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Remembering that it is the transition that breaks the SFC out of the current step, it
narrows the playing field dramatically. When a process function stops, there has been an
error in executing the action or something is missing to enable the transition. Some SFC
IDEs will identify where the scan stalled.
However, the best way to troubleshoot is to know the sequence of the process, identify
the last successful action, and then evaluate the next action and transition to determine
the reason for failure.
Flowcharting
When IEC 61131-3 came to North America, Steeplechase Software Inc. applied to the
IEC committee to include flowcharting to the standard but were turned down.
Steeplechase had created an environment that supported the use of flowcharting for
industrial applications (see Figure 16-8).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
While Ron Lavalee created the genesis of flowcharting with FloPro, which was used by
certain areas in the automotive industry, the success rate of applications was low in the
eyes of those who had to implement it.
FC is a PC-based control system exclusively. Some hardware vendors provide a PCbased control system in form factors that may be more suitable to a specific
environment. While this hardware limitation may force some users to seek other
applications, FC can be used in a variety of applications. It is recommended, however,
that the applications be kept small due to the premise that the FC is the troubleshooting
tool to use. What is unclear to the author is the advancement of the technology as it
applies to industrial PC-based control. However, many applications still run using
flowcharting, thus its inclusion in this chapter.
FC is a lone wolf, if you will. It hasn’t conformed to any of the current standards, such
as Open Platform Communication Unified Architecture (OPC UA) or IEC for obvious
reasons.
Due to the lack of general support and current available programming techniques, FC
has been included strictly for legacy purposes.
Conclusions
All programming languages have instructions to which data is attached, as well as
standalone functions. The method by which the data gets attached varies from vendor to
vendor, but the common thread between them all is the process—the target application
that demands control by the aggregate operation of the language(s) chosen to perform
those demands.
Instructions are nothing more than words in the vocabulary of industrial solutions and
are there for a scribe to put them together to form sentences and paragraphs called a
control program.
Standards can play a big part in language selection; however, the target application
should determine what words are put on the page.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
When it comes to industrial control, one pet peeve users have is the training component.
The people who develop, implement, and maintain the components of the target
application must have a certain set of skills and knowledge to make things happen. Once
we have that set of tools, it is difficult to change horses in midstream.
IEC 61131-3 was defined to help with that, and it has moved the goalposts over the last
10 years. But vendors will do what they have to do to keep their audience captive.
Having said that, many cross-platform topics can be transferred from product to product
and vendor to vendor.
While the vendor selection determines the dictionary we have to work with, the
language of choice comes under the microscope for various reasons, and it is best to
choose wisely. The process is important, and this author has always believed that as
engineers and software people, regardless of language, we will make things work.
With the advent of the general acceptance of the described six languages, we now have
choices from which we can write the script. Regardless of the fact that the responsibility
lies with a team or an individual on choice and implementation, the right choice may
only seem obvious based on our own experiences.
Some factors to consider:
• Ease of control program maintenance—will the process change often?
• The execution speed of the process may determine which language to use.
• The instruction set is more dependent on the hardware platform used; determine
what you need first.
• Who will maintain the target application? Always remember the 3 a.m. phone call
paradigm.
• Consider abstracting functions for easier understanding using FBC/SFC.
• Use of standards can create a successful cross-training component in your
maintenance department.
• The language chosen must fit into the overall data strategy of integrating the
“plant floor to the top floor” (maintenance department to management).
• The target audience of the resulting control program must be considered at all
times based on their knowledge base.
• Do you need to develop for the lowest common denominator?
We live in a fast-paced technological world; however, the industrial world doesn’t move
as fast due to processes and legacy. All we learn today will be transportable to the next
platform of control paradigms.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Go forth and write a masterpiece using these languages and word sets that will give you
satisfaction and respect in your programming and maintenance world.
About the Author
Jeremy Pollard, the “Caring Canuckian,” has been involved in automation for 40 years
in programming, design, integration, teaching, and consulting for various companies, as
well as a columnist for Control Design magazine and a former columnist for
Manufacturing Automation. A former Rockwell employee and professor at Durham
College, Pollard has been in charge of his own destiny since 1986, including a brief stint
in Houston, Texas, as a product manager for a software firm. Pollard has been actively
involved in the International Society of Automation (ISA) since the early 90s in
organizing and presenting conference sessions on various topics. He has been quoted in
numerous trade publications.
17
Process Modeling
By Gregory K. McMillan
Fundamentals
Process models can be broadly categorized as steady state and dynamic. Steady-state
models are largely used for process and equipment design and real-time optimization
(RTO) of continuous processes. Dynamic models are used for system acceptance testing
(SAT), operator training systems (OTS), and process control improvement (PCI). A
steady-state or dynamic model can be experimental or first principle. Experimental
models are identified from process testing. First principle models are developed using
the equations for charge, energy, material, and momentum balances; equilibrium
relationships; and driving forces.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Steady-state models can be multivariate statistical, neural-network, or first principle
models. Multivariate statistical and neural-network models are primarily used to detect
abnormalities and make predictions in process operation. First principle steady-state
models are widely used by process design engineers to develop process flow diagrams
and, to a much lesser extent, by advanced control engineers for RTO.
Multivariate statistical and neural-network dynamic models presently rely on the
insertion of dead-time blocks on process inputs to model the effect on downstream
process outputs. The lack of a process time constant or integrating response means that
these models cannot be used for testing feedback control systems. For batch processes,
the prediction of batch endpoint conditions does not require dead-time blocks because
synchronization with current values is not required.
Dynamic first principle models should include the dynamics of the automation system in
addition to the process as shown in Figure 17-1. Valve and variable-speed drive (VSD)
models should include the installed flow characteristic, resolution and sensitivity limits,
deadband, dead time, and a velocity limited exponential response. Measurement models
should include transportation delays, sensor lag and delay, signal filtering, transmitter
damping, resolution and sensitivity limits, and update delays. For analyzers, the
measurement models should include sample transportation delays, cycle time, and
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
multiplex time. Wireless devices should include the update rate and trigger level.
Step response models use an open-loop gain, total-loop dead time, and a primary—and
possibly a secondary—time constant. The process gain is a steady-state gain for selfregulating processes. The process gain is an integrating process gain for integrating and
runaway processes. The inputs and outputs of the step response model are deviation
variables. The input is a change in the manipulated variable and the output is the change
in the process variable (PV). The models identified by proportional-integral-derivative
(PID) tuning and model predictive control (MPC) identification software take into
account the controller scale ranges of the manipulated and process variables and include
the effect of valve or variable speed drive and measurement dynamics. As a result, the
process gain identified is really an open-loop gain that is the product of the valve or
VSD gain, process gain, and measurement gain. The open-loop gain is dimensionless
for self-regulating processes and has units of inverse seconds (1/sec) for integrating
processes. Correspondingly, the process dead time is actually a total-loop dead time
including the delays from digital components and analyzers, and equivalent dead time
from small lags in the automation system. While the primary (largest) and secondary
(second largest) time constants are normally in the process for composition and
temperature control, they can be in the automation system for flow, level, and pressure
control and for fouled sensors in all types of loops. Tieback models can be enhanced to
use step response models that include the dynamics of the automation system.
Standard tieback models pass the PID output through a multiplier block for the openloop gain and filter block for the primary time constant to create the PV input. The
tieback inputs and outputs are typically in engineering units. Simple enhancements to
this setup enables step response models, such as those shown in Figures 17-2a and b.
These models can be used to provide a dynamic fidelity that is better than what can be
achieved by first principle model, whose parameters have not been adjusted based on
test runs. Not shown are the limits to prevent values from exceeding scale ranges.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The enhancement to the input of the standard tieback model is to subtract the normal
operating value of the manipulated variable (%MVo) from the new value of the
manipulated variable (%MVn) to create a deviation variable (∆%MV) that is the change
in the manipulated variable. The enhancement to the output is to add the normal
operating value of the process variable (%PVo) to the deviation variable (∆%PV) to
provide the new value of the process variable (%PVn). A dead-time block for the totalloop dead time (θo) and a filter block for the secondary time constant (τs) are inserted to
provide all the parameters for a second-order plus dead-time step response model for
self-regulating processes. If the manipulated variable equals its normal operating point
plus correction for the disturbance variable, the process variable will settle out to equal
its normal operating point. The input and output biases enable the setting of normal
operating points and the further enhancement of the tieback to use a step response model
for higher-fidelity simulations and linear dynamic estimators.
The total dead time in control loops includes the pure delays and equivalent dead time
from small lags in the valve, process, measurement, and controller. Total dead time and
primary time constant varies from a few seconds in flow and liquid pressure loops to
minutes in vessel temperature and composition loops to hours in column temperature
and composition loops. If sensor sensitivity is good, level loops have a dead time of a
few seconds when manipulating a flow. Level loop dead time increases to minutes when
manipulating heat transfer to change level by vaporization. Integrating process gains for
level and batch temperature are generally very slow (e.g., 0.000001%/sec/%) but can be
very fast for gas pressure particularly when the controlled variable scale is specified in
fractional inches of water column (e.g., 1%/sec/%).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
For integrating and runaway processes, an integrator block is substituted for the filter
block for the primary time constant. For integrating process models, the normal
operating point of the manipulated variable that is the negative bias magnitude must be
sufficiently greater than zero to provide negative as well as positive changes in the
process variable. This bias represents a process load. To achieve a new target or set
point, the manipulated variable must be temporarily changed to be different from the
load. When the process variable settles out at the new operating point, the manipulated
variable returns to be equal to the load plus correction for the disturbance variable.
Dynamic simulations for SAT, OTS, and PCI use a virtual or actual version of the actual
control system configuration and graphics, including historian and advanced control
tools, interfaced to a dynamic model running in an application station or personal
computer. Using an actual, or virtual, rather than an emulated (imitated) control system
is necessary to allow the operators, process engineers, automation engineers, and
technicians to use the same graphics, trends, and configuration as the actual installation.
This fidelity to the actual installation is essential. The emulation of a PID block is
problematic because of the numerous and powerful proprietary features, such as antireset windup and external-reset feedback. An actual control system is often used in SAT
to include the testing of input and output channels. A virtual system is preferred for OTS
and PCI to free up the control system hardware and enable speed-up and portability of
the control system.
The fidelity of steady-state models is the error between the final values of modeled and
measured compositions, densities, flows, and temperatures. The fidelity of a dynamic
model is judged not only by final values but also the time response. For PID control, the
dead time and maximum excursion rate in the time frame of feedback correction is most
important. For a PID tuned for good disturbance rejection, the time frame is 4 dead
times.
First principle models can be sped up to be faster than real time by increasing the
integration step size and the kinetic rates. The effect of these factors is multiplicative.
Dynamic bioreactor models are run 1,000 times in real time by increasing the kinetics
by a factor of 100 and the integration step size by a factor of 10. The result is a
simulation batch cycle time of about 30 minutes rather than an actual batch time of 2
weeks. An increase in kinetics requires a proportional increase in the flows associated
with the kinetics. The process time constants will be decreased by the speed-up factor.
Dead times should be proportionally decreased so that the time constant to dead-time
ratio is about the same so that the decrease in virtual PID gain is only proportional to the
increase in manipulated flow scale span. The PID rate and reset time should be
proportionally decreased with the total-loop dead time.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Linear Dynamic Estimators
Step response models can be used as linear dynamic estimators, as shown in Figure 173, by using the process variable’s steady-state value for self-regulating processes (Figure
17-2a) and the rate of change after multiplication by the total-loop dead time for
integrating processes (Figure 17-2b) as a future value after conversion from percent of
scale to engineering units. A portion of the error between the model output that includes
dead time and time constants (the new PV value) and an at-line or off-line analyzer
result (the measured PV value) is used to bias the linear dynamic estimator output (the
future PV value), similar to the correction done in MPC for self-regulating processes.
The model dynamics provide synchronization with lab results and must include the
effect of sample time, cycle time, and analysis time. The linear dynamic estimator can
be extended to include the step response models of several process inputs. MPC
identification software can readily provide these models. Adaptive tuner software can
provide the models by setting up dummy loops and generating tests from the step
changes in the dummy controller manual or remote output. In the use of these step
response models, the user must be aware—despite the display of variables in
engineering units—that these models are internally using values in percent-of-scale
ranges because the controller algorithms are executed based on percent values of inputs
and outputs. It is important to note that the dynamic estimator, shown in Figure 17-3, is
using variables in engineering units, whereas the variables internally used in PID and
MPC algorithms are in percent and thus include the effect of scales on the open-loop
gain.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Multivariate Statistical Process Control
Multivariate statistical process control (MSPC), also known as data analytics, uses
principal component analysis (PCA) to provide a small set of uncorrelated principal
components, called latent variables, from a linear combination of a large set of possibly
correlated process inputs. Consider a three-dimensional (3-D) plot of process output
data versus three process inputs, as shown in Figure 17-4. The first latent variable (PC1)
is a line through the data in the direction of maximum variability. The second latent
variable (PC2) is a line perpendicular (orthogonal) to the first in the direction of second
greatest variability. The data projected on this new plane is a “Scores” plot. While this
example is for three process inputs reduced to two latent variables, MSPC can reduce
hundreds of process inputs into a few principal components and still capture a
significant amount of the variability in the data set.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
If the data points are connected in the proper time sequence, they form a “Worm” plot,
as shown in Figure 17-5, where the head of the worm is the most recent data point
(PV1n). Outliers, such as the data point at scan n-x (PV1n-x), are screened out as
extraneous values of a process variable, possibly because of a bad lab analysis or
sample. When the data points are batch endpoints, the plot captures and predicts
abnormal batch behavior. In Figure 17-5, the sequence of data points indicates that the
process is headed out of the inner circle of good batches.
A partial least squares (PLS) estimator predicts a controlled variable based on a linear
combination of the latent variables that minimizes the sum of the squared errors in the
model output. To synchronize the predicted with the observed process variable, each
process input is delayed by a time approximately equal to the sum of the dead time and
primary time constant for continuous processes. Synchronization is not needed for batch
process endpoint prediction.
Artificial Neural Networks
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
An artificial neural network (ANN) consists of a series of nodes in hidden layers where
each node is a nonlinear sigmoidal function to mimic the brain’s behavior. The input to
each node in the first hidden layer is the summation of the process inputs biased and
multiplied by their respective weighting factors, as shown in Figure 17-6. The outputs
from nodes in a layer are the inputs to the nodes in a subsequent layer. The predicted
controlled variable is the summation of the outputs of the nodes from the last layer. The
weights of each node are automatically adjusted by software to minimize the error
between the predicted (PV1*) and measured process variables in the training data set.
The synchronization method for an ANN is similar to that for a PLS estimator, although
an ANN may use multiple instances of the same input with accordingly set delays to
simulate the exponential time response from a time constant.
First Principle Models
First principle models use equations that obey the laws of physics, such as the
conservation of quantities of mass, energy, charge, and momentum. The accumulation
rate of a given physical quantity is the rate of the quantity entering minus the rate of the
quantity exiting the volume including conversion, evaporation, condensation, and
crystallization rates. Process variables are computed from the accumulation, equipment
dimensions, physical properties, and equations of state. The mass balance for level and
pressure is calculated by integrating the mass flows entering and exiting the liquid and
gas phase, respectively. Temperature is computed from an energy balance and pH from a
charge balance.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Industry tends to use ordinary differential equations for dynamic first principle models
where the physical quantity of interest is assumed to be evenly distributed throughout
the volume. Profiles and transportation delays are modeled by the breaking of a process
volume, such as the tube side of a heat exchanger, into several interconnected volumes.
This lumped parameter method avoids using partial differential equations and the
associated boundary value problems.
In steady-state models, physical attributes, such as composition and temperature, of a
stream are set by an iterative solution proceeding from input to output streams, or vice
versa, to converge to a zero rate of accumulation of a quantity, such as the mass of a
chemical component or energy, within the volume. In dynamic models, the rate of
accumulation is nonzero and is integrated. However, most dynamic models have steadystate models and thus iterative solutions for pressure-flow interrelationship of liquid
streams because the integrations step sizes required for the millisecond momentum
balance response time are too small. Special dedicated software is required to use
momentum balances to study hydraulics, hammer, and surge. Steady-state models
generally do not have a pressure-flow solver because it makes convergence of the
overall model too difficult and lengthy. Consequently, a stream flow must be directly set
in such models, because a change in valve position or operating point on a pump curve
does not change flow.
Figure 17-7 is a simple example of a dynamic and steady-state model used to compute
level. In the dynamic model, the rate of accumulation of liquid mass (∆MLn/∆t) is the
net mass flows (F1n and F2n) into and out of the volume (F3n). The rate of accumulation
multiplied by the integration step size (∆t) and added to the last accumulation (∆MLn-1)
gives the new accumulation (∆MLn). The liquid level (LLn) is the present accumulation
of mass divided by the product of the liquid density (ρL) and cross-sectional area (Av) of
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
the vessel. In the steady-state model, the liquid mass (MLn) and level (LLn) is constant
because the rate of accumulation is zero. Also, the mass flow out (F3n) of the vessel is
not a function of the pressure drop, maximum flow coefficient of the control valve, and
the level controller output, but it is set equal to sum of the mass flows into the vessel
(F1n + F2n).
Dynamic models run in real time if the time between successive executions of the model
(execution time) is equal to the integration step size. A dynamic model will run faster
than real time if the execution time is less than the integration step size. For the model to
be synchronized with the control system, both must start at the same point in time and
run at the same real-time factor, which is difficult if the model and control system are in
separate software packages. Frequently, complex dynamic models will slow down
because of a loss of free time during an upset when the dynamic response of the model
and control system is of greatest interest.
Capabilities and Limitations
A linear dynamic estimator (LDE) and MSPC use linear models. Both require that there
be no correlations between process outputs. However, the MSPC is designed via PCA to
handle correlated process inputs. An LDE, by definition, can accurately model process
outputs with significant time constants and an integrating response. However, presently,
LDE software tend to have practical limits of 50 process inputs, and most LDE have less
than 5 process inputs, whereas MSPC and ANN software are designed to handle
hundreds of inputs.
An ANN excels at interpolation of nonlinear relationships. However, extrapolation
beyond the training set can lead to extremely erroneous process outputs because of the
exponential nature of the response. Also, a large number of inputs and layers can lead to
a bumpy response.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
An LDE requires testing the process to accurately identify the dynamic response. Both
MSPC and ANN advertise that you can develop models by just dumping historical data.
Generally, the relationships between whole values rather than changes in the inputs and
outputs are identified. Consequently, these models are at risk of not identifying the
actual process gain at operating conditions. An MSPC and ANN also generally require
the process to be self-regulating, meaning output variables reach steady states
determined by measured process inputs. Nonstationary behavior (shifting process
outputs) from an unmeasured disturbance or an integrating response is better handled by
identification and feedback correction methods employed by an LDE.
First principle models can inherently handle all types of nonlinearities, correlations, and
nonself-regulation, and show the compositions and conditions of streams. However, the
physical property data may be missing for components, requiring the user to construct
theoretical compounds. Also, these models tend to focus on process relationships and
omit transportation and mixing delays, thermal lags, non-ideal control valve behavior,
and sensor lags. Consequently, while first principle models potentially show nonlinear
interrelationships better than experimental models, the errors in the loop dead times and
times constants are larger than desired unless variable dead time, analyzer, wireless,
resolution, backlash-stiction, rate limiting, and filter blocks are added to address the
dynamics from the equipment, piping, and automation system.
First principle models offer the significant opportunity to explore new operating regions,
investigate abnormal situations, and provide online displays of the composition of all
streams and every conceivable indicator of process performance. There are considerable
opportunities to increase both process and automation system knowledge by
incorporation in a virtual plant leading to many process control improvement
opportunities.
An LDE requires step changes five times larger than the noise band for each process
input with at least two steps held for the time to steady state, which is the dead time plus
four time constants.
An MSPC and ANN should have at least five data points, preferably at different values
significantly beyond the noise level, for each process input. A hundred process inputs
would require at least 500 data points. A feedback correction from an online or
laboratory measurement can be added to an MSPC and ANN, similar to an LDE. If
laboratory measurement is used, the synchronizing delay must be based on the time
between when the sample was taken and the analysis entered. In most cases, the total
number and frequency of lab samples is too low.
MSPC and ANN models that rely totally on historical data, rather than a design of
experiments, are particularly at risk of identifying a process input as a cause, when it is
really an effect or a coincidence. For example, a rooster crowing at dawn is not the
cause of the sunrise, and the dark background of enemy tanks at night is not an indicator
that the tanks are a legitimate target. Each relationship should be verified that it makes
sense, based on process principles. A sure sign of a significantly weighted extraneous
input is an MSPC or ANN model that initially looks good but is inaccurate for a slight
change in process or equipment conditions.
The LDE, MSPC, and ANN all offer future values of controlled variables by using the
model output without the dynamics needed for synchronization with the process
measurement for model development and feedback correction. The synchronization for
an LDE is more accurate than for an MSPC or ANN because it includes time constants
as well as the total loop time delay.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Frequently the process gain, dead time, and time constant of controlled variables, such
as temperature and composition, are inversely related to flow rate. Consequently, an
experimental model is only valid for limited changes in production rate. For large
changes in throughput, different models should be developed and switched if the process
gain, dead time, and time constant cannot be computed and updated in the model.
Steady-state first principle models offer projected steady states. However, it may take 30
minutes or more for a large steady-state model to converge. Dynamic first principle
models can run faster than real time to rapidly show the dynamic response into the
future, including response as it moves between steady states. It is not limited to selfregulating processes and can show integrating and runaway responses. The step size
should be less than one-fifth of the smallest time constant to avoid numerical stability,
so faster than real-time execution is preferably achieved by a reduction in the execution
time rather than an increase in integration step size. The fastest real-time multiple is the
product of the largest stable integration step size divided by the original step size
(execution time) and the original execution time divided by the new execution time that
is the calculation time (original execution time minus the wait time).
Dynamic models can be run in a computer with a downloaded configuration and
displays of the actual automation system to form a virtual plant. A virtual plant provides
inherent synchronization and integration of the model and the automation system,
eliminates the need for emulation of the control strategy and the operator interface, and
enables migration of actual configurations and operator graphics.
Steady-state first principle models are limited to continuous processes and steady states.
Thus, the first principle model needs to be dynamic to predict the process outputs for
chemical reaction or biochemical cell kinetics, behavior during product or grade
transitions, and for batch and non-self-regulating processes. Parameters, such as heat
exchanger coefficients, must be corrected within the models based on the mismatch
between manipulated variables (flows) in the process model and the actual plant. For
steady-state models, this is done by a reconciliation step where the process model is
solved for the model parameter. For online dynamic models in a virtual plant
synchronized with actual plant, a model predictive controller has been shown to be
effective in automatically adjusting model parameters.
Only dynamic first principle models are capable of simulating valve resolution (stickslip) and deadband (backlash). However, most to date do not include this dynamic
behavior of control valves and consequently will not show the associated limit cycle or
dead time.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Often, the compositions of raw materials are not measured comprehensively enough or
frequently enough. Most of the process variability not introduced by oscillations from
the automation system is typically caused by raw materials.
It is important to distinguish fidelity based on the purpose. In general, process models
for control system optimization require the most fidelity. Process models for
configuration testing require the least fidelity. Operator training for familiarization with
displays can be accomplished by low fidelity models. However, process training, control
strategy development and prototyping, and abnormal situation management require at
least medium fidelity models. It is difficult to achieve medium fidelity in the modeling
of start-ups and shutdowns because many first principle models will become unstable or
crash for fluid volumes approaching zero.
Process Control Improvement
Using models for operator training is recognized as essential. Not realized is that many,
if not most, operator errors could have been prevented by better operator interface and
alarm management systems, state-based control, control systems that stay in highest
mode and prevent activation of safety instrumented systems (SISs), continual training
and improvement from knowledge gained, and better two-way communication between
operators and everyone responsible for the system integrity and performance. The
virtual plant provides this and many process control improvement opportunities.
Inferential measurements by LDE, MSPC, ANN, and first principle models can provide
the composition measurements that are most important and most often missing. The
virtual plant can be used to develop and implement these measurements, as well as the
associated, more effective and reliable control strategies.
Process control engineers can use a virtual plant to increase process capacity and
efficiency. The virtual plant offers: flexible and fast exploring, discovering, prototyping,
testing, justifying, deploying, testing, training, commissioning, maintaining,
troubleshooting, auditing, continuous improvement, and showing the “before” and
“after” benefits of solutions via online metrics. The virtual plant can provide knowledge
for alarm management and operator interface improvements, cause-and-effect
relationship identification, interaction and resonance response, valve and sensor
response requirements, process safety stewardship, control system and SIS
requirements, validation and support, equipment performance, batch control, and
procedure automation (state-based control) for start-ups, transitions, shutdowns and
abnormal operation, advanced process control, and more optimum operating points.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Costs and Benefits
Process models provide process knowledge, composition measurements, and more
optimum operating points whose benefits generally pay for the cost of the LDE, MSPC,
and ANN software in less than a year. The cost of comprehensive, first-principle
modeling software with physical property packages is generally greater but can be used
for SAS and OTS besides more extensive PCI.
The process knowledge needed to implement first principle models is greater but,
correspondingly, the process knowledge gain is more extensive and deeper providing
significant process understanding. Some LDE, MSPC, and ANN software require little,
if any, outside support after the first application, but pose significant risks of erroneous
results when process understanding is missing to verify cause and effects. Consequently,
there is considerable synergy to be gained from having both experimental and first
principle models in terms of mutual improvements.
First principle models presently require outside support or internal simulation experts
and a total engineering cost that generally exceeds the software cost. All models require
an ongoing yearly maintenance cost that is about 10–20% of the initial installed cost, or
else the benefits will steadily diminish and eventually disappear. However, first
principle models in virtual plants can greatly increase process performance from the
synergy of operational, process, and automation system knowledge yielding benefits
each year that are much greater than total cost of the models.
The total cost of an LDE, MSPC, and ANN process model is generally less than the
installed cost of a field analyzer with a sample system. However, the cost of improving
the accuracy of lab analysis and increasing the frequency of lab samples must be
considered. Often overlooked are the benefits from reducing the effect of noise, dead
time, and failures in existing analyzers, and taking advantage of the composition
information in density measurements from Coriolis meters.
Further Information
Mansy, Michael M., Gregory K. McMillan, and Mark S. Sowell. “Step into the Virtual
Plant.” Chemical Engineering Progress 98, no. 2 (February 2002): 56+.
McMillan, Gregory K. Advances in Reactor Measurement and Control. Research
Triangle Park, NC: ISA (International Society of Automation), 2015.
McMillan, Gregory K., and Hunter Vegas. 101 Tips for a Successful Automation Career.
Research Triangle Park, NC: ISA (International Society of Automation), 2013.
McMillan, Gregory K., and Mark S. Sowell. “Virtual Plant Provides Real Insights,”
Chemical Processing (October 2008).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
McMillan, Gregory K., and Martin E. Berutti. “Virtual Plant Virtuosity.” Control
(August 2017).
McMillan, Gregory K., and Robert A. Cameron. Models Unleashed: Applications of the
Virtual Plant and Model Predictive Control – A Pocket Guide. Research Triangle
Park, NC: ISA (International Society of Automation), 2004.
About the Author
Gregory K. McMillan is a retired Senior Fellow from Solutia Inc. and an ISA Fellow.
He received the ISA Kermit Fischer Environmental Award for pH control in 1991 and
Control magazine’s Engineer of the Year award for the process industry in 1994. He was
inducted into Control magazine’s Process Automation Hall of Fame in 2001; honored as
one of InTech magazine’s most influential innovators in 2003; and presented with the
ISA Life Achievement Award in 2010. McMillan earned a BS in engineering physics
from Kansas University in 1969 and an MS in electrical engineering (control theory)
from Missouri University of Science and Technology in 1976.
18
Advanced Process Control
By Gregory K. McMillan
Fundamentals
In advanced process control, process knowledge by way of process models is used to
make the control system more intelligent. The process modeling topic (Chapter 17)
shows how quantitative process models can provide process knowledge and inferential
measurements, such as stream compositions, that can be less expensive, faster, and more
reliable than the measurements from field analyzers. The quantitative models from
Chapter 17 are used in this chapter to provide better tuning settings, set points, and
models and algorithms for feedback and feedforward control.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Advanced PID Control
A fuzzy logic controller (FLC) is not detailed here, because it has recently been shown
that proportional-integral-derivative (PID) tuning and various options can make a PID
do as well or better than a FLC. There are some niche applications for FLC, such as in
mineral processing due to indeterminate process dynamics and missing measurements.
For unmeasured disturbances, the PID has proven to provide near-optimal control
minimizing the peak and integrated errors. The improvement over other control
algorithms is most noticeable for processes that do not achieve a steady state in the time
frame of the PID response (e.g., 4 dead times). Processes with a large time constant,
known as lag dominant or near-integrating, with a true integrating response (e.g., batch,
level, or gas pressure), or runaway response (e.g., highly exothermic reactors), need a
more aggressive feedback correction by a PID to deal with the fact that the process
response tends to ramp or even accelerate. For these processes, a high PID gain,
derivative action, and overdriving the manipulated variable (flow) past the final resting
value are essential for good control and safe operation.
If disturbances can be measured, feedforward signals whose magnitude and timing are
accurate to within 10% can provide a 10:1 reduction in errors by preemptive correction
and coordination of flows. When the process controller directly manipulates a control
valve, the feedforward can be integrated into the PID controller via a feedforward option
in the PID. The control valve must have a linear installed characteristic or a signal
characterizer must be used to provide linearization and a PID output in percent flow.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
When a primary process controller manipulates a secondary flow loop set point (SP),
implementing a flow feedforward is best done via a ratio and bias/gain stations to ratio a
follower flow to a leader flow. Figure 18-1 shows the ratio control setup for volumes
with mixing as seen in agitated vessels and in columns due to boiling and reflux. For
these volumes, the process control of composition, pH, and temperature for continuous
besides batch operations do not have a steady state in the PID response time frame.
Increases in feed flow have an offsetting effect on PID tuning by decreasing the process
gain and time constant. Correction of the ratio set point (multiplying factor) by the
process controller would introduce nonlinearity. Here, the feedback correction is best
done by means of a bias. The bias correction can be gradually minimized by an adaptive
integral-only controller slowly correcting the ratio set point when the ratio station is in
the cascade (remote set-point) mode. The operator can see the desired versus the current
ratio and take over control of the ratio set point by putting the ratio station in the
automatic (local set-point) mode. This feature is particularly important for the start up of
distillation columns before the column has reached operating conditions (before
temperature is representative of composition). A typical example is steam to feed flow
and distillate to feed flow ratio control for a distillation column. Ratio control
operability and visibility is important for many unit operations.
The correction for a disturbance must arrive at the same point in the process and at the
same time as the disturbance with a magnitude and sign to cancel out the effect of the
disturbance. The ratio control system provides the correction with proper magnitude and
sign. The proper timing of the correction is achieved by dynamic compensation of the
feedforward via dead-time and lead-lag blocks. It is very important that the correction
does not arrive too soon or too late, thereby creating a response in the opposite direction
of the disturbance response, confusing the PID, and making control worse with
feedforward.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
In the literature, the source of the feedforward is sometimes depicted as a wild flow
(meaning the flow cannot be manipulated for control in the unit operation it goes to).
Here we take a more general view of the leader flow as any chosen flow to a unit
operation that is unregulated or is varied to set production rate. Often, the main feed
flow is the leader flow. For reactors, the leader is the primary reactant feed. The
follower flow is a secondary reactant feed. Fed-batch and continuous control strategies
may optimize the leader flow. The follower flow must change according to a corrected
ratio of follower to leader flow to maintain desired reactant concentrations. For
distillation columns, the feed flow is the leader, and the steam flow and distillate or
reflux flow are the followers.
The feedforward dead time is set equal to the leader dead time minus the follower dead
time in the path to the common point in the process. The lead time for the follower
feedforward flow signal is set equal to the lag time (time constant) in the path of the
follower flow to the common point in the process. The lag time for the follower
feedforward signal is set equal to the lag time in the path of the leader flow to the
common point in the process. If the lag time in the follower and the lag time in the
leader paths are similar, the lead-lag cancels out and the timing correction simplifies to a
dead-time correction. For ratio control of reactant flows, a filtered leader flow set point
is used as the feedforward signal with the filter time set larger than the slowest reactant
flow closed-loop time constant. This setup provides nearly identical timing of changes
in reactant flows eliminating the need for dynamic compensation. This is critical to
eliminate unbalances in the stoichiometry that can cause poor conversion or side
reactions.
The tuning for maximum disturbance rejection will generally cause excessive overshoot
and movement of the manipulated flow for set-point changes. The use of a set-point
filter equal to the PID reset time or a PID structure of integral action on error with
proportional and derivative action on the process variable (I on E, PD on PV) will
minimize the overshoot and abruptness in the PID output for a set-point change.
However, the approach to set point can be very slow. To get to set point faster, a lead
time equal to 1/4 the filter time (lag time) is applied to the set point or a two-degrees-offreedom (2 DOF) structure is used with the setpoint weights (beta and gamma) set equal
to 0.25. Note that the PID form considered here is the International Society of
Automation (ISA) standard form, and the derivative mode is assumed to have a built-in
filter whose filter time is about 1/8 the rate time setting to prevent spikes in the PID
output from derivative action.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
A key PID feature for minimizing oscillations and disruption to other loops from
aggressive PID action is external reset feedback, also known as a dynamic reset limit,
made possible by the positive feedback implementation of integral action. External reset
feedback of the actual manipulated variable, whether it be a valve stroke or a secondary
loop process variable (PV), will prevent the PID output from changing faster than the
PV of the manipulated variable responds. The update of PV used for external reset
feedback must be fast enough to reflect what the valve or secondary loop is actually
doing. This can be a problem for valves.
Simply turning on external reset feedback with a properly connected PV of the
manipulated variable will prevent oscillations from the slow slew rate of large valves,
valve backlash, and violation of the cascade rule where the secondary loop is not five
times faster than the primary loop. External reset feedback allows the use of set-point up
and down rate limits on analog outputs to valves and secondary loops to provide
directional move suppression (DMS). Oscillations from unnecessary crossings of the
split-range point, which is the biggest point of discontinuity, can be reduced by DMS in
the direction of the crossing of the split-range point when movement is not needed to
deal with an abnormal condition. DMS can be used to provide fast-opening and slowclosing surge valves to protect against surge and reduce overreaction and the disruption
to downstream users upon recovery. DMS also adds important capability to valve
position control (VPC) for simple optimization as described in the next section. DMS
enables a gradual optimization that is less disruptive to other loops with a fast getaway
for abnormal conditions to prevent the “running out of valve” (i.e., the valve position
used by the VPC becoming too far open or closed). Also, the suspension of execution of
the PID when there is no analyzer update can provide an enhanced PID that does not
oscillate for an increase in wireless update time or analyzer cycle time.
Valve Position Controllers
A simple configuration change to add a valve position controller (VPC) can optimize a
process set point by pushing the affected control valve position (the VPC controlled
variable) to its maximum or minimum throttle position (the VPC set point). A common
example is where the lowest output of three separate VPCs setting the leader reactant
flow maximizes production rate by pushing jacket coolant, overhead condenser coolant,
and vent control valves to their maximum throttle position. Traditionally, a VPC has
been tuned as integral-only controllers with the integral time 10 times larger than the
process residence time or the process controller’s closed-loop time constant or arrest
time to minimize interactions between the feed flow loop manipulated by the VPC and
the affected process loop. However, this can lead to too slow of a recovery for abnormal
operation leading to running out of a valve in the process loop. To prevent this,
proportional action is added with gain scheduling to the VPC. An easier and more
comprehensive fix is DMS, which does not require retuning. The VPC is simply tuned
for the abnormal condition and move suppression is added in the direction of
optimization. Using external reset feedback can also prevent oscillations from backlash
in the process loop valve from being translated to oscillations in the optimized set point
(e.g., feed flow).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Figure 18-2 shows how a VPC can be used to minimize compressor pressure by pushing
the furthest open user flow control valve (reactor feed valve) to its maximum throttle
position. This strategy can also be used to save energy by minimizing boiler pressure
and maximizing chiller temperature.
Also shown in Figure 18-2 is the computation of a derivative of a key process variable
that is in this case compressor suction flow. A high rate of change indicates a potential
or actual crossing of the surge curve. Preemptive action is taken by an open-loop backup
to quickly open the surge valve and keep it open until the system has stabilized before
returning manipulation of surge valve to the PID. A similar strategy is used to prevent
violation of environmental pH limits in the optimization of a reagent flow. Missing from
Figure 18-2 is a feedforward of reactor feed flows to the output of the surge controller.
This additional capability is particularly important to prevent disruption of other users
and compressor surge upon shutdown of a unit operation (reactor) and consequential
rapid closure of a user valve.
Computing a PV rate of change can be simply done by using a dead-time block to create
an old PV. This rate of change can be used for control of the slope of a batch or start-up
profile. A future PV by multiplication of the PV rate of change by the dead time and
addition to current PV can be used for prediction of batch end points and in trend plots
to increase understanding of where a process is going.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Maximizing Efficiency
Wherever there is an adjustable utility source, there is an opportunity to use a VPC to
make the use of the utility more efficient. The valve position of the process PID that is a
user of the utility is the PV for the VPC. The process PID output is used as an implied
valve position. The set point (SP) of the VPC is a maximum desirable throttle
(optimum) position of the process PID valve, which is a constraint to less energy use.
The VPC output is the cascade SP of a utility PID. The most frequent configuration
involves selecting the furthest open valve of process loops setting the utility flow to a
unit operation. The VPC maximizes the valve positions of process loops throttling
utility flows to minimize the pressure from boilers, compressors, and pumps; maximize
the temperature from cooling towers and chillers; and minimize the temperature from
heaters.
To reduce compressor energy use, the widest open valve of gas feed loops for a parallel
train of reactors is the PV for a VPC whose output adjusts the pressure SP of the
compressor resulting in a more optimum speed. The VPC lowers the compressor
pressure SP until one of the feed valves reaches the maximum throttle position. For
liquid feeds, the VPC would lower pump speed to reduce energy use. For a boiler, the
VPC lowers the boiler pressure SP to force the furthest open steam control valves in a
parallel train of columns or vessels to a maximum throttle position. The control valve
could be the output of a steam flow controller providing boil-up via a reboiler or a
reactor temperature controller providing heat-up via a jacket. For a chiller or cooling
tower, the VPC raises the supply temperature to force the furthest open valve of a user
loop to a maximum throttle position. For heaters, the VPC lowers the supply
temperature.
For improving the efficiency of nearly identical parallel trains and loops, a high signal
selector to choose the furthest open valve can be used as the input to a single VPC.
While there is no such thing as exactly identical equipment, the dynamics can be similar
enough regardless of which valve is selected to enable using a single VPC. If the unit
operations or process loops are quite different, a VPC is dedicated to each process loop
valve. A high and low signal selector of VPC outputs is used to lower supply pressure
and raise supply temperature, respectively. These VPC are called override controllers.
When there are different cost utilities, a VPC can maximize the use of the least
expensive utility (e.g., gas or liquid wastes). The VPC output decreases the SP of the
more expensive utility (e.g., natural gas or fuel oil) flow loop to push open the valve for
the less expensive utility. When there is a waste reagent and a purchased reagent, a VPC
can maximize the position of the waste reagent by reducing the flow SP of the
purchased reagent. For several stages of neutralization, a VPC can minimize the pH SP
of the first stage for acidic waste until the final stage pH controller reaches a minimum
throttle position. For basic waste, the VPC would maximize the first stage SP.
Maximizing Production
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
When the production rate of a unit operation needs to be maximized, a VPC monitoring
each valve throttled by process PID is used to increase the feed rate. Because the
process loops are quite different, a VPC is dedicated to each process PID and the lowest
output of the override VPC is used to set feed rate.
To maximize the production rate of a reactor with process loops for pressure control,
and jacket and condenser temperature, the lowest output of a VPC is used to prevent the
vent valve, or cooling water valve to the condenser or jacket, from going too far open.
An additional override PID can use radar to monitor foam level to prevent carry over
into the vent system. Alternately, the VPC controllers can raise the reactor temperature
SP to a permissible limit increasing an exothermic reaction rate until a VPC says enough
is enough. The high SP limit prevents undesirable reactions or excessive heat release.
For a column, the feed SP is the lowest output of individual VPC responsible for
preventing flow, level, temperature, and pressure control valves from running out of
valve. Additional override controllers may be added to the mix to prevent flooding by
keeping the differential pressure from getting too high across a key section of the
column. Alternately the VPC controllers can be used to lower column pressure SP to a
permissible limit increasing distillation rate. Table 18-1 gives examples of the many
VPC opportunities.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Model Predictive Control
Model predictive control (MPC) uses incremental models of the process where the
change in a controlled (CV) or constraint variable (AV) for a change in the manipulated
(MV) or disturbance variable (DV) is predicted. The initial values of the controlled,
constraint, manipulated, and disturbance variables are set to match those in the plant
when the MPC is in the initialization mode. The MPC can be run in manual and the
trajectories monitored. While in manual, a back-calculated signal from the downstream
function blocks is used so the manipulated variables track the appropriate set points.
A simplified review of the functionality of the MPC helps to provide a better
understanding important for a later discussion of its capabilities and limitations. Figure
18-3 shows the response of a change in the controlled variable to a step change in each
of two manipulated variables at time zero. If the step change in the manipulated
variables was twice as large, the individual responses of the controlled variable would
be predicted to be twice as large. The bottom plot shows the linear combination of the
two responses. Nonlinearities and interdependencies can cause this principal of linear
superposition of responses to be inaccurate.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Any errors in the modeled versus actual process response shows up as an error between
the predicted value and actual valve of the controlled variable as shown in the upper plot
of Figure 18-4. A portion of this error is then used to bias the process vector as shown in
the middle plot. The MPC algorithm then calculates a series of moves in the
manipulated variables that will provide a control vector that is the mirror image of the
process vector about the set point, as shown in the bottom plot of Figure 18-4. If there
are no nonlinearities, load upsets, or model mismatch, the predicted response and its
mirror image should cancel out with the controlled variable ending up at its set point.
How quickly the controlled variable reaches set point depends on the process dead time,
time constant, move suppression, move size limit, and number of moves set for the
manipulated variables.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The first move in the manipulated variable is actually the linear summation of all the
first moves based on controlled, disturbance, and constraint variables. Only the first
move is executed because the whole algorithm is revaluated in the next controller
execution.
As an example of an emerging MPC application, consider a continuous reactor. The
process objectives in this case are to maximize production rate and losses of reactant
and product in the overhead system. Online first principal estimators are first developed
and commissioned to provide online concentrations of the reactant in the overheads and
product in the reactor that match periodic lab samples. The MPC setup uses these
concentrations as controlled variables, the condenser temperature set point and reactant
ratio as manipulated variables, the reactor jacket and condenser coolant valve positions
as constraint variables, and the reactor temperature and feed flow as optimization
variables. The reactor temperature and reactant feed rate are maximized to maximize the
production rate, if the projected coolant valve positions are below their high limit. The
MPC setup is shown in Figure 18-5 with the relative trajectories for each pairing of a
controlled and constraint variable with a manipulated variable.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
MPC models must be able to provide a reasonably accurate time response of the change
in each process output (controlled or constraint variable) for a change in each process
input (manipulated or disturbance variable). Any control loops that use these MPC
variables must be in manual while the process is tested; otherwise, it is difficult to
impossible to separate the response of the controller algorithms and tuning from the
process. The smallest size MPC should be sought that meets the process objectives to
minimize the number of loops in manual and the test time.
The first process test makes a simple step change in each MV and DV and is commonly
known as a bump test. This initial test provides an estimate of the model parameters, as
well as an evaluation of the step size and time to steady state. The parameters provide an
estimate of the condition number of the matrix that is critical for evaluating the variables
selected.
The next test is either a more numerous series of steps each held for the longest time to
steady state or a pseudo random binary sequence (PRBS). The PRBS test is favored
because it excites more of the frequencies of the process and when combined with a
noise model can eliminate the effects of noise and load upsets. In a PRBS test, each
subsequent step is in opposite directions like the bump test, but the time between
successive steps is random (a coin toss). However, one or more of the steps must be held
to steady state and the minimum time between steps, called the flip time, must be larger
than a fraction of the time lag. Theoretically, the flip time could be as small as 1/8 of the
time lag, but in practice, it has been found that industrial processes generally require flip
times larger than 1/2 of the time lag.
The number of flips and consequently the duration of the PRBS test are increased for
processes with extensive noise and unmeasured upsets. A normal PRBS test time is
about 10 times the longest time to steady state multiplied by the number of process
inputs. For example, a typical PRBS test time would be 4 hours for a process with four
manipulated variables and a maximum 98% response time (T98) of 6 minutes. For
distillation columns, the test can easily span several shifts and is susceptible to
interruption by abnormal operation. While the PRBS test can theoretically make moves
in all of the manipulated variables, PRBS tests are often broken up into individual tests
for each manipulated variable to reduce the risk from an interruption and to make the
identification process easier. If more than one manipulated variable is moved, it is
important that the moves not be correlated. The PRBS sequence is designed to provide
random step durations and uncorrelated moves to ensure the software will identify the
process rather than the operator.
Sometimes even the most sophisticated software gets confused and can cause gross
errors in parameters and even a response in the wrong direction. The engineer should
estimate the process gain, time delay, and time lag from the simple bump test and verify
that the model direction and parameter estimates are consistent with these observations
and process fundamentals. The manually estimated values can be used when the
software has failed to find a model that fits the data. The rule is “if you can see the
model, and it makes sense, use it.”
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Real-Time Optimization
If a simple maximization of feed or minimization of utility or reagent flow is needed, a
controlled variable that is the set point of the manipulated flow is added to the matrix.
The set point of the flow loop is ramped toward its limit until there is a projected
violation of a constraint or an excessive error of a controlled variable. While the primary
implementation has been for flow, it can also be used for the maximization of other
manipulated variables.
It is important to realize that the optimum always lies at the intersection of constraints.
This can be best visualized by looking at the plot of the lines for the minimum and
maximum values of controlled, constraint, and manipulated variables plotted on a
common axis of the manipulated variables as shown in Figure 18-6.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
If the best targets for controlled variables have a fixed value or if the same intersection
is always the optimum one in Figure 18-6, the targets can be manually set based on the
process knowledge gained from development and operation of the MPC and real-time
optimization (RTO). In this case, the linear program (LP) or RTO can be run in the
advisory mode. If the optimum targets of controlled variables move to different
intersections based on costs, price, or product mix, then a LP can continuously find the
new targets.
If the lines for the variables plotted in Figure 18-6 shift from changes in process or
equipment conditions and cause the location of intersections to vary, then a high-fidelity
process model is needed to find the optimum. Steady-state simulations are run for
continuous processes. The model is first reconciled to better match the plant by solving
for model parameters, such as heat transfer coefficients and column tray efficiencies.
The model is then converged to find the optimums. This procedure is repeated several
times and the results are averaged when the plant is not at a steady state. For batch
operations and where kinetics and unsteady operation is important, dynamic models
whose model parameters have been adapted by the use of an MPC and a virtual plant (as
exemplified for a bioreactor in Figure 18-7) can be run faster than real time to find
optimums.
Any controlled variable whose set point should be optimized is a prime candidate for an
MPC, because the MPC excels at responding to set-point changes and handling
constraints and interactions. The best results of real-time optimization are achieved in
multivariable control when the set points are sent to an MPC rather than PID controllers.
Capabilities and Limitations
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
All model-based control is based on a simplification of the process dynamics. Figure 188 of the block diagram of a loop and a more realistic understanding of model parameters
reveals the potential problems in the implementation of adaptive and model predictive
control.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The process gain is really an open-loop or static gain that is the product of the gains
associated with the manipulated variable, the process variable, and controlled variable.
For a manipulated variable that is a control valve, the gain is the slope of the installed
characteristic at a given valve position for a perfect valve. Valve deadband from
backlash and resolution from stiction reduces the gain to a degree that is dependent on
the direction and size of the change in controller output. A slip larger than the stick will
increase the valve gain. Deadband will result in a continuous cycle in integrating
processes, such as level, and in cascade control systems where both controllers have
integral action. Stick-slip will cause a limit cycle in all control systems. Excessive
deadband and stick-slip has been the primary cause of the failure of adaptive controllers.
For the more important process variables, such as temperature and composition, the
process gain is a nonlinear function of the ratio of the manipulated flow to the feed flow
and is thus inversely proportional to feed flow. Because control algorithms use inputs
and outputs in percent, the open-loop gain is also directly and inversely proportional to
the manipulated and controlled variable spans, respectively.
The commonly used term process time constant is essentially the largest time constant
in the loop (open-loop time constant); it does not necessarily have to be in the process
but can be in the valve, measurement, and controller beside the process. Control valve
time constants from large actuators are extremely difficult to predict and depend on the
size of the change in controller output. Process time constants may be interactive lags
that depend on differences in temperature and composition or are derived from
residence times that are inversely proportional to throughput. Temperature sensor and
electrode time constants are greatly affected by velocities, process conditions, sensor
location, and sensor construction.
The commonly used term process dead time is really a total loop dead time that is the
sum of all the pure delays from the prestroke dead time of the actuator, the dead time
from valve backlash and stiction, process and sensor transportation delays that are
inversely proportional to flow, unpredictable process delays from non-ideal mixing, and
the execution time of digital devices and algorithms. All time constants smaller than the
largest time constant add an equivalent dead time as a portion of the small time constant
that gets larger as its size gets smaller compared to the largest time constant.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Adaptive controllers require that the process be excited by a known change to identify
the process gain, dead time, and time constant. When the controller is in manual,
changes in valve position made by the operator trigger the identification of the process
model. When the controller is in automatic, small pulses automatically injected into the
controller output or changes in the set point initiate the identification. The models
reportedly developed from closed-loop control operation without any known excitation
is really a combination of the process and controller. Studies have shown that it is not
feasible to reliably extract the complete process model from the combined model of the
controller and process for unknown disturbances and quiescent operation.
Unless the process exhibits or mimics an integrating or runaway response, an adaptive
controller whose model is identified from excitations will wait for the time to steady
state to find the process gain, dead time, and time constant. For a self-regulating
process, which is a process that will go to steady state, the time to reach steady state is
the total loop dead time plus 4 time constants. For processes with a large time constant,
the time required for adaptation is slow. In the case where the time constant is much
larger than the dead time, specifying the process as a near-integrator enables the
identification to be completed in about 4 dead times, which could easily be an order of
magnitude faster. This is particularly important for batch operations because there is
often no steady state.
Pattern recognition controllers must wait for several damped oscillations, each of which
is at least 4 or more times the total loop dead time. Thus, for processes where the dead
time is very large, the identification is slow. Noise and limit cycles from non-ideal
valves can lead to erroneous results. Integrators and runaway responses with windows
of allowable gain, where too low besides too high of a controller gain causes
oscillations, can result in a downward spiral in the adjusted PID gain because the
literature only talks about oscillations caused by a PID gain that is too high.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Figure 18-9 shows how an MPC has a view of the trajectory of each controlled and
constraint variable from changes in the manipulated and disturbance variables.
In contrast, a PID controller only sees the current value and the rate of change of its
controlled variable. The addition of a dead-time compensator only extends its view of
the future to the end of the dead time. However, besides the integral (reset) mode, it has
a derivative (rate) mode that provides some anticipation based on the slope of response
of the controlled variable and a proportional (gain) mode that result in immediate and
abrupt action. Feedforward and decoupling can be added, but the addition of these
signals to controller output is again based on current values, has no projected future
effect, and is designed to achieve the goal of returning a single controlled variable back
to its set point.
These fundamental differences between the MPC and the PID are a key to their relative
advantages for applications. The MPC offers performance advantages to meet process
objectives and deal with interactions. Because it also computes the trajectories of
constrained variables and disturbance variables and has built-in capabilities for move
suppression and the maximization or minimization of a manipulated variable, it is well
suited to multivariable control problems and optimization.
The PID algorithm assumes nothing about the future and is tuned to provide immediate
action based on change and rate of change and a driving action via reset to eliminate
offset. The PID offers performance advantages for runaway and nonlinear responses and
unmeasured, and hence unknown, load disturbances where the degree and speed of the
change in the process variable is the essential clue. The key to whether a loop should be
left as a PID controller is the degree that the proportional and derivative mode is needed.
For well-tuned controllers with large gain settings (> 10) or rate settings (> 60 seconds),
it may be inadvisable to move to MPC. Such settings are frequently seen in loops for
tight column and reactor temperature, pressure, and level control. PID controllers thrive
on the smooth and gradual response of a large time constant (low integrator gain) and
can achieve unmeasured load disturbance rejection that is hard to duplicate.
The analog-to-digital converter (A/D) chatter and resolution limit of large scale ranges
of temperature inputs brought in through distributed control system (DCS) cards rather
than via dedicated smart transmitters severely reduces the amount of rate action that a
PID can use without creating valve dither. The low-frequency noise from the scatter of
an analyzer reading also prohibits the full use of rate action. Process variable filters can
help if judiciously set, based on the DCS module execution time and the analysis update
time. An MPC is less sensitive to measurement noise and sensor resolution because it
looks at the error over a time horizon and does not compute a derivative.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Valve deadband (backlash) and resolution (stick-slip) is a problem for both the PID and
MPC. In the MPC, an increase in the minimum move size limit to be just less than the
resolution will help reduce the dead time from valve deadband and resolution but will
not eliminate the limit cycles.
In general, there is a tradeoff between performance (the minimum peak and integrated
error in the controlled variable) and robustness (the maximum allowable unknown
change in the process gain, dead time, or time constant). Higher performance
corresponds to lower robustness.
An increase in the process dead time of 50% can cause damped oscillations in an
aggressively tuned PID, but a decrease in process dead time of 50% can cause growing
oscillations in an MPC or PID with dead time compensation (PIDx) with default tuning
that initially has a better performance than the PID.
An MPC or PIDx is more sensitive to a decrease than an increase in dead time. A
decrease in process dead time rapidly leads to growing oscillations that are much faster
than the ultimate period. An increase in dead time shows up as much slower oscillations
with a superimposed high-frequency limit cycle. A PID goes unstable for an increase in
process dead time. A decrease in process dead time for a PID just translates to lost
opportunity associated with a greater than optimal controller reset time and a smaller
than optimal controller gain. An enhanced PID can handle increases in dead time from
analyzers.
For a single controlled and manipulated variable, MPC shows the greatest improvement
over PID for a process where the dynamics are fixed and move suppression is greatly
reduced. However, MPC is more sensitive to an unknown change in dead time.
For measured disturbances, the MPC generally has a better dynamic disturbance model
than a PID controller with feedforward control, primarily because of the difficulty in
properly identifying the feedforward lead-lag times. Often the feedforward dynamic
compensation for PID controllers is omitted or tuned by trial and error.
For constraints, the MPC anticipates a future violation by looking at the final value of a
trajectory versus the limit. MPC can simultaneously handle multiple constraints. PID
override controllers, however, handle constraints one at a time through the low or high
signal selection of PID controller outputs.
For interactions, the MPC is much better than PID controller. The addition of
decoupling to a PID is generally just based on steady-state gains. However, the benefits
of the MPC over detuned or decoupled PID controllers deteriorate as the condition
number of the matrix increases.
The steady-state gains in the 2 x 2 matrix in Equation 18-1 show that each manipulated
variable has about the same effect on the controlled variables. The inputs to the process
are linearly related. The determinant is nearly zero and provides a warning that MPC is
not a viable solution.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
(18-1)
The steady-state gains of a controlled variable for each manipulated variable in Equation
18-2 are not equal but exhibit a ratio. The outputs of the process are linearly related.
Such systems are called stiff because the controlled variables move together. The system
lacks the flexibility to move them independently to achieve their respective set points.
Again, the determinant is nearly zero and provides a warning that MPC is not a viable
solution.
(18-2)
The steady-state gains for the first manipulated variable (MV1) are several orders of
magnitude larger than for the second manipulated variable (MV2) in Equation 18-3.
Essentially, there is just one manipulated variable MV1 because the effect of MV2 is
negligible in comparison. Unfortunately, the determinant is 0.9, which is far enough
above zero to provide a false sense of security. The condition number of the matrix
provides a more universal indication of a potential problem than either the determinant
or relative gain matrix. A higher condition number indicates a greater problem. For
Equation 18-3, the condition number exceeds 10,000.
(18-3)
The condition number should be calculated by the software and reviewed before an
MPC is commissioned. The matrix can be visually inspected for indications of possible
MPC performance problems by looking for gains in a column with the same sign and
size, gains that differ by an order of magnitude or more, and gains in a row that are a
ratio of gains in another row. Very high process gains may cause the change in the MV
to be too close to the deadband and resolution limits of a control valve and very low
process gains may cause an MV to hit its output limit.
Costs and Benefits
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The cost of MPC software varies from $10K to $100K, depending on the number of
manipulated variables. The cost of high-fidelity process modeling software for real-time
optimization varies from $20K to $200K. The installed cost of MPC and RTO varies
from about 2 to 20 times the cost of the software depending on the condition of the plant
and the knowledge and complexity of the process and its disturbances.
Process tests and model identification reveal measurements that are missing or nonrepeatable and control valves that are sloppy or improperly sized. Simple preliminary
bump tests should be conducted to provide project estimates of the cost of upgrades and
testing time.
Often a plant is running beyond nameplate capacity or at conditions and products never
intended. An MPC or RTO applied to a plant that is continually rocked by unmeasured
disturbances, or where abnormal situations are the norm, require a huge amount of time
for testing and commissioning.
The proper use of advanced control can reduce the variability in a key concentration or
quality measurement. A reduction in variability is essential to the minimization of
product that is downgraded, recycled, returned, or scrapped. Less obvious is the product
given away in terms of extra purity or quantity in anticipation of variability. Other
benefits from a reduction in variability often manifest themselves as a minimization of
fuel, reactant, reagent, reflux, steam, coolant, recycle, or purge flow and a more
optimum choice of set points. Significant benefits are derived from the improvements
made to the basic regulatory control system identified during testing. New benefits in
the area of abnormal situation management are being explored from monitoring the
adaptive control models as indicators of changes in the instrumentation, valve, and
equipment.
The benefits for MPC generally range from 1% to 4% of the cost of goods for
continuous processes with an average of around 2%. The benefits of MPC for fed-batch
processes are potentially 10 times larger because the manipulated variables are constant
or sequenced despite varying conditions as the batch progresses. Other advanced control
technologies average significantly less benefits. RTO has had the most spectacular
failures but also the greatest future potential.
MPC Best Practices
The following list of best practices is offered as guidance and is not intended to cover all
aspects.
1. Establish a user company infrastructure to make the benefits consistent.
2. Develop corporate standards to historize and report key performance indicators
(KPI).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
3. Screen and eliminate outliers and bad inputs and review results before reporting
KPI.
4. Train operators in the use and value of the MPC for normal and abnormal
operation.
5. Installation must be maintainable without developer.
6. Improve field instrumentation and valves and tune regulatory controllers before
MPC pre-tests.
7. Eliminate oscillations from overly aggressive PID controller tuning that excite
nonlinearities.
8. Realize that changes even in PID loops not manipulated by the MPC can affect
the MPC models.
9. Use secondary flow loops so that MPC manipulates a flow set point rather than a
valve position to isolate valve nonlinearities from the MPC.
10. Use secondary jacket/coil temperature loops so MPC manipulates a temperature
set point rather than a coolant or steam flow to isolate jacket/coil nonlinearities
from MPC.
11. Use flow ratio control in the regulatory system so that the MPC corrects a flow
ratio instead of using flow as a disturbance variable in the MPC.
12. Generally, avoid replacing regulatory loops with MPC if the PID execution time
must be less than 1 second or the PID gain is greater than 10 to deal with
unmeasured disturbances.
13. Use inferential measurements (e.g., the linear dynamic estimators from Chapter
17) to provide a faster, smoother, and more reliable composition measurement.
14. Bias the inferential measurement prediction by a fraction of the error between
the inferential measurement and an analyzer after synchronization and
eliminating noise and outliers.
15. Eliminate data historian compression and filters to get raw data.
16. Conduct tests near constraint limits besides at normal operating conditions.
17. Use pre-tests (bump tests) to get step sizes and time horizons.
18. Step size should be at least five times deadband, stick-slip, resolution limit, and
noise.
19. Get at least 20 data points in the shortest time horizon.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
20. Use a near-integrator approximation to shorten time horizon if optimization not
affected.
21. Get meaningful significant movement in the manipulated variables at varying
step durations.
22. Make sure steady-state process gains are accurate for analysis, prediction, and
optimization.
23. Use engineering knowledge and available models or simulators to confirm or
modify gains.
24. Combine reaction and separation into the same MPC when the separation section
limits reaction system performance.
25. Use singular value decomposition (SVD) and linear program (LP) cost
calculation tools to build and implement large MPC applications.
26. Reformulate the MPC to eliminate interrelationships between process input
variables as seen by similar process gains in a column of the matrix.
27. Reformulate the MPC to eliminate interrelationships between process output
variables as seen by process gains in one column of the matrix having the same
ratio to gains in another column.
28. Make sure the MPC is in sync and consistent with targets from planning and
scheduling people.
29. For small changes in dynamics, modify gains and dead times online.
30. For major changes in dynamics, retest using automated testing software.
Further Information
Kane, Les A., ed. Advanced Process Control and Information Systems for the Process
Industries. Houston, TX: Gulf Publishing, 1999.
McMillan, Gregory K. Advances in Reactor Measurement and Control. Research
Triangle Park, NC: ISA (International Society of Automation), 2015.
——— Good Tuning: A Pocket Guide. 4th ed. ISA, Research Triangle Park, NC: ISA
(International Society of Automation), 2015.
McMillan, Gregory K., and Robert A. Cameron. Models Unleashed: Applications of the
Virtual Plant and Model Predictive Control – A Pocket Guide. Research Triangle
Park, NC: ISA (International Society of Automation), 2004.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
About the Author
Gregory K. McMillan is a retired Senior Fellow from Solutia Inc. and an ISA Fellow.
He received the ISA Kermit Fischer Environmental Award for pH control in 1991 and
Control magazine’s Engineer of the Year award for the process industry in 1994. He was
inducted into Control magazine’s Process Automation Hall of Fame in 2001; honored as
one of InTech magazine’s most influential innovators in 2003; and presented with the
ISA Life Achievement Award in 2010. McMillan earned a BS in engineering physics
from Kansas University in 1969 and an MS in electrical engineering (control theory)
from Missouri University of Science and Technology.
VI
Operator Interaction
Operator Training
Operator training continues to increase in importance as systems become more
complex, and the operator is expected to do more and more. It sometimes seems that, the
more we automate, the more important it is for the operator to understand what to do
when that automation system does not function as designed. This topic ties closely with
the modeling topic because simulated plants allow more rigorous operator training.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Operator Interface: Human-Machine Interface (HMI)
Software
Operator interfaces, data management, and other types of software are now basic topics
for automation professionals, and they fit in this category better than anywhere else.
Packaged automation software that is open with respect to Open Platform
Communications (OPC) covers a significant portion of the needs of automation
professionals; however, custom software is still needed in some cases. That custom
software must be carefully designed and programmed to perform well and be easily
maintained.
Alarm Management
Alarm management has become a very important topic in the safety area. The press
continues to report plant incidents caused by poorly designed alarms, alarm flooding,
and alarms being bypassed. Every automation professional should understand the basic
concepts of this topic.
19
Operator Training
By Bridget A. Fitzpatrick
Introduction
Advances in process control and safety system technology enable dramatic
improvements in process stability and overall performance. With fewer upsets, operators
tend to make fewer adjustments to the process. Additionally, as the overall level of
automation increases, there is less human intervention required. With less human
intervention, there is less “hands on” learning.
However, even the best technology fails to capture the operators’ knowledge of the realtime constraints and complex interactions between systems. The console operator
remains integral to safe, efficient, and cost-effective operation. Operator training
manages operator skills, knowledge, and behaviors.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Optimizing overall operations performance is a complex undertaking that includes a
review of process design, safety system design, the level of automation, staffing design,
shift schedules, and individual job design across the entire operations team. This chapter
focuses on control room operator training.
Evolution of Training
In early control rooms, panel-board operator training was traditionally accomplished
through a progression from field operator to control room operator. This progression
was commonly accomplished through on-the-job training (OJT) where an experienced
operator actively mentored the student worker. As process and automation technology
has advanced, these early methods have been augmented with a mix of training
methods. These methods and the advantages and disadvantages of each will be
discussed in the following sections.
The Training Process
A successful training program is based on understanding that training is not a single pass
program, but an ongoing process. This is complicated by continual changes in both
human resources and the process itself. A successful program requires support for initial
training and qualification, training on changes to the process and related systems, and
periodic refresher training. Developing and maintaining operator skills, knowledge, and
behavior is central to operational excellence.
Training Process Steps
The key steps of the training process include setting learning objectives (functional
requirements), training design, materials and methods testing, metrics selection, training
delivery, assessment, and continual improvement.
Learning objectives define the expected learning outcomes or identified needs for
changes to student skills, knowledge, or behaviors. This applies for each segment of
training, as well as for the overall training program.
Training design includes design work to define the best training delivery methods to be
used to meet the functional requirements, schedule, and budget.
Materials and methods testing refines the design and includes a “dry run” testing phase
to ensure that materials are effective at meeting functional requirements on a small
scale. For a new program, testing all major methods on a small scale is recommended to
ensure that the technologies in use and the training staff are executing successfully.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Metrics selection is important to ensure that a baseline of student performance is
available prior to training and to ensure that all personnel have a clear understanding of
expectations.
Training delivery is the execution phase of the training, which should include continual
feedback, such as instructor-to-student, student-to-instructor, and peer feedback in the
student and instructor populations.
The assessment phase includes evaluating the success of both the student learning and
the training program. Student learning assessment can be formal or informal. It can be
performed internally or by a third party. The proper method depends on the nature of the
subject. Assessment of the training program requires feedback from the participants on
training relevance, style, presentation, and perceived learning. This feedback can be
gathered by anonymous post-training questionnaires or follow-up discussions.
For the training program’s continuous improvement, it is recommended to include an
improvement phase to refine and adjust content and work processes. If training
materials are largely electronic or prepared in small batches, continual improvement is
not hampered by cost concerns.
Role of the Trainer
Depending on the scope of the objectives, a range of limited part-time to multiple
dedicated training staff may be required.
As with any role, there are skills, knowledge, and behaviors required for the trainer role.
Experienced operators commonly progress into the training role. This progression
leverages the operations skills and knowledge and may include familiarity with the
student population. For training success, it is also important to consider other skills,
including presentation and public speaking, listening skills, meeting facilitation, records
management, computer skills, and coaching/mentoring skills.
If the training role does not include the requirement to remain a certified operator, then
active efforts to maintain knowledge retention of the trainer are important. It is common
to employ a “train the trainer” approach, where new process operations or changes to
existing process operations are explained to the trainer and then the trainer delivers this
training to the operators. In this model, the learning skills of the trainer are critical.
Training Topics
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Training topics for operators span a broad range of topics, including such topics as:
• Safety training – Perhaps the most ubiquitous training content in the industry is
that of safety training. Much of this is required through regulation, and the
content is continually assessed and updated. Training requirements are also set by
a variety of automation standards, including ISA-84, ISA-18.2, and ISA-101. The
format is generally a mix of classroom, computer-based training (CBT), and
hands-on (e.g., live fire training) methods.
• Generic process equipment training – Traditionally, operators began their
career as field helpers and progressed into field operators and from there into a
control room operator role. As such, they were trained hands-on to understand the
operations and, to some extent, the maintenance of different process equipment.
This commonly would include pumps, compressors, heat exchangers, and so on.
As the efficiency with which these types of equipment were operated and the
underlying complexity of these devices increased, it became more critical that
staff members have a deeper understanding of how to operate and troubleshoot
these basic unit operations. Training on this topic helps develop and maintain
troubleshooting skills. Formal training may be more important in cases where no
or limited field experience is developed.
• Instrumentation training – The technology evolution in the area of
instrumentation has also had an impact in an increasing need for staff to
understand all common types of instrumentation for improved operations
performance.
• Control training – Training on common control theory is important for
operators. This includes the basic concepts of proportional-integral-derivative
(PID) control, ratio control, and cascade control. Overview training on advanced
process control (APC) concepts and interaction is also important when APC is
used.
• System training – Training on the control system is critical because this is the
main means of interacting with the process. This includes general system training,
training specifically on the alarm system (see requirements in ISA-18.2), and
training on the standard and custom displays in the human-machine interface
(HMI). Where APC is used, specific interactions on the HMI and any other
software packages that manage the HMI are also important considerations.
• Unit-specific training – Unit-specific training will include details on the general
process design (i.e., the process flow diagram [PFD], mass, and energy balance)
and the related integrity operating windows for key variables. Training will
include a review of related safety studies. Specific training will be delivered for
operator response to alarms, including specific expectations for all critical alarms.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Nature of Adult Learning
There is an old adage, generally attributed to either Confucius or Xunzi, that states,
“What I hear, I forget. What I see, I remember. What I do, I understand.” Some versions
of the adage include percentage effectiveness of reading, seeing, doing, and teaching.
While the adage seems to be true, it has not been based on research and misrepresents
the complexity of learning.
Knowles suggested four principles that are applied to adult learning:
1. Adults need to be involved in the planning and evaluation of their instruction.
2. Experience (including mistakes) provides the basis for the learning activities.
3. Adults are most interested in learning subjects that have immediate relevance
and impact to their job or personal life.
4. Adult learning is problem-centered rather than content-oriented (Kearsley 2010).
Key learning skills to be supported include creativity, curiosity, collaboration,
communication, and problem solving (Jenkins 2009). These skills also apply to effective
operations.
Given the impact of student-learning skills, multiple approaches may be required in
order to engage and support the student base.
Training Delivery Methods
To be most effective, training delivery methods must be tailored to align with the
instructor, student, content, and the related learning targets. In selecting methods, it is
important to understand that while the subject matter may lend itself conceptually to one
or two methods, the student population will respond to these methods across a spectrum
of performance. As such, a key consideration for training effectiveness includes
individual student learning assessment. Often the most overlooked aspect of training
success is the aptitude, continued assessment, and development of the instructor.
For operator training, it is important to note that the common and best practice training
methods have evolved over time as both available technology and the nature of
operations have changed. There are also related pragmatic initial and life-cycle costs for
the training materials and labor.
Training methods should be dictated by an assessment of the best method to meet
training functional requirements. The training delivery methods considered here include:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• “Book” learning—self study
• On-the-job training (OJT)
• Formal apprenticeship
• External certifications/degrees
• Classroom training (instructor-led)
• Computer-based training (CBT)
• Scenarios (paper-based)
• Simulation
• Expert systems
“Book” Learning—Self-Study
Books or reference materials, either hard copy or electronic, can be provided in a variety
of topics. This may include:
• Process technology manuals with engineering details, including design ranges
• Operating manuals with detailed discussions of operating windows with both
target and normal operating limits
• Equipment manuals and related engineering drawings
• Standard operating procedures
• Emergency operating procedures
In self-study, students review the material in a self-paced progression. This method is
likely included in most training programs and, at a minimum, provides familiarity with
reference material for later use.
Learning assessment is commonly through practice tests during training, followed by a
test or instructor interviews after completing the self-study.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Self-study is more effective for general awareness training or incremental learning on a
topic where the student has already demonstrated proficiency.
Advantages
Disadvantages
1. Training is self-paced, allowing the 1.
student to focus on areas of weakness
and proceed through areas of
established knowledge.
2.
Retention may be limited if the only
interaction and reinforcement is
reading the information.
2. Familiarity with reference material
that may later be used for
troubleshooting can be useful.
Materials can be expensive to
generate and maintain over the life
cycle of the facility.
On-the-Job Training
On-the-job training (OJT) allows for two general types of training:
1. Real-time mentoring of a student by a more experienced operator. This allows
for a student-paced initial training method. OJT provides specific task instruction
by verbal review of requirements, demonstration, and discussion, followed by
supervised execution by the student.
2. Training on changes in operations, where the change is explained and potentially
demonstrated to all control room operators before commissioning. Learning
assessment should be included.
For initial operator training, it is unlikely that the student will observe a broad range of
abnormal or emergency conditions. It is also unlikely that the mentor/instructor will
have time to dissect and explain activities during upsets/abnormal conditions.
Advantages
Disadvantages
1. Training customized for unit and for 1. Higher labor cost for one-on-one or
specific student needs.
one-to-few.
2. High attention to student-learning
assessment.
3. Improved instructor performance
through teaching others solidifies
their understanding of the content.
2. Limited number of instructors ideally
suited to teaching. Quality of the
training is dependent upon the
instructor.
3. Training across all modes of normal
operation may not be feasible.
4. Training for abnormal and emergency
conditions may be impossible.
5. If OJT is on shift, normal work load
will interrupt the training and extend
the schedule.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
6. If OJT is on shift, instructor may be
distracted and result in upset to
operations.
Formal Apprenticeship
A formal apprenticeship is similar to OJT but it has a more defined curriculum and
certification. An apprentice must be able to demonstrate mastery of all required skills
and knowledge before being allowed to graduate to journeyman status. This is
documented through testing and certification processes. Journeymen provide the on-thejob training, while adult education centers and community colleges typically provide the
classroom training. In many locales, formal apprenticeship programs are regulated by
governmental agencies that also set standards and provide services.
Advantages
1. Certified skill levels for the student.
Disadvantages
1. Where apprentices are training
outside of the organization, unit2. Motivated students that complete the
specific details are limited.
2. Training generic to the industry
unless managed or influenced by the
3. Effective method to establish basic
operating company directly.
knowledge in process equipment and
program.
instrumentation and control skills.
3. Entry-level resources may be more
expensive.
External Programs with Certification
External training programs are common. These include degreed 1- to 2-year programs.
In cases where the industry has participated in curriculum development, these programs
have been quite effective. The curriculum is commonly related to either instrumentation
and control or generic process knowledge for a given industry.
Advantages
Disadvantages
1. Certified skill level for the student.
1. Training generic to the industry
unless managed by the operating
2. Motivated students that complete the
company directly.
program.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
3. External programs can be an effective 2. Entry-level resources may be more
expensive.
way to establish basic knowledge in
process equipment and
instrumentation and control skills.
4. Cost structure may be attractive
where expertise in training these
topics is not maintained at site.
Classroom Training (Instructor-Led)
Classroom training is commonly used in conjunction with OJT. The classroom setting
has limited interruptions and fewer distractions. Methods in the classroom environment
include lecture and varied types of discussions.
Lectures are effective for overview training or improving behavior. Lectures may
include varied multimedia materials to enhance student attention. This format allows for
training in limited detail to a large group in a short period of time. Students can be
overwhelmed with excessive information in lecture format without reinforcement with
more interactive methods.
Q&A and group discussion sessions engage the students more directly, allowing for
clarifying questions, which enhance learning and keep the students focused. Student
interaction validates learning and provides insight to the trainer on areas that require
additional lecture or alternate training methods.
Discussions are more effective at training for procedural or complex topics because they
allow the students to process the training material incrementally with clarifications and
insights from other students. A discussion of scenarios is included below.
Advantages
Disadvantages
1. Training is customized for each job
position.
1. Higher student-to-instructor ratio can
lower the ability of the instructor to
assess understanding.
2. Peer-student interaction improves
understanding.
3. Learning is actively observed.
2. Students are largely passive for
lectures. Attention may be hard to
sustain.
4. Non-instructor perspectives are
shared, which can enhance learning. 3. A variety of delivery methods will be
required to ensure student
5. Preassigned roles for follow-up
attentiveness.
discussion segments can improve
4. Can be costly to generate and
student attention.
maintain large training programs.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
5. Quality of the training is dependent
on the instructor.
Computer-Based Training
Computer-based training (CBT) includes the use of technology to provide training
materials electronically. This may include a lecture delivered from a location remote to
the students. Alternately, electronic materials may be provided for self-paced study.
Advanced CBT methods enable interaction with the instructor and other students (this is
often done asynchronously).
Advantages
1. Training is delivered consistently.
Disadvantages
1. Asynchronous interaction can lower
the ability of the instructor to change
2. Learning assessment is consistent.
methods or correct student
3. Software methods can detect areas of
misunderstanding.
student weakness and provide
additional content in these areas.
2. Effectiveness limited by the
performance of the underlying
software and related hardware.
Scenarios (Paper-Based)
Facilitated discussions of possible “What-If” scenarios or review of actual facility
upsets are an effective method to refine the skills, knowledge, and behaviors required.
This is an extension of classroom training.
Review of actual process upsets can be an effective method of development or training
on lessons learned. Stepping students through the troubleshooting and decision points
helps refine their understanding of both the process and general troubleshooting skills.
Reviewing the upsets offline may identify creative causes and innovative workarounds.
Scenarios can be discussed in a variety of formats to optimize engagement and specific
student training targets. Options include:
• Discuss the event to generate a consensus best response or series of responses.
Review related operating procedures for accuracy and completeness as a team.
• Review actual incidents with a replay of actual facility process information,
alarms, and operator actions. Review the key actions and critique response.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Advantages
Disadvantages
1. Customized to match facility areas of 1. A small number of students may
concern (e.g., loss of power,
dominate the discussion.
instrument air, and steam).
2. The student group may not be
2. Institutional knowledge of best
effective at understanding or
response to the scenario is captured
resolving the scenario.
and refined.
3. Students may not reach consensus.
3. Well-facilitated sessions result in
broad student engagement.
4. The instructor must be able to
respond to ideas outside the planned
4. Abnormal conditions are reviewed in
curriculum.
detail in a safe setting.
5. The team may have negative learning
5. Storytelling on the active topic and
related events strengthen operator
troubleshooting skills.
with incorrect conclusions.
Simulation
Simulation uses software and hardware that emulates the process and allows for the
students to monitor the process, follow procedures, and engage in near real-time
decision-making in a training environment. To be effective, the simulator must match, as
close as practical, the physical and psychological demands of the control room.
Operator training simulators (OTSs) are effective for new operator training, refresher
training, and specific training for abnormal and emergency operations. The inclusion of
the simulator allows for validation of scenarios and evaluations of ad-hoc student
responses during the training process.
It is important to understand the levels of fidelity that are available and their uses. A
simple “tie back” model loops outputs back to inputs with some time delay, or filtering,
to achieve the simplest directional response form of simulation. This can be useful for
system checkout and provide the operator hands-on experience on the HMI, especially if
the process has very simple dynamics. Often operators soon recognize the limits of this
type of simulation and lose confidence in its ability to respond as the real process would.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Higher fidelity models that include both mass and energy balances and process reaction
dynamics more closely represent the real system. When used for training in upset
conditions and emergency response, they provide operator confidence in both their
abilities and the response of the actual system under those conditions. Where models
already exist for other uses, the investment to make them useful for training may be
small compared to the benefits.
Advantages
1. Unit-specific training.
Disadvantages
1. Higher cost for higher fidelity.
2. Training on the process response
2. Negative training with lower fidelity
offline with no risk for process upset.
models, resulting in operators
learning an incorrect response profile.
3. Flexibility to test all student
responses without detailed instructor 3. Loss of operator confidence in the
preparation.
model with unexpected responses.
Scenarios with Expert Guidance
Scenario discussions with expert guidance can be applied to both paper- and simulatorbased systems. This method requires that engineering experts provide the “correct
engineered” response to the scenario to be used as guidance for the training session. The
method requires breaking the scenario into decision points where the students make a
consensus decision and then compare their answer to the expert advice. Each decision is
discussed in detail to ensure that the instructor understands the group logic and the
students understand the advantages of the expert recommendations.
Advantages
Disadvantages
1. The ability for unit-specific training. 1. Scenario design impacts options in
training.
2. Training on the process response
offline with no risk for process upset. 2. Expert advice must be generated.
3. Expert guidance cross checks the
accuracy of the instructor and/or
simulator results.
3. Students may not agree with expert
recommendations.
3. Can be time intensive to maintain
4. Decision points in the scenarios focus
with changes in the facility.
the team on key learning targets.
5. Learning can be observed in the team
setting.
6. Collaboration in troubleshooting
improves skills and behaviors.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Summary
The console operator remains integral to safe, efficient, and cost-effective operation.
Operator training manages operator skills, knowledge, and behaviors.
For operator training, common and best practice training methods have evolved over
time as both available technology and the nature of operations have changed. To be most
effective, training delivery methods must be tailored to meet training functional
requirements.
Further Information
ANSI/IACET 1-2018 (R9.13.17) Standard for Continuing Education and Training.
American National Standards Institute (ANSI) and the International Association
for Continuing Education and Training (IACT).
Blanchard, P. N. Training Delivery Methods. 2017. Accessed 22 March 2018.
http://www.referenceforbusiness.com/management/Tr-Z/Training-DeliveryMethods.html#ixzz4wrUDqDw3.
Carey, Benedict. How We Learn: The Surprising Truth About When, Where, and Why It
Happens. New York: Random House Publishing Group, 2014.
Driscoll, M. P. Psychology of Learning for Instruction. 3rd ed. Boston: Pearson
Education, Inc., 2005.
Gonzalez, D. C. The Art of Mental Training: A Guide to Performance Excellence.
GonzoLane Media, 2013.
Grazer, B. A. Curious Mind: The Secret to a Bigger Life. New York: Simon & Schuster,
2015.
Illeris, K. Contemporary Theories of Learning: Learning Theorists ... In Their Own
Words. New York: Taylor and Francis. Kindle Edition, 2009.
Jenkins, H. Confronting the Challenges of Participatory Culture: Media Education for
the 21st Century. The John D. and Catherine T. MacArthur Foundation Reports on
Digital Media and Learning. Cambridge, MA: The MIT Press, 2009.
Kearsley, G. Andragogy (M. Knowles). The Theory into Practice Database. 2010.
Accessed 22 March 2018. http://www.instructionaldesign.org/about.html.
Klein, G. Streetlights and Shadows Searching for the Keys to Adaptive Decision
Making. Cambridge, MA: The MIT Press, 2009.
Knowles, M. The Adult Learner: A Neglected Species. 3rd ed. Houston, TX: Gulf
Publishing, 1984.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Leonard, D. C. Learning Theories: A to Z. Santa Barbara: Greenwood Publishing, 2002.
Pink, D. H. Drive: The Surprising Truth About What Motivates Us. New York: Penguin
Publishing Group, 2011.
Silberman, M. L., Biech, E. Active Training: A Handbook of Techniques, Designs, Case
Examples, and Tips. New Jersey: John Wiley & Sons, Inc., 2015.
Strobhar, D. A. Human Factors in Process Plant Operation. New York: Momentum
Press, 2013.
About the Author
Bridget Fitzpatrick is the process automation authority for the Automation and Control
organization within Wood. She holds an MBA in technology management from the
University of Phoenix and an SB in chemical engineering from the Massachusetts
Institute of Technology.
20
Effective Operator Interfaces
By Bill Hollifield
Introduction and History
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The human-machine interface (HMI) is the collection of monitors, graphic displays,
keyboards, switches, and other technologies used by the operator to monitor and interact
with a modern control system (typically a distributed control system [DCS] or
supervisory control and data acquisition system [SCADA]). The design of the HMI
plays a vital role in determining the operator’s ability to effectively manage the process,
particularly in detecting and responding to abnormal situations. The primary issues with
modern process HMIs are the design and content of the process graphics displayed to
the operator.
As part of the changeover to digital control systems in the 1980s and 1990s, control
engineers were given a new task for which they were ill prepared. The new control
systems included the capability to display real-time process control graphics for the
operator on cathode ray tube (CRT) screens. However, the screens were blank and it was
the purchaser’s responsibility to come up with graphic depictions for the operator to use
to control the process.
Mostly for convenience, and in the absence of a better idea, it was chosen to depict the
process as a piping and instrumentation drawing or diagram (P&ID) view covered in
live numbers (Figure 20-1). Later versions added distracting 3D depictions, additional
colors, and animation. The P&ID is a process design tool that was never intended to be
used as an HMI, and such depictions are now known to be a suboptimal design for the
purposes of overall monitoring and control of a process. However, significant inertia
associated with HMI change has resulted in such depictions remaining commonplace.
Poor graphics designed over 20 years ago have often been migrated, rather than
improved, even as the underlying control systems were upgraded or replaced multiple
times.
For many years, there were no available guidelines as to what constituted a “good” HMI
for control purposes. During this time, poorly designed HMIs have been cited as
significant contributing factors to major accidents. The principles for designing effective
process graphics are now available, and many industrial companies have graphic
improvement efforts underway.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
An effective HMI has many advantages, including significantly improved operator
situation awareness; increased process surveillance; better abnormal situation detection
and response; and reduced training time for new operators.
Basic Principles for an Effective HMI
This chapter provides an overview of effective practices for the creation of an improved
process control HMI. The principles apply to modern, screen-based control systems and
to any type of process (e.g., petrochemical, refining, power generation, pharmaceutical,
mining). Application of these principles will significantly improve the operator’s ability
to detect and successfully resolve abnormal situations. This chapter’s topics include:
• Appropriate and consistent use of color
• The display of information rather than raw data
• Depiction of alarms
• Use of embedded trends
• Implementation of a graphic hierarchy
• Embedded information in context
• Performance improvements from better HMIs
• The HMI development work process
Issues with Color Coding
Many existing graphics are highly colorful, and usually even a cursory review of a
system’s graphics will likely uncover many inconsistencies in the use of color. It is well
known that many people have difficulty differentiating a variety of colors and color
combinations, with red-green, yellow-green, and white-cyan as the most common.
People also do not detect color change in peripheral vision very well, and control room
consoles are often laid out with several screens spread horizontally.
To accommodate these facts, the most important and primary principle for color is:
Color is not used as the sole differentiator of an important condition or status.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Most graphics throughout the world violate this principle. Important information on the
screen should be coded redundantly via methods besides color. As an example, consider
Figure 20-2, which shows the usual red-green coding of pump status on the left, with
redundant grayscale coding incorporating brightness on the right. An object brighter
than the graphic background is ON, darker is OFF (think of a light bulb inside them). A
status word is placed next to the object. There is no mistaking these differences, even by
persons having color detection problems.
Note that the printing of this book in grayscale rather than color places additional
burdens on the reader in understanding some of these principles and actually illustrates
the need for redundant coding. Descriptions of the figures are intended to compensate
for this.
Color Palette and Principles
In order to use color effectively, a color palette should be prepared, tested in the control
room environment, then documented and used. The palette should contain a limited
number of easily distinguishable colors, and there should be consistent and specific uses
for each color. The palette is usually part of an HMI Philosophy and Style Guide
document discussed later.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Bright colors are used to bring or draw attention to abnormal situations rather than
normal ones. Graphics depicting the operation running normally should not be covered
in brightly saturated colors, such as bright red or green pumps, equipment, valves, and
so forth. It is common to find existing graphics covered in bright red objects even when
the process is running normally, and even when red is also used as an alarm color on
those same graphics.
Gray backgrounds for graphics are preferred. This poses the least potential problems
with color combination issues. Designs that are basically color-neutral are also
preferred. When combined with modern LCD (rather than CRT) displays, this enables
the reintroduction of bright lighting to the control room. A darkened control room
promotes drowsiness, particularly for shift workers. Control rooms were darkened many
years ago, often because of severe glare and reflection issues with CRT displays using
bright colors on dark backgrounds. Those were the only possible graphic combinations
when digital control systems were originally introduced, and inertia has played a
significant role since then.
Attempts to color code a process line with its contents are usually unworkable for most
processes. A preferred method is to use consistent labeling along with alternative line
thicknesses based on a line’s importance on a particular graphic. Use labeling
judiciously; a process graphic is not an instruction manual nor is it a puzzle to be
deciphered.
Depiction of Alarms on Graphics
Alarmed conditions should stand out clearly on a process graphic. When colors are
chosen to be associated with alarms, those colors should not be used for non-alarmrelated depiction purposes. Many of the traditional methods for depicting an alarm are
ineffective at being visually prominent and also violate the primary principle of color
use.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Figure 20-3 shows three usual, but poor, methods in which color alone is the
distinguishing factor of the alarm’s importance. The fourth method shown complies with
the primary principle and uses a redundantly coded alarm indicator element that appears
next to a value or condition in alarm. The indicator flashes while the alarm is
unacknowledged (one of the very few proper uses of animation) and ceases flashing
after acknowledgement, but remains visible as long as the alarm condition is in effect.
The indicator’s colors, shapes, and markings are associated with the priority of the
alarm. A unique sound for each priority should also be annunciated when a new alarm
occurs.
Unlike color, object movement (such as flashing) is readily detected in peripheral vision,
and a new alarm needs to be noticed. When properly implemented, such indicators
visually stand out and are easily detected, even on a complex graphic.
A very effective practice is to provide direct access from an alarm depicted on a graphic
to the information about that alarm (e.g., causes, consequences, corrective actions)
typically stored in a master alarm database. Ideally a right-click or similar action on the
alarm calls up a faceplate with the relevant information. There are a variety of ways to
accomplish such a linkage.
Display of Information Rather Than Raw Data
It is typical for an operator to have dozens of displayable graphics, each covered with
dozens to hundreds of numeric process values. Most graphics provide little to no context
coding of these raw numbers. The cognitive process for monitoring such a graphic is a
difficult one. The operator must observe each number and compare it to a memorized
mental map of what values constitute normal or abnormal conditions.
Additionally, there are usually combinations of different process values that represent
burgeoning abnormal situations. The building of a mental model containing this
information is a complex and lengthy process taking months to years. The proficiency of
an operator may depend on the number and type of costly process upsets that they have
personally experienced, and then added to their mental maps.
The situation awareness of an operator can be significantly improved when graphics are
designed to display not just the numbers, but to also provide contextual information as
to whether the process is running normally or not. One method of supplying desirable
context is in the use of properly coded analog indication.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
In Figure 20-4, much money has been spent on the process instrumentation. But this
quite usual and commonplace depiction of the readings provides no clue as to whether
the process is running well or poorly. Interpretation of the column temperature profile
requires a very experienced operator.
By contrast, Figure 20-5 shows coded analog representations of values. The “normal”
range of each is shown inside the analog indicator. In the rightmost element, this is
highlighted with a dotted line. On an actual graphic, this normal region would have a
subtle color-coding (e.g., pale blue or green) relative to the background gray rather than
the dotted line shown here for grayscale printing purposes.) Any configured alarm
ranges are also shown and the alarm range changes color when the alarm is active, along
with the appearance of the redundantly coded alarm indicator element.
Some measurements are inputs to automated interlock actions, and these are also shown
clearly instead of expecting them to be memorized by the operator. The proximity of the
process value to abnormal, alarm, and interlocked ranges is clearly depicted.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The indicators provide much-needed context. Much of the desirable mental model
becomes embedded into the graphic. An operator can scan dozens of such indicators in a
few seconds and immediately spot any that are abnormal and deserve further
investigation. This makes training easier and facilitates abnormal situation detection
even before an alarm occurs, which is highly desirable. The newest, least experienced
operator can easily spot an abnormal temperature profile. The display of analog values
that include context promotes process surveillance and improved overall situation
awareness.
Such analog indicators are best used when placed in easily scanned groups, rather than
scattered around a P&ID type of depiction. Figure 20-6 shows analog indicators with
additional elements (e.g., set point, mode, output) to depict a proportional-integralderivative (PID) controller. Analog position feedback of the final control element in a
loop is becoming increasingly common, and the depiction shown makes it easy for the
operator to spot a mismatch between the commanded and actual position. Such
mismatches can be alarmed.
It is common to show vessel levels in ways that use large splotches of bright colors. A
combination trend with analog range depiction shows much more useful information in
the same amount of space.
Space precludes the inclusion of dozens of additional comparisons of conventional
depictions versus designs that are more informative and effective. See the references
section.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Embedded Trends
It is common but surprising to enter a control room, examine dozens of screens, and not
find a single trend depiction. Every graphic generally contains one or two values that
would be much better understood if presented as trends. However, the graphics rarely
incorporate them.
One reason for this lack is that it is assumed that the operators can and will create any
trends as needed, using trending tools supplied with the control system. In actual
practice, it can take 10 to 20 clicks and data inputs to create a properly scaled, timed,
and usable trend, and such trends often do not persist if the displayed graphic is
changed. Trending-on-demand is often a frustrating process that takes too much operator
time when handling an abnormal situation.
The benefit of trends should not be dependent on an individual operator’s skill level.
Trends should be embedded in the graphics and appear whenever the graphic is called
up, immediately showing proper range and history. This is usually possible, but it is a
graphic capability that is often not utilized. Trends should incorporate elements that
depict both the normal and abnormal ranges for the trended value. There are a variety of
ways to accomplish this as shown in Figure 20-7.
Graphic Hierarchy
Graphics should be designed in a hierarchy that enables the operator to access
progressive exposure of detail as needed. Graphics designed from a stack of Pads will
not have this; they will be “flat,” similar to a computer hard disk with only one folder
for all the files. Such a structure does not provide for optimum situation awareness and
control. A four-level hierarchy is recommended.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Level 1 – Process Overview Graphic
An overview graphic shows the key performance indicators of the entire system under
the operator’s span of control—the big picture. It provides clear indication of the current
performance of the operation. The most important parameters utilize trends. It is
designed to be easy to scan and detect any abnormal conditions. Status of major
equipment is shown. Alarms are easily seen.
Figure 20-8 is an example overview graphic from a large coal-fired power plant. This
graphic was used in a proof test of advanced HMI concepts conducted by the Electric
Power Research Institute (EPRI, see references) and was highly rated for providing
overall situation awareness. It is a common practice that overview graphics do not
incorporate actual control functionality (such as faceplate call-up), thus providing more
screen space for monitoring elements. Overview graphics are often depicted on larger
off-console wall monitors.
In the test, the operators found this overview graphic to be far more useful than the
existing “typical” graphics in providing overall situation awareness and useful in
detecting burgeoning abnormal situations.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Level 2 – Process Unit Control Graphic
A single operator’s span of control is usually made up of several smaller, significantly
instrumented unit operations. Examples include a single reactor, a pipeline segment, a
distillation train, a furnace, and a compressor. A Level 2 graphic should exist for each of
these separate unit operations. Typically, the Level 2 breakdown consists of 10 to 20
different graphics.
Figure 20-9 is an example of a Level 2 graphic for a reactor. It is specifically designed
such that 90+ percent of the control interactions needed for effective monitoring and
control of the reactor can be accomplished from this single graphic. The primary control
loops are accessible. Important parameters are trended. Interlock status is clearly shown.
Production plan versus actual is depicted. Analog indicators provide context. Important
abnormal situation command buttons are available. Navigation to the most likely
graphics is provided. Details of reactor subsystems are provided at Level 3.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
A special note about faceplates: The paradigm of most control system graphic
implementations is that the selection of a measurement or object on the screen calls up a
standardized faceplate for that point type. Manipulation is actually made through the
faceplate element. This is a very workable paradigm. It is desirable that the faceplates be
caused to appear in an area on the graphic reserved for them, rather than on top of and
obscuring the main depiction. The Level 2 example shows this reserved area in the
upper right portion of the graphic.
Level 3 – Process Unit Detail Graphic
Level 3 graphics provide the detail about a single piece of equipment. These are used for
a detailed diagnosis of problems. They show all the instruments and include the highly
detailed interlock status. A P&ID schematic type of depiction is often the basis for a
Level 3 graphic. Figure 20-10 is an example of a Level 3 graphic of a compressor. It is
still possible and desirable to include analog indication and trends even at Level 3.
Most of the existing graphics in the world can be considered as “improvable” Level 3
graphics. Significant HMI improvement can be accomplished inexpensively by the
introduction of new Level 1 and 2 graphics, and the more gradual alteration and
improvement of the existing Level 3 screens. This will introduce inconsistencies
between the new and the old, but most existing old-style graphic implementations
already have significant inconsistencies that the operators have learned to deal with.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Level 4 – Support and Diagnostic Graphics
Level 4 graphics provide the most detail of subsystems, individual sensors, or
components. They show the most detailed possible diagnostic or miscellaneous
information. A point detail or logic detail graphic is a typical example. The dividing line
between Level 3 and Level 4 graphics can be somewhat indefinite. It is good to
incorporate access to useful external knowledge based on process context, such as
operating procedures, into the operator’s HMI.
Abnormal Situation Response Graphics
Many process operations have a variety of known abnormal situations that can occur.
These are events like loss of feed, heat, cooling, and compression. The proper operator
response to such situations is often difficult and stressful, and, if not done correctly, can
result in a more significant and avoidable upset. In many cases operators are expected to
use the same existing P&ID-type of graphics to handle such situations, and often the
information needed is spread out across many of those.
Operator response to such abnormal situations can often be greatly improved by the
creation of special purpose Level 2 graphics, specifically designed to contain every item
needed by the operator to handle certain abnormal situations.
Other Graphic Principles
Besides those discussed in detail, some additional recommendations are contained in
Table 20-1.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Expected Performance Improvements
The overview of principles discussed in this chapter—compared to traditional P&IDstyle, number-covered graphics—have been tested and proven in at least two published
studies involving actual operators and full-capability process simulators. The first was in
March 2006, concerning a study at a Nova Chemicals ethylene plant. The second was a
test conducted in 2009 by the EPRI. See the reference section for the reports. In these
tests, operators performed significant abnormal situation detection and response tasks,
using both existing “traditional” graphics and graphics created in accordance with the
new principles. This chapter’s author participated in the EPRI test.
In the EPRI test, new Level 1 and 2 graphics were created. Operators had many years of
experience with the existing graphics, which had been unchanged for more than a
decade. But with only one hour of practice use, the new graphics were proven to be
significantly better in assisting the operator in:
• Maintaining situational awareness
• Recognizing abnormal situations
• Recognizing equipment malfunctions
• Dealing with abnormal situations
• Embedding knowledge into the control system
Operators rated the Level 1 overview screen (Figure 20-8) highly, agreeing that it
provided continuously useful “big picture” situation awareness. Positive comments were
also received on the use of analog depictions, alarm depictions, and embedded trends.
There were consistent positive comments on how “obvious” the new HMI made the
various process situations. Values moving towards a unit trip were clearly shown and
noticed.
The operators commented that an HMI like this would enable faster and more effective
training of new operations personnel. The best summary quote was this:
“Once you got used to these new graphics, going back to the old ones would be
hell.”
The HMI Development Work Process
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
There is proven, straightforward methodology for the development of an effective HMI
—one that follows proper principles and is based on process objectives rather than Pads.
Step 1: Adopt a comprehensive HMI Philosophy and Style Guide. This is a detailed
document containing the proper principles for creating and implementing an effective
HMI. The style guide portion provides details and functional descriptions for objects
and layouts that implement the intent of the philosophical principles within the
capabilities of a specific control system. Most HMIs were created and altered for many
years without the benefit of such a document.
Step 2: For existing systems, assess and benchmark existing graphics against the HMI
Philosophy and Style Guide. Create a gap analysis.
Step 3: Create the Level 2 breakdown structure. For each portion of the process to be
controlled by a Level 2 graphic, determine the specific performance and goal objectives
for that area. These are such factors as:
• Safety parameters and limits
• Production rate
• Energy usage and efficiency
• Run length
• Equipment health
• Environmental (e.g., emission controls and limits)
• Production cost
• Product quality
• Reliability
It is important to document these, along with their goals, normal ranges, and target
values. This is rarely done and is one reason for the current poor state of most HMIs.
Step 4: Perform task analysis to determine the specific measurements and control
manipulations needed to effectively monitor the process and achieve the performance
and goal objectives from Step 3. The answer determines the content of each Level 2
graphic. The Level 1 graphic is an overall distillation of the Level 2 graphics. Level 3
graphics have the full details of the subparts of the Level 2 structure.
Step 5: Design the graphics using the design principles in the HMI philosophy and
elements from the style guide to address the identified tasks. Appropriate reviews and
refinements are included in this step. Each graphic is designed to give clear guidance as
to whether the process is running well or poorly.
Step 6: Install, commission, and provide training on the new HMI additions or
replacements.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Step 7: Control, maintain, and periodically reassess the HMI performance.
The ISA-101 HMI Standard
In July 2015, the International Society of Automation (ISA) published the ANSI/ISA101.01-2015 Human Machine Interfaces for Process Automation Systems standard. ISA101 is a relatively short document containing consistent definitions of various aspects of
an HMI. It contains generic principles of good HMI design, such as “the HMI should be
consistent and intuitive” and “colors chosen should be distinguishable by the operators.”
ISA-101 uses the life-cycle approach to HMI development and operations. It is
mandatory to have an HMI philosophy, style guide, and object library (called System
Standards). Methods to create these documents are discussed, but there are no examples.
It is also mandatory to place changes in the HMI under Management of Change (MOC)
procedures, similar to those that govern other changes in the plant and the control
system. User training for the HMI is the only other mandatory requirement.
ISA-101 provides brief descriptions of several methods for interacting with an HMI,
such as data entry in fields, entering and showing numbers, using faceplates, and use of
navigation menus and buttons. It contains a brief section on the determination of the
tasks that a user will accomplish when using the HMI and how those tasks feed the HMI
design process. There are no examples of proper and improper human factors design and
no details, such as appropriate color palettes or element depictions. ISA-101 mentions
the concept of display hierarchy, but contains no detailed design guidance or detailed
examples, such as those shown in the “Graphic Hierarchy” section in this chapter.
Conclusion
Sophisticated, capable, computer-based control systems are currently operated via
ineffective and problematic HMIs, which were created without adequate knowledge. In
many cases, guidelines did not exist at the time the graphics were created and inertia has
kept those graphics in commission for two or more decades.
The functionality and effectiveness of these systems can be greatly enhanced if graphics
are redesigned in accordance with effective principles. Significantly better operator
situation awareness and abnormal situation detection and response can be achieved.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Further Information
EPRI (Electric Power Research Institute). Operator HMI Case Study: The Evaluation of
Existing “Traditional” Operator Graphics vs. High Performance Graphics in a
Coal Fired Power Plant Simulator. EPRI Product ID 1017637. Charlotte, NC:
Electric Power Research Institute, 2009. (Note that the full and lengthy EPRI report
is restricted to EPRI member companies. A condensed version with many figures
and detailed results is in the white paper mentioned previously.)
Errington, J., D. Reising, and K. Harris. “ASM Outperforms Traditional Interface.”
Chemical Processing (March 2006). https://www.chemicalprocessing.com.
Hollifield, B., D. Oliver, I. Nimmo, and E. Habibi. The High Performance HMI
Handbook. Houston TX: PAS, 2008. (All figures in this chapter are courtesy of this
source and subsequent papers by PAS.)
Hollifield, B., and H. Perez. High Performance HMI™ Graphics to Maximize Operator
Effectiveness. Version 3.0. White paper, Houston, TX: PAS, 2015. (Available free
from PAS.com.)
About the Author
Bill R. Hollifield is the principal consultant responsible for the PAS work processes and
intellectual property in the areas of both alarm management and high-performance HMI.
He is a member of the International Society of Automation ISA18 Instrument Signals
and Alarms committee, the ISA101 Human-Machine Interface committee, the American
Petroleum Institute’s API RP-1167 Alarm Management Recommended Practice
committee, and the Engineering Equipment and Materials Users Association (EEMUA)
Industry Review Group. Hollifield was made an ISA Fellow in 2014.
Hollifield has multi-company, international experience in all aspects of alarm
management and HMI development. He has 26 years of experience in the petrochemical
industry in engineering and operations, and an additional 14 years in alarm management
and HMI software and service provision for the petrochemical, power generation,
pipeline, and mining industries.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Hollifield is co-author of The Alarm Management Handbook, The High Performance
HMI Handbook, and The Electric Power Research Institute (EPRI) Guideline on Alarm
Management. He has authored several papers on alarm management and HMI, and he is
a regular presenter on these topics in such venues as ISA, API, and EPRI symposiums.
He has a BS in mechanical engineering from Louisiana Tech University and an MBA
from the University of Houston.
21
Alarm Management
By Nicholas P. Sands
Introduction
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The term alarm management refers to the processes and practices for determining,
documenting, designing, monitoring, and maintaining alarms from process automation
and safety systems. The objective of alarm management is to provide the operators with
a system that gives them an indication at the right time, to take the right action, to
prevent an undesired consequence. The ideal alarm system is a blank banner or screen
that only has an alarm for the operator when an abnormal condition occurs and clears to
a blank banner or screen when the right action is taken to return to normal conditions.
Most alarm systems do not work that way in practice. The common problems of alarm
management are well documented. The common solutions to those common problems
are also well documented in ANSI/ISA-18.2, Management of Alarm Systems for the
Process Industries, and the related technical reports. This chapter will describe the
activities of alarm management following the alarm management life cycle, how to get
started on the journey of alarm management, and which activities solve which of the
common problems. The last section discusses safety alarms.
Alarm Management Life Cycle
The alarm management life cycle was developed as a framework to guide alarm
management activities and map them to other frameworks like the phases of a project. A
goal of the life-cycle approach to alarm management is continuous improvement, as the
life-cycle activities continue for the life of the facility. The alarm management life cycle
is shown in Figure 21-1 [1].
Philosophy
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
A key activity in the alarm management life cycle is the development of an alarm
management philosophy; a document that establishes the principles and procedures to
consistently manage an alarm system over time. The philosophy does not specify the
details of any one alarm, but defines each of the key processes used to manage alarm
systems: rationalization, design, training, monitoring, management of change, and audit.
Alarm system improvement projects can be implemented without a philosophy, but the
systems tend to drift back toward the previous performance. Maintaining an effective
alarm system requires the operational discipline to follow the practices in the alarm
philosophy.
The philosophy includes definitions. Of those definitions, the most important is the one
for alarm:
...audible and/or visible means of indicating to the operator an equipment
malfunction, process deviation, or abnormal condition requiring a timely
response [1].
This definition clarifies that alarms are indications
• that may be audible, visible, or both,
• of abnormal conditions and not normal conditions,
• to the operator,
• that require a response, and
• that are timely.
Much of alarm management is the effort to apply this definition.
The alarm philosophy includes many other sections to provide guidance, including:
• Roles and responsibilities
• Alarm class definitions
• Alarm prioritization methods
• Basic alarm design guidance
• Advanced alarm design methods
• Implementation of alarms
• Alarm performance metrics and targets
• Management of change
• Audit
A list of the required and recommended contents of an alarm philosophy can be found in
ISA-18.2 [1] and ISA technical report ISA-TR18.2.1-2018 [2].
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Identification
Identification is the activity that generates a list of potential alarms using some of the
various methods that address undesired consequences. For safety consequences, a
process hazard analysis (PHA) or hazard and operability study (HAZOP) might be used.
For quality or reliability, a failure modes and effects analysis (FMEA) might be used.
For environmental consequences, a compliance or permit review might be used. For
existing sites, the list of existing alarms is usually incorporated.
For best results, the participants in the identification methods should be trained on alarm
rationalization and should document basic information for each potential alarm: the
consequence, corrective action, and probable cause.
Rationalization
Rationalization is the process of examining one potential alarm at a time against the
criteria in the definition of alarm. The product of rationalization is a set of consistent,
well-documented alarms in the master alarm database. The documentation supports both
the design process and operator training.
Rationalization begins with alarm objective analysis, determining the rationale for the
alarm. This information may have been captured in identification:
• Consequence of inaction or operator preventable consequence
• Corrective action
• Probable cause
The key to this step is correctly capturing the consequence. This can be viewed two
different ways but should have the same result. The consequence of inaction is what
results if the operator does not respond to the alarm. The operator preventable
consequence is what the operator can prevent by taking the corrective action. Either
way, the consequence prevented by automatic functions like interlocks is not included.
This relates to the definition of alarm, as the alarms are for the operator. If there is no
operator, there are no alarms. If the condition is normal, the consequence is minimal, or
there is no corrective action the operator can take, the alarm should be rejected.
Another important criterion is the action required. Acknowledging the alarm does not
count as an action as it would justify every alarm. The operator action should prevent, or
mitigate, the consequence and should usually return the alarm condition to normal. The
corrective action often relates directly to the probable cause.
The next step is set-point determination. For this step, it helps to document the:
• Basis for the alarm set point
• Normal operating range
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Allowable response time or time to respond
The basis is the reason to set the alarm set point at one value or another, like the freezing
temperature of a liquid. The normal operating range is the where the condition is not in
alarm. The allowable response time is the time the operator has from the alarm to
prevent the consequence. This is usually determined by the process dynamics, and not
chosen based on how fast the operator should respond. It is usually estimated in time
ranges, like 3–10 minutes. With this information, the key is to determine if the operator
has enough time to take the corrective action. If not, the alarm set point can be moved to
allow more time to respond.
Alarm classification is the next step in rationalization. Classification precedes alarm
prioritization because there may be rules in the alarm philosophy that set priority by
class, for example, using the highest priority for safety alarms. Alarm classes are
defined in the alarm philosophy. Class is merely a grouping based on common
requirements, which is more efficient than listing each requirement for each alarm.
These requirements are often verified during audits. These requirements typically
include:
• Training requirements
• Testing requirements
• Monitoring and reporting requirements
• Investigation requirements
• Record retention requirements
The next step is prioritization. The alarm priority is an indication of the urgency to
respond for the operator. Priority is of value when there is more than one active alarm.
For priority to be meaningful, it must be assigned consistently and must be based on the
operator preventable consequence. In the past, priority was often based on an
engineering view that escalated priority as conditions progressed away from the normal
operating range, regardless of automatic actions. Prioritization usually uses:
• Allowable response time or time to respond
• Consequence severity
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The consequence severity is a rating based on a table from the alarm philosophy, like the
example in Table 21-1 [3].
Alarm priority is assigned using a matrix from the alarm philosophy, which usually
includes the consequence severity and the time to respond. Table 21-2 is an example
from ISA-TR18.2.2-2016 [3].
All of the information from rationalization is captured in the master alarm database.
This information is used in detailed design.
Detailed Design
The design phase utilizes the rationalized alarms and design guidance in the philosophy
to complete basic alarm design, advanced alarm design, and the human-machine
interface (HMI) design for alarms. Design guidance is often documented in an annex to
the alarm philosophy and typical configurations. As systems change, the guidance
should be updated to reflect features and limitations of the control system.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The guidance on basic configuration may include default settings for alarm deadbands,
delays, alarm practices for redundant transmitters, timing periods for discrete valves,
alarm practices for motor control logic, and the methods for handling alarms on bad
signal values. Many alarm system problems can be eliminated with good basic
configuration practices. ISA-TR18.2.3-2015 provides additional guidance [4].
The guidance on the HMI may include alarm priority definitions, alarm color codes,
alarm tones, alarm groups, alarm summary configuration, and graphic symbols for alarm
states. Alarm functions are only one part of the HMI, so it is important that these
requirements fit into the overall HMI philosophy as described in ANSI/ISA-101.012015 [5].
A common component of the HMI design guide is a table of alarm priorities, alarm
colors, and alarm tones. Some systems have the capability to show shapes or letters next
to alarms. This is a useful technique for assisting color-blind operators in recognizing
alarm priorities.
Beyond the basic configuration and HMI design, there are many techniques to reduce
the alarm load on the operator and improve the clarity of the alarm messages. These
techniques range from first-out alarming to state-based alarming to expert systems for
fault diagnosis. The techniques used should be defined in the alarm philosophy, along
with the implementation practices in the design guide. Some advanced alarming (e.g.,
suppressing low temperature alarms when the process is shutdown) is usually needed to
achieve the alarm performance targets.
There are many methods for advanced alarming, which often vary with the control
system and change over time as new functions are introduced. The common challenge is
maintaining the advanced alarming design over time. For this reason, the advanced
alarming should be well documented [6].
Implementation
The implementation stage of the life cycle is the transition from design to operation. The
main tasks are procedure development, training, and testing.
Training is one of the most essential steps in developing an alarm system. Since an
alarm exists only to notify the operator to take an action, the operator must know the
corresponding action for each alarm, as defined in the alarm rationalization. A program
should be in place to train operators on these actions. Documentation on all alarms,
sometimes called alarm response procedures, should be easily accessible to the
operator. Beyond the alarm-specific training, the operator should be trained on the alarm
philosophy and the HMI design. A complete training program includes initial training
and periodic refresher training.
Additional procedures on alarm shelving, if used, and taking an alarm out of service
should be developed and personnel should be trained on the procedures.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Testing for alarms is usually class-dependent. Some alarms require end-to-end testing
and others just verification of alarm set points and priorities.
Operation
In the operation stage of the alarm life cycle, the alarm performs its function as
designed, indicating to the operator that it is the right time to take the right action to
avoid an undesired consequence. The main activities of this stage are refresher training
for the operator and use of the shelving and out-of-service procedures.
The ANSI/ISA-18.2 standard does not use the words disable, inhibit, hide, or similar
terms, which might be used in different ways in different control systems. It does
describe different types of alarm suppression (i.e., hiding the alarm annunciation from
the operator, based on the engineering or administrative controls for the suppression). It
is described in the alarm state transition diagram shown in Figure 21-2 [1].
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Maintenance
In the maintenance stage of the life cycle, the alarm does not perform its designed
function, but is out of service for testing, repair, replacement, or another reason. The
out-of-service procedure is the explicit transition of an alarm from operation to
maintenance and return-to-service is the transition back to operation.
Monitoring and Assessment
Monitoring is the tracking of all the transitions in Figure 21-2. This data can be
consolidated into performance and diagnostic metrics. The performance metrics can be
assessed against the targets in the alarm philosophy. If the performance does not meet
the targets, the related diagnostic metrics can usually point to specific alarms to be
reviewed for changes.
Monitoring the alarm system performance is a critical step in alarm management. Since
each alarm requires operator action for success, overloading the operator reduces the
effectiveness of the alarm system. Instrument problems, controller performance issues,
and changing operating conditions will cause the performance of the alarm system to
degrade over time. Monitoring and taking action to address bad actors can maintain a
system at the desired level of performance. Table 21-3 shows the recommended
performance metrics and target from ANSI/ISA-18.2-2016 [1]. A more detailed
discussion of metrics can be found in ISA-TR18.2.5-2012 [7].
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The alarm philosophy should define report frequencies, metrics, and thresholds for
action. The performance metrics are usually calculated per operator position or operator
console. Common measurements include:
• Average alarm rate, such as total number of alarms per operator per hour
• Time > 10 alarms per 10 minutes, or time in flood
• Stale alarms
• Annunciated alarm priority distribution
Measurement tools allow reporting of the metrics at different frequencies. Typically,
there are daily reports to personnel responsible to take action, and weekly or monthly
reports to management. The type of data reported varies, depending on the control
system or safety system and the measurement tool. One of the most useful reports to
create is one that lists the top 10 most frequent alarms. This can clearly highlight the
alarms with issues.
It is recommended to set a target and an action limit for each performance metric, where
better than the target is good, worse than the action limit is bad, and between the limits
indicates improvement is possible. This allows a green-yellow-red stoplight indication
for management.
Management of Change
Alarm monitoring and other activities will drive changes in the alarm and alarm system.
These changes should be approved through a management of change process that
includes the activities of updating rationalization, design, and implementation.
Usually there are one or more management of change processes already established for
process safety management (PSM) or current good manufacturing practices (cGMP),
which would encompass changes for alarms. The alarm philosophy will define the
change processes and the steps necessary to change alarms.
Audit
The audit stage of the life cycle represents a benchmark and audit activity to drive
execution improvements and updates to the alarm philosophy. The benchmark is a
comparison of execution and performance against criteria like those in the ANSI/ISA18.2-2016. Audit is a periodic check of execution and performance against the alarm
philosophy and site procedures. It is recommended to include an interview with the
operators in any benchmark or audit [7].
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Getting Started
The best way to start the alarm management journey depends on the whether it is a new
or existing facility. The alarm management life cycle has three official starting points:
1. Philosophy
2. Monitoring and Assessment
3. Audit
New Facilities
The alarm philosophy is the recommended starting point for new facilities. A
philosophy should be developed or adopted early in the project, and the life cycle used
to identify project activities and deliverables. A new facility will only start up with a
good alarm system if the activities above are included in the project schedule and
budget. There are technical reports on applying alarm management to batch and discrete
processes, ISA-TR18.2.6-2012 [8], and to packaged systems, ISA-TR18.2.7-2017 [9].
Existing Facilities
Existing facilities may start with an alarm philosophy, but it is common to start with
monitoring and assessment or a benchmark. The advantage of starting with monitoring
is that alarm load can be quantified and problem alarms, sometimes called bad actors,
can be identified and addressed. This usually allows the plant team to see the problem
and that progress can be made, which can make it easier to secure funding for alarm
management work. A benchmark can serve the same purpose, highlighting gaps in
training and procedures. The benchmark leads to the development of the alarm
management philosophy.
Alarms for Safety
Safety alarms require more attention than general alarms, though the meaning of the
term safety alarm is different in different industries. Within the ANSI/ISA-18.2
standard, a set of requirements are called out for Highly Managed Alarms, such as
safety alarms. These requirements include training with documentation, testing with
documentation, and control of suppression. These requirements are repeated in
ANSI/ISA-84.91.01-2012 [10].
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
When alarms are used as protection layers, the performance of the individual alarm and
then alarm system should be considered. Safety alarms should not be allowed to become
nuisance alarms. If the alarm system does not perform well, no alarm in the system
should be considered reliable enough to use as a layer of protection.
References
1. ANSI/ISA-18.2-2016. Management of Alarm Systems for the Process Industries.
Research Triangle Park, NC: ISA (International Society of Automation).
2. ISA-TR18.2.1-2018. Alarm Philosophy. Research Triangle Park, NC: ISA
(International Society of Automation).
3. ISA-TR18.2.2-2016. Alarm Identification and Rationalization. Research
Triangle Park, NC: ISA (International Society of Automation).
4. ISA-TR18.2.3-2015. Basic Alarm Design. Research Triangle Park, NC: ISA
(International Society of Automation).
5. ANSI/ISA-101.01-2015. Human Machine Interfaces for Process Automation
Systems. Research Triangle Park, NC: ISA (International Society of
Automation).
6. ISA-TR18.2.4-2012. Enhanced and Advanced Alarm Methods. Research
Triangle Park, NC: ISA (International Society of Automation).
7. ISA-TR18.2.5-2012. Alarm System Monitoring, Assessment, and Auditing.
Research Triangle Park, NC: ISA (International Society of Automation).
8. ISA-TR18.2.6-2012. Alarm Systems for Batch and Discrete Processes. Research
Triangle Park, NC: ISA (International Society of Automation).
9. ISA-TR18.2.7-2017. Alarm Management When Utilizing Packaged Systems.
Research Triangle Park, NC: ISA (International Society of Automation).
10. ANSI/ISA-84.91.01-2012. Identification and Mechanical Integrity of Safety
Controls, Alarms, and Interlocks in the Process Industry. Research Triangle
Park, NC: ISA (International Society of Automation).
About the Author
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Nicholas P. Sands, PE, CAP, ISA Fellow, is currently a senior manufacturing
technology fellow with more than 28 years at DuPont, working in a variety of
automation roles at several different businesses and plants. He has helped develop
company standards and best practices in the areas of automation competency, safety
instrumented systems, alarm management, and process safety.
Sands has been involved with ISA for more than 25 years, working on standards
committees—including ISA18, ISA101, ISA84, and ISA105—as well as training
courses, the ISA Certified Automation Professional (CAP) certification, and section and
division events. His path to automation started when he earned a BS in chemical
engineering from Virginia Tech.
VII
Safety
HAZOP Studies
Hazard and operability studies (HAZOP), also termed HAZOP analysis or just HAZOP,
are systematic team reviews of process operations to determine what can go wrong and
to identify where existing safeguards are inadequate and risk-reduction actions are
needed. HAZOP studies are often required to comply with regulatory requirements, such
as the U.S. Occupational Safety and Health Administration’s (OSHA’s) Process Safety
Management Standard (29?CFR 1910.119), and are also used as the first step in
determining the required safety integrity level (SIL) for safety instrumented functions
(SIFs) to meet a company’s predetermined risk tolerance criteria.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Safety Life Cycle
A basic knowledge of reliability is fundamental to the concepts of safety and safety
instrumented systems. Process safety and safety instrumented systems (SISs) are
increasingly important topics. Safety is important in all industries, especially in large
industrial processes, such as petroleum refineries, chemicals and petrochemicals, pulp
and paper mills, and food and pharmaceutical manufacturing. Even in areas where the
materials being handled are not inherently hazardous, personnel safety and property
loss are important concerns. SIS is simple in concept but requires a lot of engineering to
apply well.
Reliability
In the field of reliability engineering, the primary metrics employed include reliability,
unreliability, availability, unavailability, and mean time to failure (MTTF). Failure
modes, such as safety-instrumented function (SIF) verification, also need to be
considered.
22
HAZOP Studies
By Robert W. Johnson
Application
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Hazard and operability studies (HAZOPs), also termed HAZOP analyses or just
HAZOPS, are systematic team reviews of process operations to determine what can go
wrong and to identify where existing safeguards are inadequate and risk-reduction
actions are needed. HAZOP Studies are typically performed on process operations
involving hazardous materials and energies. They are conducted as one element of
managing process risks, and are often performed to comply with regulatory
requirements such as the U.S. Occupational Safety and Health Administration’s
(OSHA’s) Process Safety Management Standard (29 CFR 1910.119). HAZOP Studies
are also used as the first step in determining the required safety integrity level (SIL) for
safety instrumented functions (SIFs) to meet a company’s predetermined risk tolerance
criteria in compliance with IEC 61511, Functional Safety: Safety Instrumented Systems
for the Process Industry Sector. An international standard, IEC 61882, is also available
that addresses various applications of HAZOP Studies.
Planning and Preparation
HAZOP Studies require significant planning and preparation, starting with a
determination of which company standards and regulatory requirements need to be met
by the study. The study scope must also be precisely determined, including not only the
physical boundaries but also the operational modes to be studied (continuous operation,
start-up/shutdown, etc.) and the consequences of interest (e.g., safety, health, and
environmental impacts only, or operability issues as well). HAZOP Studies may be
performed in less detail at the early design stages of a new facility, but are generally
reserved for the final design stage or for operating facilities.
HAZOP Studies are usually conducted as team reviews, with persons having operating
experience and engineering expertise being essential to the team. Depending on the
process to be studied, other backgrounds may also need to be represented on the team
for a thorough review, such as instrumentation and controls, maintenance, and process
safety. Study teams have one person designated as the facilitator, or team leader, who is
knowledgeable in the HAZOP Study methodology and who directs the team discussions.
Another person is designated as the scribe and is responsible for study documentation.
Companies often require a minimum level of training and experience for study
facilitators. To be successful, management must commit to providing trained resources
to facilitate the HAZOP Study and to making resources available to address the findings
and recommendations in a timely manner.
A facilitator or coordinator needs to ensure all necessary meeting arrangements are
made, including reserving a suitable meeting room with minimum distractions and
arranging any necessary equipment. For a thorough and accurate study to be conducted,
the review team will need to have ready access to up-to-date operating procedures and
process safety information, including such items as safety data sheets, piping and
instrumentation diagrams, equipment data, materials of construction, established
operating limits, emergency relief system design and design basis, and information on
safety systems and their functions.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Nodes and Design Intents
The first step in the HAZOP Study is to divide the review scope into nodes or process
segments. Adjacent study nodes will generally have different relevant process
parameters, with typical study nodes being vessels (with parameters of importance such
as level, composition, pressure, temperature, mixing, and residence time) and transfer
lines (with parameters such as source and destination locations, flow rate, composition,
pressure, and temperature).
Nodes are generally studied in the same direction as the normal process flow. The
HAZOP Study team begins the analysis of each node by determining and documenting
its design intent, which defines the boundaries of “normal operation” for the node. This
is a key step in the HAZOP Study methodology because the premise of the HAZOP
approach is that loss events occur only when the facility deviates from normal operation
(i.e., during abnormal situations).
The design intent should identify the equipment associated with the node including
source and destination locations, the intended function(s) of the equipment, relevant
parameters and their limits of safe operation, and the process materials involved
including their composition limits. An example of a design intent for a chemical reactor
might be to:
Contain and control the complete reaction of 1,000 kg of 30% A and 750 kg
of 98% B in EP-7 by providing mixing and external cooling to maintain
470–500ºC for 2 hours, while venting off-gases to maintain < 100 kPa
gauge pressure.
For procedure-based operations such as unloading or process start-up, the established
operating procedure or batch procedure is an integral part of what defines “normal
operation.”
Scenario Development: Continuous Operations
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Figure 22-1 illustrates how the HAZOP Study methodology interfaces with a typical
incident sequence to develop possible incident scenarios associated with a study node.
Terminology in this figure is consistent with the definitions in the Guidelines for Hazard
Evaluation Procedures, Third Edition, published by the American Institute of Chemical
Engineers – Center for Chemical Process Safety (AIChE-CCPS).
The typical incident sequence starts with the initiating cause, which is the event that
marks a transition from normal operation to an abnormal situation or deviation. If a
preventive safeguard such as an operator response to an alarm or a safety instrumented
function is successful, the process will be brought back to normal operation or to a safe
state such as unit shutdown. However, if the preventive safeguards are inadequate or do
not function as intended, a loss event such as a toxic release or a bursting vessel
explosion may result. Mitigative safeguards such as emergency response actions can
reduce the impacts of the loss event.
The HAZOP Study starts by applying a set of guide words to the design intent
(described in the preceding section) to develop meaningful deviations from the design
intent. This can be done either one guide word at a time, or one parameter (such as flow
or temperature) at a time.
Once a meaningful deviation is identified, the team brainstorms what could cause the
deviation, then what possible consequences could develop as a result of the deviation,
then what safeguards could intervene to keep the consequences from being realized.
Each unique cause-consequence pair, with its associated safeguards, is a different
scenario. Facilitators use various approaches to ensure a thorough yet efficient
identification of all significant scenarios, such as identifying only local causes (i.e., only
those initiating causes associated with the node being studied), investigating global
consequences (anywhere, anytime), and ensuring that incident sequences are taken all
the way through to the loss event consequences.
Scenario Development: Procedure-Based Operations
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Procedure-based operations are those process operations that manually or automatically
follow a time-dependent sequence of steps to accomplish a specific task or make a
particular product. Examples of procedure-based operations are tank truck unloading;
product loadout into railcars; start up and shut down of continuous operations;
transitioning from one production mode to another; sampling procedures; and batchwise physical processing, chemical production, and waste treatment. Many companies
have discovered the importance of studying the procedure-based aspects of their
operations in detail, with the realization that most serious process incidents have
occurred during stepwise, batch, nonroutine, or other transient operational modes.
Scenario development for procedure-based operations starts with the steps of an
established operating procedure or batch sequence, and then uses the HAZOP guide
words to identify meaningful deviations from each step or group of steps. For example,
combining the guide word “NONE” with a procedural step to close a drain valve would
lead to identifying the deviation of the drain valve not being closed, such as due to an
operator skipping the step.
Note:
Section 9.1 of the AIChE-CCPS guidelines describes an alternative two-guideword approach that can be used to identify potential incident scenarios
associated with procedure-based operations.
Determining the Adequacy of Safeguards
After possible incident scenarios are identified, the HAZOP Study team evaluates each
scenario having consequences of concern to determine whether the safeguards that are
currently built into the process (or process design, for a new facility) are adequate to
reduce risks to a tolerable level. Some companies perform this evaluation as each
scenario is identified and documented; other companies wait until all scenarios are
identified.
A range of approaches is used by HAZOP Study teams to determine the adequacy of
safeguards, from a purely qualitative judgment, to the use of risk matrices (as described
below), to order-of-magnitude quantitative assessments (AIChE-CCPS guidelines,
Chapter 7). However, all approaches fundamentally are evaluating the level of risk
posed by each scenario and deciding whether the risk is adequately controlled or
whether it is above or below a predetermined action threshold.
Scenario risk is a function of the likelihood of occurrence and the severity of
consequences of the scenario loss event. According to the glossary in the AIChE-CCPS
guidelines, a loss event is defined as:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The point of time in an abnormal situation when an irreversible physical
event occurs that has the potential for loss and harm impacts. Examples
include the release of a hazardous material, ignition of flammable vapors or
an ignitable dust cloud, and the overpressurization rupture of a tank or
vessel. An incident might involve more than one loss event, such as a
flammable liquid spill (first loss event) followed by the ignition of a flash
fire and pool fire (second loss event) that heats up an adjacent vessel and its
contents to the point of rupture (third loss event).
In the multiple loss event example in the definition above, each of the three loss events
would pose a different level of risk and each can thus be evaluated as a separate HAZOP
Study scenario.
Figure 22-2 illustrates that the likelihood of occurrence of the loss event (i.e., the loss
event frequency) is a function of the initiating cause frequency reduced by the
effectiveness of all preventive safeguards taken together that would keep the loss event
from being realized, given that the initiating cause has occurred. The severity of
consequences is the loss event impact, which is generally assessed in terms of human
health effects and environmental damage. The assessed severity of consequences may
include property damage and other business impacts as well. The scenario risk is then
the product of the loss event frequency and severity.
Companies typically define a threshold level of scenario risk. Below this threshold,
safeguards are considered to be adequate and no further risk-reduction actions are
considered to be warranted. This threshold can either be numerical, or, more commonly,
can be shown in the form of a risk matrix that has frequency and severity on the x and y
axes. The company’s risk criteria or risk matrix may also define a high-risk region
where greater urgency or priority is placed on reducing the risk. The risk criteria or risk
matrix provides the HAZOP Study team with an aid for determining where risk
reduction is required.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Recording and Reporting
HAZOP Study results in the form of possible incident scenarios are generally recorded
in a tabular format, with different columns for the various elements of the HAZOP
scenarios. In the book HAZOP: Guide to Best Practice, Third Edition, Crawley et al.
give the minimum set of columns as Deviation, Cause, Consequence, and Action.
Current usage generally documents the Safeguards in an additional separate column, as
well as the Guide Word and/or Parameter. The table might also include documentation
of the likelihood, severity, risk, and action priority. Table 22-1, adapted from the AIChECCPS guidelines, shows a typical HAZOP scenario documentation with the numbers in
the Actions column referring to a separate tabulation of the action items (not shown).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Johnson (2008, 2010) shows how the basic HAZOP Study can be extended by obtaining
a risk magnitude for each scenario by adding cause frequency and consequence severity
magnitudes, then subtracting safeguards effectiveness magnitudes. By considering the
safeguards specifically as independent protection layers (IPLs) and conditional
modifiers, as shown in Table 22-2, the resulting HAZOP Study has the features of a
Layer of Protection Analysis (LOPA), which is useful for determining the required
safety integrity level (SIL) to meet a company’s risk tolerance criteria for complying
with IEC 61511.
Computerized aids are commercially available for documenting HAZOP Studies
(AIChE-CCPS guidelines, Appendix D). The final HAZOP Study report consists of
documentation that gives, as a minimum, the study scope, team members and
attendance, study nodes, HAZOP scenarios, and the review team’s findings and
recommendations (action items). The documentation may also include a process
description, an inventory of the process hazards associated with the study scope, a
listing of the process safety information on which the study was based, any auxiliary
studies or reference to such studies (e.g., for human factors, facility siting, or inherent
safety), and a copy of marked-up piping and instrumentation diagrams (P&IDs) that
show what equipment was included in each node.
The timely addressing of the HAZOP Study action items, and documentation of their
resolution, is the responsibility of the owner/operator of the facility. HAZOP Studies are
not generally kept as evergreen documents (always immediately updated whenever a
change is made to the equipment or operation of the facility). They are typically updated
or revalidated on a regular basis, at a frequency determined by company practice and
regulatory requirements.
Further Information
AIChE-CCPS (American Institute of Chemical Engineers – Center for Chemical
Process Safety). Guidelines for Hazard Evaluation Procedures, Third Edition, New
York: AIChE-CCPS, 2008.
AIChE (American Institute of Chemical Engineers). Professional and technical training
courses. “HAZOP Studies and Other PHA Techniques for Process Safety and Risk
Management.” New York: AIChE. www.aiche.org/academy.
Crawley, F., M. Preston, and B. Tyler. HAZOP: Guide to Best Practice. 3rd ed. Rugby,
UK: Institution of Chemical Engineers, 2015.
IEC 61511 Series. Functional Safety – Safety Instrumented Systems for the Process
Industry Sector. Geneva 20 – Switzerland: IEC (International Electrotechnical
Commission).
IEC 61882. Hazard and Operability Studies (HAZOP Studies) – Application Guide.
Geneva 20 – Switzerland: IEC (International Electrotechnical Commission).
Johnson, R. W. “Beyond-Compliance Uses of HAZOP/LOPA Studies.” Journal of Loss
Prevention in the Process Industries 23, no.6 (November 2010): 727-733.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
——— “Interfacing HAZOP Studies with SIL Determinations using Exponential
Frequency and Severity Categories.” ISA Safety Symposium, Calgary, Alberta,
April 2008.
About the Author
Robert W. Johnson is president of the Unwin Company process risk management
consultancy. Johnson, a process safety specialist since 1978, has authored six books and
numerous technical articles and publications on process safety topics. He is a fellow of
the American Institute of Chemical Engineers (AIChE) and past chair of the AIChE
Safety and Health Division. Johnson lectures on HAZOP Studies and other process
safety topics for the AIChE continuing education program and has taught process safety
at the university level. He has a BS and MS in chemical engineering from Purdue
University. Johnson can be contacted at Unwin Company, 1920 Northwest Boulevard,
Suite 201, Columbus, Ohio 43212 USA; +1 614 486-2245; rjohnson@unwin-co.com.
23
Safety Instrumented Systems in the
Process Industries
By Paul Gruhn, PE, CFSE
Introduction
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Safety instrumented systems (SISs) are one means of maintaining the safety of process
plants. These systems monitor a plant for potentially unsafe conditions and bring the
equipment, or the process, to a safe state if certain conditions are violated. Today’s SIS
standards are performance-based, not prescriptive. In other words, they do not mandate
technologies, levels of redundancy, test intervals, or system logic. Essentially, they state,
“the greater the level of risk, the better the safety systems needed to control it.”
Hindsight is easy. Everyone always has 20/20 hindsight. Foresight, however, is a bit
more difficult. Foresight is required with today’s large, high-risk systems. We simply
cannot afford to design large petrochemical plants by trial and error. The risks are too
great to learn that way. We have to try to prevent certain accidents, no matter how
remote the possibility, even if they have not yet happened. This is the subject of system
safety.
There are a number of methods for evaluating risk. There are also a variety of methods
for equating risk to the performance required of a safety system. The overall design of a
safety instrumented system (SIS) is not a simple, straightforward matter. The total
engineering knowledge and skills required are often beyond that of any single person.
An understanding is required of the process, operations, instrumentation, control
systems, and hazard analysis. This typically calls for the interaction of a
multidisciplined team.
Experience has shown that a detailed, systematic, methodical, well-documented design
process or methodology is necessary in the design of SISs. This is the intent of the
safety life cycle, as shown in Figure 23-1.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The intent of the life cycle is to leave a documented, auditable trail, and to make sure
that nothing is neglected or falls between the inevitable cracks within every
organization. Each phase or step of the life cycle can be defined in terms of its
objectives, inputs (requirements to complete that phase), and outputs (the documentation
produced). These steps and their objectives, along with the input and output
documentation required to perform them, are briefly summarized below. The steps are
described in more detail later in this chapter.
Hazard and Risk Assessment
Objectives: To determine the hazards and hazardous events of the process and
associated equipment, the sequence of events leading to various hazardous events, the
process risks associated with each hazardous event, the requirements for risk reduction,
and the safety functions required to achieve the necessary risk reduction.
Inputs: Process design, layout, staffing arrangements, and safety targets.
Outputs: A description of the hazards, the required safety function(s), and the
associated risk reduction of each safety function.
Allocation of Safety Functions to Protection Layers
Objectives: To allocate safety functions to protection layers, and to determine the
required safety integrity level (SIL) for each safety instrumented function (SIF).
Inputs: A description of the required safety instrumented function(s) and associated
safety integrity requirements.
Outputs: A description of the allocation of safety requirements.
Safety Requirements Specification
Objectives: To specify the requirements for each SIS, in terms of the required safety
instrumented functions and their associated safety integrity levels, in order to achieve
the required functional safety.
Inputs: A description of the allocation of safety requirements.
Outputs: SIS safety requirements; software safety requirements.
Design and Engineering
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Objectives: To design the SIS to meet the requirements for safety instrumented
functions and safety integrity.
Inputs: SIS safety requirements and software safety requirements.
Outputs: Design of the SIS in conformance with the SIS safety requirements; plans for
the SIS integration test.
Installation, Commissioning, and Validation
Objectives: To install and test the SIS, and to validate that it meets the specifications
(functions and performance).
Inputs: SIS design, integration test plan, safety requirements, and validation plan.
Outputs: A fully functional SIS in conformance with design and integration tests, as
well as the results of the installation, commissioning, and validation activities.
Operations and Maintenance
Objectives: To ensure that the functional safety of the SIS is maintained during
operation and maintenance.
Inputs: SIS requirements, SIS design, operation, and maintenance plan.
Outputs: Results of the operation and maintenance activities.
Modification
Objectives: To make corrections, enhancements, or changes to the SIS to ensure that the
required safety integrity level is maintained.
Inputs: Revised SIS safety requirements.
Outputs: Results of the SIS modification.
Decommissioning
Objectives: To ensure the proper review and sector organization, and to ensure that
safety functions remain appropriate.
Inputs: As-built safety requirements and process information.
Outputs: Safety functions placed out of service.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Verification (of All Steps)
Objectives: To test and evaluate the outputs of a given life-cycle phase to ensure the
correctness and consistency with respect to the products and standards provided as
inputs to that phase.
Inputs: Verification plan for each phase.
Outputs: Verification results for each phase.
Assessments (of All Steps)
Objectives: To investigate and arrive at a judgment as to the functional safety achieved
by the SIS.
Inputs: Safety assessment plan and SIS safety requirements.
Outputs: Results of the SIS functional safety assessments.
Hazard and Risk Analysis
One of the goals of process plant design is to have a facility that is inherently safe.
Trevor Kletz, one of the pillars of the process safety community, has said many times,
“What you don’t have, can’t leak.” Hopefully, the design of the process can eliminate
many of the hazards, such as unnecessary storage of intermediate products and the use
of safer catalysts. One of the first steps in designing a safety system is developing an
understanding of the hazards and risks associated with the process.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Hazard analysis consists of identifying the hazards and hazardous events. There are
numerous techniques that can be used (e.g., a hazard and operability study [HAZOP], a
what-if, a fault tree, and checklists). Techniques such as checklists are useful for wellknown processes where there is a large amount of accumulated knowledge. The
accumulated knowledge can be condensed into a checklist of items that needs to be
considered during the design phase. Other techniques, such as HAZOP or what-if, are
more useful for processes that have less accumulated knowledge. These techniques are
more systematic in their approach and typically require a multidisciplined team. They
typically require the detailed review of design drawings, and they ask a series of
questions intended to stimulate the team into thinking about potential problems and
what might cause them; for example: What if the flow is too high, too low, reversed,
etc.? What might cause such a condition?
Risk assessment consists of ranking the risk of the hazardous events that have been
identified in the hazard analysis. Risk is a function of the frequency or probability of an
event and the severity or consequences of the event. Risks may affect personnel,
production, capital equipment, the environment, company image, etc. Risk assessment
may be either qualitative or quantitative. Qualitative assessments subjectively rank the
risks from low to high. Quantitative assessments attempt to assign numerical factors to
the risk, such as death or accident rates and the actual size of a release. These studies are
not the sole responsibility of the instrument or control system engineer. Obviously,
several other disciplines are required to perform these assessments, such as safety,
operations, maintenance, process, mechanical design, and electrical.
Allocation of Safety Functions to Protective Layers
Figure 23-2 shows an example of multiple independent protection layers that may be
used in a plant. Various industry standards either mandate or strongly suggest that safety
systems be completely separate and independent from control systems. Each layer helps
reduce the overall level of risk. The inner layers help prevent a hazardous event (e.g., an
explosion due to an overpressure condition) from occurring; they are referred to as
protection layers. The outer layers are used to lessen the consequences of a hazardous
event once it has occurred; they are referred to as mitigation layers.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Risk is a measure of the frequency and severity of an event. Figure 23-3 is a graphical
way of representing the risk reduction that each layer provides. Let us consider an
example of an explosion causing multiple fatalities. The vertical line on the right side of
the figure represents the frequency of an initiating event, such as an operator closing or
opening the wrong valve, which could cause the hazardous event if left unchecked (i.e.,
no other safety layer reacted). Let us also assume that our corporate safety target (i.e.,
the tolerable level of risk, shown as the vertical line on the left side of the figure) for
such an event is 1/100,000 per year. (Determining such targets is a significant subject all
unto itself and is beyond the scope of this chapter.) The basic process control system
(BPCS) maintains process variables within safe boundaries and, therefore, provides a
level of protection (i.e., the control system would detect a change in flow or pressure
and could respond). Standards state that one should not claim more than a risk reduction
factor of 10 for the BPCS. If there are alarms separate from the control system—and
assuming the operators have enough time to respond and have procedures to follow—
one might assume a risk reduction factor of 10 for the operators (i.e., they might be able
to detect if someone else in the field closed the wrong valve). If a relief valve could also
prevent the overpressure condition, failure rates and test intervals could be used to
calculate their risk reduction factor (also a significant subject all unto itself and beyond
the scope of this chapter). Let us assume a risk reduction factor of 100 for the relief
valves.
Without an SIS, the level of overall risk is shown in Table 23-1:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Without a safety system, the example above does not meet the corporate risk target of 1/
100,000. However, adding a safety system that provides a level of risk reduction of at
least 10 will result in meeting the corporate risk target. As shown in Table 23-2 in the
next section, this falls into the safety integrity level (SIL) 1 range. This is an example of
Layer of Protection Analysis (LOPA), which is one of several techniques for
determining the performance required of a safety system.
If the risks associated with a hazardous event can be prevented or mitigated with
something other than instrumentation—which is complex, is expensive, requires
maintenance, and is prone to failure—so much the better. For example, a dike is a
simple and reliable device that can easily contain a liquid spill. KISS (keep it simple,
stupid) should be an overriding theme.
Determine Safety Integrity Levels
For all safety functions assigned to instrumentation (i.e., safety instrumented functions
[SIFs]), the level of performance required needs to be determined. The standards refer to
this as the safety integrity level (SIL). This continues to be a difficult step for many
organizations. Note that the SIL is not directly a measure of process risk, but rather a
measure of the safety system performance of a single safety instrumented function (SIF)
required to control the individual hazardous event down to an acceptable level. The
standards describe a variety of techniques on how safety integrity levels can be
determined. This chapter will not attempt to summarize that material beyond the brief
LOPA example given above.
Tables in the standards then show the performance requirements for each integrity level.
Table 23-2 lists the performance requirements for low-demand-mode systems, which are
the most common in the process industries. This shows how the standards are
performance-oriented and not prescriptive (i.e., they do not mandate technologies, levels
of redundancy, or test intervals).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Develop the Safety Requirements Specification
The next step consists of developing the safety requirements specification (SRS). This
consists of documenting the I/O (input/output) requirements, functional logic, the SIL of
each safety function, and a variety of other design issues (bypasses, resets, speed of
response, etc.). This will naturally vary for each system. There is no general, across-theboard recommendation that can be made. One simple example might be: “If temperature
sensor TT2301 exceeds 410 degrees, then close valves XV5301 and XV5302. This
function must respond within 3 seconds and needs to meet SIL 2.” It may also be
beneficial to list reliability requirements if nuisance trips are a concern. For example,
many different systems may be designed to meet SIL 2 requirements, but each will have
a different nuisance trip performance. Considering the costs associated with lost
production downtime, as well as safety concerns, this may be an important issue. In
addition, one should include all operating conditions of the process, from startup
through shutdown, as well as maintenance. One may find that certain logic conditions
conflict during different operating modes of the process.
The system will be programmed and tested according to the logic determined during this
step. If an error is made here, it will carry through for the rest of the design. It will not
matter how redundant the system is or how often the system is manually tested; it
simply will not work properly when required. These are referred to as systematic or
functional failures.
SIS Design and Engineering
Any proposed conceptual design (i.e., a proposed implementation) must be analyzed to
determine whether it meets the functional and performance requirements. Initially, one
needs to select a technology, configuration, test interval, and so on. This pertains to the
field devices, as well as the logic solver. Factors to consider are overall size, budget,
complexity, speed of response, communication requirements, interface requirements,
method of implementing bypasses, testing, and so on. One can then perform a simple
quantitative analysis (i.e., calculate the average probability of failure on demand
[PFDavg] of each safety instrumented function) to determine if the proposed function
meets the performance requirements. The intent is to evaluate the system before one
specifies the solution. Just as it is better to perform a HAZOP before you build the plant
rather than afterwards, it is better to analyze the proposed safety system before you
specify, build, and install it. The reason for both is simple. It is cheaper, faster, and
easier to redesign on paper. The topic of system modeling/analysis is described in
greater detail in the references listed at the end of this chapter.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Detail design involves the actual documentation and fabrication of the system. Once a
design has been chosen, the system must be engineered and built following strict and
conservative procedures. This is the only realistic method we know of for preventing
design and implementation errors. The process requires thorough documentation that
serves as an auditable trail that someone else may follow for independent verification
purposes. It is difficult to catch one’s own mistakes.
After the system is constructed, the hardware and software should be fully tested at the
integrator’s facility. Any changes that may be required will be easier to implement at the
factory rather than the installation site.
Installation, Commissioning, and Validation
It is important to ensure that the system is installed and started up according to the
design requirements, and that it performs according to the safety requirements
specification. The entire system must be checked, this time including the field devices.
There should be detailed installation, commissioning, and testing documents outlining
each procedure to be carried out. Completed tests should be signed off in writing to
document that every function has been checked and has passed all tests satisfactorily.
Operations and Maintenance
Not all faults are self-revealing. Therefore, every SIS must be periodically tested and
maintained. This is necessary to make certain that the system will respond properly to
an actual demand. The frequency of inspection and testing will have been determined
earlier in the life cycle (i.e., system modeling/analysis). All testing must be documented.
This will enable an audit to determine if the initial assumptions made during the design
(failure rates, failure modes, test intervals, diagnostic coverage, etc.) are valid based on
actual experience.
Modifications
As process conditions change, it may be necessary to make modifications to the safety
system. All proposed changes require returning to the appropriate phase of the life cycle
in order to review the impact of the change. A change that may be considered minor by
one individual may actually have a major impact to the overall process. This can only be
determined if the change is documented and thoroughly reviewed by a qualified team.
Hindsight has shown that many accidents have been caused by this lack of review.
Changes that are made must be thoroughly tested.
System Technologies
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Logic Systems
There are various technologies available for use in safety systems—pneumatic,
electromechanical relays, solid state, and software-based. There is no overall “best”
system; rather, each has advantages and disadvantages. The decision over which system
may be best suited for an application will depend on many factors, such as budget, size,
level of risk, flexibility (i.e., ease of making changes), maintenance, interface and
communication requirements, and security.
Pneumatic systems are most suitable for small applications where there are concerns
over simplicity, intrinsic safety, and lack of available electrical power.
Relay systems are fairly simple, relatively inexpensive to purchase, and immune to most
forms of electromagnetic/radio frequency (EMI/RFI) interference; and they can be built
for many different voltage ranges. They generally do not incorporate any form of
interface or communications. Changes to logic require manually changing both physical
wiring and documentation. In general, relay systems are used for relatively small
applications.
Solid-state systems (hardwired systems that are designed to replace relays, yet do not
incorporate software) are relatively dated, but also available. Several of these systems
were built specifically for safety applications and include features for testing, bypasses,
and communications. Logic changes still require manually changing both wiring and
documentation. These systems have fallen out of favor with many due to their high cost,
along with the acceptance of software-based systems.
Software-based systems, generally industrial programmable logic controllers (PLCs),
offer software flexibility, self-documentation, communications, and higher-level
interfaces. Unfortunately, many general-purpose systems were not designed specifically
for safety and do not offer features required for more critical applications (such as
effective self-diagnostics). However, certain specialized single, dual, and triplicated
systems were developed for applications that are more critical and have become firmly
established in the process industries. These systems offer extensive diagnostics and
better fault tolerance schemes, and are often referred to as safety PLCs.
Field Devices
In the process industries, more hardware faults occur in the peripheral equipment—that
is, the measuring instruments/transmitters and the control valves—than in the logic
system itself.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Table 23-3 is reproduced from the 2016 version of IEC 61511. It shows the minimum
hardware fault-tolerance requirement that field devices must meet to achieve each safety
integrity level.
Low demand is considered to be less than once a year. High demand is greater than once
a year. Continuous mode is frequent/continual demands (i.e., critical control functions
with no backup safety function). A minimum hardware fault tolerance of X means that X
+ 1 dangerous failures would result in a loss of the safety function. In other words, a
fault tolerance of 0 refers to a simplex (nonredundant) configuration (i.e., a single
failure would cause a loss of the safety function). A fault tolerance of 1 refers to a 1oo2
(one out of two) or 2oo3 (two out of three) configuration. The table is essentially the
same as the one in the 2003 version of the standard, with the assumption that devices are
selected based on prior use. The point of the table is to remind people that simply using
a logic solver certified for use in SIL 3 will not provide a SIL 3 function or system all on
its own. Field devices have a major impact on overall system performance. The table
can be verified with simple calculations.
Sensors
Sensors are used to measure process variables, such as temperature, pressure, flow, and
level. They may consist of simple pneumatic or electric switches that change state when
a set point is reached, or they may contain pneumatic or electric analog transmitters that
give a variable output in relation to the strength or level of the process variable.
Sensors, like any other devices, may fail in a number of different ways. They may cause
nuisance trips (i.e., they respond without any change of input signal). They may also fail
to respond to an actual change of input condition. While these are the two failure modes
of most concern for safety systems, there are additional failure modes as well, such as
leaking, erratic output, and responding at an incorrect level.
Most safety systems are designed to be fail-safe. This usually means that the safety
system makes the process or the equipment revert to a safe state when power is lost,
which usually means stopping production. (Nuisance trips should be avoided for safety
reasons as well, since start-up and shutdown operations are usually associated with the
highest levels of risk.) Thought must be given to how the sensors should respond in
order to be fail-safe.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Final Elements
Final elements generally have the highest failure rates of any components in the system.
They are mechanical devices and subject to harsh process conditions. Safety shutoff
valves also suffer from the fact that they usually remain in a single position and are not
activated for long periods of time, except for testing. One of the most common failure
modes of a valve is being stuck or frozen in place. Valves should be fail-safe upon loss
of power, which usually entails the use of a spring-loaded actuator.
Solenoids are one of the most critical components of final elements. It is important to
use a good industrial grade solenoid valve. The valve must be able to withstand high
temperatures, including the heat generated by the coil itself when energized
continuously.
System Analysis
What is suitable for use in SIL 1, SIL 2, and SIL 3 applications? (SIL 4 is defined in ISA
84/IEC 61511, but users are referred to IEC 61508 because such systems should be
extremely rare in the process industry.) Which technology, which level of redundancy,
and what manual test interval (including field devices) are questions that need to be
answered. Things are not as intuitively obvious as they may seem. Dual is not always
better than simplex, and triple is not always better than dual.
We do not design nuclear power plants or aircraft by gut feel or intuition. As engineers,
we must rely on quantitative evaluations as the basis for our judgments. Quantitative
analysis may be imprecise and imperfect, but it is a valuable exercise for the following
reasons:
• It provides an early indication of a system’s potential to meet the design
requirements.
• It enables one to determine the weak link in the system (and fix it, if necessary).
In order to predict the performance of a system, one needs the performance data from all
the components. Information is available from user records, vendor records, military
style predictions, and commercially available databases in different industries.
When modeling the performance of a safety system, one needs to consider two failure
modes:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Safe failures – These result in nuisance trips and lost production. Common terms
used to describe this mode of performance are mean time between failure
spurious (MTBFSPURIOUS) and nuisance trip rate.
• Dangerous failures – These result in hidden failures where the system will not
respond when required. Common terms used to quantify performance in this
mode are probability of failure on demand (PFD) and risk reduction factor (RRF),
which is 1/ PFD.
Note that safety integrity levels only refer to dangerous system performance. There is no
relationship between safe and dangerous system performance. An SIL 4 system may
produce a nuisance trip every month, just as a SIL 1 system may produce a nuisance trip
just once in 100 years. Knowing the performance in one mode tells you nothing about
the performance in the other.
There are multiple modeling techniques used to analyze and predict safety system
performance, the most common methods being reliability block diagrams, algebraic
equations, fault trees, Markov models, and Monte Carlo simulation. Each method has its
pros and cons. No method is more “right” or “wrong” than any other. They are all
simplifications and can account for different factors. Using such techniques, one can
model different technologies, levels of redundancy, test intervals, and field-device
configurations. One can model systems using a hand calculator, or develop spreadsheets
or stand-alone programs to automate and simplify the task. Table 23-4 is an example of
a “cookbook” that one could develop using any of the modeling techniques.
Table 23-4 Notes:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Such tables are by their very nature oversimplifications. It is not possible to show the
impact of all design features (failure rates, failure-mode splits, diagnostic levels,
quantities, manual test intervals, common-cause factors, proof-test coverage, impact of
bypassing, etc.) in a single table. Users are urged to perform their own analysis in order
to justify their design decisions. The above table should be considered an example only,
based on the following assumptions:
1. Separate logic systems are assumed for safety applications. Safety functions
should not be performed solely within the BPCS.
2. One sensor and one final element are assumed. Field devices are assumed to
have a mean time between failure (MTBF) in both failure modes (safe and
dangerous) of 50 years.
3. Simplex (nonredundant) transmitters are assumed to have 30% diagnostics; fault
tolerant transmitters with comparison have greater than 95% diagnostics.
4. Transmitters with comparison means comparing the control transmitter with the
safety transmitter and assuming 90% diagnostics.
5. Dumb valves offer no self-diagnostics; smart valves (e.g., automated partial
stroking valves) are assumed to offer 80% diagnostics.
6. When considering solid-state logic systems, only solid-state systems specifically
built for safety applications should be considered. These systems are either
inherently fail-safe (like relays) or they offer extensive self-diagnostics.
7. General-purpose PLCs are not appropriate beyond SIL 1 applications. They do
not offer effective enough diagnostic levels to meet the higher performance
requirements. Check with your vendors for further details.
8. One-year manual testing is assumed for all devices. (More frequent testing
would offer higher levels of safety performance.)
9. Fault tolerant configurations are assumed to be either 1oo2 or 2oo3. (1oo2, or
“one out of two,” means there are two devices and either one can trip/shutdown
the system.) The electrical equivalent of 1oo2 is two closed and energized
switches wired in series and connected to a load. 1oo2 configurations are safe, at
the expense of more nuisance trips. 2oo2 configurations are less safe than
simplex and should only be used if it can be documented that they meet the
overall safety requirements.
10. The above table does not categorize the nuisance trip performance of any of the
systems.
Key Points
• Follow the steps defined in the safety-design life cycle.
• If you cannot define it, you cannot control it.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Justify and document all your decisions (i.e., leave an auditable trail).
• The goal is to have an inherently safe process (i.e., one in which you do not even
need an SIS).
• Do not put all of your eggs in one basket (i.e., have multiple, independent safety
layers).
• The SIS should be fail-safe and/or fault-tolerant.
• Analyze the problem, before you specify the solution.
• All systems must be tested periodically.
• Never leave points in bypass during normal operation!
Rules of Thumb
• Maximize diagnostics. (This is the most critical factor in safety performance.)
• Any indication is better than no indication (transmitters have advantages over
switches, systems should provide indications even when signals are in bypass,
etc.).
• Minimize potential common-cause problems.
• General-purpose PLCs are not suitable for use beyond SIL 1.
• When possible, use independently approved and/or certified components/systems
(exida, TÜV, etc.).
Further Information
ANSI/ISA-84-2004 (IEC 61511 Mod). Functional Safety: Safety Instrumented Systems
for the Process Industry Sector. Research Triangle Park, NC: ISA (International
Society of Automation).
Chiles, James R. Inviting Disaster. New York: Harper Business, 2001. ISBN 0-06662081-3. “Energized by Safety: At Conoco, Putting Safety First Puts Profits First
Too.” Continental magazine (February 2002): 49–51.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Goble, William M. Control System Safety Evaluation and Reliability. Research Triangle
Park, NC: ISA (International Society of Automation), 1998. ISBN 1-55617-636-8.
Guidelines for Chemical Process Quantitative Risk Analysis. New York: Center for
Chemical Process Safety (CCPS) of the AIChE (American Institute of Chemical
Engineers), 1989. ISBN 0-8169-0402-2.
Guidelines for Hazard Evaluation Procedures. New York: Center for Chemical Process
Safety (CCPS) of the AIChE (American Institute of Chemical Engineers), 1992.
ISBN 0-8169-0491-X.
Guidelines for Safe Automation of Chemical Processes. New York: Center for Chemical
Process Safety (CCPS) of the AIChE (American Institute of Chemical Engineers),
1993. ISBN 0-8169-0554-1.
Gruhn, P., and H. Cheddie. Safety Instrumented Systems: Design, Analysis, and
Justification. Research Triangle Park, NC: ISA (International Society of
Automation), 2006. ISBN 1-55617-956-1.
IEC 61508:2010. Functional Safety – Safety Related Systems. Geneva 20 – Switzerland:
IEC (International Electrotechnical Commission).
ISA-TR84.00.02-2002-Parts 1-5. Safety Instrumented Functions (SIF) Safety Integrity
Level (SIL) Evaluation Techniques Package. Research Triangle Park, NC: ISA
(International Society of Automation).
Kletz, Trevor A. What Went Wrong? Case Histories of Process Plant Disasters. 3rd ed.
Houston, TX: Gulf Publishing Co., 1994. ISBN 0-88415-0-5.
Layer of Protection Analysis. New York: Center for Chemical Process Safety (CCPS) of
the AIChE (American Institute of Chemical Engineers), 2001. ISBN 0-8169-08117.
Leveson, Nancy G. Safeware—System Safety and Computers. Reading, MA: AddisonWesley, 1995. ISBN 0-201-11972-2.
Marszal, Edward M., and Dr. Eric W. Scharpf. Safety Integrity Level Selection:
Systematic Methods Including Layer of Protection Analysis. Research Triangle
Park, NC: ISA (International Society of Automation), 2002.
Perrow, Charles. Normal Accidents. Princeton, NJ: Princeton University Press, 1999.
ISBN 0-691-00412-9.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
About the Author
Paul Gruhn is a global functional safety consultant with aeSolutions in Houston, Texas.
Gruhn—an ISA member for more than 25 years—is an ISA Life Fellow, co-chair of the
ISA84 standard committee (on safety instrumented systems), the developer and
instructor of ISA courses on safety systems, the author of two ISA textbooks, and the
developer of the first commercial, safety-system software modeling program. Gruhn has
a BS in mechanical engineering from Illinois Institute of Technology, is a licensed
Professional Engineer (PE) in Texas, and is both a Certified Functional Safety Expert
(CFSE) and an ISA84 Safety Instrumented Systems Expert.
24
Reliability
By William Goble
Introduction
There are several common metrics used within the field of reliability engineering.
Primary ones include reliability, unreliability, availability, unavailability, and mean time
to failure (MTTF). However, when different failure modes are considered, as they are
when doing safety instrumented function (SIF) verification, then new metrics are
needed. These include probability of failing safely (PFS), probability of failure on
demand (PFD), probability of failure on demand average (PFDavg), mean time to failure
spurious (MTTFSPURIOUS), and mean time to dangerous failure (MTTFD).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Measurements of Successful Operation: No Repair
Probability of success – This is often defined as the probability that a system will
perform its intended function when needed and when operated within its specified
limits. The phrase at the end of the last sentence tells the user of the equipment that the
published failure rates apply only when the system is not abused or otherwise operated
outside of its specified limits.
Using the rules of reliability engineering, one can calculate the probability of successful
operation for a particular set of circumstances. Depending on the circumstances, that
probability is called reliability or availability (or, on occasion, some other name).
Reliability – A measure of successful operation for a specified interval of time.
Reliability, R(t), is defined as the probability that a system will perform its intended
function when required to do so if operated within its specified limits for a specified
operating time interval (Billinton 1983). The definition includes five important aspects:
1. The system’s intended function must be known.
2. When the system is required to function must be judged.
3. Satisfactory performance must be determined.
4. The specified design limits must be known.
5. An operating time interval is specified.
Consider a newly manufactured and successfully tested component. It operates properly
when put into service (T = 0). As the operating time interval (T) increases, it becomes
less likely that the component will remain successful. Since the component will
eventually fail, the probability of success for an infinite time interval is zero. Thus, all
reliability functions start at a probability of one and decrease to a probability of zero
(Figure 24-1).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Reliability is a function of the operating time interval. A statement such as “system
reliability is 0.95″ is meaningless because the time interval is not known. The statement
“the reliability equals 0.98 for a mission time of 100 hours” makes perfect sense.
A reliability function can be derived directly from probability theory. Assume the
probability of successful operation for a 1-hour time interval is 0.999. What is the
probability of successful operation for a 2-hour time interval? The system will be
successful only if it is successful for both the first hour and the second hour. Therefore,
the 2-hour probability of success equals:
0.999 • 0.999 = 0.998
(24-1)
The analysis can be continued for longer time intervals. For each time interval, the
probability can be calculated by the equation:
P(t) = 0.999t
(24-2)
Figure 24-2 shows a plot of probability versus operating time using this equation. The
plot is a reliability function.
Reliability is a metric originally developed to determine the probability of successful
operation for a specific “mission time.” For example, if a flight time is 10 hours, a
logical question is, “What is the probability of successful operation for the entire
flight?” The answer would be the reliability for the 10-hour duration. It is generally a
measurement applicable to situations where online repair is not possible, like an
unmanned space flight or an airborne aircraft. Unreliability is the complement of
reliability. It is defined as the probability of failure during a specific mission time.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Mean time to failure (MTTF) – One of the most widely used reliability parameters is
the MTTF. It has been formally defined as the “expected value” of the random variable
time to fail, T. Unfortunately, the metric has evolved into a confusing number. MTTF
has been misused and misunderstood. It has been misinterpreted as “guaranteed
minimum life.”
Formulas for MTTF are derived and often used for products during the useful life
period. This method excludes wear-out failures. Ask an experienced plant engineer,
“What is the MTTF of a pressure transmitter?” He would possibly answer “35 years,”
factoring in wear out. Then the engineer would look at the specified MTTF of 300 years
and think that the person who calculated that number should come out and stay with him
for a few years and see the real world.
Generally, the term MTTF is defined during the useful life of a device. “End of life”
failures are generally not included in the number.
Constant failure rate – When a constant failure rate is assumed (which is valid during
the useful life of a device), then the relationship between reliability, unreliability, and
MTTF are straightforward. If the failure rate is constant then:
λ(t) = λ
(24-3)
For that assumption, it can be shown that:
R(t) = e–λt
(24-4)
F(t) = 1 – e–λt
(24-5)
MTTF = 1/λ
(24-6)
And:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Figure 24-3 shows the reliability and unreliability functions for a constant failure rate of
0.001 failures per hour. Note the plot for reliability looks the same as Figure 24-2, which
shows the probability of successful operation given a probability of success for 1 hour
of 0.999. It can be shown that a constant probability of success is equivalent to an
exponential probability of success distribution as a function of operating time interval.
Useful Approximations
Mathematically, it can be shown that certain functions can be approximated by a series
of other functions. For all values of x, it can be shown as in Equation 24-7:
ex = 1 + x + x2/2! + x3/3! + x4/4! + . . .
For a sufficiently small value of x, the exponential can be approximated with:
ex = 1 + x
Substituting –λt for x:
(24-7)
eλt = 1 + λt
Thus, there is an approximation for unreliability when λt is sufficiently small.
F(t) = λt
(24-8)
Remember, this is only an approximation and not a fundamental equation. Often the
notation for unreliability is PF (probability of failure) and the equation is shown as:
PF(t) = λt
(24-9)
Measurements of Successful Operation: Repairable Systems
The reliability metric requires that a system be successful for an interval of time. While
this probability is a valuable metric for situations where a system cannot be repaired
during a mission, something different is needed for an industrial process control system
where repairs can be made—often with the process operating.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Mean time to restore (MTTR) – MTTR is the “expected value” of the random variable
referred to as restore time (or time to repair). The definition includes the time required
to detect that a failure has occurred, as well as the time required to make a repair once
the failure has been detected and identified. Like MTTF, MTTR is an average value.
MTTR is the average time required to move from unsuccessful operation to successful
operation.
In the past, the acronym MTTR stood for mean time to repair. The term was changed in
IEC 61508 because of confusion as to what was included. Some thought that mean time
to repair included only the actual repair time. Others interpreted the term to include both
time to detect a failure (diagnostic time) and actual repair time. The term mean dead
time (MDT), commonly used in some parts of the world, means the same as MTTR.
MTTR is a term created to include both diagnostic detection time and actual repair time.
Of course, when actually estimating MTTR, one must include time to detect, recognize,
and identify the failure; time to obtain spare parts; time for repair team personnel to
respond; actual time to do the repair; time to document all activities; and time to get the
equipment back in operation.
Reliability engineers often assume that the probability of repair is an exponentially
distributed function, in which case the “restore rate” is a constant. The lowercase Greek
letter mu is used to represent restore rate by convention. The equation for restore rate is:
µ = 1/MTTR
(24-10)
Restore times can be difficult to estimate. This is especially true when periodic activities
are involved. Imagine the situation where a failure in the safety instrumented system
(SIS) is not noticed until a periodic inspection and test is done. The failure may occur
right before the inspection and test, in which case the detection time might be near zero.
On the other hand, it may occur right after the inspection and test, in which case the
detection time may get as large as the inspection period.
In such cases, it is probably best to model repair probability as a periodic function, not
as a constant (Bukowski 2001). This is discussed later in the section “Average
Unavailability with Periodic Inspection and Test.”
Mean time between failures (MTBF) – MTBF is defined as the “average time period
of a failure/repair cycle” (Goble 2010). It includes time to failure, any time required to
detect the failure, and actual repair time. This implies that a component has failed and it
has been successfully repaired. For a simple repairable component,
MTBF = MTTF + MTTR
(24-11)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The MTBF term can also be confusing. Since MTTR is usually much smaller than
MTTF, MTBF is approximately equal to MTTF. The term MTBF is often substituted for
MTTF; it applies to both repairable systems and non-repairable systems. Because of the
confusion, the term MTBF is rarely used in recent times.
Availability – The reliability measurement was not sufficiently useful for engineers who
needed to know the average chance of success of a system when repairs are possible.
Another measure of system success for repairable systems was needed; that metric is
availability. Availability is defined as the probability that a device is successful at time t
when needed and operated within specified limits. No operating time interval is directly
involved. If a system is operating successfully, it is available. It does not matter whether
it has failed in the past and has been repaired or has been operating continuously from
startup without any failures. Availability is a measure of “uptime” in a system, unit, or
module.
Availability and reliability are different metrics. Reliability is always a function of
failure rates and operating time intervals. Availability is a function of failure rates and
repair rates. While instantaneous availability will vary during the operating time
interval, this is due to changes in failure probabilities and repair situations. Often
availability is calculated as an average over a long operating time interval. This is
referred to as steady-state availability.
In some systems, especially SISs, the repair situation is not constant. In SISs, the
situation occurs when failures are discovered and repaired during a periodic inspection
and test. For these systems, steady-state availability is NOT a good measure of system
success. Instead, average availability is calculated for the operating time interval
between inspections. (Note: This is not the same measurement as steady-state
availability.)
Unavailability – This is a measure of failure used primarily for repairable systems. It is
defined as the probability that a device is not successful (is failed) at time t. Different
metrics can be calculated, including steady-state unavailability and average
unavailability, over an operating time interval. Unavailability is the ones’ complement of
availability; therefore,
U(t) = 1 – A(t)
(24-12)
Steady-State Availability – Traditionally, reliability engineers have assumed a constant
repair rate. When this is done, probability models can be solved for steady state or
average probability of successful operation. The metric can be useful, but it has
relevance only for certain classes of problems with random restoration characteristics.
(Note: Steady-state availability solutions are not suitable for systems where failures are
detected with periodic proof test inspections.)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Figure 24-4 shows a Markov probability model of a single component with a single
failure mode. This model can be solved for steady-state availability and steady-state
unavailability.
A = MTTF / (MTTF + MTTR)
(24-13)
U = MTTR / (MTTF + MTTR)
(24-14)
When the Markov model in Figure 24-4 is solved for availability as a function of the
operating time interval, the result is shown in Figure 24-5, labeled A(t). It can be seen
that the availability reaches a steady state after some period of time.
Figure 24-6 shows a plot of unavailability versus unreliability. These plots are
complementary to those shown in Figure 24-5.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Average Unavailability with Periodic Inspection and Test
In low-demand SIS applications with periodic inspection and test, the restore rate is
NOT constant nor is it random. For failures not detected until a periodic inspection and
test, the restore rate is zero until the time of the test. If it is discovered the system is
operating successfully, then the probability of failure is set to zero. If it is discovered the
system has a failure, it is repaired. In both cases, the restore rate is high for a brief
period of time. Dr. Julia V. Bukowski has described this situation and proposed
modeling perfect test and repair as a periodic impulse function (Bukowski 2001).
Figure 24-7 shows a plot of probability of failure in this situation. This can be compared
with unavailability calculated with the constant restore rate model as a function of
operating time. With the constant restore model, the unavailability reaches a steady-state
value. This value is clearly different than the result that would be obtained by averaging
the unavailability calculated using a periodic restore period.
It is often assumed that periodic inspection and test will detect all failed components and
the system will be renewed to perfect condition. Therefore, the unreliability function is
suitable for the problem. A mission time equal to the time between periodic inspection
and test is used. In SIS applications, the objective is to identify a model for the
probability that a system will fail when a dangerous condition occurs. This dangerous
condition is called a demand.
Our objective, then, is to calculate the probability of failure on demand. If the system is
operating in an environment where demands are infrequent (e.g., once per 10 years) and
independent from system proof tests, then an average of the unreliability function will
provide the average probability of failure. This, by definition, is an “unavailability
function” because repair is allowed. (Note: This averaging technique is not valid when
demands are more frequent. Special modeling techniques are needed in that case.)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
As an example, consider the single component unreliability function given in Equation
24-5.
F(t) = 1 – e–λt
This can be approximated as explained previously with Equation 24-8.
F(t) = λt
The average can be obtained by using the expected value equation:
(24-15)
with the result being an approximation equation:
PFavg = λt/2
(24-16)
For a single component (nonredundant) or a single channel system with perfect test and
repair, the approximation is shown in Figure 24-8.
Periodic Restoration and Imperfect Testing
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
It is quite unrealistic to assume that inspection and testing processes will detect all
failures. In the worst case, assume that testing is not done. In that situation, what is the
mission time? If the equipment is used for the life of an industrial facility, plant life is
the mission time. Probability of failure would be modeled with the unreliability function
using the plant life as the time interval.
If the equipment is required to operate only on demand, and the demand is independent
of system failure, then the unreliability function can be averaged as explained in the
preceding section.
When only some failures are detected during the periodic inspection and test, then the
average probability of failure can be calculated using an equation that combines the two
types of failures—those detected by the test and those undetected by the test. One must
estimate the percentage of failures detected by the test to make this split (Van Beurden
2018, Chapter 12). The equation would be:
PFavg = CPTλ TI/2 + (1 – CPT) λ LT/2
where
λ
CPT
=
=
the failure rate
the percentage of failures detected by the proof test
TI
LT
=
=
the periodic test interval
lifetime of the process unit
(24-17)
Equipment Failure Modes
Instrumentation equipment can fail in different ways. We call these failure modes.
Consider a two-wire pressure transmitter. This instrument is designed to provide a 4–20
mA electrical current signal in proportion to the pressure input. Detail failure modes,
effects, and diagnostic analysis (Goble 1999) of several of these devices reveal several
failure modes: frozen output, current to upper limit, current to lower limit, diagnostic
failure, communications failure, and drifting/erratic output among perhaps others. These
instrument failures can be classified into failure mode categories when the application is
known.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
If a single transmitter (no redundancy) were connected to a safety programmable logic
controller (PLC) programmed to trip when the current goes up (high trip), then the
instrument failure modes could be classified as shown in Table 24-1.
Consider the possible failure modes of a PLC with a digital input and a digital output,
both in a de-energize-to-trip (logic 0) design. The PLC failure modes can be categorized
relative to the safety function as shown in Table 24-2.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Final element components will also fail and, again, the specific failure modes of the
components can be classified into relevant failure modes depending on the application.
It is important to know if a valve will open or close when it trips. Table 24-3 shows an
example failure mode classification based on a close to trip configuration.
It should be noted that the above failure mode categories apply to an individual
instrument and may not apply to the set of equipment that performs an SIF, as the
equipment set may contain redundancy. It should be also made clear that the above
listings are not intended to be comprehensive or representative of all component types.
Fail-Safe
Most practitioners define fail-safe for an instrument as “a failure that causes a ‘false or
spurious’ trip of a safety instrumented function unless that trip is prevented by the
architecture of the safety instrumented function.” Many formal definitions, including
IEC 61508:2010, define it as “a failure that causes the system to go to a safe state or
increases the probability of going to a safe state.” This definition is useful at the system
level and includes many cases where redundant architectures are used. However, it also
includes failures of automatic diagnostic components, which have a very different
impact of probabilities for a false trip.
IEC 61508:2000 uses the definition of “failure [that] does not have the potential to put
the safety-related system in a hazardous or fail-to-function state.” This definition
includes many failures that do not cause a false trip under any circumstances and is
quite different from the definition practitioners need to calculate the false-trip
probability.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Fail-Danger
Many practitioners define fail-danger as “a failure that prevents a safety instrumented
function from performing its automatic protection function.” Variations of this definition
exist in standards. IEC 61508 provides a similar definition that reads “failure that has
the potential to put the safety-related system in a hazardous or fail-to-function state.”
The definition from IEC 61508 goes on to add a note: “Whether or not the potential is
realized may depend on the channel architecture of the system; in systems with multiple
channels to improve safety, a dangerous hardware failure is less likely to lead to the
overall dangerous or fail-to-function state.” The note from IEC 61508 recognizes that a
definition for a piece of equipment may not have the same meaning at the SIF level or
the system level.
Annunciation
Some practitioners recognize that certain failures within equipment used in an SIF
prevent the automatic diagnostics from correct operation. When reliability models are
built, many account for the automatic diagnostics’ ability to reduce the probability of
failure. When these diagnostics stop working, the probability of dangerous failure or
false trip is increased. While these effects may not be significant, unless they are
modeled, the effect is not known.
An annunciation failure is therefore defined as “a failure that prevents automatic
diagnostics from detecting or annunciating that a failure has occurred inside the
equipment” (Goble 2010). Note the failure may be within the equipment that fails or
inside an external piece of equipment designed for automatic diagnostics. These failures
would be classified as fail-safe in the definition provided in IEC 61508:2000.
No Effect
Some failures within a piece of equipment have no effect on the safety instrumented
function nor do they cause a false trip or prevent automatic diagnostics from working.
Some functionality performed by the equipment is impaired but that functionality is not
needed. These may simply be called no effect failures. They are typically not used in
any reliability model intended to obtain probability of a false trip or probability of a faildanger.
Detected/Undetected
Failure modes can be further classified as detected or undetected by automatic
diagnostics performed somewhere in the SIS.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Safety Instrumented Function Modeling of Failure Modes
When evaluating SIF safety integrity, an engineer must examine more than the
probability of successful operation. The failure modes of the system must be
individually calculated. The normal metrics of reliability, availability, and MTTF only
suggest a measure of success. Additional metrics to measure safety integrity include
PFD, PFDavg, MTTFD, and risk reduction factor (RRF). Other related terms are
MTTFSPURIOUS and PFS.
PFS/PFD
There is a probability that a safety instrumented function will fail and cause a
spurious/false trip of the process. This is called the probability of failing safely (PFS).
There is also a probability that a safety instrumented function will fail such that it
cannot respond to a potentially dangerous condition. This is called the probability of
failure on demand (PFD).
PFDavg
PFD average (PFDavg) is a term used to describe the average probability of failure on
demand. PFD will vary as a function of the operating time interval of the equipment. It
will not reach a steady-state value if any periodic inspection, test, and repair are done.
Therefore, the average value of PFD over a period of time can be a useful metric if it is
assumed that the potentially dangerous condition (also called a hazard) is independent
from equipment failures in the SIF.
The assumption of independence between hazards and SIF failures seems very realistic.
(Note: If control functions and safety functions are performed by the same equipment,
the assumption may not be valid! Detailed analysis must be done to ensure safety in
such situations, and it is best to avoid such designs completely.) When hazards and
equipment are independent, it is realized that a hazard may come at any time. Therefore,
international standards have specified that PFDavg is an appropriate metric for
measuring the effectiveness of an SIF.
PFDavg is defined as the arithmetic mean over a defined time interval. For situations
where a safety instrumented function is periodically inspected and tested, the test
interval is the correct time period. Therefore:
(24-18)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
This definition is used to obtain numerical results in several of the system-modeling
techniques. In a discrete-time Markov model using numerical solution techniques, a
direct average of the time-dependent numerical values will provide the most accurate
answer. When analytical equations for PFD are obtained using a fault tree, the above
equation can be used to obtain equations for PFDavg.
It has become recognized that at least nine variables may impact a PFDavg calculation
depending on the application (Van Beurden 2016). It is important that realistic analysis
be used for safety design processes.
Redundancy
There are applications where the reliability or safety integrity of a single instrument is
not sufficient. In these cases, more than one instrument is used in a design. Some
arrangements of the instruments are designed to provide higher reliability (typically to
protect against a single “safe” failure). Other arrangements of instruments are designed
to provide higher safety integrity (typically to protect against a single “dangerous”
failure). There are also arrangements that are designed to provide both high reliability
and high safety integrity. When multiple instruments are wired (or configured) to
provide redundancy to protect against one or more failure modes, these arrangements
are known as architectures. A listing of some common architectures is shown in Table
24-4. These architectures are described in detail in Chapter 14 of Control System Safety
Evaluation and Reliability, Third Edition (Goble 2010).
The naming convention stands for X out of Y, where Y is the number of equipment sets
in the design and X is the number of equipment sets needed to perform the function. In
some advanced architecture names, the term D is added to designate a switch that is
controlled by diagnostics to reconfigure the equipment if a failure is detected in one
equipment set.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Conclusions
General system availability as well as the dangerous failure mode metric, PFDavg, are
dependent on variables (Van Beurden 2016) as described above. These include failure
rates, proof testing intervals, proof test coverage, proof test duration, automatic
diagnostics, redundancy, and operational/maintenance capability. It is important that
realistic parameters be used and that all relevant parameters be included in any
calculation.
Further Information
Billinton, R., and Allan, R. N. Reliability Evaluation of Engineering Systems: Concepts
and Techniques. New York: Plenum Press, 1983.
Bukowski, J. V. “Modeling and Analyzing the Effects of Periodic Inspection on the
Performance of Safety-Critical Systems.” IEEE Transactions of Reliability 50, no.
3 (2001).
Goble, W. M. Control System Safety Evaluation and Reliability. 3rd ed. Research
Triangle Park, NC: ISA (International Society of Automation), 2010.
Goble, W. M., and Brombacher, A. C. “Using a Failure Modes, Effects and Diagnostic
Analysis (FMEDA) to Measure Diagnostic Coverage in Programmable Electronic
Systems.” Reliability Engineering and System Safety 66, no. 2 (November 1999).
IEC 61508:2010 Ed. 2.0. Functional Safety of Electrical/Electronic/Programmable
Electronic Safety-Related Systems. Geneva 20 – Switzerland: IEC (International
Electrotechnical Commission).
IEC 61511:2016 Ed. 2.0. Application of Safety Instrumented Systems for the Process
Industries. Geneva 20 – Switzerland: IEC (International Electrotechnical
Commission).
Van Beurden, I., and Goble W. M. Safety Instrumented System Design: Techniques and
Design Verification. Research Triangle Park, NC: ISA (International Society of
Automation), 2018.
——— The Key Variables Needed for PFDavg Calculation. White paper. Sellersville,
PA: exida, 2016. www.exida.com.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
About the Author
William M. Goble, PhD, is currently managing director of exida.com, a knowledge
company that provides ANSI-accredited functional safety and cybersecurity
certification, failure data research, system consulting, training, and support for safetycritical and high-availability process automation. He has more than 40 years of
experience in control systems product development, engineering management,
marketing, training, and consulting. Goble has a BSEE from the Pennsylvania State
University, an MSEE from Villanova, and a PhD from Eindhoven University of
Technology in reliability engineering. He is a registered professional engineer in the
state of Pennsylvania and a Certified Functional Safety Expert (CFSE). He is an ISA
Fellow and an author of several ISA books.
VIII
Network Communications
Analog Communications
This chapter provides an overview of the history of analog communications from direct
mechanical devices to the present digital networks, and it also puts the reasons for many
of the resulting analog communications standards into context through examples of the
typical installations.
Wireless
Wireless solutions can dramatically reduce the cost of adding measurement points,
making it feasible to include measurements that were not practical with traditional
wired solutions. This chapter provides an overview of the principle field-level sensor
networks and items that must be considered for their design and implementation.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Cybersecurity
Integrating systems and communications is now fundamental to automation. While some
who work in a specific area of automation may have been able to avoid a good
understanding of these topics, that isolation is rapidly coming to an end. With the rapid
convergence of information technology (IT) and operations technology (OT), network
security is a critical element in an automation professional’s repertoire.
Many IT-based tools may solve the integration issue; however, they usually do not deal
with the unique real-time and security issues in automation, and they often ignore the
plant-floor issues. As a result, no topic is hotter today than network security—including
the Internet. Automation professionals who are working in any type of integration must
pay attention to the security of the systems.
25
Analog Communications
By Richard H. Caro
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The earliest process control instruments were mechanical devices in which the sensor
was directly coupled to the control mechanism, which in turn was directly coupled to
the control valve. Usually, a dial indicator was provided to enable the process variable
value to be read. These devices are still being used today and are called self-actuating
controllers or often just regulators. These mechanical controllers often take advantage
of a physical property of some fluid to operate the final control element. For example, a
fluid-filled system can take advantage of the thermal expansion of the fluid to both
sense temperature and operate a control valve. Likewise, process pressure changes can
be channeled mechanically or through filled systems to operate a control valve. Such
controllers are proportional controllers with some gain adjustment available through
mechanical linkages or some other mechanical advantage. We now know that they can
exhibit some offset error.
While self-actuating controllers (see Figure 25-1) are usually low-cost devices, it was
quickly recognized that it would be easier and safer for the process operator to monitor
and control processes if there was an indication of the process variable in a more
convenient and protected place. Therefore, a need was established to communicate the
process variable from the sensor that remained in the field to a remote operator panel.
The mechanism created for this communication was air pressure over the range 3–15
psi. This is called pneumatic transmission. Applications in countries using the metric
system required the pressure in standard international units to be 20–100 kPa, which is
very close to the same pressures as 3–15 psi. The value of using 3 psi (or 20 kPa) rather
than zero is to detect failure of the instrument air supply. The value selected for 100% is
15 psi (or 100 kPa) because it is well below nominal pressures of the air supply for
diagnostic purposes.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
However, the operator still had to go to the field to change the set point of the controller.
The solution was to build the controller into the display unit mounted at the operator
panel using pneumatic computing relays. The panel-mounted controller could be more
easily serviced than if it was in the field. The controller output was in the 3–15 psi air
pressure range and piped to a control valve that was, by necessity, mounted on the
process piping in the field. The control valve was operated by a pneumatic actuator or
force motor using higher-pressure air for operation. Once the pneumatic controller was
created, innovative suppliers soon were able to add integral and derivative control to the
original proportional control in order to make the control more responsive and to correct
for offset error. Additionally, pneumatic valve positioners were created to provide
simple feedback control for control valve position. A pneumatic control loop is
illustrated in Figure 25-2.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Thousands of pneumatic instruments, controllers, and control valves have remained in
use more than 50 years after the commercialization of electronic signal transmission and
well into the digital signal transmission age. However, except for a few processes in the
manufacture of extremely hazardous gases and liquids, such as ether, there has been no
growth in pneumatic instrumentation and signal transmission. Many pneumatic process
control systems are being modernized to electronic signal transmission or directly to
digital data transmission and control.
While pneumatic data transmission and control proved to be highly reliable, it is
relatively expensive to interconnect sensors, controllers, and final control elements with
leak-free tubing. Frequent maintenance is required to repair tubing, to clean instruments
containing entrained oil from air compressors, and to remove silica gel from air driers.
In the 1960s, it was decided that the replacement for pneumatic signal transmission was
to be a small analog direct current (DC) signal, which could be used over considerable
distances on a single pair of small gauge wiring without amplification. While most
supplier companies agreed that the range of 4–20 mA was probably the best, one
supplier persisted in its demand for 10–50 mA because its equipment could not be
powered from the base 4 mA signal. The first ANSI/ISA S50.1-1972 standard was for
4–20 mA DC with an alternative at 10–50 mA. Eventually, that one supplier changed
technologies and accepted 4–20 mA DC analog signal communications. The alternative
for 10–50 mA was removed for the 1982 edition of this standard.
The reason that 4 mA was selected for the low end of the transmission range was to
provide the minimal electrical power necessary to energize the field instrument. Also,
providing a “live zero” that is different from zero mA proves that the field instrument is
operating and provides a small range in which to indicate a malfunction of the field
instrument. The upper range value of 20 mA was selected because it perpetuated the
tradition of five times the base value from 3–15 psi pneumatic transmission. There is no
standard meaning for signals outside the 4–20 mA range, although some manufacturers
have used such signals for diagnostic purposes.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
One reason for selecting a current-based signal is that sufficient electrical power (4 mA •
24 V = 96 mW) to energize the sensor can be delivered over the same pair of wires as
the signal. Use of two-wires for both the signal and power reduces the cost of
installation. Some field instruments require too much electrical energy to be powered
from the signal transmission line and are said to be “self-powered,” meaning that they
are powered from a source other than the 4–20 mA transmission line. Another reason for
using a current-based signal is that current is unaffected by the resistance (length or
diameter) of the connecting wire. A voltage-based signal would vary with the length and
gauge of the connecting wire. A typical electronic control loop is illustrated in Figure
25-3.
Although the transmitted signal is a 4–20 mA analog current, the control valve is most
often operated by high-pressure pneumatic air because it is the most economic and
responsive technology to move the position of the control valve. This requires that the
4–20 mA output from the controller be used to modulate the high-pressure air driving
the control valve actuator. A device called an I/P converter may be required to convert
from 4–20 mA to 3–15 psi (or 20–100 kPa). The output of the I/P converter is connected
to a pneumatic valve positioner. However, more often the conversion takes place in an
electronic control valve positioner that uses feedback from the control valve itself and
modulates the high-pressure pneumatic air to achieve the position required by the
controller based on its 4–20 mA output.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The 4–20 mA signal is achieved by the field transmitter or the controller acting as a
current regulator or variable resistor in the circuit. The two-wire loop passing from the
DC power source through the field transmitter can therefore have a maximum total
resistance such that the total voltage drop cannot exceed that of the DC power source—
nominally 24 V. One of the voltage drops occurs across the 250 ohm resistor connected
across the input terminals of a controller, or the field wiring terminals of a digital control
system analog input point. Other instruments may also be wired in series connection in
the same two-wire current loop as long as the total loop resistance does not exceed
approximately 800 ohms. The wiring of a loop-powered field transmitter is illustrated in
Figure 25-4.
More than 25 years after the work began to develop a digital data transmission standard,
4–20 mA DC still dominates the process control market for both new and revamped
installations because it now serves as the primary signal transmission method for the
Highway Addressable Remote Transducers (HART) protocol. HART uses its one 4–20
mA analog transmission channel for the primary variable, usually the process variable
measurement value, and transmits all other data on its digital signal channels carried on
the same wires as the 4–20 mA analog signal.
Analog electronic signal transmission remains the fastest way to transmit a measured
variable to a controller because it is a continuous signal. This is especially true when the
measurement mechanism itself continuously modulates the output current as in forcemotor-driven devices. However, even in more modern field transmitters that use
inherent digital transducers and digital-to-analog converters, delays to produce the
analog signal are very small compared to process dynamics; consequently, the resulting
signals are virtually continuous and certainly at a higher update rate than the associated
control system. Continuous measurement, transmission, and analog electronic
controllers are not affected by the signal aliasing errors that can occur in sampled data
digital transmission and control.
The design of process manufacturing plants is usually documented on process piping
and instrumentation diagrams (P&IDs), which attempt to show the points at which the
process variable (PV) is measured, the points at which control valves are located, and
the interconnection of instruments and the control system. The documentation symbols
for the instruments and control valves, and the P&ID graphic representations for the
instrumentation connections are covered elsewhere in this book. All the P&ID symbols
are standardized in ANSI/ISA 5.01.
Further Information
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
ANSI/ISA-5.01-2009. Instrumentation Symbols and Identification. Research Triangle
Park, NC: ISA (International Society of Automation).
ANSI/ISA-50.00.01-1975 (R2017). Compatibility of Analog Signals for Electronic
Industrial Process Instruments. Research Triangle Park, NC: ISA (International
Society of Automation).
About the Author
Richard H. (Dick) Caro is CEO of CMC Associates, a business strategy and
professional services firm in Arlington, Mass. Prior to CMC, he was vice president of
the ARC Advisory Group in Dedham, Mass. He is the chairman of ISA50 and formerly
the convener of the IEC (International Electrotechnical Committee) Fieldbus Standards
Committees. Before joining ARC, Caro held the position of senior manager with Arthur
D. Little, Inc. in Cambridge, Mass., was a founder of Autech Data Systems, and was
director of marketing at ModComp. In the 1970s, The Foxboro Company employed
Caro in both development and marketing positions. He holds a BS and MS in chemical
engineering and an MBA. He holds the rank of ISA Fellow and is a Certified
Automation Professional. Caro was named to the Process Automation Hall of Fame in
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
2005. He has published three books on automation networks including Wireless
Networks for Industrial Automation.
26
Wireless Transmitters
By Richard H. Caro
Summary
Process control instrumentation has already begun the transition from bus wiring as in
FOUNDATION Fieldbus and PROFIBUS, to wireless. Many wireless applications are now
appearing using both ISA100 Wireless and WirelessHART, although not yet in critical
control loops. As experience is gained, user confidence improves; and as
microprocessors improve in speed and in reduced use of energy, it appears that wireless
process control instrumentation will eventually become mainstream.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Introduction to Wireless
Most instrument engineers would like to incorporate measurement transmitters into
processes without the associated complexity and cost of installing and maintaining
interconnecting wiring to a host system or a distributed control system (DCS). Wireless
solutions can dramatically reduce the cost of adding measurement points, making it
feasible to include measurements that were not practical with traditional wired solutions.
With a wired plant, every individual wire run must be engineered, designed, and
documented. Every termination must be specified and drawn so that installation
technicians can perform the proper connections. Even FOUNDATION Fieldbus, in
which individual point terminations are not important, must be drawn in detail since
installation technicians are not permitted to make random connection decisions.
Wireless has no terminations for data transmission, although sometimes it is necessary
to wire-connect to a power source. Often the physical location of a wireless instrument
and perhaps a detachable antenna may be very important and the antenna may need to
be designed and separately installed.
Maintenance of instrumentation wiring within a plant involves ongoing costs often
related to corrosion of terminations and damage from weather, construction, and other
accidental sources. Wireless has a clear advantage since there are no wiring terminations
and there is little likelihood of damage to the communications path from construction
and accidental sources. However, there are temporary sources of interference such as
mobile large equipment blocking line-of-sight communications and random electrical
noise from equipment such as an arc welder.
Wireless Network Infrastructure
Traditional distributed control systems (DCSs) use direct point-to-point wiring between
field instruments and their input/output (I/O) points present on an analog input or output
multiplexer card. There is no network for the I/O. If HART digital signals are present,
they are either ignored, read occasionally with a handheld terminal, or routed to or from
the DCS through the multiplexer card. If the field instrumentation is based on
FOUNDATION Fieldbus, PROFIBUS-PA, or EtherNet/IP, then a network is required to
channel the data between the field instruments and to and from the DCS. Likewise, if
the field instruments are based on digital wireless technology such as ISA100 Wireless
or WirelessHART, then a network is required to channel the data between the field
instruments and to and from the DCS.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The nature of wired field networks is discussed in Chapter 8. The elements of the
wireless portion of field networks is often referred to as the wireless network
infrastructure. Unlike wired networks in which every signal is conducted by wire to its
intended destination, wireless messages can be interrupted from delivery when the
signals encounter obstacles or interference, or when they are not powerful enough to
survive the distances involved. The wireless network infrastructure includes the
following:
• Formation of mesh networks in which intermediate devices store and forward
signals to overcome obstacles and lengthen reception distances
• Redundant or resilient signal paths so that messages are delivered along
alternative routes for reliability
• Use of frequency shifting so that error recovery does not use the same frequency
as failed messages
• Directional antennas to avoid interference and to lengthen reception distances
Wireless field networks always terminate in a gateway that may connect to a data
acquisition or control system with direct wired connections, with a wired network, or
with a plant-level wireless network. The cost of the wireless field networks must always
include the gateway that is usually combined with the network manager, which controls
the wireless network performance. The gateway almost always includes the wireless
network security manager as well. Note that the incremental cost of adding wireless
network field instruments does not include any additional network infrastructure
devices.
ISM Band
Wireless interconnection relies on the radio frequency spectrum, a limited and crowded
resource in which frequency bands are allocated by local/country governments.
Governmental organizations in most nations have established license-free radio bands,
the most significant of which is the industrial, scientific, and medical (ISM) band
centered at 2.4 GHz. This band is widely used for cordless telephones, home and office
wireless networks, and wireless process control instrumentation. It is also used by
microwave ovens, which are often located in field control rooms (microwave leakage
may show up as interference on wireless networks).
Although the 2.4 GHz ISM band is crowded, protocols designed for it provide many
ways to avoid interference and to recover from blocked messages. The blessing of the
ISM band is that the broad availability of ISM components and systems leads to lower
cost products. The curse of the ISM band is the resulting complexity of the protocol
needed to assure reliable end-to-end communications. There are additional ISM bands at
433 MHz, 868–915 MHz, 5.8 GHz, and 57–66 GHz.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Need for Standards
While field transmitters of process information are free-standing devices, the
information they provide must be sent to other devices to participate in the overall
control scheme. Likewise, the field instrument must be configured or programmed to do
its assigned task. Wired and wireless field instruments and the devices to which they
connect must “speak” the same language. Both the physical connection and the semantic
content of the communications must be the same for this to occur. In communications
language, we refer to this as the protocol. To ensure that the same protocol is followed
by both the field transmitter and the connected receiver, standards must be established.
The standards for field instrumentation have usually originated with the International
Society of Automation (ISA). However, to assure worldwide commonality,
instrumentation standards must be established and maintained by a worldwide standards
body. The standards body assigned to enforce both wired and wireless communications
standards for industrial process control is the International Electrotechnical Commission
(IEC), headquartered in Geneva, Switzerland. With such standards, field transmitters
designed and manufactured by any vendor for use in any country will be able to
communicate with devices from any other vendor who designs according to the
requirements of the standard.
Wired communications standards must ensure that the electrical characteristics are
firmly established and that the format of the messages are organized in a way that they
can be understood and are usable by the receiver. Wireless communications standards
must also specify the format of the messages and their organization, but must properly
apply the laws of radio physics and statutory law. Not only must the radio (electrical
interface) be powerful enough to cover the required distance, but the same radio channel
or carrier frequency must be used at both the transmitting and receiving ends.
Additionally, energy conservation for wireless instrumentation is achieved by turning
the transmitter and the receiver off most of the time; they awaken only to transmit or
receive data. Careful coordination of the awake cycle is necessary to communicate.
Requirements for radio channel compatibility and energy conservation are key elements
in the standards to which wireless field transmitters are designed.
Limitations of Wireless
Process plants contain many pieces of process equipment fabricated from steel. These
pieces of equipment are mounted in industrial buildings also made of steel. In fact, the
phrase “canyons of steel” is usually used to describe the radio environment of the
process plant. Buildings and equipment made from steel, the size of the plant, and
external noise all affect the ability of wireless devices to communicate in the following
ways:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Steel reflects radio signals in many directions which may cause them to arrive at
the destination at different times; this is called multipath interference.
• The wireless transmitter may not be in a direct line-of-sight with the device with
which it must communicate.
• Often process plants are very large, requiring signal paths that may be longer than
those that are physically achievable by the type of wireless communications being
used.
• There may be sources of interference or noise produced by outside forces both
incidental and covert.
The wireless protocol must be designed to overcome these challenges of distance,
multipath interference, and other radio frequency (RF) sources.
Powering Wireless Field Instruments
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Wireless field instruments may be battery powered, powered from a source of local
electricity, or “self-powered” from a power generator source. The convenience of being
able to install devices where there are no sources of electrical power is one of the
principal reasons to invest in the additional cost of wireless field instrumentation, yet
many times electrical power is found nearby from which the wireless instrument may be
powered, but the communications connection remains wireless. This is the case when
the process instrument itself is already installed, but it is connected to a control system
by means of an analog signal that also supplied instrument power (e.g., 4–20 mA.)
Adapters are available to convert process measurement data from such instruments to
data on wireless networks. Often, such adapters may themselves be powered from the
same source as the wired field instrument.
Several manufacturers are producing “energy harvesting” devices to attach to wireless
instruments in order to transform them into self-powered devices. The electrical power
is produced by using a local source of light (solar), vibration, thermal energy, or air
pressure to generate enough electrical energy to power the instrument. Often, a primary
battery is used to back up the harvester during times when the scavenged source of
power is not available. Solar cells obviously depend upon daylight, but may also be
energized from local sources of artificial lighting. Most process plants have vibration
from fluid pumping, and have high temperature equipment from which thermal energy
can be harvested. Finally, most process operations have ordinary compressed air used
for maintenance purposes that is conveniently piped to areas where instrumentation is
installed. Note that power harvesting depends only on the use of non-rechargeable
(primary) batteries for backup when the harvesting energy is not available. IEC 62830 is
a standard that defines the attachment fitting common to all energy-harvesting and
primary battery-powered devices used for wireless communications in industrial
measurement and control. IEC 60086 sets the international standard for primary
batteries.
Most wireless instruments are designed to use primary (non-rechargeable) batteries.
This means that such instruments are very frugal in their use of electricity. In most
cases, the instruments are designed to be operated by replaceable cell batteries in which
the replacement cycle is no shorter than 5 years.
Interference and Other Problems
Since most of the currently available wireless transmitters are designed to operate in the
2.4 GHz ISM band, they are subject to interference from Wi-Fi (IEEE 802.11) operating
in that same band. The protocols using IEEE 802.15.4 are all designed for this band and
use a variety of methods to avoid interference and to recover messages that are not
delivered due to interference. The three process control protocols discussed below (ISA100 Wireless, WirelessHART, and WIA-PA) all use channel hopping within the 2.4 GHz
band to avoid interference and to overcome multipath effects.
A specific requirement for radio communications in Europe has been issued by the
Committee European Normalisation Electrical (CENELEC, the standards authority for
the European Union) that all telecommunications standards shall provide a “Listen
Before Talk” protocol. While this has no direct effect on international standards, all
newly approved IEC standards have specifically implemented this requirement.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
A few users have objected to the use of wireless instrumentation for any critical
application because of the ability to intercept wireless signals and to jam the entire
wireless network with a powerful radio outside the plant fence. These are threats to
wireless networks that are not applicable to wired networks. Capitulating to the fear of
jamming is irrational and hardly a reason to deny the technology that has so many
benefits. Even so, it can be shown that even in the event of a broadband jammer, the
frequency-hopping and direct-sequence spread spectrum used by all three process
control wireless networks will prevent total signal loss.
It is possible to assemble a very powerful jamming broadband transmitter to operate in
the 2.4 GHz band in a covert attack on plant wireless networks. Such a jammer has been
shown to disrupt Wi-Fi communications but only when it is located inside the range of a
Wi-Fi network using omnidirectional antennas; recovery is possible when the network is
switched to channels away from the center of the band. While the jammer may send its
signals over the full 2.4 GHz band, more than 60% of its power is confined to the center
frequencies, with much less power at the highest and lowest frequencies. Process control
networks use at least 15 channels scattered across the 2.4 GHz band; they are designed
to extract correlated data from the ever-present white noise and to reject channels that
generate excessive retries due to noise interference. This is not to say that a covert and
illegal jammer has no effect on industrial wireless communications, just that the design
of the protocol—using both direct-sequence and frequency-hopping spread spectrum
technologies—is specifically designed to mitigate the threat of covert jamming.
All wireless networks based on IEEE 802.15.4 use 100% full time AES 128-bit
encryption for all messages to provide secure messaging. Additionally, the three process
control protocols assure privacy because only devices authorized in advance are
permitted to send messages. WirelessHART authenticates new devices only by direct
attachment to a HART handheld device such that a random signal from outside that
network is rejected. WIA-PA also authenticates new network devices when they are
attached to a network configuration device. ISA100 Wireless authenticates over the air,
but only devices that are pre-registered for network membership and preconfigured with
an out-of-band (infrared) configuration device. Furthermore, ISA100 Wireless may use
256-bit encryption to validate the new network member. These security measures are far
more than industry standard and are widely accepted as “wired-equivalent” security.
ISA-100 Wireless
ANSI/ISA-100.11a-2011 was developed to be the preferred network protocol for
industrial wireless communications. Specifically, ISA100 Wireless was designed to
fulfill all the communications requirements for FOUNDATION Fieldbus, if it is to be
implemented on a wireless network.1 This requires direct peer-to-peer messaging and
time synchronization to ±1.0 ms.
ANSI/ISA-100.11a-2011 is also identified as IEC 62734. The ISA100 Wireless
Compliance Institute, WCI, is the industry organization responsible for testing new
equipment and validating it for standards conformance. ISA100 Wireless registered
products are listed on the WCI website: http://www.isa100wci.org/End-UserResources/Product-Portfolio.aspx.2 The current standard is based on the use of IEEE
802.15.4-2006 radios using 128-bit encryption, direct-sequence spread spectrum, and a
basic slot time of 10 ms. Additions to the base protocol are as follows:
• Configurable slot times for network efficiency
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Less than 100 ms message latency
• Peer-peer messaging to support “smart” field devices
• Over-the-air provisioning (initialization) not requiring a terminal device
• Hopping among the 16 channels according to a configurable hopping table
• Mesh communications using two or more routes
• Backbone network to reduce the depth of the mesh when necessary
• Local nodes use IEEE EUI-64 bit addressing; externally, nodes are addressed
using IPv6 networking using the Internet standards RFC6282 and RFC6775
(6LoWPAN - IPv6 over low power wireless personal area networks)
• End-to-end message acknowledgement using UDP/IP (User Datagram Protocol/
Internet Protocol)
• Application layer compatible with IEC 61804 (EDDL or Electronic Device
Description Language), including HART
• Capable of tunneling other wireless protocols
• Duocast messaging, where every message is simultaneously sent to two neighbors
in the mesh to improve reliability
The protocol for ANSI/ISA-100.11a-2011 served to be a model for the development of
IEEE 802.15.4e-2011, the most recent version of that standard. As the IEEE 802.15.4
radios move to that latest standard, ISA100 Wireless instruments will already be able to
use them.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
ISA100 Wireless has been implemented by several of the major suppliers of process
control instrumentation and is the core of their wireless strategy. Most varieties of
process instruments are available with ISA100 Wireless including HART adapters. The
ISA100.15 subcommittee has developed the technology to be used for a backhaul
network, the network used to connect the ISA100 Gateway with other network gateways
and applications. For process control applications, the current conclusion is that the use
of FOUNDATION Fieldbus High Speed Ethernet (HSE) protocol on whatever IP
network, such as Ethernet or Wi-Fi, is appropriate for the speed and distance.
While early users of ISA100 Wireless have concentrated on monitoring applications, it
is anticipated that their experience will lead to applications in feedback loop control.
Since the architecture of ISA100 Wireless supports FOUNDATION Fieldbus, it has
been predicted by some members of the ISA100 standards committee that ISA100
Wireless will serve as the core technology for a wireless version of FOUNDATION
Fieldbus when microprocessors with suitable low energy requirements become
available.
WirelessHART
WirelessHART was designed by the HART Communication Foundation (HCF) to
specifically address the needs of process measurement and control applications. It is a
wireless extension to the HART protocol that is designed to be backwards compatible
with previous versions of HART. The design minimizes the impact of installing a
wireless network for those companies currently using HART instrumentation.
WirelessHART provides a wireless network connection to existing HART transmitters
installed without a digital connection to a control system. WirelessHART is defined by
the IEC 62591 standard. HCF conducts conformance testing and interchangeability
validation for WirelessHART instruments. Registered devices are listed on their website:
http://www.hartcommproduct.com/inventory2/index.php?
action=listcat&search=search...&tec=2&cat=&mem=&x=24&y=15.3
Unfortunately, WirelessHART is not compatible with ISA100 Wireless. The two
networks may coexist with each other in the same plant area with recoverable
interactions; however, interoperation, or the ability for devices on one network to
communicate directly with those on the other network is not possible, although such
messages can be passed through a gateway that has access to both networks (dualfunction gateway).
WirelessHART uses the IEEE 802.15.4-2006 radio with AES 128-bit encryption, directsequence spread spectrum, channel hopping, and a fixed slot time of 10 ms.
WirelessHART is well supported by current chip suppliers. Features of the
WirelessHART protocol include:
• Hopping among 15 channels according to a pseudo-random hopping table
• Low-latency, high-reliability mesh communications using two or more routes
• A proprietary network layer with IEEE EUI-64-bit addresses
• A proprietary transport layer with end-to-end message acknowledgement
• An application layer consisting of all HART 7 commands plus unique wireless
commands
• HART 7 compatible handheld devices used to provision/initialize field
instruments
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Field-proven performance
WirelessHART has been very popular among early wireless users who can purchase
instruments and HART adapters from several instrumentation suppliers. Since
WirelessHART instruments were the first to market, they have built an installed base
greater than ISA100 Wireless.
WIA-PA
WIA-PA was developed by a Chinese consortium initially formed by Chongqing
University. It is very similar to both ISA100 and WirelessHART, but it has small yet
significant differences. Like ISA100 and WirelessHART, WIA-PA is based on the use of
the IEEE 802.15.4-2006 radio, including 128-bit AES encryption. The slot time is
adjustable, but no default appears in the standard. ISA100 modifies the medium access
control (MAC) sublayer of the data link layer specified by IEEE 802.15.4-2006, while
WirelessHART and WIA-PA do not. WIA-PA provides channel hopping among the 16
channels approved for the 2.4 GHz spectrum in China using its own hopping table that
is not specified in the standard. The local address conforms to IEEE EUI-64, but there is
no IP addressing and no services. The network layer supports the formation of a mesh
network similar to WirelessHART, but it is unlike ISA100 since there is no duocast.
There is no transport layer. The application layer supports an object model, but with no
specific object form. Tunneling is not supported.
At this writing, there are no known commercial suppliers of WIA-PA outside of China.
WIA-FA
WIA-FA is a protocol designed for factory automation that is under development in an
IEC SC65 standards committee. While this protocol is similar to WIA-PA, it is based on
the Wi-Fi physical layer (IEEE 802.11) using only the 2.4 GHz band. It is likely to
change from the initial committee draft now that it has been subjected to international
ballot and comment (2016).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
ZigBee
ZigBee is a specification defined by the ZigBee Alliance consortium. There are several
versions of the specification, but all of them use the IEEE 802.15.4-2006 standard radios
with 128-bit encryption. Since ZigBee is based on the IEEE 802.15.4-2006 standard, the
application must select one of the 16 available channels. ZigBee is widely used in
commercial applications but not industrial applications, except a few specialized process
applications. The following are the specialized forms of ZigBee:
• ZigBee PRO – This is a mesh network optimized for low power consumption
and large networks.
• ZigBee Mesh networking – This is a simpler protocol for small networks.
• ZigBee RF4CE – This is an interoperable specification intended for simple
consumer products needing two-way communications at low cost; there is no
meshing.
Note that the meshing specification used for ZigBee and ZigBee PRO is not the same as
that used by IEEE 802.15.4e-2011, WirelessHART, or ISA100 Wireless.
ZigBee has been very successful in applications such as automatic meter reading for
gas, electricity, and water utilities; for early heating, ventilating, and air-conditioning
(HVAC) applications; and is a leading candidate for use in the Smart Grid applications.
In these applications, the emphasis is on long battery life and low cost.
ZigBee has found several applications in process control including a system to read
valve positions. Many users have trial ZigBee applications in factory automation.
Other Wireless Technologies
Wi-Fi and WiGig
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Wi-Fi is the ubiquitous wireless network technology found in homes, offices, and
industrial plants. It is informally known as wireless Ethernet since it is generally
completely integrated with installed Ethernet networks. Over the years, Wi-Fi has
improved from the very slow IEEE 802.11b standard, operating at about 1 Mbps data
rate, to today’s IEEE 802.11ac standard, operating at about 500 Mbps with up to eight
bonded channels at 5 GHz. Not only does IEEE 802.11ac have a high throughput, but it
uses multiple-input multiple-output (MIMO) to enhance performance by using
multipath signals.
Wi-Fi is not well suited to use in communications with low-powered field
instrumentation, at least not until a low power version is created. However, most
wireless field networks using IEEE 802.15.4 will often require a high-speed wireless
link to operate in the field in order to keep the depth of their mesh shallow to reduce
latency times. Wi-Fi is configured into at least one supplier’s architecture for just this
purpose when the field network is used to gather closed loop process control data. In
such an architecture, there are field access points for ISA100 Wireless or
WirelessHART, and the Wi-Fi network links those access points to a gateway. IEEE
802.11ac with MIMO has been very successful, since the multipath reflections from
process equipment and building steel are used to enhance the transmitted signal.
Many factory automation projects are now using Wi-Fi networks to unite remote I/O
units that support EtherNet/IP, Modbus/TCP, PROFINET, PowerLink, EtherCAT,
SERCOS III, or CC Link IE, all of which are specified to use 100/1000 Ethernet. The
Wi-Fi becomes simply a part of the physical layer to join remote I/O with the
appropriate PLC units. Reliability of WiFi has not been an issue.
The Wi-Fi Alliance is the industry consortium responsible for developing new versions
of Wi-Fi and preparing certification tests to validate interoperability. This organization
recently combined with the WiGig Alliance to administer a developing technology
operating in the 60 GHz ISM band. This technology is expected to find use in the
broadband commercial and residential markets. While the high frequency can limit
application to short distances, the short wavelength allows the use of small, highly
directional, planar and phased array antennas for point-to-point data links over longer
distances.
DASH7
Alternative technologies for low power radio continue to be explored for use in
industrial automation. DASH7 is based on the use of ISO/IEC 18000-7 standard for data
transmission in the 433 MHz ISM band. This band is also used by some longer-range
RF tags. The maximum data rate for DASH7 is 200 Mbps, only slightly slower than that
for IEEE 802.15.4-based networks at 250 Mbps, but only 28 Kbps net data rate is
actually claimed. However, DASH7 has a nominal range of about 1 km, compared with
IEEE 802.15.4 radios at about 100 m. DASH7 defines tag-to-tag communications that
can be used for field networking or meshing. The other appealing feature of DASH7 is
the very low energy consumption necessary for long battery life.
Currently, there are no commercial or industrial applications for DASH7.
Global System Mobile
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Global System Mobile (GSM) is the most popular network for cellular telephony in the
world. The GSM technology uses time division multiple access (TDMA) across two
different frequency channels, one for each direction of data flow. The channels available
for telephony in North America are different from those used elsewhere in the world.
Use of GSM for data transmission is referred to as 1G, a very slow rate. GSM
telephones use power very wisely and have remarkably long battery life.
One of the applications for GSM modems is data collection from supervisory control
and data acquisition (SCADA) system remote terminal units (RTUs).
Code Division Multiple Access
Code division multiple access (CDMA) is a cellular telephony technology used in North
America, parts of Japan, and many countries in Asia, South America, the Caribbean, and
Central America. CDMA efficiently uses the limited telephony channels by packetizing
voice and only transmitting when new data is being sent. CDMA is very conservative in
the use of battery energy.
Long Term Evolution
In the effort to speed up the use of telephone-dedicated channels for high-speed data
communications, the latest (2016) leader is Long Term Evolution (LTE). It now appears
that LTE has replaced Worldwide Interoperability for Microwave Access (WiMAX)
technology in the effort to achieve a 4 Gbps download speed for 4G networks. LTE
chips are designed to conserve energy in order to achieve long battery life. Currently, no
applications for LTE exist in the industrial market, other than voice and conventional
data use.
Z-Wave
Z-Wave is a low-cost, low-power, low-data-rate wireless network operating in the 900
MHz ISM band. The primary application for which Z-Wave was intended is home
automation. This was one of the intended markets for ZigBee, but it appears that the
slow, short message length of Z-Wave, using frequency-shift keying (FSK), has a better
application future in home automation.
While no industrial applications are yet planned for Z-Wave, it appears that this lowcost, long-battery-life technology may be applicable for remote discrete I/O
connections.
Further Information
Caro, Dick. Wireless Networks for Industrial Automation. 4th ed. Research Triangle
Park, NC: ISA (International Society of Automation).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
ANSI/ISA-100.11a-2011. Wireless Systems for Industrial Automation: Process Control
and Related Applications. Research Triangle Park, NC: ISA (International Society
of Automation).
About the Author
Richard H. (Dick) Caro is CEO of CMC Associates, a business strategy and
professional services firm in Arlington, Mass. Prior to CMC, he was vice president of
the ARC Advisory Group in Dedham, Mass. He is the chairman of ISA SP50 and
formerly the convener of IEC (International Electrotechnical Committee) Fieldbus
Standards Committees. Before joining ARC, Caro held the position of senior manager
with Arthur D. Little, Inc. in Cambridge, Mass., was a founder of Autech Data Systems,
and was director of marketing at ModComp. In the 1970s, The Foxboro Company
employed Dick in both development and marketing positions. He holds a BS and MS in
chemical engineering and an MBA. He holds the rank of ISA Fellow and is a Certified
Automation Professional. In 2005 Dick was named to the Process Automation Hall of
Fame. He has published three books on automation networks including Wireless
Networks for Industrial Automation.
Currently, FOUNDATION Fieldbus computations require too much energy to be implemented in a battery-operated
wireless node.
2. Accessed 27 April 2016.
3. Accessed 27 April 2016.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
1.
27
Cybersecurity
By Eric C. Cosman
Introduction
What is the current situation with respect to cybersecurity, and what are the trends?
Cybersecurity is a popularly used term for the protection of computer and
communications systems from electronic attack. Also referred to as information
security, this mature discipline is evolving rapidly to address changing threats.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Although long applied to computers and networks used for basic information processing
and business needs, more recently attention has also been focused on the protection of
industrial systems.1 These systems are a combination of personnel, hardware, and
software that can affect or influence the safe, secure, and reliable operation of an
industrial process.
This shift has resulted in the creation of something of a hybrid discipline, bringing
together elements of cybersecurity, process automation, and process safety. This
combination is referred to as industrial systems cybersecurity. This is a rapidly evolving
field, as evidenced by the increasing focus from a variety of communities ranging from
security researchers to control engineers and policy makers.
This chapter gives a general introduction to the subject, along with references to other
sources of more detailed information.
To appreciate the nature of the challenge fully, it is first necessary to understand the
current situation and trends. This leads to an overview of some of the basic concepts that
are the foundation of any cybersecurity program, followed by a discussion of the
similarities and differences between securing industrial systems and typical information
systems. There are several fundamental concepts that are specific to industrial systems
cybersecurity, and some basic steps are necessary for addressing industrial systems
cybersecurity in a particular situation.
Current Situation
Industrial systems are typically employed to monitor, report on, and control the
operation of a variety of different industrial processes. Quite often, these processes
involve a combination of equipment and materials where the consequences of failure
range from serious to severe. As a result, the routine operation of these processes
consists of managing risk.
Risk is generally defined as being the combination or product of threat, vulnerability,
and consequence. Increased integration of industrial systems with communication
networks and general business systems has contributed to these systems becoming a
more attractive target for attack, thus increasing the threat component. Organizations are
increasingly sharing information between business and industrial systems, and partners
in one business venture may be competitors in another.
External threats are not the only concern. Knowledgeable insiders with malicious intent
or even an innocent unintended act can pose a serious security risk. Additionally,
industrial systems are often integrated with other business systems. Modifying or testing
operational systems has led to unintended effects on system operations. Personnel from
outside the control systems area increasingly perform security testing on the systems,
exacerbating the number and consequence of these effects. Combining all these factors,
it is easy to see that the potential of someone gaining unauthorized or damaging access
to an industrial process is not trivial.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Even without considering the possibility of deliberate attack, these systems are
increasingly vulnerable to becoming collateral damage in the face of a nonspecific
attack, such as the release of malicious software (viruses, worms, Trojan horses, etc.).
The vulnerability of industrial systems has changed as a result of the increased use of
commodity technology, such as operating systems and network components. However, a
full understanding of the level of risk is only possible after considering the consequence
element. The consequence of failure or compromise of industrial systems has long been
well understood by those who operate these processes.
Loss of trade secrets and interruption in the flow of information are not the only
consequences of a security breach. Industrial systems commonly connect directly to
physical equipment so the potential loss of production capacity or product,
environmental damage, regulatory violation, compromise to operational safety, or even
personal injury are far more serious consequences. These may have ramifications
beyond the targeted organization; they may damage the infrastructure of the host
location, region, or nation.
The identification and analysis of the cyber elements of risk, as well as the
determination of the best response, is the focus of a comprehensive cybersecurity
management system (CSMS). A thorough understanding of all three risk components is
typically only possible by taking a multidisciplinary approach, drawing on skills and
experience in areas ranging from information security to process and control
engineering.
While integrated with and complementary to programs used to maintain the security of
business information systems and the physical assets, the industrial system’s response
acknowledges and addresses characteristics and constraints unique to the industrial
environment.
Trends
The situation with respect to cybersecurity continues to evolve. There are several trends
that contribute to the increased emphasis on the security of industrial systems,
including:
• Increased attention is being paid to the protection of industrial processes,
particularly those that are considered to be part of the critical infrastructure.
• Businesses have reported more unauthorized attempts (either intentional or
unintentional) to access electronic information each year than in the previous
year.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• New and improved tools have been developed to automate attacks. These are
commonly available on the Internet. The sources of external threat from the use
of these tools now includes cyber criminals and cyberterrorists who may have
more resources and knowledge to attack an industrial system.
• Changing business models in the industrial sector have led to a more complex
situation with respect to the number of organizations and groups contributing to
the security of industrial systems. These practices must be taken into account
when developing security for these systems.
• The focus on unauthorized access has broadened from amateur attackers or
disgruntled employees to deliberate criminal or terrorist activities aimed at
impacting large groups and facilities.
These and other trends have contributed to an increased level of risk associated with the
design and operation of industrial systems. At the same time, electronic security of
industrial systems has become a more significant and widely acknowledged concern.
This shift requires more structured guidelines and procedures to define electronic
security applicable to industrial systems, as well as the respective connectivity to other
systems.
General Security Concepts
Do common security concepts and principles also apply to industrial systems security?
There has been considerable discussion and debate on the question of whether industrial
system cybersecurity is somehow “different” from that of general business systems. A
more constructive approach is to start with an overview of some of the general concepts
that form the basis of virtually any cybersecurity program, and then build on these
concepts by looking at those aspects that differentiate the system from industrial
systems.
Management System
Regardless of the scope of application, any robust and sustainable cybersecurity
program must balance the needs and constraints in three broad areas: people, processes,
and technology. Each of these areas contributes to the security of systems, and each
must be addressed as part of a complete management system, regardless of whether the
focus is on information or industrial systems security.
People-related weaknesses can diminish the effectiveness of technology. These include a
lack of necessary training or relevant experience, as well as improper attention paid to
inappropriate behavior.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Strong processes can often help to overcome potential vulnerabilities in a security
product, while poor implementation can render good technologies ineffective.
Finally, technology is necessary to accomplish the desired goals, whether they are
related to system functionality or operational performance.
Figure 27-1 shows how the three aspects described above come together in the form of a
management system that allows a structured and measured approach to establishing,
implementing, operating, monitoring, reviewing, maintaining, and improving
cybersecurity.
An organization must identify and manage many activities in order to function
effectively. Any activity that uses resources and is managed in order to enable the
transformation of inputs into outputs can be considered to be a process. Often the output
from one process directly becomes the input to the next process.
The application of a system of processes within an organization, together with the
identification and interactions of these processes, and their management, can be referred
to as a process approach, which encourages its users to emphasize the importance of:
• Understanding an organization’s cybersecurity requirements and the need to
establish policy and objectives for cybersecurity
• Implementing and operating controls to manage an organization’s cybersecurity
risks relative to the context of overall business risks
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Monitoring and reviewing the performance and effectiveness of the industrial
system’s security management system (SMS)
• Regular improvements based on objective measurements
Program Maturity
Establishing a basis for continual improvement requires that there first be some
assessment of program effectiveness. One commonly used method is to apply a Maturity
Model,2 which allows various aspects of the program to be assessed in a qualitative
fashion.
A mature security program integrates all aspects of cybersecurity, incorporating desktop
and business computing systems with industrial automation and control systems. The
development of a program shall recognize that there are steps and milestones in
achieving this maturity.
A model such as this may be applied to a wide variety of requirements for the system in
question. It is intended that capabilities will evolve to higher levels over time as
proficiency is gained in meeting the requirements.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Table 27-1 illustrates the application of maturity levels to industrial control systems
(ICSs), and a comparison to Capability Maturity Model Integration for Services
(CMMI-SVC).
Improvement Model
The need for continual improvement can be described in the context of a simple plando-check-act (PDCA) model, which is applied to structure all processes. Figure 27-2
illustrates how an industrial automation and control systems security management
system (IACS-SMS)3 takes the security requirements and expectations of the interested
parties as input and produces outcomes that meet those requirements and expectations.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Each phase in the above model is described briefly in Table 27-2.
Common Principles
There are several common principles that may be employed as part of virtually any
security program, regardless of the nature of application. Several of these are of
particular relevance in the industrial systems context.
• Least privilege – Each user or system module must be able to access only the
information and resources that are necessary for its legitimate purpose.
• Defense in depth – Employ multiple techniques to help mitigate the risk of one
component of the defense being compromised or circumvented.
• Threat-risk assessment – Assets are subject to risks. These risks are in turn
minimized through the use of countermeasures, which are applied to address
vulnerabilities that are used or exploited by various threats.
Each of these will be touched on in more detail in subsequent sections.
Industrial Systems Security
What makes ICS security different from “normal” security?
With a solid foundation composed of the general concepts of information security, it is
possible to move on to additional concepts that are in fact somewhat different in the
context of industrial systems. It is a solid understanding of these concepts and their
implications that is critical to the development of a comprehensive, effective, and
sustainable cybersecurity response in this environment.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Safety and Security
A good place to start is with the important link between cybersecurity and process
safety. It is a fact that many industrial systems are connected to physical equipment that
makes the treatment of these systems different. To varying degrees, and depending on
the nature of the physical environment, failure or compromise of this equipment can
have serious consequences, ranging from adverse environmental impact to injury or
death. It is for this reason that the overriding objective in these industrial systems is to
ensure that the underlying physical process operates safely. Ineffective or nonexistent
cybersecurity presents a potential means by which this objective can be compromised.
Security Life Cycle
In order to be effective over the long term, the security program applied to an industrial
system must consider all phases of that system’s life cycle. This perspective is
particularly important in this context because of the often long operational life of
industrial systems and processes. All significant decisions must be made with a longterm perspective, given that the underlying system may be in place for decades.
The security-level life cycle is focused on the security level of a portion of the industrial
system over time. It should not be confused with the life-cycle phases of the actual
physical assets comprising the industrial system. Although there are many overlapping
and complimentary activities associated with the asset life cycle and the security-level
life cycle, they each have different trigger points to move from one phase to another. A
change to a physical asset may trigger a set of security-level activities or a change in
security vulnerabilities, but it is also possible that changes to the threat environment
could result in changes to the configuration of one or more asset components.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
There are several views of the security life cycle. One of these is illustrated in Figure
27-3.
Reference Model
With an understanding of the importance of security to safety, the next step is to
consider the nature of the system to be secured. In most cases, this begins with the
selection or development of a reference model that can be used to represent the basic
system functionality in generic terms. This approach is well established in the
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
development of technology standards and practices. One model that addresses the
industrial systems domain has been derived from earlier models appearing in related
industry standards, which are in turn based on the Purdue Reference Model.4 This model
is shown in Figure 27-4 below.
The primary focus of industrial systems security is on the lower three layers of this
model. While the security of systems at Layer 4 is typically well addressed by a general
business security program, the nature of systems at Levels 1 through 3 means that they
require specific attention.
System Definition
The reference model shown in Figure 27-4 provides the context or backdrop for
defining the specific boundaries of the security system. This can be a complex activity
because these boundaries can be described in a variety of terms, and the results are often
not consistent.
Perspectives that must be considered include:
• Functionality included – The scope of the security system can be described in
terms of the range of functionality within an organization’s information and
automation systems. This functionality is typically described in terms of one or
more models. Industrial automation and control includes the supervisory control
components typically found in process industries, as well supervisory control and
data acquisition (SCADA) systems that are commonly found in other critical and
noncritical infrastructure industries.
• Systems and interfaces – It is also possible to describe the scope in terms of
connectivity to associated systems. The range of industrial systems includes those
that can affect or influence the safe, secure, and reliable operation of industrial
processes. They include, but are not limited to:
â—‹ Industrial systems and their associated communications networks, including
distributed control systems (DCSs); programmable logic controllers (PLCs);
remote terminal units (RTUs); intelligent electronic devices; SCADA systems;
networked electronic sensing and control, metering, and custody transfer
systems; and monitoring and diagnostic systems. In this context, industrial
systems include basic process control system and safety instrumented system
(SIS) functions, whether they are physically separate or integrated.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
â—‹ Associated systems at Level 3 or below of the reference model. Examples
include advanced or multivariable control, online optimizers, dedicated
equipment monitors, graphical interfaces, process historians, manufacturing
execution systems, pipeline leak detection systems, work management, outage
management, and energy management systems.
â—‹ Associated internal, human, network, software, machine, or device interfaces
used to provide control, safety, manufacturing, or remote operations
functionality to continuous, batch, discrete, and other processes.
• Activity-based criteria – The ANSI/ISA-95.00.035 standard defines a set of
criteria for defining activities associated with manufacturing operations. A similar
list has been developed for determining the scope of industrial systems security. A
system should be considered to be within this scope if the activity it performs is
necessary for any of the following:
â—‹ Predictable operation of the process
â—‹ Process or personnel safety
â—‹ Process reliability or availability
â—‹ Process efficiency
â—‹ Process operability
â—‹ Product quality
â—‹ Environmental protection
â—‹ Compliance with relevant regulations
â—‹ Product sales or custody transfer affecting or influencing industrial processes
• Asset-based criteria – Industrial systems security may include those systems in
assets that meet any of several criteria or whose security is essential to the
protection of other assets that meet these criteria. Such assets may:
â—‹ Be necessary to maintain the economic value of a manufacturing or operating
process
â—‹ Perform a function necessary to operation of a manufacturing or operating
process
â—‹ Represent intellectual property of a manufacturing or operating process
â—‹ Be necessary to operate and maintain security for a manufacturing or operating
process
â—‹ Be necessary to protect personnel, contractors, and visitors involved in a
manufacturing or operating process
â—‹ Be necessary to protect the environment
â—‹ Be necessary to protect the public from events caused by a manufacturing or
operating process
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
â—‹ Fulfill a legal requirement, especially for security purposes of a manufacturing
or operating process
â—‹ Be needed for disaster recovery
â—‹ Be needed for logging security events
This range of coverage includes systems whose compromise could result in the
endangerment of public or employee health or safety; loss of public confidence;
violation of regulatory requirements; loss or invalidation of proprietary or confidential
information; environmental contamination; economic loss; or impact on an entity or on
local or national security.
• Consequence-based criteria – During all phases of the system’s life cycle,
cybersecurity risk assessments must include a determination of what could go
wrong to disrupt operations, where this could occur, the likelihood that a cyber
attack could initiate such a disruption, and the consequences that could result.
The output from this determination will include sufficient information to help in
the identification and selection of relevant security properties.
Security Zones
For all but the simplest of situations, it is impractical or even impossible to consider an
entire industrial system as having a single common set of security requirements and
performance levels. Differences can be addressed by using the concept of a security
“zone,” or an area under protection. A security zone is a logical or physical grouping of
physical, informational, and application assets sharing common security requirements.
Some systems are included in the security zone and all others are outside the zone.
There can also be zones within zones, or subzones, that provide layered security, giving
defense in depth and addressing multiple levels of security requirements. Defense in
depth can also be accomplished by assigning different properties to security zones.
A security zone has a border, which is the boundary between included and excluded
components. The concept of a zone implies the need to access the assets in a zone from
both within and without. This defines the communication and access required to allow
information and people to move within and between the security zones. Zones may be
considered as trusted or untrusted.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Security zones can be defined in either a physical sense (i.e., a physical zone) or in a
logical manner (i.e., a virtual zone). Physical zones are defined by grouping assets by
physical location. In this type of zone, it is easy to determine which assets are within
each zone. Virtual zones are defined by grouping assets, or parts of physical assets, into
security zones based on functionality or other characteristics, rather than the actual
location of the assets.
When defining a security zone, the first step is to assess the security requirements or
goals in order to determine if a particular asset should be considered within the zone or
outside the zone. The security requirements can be broken down into the following
types:
• Communications access – For a group of assets within a security border, there is
also typically access to assets outside the security zone. This access can be in
many forms, including physical movement of assets (products) and people
(employees and vendors) or electronic communication with entities outside the
security zone.
Remote communication is the transfer of information to and from entities that are
not in proximity to each other. Remote access is defined as communication with
assets that are outside the perimeter of the security zone being addressed. Local
access is usually considered communication between assets within a single
security zone.
• Physical access and proximity – Physical security zones are used to limit access
to a particular area because all the systems in that area require the same level of
trust of their human operators, maintainers, and developers. This does not
preclude having a higher-level physical security zone embedded within a lowerlevel physical security zone or a higher-level communication access zone within a
lower-level physical security zone. For physical zones, locks on doors or other
physical means protect against unauthorized access. The boundary is the wall or
cabinet that restricts access. Physical zones should have physical boundaries
commensurate with the level of security desired and aligned with other asset
security plans.
One example of a physical security zone is a typical manufacturing plant.
Authorized people are allowed into the plant by an authorizing agent (e.g.,
security guard or ID), and unauthorized people are restricted from entering by the
same authorizing agent or by physical barriers.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Assets that are within the security border are those that must be protected to a
given security level, or to adhere to a specific policy. All devices that are within
the border must share the same minimum level security requirements. In other
terms, they must be protected to meet the same security policy. Protection
mechanisms can differ depending on the asset being protected.
Assets that are outside the security zone are, by definition, at a different security
level. They are not protected to the same security level and cannot be trusted to
the same security level or policy.
Security Conduits
Information must flow into, out of, and within a security zone. Even in a non-networked
system, some communication exists (e.g., intermittent connection of programming
devices to create and maintain the systems). This is accomplished using a special type of
security zone: a communications conduit.
A conduit is a type of security zone that groups communications that can be logically
organized into a grouping of information flows within and external to a zone. It can be a
single service (i.e., a single Ethernet network) or it can be made up of multiple data
carriers (i.e., multiple network cables and direct physical accesses). As with zones, it
can be made of both physical and logical constructs. Conduits may connect entities
within a zone or may connect different zones.
As with zones, conduits may be either trusted or untrusted. Conduits that do not cross
zone boundaries are typically trusted by the communicating processes within the zone.
Trusted conduits crossing zone boundaries must use an end-to-end secure process.
Untrusted conduits are those that are not at the same level of security as the zone
endpoint. In this case, the security of the communication becomes the responsibility of
the individual channel.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Figures 27-5 and 27-6 depict examples of zone and conduit definitions for the process
and manufacturing environments, respectively.
Foundational Requirements
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The ISA-62443 and IEC 62443 series of standards describe a small set of foundational
requirements (FRs) that encompass and, at times, help organize more specific detailed
requirements, as well as the actions required for a security program. These requirements
are:
• FR 1. Identification and Authentication Control (IAC) – Based on the target
security level, the industrial control system (ICS) shall provide the necessary
capabilities to reliably identify and authenticate all users (i.e., humans, software
processes, and devices) attempting to access it.
Asset owners will have to develop a list of all valid users (i.e., humans, software
processes, and devices) and to determine the required level of IAC protection for
each zone. The goal of IAC is to protect the ICS from unauthenticated access by
verifying the identity of any user requesting access to the ICS before activating
the communication. Recommendations and guidelines should include
mechanisms that will operate in mixed modes. For example, some zones and
individual ICS components require strong IAC, such as authentication
mechanisms, and others do not.
• FR 2. Use Control (UC) – Based on the target security level, the ICS shall
provide the necessary capabilities to enforce the assigned privileges of an
authenticated user (i.e., human, software process, or device) to perform the
requested action on the system or assets and monitor the use of these privileges.
Once the user is identified and authenticated, the control system has to restrict the
allowed actions to the authorized use of the control system. Asset owners and
system integrators will have to assign to each user (i.e., human, software process,
or device) a group or role, and so on, with the privileges defining the authorized
use of the industrial control systems. The goal of UC is to protect against
unauthorized actions on ICS resources by verifying that the necessary privileges
have been granted before allowing a user to perform the actions. Examples of
actions are reading or writing data, downloading programs, and setting
configurations. Recommendations and guidelines should include mechanisms that
will operate in mixed modes. For example, some ICS resources require strong use
control protection, such as restrictive privileges, and others do not. By extension,
use control requirements need to be extended to data at rest. User privileges may
vary based on time-of-day/date, location, and means by which access is made.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• FR 3. System Integrity (SI) – Based on the target security level, the ICS shall
provide the necessary capabilities to ensure integrity and prevent unauthorized
manipulation.
Industrial control systems will often go through multiple testing cycles (unit
testing, factory acceptance testing [FAT], site acceptance testing [SAT],
certification, commissioning, etc.) to establish that the systems will perform as
intended even before they begin production. Once operational, asset owners are
responsible for maintaining the integrity of the industrial control systems. Using
their risk assessment methodology, asset owners may assign different levels of
integrity protection to different systems, communication channels, and
information in their industrial control systems. The integrity of physical assets
should be maintained in both operational and non-operational states, such as
during production, when in storage, or during a maintenance shutdown. The
integrity of logical assets should be maintained while in transit and at rest, such as
being transmitted over a network or when residing in a data repository.
• FR 4. Data Confidentiality (DC) – Based on the target security level, the ICS
shall provide the necessary capabilities to ensure the confidentiality of
information on communication channels and in data repositories to prevent
unauthorized disclosure.
Some control system-generated information, whether at rest or in transit, is of a
confidential or sensitive nature. This implies that some communication channels
and data-stores require protection against eavesdropping and unauthorized access.
• FR 5. Restricted Data Flow (RDF) – Based on the target security level, the ICS
shall provide the necessary capabilities to segment the control system via zones
and conduits to limit the unnecessary flow of data.
Using their risk assessment methodology, asset owners need to identify the
necessary information flow restrictions and thus, by extension, determine the
configuration of the conduits used to deliver this information. Derived
prescriptive recommendations and guidelines should include mechanisms that
range from disconnecting control system networks from business or public
networks to using unidirectional gateways, stateful firewalls, and demilitarized
zones (DMZs) to manage the flow of information.
• FR 6. Timely Response to Events (TRE) – Based on the target security level,
the ICS shall provide the necessary capabilities to respond to security violations
by notifying the proper authority, reporting needed evidence of the violation, and
taking timely corrective action when incidents are discovered.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Using their risk assessment methodology, asset owners should establish security
policies and procedures, and proper lines of communication and control needed to
respond to security violations. Derived prescriptive recommendations and
guidelines should include mechanisms that collect, report, preserve, and
automatically correlate the forensic evidence to ensure timely corrective action.
The use of monitoring tools and techniques should not adversely affect the
operational performance of the control system.
• FR 7. Resource Availability (RA) – Based on the target security level, the ICS
shall provide the necessary capabilities to ensure the availability of the control
system against the degradation or denial of essential services.
The objective is to ensure that the control system is resilient against various types
of denial of service events. This includes the partial or total unavailability of
system functionality at various levels. In particular, security incidents in the
control system should not affect SIS or other safety-related functions.
Security Levels
Safety systems have used the concept of safety integrity levels (SILs) for almost two
decades. This allows the safety integrity capability of a component or the SIL of a
deployed system to be represented by a single number that defines a protection factor
required to ensure the health and safety of people or the environment based on the
probability of failure for that component or system. The process to determine the
required protection factor for a safety system, while complex, is manageable since the
probability of a component or system failure due to random hardware failures can be
measured in quantitative terms. The overall risk can be calculated based on the
consequences that those failures could potentially have on health, safety, and the
environment (HSE).
Security systems have much broader applications, a much broader set of consequences,
and a much broader set of possible circumstances leading up to a possible event.
Security systems protect HSE, but they are also meant to protect the process itself,
company-proprietary information, public confidence, and national security among other
things in situations where random hardware failures may not be or “are not.” In some
cases, it may be a well-meaning employee that makes a mistake, and in other cases it
may be a devious attacker bent on causing an event and hiding the evidence. The
increased complexity of security systems makes compressing the protection factor down
to a single number much more difficult.
Security levels provide a qualitative approach to addressing security for a zone. As a
qualitative method, security level definition has applicability for comparing and
managing the security of zones within an organization. It is applicable to both end-user
companies and vendors of industrial systems and security products, and may be used to
select industrial systems devices and countermeasures to be used within a zone and to
identify and compare security of zones in different organizations across industry
segments.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Security levels have been broken down into three different types:
1. A target security level is the desired level of security for a particular system.
This is usually determined by performing a risk assessment on a system and
determining that it needs a particular level of security to ensure its correct
operation.
2. An achieved security level is the actual level of security for a particular system.
Achieved security levels are measured after a system design is available or when
a system is in place. They are used to establish that a security system meets the
goals that were originally set out in the target security levels.
3. Capability security levels are the security levels that components or systems can
provide when properly configured. These levels state that a particular system or
component is capable of meeting the target security levels without additional
compensating controls, when properly configured and integrated.
While related, these types have to do with different aspects and phases of the security
life cycle. Starting with a target for a particular system, the design team would first
develop the target security level necessary for a particular system. They would then
design the system to meet those targets, usually in an iterative process, where the
achieved security levels of the proposed design are measured and compared to the target
security levels after each iteration. As part of that design process, the designers would
select systems and components with the necessary capability security levels to meet the
target security-level requirements—or, where such systems and components are not
available—that complement the available ones with compensating security controls.
After the system went into operation, the actual security levels would be measured as
the achieved security levels and then compared to the target levels.
Four different security levels have been defined (1, 2, 3, and 4), each with an increasing
level of security. Every security level defines security requirements or achievements for
systems and products, but there is no requirement that they be applied. The language
used for each of the security levels includes terms like casual, coincidental, simple,
sophisticated, and extended. The following sections will provide some guidance on how
to differentiate between the security levels.
Security Level 1: Protection Against Casual or Coincidental Violation
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Casual or coincidental violations of security are usually through the lax application of
security policies. These can be caused by well-meaning employees just as easily as they
can be by an outside threat. Many of these violations will be security program-related
and will be handled by enforcing policies and procedures.
A simple example would be an operator who is able to change a set point on the
engineering station in the control system zone to a value outside certain conditions
determined by the engineering staff. The system did not enforce the proper
authentication and use control restrictions to disallow the change by the operator.
Another example would be a password being sent in clear text over the conduit between
the control system zone and the plant network, allowing a network engineer to view the
password while troubleshooting the system. The system did not enforce proper data
confidentiality to protect the password. A third example would be an engineer who
means to access the PLC in Industrial Network #1 but actually accesses the PLC in
Industrial Network #2. The system did not enforce the proper restriction of data flow
preventing the engineer from accessing the wrong system.
Security Level 2: Protection Against Intentional Violation Using Simple
Means with Low Resources, Generic Skills, and Low Motivation
Simple means do not require much knowledge on the part of the attacker. The attacker
does not need detailed knowledge of security, the domain, or the particular system under
attack. The attack vectors are well known and there may be automated tools for aiding
the attacker. Such tools are also designed to attack a wide range of systems instead of
targeting a specific system, so an attacker does not need a significant level of motivation
or resources at hand.
An example would be a virus that infects the email server and spreads to the engineering
workstations in the plant network because the server and workstations both use the same
general-purpose operating system. Another example would be an attacker who
downloads an exploit for a publicly known vulnerability from the Internet and then uses
it to compromise a web server in the enterprise network. The attacker then uses the web
server as a pivot point in an attack against other systems in the enterprise network, as
well as the industrial network. A third example would be an operator who views a
website on the human-machine interface (HMI) located in Industrial Network #1 which
downloads a Trojan that opens a hole in the routers and firewalls to the Internet.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Security Level 3: Protection Against Intentional Violation Using
Sophisticated Means with Moderate Resources, System Specific Skills,
and Moderate Motivation
Sophisticated means requiring advanced security knowledge, advanced domain
knowledge, advanced knowledge of the target system, or any combination of these. An
attacker going after a Security Level 3 system will likely be using attack vectors that
have been customized for the specific target system. The attacker may use exploits in
operating systems that are not well known, weaknesses in industrial protocols, specific
information about a particular target to violate the security of the system, or other means
that require a greater motivation, skill, and knowledge than are required for Security
Level 1 or 2.
An example of sophisticated means could be password or key-cracking tools based on
hash tables. These tools are available for download, but applying them takes knowledge
of the system (such as the hash of a password to crack). Another example would be an
attacker who gains access to the safety PLC through the Modbus conduit after gaining
access to the control PLC through a vulnerability in the Ethernet controller. A third
example would be an attacker who gains access to the data historian by using a bruteforce attack through the industrial or enterprise DMZ firewall initiated from the
enterprise wireless network.
Security Level 4: Protection Against Intentional Violation Using
Sophisticated Means with Extended Resources, System Specific Skills,
and High Motivation
Security Level 3 and Security Level 4 are very similar in that they both involve using
sophisticated means to violate the security requirements of the system. The difference
comes from the attacker being even more motivated and having extended resources at
their disposal. These may involve high-performance computing resources, large
numbers of computers, or extended periods of time.
An example of sophisticated means with extended resources would be using super
computers or computer clusters to conduct brute-force password cracking using large
hash tables. Another example would be a botnet used to attack a system employing
multiple attack vectors at once. A third example would be a structured crime
organization that has the motivation and resources to spend weeks attempting to analyze
a system and develop custom “zero-day” exploits.
Standards and Practices
“Help is on the way.”
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
As the hybrid discipline of industrial automation and control systems security evolves,
so do the standards and practices related to its application.
International standards are the joint responsibility of the International Society of
Automation (ISA) and the International Electrotechnical Commission (IEC). Several
standards are available as part of the ISA-62443 or IEC 62443 series, with more under
development.
In addition, the ISA Security Compliance Institute6 manages the ISASecure™ program,
which recognizes and promotes cyber-secure products and practices for industrial
automation suppliers and operational sites.
As the standards and certifications evolve and gain acceptance, practical guidance and
assistance will become increasingly available. Such assistance is typically available
through trade associations, industry groups, and private consultants.
Further Information
Byres, Eric, and John Cusimano. Seven Steps to ICS and SCADA Security. Tofino
Security and exida Consulting LLC, 2012.
Krutz, Ronald L. Securing SCADA Systems. Indianapolis, IN: Wiley Publishing, Inc.,
2006. Langner, Ralph. Robust Control System Networks. New York: Momentum
Press, 2012.
Macaulay, Tyson, and Bryan Singer. Cybersecurity for Industrial Control Systems. Boca
Raton, FL: CRC Press, 2012.
U.S. Department of Energy. Twenty-One Steps to Improve Cybersecurity of SCADA
Networks. http://energy.gov/oe/downloads/21-steps-improve-cyber-security-scadanetworks.
Weiss, Joseph. Protecting Industrial Control Systems from Electronic Threats. New
York: Momentum Press, 2010.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
About the Author
Eric C. Cosman is a former operations IT consultant with The Dow Chemical
Company. In that role, his responsibilities included system architecture definition and
design, technology management, and integration planning for manufacturing systems.
He has presented and published papers on various topics related to the management and
development of information systems for process manufacturing. Cosman contributes to
various standards committees, industry focus groups, and advisory panels. He has been
a contributor to the work of the ISA95 committee, served as the co-chairman of the
ISA99 committee on industrial automation systems security, and served as the vicepresident for standards and practices at ISA. Cosman sponsored a chemical sector
cybersecurity program team that focused on industrial control systems cybersecurity,
and he was one of the authors of the Chemical Sector Cybersecurity Strategy for the
United States.
1.
2.
3.
4.
5.
6.
Many terms are used to describe these systems. The ISA-62443 series of standards uses the more formal and expansive
term industrial automation and control systems (IACS).
One such model is summarized in Table 27-1. The maturity levels are based on the CMMI-SVC model, defined in
CMMI® for Services, Version 1.3, November 2010, (CMU/SEI-2010-TR-034, ESC-TR-2010-034).
IACS-SMS is a term used in the ISA-62443 series of standards.
http://www.pera.net/
http://en.wikipedia.org/wiki/ANSI/ISA-95
http://www.isasecure.org/
IX
Maintenance
Maintenance Principles
Maintenance, long-term support, and system management take a lot of work to do well.
The difference in cost and effectiveness between a good maintenance operation and a
poor one is easily a factor of two and may be much more. Automation professionals
must understand this area so that their designs can effectively deal with life-cycle cost.
Troubleshooting Techniques
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Automation professionals who only work on engineering projects in the office and leave
the field work to others may not realize the tremendous amount of work required to get a
system operating. Construction staff and plant technicians are doing more and more of
the checkout, system testing, and start-up work today, which makes it more important
that automation professionals understand these topics.
Asset Management
Asset management systems are processing and enabling information systems that
support managing an organization’s assets, both physical (tangible) assets and nonphysical (intangible) assets. Asset management is a systematic process of costeffectively developing, operating, maintaining, upgrading, and disposing of assets. Due
to the number of elements involved, asset management is data and information
intensive. Using all the information available from various assets will improve asset
utilization at a lower total cost, which is the goal of asset management programs.
28
Maintenance, Long-Term Support, and
System Management
By Joseph D. Patton, Jr.
Maintenance Is Big Business
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Maintenance is a challenging mix of art and science, where both economics and
emotions have roles. Please note that serviceability and supportability parallel
maintainability, and maintenance and service are similar for our purposes.
Maintainability (i.e., serviceability or supportability) is the discipline of designing and
producing equipment so it can be maintained. Maintenance and service include
performing all actions necessary to restore durable equipment to, or keep it in, specified
operation condition.
There are several forces changing the maintenance business. One is the technological
change of electronics and optics doing what once required physical mechanics.
Computers are guiding activities and interrelationships between processes instead of
humans turning dials and pulling levers. Remote diagnostics using the Internet reduce
the number of site visits and help improve the probability of the right technician coming
with the right part. Many repairs can be handled by the equipment operator or local
personnel. Robots are performing many tasks that once required humans. Many parts,
such as electronic circuit boards, cannot be easily repaired and must be replaced in the
field and sent out for possible repair and recycling. Fast delivery of needed parts and
reverse logistics are being emphasized to reduce inventories, reuse items, reduce
environmental impact, and save costs. Life-cycle costs and profits are being analyzed to
consider production effects, improve system availability, reduce maintenance, repair,
and operating (MRO) costs, and improve overall costs and profits. Change is
continuous!
Organizations that design, produce, and support their own equipment, often on lease,
have a vested interest in good maintainability. On the other hand, many companies,
especially those with sophisticated high-technology products, have either gone bankrupt
or sold out to a larger corporation when they became unable to maintain their creations.
Then, of course, there are many organizations such as automobile service centers,
computer repair shops, and many factory maintenance departments that have little, if
any, say in the design of equipment they will later be called on to support. While the
power of these affected organizations is somewhat limited by their inability to do more
than refuse to carry or use the product line, their complaints generally result in at least
modifications and improvements to the next generation of products.
Maintenance is big business. Gartner estimates hardware maintenance and support is
$120 billion per year and growing 5.36% annually. The Northwestern University
Chemical Process Design Open Textbook places maintenance costs at 6% of fixed
capital investment. U.S. Bancorp estimates that spending on spare parts costs $700
billion in the United States alone, which is 8% of the gross domestic product.
Service Technicians
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Typically, maintenance people once had extensive experience with fixing things and
were oriented toward repair instead of preventive maintenance. In the past, many
technicians were not accustomed to using external information to guide their work.
Maintenance mechanics or technicians often focused on specific equipment, usually at a
single facility, which limited the broader perspective developed from working with
similar situations at many other installations.
Today, service technicians are also called field engineers (FEs), customer engineers
(CEs), customer service engineers (CSEs), customer service representatives (CSRs), and
similar titles. This document will use the terms “technicians” or “techs.” In a sense,
service technicians must “fix” both equipment and customer employees. There are many
situations today where technicians can solve problems over the telephone by having a
cooperative customer download a software patch or perform an adjustment. The major
shift today is toward remote diagnostics and self-repairs via Internet software fixes,
YouTube guidance of procedures, supplier’s websites, and call centers to guide the end
user or technician.
Service can be used both to protect and to promote. Protective service ensures that
equipment and all company assets are well maintained and give the best performance of
which they are capable. Protective maintenance goals for a technician may include the
following:
• Install equipment properly
• Teach the customer how to use the equipment capability effectively
• Provide functions that customers are unable to supply themselves
• Maintain quality on installed equipment
• Gain experience on servicing needs
• Investigate customer problems and rapidly solve them to the customer’s
satisfaction
• Preserve the end value of the product and extend its useful life
• Observe competitive activity
• Gain technical feedback to correct problems
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Service techs are becoming company representatives who emphasize customer contact
skills instead of being solely technical experts. In addition, the business of maintenance
service is becoming much more dependent on having the correct part. A concurrent
trend is customer demand and service level agreements (SLAs) that require fast
restoration of equipment to good operating condition. This is especially true with
computer servers, communications equipment, medical scanners, sophisticated
manufacturing devices, and similar equipment that affects many people or even
threatens lives when it fails.
Accurate and timely data reporting and analysis is important. When focusing on
completing a given job, most technicians prefer to take a part right away, get equipment
up and running, and enter the related data later. Fortunately, cell phones, hand-held
devices, bar code readers and portable computers with friendly application software
facilitate reporting. Returning defective or excess parts may be a lower priority, and
techs may cache personal supplies of parts if company supply is unreliable. However,
there are many situations where on-site, real-time data entry and validation can be
shown to gain accurate information for future improvement. As a result, a challenge of
maintenance management is to develop technology that stimulates and supports
maintenance realities.
Big Picture View
Enterprise asset management (EAM) is a buzzword for the big picture. There are good
software applications available to help manage MRO activities. However, most data are
concentrated on a single facility and even to single points in time, rather than covering
the life cycle of equipment and facilities. As Figure 28-1 illustrates, the initial cost of
equipment is probably far exceeded by the cost to keep it operating and productive over
its life cycle. Many maintenance events occur so infrequently in a facility that years
must pass before enough data is available to determine trends and, by then, the
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
equipment is probably obsolete or at least changed. Looking at a larger group of
facilities and equipment leads to more data points and more rapid detection of trends
and formation of solutions.
Interfacing computerized information on failure rates and repair histories with human
resources (HR) information on technician skill levels and availability, pending
engineering changes, procurement parts availability, production schedules, and financial
impacts can greatly improve guidance to maintenance operations. Then, if we can
involve all plants of a corporation, or even all similar products used by other companies,
the populations become large enough to provide effective, timely information.
Optimizing the three major maintenance components of people, parts, and information,
shown in Figure 28-2, are all important to achieving effective timely information.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Historically, the two main maintenance costs have been labor and materials (people and
parts). Labor costs are increasing. This means organizations must give priority efforts to
reducing frequency, time, and skill level, and thereby the cost of labor. The costs of parts
are also increasing. A specific capability probably costs less, but integrating multiple
part capabilities into a single part brings high costs and more critical need for the
replaceable costs. A third leg is becoming important to product development and
support: information as generally provided by software on computer and
communications systems. Digital electronic and optical technologies are measurably
increasing equipment capabilities while reducing both costs and failure rates. Achieving
that reduction is vital. Results are seen in the fact that a service technician, who a few
years ago could support about 100 personal computers, can now support several
thousand. Major gains can be made in relating economic improvements to
maintainability efforts. Data gathered by Patton Consultants shows a payoff of 50:1; that
is, a benefit of $50 prevention value for each $1 invested in maintainability.
No Need Is Best
Everything will fail sometime—electrical, electronic, hydraulic, mechanical, nuclear,
optical, and especially biological systems. People spend considerable effort, money, and
time trying to fix things faster.
However, the best answer is to avoid having to make a repair at all. To quote Ben
Franklin, “An ounce of prevention is worth a pound of cure.” The failure-free item that
never wears out has yet to be produced. Perhaps someday it will be, but meanwhile, we
must replace burned-out light bulbs, repair punctured car tires, overhaul jet engines, and
correct elusive electronic discrepancies in computers.
A desirable long-range life-cycle objective is to achieve very low equipment failure
rates and require replacement of only consumables and the parts that wear during
extended use, which can be replaced on a condition-monitored predictive basis.
Reliability (R) and maintainability (M) interact to form availability (A), which may be
defined as the probability that equipment will be in operating condition at any point in
time. Three main types of availability are: inherent availability, achieved availability,
and operational availability. Service management is not particularly interested in
inherent availability, which assumes an ideal support environment without any
preventive maintenance, logistics, or administrative downtime. In other words, inherent
availability is the pure laboratory availability as viewed by design engineering.
Achieved availability also assumes an ideal support environment with everything
available. Operational availability is what counts in the maintenance tech’s mind, since
it considers a “real-world” operating environment.
The most important parameter in determining availability is failure rate, as a product
needs corrective action only if it fails. The main service objective for reliability is mean
time between failure (MTBF), with “time” stated in the units most meaningful for the
product. Those units could include:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Time – Hours, days, weeks, etc.
• Distance – Miles, kilometers, knots, etc.
• Events – Cycles, gallons, impressions, landings, etc.
It is important to realize equipment failures caused by customer use should be
anticipated in the design. Coffee spilling in a keyboard, a necklace dropping into a
printer, and panicked pushing of buttons by frustrated users add more calls for help.
Operating concerns by inexperienced users often result in more than half the calls to a
service organization. What the customer perceives as a failure may vary from technical
definitions, but customer concerns must still be addressed by the business.
For operations where downtime is not critical, the need for a highly responsive
maintenance organization is not critical. However, for manufacturing operations where
the process performance is directly related to the performance of the automation
systems, or any other part of the process, downtime can be directly related to the
revenue-generation potential of the plant. Under these conditions, response time
represents revenue to the plant itself. Thus, plants that would have revenue generation
capacity of $10,000 worth of product per hour, operating in a 24-hour day, 7-day week
environment, would be losing approximately $240,000 of revenue for every day that the
plant is shut down. A 24-hour response time for plants of this type would be completely
unsatisfactory. On the other hand, if a manufacturing plant that operates on a batch basis
has no immediate need to complete the batch because of the scheduling of other
products, then a 24-hour response time may be acceptable.
A typical automobile, for example, gives more utility at lower relative cost than did cars
of even a few years ago; however, it must still be maintained. Cars once required
frequent spark plug changes and carburetor adjustments, but fuel injection has replaced
carburetion. A simple injector cleaning eliminates the several floats, valves, and gaskets
of older carburetors— with fewer failures and superior performance. Computer-related
failures that used to occur weekly are now reduced to units of years.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Service level agreements (SLAs) increasingly require that equipment be restored to
good operation the same day service is requested, and often specify 4 hours, 2 hours, or
even faster repair. Essential equipment may cause great hardship physically and
financially if it is down for long periods of time.
For example, a production line of an integrated circuit fabrication facility can lose
$100,000 per hour of shutdown. A Tennent Healthcare Hospital reports that a magnetic
resonance induction (MRI) scanner that cannot operate costs $4,000 per hour in lost
revenue and even more if human life is at risk. Failure of the central computer of a
metropolitan telephone system can cause an entire city to grind to a stop until it is fixed.
Fortunately, reliability and the availability (uptime) of equipment are improving, which
means there are fewer failures. However, when failures do occur, the support solutions
are often complex.
Evolution of Maintenance
Maintenance technology has also been rapidly changing during recent years. The idea
that fixed-interval preventive maintenance is right for all equipment has given way to
the reliability-based methods of on-condition and condition monitoring. The process of
maintenance is illustrated in Figure 28-3.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Many defective parts are now discarded rather than being maintained at organizational
or even intermediate levels. The multilevel system of maintenance is evolving into a
simplified system of more user participation, local first- and second-level maintenance,
and backup direct from a third-party or original equipment manufacturer (OEM) service
organization. Expert systems and artificial intelligence are being developed to help
diagnostics and to predict the need for preventive maintenance. Parts are often supplied
directly from vendors at the time of need, so maintenance organizations need not invest
hard money in large stocks of parts.
Automatic Analysis of Device Performance
There is increased focus and resource deployment to design durable products for
serviceability. Durable equipment is designed and built once, but it must be maintained
for years. With design cycles of 6 months to 3 years and less, and with product lives
ranging from about 3 years for computers through 40+ years for hospital sterilizers,
alarm systems, and even some airplanes, the initial investment in maintainability will
either bless or haunt an organization for many years. If a company profits by servicing
equipment it produced, good design will produce high return on investment in user
satisfaction, repeat sales, less burden for the service force, and increased long-term
profits. In many corporations, service generates as much revenue as product sales do,
and the profit from service is usually greater. Products must be designed right the first
time. That is where maintainability that enables condition monitoring and on-condition
maintenance becomes effective.
Instruments that measure equipment characteristics are being directly connected to the
maintenance computer. Microprocessors and sensors allow vibration readings, pressure
differentials, temperatures, and other nondestructive test (NDT) data to be recorded and
analyzed. Historically these readings primarily activated alarm enunciators or recorders
that were individually analyzed. Today, there are automated control systems in use that
can signal the need for more careful inspection and preventive maintenance. These
devices are currently cost effective for high-value equipment such as turbines and
compressors. Progress is being made in this area of intelligent device management so all
kinds of electrical, electronic, hydraulic, mechanical, and optical equipment can “call
home” if they begin to experience deficiencies. Trend analysis for condition monitoring
is assisted by accurate, timely computer records.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Capabilities should also be designed into computer programs to indicate any other active
work orders that should be done on equipment at the same time. Modifications, for
example, can be held until other work is going to be done and can be accomplished most
efficiently at the same time as the equipment is down for other maintenance activities. A
variation on the same theme is to ensure emergency work orders will check to see if any
preventive maintenance work orders might be done at the same time. Accomplishing all
work at one period of downtime is usually more effective than doing smaller tasks on
several occasions.
Products that can “call home” and identify the need to replace a degrading part before
failure bring major advantages to both the customer and support organization. There are,
however, economic trade-offs regarding the effort involved versus the benefit to be
derived. For example, the economics may not justify extensive communication
connections for such devices as “smart” refrigerators. However, business devices that
affect multiple people need intelligent device management (IDM) with remote
monitoring to alert the service function to a pending need, hopefully before equipment
becomes inoperable. The ability to “know before you go” is a major help for field
technicians, so they have the right part and are prepared with knowledge of what to
expect.
It is important to recognize the difference between response time and restore time.
Response time is the time in minutes from notification that service is required until a
technician arrives on the scene. Restore time adds the minutes necessary to fix the
equipment. Service contracts historically specified only response time, but now usually
specify restore time. Response is action. Restore is results (see Figure 28-4).
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The challenge is that, to restore operation, the technician often needs a specific part.
Many field replaceable units (FRUs) are expensive and not often required. Therefore,
unless good diagnostics identifies the need for a specific part, techs may arrive at the
customer location and then determine they need a part they do not have. Diagnostics is
the most time-consuming portion of a service call. Technicians going to a call with a 4hour restore requirement will often consume an hour or more to complete the present
assignment and travel to the new customer. Diagnostics adds even more time, so the
techs could easily consume 2 hours of the 4 available before even knowing what part is
needed. Acquiring parts quickly then becomes very important. The value of information
is increasing. Information is replacing inventory. Knowing in an accurate, timely way
that a part was used allows a company to automatically initiate resupply to the
authorized stocking site, even to the extent of informing the producer who will supply
the warehouse with the next required part.
An organization can fix considerable equipment the next day without undue difficulty. A
required part can be delivered overnight from a central warehouse that stocks at least
one of every part that may be required. Overnight transportation can be provided at
relatively low cost with excellent handling, so orders shipped as late as midnight in
Louisville, Ky. or Memphis, Tenn. can be delivered as early as 6:00 a.m. in major
metropolitan areas. Obviously, those are “best case” conditions. There are many
locations around the world where a technician must travel hours in desolate country to
get to the broken equipment. That technician must have all necessary parts and,
therefore, will take all possible parts or acquire them en route.
Service parts is a confidence business. If technicians have confidence the system will
supply the parts they need when and where they need them, then techs will minimize
their cache of service parts. If confidence is low, techs will develop their own stock of
parts, will order two parts when only one is needed, and will retain intermittent problem
parts.
Parts required 24/365 can be shared through either third-party logistics companies
(TPLs) or intelligent lockers instead of being carried by the several individual
technicians who might provide the same coverage. Handoffs from the stock-keeping
facility to the courier to the technician can be facilitated by intelligent lockers. Today,
most orders are transmitted to the company warehouse or TPL location that picks and
packs the ordered part for shipment and notifies the courier. Then the courier must
locate the technician, who often has to drop what he or she is doing, go to meet the
courier, and sign for the part. Avoid arrangements that allow the courier to leave parts at
a receiving dock or reception desk, because they often disappear before the technician
arrives.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Intelligent lockers can facilitate the handoff procedures at both ends of the delivery
process. The receiving personnel can put parts in the intelligent locker and immediately
notify the courier or technician by page, cell phone, fax, e-mail, or other method that the
part is ready for pick up. The receiver can then retrieve parts at his or her convenience,
and the access code provides assurance that the correct person gets the part.
A single vendor can manage one-to-many intelligent lockers to provide parts to many
users. For example, Granger or The Home Depot could intelligently control sales of
expensive, prone-to-shrink tools and accessories by placing these items in intelligent
lockers outside their stores where the ordered items can be picked up at any hour. Public
mode allows many users to place their items in intelligent lockers for access by
designated purchasers. Vendors could arrange space as required so a single courier
“milk run” could deliver parts for technicians from several companies to pick up when
convenient. This “controlled automat” use is sure to excite couriers themselves, as well
as entrepreneurs who could use the capability around the clock for delivery of airline
tickets, laptop computer drop-off and return, equipment rental and return, and many
similar activities.
Installation parts for communications networks, smart buildings, security centers, and
plant instrumentation are high potential items for intelligent lockers. These cabinets can
be mounted on a truck, train, or plane and located at the point of temporary need.
Communications can be by wired telephone or data, and wireless cell, dedicated or
pager frequencies so there are few limits on locations. Installations tend to be chaotic,
without configuration management, and with parts taken but not recorded. Intelligent
lockers can improve these and many other control and information shortages.
Physical control is one thing, but information control is as important. Many technicians
do not like to be slowed down with administration. Part numbers, usage, transfers, and
similar matters may be forgotten in the rush of helping customers. Information provided
automatically by the activities involving intelligent lockers should greatly improve parts
tracking, reordering, return validation, configuration management, repair planning,
pickup efficiency, and invoicing.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Production Losses from Equipment Malfunction
In-plant service performance is primarily directed at supporting the plant operations. As
most equipment failures in a plant represent production loss, measuring the amount of
loss that results from inaccurate or improper service is a key element to measuring the
service operation. Because other parameters can affect production loss, only by noting
the relationship of production losses caused by equipment malfunction to production
losses caused by other variables, such as operator error, poor engineering, or random
failures, can a true performance of the service function be assessed. By maintaining
long-term records of such data, companies can visualize the success of the service
department by noting the percent of the total production loss that results from
inadequate or improper service. The production loss attributable to maintenance also
represents a specific performance measure of the generic element of accuracy in
problem definition. Effective preventive maintenance (PM) is a fundamental support for
high operational availability.
PM means all actions are intended to keep durable equipment in good operating
condition and to avoid failures. New technology has improved equipment quality,
reliability, and dependability by fault-tolerance, redundant components, selfadjustments, and replacement of hydraulic and mechanical components with more
reliable electronic and optical operations. However, many components can still wear
out, corrode, become punctured, vibrate excessively, become overheated by friction or
dirt, or even be damaged by humans. For these problems, a good PM program will
preclude failures, enable improved uptime, and reduce expenses.
Success is often a matter of degree. Costs in terms of money and effort to be invested
now must be evaluated against future gains. This means the time-value of money must
be considered along with business priorities for short-term versus long-term success.
Over time, the computerized maintenance management system must gather data, which
must then be analyzed to assist with accurate decisions. The proper balance between
preventive and corrective maintenance that will achieve minimal downtime and costs
can be tenuous.
PM can prevent failures from happening at inconvenient times, can sense when a failure
is about to occur and fix it before it causes damage, and can often preserve capital
investments by keeping equipment operating for years as well as the day it was
installed. Predictive maintenance is considered here to be a branch of preventive
maintenance.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Inept PM, however, can cause problems. Humans are not perfect. Whenever any
equipment is touched, it is exposed to potential damage. Parts costs increase if
components are replaced prematurely. Unless the PM function is presented positively,
customers may perceive PM activity as, “that machine is broken again.” A PM program
requires an initial investment of time, materials, people, and money. Payoff comes later.
While there is little question that a good PM program will have a high return on
investment, many people are reluctant to pay now if the return is not immediate. That
challenge is particularly predominant in a poor economy where companies want fast
return on their expenditures. PM is the epitome of, “pay me now, or pay me later.” The
PM advantage is that you will pay less now to do planned work when production is not
pushing, versus having very expensive emergency repairs that may be required under
disruptive conditions, halting production and losing revenue. Good PM saves money
over a product’s life cycle.
In addition to economics, emotions play a prominent role in preventive maintenance. It
is a human reality that perceptions often receive more attention than do facts. A good
computerized information system is necessary to provide the facts and interpretation
that guide PM tasks and intervals. PM is a dynamic process. It must support variations
in equipment, environment, materials, personnel, production schedules, use, wear,
available time, and financial budgets. All these variables impact the how, when, where,
and who of PM.
Technology provides the tools for us to use, and management provides the direction for
their use. Both are necessary for success. These ideas are equally applicable to
equipment and facility maintenance and to field service in commerce, government,
military, and industry.
The foundation for preventive maintenance information is equipment records. All
equipment and maintenance records should be in electronic databases. The benefits
obtained from computerizing maintenance records are much greater than the relatively
small cost. There should be a current data file for every significant piece of equipment,
both fixed and movable.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The equipment database provides information for many purposes beyond PM and
includes considerations for configuration management, documentation, employee skill
requirements, energy consumption, financials, new equipment design, parts
requirements, procurement, safety, and warranty recovery. Essential data items are
shown in Table 28-1.
The data for new equipment should be entered into the computer database when the
equipment is procured. The original purchase order and shipping documents can be the
source, with other data elements added as they are fixed. It is important to remember
there are three stages of configuration:
1. As-designed
2. As-built
3. As-maintained
The as-maintained database is the major challenge to keep continually current. The
master equipment data should be updated as an intuitive and real-time element of the
maintenance system. If pieces of paper are used, they are often forgotten or damaged,
and the data may not get into the single master location on the computer. Part number
revisions are especially necessary so the correct part can be rapidly ordered if needed. A
characteristic of good information systems is that data should only need to be entered
once, and all related data fields will be automatically updated. Many maintenance
applications today are web-based so they can be accessed from anywhere a computer (or
personal digital assistant [PDA], tablet, or enabled cell phone) can connect to the
Internet.
Computers are only one component of the information system capability. Electronic
tablets, mobile phones, two-way pagers, voice recognition, bar codes, and other
technologies are coming to the maintenance teams, often with wireless communications.
A relatively small investment in data-entry technology can gain immediate reporting,
faster response to discovered problems, accurate numbers gathered on the site, less
travel, knowledge of what parts are in stock to repair deficiencies, and many other
benefits.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
It is important that the inspection or PM data be easily changeable. The computer
program should accomplish as much as possible automatically. Many systems record the
actual odometer reading at every fuel stop, end of shift, and other maintenance, so meter
reading can be continually brought up to date. Other equipment viewed less often can
have PM scheduled more on predicted dates. Meter information can be divided by the
number of days to continually update the use per day, which then updates the next due
date. When an inspection or PM is done and the work order closed, these data
automatically revise the date last done, which in turn revises the date next due.
Most maintenance procedures are now online, so they are available anytime via an
electronic screen. Paperwork is kept to a minimum due to the ability for the inspector or
customer to sign to acknowledge completion and/or input comments directly using a
touch-sensitive screen. Single-point control over procedures via a central database is a
big help, especially on critical equipment. When the work order is closed out, the related
information should be entered automatically onto history records for later analysis.
Sharing the data electronically with other factories, manufacturers, and consultants can
allow “big data” to be analyzed effectively and often results in improvements that can
be shared with all users.
Safety inspections and legally required checks can be maintained in computer records
for most organizations without any need to retain paper copies. If an organization must
maintain those paper records for some legal reason, then they should be microfilmed or
kept as electronic images rather than in bulky paper form.
Humans are still more effective than computers at tasks that are complex and are not
repeated. Computers are a major aid to humans when tasks require accurate historical
information and are frequently repeated. Computer power and intelligent software
greatly enhance the ability to accurately plan, schedule, and control maintenance.
Performance Metrics and Benchmarks
The heart of any management system is establishing the objectives that must be met.
Once managers determine the objectives, then plans, budgets, and other parts of the
management process can be brought into play. Too often service management fails to
take the time to establish clear objectives and operates without a plan. Because service
may contribute a majority of a company’s revenues and profits, that can be a very
expensive mistake.
Objectives should be:
• Written
• Understandable
• Challenging
• Achievable
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Measurable
Each company must develop its own performance measures. Useful performance
measures, often referred to as benchmarks or key performance indicators (KPIs),
include the following:
Asset Measures—Equipment, Parts, and Tools
(Note that this may be divided into the technician’s days to return and the repair
time once the decision is made to repair the defective part.)
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Cost Measures
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Equipment Measures
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Preventive Measures
Human Measures
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Example Calculation:
The most important measure for production equipment support is operational
availability, which we also term uptime. This is item E1 and definition AO above. This is
the “real world” measure of what percent of time equipment is available for production.
In the following example, we evaluate an item of automation equipment for 1 year,
which is 365 days • 24 hours per day = 8,760 total possible “up” hours. Our equipment
gets preventive maintenance for 1 hour every month (12 hours per year) plus additional
quarterly PM of another hour each quarter (4 more hours per year). There was one
failure that resulted in 6 hours of downtime. Thus, total downtime for all maintenance
was 12 + 4 + 6 = 22 hours.
That would be considered acceptable performance in most operations, especially if the
PM work can be done at times that will not interfere with production. The maintenance
challenge is to avoid failures that adversely affect production operations.
Automation professionals should consider life-cycle cost when designing or acquiring
an automation system. Design guidelines for supportability include:
1. Minimize the need for maintenance by:
Lifetime components
High reliability
Fault-tolerant design
Broad wear tolerances
Stable designs with clear yes/no indications
2. Access:
Openings of adequate size
Fasteners few and easy to operate
Adequate illumination
Workspace for large hands
Entry without moving heavy equipment
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Frequent maintenance areas have best access
Ability to work on any FRU (field replaceable unit) without disturbing others
3. Adjustments:
Positive success indication
No interaction effects
Factory/warranty adjustments sealed
Center zero and increase clockwise
Fine adjustments with large movements
Protection against accidental movement
Automatic compensation for drift and wear
Control limits
Issued drawings show field adjustments and tolerances
Routine adjustment controls and measurement points in one service area
4. Cables:
Fabricated in removable sections
Each wire identified
Avoids pinches, sharp bends, and abrasions
Adequate clamping
Long enough to remove connected components for test
Spare wires at least 10% of total used
Wiring provisions for all accessories and proposed changes
5. Connectors:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Quick disconnect
Keyed alignment
Spare pins
Plugs cold, receptacles hot
No possible misconnection
Moisture prevention, if needed
Spacing provided for work area and to avoid shorts
Labeled; same color marks at related ends
6. Covers and panels:
Sealed against foreign objects
Independently removable with staggered hinges
Practical material finishes and colors
Moves considered—castors, handles, and rigidity
Related controls together
Withstand pushing, sitting, strapping and move stress
Easily removed and replaced
No protruding handles or knobs except on control panel
7. Consumables:
Need detected before completely expended
Automatic shutoff to avoid overflow
Toxic exposure under thresholds
8. Diagnostics:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Every fault detected and isolated
Troubleshooting cannot damage
Self-tests preferred
Go/no go indications
Isolation to field replaceable unit
Never more than two signals observed simultaneously o Condition monitoring on
all major inputs and outputs o Ability for partial operation of critical assemblies
9. Environment—equipment protected from:
Hot and cold temperatures
High and low humidity
Airborne contaminants
Liquids
Corrosives
Pressure
Electrical static, surges, and transients
10. Fasteners and hardware:
Few in number
Single turn
Captive
Fit multiple common tools
Non-clog
Common metric/imperial thread
11. Lubrication:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Disassembly not required
Need detectable before damage
Sealed bearings and motors
12. Operations tasks:
Positive feedback
Related controls together
Decisions logical
Self-guiding
Checklists built-in
Fail-safe
13. Packaging:
Stacking components avoided
Ease of access guides replacement
Functional groups
Hot items high and outside near vents
Improper installation impossible
Plug-in replaceable components
14. Parts and components:
Labeled with part number and revision level
Able to replace knobs and buttons without having to replace the entire unit
Delicate parts protected
Stored on equipment if user replaceable
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Standard, common, proven
Not vulnerable to excessive heat
Mean time between maintenance known
Wear-in/wear-out considered
15. Personnel involvement:
Weight for portable items 35 lb. (16 kg.) maximum
Lowest ability expected to do all tasks
Male or female
Clothing considered
Single-person tasks
16. Refurbish, rejuvenate, and rebuild:
Materials and labels resist anticipated solvents and water
Drain holes
Configuration record easy to see and understand
Aluminum avoided in cosmetic areas
17. Safety:
Interlocks
Electrical shutoff near equipment
Circuit breaker and fuses adequately and properly sized
Protection from high voltages
Corners and edges round
Protrusions eliminated
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Electrical grounding or double insulation
Warning labels
Hot areas shielded and labeled
Controls not near hazards
Bleeder and current limiting resistors on power supplies
Guards on moving parts
Hot leads not exposed
Hazardous substances not emitted
Radiation given special considerations
18. Test points:
Functionally grouped
Clearly labeled
Accessible with common test equipment
Illuminated
Protected from physical damage
Close to applicable adjustment or control
Extender boards or cables
19. Tools and test equipment:
Standardized
Minimum number
Special tools built into equipment
Metric compatible
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
20. Transport and storage:
Integrated moving devices, if service needs to move
Captive fluids and powders
Delivery and removal methods practical
Components with short life easily removed o Ship ready to use
The preferred rules for modern maintenance are to regard safety as paramount,
emphasize predictive prevention, repair any defect or malfunction, and, if the system
works well, strive to make it work better.
Further Information
Patton, Joseph D., Jr. Maintainability & Maintenance Management. 4th ed. Research
Triangle, Park, NC: ISA (International Society of Automation), 2005.
——— Preventive Maintenance. 3rd ed. Research Triangle, Park, NC: ISA
(International Society of Automation), 2004.
Patton, Joseph D., Jr. and Roy J. Steele. Service Parts Handbook. 2nd ed. Research
Triangle, Park, NC: ISA (International Society of Automation), 2003.
Patton, Joseph D., Jr., and William H. Bleuel. After the Sale: How to Manage Product
Service for Customer Satisfaction and Profit. New York: The Solomon Press, 2000.
Author’s note: With the Internet available to easily search for publications and
professional societies, using a search engine with keywords will be more effective than
a printed list of references. An Internet search on specific topics will be continually up
to date, whereas materials in a book can only be current as of the time of printing.
Searching with words like maintenance, preventive maintenance, reliability, uptime
(which finds better references than does the word availability), maintainability,
supportability, service management, and maintenance automation will bring forth
considerable information, from which you can select what you want.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
About the Author
Joseph D. Patton, Jr. is retired. He was the founder and chairman (for more than 35
years) of Patton Consultants, Inc., which advised management on product service,
logistics, and support systems. Before founding Patton Consultants in 1976, Patton was
an officer in the Regular Army and spent 11 years with Xerox Corp. He has authored
more than 200 published articles and 8 books. He earned a BS degree from the
Pennsylvania State University and an MBA in marketing from the University of
Rochester. He is a registered Professional Engineer (PE) in Quality Engineering and a
Fellow of both ASQ, the American Society for Quality, and SOLE, The International
Society of Logistics. He is a Certified Professional Logistician (CPL), Certified Quality
Engineer (CQE), Certified Reliability Engineer (CRE), and a senior member of ISA.
29
Troubleshooting Techniques
By William L. Mostia, Jr.
Introduction
Troubleshooting can be defined as the method used to determine why something is not
working properly or is not providing an expected result. Troubleshooting methods can
be applied to physical as well as nonphysical problems. As with many practical skills, it
is an art but it also has an analytical or scientific basis. As such, basic troubleshooting is
a trainable skill, while advanced troubleshooting is based on experience, developed
skills, information, and a bit of art. While the discussion here centers on troubleshooting
instrumentation and control systems, the basic principles apply to broader classes of
problems.
Troubleshooting normally begins with identifying that a problem exists and needs to be
solved. The first steps typically involve applying a logical/analytical framework.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Logical/Analytical Troubleshooting Framework
A framework underlies a structure. Logical frameworks provide the basis for structured
methods to troubleshoot problems. However, following a step-by-step method without
first thinking through the problem is often ineffective; we also need to couple logical
procedures with analytical thinking. To analyze information and determine how to
proceed, we must combine logical deduction and induction with a knowledge of the
system, then sort through the information we have gathered regarding the problem.
Often, a logical/analytical framework does not produce the solution to a troubleshooting
problem in just one pass. We usually have several iterations, which cause us to return to
a previous step in the framework and go forward again. Thus, we can systematically
eliminate possible solutions to our problem until we find the true solution.
Specific Troubleshooting Frameworks
Specific troubleshooting frameworks have been developed that apply to a particular
instrument, class of instruments, system, or problem domain. For example, frameworks
might be developed for a particular brand of analyzer, for all types of transmitters, for
pressure control systems, or for grounding problems. When these match up with your
system, you have a distinct starting point for troubleshooting.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Figure 29-1 is an example of a specific troubleshooting framework (also called a
flowchart or tree) for transmitters.
Generic Logical/Analytical Frameworks
Since we do not always have a specific structured framework available, we need a more
general or generic framework that will apply to a broad class of problems. Figure 29-2
depicts this type of framework as a flowchart.
The framework shown in Figure 29-2, while efficient, leaves out some important safetyrelated tasks and company procedural requirements associated with or related to the
troubleshooting process. As troubleshooting increases the safety risk to the
troubleshooter due to troubleshooting actions that involve online systems, energized
systems, and moving parts, a few important points should be made here:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Always make sure that what you are doing is safe for you, your fellow workers,
and your facility.
• Follow company procedures.
• Get the proper permits and follow their requirements.
• Always communicate your actions to the operator in charge and other involved
people.
• Never insert any part of your body into a location where there is the potential for
hazards.
• Never take unnecessary risks. The life you save may be your own!
The Seven-Step Troubleshooting Procedure
The following is a description of the seven-step procedure illustrated in Figure 29-2,
which provides a generic, structured approach to troubleshooting.
Step 1: Define the Problem
You cannot solve a problem if you do not know what the problem is. The problem
definition is the starting point. Get it wrong, and you will stray off the path to the
solution to the problem. The key to defining the problem is communication.
When defining the problem, listen carefully and allow the person(s) reporting the
problem to provide a complete report of the problem as they see it. The art of listening is
a key element of troubleshooting. After listening carefully, ask clear and concise
questions. All information has a subjective aspect to it. When trying to identify a
problem, you must strip away the subjective elements and get to the meat of the
situation.
Avoid high-level technical terms or “technobabble.” A troubleshooter must be able
speak the “language” of the person reporting the problem. This means understanding the
process, the plant physical layout, instrument locations, process functions as they are
known in the plant, and the “dialect” of abbreviations, slang, and technical words
commonly used in the plant. Some of this is generic to process plants in general, while
some is specific to the plant in question.
Step 2: Collect Additional Information Regarding the Problem
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Once a problem has been defined, collect additional information. This step may overlap
Step 1 and, for simple problems, these two steps may even be the same. For complex or
sophisticated problems, however, collecting information is typically a more distinct
stage.
Develop a strategy or plan of action for collecting information. This plan should include
determining where in the system you will begin to collect information, what sources will
be used, and how the information will be organized. Information gathering typically
moves from general to specific, though there may be several iterations of this. In other
words, continue working to narrow down the problem domain.
Symptoms
The information gathered typically consists of symptoms, what is wrong with the
system, as well as what is working properly. Primary symptoms are directly related to
the cause of the problem at hand. Secondary symptoms are downstream effects—that is,
not directly resulting from what is causing the problem. Differentiation between primary
and secondary symptoms can be the key to localizing the cause of the problem.
Interviews and Collecting Information
A large part of information gathering will typically be in the form of interviews with the
person(s) who reported the problem and with any other people who may have relevant
information. People skills are important here: good communication skills, the use of
tact, and nonjudgmental and objective approaches can be key to getting useful
information.
Then, review the instrument or system’s performance from the control system’s
faceplates, trend recorders, summaries, operating logs, alarm logs, recorders, and system
self-diagnostics. System drawings and documentation can also provide additional
information.
Inspection
Next, inspect the instrument(s) that are suspected of being faulty or other local
instruments (such as pressure gauges, temperature gauges, sight glasses, and local
indicators) to see if there are any other indications that might shed light on the matter.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
History
Historical records can provide useful information. The facility’s maintenance
management system (MMS) may contain information regarding the failed system or
ones like it. Also, check with others who have worked on the instrument or system.
Beyond the Obvious
If there are no obvious answers, testing may be in order. Plan the testing to ensure it is
done safely and it obtains the information needed with a minimum of intrusion. When
testing by manipulating the system, plan to test or manipulate only one variable at a
time. Altering more than one variable might solve the problem, but you will be unable
to identify what fixed the problem. Always make the minimum manipulation necessary
to obtain the desired information. This minimizes the potential upset to the process.
Step 3: Analyze the Information
Once the information is collected, you must analyze it to see if there is enough data to
propose a solution. Begin by organizing the collected information. Then, analyze the
problem by reviewing what you already know and the new information you have
gathered, connecting causes and effects, exploring causal chains, applying IF-THEN and
IF-THEN NOT logic, applying the process of elimination, and applying other relevant
analytical or logical methods.
Case-Based Reasoning
Probably the first analytical technique that you will use is past experience. If you have
seen this situation or case before, then you know a possible solution. Note that we say,
“a possible solution” because similar symptoms sometimes have different causes and,
hence, different solutions.
“Similar To” Analysis
Compare the system you are working on to similar systems you have worked on in the
past. For example, a pressure transmitter, a differential pressure transmitter, and a
differential pressure level transmitter are similar instruments. Different programmable
logic controller (PLC) brands often have considerable similarities. RS-485
communication links are similar, even on very different source and destination
instruments. Similar instruments and systems operate on the same basic principles and
have potentially similar problems and solutions.
“What, Where, When” Analysis
This type of analysis resembles the Twenty Questions game. Ask questions about what
the gathered information may tell you. These are questions such as:
• What is working?
• What is not working? Has it ever worked?
• What is a cause of an effect (symptom) and what is not?
• What has changed?
• What has not changed?
• Where does the problem occur?
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Where does it not occur?
• When did the problem occur?
• When did it not occur?
• What has been done to troubleshoot so far?
• What do I not know?
• What physical properties or principles must be true?
Patterns
Symptoms can sometimes be complex, and they can be distributed over time. Looking
for patterns in symptom actions, in lack of actions, or in time of occurrence can
sometimes help in symptom analysis.
Basic Principles
Apply basic scientific principles to analyze the data—such as electrical current can only
flow certain ways, Ohm’s and Kirchhoff’s Laws always work, and mass and energy
always balance—and applicable physical and material properties that apply to the
process—such as state of matter, boiling point, and corrosive properties—as a result of
these principles.
The Manual
When in doubt, read the manual! It may have information on circuits, system analysis,
or troubleshooting that can lead to a solution. It may also provide voltage, current, or
indicator readings; test points; and analytical procedures. Manuals often have
troubleshooting tables or charts to assist you.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Logical Methods
You will need a logical approach to make this iterative procedure successful. Several
approaches, such as the linear approach and the divide-and-conquer method, may apply.
The linear or walk-through approach is a step-by-step process (illustrated in Figure 293) that you follow to test points throughout a system. The first step is to decide on an
entry point. If the entry point tests good, then test the next point downstream in a linear
signal path. If this point tests good, then you choose the next point downstream of the
previous test point, and so on. Conversely, if the entry point is found to be bad, choose
the next entry point upstream and begin the process again. As you move downstream or
upstream, each step narrows down the possibilities. Any branches must be tested at the
first likely point downstream on the branch.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The divide-and-conquer method is a general approach (illustrated in Figure 29-4). You
choose a likely point or, commonly, the midpoint of the system, and test it. If it tests
bad, then the upstream section of the system contains the faulty part; if it tests good, the
downstream section contains the faulty part. Divide the section of the system (upstream
or downstream) that contains the faulty part into two parts and test the section at the
dividing point. Determine whether the faulty part is upstream or downstream of the test
point and continue dividing the sections and testing until the cause of the problem has
been found.
Step 4: Determine Sufficiency of Information
When gathering information, how do you know that you have enough? Can you
determine a cause and propose a solution to solve the problem? If the answer is yes, this
is a decision point for moving on to the next step of proposing a solution. If the answer
is no, return to Step 2, “Collect Additional Information Regarding the Problem.”
Step 5: Propose a Solution
When you believe you have determined the cause of the problem, propose a solution. In
fact, you may propose several solutions based on your analysis. The proposed solution
will usually be to remove and replace (or repair) a defective part. In some cases,
however, the proposal may not offer complete certainty of solving the problem and will
have to be tested. If there are several possible solutions, propose them in the order of
their probability of success. If the probability is roughly equal, or other operational
limitations come into play, you can use other criteria to determine the order of the
solutions. You might propose solutions in the order of the easiest to the most difficult. In
cases where there may be cost penalties (labor costs, cost of consumable parts, lost
production, etc.) associated with trying various solutions, you may propose trying the
least costly viable option.
Do not try several solutions at once. This is called the shotgun approach and it will
typically only confuse the issue. Management will sometimes push for a shotgun
approach due to time or operational constraints, but you should resist it. With a little
analytical work, you may be able to solve the problem and meet management
constraints at a lower cost. With the shotgun approach, you may find that you do not
know what fixed the problem and it will be more costly both immediately and in the
long term; if you do not know what fixed the problem, you may be doomed to repeat it.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Also, do not rush to a compromise solution proposed by a “committee.” Consider the
well-known “Trip to Abilene” story, illustrating the “group think” concept that is the
opposite of synergy. In the story, some people are considering going to Abilene, though
none of them really wants to go. They end up in Abilene, however, because everyone in
the group thinks that everyone else wants to go to Abilene. This sometimes occurs in
troubleshooting when a committee gets together to “assist” the troubleshooter and the
committee gets sidetracked by a trip to Abilene.
Step 6: Test the Proposed Solution
Once a solution, or a combination of solutions, has been proposed, it must be tested to
determine if the problem analysis is correct.
Specific versus General Solutions
Be careful of specific solutions to more general problems. At this step, you must
determine if the solution needed is more general than the specific one proposed. In most
cases, a specific solution will be repairing or replacing the defective instrument, but that
may not solve the problem. What if replacing an instrument only results in the new
instrument becoming defective? Suppose an instrument with very long signal lines
sustains damage from lightning transients. The specific solution would be replacing the
instrument; the general solution might be to install transient protection on the instrument
as well.
The Iterative Process
If the proposed and tested solution is not the correct one, then return to Step 3, “Analyze
the Information.” Where might you have gone astray? If you find the mistake, then
propose another solution. If you cannot identify the mistake, return to Step 2, “Collect
Additional Information Regarding the Problem.” It is time to gather more information
that will lead you to the real solution.
Step 7: The Repair
In the repair step, implement the proposed solution. In some cases, testing a solution
results in a repair, as in replacing a transmitter, which both tests the solution and repairs
the problem. Even in this case, there will generally be additional work to be done to
complete the repair, such as tagging the repaired/new equipment, updating the database,
and updating maintenance records. Document the repair so future troubleshooting is
made easier. If the system is a safety instrumented system (SIS) or an independent
protection layer (IPL), additional information will have to be documented, such as the
as-found and as-left condition of the repair, the mode of failure (e.g., safe or dangerous),
and other information pertinent to a safety/critical system failure.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Vendor Assistance: Advantages and Pitfalls
Sometimes it is necessary to involve vendors or manufacturers in troubleshooting, either
directly, by the Internet, or by phone. A field service person, or the in-house service
personnel, can be helpful (and can provide a learning experience), but some are quick to
blame other parts of the system (not their own) when they cannot find anything wrong—
in some cases before they have even checked out their own system. When trying to
solve a problem, do not let vendors or manufacturers off the hook just because they say
it is not their equipment. Ask questions and make them justify their position. Be very
careful to follow your company’s cybersecurity procedures before you allow a vendor or
other people remote access to your systems via the Internet or by phone modem. While
allowing remote access to your system can be beneficial sometimes in troubleshooting
complex problems, the cybersecurity threat risk generally outweighs allowing vendor or
manufacturer remote access to your systems. You should also be careful that vendor
field maintenance personnel do not gain computer access to your systems. Note that any
system that is password protected should not use the default password. Better to be safe
than sorry regarding cybersecurity threats.
Other Troubleshooting Methods
There are other troubleshooting methods that complement the logical/analytical
framework. Some of these are substitution, fault insertion, remove-and-conquer, circlethe-wagons, complex-to-simple, trapping, consultation, intuition, and out-of-the-box
thinking.
The Substitution Method
This method substitutes a known good component for a suspected bad component. For
modularized systems or those with easily replaceable components, substitution may
reveal the component that is the cause of the problem. First, define the problem, gather
information, and analyze the information just as you do in the generic troubleshooting
framework. Then, select a likely replacement candidate and substitute a known good
component for it. By substituting components until the problem is found, the
substitution method may find problems even where there is no likely candidate or only a
vague area of suspicion. One potential problem with substitution is that a higher-level
cause can damage the replacement component as soon as you install it, or a transient
(such as lightning) may have caused the failure and your fix may only be temporary
(e.g., until the next lightning strike). Using this method can raise the overall
maintenance cost due to the cost of extra modules.
The Fault Insertion Method
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Sometimes you can insert a fault instead of a known good signal or value to determine
how the system responds. For example, when a software communication interface is
periodically locking up, you may suspect that the interface is not responding to an
input/output (I/O) timeout properly. You can test this by inserting a fault—an I/O
timeout.
The Remove-and-Conquer Method
For loosely coupled systems that have multiple independent devices, removing the
devices one at a time may help you find certain types of problems. For example, if a
communication link with 10 independent devices talking to a computer is not
communicating properly, you might remove the devices one at a time until the defective
device is found. Once the defective device has been detected and repaired, the removed
devices should be reinstalled one at a time to see if any other problems occur.
The remove-and-conquer technique is particularly useful when a communication system
has been put together incorrectly or it exceeds system design specifications. For
example, there might be too many devices on a communication link, cables that are too
long, cable mismatches, wrong cable installation, impedance mismatches, or too many
repeaters. In these situations, sections of the communication system can be disconnected
to determine what is causing the problem.
A similar technique called add-back-and-conquer works in the reverse. You remove all
the devices and add them back one by one until you find the cause of the problem.
The Circle-the-Wagons Method
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
When you believe the cause of a problem is external to the device or system, try the
circle-the-wagons technique. Draw an imaginary circle or boundary around the device
or system. Determine what interfaces (such as signals, power, grounding,
environmental, and electromagnetic interference [EMI]) cross the circle, and then isolate
and test each boundary crossing. Often this is just a mental exercise that helps you think
about external influences, which then leads to a solution. Figures 29-5 and 29-6
illustrate this concept.
A Trapping We Shall Go
Sometimes an event that triggers or causes the problem is not alarmed, it is a transient,
or it happens so fast the system cannot catch it. This is somewhat like having a mouse in
your house. You generally cannot see it, but you can see what it has done.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
How do you catch the mouse? You set a trap. In sophisticated systems, you may have
the ability to set additional alarms or identify trends to help track down the cause of the
problem. For less sophisticated systems, you may have to use external test equipment or
build a trap. If software is involved, you may have to build software traps that involve
additional logic or code to detect the transient or bug.
The Complex-to-Simple Method
Many control loops and systems may have different levels of operation or complexity
with varying levels of sophistication. One troubleshooting method is to break systems
down from complex to simple. This involves identifying the simple parts that function
to make the whole. Once you identify the simplest nonfunctioning “part,” you can
evaluate that part or, if necessary, you can start at a simple known good part and
“rebuild” the system until you find the problem.
Consultation
Consultation, also known as the third head technique, means you use a third person,
perhaps someone from another department or an outside consultant, with advanced
knowledge about the system or the principles for troubleshooting the problem. This
person may not solve the problem, but they may ask questions that make the cause
apparent or that spark fresh ideas. This process allows you to stand back during the
discussions, which sometimes can help you distinguish the trees from the forest. The
key is to know when you have reached the limitations of your investigation and need
some additional help or insight.
Intuition
Intuition can be a powerful tool. What many people would call “troubleshooting
intuition” certainly improves with experience. During troubleshooting or problem
solving, threads of thought in your consciousness or subconsciousness may develop, one
of which may lead you to the solution. The more experience you have, the more threads
will develop during the troubleshooting process. Can you cultivate intuition? Experience
suggests that you can, but success varies from person to person and from technique to
technique. Find what works for you.
Out-of-the-Box Thinking
Difficult problems may require approaches beyond normal or traditional troubleshooting
methods. The term out-of-the-box thinking was a buzzword for organizational
consultants during the 1990s. Out-of-the-box thinking means approaching a problem
from a new perspective, not being limited to the usual ways of thinking about it. The
problem in using this approach is that our troubleshooting “perspective” is generally
developed along pragmatic lines; that is, it is based on what has worked before and
changing can sometimes be difficult.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
How can you practice out-of-the-box thinking? How can you shift your perspective to
find another way to solve the problem? Here are some questions that may help:
• Is there some other way to look at the problem?
• Can the problem be divided up in a different way?
• Can different principles be used to analyze the problem?
• Can analyzing what works rather than what does not work help to solve the
problem?
• Can a different starting point be used to analyze the problem?
• Are you looking at too small a piece of the puzzle? Too big?
• Could any of the information on which the analysis is based be in error,
misinterpreted, or looked at in a different way?
• Can the problem be conceptualized differently?
• Is there another “box” that has similarities that might provide a different
perspective?
Summary
While troubleshooting is an art, it is also based on scientific principles, and it is a skill
that can be taught and developed with quality experience. An organized, logical
approach to troubleshooting is necessary to be successful and can be provided by
following a logical framework, supplemented by generalized techniques such as
substitution, remove-and-conquer, circle-the-wagons, and out-of-the-box thinking.
Further Information
Goettsche, L.D., ed. Maintenance of Instruments & Systems. 2nd ed. Practical Guides
for Measurement and Control series. Research Triangle Park, NC: ISA
(International Society of Automation), 2005.
Mostia, William L., Jr., PE. “The Art of Troubleshooting.” Control IX, no. 2, (February
1996): 65–69.
——— Troubleshooting: A Technician’s Guide. Research Triangle Park, NC: ISA
(International Society of Automation), 2006.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
About the Author
William (Bill) L. Mostia, PE, has more than 35 years of experience in instrumentation,
controls, safety, and electrical areas. He is the principle engineer at WLM Engineering
Co., a consultant firm in instrument, electrical, and safety areas. Mostia has worked for
Amoco, Texas Eastman, Dow Chemical Co., SIS-TECH Solutions, and exida; and he
has been an independent consultant in instrument, electrical, and safety areas. He
graduated from Texas A&M University with a BS in electrical engineering. He is a
professional engineer registered in the state of Texas, a certified Functional Safety (FS)
Engineer by TUV Rheinland, and an ISA Fellow. Mostia is an active member of ISA,
and he serves on the ISA84 and ISA91 standards committees, as well as various ISA12
standards committees. He has published more than 75 articles and papers, and a book on
troubleshooting; he has also been a contributor to several books on instrumentation.
30
Asset Management
By Herman Storey and Ian Verhappen, PE, CAP
Asset management, broadly defined, refers to any system that monitors and maintains
things of value to an entity or group. It may apply to both tangible assets (something
you can touch) and to intangible assets, such as human capital, intellectual property,
goodwill, and/or financial assets. Asset management is a systematic process of costeffectively developing, operating, maintaining, upgrading, and disposing of assets.
Asset management systems are processing and enabling information systems that
support management of an organization’s assets, both physical assets, called tangible,
and nonphysical, intangible assets.
Due to the number of elements involved, asset management is data and informationintensive.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
What you expect the asset to do is known as the function of the asset. An important part
of asset management is preserving the asset’s ability to perform its function as long as
required. Maintenance is how an assets function is preserved.
Maintenance usually costs money, as it consumes time and effort. Not doing
maintenance has consequences. Failure of some assets to function can be expensive,
harm both people and the environment, and stop the business from running, while
failure of other assets may be less serious. As a result, one of the important first steps in
any asset management program is understanding the importance of your assets to your
operations, as well as the likelihood and consequence of their failure, so you can more
effectively mitigate actions to preserve their functions. This exercise is commonly
referred to as criticality ranking. Knowing the criticality of your assets makes it easier
to determine the appropriate techniques or strategies to manage those assets.
Managing all this data requires using computer-based systems and increasingly
integrating information technology (IT) and operational technology (OT) systems.
Industry analyst Gartner confirms this increased integration, predicting that OT will be
used to intelligently feed predictive data into enterprise asset management (EAM) IT
systems in the near future. This alerts the asset manager to potential failures, allowing
effective intervention before the asset fails.
This integration promises significant improvement in asset performance and availability
to operate in the near future.
Several organizations already offer asset performance management systems that straddle
IT and OT, thus providing more sophistication in how these systems can be used to
manage assets.
Additional guidance on how to implement asset management systems is available
through recently released, and currently under development, international standards. The
International Standards Organization (ISO) has developed a series of documents similar
in nature to the ISO 9000 series on quality and ISO 14000 series on environmental
stewardship. The ISO 55000 series of three separate voluntary asset management
standards was officially released 15 January 2014.
1. ISO 55000, Asset Management – Overview, Principles, and Terminology
2. ISO 55001, Asset Management – Management Systems – Requirements
3. ISO 55002, Asset Management – Management Systems – Guidelines for the
Application of ISO 55001
Like ISO 9000 and ISO 14000, ISO 55000 provides a generic conceptual framework—it
can be applied in any industry or context.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The requirements of the ISO 55000 standards are straightforward.
An organization (such as a company, plant, mine, or school board) has a portfolio of
assets. Those assets are intended (somehow) to deliver on part of the organization’s
objectives. The asset management system creates the link from corporate policies and
objectives, through a number of interacting elements to establish policy (i.e., rules),
asset management objectives, and processes through which to achieve them. Asset
management itself is the activity of executing on that set of processes to realize value
(as the organization defines it) from those assets.
Policies lead to objectives, which require a series of activities in order to achieve them.
Like the objectives, the resulting plans must be aligned and consistent with the rest of
the asset management system, including the various activities, resources, and other
financing.
Similarly, risk management for assets must be considered in the organization’s overall
risk management approach and contingency planning.
Asset management does not exist in a vacuum. Cooperation and collaboration with other
functional areas will be required to effectively manage and execute the asset
management system. Resources are needed to establish, implement, maintain, and
continually improve the asset management system itself, and collaboration outside of
the asset management organization or functional area will be required to answer
questions such as:
• What is the best maintenance program?
• What is the ideal age at which to replace the assets?
• What should we replace them with?
• How can we best utilize new technologies?
• What risks are created when the asset fails?
• What can we do to identify, manage, and mitigate those risks?
These challenges and questions must be answered or a high price will be paid if ignored.
Good asset management, as described in industry standards, gives us a framework to
answer the above questions and deal with the associated challenges.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
NAMUR NE 129 provides a view of maintenance processes that is complementary to
ISO 55000 and the concept of intelligent device management (IDM). NAMUR NE 129
(plant asset management) states that asset management tasks encompass all phases of
the plant life cycle, ranging from planning, engineering, procurement, and construction
to dismantling the plant.
The International Standard ISO 14224 provides a comprehensive basis for the collection
of reliability and maintenance (RM) data in a standard format for equipment in all
facilities and operations within the petroleum, natural gas, and petrochemical industries
during the operational life cycle of equipment.
Asset Management and Intelligent Devices
The International Society of Automation (ISA) and the International Electrotechnical
Commission (IEC) are collaborating on a proposed standard, currently in draft-review
stages, to be named IEC 63082/ISA-108, Intelligent Device Management, to help
facilitate integrating the OT and IT realms by effectively using the information available
from the sensors and controllers in facilities.
As instruments evolve to become truly intelligent, they transmit more data digitally,
delivering more benefits to users along with the potential for simpler deployment and
operation. The proliferation of intelligent devices has resulted in an increase in volume
and complexity of data requiring standards for identifying errors, diagnostic codes, and
critical configuration parameters.
A successful measurement is built on the proper type of instrument technology correctly
installed in the right application. Aside from input from its direct sensors, a nonintelligent device cannot perceive any other process information. Intelligent devices
have diagnostics that can detect faults in the installation or problems with the
application, each of which could compromise measurement quality and/or reliability.
Intelligent devices also have the ability to communicate this information over a network
to other intelligent devices. Smart instruments can also respond to inquiries or push
condition information to the balance of the automation system.
Many of the same technologies used in enterprise business systems, such as Ethernet
and open OT systems, have been adapted to automation platforms. As a result, many of
the same security and safety concerns found in these systems must be addressed before
connecting critical process devices to non-process control networks.
Combining all these advances will result in better integration of intelligent devices into
the automation and information systems of the future. This will make it practical for
users to realize all the advantages that intelligent devices offer: better process control,
higher efficiency, lower energy use, reduced downtime, and higher-quality products.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Established maintenance practices, such as run to failure, time-based inspection and
testing, and unplanned demand maintenance associated with traditional devices, were
sufficiently effective, given the limitations of the device’s technology. In general,
traditional maintenance practices were applied in the following contexts:
• Run to failure – Used to manage failure modes, which were sudden, hidden, or
deemed to be low impact such that their failure would not impact reliable process
operations
• Time-based inspection and testing – Used to manage failure modes, which were
both gradual and predictable, such as mechanical integrity of devices
• Demand maintenance – Used to manage failure modes that were either sudden
or gradual but unpredictable
Most intelligent devices contain configuration data and diagnostics that can be used to
optimize maintenance practices. In many cases, the promise of intelligent devices in the
facility remains unrealized. This is not so much a technology issue as a lack of
management understanding of IDM value, as well as insufficient skilled personnel and
work processes. This lack of understanding results in less than optimum risk
management and unrealized benefits of device intelligence.
With the implementation of intelligent devices, IDM, and diagnostic maintenance,
significant benefits can be realized. The benefits come from:
• The ability of intelligent devices to detect and compensate for environmental
conditions, thus improving accuracy
• The ability to detect and compensate for faults that might not be detected by
traditional inspection and testing
Figure 30-1 provides a conceptual illustration of the enhanced diagnostic capability.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Associated new IDM-based work processes also provide opportunities to improve:
• Data management through having access to all the information related to not only
the process but also the integrity of the resulting signals, the device configuration
and settings, as well as each stage of the signal processing steps
• Worker knowledge and knowledge management by capturing all changes made to
the individual devices and resulting analysis, which help develop best practices
and optimize work processes
• Maintenance work processes resulting from better understanding the root cause of
a device’s deterioration and to focus activities on the source of the deterioration in
signal integrity
• The impact of faults on the process under control
With the implementation of diagnostic messaging, routine scheduled testing is
unnecessary for intelligent devices and inspection procedures can be simplified. Tools
and processes that use the device’s built-in diagnostics can improve testing and
inspection, thus improving overall facility risk management.
As indicated in the text above, asset management of the enterprise manages all its assets.
Figure 30-2 represents the relationship between IDM and intelligent devices. In the
context of asset management in a unified modeling language (UML) class diagram, an
intelligent device is one form of asset the IDM manages following the principles of asset
management.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Figure 30-3 shows how an IDM program translates objectives provided by higher levels
of the organization, such as business management processes, into requirements of
technology and resources and then passes them to IDM work processes. Similarly, the
IDM program translates performance history recorded by work processes into key
performance indicators (KPIs) and passes them to higher levels of the organization.
Implementation methods for enterprise business management are nontechnical and best
left to operating companies. Many enterprise business management stakeholders can
have an impact on the success of IDM. Most enterprise business management processes
are created for larger purposes and are not structured to directly manage IDM. A
technical and business focus for IDM, as well as a mid-level management structure, can
provide valuable coordination between the IDM work processes and higher-level
management.
IDM programs are implemented and supported via IDM work processes. The IDM
program requires a well-defined functional structure, but the structure can be tailored to
the enterprise’s needs.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
One of the main driving forces for program-level activity is the fast rate of change in
software used in intelligent devices. Software drives much faster change in devices than
the natural change in technology or application requirements for those devices.
Program-level activities can improve the efficiency of managing those changes by
managing the change once per device type (i.e., a specific model from a particular
manufacturer) for an enterprise rather than managing the change in each application at
each facility.
The supporting functions for IDM can be expressed by the various applications,
subsystems, and intelligent devices that can comprise the IDM. Intelligent devices are
the fundamental building block of IDM. The diagnostics and other information available
from intelligent devices can be integrated with these associated functions in the facility
to implement IDM work processes and advanced maintenance processes. A true IDM
does not need to incorporate all the elements included here, but the intelligent devices
are a necessary element of the system. Figure 30-4 shows an overview of a
representative sample of the supporting functions for IDM.
IDM supporting functions range from real time or semi-real time to planned activities.
Information from intelligent devices is provided in real time or close to real time,
depending on the types of devices used. This real-time information can then be used to
generate an alarm for the operator in real time or for operator human-machine interface
(HMI) applications. In the transactional world, intelligent device information is
provided to the Computerized Maintenance Management System (CMMS).
Primary functional elements of the IDM physical architecture include:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
• Intelligent devices
• Field maintenance tools
• Control hardware and software
• Simulation and optimization tools
• Historian
• Asset management tools
• Reporting and analysis tools
• Alarm management tools
• Operator HMI
• Engineering and configuration tools
• CMMS
• Design tools
• Planning and scheduling tools
Field maintenance tools include portable tools and field workstations, which fall into
three general types:
• Laptop workstations
• Hand-held tools
• Remote wireless clients
Asset management tools have functions, such as the collection and analysis of
information from intelligent devices, with the purpose of enacting various levels of
preventive, predictive, and proactive maintenance processes. Asset management tools
give maintenance people a real-time view into what is happening with their field device
assets, but they can also connect to other intelligent assets in the facility, including
control hardware and software, intelligent drives and motor controls, rotating machinery,
and even electrical assets. Asset management tools can be used to manage overall
facility maintenance processes and can use information collected from other functional
domains, such as device management tools.
Traditional asset management practices apply to devices and systems used for
automation. However, the widespread use of intelligent devices brings requirements for
new techniques and standards for managing this new class of devices.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Further Information
Many standards efforts are being undertaken in the process industries that revolve
around plant assets and devices, but none of these efforts addresses IDM for
maintenance and operations.
The ISO PC 251 effort is part of the ISO 55000 asset management standard. ISO 55000
primarily focuses on the inspection and test side of asset management. ISO 55000 does
not address the requirements of IDM.
The Institute of Asset Management (IAM) is a UK-based (physical) asset management
association. In 2004, IAM, through the British Standards Institution (BSI), published
Publicly Accepted Specifications (PAS) for asset management. These include PAS 551:2008, Part 1: Specification for the Optimized Management of Physical Assets and PAS
55-2:2008, Part 2: Guidelines for the Application of PAS 55-1. However, the PAS
standards’ primary focus is not on instrumentation and again leans heavily toward the
inspect and test side of the business. These standards tend to be heavily focused on
physical assets, not the devices that control them.
IEC TR 61804-6:2012 Function Blocks (FB) for Process Control – Electronic Device
Description Language (EDDL) – Part 6: Meeting the Requirements for Integrating
Fieldbus Devices in Engineering Tools for Field Devices defines the plant asset
management system as a system to achieve processing equipment optimization,
machinery health monitoring, and device management.
NAMUR NE 129 recommendation, Requirements of Online Plant Asset Management
Systems, does a very good job at outlining the purpose of plant asset management
systems and their place in the world of plant asset health.
About the Authors
Herman Storey is an independent automation consultant and the chief technology
officer for Herman Storey Consulting, LLC. He has been active for many years in
standards development with ISA and other organizations, including the FieldComm
Group. Storey is co-chair of the ISA100 Wireless Systems for Automation and ISA108
Intelligent Device Management committees.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
After a 42-year career, Storey retired from Shell Global Solutions in 2009. At that time,
he was a senior consultant in the Process Automation, Control, and Optimization group.
His role was the principle technology expert for Instrumented Systems Architecture and
the subject-matter expert for distributed control systems technology. Storey graduated
from Louisiana Tech with a BSEE.
Ian Verhappen, PE, CAP, and ISA Fellow, has worked in all three aspects of the
automation industry: end user, supplier, and engineering consultant. After approximately
25 years as an end user in the hydrocarbon industry (where he was responsible for
analyzer support, as well as integration of intelligent devices in the facilities),
Verhappen moved to a supplier company as director of digital networks. For the past 5+
years, he has been working for engineering firms as a consultant.
In addition to being a regular trade journal columnist, Verhappen has been active in ISA
and IEC standards for many years, including serving a term as vice president of the ISA
Standards and Practices (S&P) Board. He is presently the convener of IEC SC65E
WG10 Intelligent Device Management, a member of the SC65E WG2 (List of
Properties), and managing director of several ISA standards including ISA-108.
X
Factory Automation
Mechatronics
Many engineering products of the last 30 years possess integrated mechanical,
electrical, and computer systems. Mechatronics has evolved significantly by taking
advantage of embedded computers and supporting information technologies and
software advances. The result has been the introduction of many new intelligent
products into the marketplace and associated practices as described in this chapter to
ensure successful implementation.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Motion Control
Motion control of machines and processes compares the desired position to the actual
scale and takes whatever corrective action is necessary to bring them into agreement.
Initially, machine tools were the major beneficiary of this automation. Today, packaging,
material handling, food and beverage processing, and other industries that use
machines with movable members are enjoying the benefits of motion control.
Vision Systems
A vision system is a perception system used for monitoring, detecting, identifying,
recognizing, and gauging that provides local information useful for measurement and
control. The systems consist of several separate or integrated components including
cameras and lenses, illumination sources, mounting and mechanical fixturing hardware,
computational or electronic processing hardware, input/output (I/O) connectivity and
cabling electrical hardware, and, most importantly, the software that performs the visual
sensing and provides useful information to the measurement or control system.
Building Automation
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
This chapter provides insight into the industry that automates large buildings. Each
large building has a custom-designed heating, ventilating, and cooling (HVAC) air
conditioning system to which automated controls are applied. Access control, security,
fire, life safety, lighting control, and other building systems are also automated as part
of the building automation system.
31
Mechatronics
By Robert H. Bishop
Basic Definitions
Modern engineering design has naturally evolved into a process that we can describe in
a mechatronics framework. Since the term was first coined in the 1970s, mechatronics
has evolved significantly by taking advantage of embedded computers and supporting
information technologies and software advances. The result has been the introduction of
many new intelligent products into the marketplace. But what exactly is mechatronics?
Mechatronics was originally defined by the Yasakawa Electric Company in trademark
application documents [1]:
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The word, Mechatronics, is composed of “mecha” from mechanism and
“tronics” from electronics. In other words, technologies and developed
products will be incorporating electronics more and more into mechanisms,
intimately and organically, and making it impossible to tell where one ends
and the other begins.
The definition of mechatronics evolved after Yasakawa suggested the original
definition. One of the most often quoted definitions comes from Harashima, Tomizuka,
and Fukada [2]. In their words, mechatronics is defined as:
The synergistic integration of mechanical engineering, with electronics and
intelligent computer control in the design and manufacturing of industrial
products and processes.
Other definitions include:
• Auslander and Kempf [3]
Mechatronics is the application of complex decision-making to the operation of
physical systems.
• Shetty and Kolk [4]
Mechatronics is a methodology used for the optimal design of electromechanical
products.
• Bolton [5]
A mechatronic system is not just a marriage of electrical and mechanical systems
and is more than just a control system; it is a complete integration of all of them.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
These definitions of mechatronics express various aspects of mechatronics, yet each
definition alone fails to capture the entirety of the subject. Despite continuing efforts to
define mechatronics, to classify mechatronic products, and to develop a standard
mechatronics curriculum, agreement on “what mechatronics is” eludes us. Even without
a definitive description of mechatronics, engineers understand the essence of the
philosophy of mechatronics from the definitions given above and from their own
personal experiences.
Mechatronics is not a new concept for design engineers. Countless engineering products
possess integrated mechanical, electrical, and computer systems, yet many design
engineers were never formally educated or trained in mechatronics. Indeed, many socalled mechatronics programs in the United States are actually programs embedded
within the mechanical engineering curriculum as minors or concentrations [6]. However,
outside of the United States, for example in Korea and Japan, mechatronics was
introduced in 4-year curriculum about 25 years ago. Modern concurrent engineering
design practices, now formally viewed as an element of mechatronics, are natural design
processes. From an educational perspective, the study of mechatronics provides a
mechanism for scholars interested in understanding and explaining the engineering
design process to define, classify, organize, and integrate the many aspects of product
design into a coherent package. As the historical divisions between mechanical,
electrical, biomedical, aerospace, chemical, civil, and computer engineering give way to
more multidisciplinary structures, mechatronics can provide a roadmap for engineering
students studying within the traditional structure of most engineering colleges. In fact,
undergraduate and graduate courses in mechatronic engineering are now offered in
many universities. Peer-reviewed journals are being published and conferences
dedicated to mechatronics are very popular. However, mechatronics is not just a topic
for investigative studies by academicians; it is a way of life in modern engineering
practice. The introduction of the microprocessor in the early 1980s, coupled with
increased performance to cost-ratio objectives, changed the nature of engineering
design. The number of new products being developed at the intersection of traditional
disciplines of engineering, computer science, and the natural sciences is expanding.
New developments in these traditional disciplines are being absorbed into mechatronics
design. The ongoing information technology revolution, advances in wireless
communication, smart sensors design, the Internet of Things, and embedded systems
engineering ensures that mechatronics will continue to evolve.
Key Elements of Mechatronics
The study of mechatronic systems can be divided into the following areas of specialty:
• Physical system modeling
• Sensors and actuators
• Signals and systems
• Computers and logic systems
• Software and data acquisition
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
The key elements of mechatronics are illustrated in Figure 31-1. As the field of
mechatronics continues to mature, the list of relevant topics associated with the area will
most certainly expand and evolve. The extent to which mechatronics reaches into
various engineering disciplines is revealed by the constituent key elements comprising
mechatronics.
Physical System Modeling
Central to mechatronics is the integration of physical systems (e.g., mechanical and
electrical systems) utilizing various sensors and actuators connected to computers and
software. The connections are illustrated in Figure 31-2. In the design process, it is
necessary to represent the physical world utilizing mathematical models; hence,
physical system modeling is essential to the mechatronic design process. Fundamental
principles of science and engineering (such as the dynamical principles of Newton,
Maxwell, and Kirchhoff) are employed in the development of physical system models of
mechanical and dynamical systems, electrical systems, electromechanical systems, fluid
systems, thermodynamic systems, and micro-electromechanical (MEMS) systems. The
mathematical models might include the external physical environment for simulation
and verification of the control system design, and it is likely that the mathematical
models also include representations of the various sensors and actuators. An
autonomous rover, for example, might utilize an inertial measurement unit for
navigation purposes, and the software hosted on the computer aboard the vehicle would
utilize a model of the inertial measurement unit to compute the expected acceleration
measurements as part of the predictive function in the rover trajectory control.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Sensors and Actuators
Mechatronic systems utilize sensors to measure the external environment and actuators
to manipulate the physical systems to achieve the desired goals. We classify sensors by
their measurement goals. Sensors can measure linear motion, rotational motion,
acceleration, force, torque, pressure, flow rates, temperature, range and proximity, light
intensity, and images. Furthermore, sensors can be classified as either analog or digital,
and as passive or active. A cadmium sulfide cell that measures light intensity is an
example of an analog sensor. A digital camera is a common digital sensor. When
remotely sensing the Earth, we can use passive sensors that measure energy that is
naturally available when the sun is illuminating the Earth. For example, a digital camera
might serve as a passive sensor for remote sensing. On the other hand, we can also use
an active sensor that provides its own energy source for illumination. An example of an
active sensor for remote sensing is the synthetic aperture radar. Sensors are becoming
increasingly smaller, lighter, and, in many instances, less expensive. The trend to
microscales and nanoscales supports the continued evolution of mechatronic system
design to smaller and smaller scales.
Actuators are following the same trends as sensors in reduced size and cost. Actuators
can be classified according to the nature of their energy: electromechanical, electrical,
electromagnetic, piezoelectric, hydraulic, and pneumatic. Furthermore, we can classify
actuators as binary or continuous. For example, a relay is a binary actuator and a stepper
motor is a continuous actuator. Examples of electrical actuators include diodes,
thyristors, and solid-state relays. Examples of electromechanical actuators include
motors, such as direct current (DC) motors, alternating current (AC) motors, and stepper
motors. Examples of electromagnetic actuators include solenoids and electromagnetic
relays. As new smart material actuators continue to evolve, we can expect advanced
shape memory alloy actuators and magnetostrictive actuators. As the trend to smaller
actuators continues, we can also look forward to a larger selection of microactuators and
nanoactuators.
When working with sensors and actuators, one must necessarily be concerned with the
fundamentals of time and frequency. Three types of time and frequency information can
be conveyed: clock time or time of day (the time an event occurred), time interval
(duration between events), and frequency (rate of a repetitive event). In mechatronic
systems, we often need to time tag and synchronize events, such as time measurements
that are obtained from sensors or the time at which an actuator must be activated. The
accuracy of time and frequency standards has improved by many orders of magnitude
over the past decades allowing further advances in mechatronics by reducing
uncertainties in sensing and actuating timing.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
In addition to timing and frequency, the issues of sensor and actuator characteristics
must be considered. The characteristics of interest include range, resolution, sensitivity,
errors (calibration, loading, and sensitivity), repeatability, linearity and accuracy,
impedance, friction, eccentricity, backlash, saturation, deadband, input, and frequency
response.
Signals and Systems
An important step in the design process is to accurately represent the system with
mathematical models. What are the mathematical models used for? They are employed
in designing and analyzing appropriate control systems. The application of system
theory and control is central to mechatronic system design. The relevance of control
system design to mechatronic systems extends from classical frequency domain design
using linear, time-invariant, single-input single-output (SISO) system models, to modern
multiple-input multiple-output (MIMO) state space methods (again assuming linear,
time-invariant models), to nonlinear, time-varying methods. Classical design methods
use transfer function models in conjunction with root locus methods and frequencyresponse methods, such as Bode, Nyquist, and Nichols. Although the transfer function
approach generally employs SISO models, it is possible to perform MIMO analysis in
the frequency domain. Modern state-space analysis and design techniques can be readily
applied to SISO and MIMO systems. Pole placement techniques are often used to design
the state-space controllers. Optimal control methods, such as linear quadratic regulators
(LQRs), have been discussed in the literature since the 1960s and are in common use
today. Robust, optimal, control design strategies, such as H2 and H∞, are applicable to
mechatronic systems, especially in situations where there is considerable uncertainty in
the plant and disturbance models. Other common design methodologies include fuzzy
control, adaptive control, and nonlinear control (using Lyapunov methods and feedback
linearization).
Computers and Logic Systems
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Once the control system is designed, it must be implemented on the mechatronic
system. This is the stage in the design process where issues of software and computer
hardware, logic systems, and data acquisition take center stage. The development of the
microcomputer, and associated information technologies and software, has impacted the
field of mechatronics and has led to a whole new generation of consumer products.
The computer is used to monitor and/or control processes and operations in a
mechatronic system. In this mode, the computer is generally an embedded computer
hosting an operating system (often real time) running codes and algorithms that process
input measurement data (from a data acquisition system) to prescribe outputs that drive
actuators to achieve the desired closed-loop behavior. Embedded computers allow us to
introduce intelligence into the mechatronic systems. Unfortunately, nearly half of all
embedded system designs are late or never make it to the product, and about one-third
fail once they are deployed. One of the challenges facing designers in this arena is
related to the complexity of the algorithms and codes that are being implemented on the
embedded computers. One of the newer (and effective) approaches to addressing the
coding complexity issue is using graphical system design that blends intuitive graphical
programming and commercial off-the-shelf hardware to design, prototype, and deploy
embedded systems.
The computer also plays a central role in the design phase where engineers use design
and analysis software (off-the-shelf or special purpose) to design, validate, and verify
the expected performance of the mechatronic system. Sometimes this is accomplished
completely in simulation, but more often the computer is but one component of a test
procedure that encompasses hardware-in-the-loop simulations and other laboratory
investigations.
Data Acquisition and Software
The collection of signals from the sensors is known as data acquisition (DAQ). The
DAQ system collects and digitizes the sensor signals for use in the computer-controlled
environment of the mechatronic system. The DAQ system can also be used to generate
signals for control purposes. A DAQ system includes various computer plug-in DAQ
devices, signal conditioning (e.g., linearization and scaling), and a suite of software. The
software suite controls the DAQ system by acquiring the raw data, analyzing the data,
and presenting the results.
Sensors are typically not connected directly to a plug-in DAQ device in the computer
because the measured physical signals are often low voltage and susceptible to noise,
thus they require some type of signal conditioning. In other words, the signals are
appropriately modified (e.g., amplified and filtered) before the plug-in DAQ device
converts them to digital information. One example of signal conditioning is linearization
where the voltage levels from the sensors or transducers are linearized so that the
voltages can be scaled to measure physical phenomena.
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
Two main hardware elements of the DAQ system are the analog-to-digital converter
(ADC) and the digital-to-analog converter (DAC). The ADC is an electronic device
(often an integrated circuit) that converts an analog voltage to a digital number.
Similarly, the DAC is an electronic device that converts a digital number to an analog
voltage or current.
The transfer of data to or from a computer system involving communication channels
and DAQ interfaces is referred to as input/output, or I/O. The I/O can be either digital
I/O or analog I/O. There are several key questions that arise when considering analog
signal inputs and DAQ. For example, you need to know the signal magnitude limits. It is
also important to know how fast the signal varies with time. The four parameters of
concern are (1) resolution, (2) device range, (3) signal input range, and (4) sampling
rate. The resolution of the ADC is measured in bits. For example, an ADC with 16 bits
has a higher resolution (and thus a higher degree of accuracy) than a 12-bit ADC. The
device range is the minimum and maximum analog signal levels that the ADC can
digitize. The signal input range is the range that is specified as the maximum and
minimum voltages of the analog input signals. A signal range that includes both positive
and negative values (e.g., −5 V to 5 V) is known as bipolar. A signal range that is
always positive (e.g., 0 V to 10 V) is unipolar. The sampling rate is the rate at which the
DAQ device samples an incoming signal.
The Modern Automobile as a Mechatronic Product
The evolution of modern mechatronics is reflected in the development of the modern
Copyright © 2018. International Society of Automation (ISA). All rights reserved.
automobile. Until the 1960s, the radio was the only significant electronics in an
automobile. Today, the automobile is a comprehensive mechatronic system. For
example, before the introduction of sensors and microcontrollers, a mechanical
distributor was used to select the specific spark plug to fire when the fuel-air mixture
was compressed. The timing of the ignition was the control variable. Modeling of the
combustion process showed that for increased fuel efficiency there existed an optimal
time when the fuel should be ignited depending on the load, speed, and other
measurable quantities. As a result of efforts to increase fuel efficiency, the electronic
ignition system was one of the first mechatronic systems to be introduced in the
automobile. The electronic ignition system consists of crankshaft position, camshaft
position, airflow rate, throttle position, and rate-of-throttle position change sensors,
along with a dedicated microcontroller to determine the timing of the spark plug firings.
The mechanical distributor is now a thing of the past. Other mechatronic additions to the
modern automobile include the antilock brake system (ABS), the traction control system
(TCS), and the vehicle dynamics control (VDC) system.
Modern automobiles typically use combinations of 8-, 16-, and 32-bit processors to
implement the control systems. The microcontroller has onboard memory, digital and
analog inputs, analog-to-digital and digital-to-analog converters, pulse width
modulation, timer functions (such as event counting and pulse width measurement,
prioritized inputs, and, in some cases, digital signal processing). Typically, the 32-bit
processor is used for engine management, transmission control, and airbags; the 16-bit
processor is used for the ABS, TCS, VDC, instrument cluster, and air-conditioning
systems; and the 8-bit processor is used for seat control, mirror control, and window lift
systems. By 2017, some luxury automobiles were employing over 150 onboard
microprocessors—and the trend of increasing the use of microprocessors continues [7].
And what about software? Modern automobiles have millions of lines of code with
conventional autos hosting up to 10 million lines of code and high-end luxury sedans
hosting nearly 100 million lines of code [8]. With this many lines of code and the
growing connectivity of the automobile to the Internet, the issue of cybersecurity is
rapidly becoming a subject of great interest as it has been demonstrated that hacking
your automobile subsystems is possible [9].
Automobile makers are searching for high-tech features that will differentiate their
vehicles from others. It is estimated that 60% of the cost of a car is associated with
automotive electronic systems and that the global automotive electronics market size
will reach $300 billion by 2020 [10]. New applications of mechatronic systems in the
automotive world include driverless and connected automobiles, safety enhancements,
emission reduction, and other features including intelligent cruise control, and brake-by-
wire systems that eliminate the hydraulics. An upcoming trend is to bring the computer
into the automobile passenger compartment with, for example, split screen monitors that
facilitate front seat passenger entertainment (such as watching a movie) while the driver
navigates using GPS and mapping technology [11]. As the number of automobiles in the
world increases, stricter emission standards are inevitable. Mechatronic products will
likely contribute to meeting the challenges in emission control by 
Download