CSE466 Syllabus

advertisement
Safety
 Examples
 Terms and Concepts
 Safety Architectures
 Safe Design Process
 Software Specific Stuff
 Sources
 Hard Time by Bruce Powell Douglass, which references Safeware by
Nancy Leveson
CSE 466 – Fall 2000 - Introduction - 1
What is a Safe System?
Brake
w/ local
controller
Brake
Pedal
Pedal
Sensor
Processor
Bus
Engine
w/ local
controller
Is it safe?
What does “safe” mean?
How can we make it safe?
CSE 466 – Fall 2000 - Introduction - 2
Terms and Concepts
 Reliability of component i can be expressed as the probability that
component i is still functioning at some time t.
Pi(t) =
Probability
of being
operational
at time t
burn
in
period
Low failure rate means nearly constant probability
1/(failure rate) = MTBF
time
 Is system reliability Ps (t) = PPi(t) ?
 Assuming that all components have the same component reliability, Is a
system w/ fewer components always more reliable ?
 Does component failure  system failure ?
CSE 466 – Fall 2000 - Introduction - 3
A Safe System
 A system is safe if it’s deployment involves assuming an acceptable amount
of risk…acceptable to whom?
 Risk factors
 Probability of something bad happing
 Consequences of something bad happening (Severity)
 Example
 Airplane Travel – high severity, low probability
 Electric shock from battery powered devices – hi probability, low severity
safe
zone
mp3 player
danger zone
(we don’t all have the same
risk tolerance!)
probability
Desktop PC?
severity
airplane
Nuclear power plant
CSE 466 – Fall 2000 - Introduction - 4
More Precise Terminology
 Accident or Mishap: (unintended) Damage to property or harm to persons..
 Release of Energy
 Release of Toxins
 Interference with life support functions
 Damage to a credit rating (Does this fit the definition?)
 Hazard: A state of the the system that will inevitably lead to an accident or
mishap.
 Hydrogen leak in fuel cell system
 Backup battery dead on ventilator
 Supplying misleading information to safety personnel or control systems.
This is the desktop PC nightmare scenario. Bad information
 Alarm bell broken
CSE 466 – Fall 2000 - Introduction - 5
Faults
 A fault is an “unsatisfactory system condition or state”. A fault is not
necessarily a hazard. In fact, assessments of safety are based on the notion
of fault tolerance.
 Main H2 Valve stuck, back up H2 valve working
 Systemic faults
 Design Errors (includes process errors such as failure to test or failure to
apply a safety design process)
 Faults due to software bugs are systemic
 Security breech
 Random Faults
 Random events that can cause permanent or temporary damage to the
system. Includes EMI and radiation, component failure, power supply
problems, wear and tear.
CSE 466 – Fall 2000 - Introduction - 6
Component v. System
 Reliability is a component issue
 Safety and Availability are system issues
 A system can be safe even if it is unreliable!
 If a system has lots of redundancy the likelihood of a component failure (a
fault) increases, but so may increase the safety and availability of that
system.
 Safety and Availability are different and sometimes at odds. Safety may
require the shutdown of a system that may still be able to perform its
function.
 A backup system that can fully operate a nuclear power plant might
always shut it down in the event of failure of the primary system.
 The plant could remain available, but it is unsafe to continue operation
CSE 466 – Fall 2000 - Introduction - 7
Single Fault Tolerance (for safety)
 The existence of any single fault does not result in a hazard
 Single fault tolerant systems are generally considered to be safe, but more
stringent requirements may apply to high risk cases…airplanes, power
plants, etc.
Backup
H2 Valve
Control
Assume perfectly reliable
valves
watchdog
protocol
Main
H2 Valve
Control
CSE 466 – Fall 2000 - Introduction - 8
If the handshake
fails, then either one
or both can shut off the gas
supply. Is this a single fault
tolerant system?
Is This?
Backup
H2 Valve
Control
common
mode
failures
watchdog
handshake
Main
H2 Valve
Control
CSE 466 – Fall 2000 - Introduction - 9
Now Safe?
Backup
H2 Valve
Control
watchdog
handshake
Main
H2 Valve
Control
CSE 466 – Fall 2000 - Introduction - 10
•Separate Clock Source
•Power Fail-Safe (non-latching)
Valves
Does it ever end?
Time is a Factor
 The TUV Fault Assessment Flow Chart
 T1: Fault tolerance time of the first failure
 T2: Time after which a second fault is likely
 Captures time, and the notion of “latent faults”
 T1 – tolerance time for first fault
First
Fault
 T2 – Time after which a second fault is likely
 Based on MTBF data
yes
Hazard after T1?
 Safety requires that
 Ttest<T1<T2
no
no
Fault Detected
After T2?
yes
2nd
Fault
System
Unsafe
yes
no
hazard?
CSE 466 – Fall 2000 - Introduction - 11
System
Safe
Latent Faults
 Any fault this is not detectable by the system during operation has a
probability of 1 – doesn’t count in single fault tolerance assessment
Backup
H2 Valve
Control
stuck valves could
be latent if the
controllers cannot
check their state.
watchdog
handshake
Main
H2 Valve
Control
May as well assume that
they are stuck!
 Detection might not mean diagnosis. If system can detect secondary affect
of device failure before a hazard arises, then this could be considered safe
CSE 466 – Fall 2000 - Introduction - 12
Design for Safety
1. Hazard Identification and Fault Tree Analysis, FMEA
2. Risk Assessment
3. Define Safety Measures
4. Create Safe Requirements
5. Implement Safety
6. Assure Safety Process
7. Test,Test,Test,Test,Test
CSE 466 – Fall 2000 - Introduction - 13
1. Hazard Identification – Ventilator Example
Human
in Loop
Mishap
Severity
Tolerance
Time
Fault
Example
Likelihood
Detection
Time
Mechanism
Exposure
Time
Hypoventilation
Severe
5 min.
Vent Fails –
No pressure
in reservoir.
Rare
30sec
Indep.
pressure
sensor w/
alarm
40sec
Esophageal
intubation
Medium
30sec
C02
sensor
alarm
40sec
User misattaches
breathing
hoses
never
N/A
Different
mechanic
al fittings
for intake
and
exhaust
N/A
Release
valve failure
Rare
0.01sec
Secondary
valve
opens
0.01sec
Overpressuriza
tion
Severe
0.05sec
CSE 466 – Fall 2000 - Introduction - 14
Download