Why do we need something different?

advertisement
Why do we need something different?
• Fast pace of technological change
• Reduced ability to learn from experience
• Changing nature of accidents
• New types of hazards
• Increasing complexity and coupling
• Decreasing tolerance for single accidents
• Difficulty in selecting priorities and making tradeoffs
• More complex relationships between humans and
automation
• Changing regulatory and public views of safety
Assumptions Underlying What We Do
1. Safety is increased by increasing system or
component reliability. If components or systems do
not fail, then accidents will not occur.
High reliability is neither necessary nor sufficient for
safety.
2. Accidents are caused by chains of directly related
events. We can understand accidents and assess
risk by looking at the chain of events leading to the
loss.
Accidents are complex processes involving the
entire socio-technical system. Traditional eventchain models cannot describe this process
adequately.
Assumptions Underlying What We Do
3. Probabilistic risk assessment based on event chains
is the best way to assess and communicate safety
and risk information.
Risk and safety may be best understood and
communicated in ways other than probabilistic risk
analysis.
4. Most accidents are caused by operator error.
Rewarding safe behavior and punishing unsafe
behavior will eliminate or reduce accidents
significantly.
Operator error is a product of the environment in
which it occurs. To reduce operator “error” we must
change the environment in which the operator works.
Assumptions Underlying What We Do
5. Highly reliable software is safe.
Highly reliable software is not necessarily safe.
Increasing software reliability will have only minimal
impact on safety.
Software is simply design abstracted from its
physical realization.
Black Box Testing
Test data derived solely from specifiation (i.e,
without knowledge of internal structure of program).
Need to test every possible input
x := y * 2
if x = 5 then y := 3
(Since black box, only way to be sure to detect
this is to try envery input condition)
Valid inputs up to max size of machine ( not astronomical)
Also all invalid input (e.g., testing Ada compiler requires all
valid and invalid programs)
If program has "memory", need to test all possible unique
valid and invalid sequences.
So for most programs, exhaustive input testing
is impractical.
White Box Testing
Derive test data by examining program's logic.
Exhaustic path testing: Two flaws
1) Number of unique paths through program is astronomical.
5
20
19
+ 5
18
+ 5
+ ... + 5 = 1014
= 100 trillion
If could develop/execute/verify one
test cases every five minutes = 1 billion years
If had magic test processor that could
develop/execute/evaluate one test per
msec = 3170 years.
White Box Testing (2)
2) Could test every path and program may still have erroes!
Does not guarantee program matches specification,
i.e., wrong program.
Missing paths: would not detect absence of necessary paths
Could still have data-sensitivity errors.
e.g. program has to compare two numbers for convergence
if (A - B) < epsilon ...
is wrong because should compare to abs(A - B)
Detection of this error dependent on values used for A
and B and would not necessarily be found by executing
every path through program.
Assumptions Underlying What We Do
6. Major accidents occur from the chance
simultaneous occurrence of random events.
Systems tend to migrate toward states of higher risk.
Such migration is predictable and can be prevented
by appropriate system design or detected during
operations using leading indicators or increasing
risk.
7. Assigning blame is necessary to learn from and
prevent accidents or incidents.
Blame is the enemy of safety. Focus should be on
understanding how the system behavior as a whole
contributed to the loss and not on who or what to
blame for it.
MIT OpenCourseWare
http://ocw.mit.edu
16.863J / ESD.863J System Safety
Spring 2011
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Download