Engineering Judgement

advertisement
Engineering Judgement
Martyn Thomas
Visiting Professor of Software Engineering
Oxford University Computing Laboratory
martyn@thomas-associates.co.uk
Engineering Judgement
When I hear the words
“engineering judgement”
I know they are just going to
make up numbers”.
Richard Feynman, 1988.
ASCSC 2004 Brisbane Workshop
2
The argument in brief
• Almost all safety-related systems have target failure
probabilities (pfh) below 10-5/hour
• Assuring such a pfh would require evidence that is
rarely available at the time of certification.
• Assessors therefore rely on their engineering
judgement. In effect, they make up numbers.
• Accepting that this is inevitable, we need to make
radical changes in the way we develop and maintain
systems, and certify them.
ASCSC 2004 Brisbane Workshop
3
Safety Integrity Levels
High demand
Safety integrity
level
4
3
2
1
IEC
61508
High demand or continuous mode of operation
(Probability of a dangerous failure per hour)
-9
-8
 10 to  10
-8
-7
 10 to  10
-7
-6
 10 to  10
-6
-5
 10 to  10
Even SIL 1 is beyond reasonable assurance by testing. It would take
10+ years under operational conditions, no failures & no modifications.
What sense does it make to attempt to distinguish single
factors of 10 in this way? Do we really know so much
about the effect of different development methods on
product failure rates? Of course not!
ASCSC 2004 Brisbane Workshop
4
What would provide adequate
evidence for 10-5 pfh?
• Sufficient operational measurements
• Proof of correct implementation of a correct
specification
What do we actually use?
• Testing
• Process-based evidence
• Compliance with standards
ASCSC 2004 Brisbane Workshop
5
Sufficient Operational
Measurements
• For 10-n pfh, at least 10n hours without
unsafe failure or modification.
• Such criteria are used for ETOPS
certification of aircraft engines
• Such an approach is impractical for most
safety-related transport systems
ASCSC 2004 Brisbane Workshop
6
Proof of Correctness
• Proof is an important form of verification. It can
show that a system meets its specification, but
provides no absolute information about the
probability of unsafe failure.
• It is very difficult to prove that all possible
unsafe system states have been considered.
• Full formal proof is very expensive.
ASCSC 2004 Brisbane Workshop
7
What do we actually do?
• Testing
• Process-based evidence
• Compliance with standards
ASCSC 2004 Brisbane Workshop
8
Testing
• What can testing tell us?
– If the tests were statistically representative of the
operation, then sufficient tests would show pfh.
– If a mathematical analysis had established
equivalence classes, then testing a member of each
class would allow an inductive proof that there
could be no failures.
– How the system behaves on the tests.
– … nothing else 
ASCSC 2004 Brisbane Workshop
9
Process-based evidence
• Good processes do not guarantee safe products
– but poor processes almost guarantee unsafe ones
• Good processes are essential if you need to
trust their output (eg version control).
• The output from a good process may provide
useful evidence.
– For example, if you can trust a proof process, the
proof may tell you something about the system’s
properties
ASCSC 2004 Brisbane Workshop
10
Compliance with standards
The nice thing about standards is that there are so
many to choose from … Andrew Tanenbaum
• Standards result from negotiation in
committee, often with strong vested interest
from industry.
– It would be surprising if they represented best practice
– … and astonishing if they led to radical improvements
• Much effort goes into meeting standards
that would be better spent improving safety.
ASCSC 2004 Brisbane Workshop
11
An aside on SIL 0
• If your safety argument allows the use of
components with pfh > 10-5 then IEC 61508
assumes that normal industrial software will be
good enough. That is absurd.
– Little industrial/commercial software has an MTBF
approaching one year…
– nor does it come with a safety analysis, or failure
history …
• I believe that all safety-related software should be
developed to higher standards than almost all
industrial software has been to date.
ASCSC 2004 Brisbane Workshop
12
An aside about maintenance
• In principle, any system change invalidates all the
operational history of that system
– unless you can prove that the change has some restricted impact
(which, typically, you cannot)
• So should all the original assurance activities be repeated?
– Obviously, yes. Although some of the outputs may be able to be reused.
• Does this happen? Not in my experience.
• It seems likely that we shall see an increasing number of
incidents caused by defects introduced in maintenance.
ASCSC 2004 Brisbane Workshop
13
Safety Assurance:
the state of practice
• There is insufficient empirical evidence to justify even
the pfh associated with SIL 1, to 99% confidence.
• Development methods and tools in common use are
too informal to support reasoning about correctness.
• So most attention is given to process issues and
conformance with standards, despite the very weak
causal link with safety.
• We usually get away with it because people are very
careful and try very hard (and very expensively).
• It seems unlikely that this approach will scale up.
ASCSC 2004 Brisbane Workshop
14
We are like the barber-surgeons of earlier ages, who
prided themselves on the sharpness of their knives
and the speed with which they dispatched their duty -either shaving a beard or amputating a limb.
Imagine the dismay with which they greeted some
ivory-towered academic who told them that the
practice of surgery should be based on a long and
detailed study of human anatomy, on familiarity with
surgical procedures pioneered by great doctors of the
past, and that it should be carried out only in a
strictly controlled bug-free environment, far removed
from the hair and dust of the normal barber’s shop.
(Professor Sir Tony Hoare 1984)
ASCSC 2004 Brisbane Workshop
15
A possible future
•
•
•
•
•
•
Greater rigour with minimal innovation
Minimal defect construction
Maintenance as the central activity
Licensing of independent safety assessors
New-generation Safe COTS components
Regulation to drive radical change
ASCSC 2004 Brisbane Workshop
16
Greater rigour
with minimal innovation
• Our systems are among the most complex
ever attempted. We must adopt the power of
mathematics to master that complexity.
• A good scientist is a person with original ideas. A
good engineer is a person who makes a design
that works with as few original ideas as possible.
There are no prima donnas in engineering.
Freeman Dyson 2001.
ASCSC 2004 Brisbane Workshop
17
Minimal defect construction
• Dijkstra observed in 1972 that most of the cost in
developing software came from the effort required
to remove the defects.
• Praxis’ Correct by Construction methods are
delivering <0.04 defects/KLoC with a productivity
of >25 LoC/person-day.
• That should become the benchmark for
professional work in safety-related systems. If
your methods do not deliver such high quality at
such low costs, change to CbC.
ASCSC 2004 Brisbane Workshop
18
Maintenance as the central
activity
• A successful system will spend far more
time being used and maintained than being
developed.
• Our development methods and tools, and
our assessment and certification protocols,
should focus on safe and cost-effective
maintenance.
ASCSC 2004 Brisbane Workshop
19
Licensing of independent safety
assessors
• Even with far better methods and tools,
safety assessment and certification will
continue to depend on judgement.
• We need to enforce standards of
competence (education, training and
experience) for the people whom society
trusts to take such decisions.
ASCSC 2004 Brisbane Workshop
20
New-generation
Safe COTS components
• Most COTS components have not been
developed to be highly dependable and do
not come with the evidence needed to allow
adequate safety assessment.
• We could redevelop the entire suite of core
COTS components for a few $B.
• This would be a worthwhile focus for
international engineering collaboration.
ASCSC 2004 Brisbane Workshop
21
Conclusion
• Current practices cannot be justified: they are unsafe
and/or too expensive. (Either way, not ALARP).
• Radical change must be created: progress is too slow
• Software engineers need competence in mathematics
(discrete and continuous) and statistics. Core curriculum.
• All safety-related systems should be formally specified and
developed using fully-defined languages supported by
powerful static analysis tools. Not C or C++.
• Safety assessment should be based on the best practicable
evidence, evaluated by a licensed assessor.
• Core COTS components must be re-implemented properly
- or avoided.
ASCSC 2004 Brisbane Workshop
22
Download