Slides

advertisement
Software Dependability Assessment:
A Reality or A Dream?
Karama Kanoun
The Sixth IEEE International Conference on Software Security and Reliability, SERE,
Washington D.C., USA, 20-22 June, 2012
1
99.9999%
99.999%
99.99%
99.9%
Interne
t
99%
9%
1950
1960
1970
[From J. Gray, ‘Dependability in the Internet era’]
 Complexity
 Economic pressure
1980
1990
Cell
phone
s
2000
2010
Availability Outage duration/yr
0,999999
32s
0,99999
5min 15s
0,9999
52min 34s
0,999
8h 46min
0,99
3d 16h
2

Aug. 1986 - 1987: the « Willy Hacker » penetrates several tens of sensitive
computing facilities

15 January 1990: 9 hours outage of the long distance phone in the USA

February 1991: Scud missed by a Patriot (Irak, Gulf war)


Nov. 1992: Crash of communication system of London Ambulance service






















4 June 1996: Failure of Ariane 5 first flight


Feb 2000: Distributed denials of service on large web sites: Yahoo, ebay…




Aug. 2003: electricity blackout in USA, Canada




Oct. 2006: 83000 email addresses, credit card info, banking transaction
files stolen in UK



Aug. 2008: Air traffic control computing system failure (USA)

Sep. 2009: 3rd service interruption of Gmail, Google email service, during
year 2009


Confidentiality
Safety

June 1985 - Jan. 1987: Excessive radiotherapy doses (Therac-25)
26 -27 June 1993: Authorization denials of credit card operations, France
Distributed
Interaction
Development

Avaialbility/
Reliability
June 1980: False alerts at the North American Air Defence (NORAD)
Physical
Examples of Historical Failures
Localized
Dependability
Impact properties
Cause







3
Accidental faults
Number of failures
[consequences and outage
durations depend upon
application]
Faults
Dedicated computing
systems
(e.g., transaction
processing, electronic
switching, Internet backend
servers)
Controlled systems
(e.g., civil airplanes, phone
network, Internet frontend
servers)
Rank
Proportion
Rank
Proportion
Physical internal
3
~ 10%
2
15-20%
Physical external
3
~ 10%
2
15-20%
Human interactions
2
~ 20%
1
40-50%
Development
1
~ 60%
2
15-20%
4
Global Information Security Survey 2004 — Ernst & Young
Loss of availability: Top ten incidents
Percentage of respondents that indicated the following incidents resulted in an unexpected or
unscheduled outage of their critical business
0%
20%
40%
60%
80%
Hardware failures
Major virus, Trojan horse, or Internet worms
Telecommunications failure
Software failure
Third party failure, e.g., service provider
System capacity issues
Operational erors, e.g., wrong software loaded
Infrastructure failure, e.g., fire, blackout
Former or current employee misconduct
Distributed Denial of Servive (DDoS) attacks
Non malicious
76%
Malicious
24%
5
Why Software Dependability Assessment?
 User / customer
• Confidence in the product
• Acceptable failure rate
 Developer / Supplier
• During production
 Reduce # faults (zero defect)
 Optimize development
 Increase operational dependability
• During operation
 Maintenance planning
• Long term
Improve software dependability
of next generations
6
Approaches to Software Dependability Assessment
 Assessment based on software characteristics
• Language, complexity metrics, application domain, …
 Assessment based on measurements
• Observation of the software behavior
 Assessment based on controlled experiments
• Ad hoc vs standardized  benchmarking
 Assessment of the production process
• Maturity models
7
Outline of the Presentation
 Assessment based on software characteristics
• Language, complexity metrics, application domain, …
 Assessment based on measurements
• Observation of the software behavior
 Assessment based on controlled experiments
• Ad hoc vs standardized  benchmarking
 Assessment of the production process
• Maturity models
8
Software Dependability Assessment — Difficulties
 Corrections 
non-repetitive process
 No relationship between failures and corrections
 Continuous evolution of usage profile
• According to the development phase
• Within a given phase
 Overselling of early reliability “growth” models
 Judgement on quality of the software developers
 What is software dependability  dependability measures?
 Number of faults, fault density, complexity?
 MTTF, failure intensity, failure rate?
9
Dependability Measures?
Dynamic measures:
characterizing occurrence
of failures and corrections
Static measures
Complexity metrics
Number of faults
Fault density
…
Usage profile
&
Environment
Failure intensity
Failure rate
MTTF
Restart time
Recovery time
Availability
…
10
Number of Faults vs MTTF
Percentage of faults and corresponding MTTF (published by IBM)
MTTF≤1.58 y


MTTF (years)
5000
1580
500
158
50
15.8
5
2,1
3,2
2,8
2,0
2,9
2,1
2,7
2,7
1,9
1,2
1,5
1,4
0,3
1,4
0,8
1,4
1,4
0,5
1.58
Product
1
2
3
4
5
6
7
8
9
34,2
34,3
33,7
34,2
34,2
32,0
34,0
31,9
31,2
28,8
28,0
28,5
28,5
28,5
28,2
28,5
27,1
27,6
17,8
18,2
18,0
18,7
18,4
20,1
18,5
18,4
20,4
10,3
9,7
8,7
11,9
9,4
11,5
9,9
11,1
12,8
5,0
4,5
6,5
4,4
4,4
5,0
4,5
6,5
5,6

0,7
0,7
0,4
0,1
0,7
0,3
0,6
1,1
0,0

1.58 y ≤ MTTF≤ 5 y
11
Assessment Based on Measurements
Data Collection
Times to failures /
# failures
Failure impact
Failure origin
Corrections
Data Processing
Outputs
• Descriptive statistics
• Correlations
• Trend analysis
• Failure modes
• Modelling/prediction
• Trend evolution
• Non-stationary
processes
• Stochastic models
• MTTF / failure rate
• MTTR
• Model validation
• Availability
12
Why Trend Analysis?
Corrections
.
.
.
Failure intensity
Vi,k
Vi,2
Vi,1
time
13
Why Trend Analysis?
Corrections
Vi+1,4
Vi+1,3
Vi+1,2
Vi+1,1
Corrections
.
.
.
Failure intensity
Vi,2
Vi,k
Changes (usage profile,
environment, specifications,...)
Vi,1
time
14
Example: Electronic Switching System
Failure intensity
# systems
25
40
Validation
Operation
30
20
20
10
15
11
15
19
23
29 31
10
5
0
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31 months
15
Electronic Switching System (Cont.)
Cumulative number of failures
220
Validation
200
Operation
180
160
140
120
Observed
100
80
60
40
20
0
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
months
16
Electronic Switching System (Cont.)
Cumulative number of failures
220
Validation
200
 Hyperexponential model application
 maintenance planning
Operation
180
160
140
120
Observed
100
80
60
Retrodictive assessment
Predictive assessment
40
20
0
1
3
5
7
9
11
13
15
Observed # failures [20-32] = 33
17
19
21
23
25
27
29
31
months
Predicted # failures [21-32] = 37
17
Electronic Switching System (Cont.)
Failure intensity and failure rate in operation
(for an average system)
2.5
2
Observed
1.5
1
Estimated by Hyperexponential model
0.5
0
17
19
21
23
25
27
29
31
Residual
failure rate:
5.7 10-5 /h
18
Electronic Switching System (Cont.)
Failure intensity of the software components
0.5
1
Hyperexponential model
0.8
Hyperexponential model
0.4
0.3
0.6
observed failure intensity
0.4
0.2
0.1
0
0
17
19
21
23
25
27
29
observed failure intensity
0.2
31
17
19
Telephony
0.4
21
23
25
27
29
31
Defense
0.4
Hyperexponential model
0.3
Hyperexponential model
0.3
0.2
0.2
observed failure intensity
0.1
0.1
0
0
17
19
21
23 25 27
Interface
29
31
observed failure intensity
17
19
21 23 25
Management
27
29
31
19
Electronic Switching System (Cont.)
Failure intensity and failure rate in operation
(for an average system)
Component
2.5
Residual failure rate
Telephony
1.2 10 - 6 /h
Defense
1.4 10 - 5 /h
Interface
2.9 10 - 5 /h
Management
2
8.5 10 - 6 /h
5.3 10 - 5 /h
Sum
Observed
1.5
Sum of the failure intensities of the
components estimated by HE
1
Estimated by Hyperexponential model
0.5
0
17
19
21
23
25
27
29
31
Residual
failure rate:
5.7 10-5 /h
20
Other Example: Operating System in Operation
Data = Time to Failure during operation
u(i)
Trend evolution
= stable dependability
2
1,5
1
0,5
0
-0,5 1
-1
-1,5
-2
21
41
61
81
101
121
141
161
181
21
41
61
81
101
121
141 161
181
# failures
300000
Mean
Time to Failure
250000
200000
150000
# failures
100000
1
21
Validity of Results
Early Validation
End of Validation
 Trend analysis
 Trend analysis
+
 Assessment
• operational

development
ollow-up
f
Operation
 Trend analysis
+
 Assessment
High relevance
profile
• enough data?
Assessment
Examples:
E10-B (Alcatel ESS):
 Limits: 10-3/h -10-4/h
1400 systems, 3 years
 = 5 10-6/h
c = 10-7/h
Nuclear I&C systems:
8000 systems, 4 years
: 3 10-7/h  10-7/h
c = 4 10-8/h
22
Research Gaps
 Applicability to new classes of systems
• Service oriented systems
• Adaptive and dynamic software systems  on-line assessment
 Industry implication
• Confidentiality  real-life data
• Cost (perceptible overhead, invisible immediate benefits)
 Case of Off-The-Shelf software components
?
 Applicability to safety critical systems
• During development
 Accumulation of experience  software process improvement
 assessment of the software process
23
Off-The-Shelf software components —
Dependability Benchmarking
 No information available from software development
 Evaluation based on controlled experimentation
Ad hoc
Standard
Dependability benchmarking
Evaluation of dependability measures / features
in a non-ambiguous way  comparison

Properties
Reproducibility, repeatability, portability, representativeness, acceptable cost
24
Benchmarks of Operating Systems
Operating System
Computer System
Linux
Mac
Windows
Which OS for
my computer
system?
 Limited knowledge: functional description
 Limited accessibility and observability
 Black-box approach  robustness benchmark
25
Robustness Benchmarks
Application
Faults
OS Outcomes
API
Operating system
Device
drivers
Hardware
Faults = corrupted system calls
26
OS Response Time
Windows
700
s
Linux
700 s
600
600
500
500
400
400
300
300
200
200
100
100
0
0
NT 4
2000
XP
NT4
2000
2003
Server Server Server
2.2.26
2.4.5
2.4.26
2.6.6
Without corruption
In the presence of corrupted system calls
27
Mean Restart Time
Windows
120
seconds
Linux
120 seconds
80
80
40
40
0
0
NT 4
2000
XP
NT4
2000
2003
Server Server Server
2.2.26
2.4.5
2.4.26
2.6.6
Without corruption
In the presence of corrupted system calls
28
Detailed Restart Time
Windows XP
250
Linux 2.2.26
seconds
250
200
200
150
150
100
100
# exp
50
0
100
200
300
400
seconds
check disk
# exp
50
0
50
100
150
200
29
seconds
More on250Windows family
200
Impact of application state after failure
Restart time
150
(seconds)
2000
2000
100
NT4
NT4
XP
# exp
Experiment
50
0
50
100
150
200
250
300
350
400
30
Benchmark Characteristics and Limitations
 A benchmark should not replace software test and validation
 Non-intrusiveness  robustness benchmarks
(faults injected outside the benchmark target)
 Make use of available inputs and outputs  impact on measures
 Balance between cost and degree of confidence
 # dependability benchmark measures >>
# performance benchmark measures
31
Maturity of Dependability Benchmarks
 Dependability benchmarks
 Performance benchmarks
• Infancy
• Mature domain
• Isolated work
• Cooperative work
• Not explicitly addressed
• Integrated to system development
• Acceptability?
• Accepted by all actors for
competitive system comparison
“Ad hoc” benchmarks
“Competition” benchmarks
•
•
Maturity
??
32
Software Process Improvement (SPI)
Data Collection
Measurements
Data Processing
Measures
Controlled experiments
Objectives
of the analysis
Data related to
similar projects
Feedback to software
development process
Capitalize experience
33
Examples of Benefits from SPI Programs
 AT&T(quality program):
Customer reported problems (maintenance program) divided by 10
System test interval divided by 2
New product introduction interval divided by 3
 Fujitsu
(concurrent development process):
Release cycle reduction = 75 %
 Motorola (Arlington Heights), mix of methods:
Fault density reduction = 50% within 3.5 years
 Raytheon (Electronic Systems), CMM:
Rework cost divided by 2 after two years of experience
Productivity increase = 190%
Product quality: multiplied by 4
34
Cost
— —
Total cost
without process
improvement
approaches
——— with process
improvement
approaches
Cost of dependability
Basic development cost
Cost of scrap / rework
Dependability
Process improvement  Dependability improvement & cost reduction!
35
36
Software Dependability Assessment:
A Reality or A Dream?
Karama Kanoun
The Sixth IEEE International Conference on Software Security and Reliability, SERE,
Washington D.C., USA, 20-22 June, 2012
37
Download