Dynamic Fault Tree analysis using Input/Output Interactive Markov

advertisement
Dynamic Fault Tree
analysis using
Input/Output Interactive Markov Chains
Hichem Boudali1, Pepijn Crouzen2, and Mariëlle Stoelinga1.
1Formal
Methods and Tools group
CS, University of Twente, NL.
2Dependable Systems and Software group,
CS, Saarland University, Germany
May 9, 2008
IPA Lentedagen, Rhenen
1
Introduction:
Dependability
Dependability:
The trustworthiness of a computer
system such that reliance can justifiably
be placed upon the service it delivers.
Reliability:
The probability that a computer system
does not fail within a given time bound.
May 9, 2008
IPA Lentedagen, Rhenen
2
Introduction:
Formal dependability
 Continuous-time Markov chains
(CTMC)
 States and Markovian transitions
 Probability of traversing a λtransition within t time-units is:
1-e-λt
 Tools: Reachability analysis
(among others)
May 9, 2008
IPA Lentedagen, Rhenen
λ
μ
μ
λ
3
Introduction:
CTMC characteristics
 CTMCs describe probability distributions
(phase-type distributions)
 Phase-type distributions can approximate
any arbitrary distribution arbitrarily closely
 Goal: Find a CTMC which describes the
probability of system failure within t timeunits (i.e. the unreliability of the system)
 Problem: Difficult to find the CTMC that
models a large system
May 9, 2008
IPA Lentedagen, Rhenen
λ
μ
μ
λ
4
Introduction:
Engineering dependability




Fault Trees (1960’s)
Graphical
Easy to use
Syntax:
Workstation fails
OR
 Basic events
 Gates
AND
 Semantics: logical formula
 Problem: Not expressive
enough
May 9, 2008
CPU
fails
IPA Lentedagen, Rhenen
Mem1
fails
Mem1
fails
5
Introduction:
Engineering dependability
 Dynamic Fault Trees (1992)
 Extension of classic fault
trees
 Additions:
System failure
OR
 Use of spares
 Dependencies
 Order-based failure
SPARE
 Tools:
 Convert to CTMC
May 9, 2008
S
IPA Lentedagen, Rhenen
P1
P2
6
But…
DFT Drawbacks
 Scalability
 Ambiguous syntax and semantics
 Lack of modularity:
 Dynamic modules can not be reused
 Restrictions on spares and dependencies
 Existing analysis technique is hard to extend
or modify
May 9, 2008
IPA Lentedagen, Rhenen
7
Outline
 Case study: FTPP system
 DFT approach
 Formalizing DFTs
 DFT semantics in I/O-IMCs
 Deep compositionality
 Extending the DFT formalism
 Conclusion
 Future work
May 9, 2008
IPA Lentedagen, Rhenen
8
Case study: FTPP
A
B
C
D
NE1
A
B
C
A
N
E
2
N
E
4
D
B
C
D
 16 processors divided into
4 groups
 4 network elements
connect the processors
 Per group 2 processors
must be operational
 Different configurations
are possible
NE3
A
May 9, 2008
B
C
D
IPA Lentedagen, Rhenen
9
Case study: FTPP
B
B
B
B
 16 processors divided into
A
A
A
A
4 groups
 4 network elements
NE1
connect the processors
D
S
 Per group 2 processors
N
D
S each configuration?
How reliableN is
must be operational
E
E
D
S
2
4
 Different configurations
are possible
D
S
NE3
 Dynamic redundancy
management is possible
C
C
C
C
May 9, 2008
IPA Lentedagen, Rhenen
10
FTPP DFT
A A A A
NE1
B
B
B
S
N
E
2
N
E
4
B
S
S
System Failure
S
NE3
OR
C C C C
Group 1 Failure
Group 2 Failure
Group 3 Failure
Group 4 Failure
2/3
2/3
2/3
2/3
A
B
C
A
B
S
A
FDEP
A
A
FDEP
A
C
A
B
B
FDEP
B
IPA Lentedagen, Rhenen
C
C
S
FDEP
NE3
B
B
S
NE2
A
B
S
NE1
May 9, 2008
C
NE4
C
C
C
S
S
S
S
11
Existing DFT analysis
[Dugan et al. 1992]
 For static fault trees binary decision diagrams can be used!
 Otherwise: Convert the DFT into a CTMC.
 Analyze CTMC using standard solution techniques.
A has failed
B is operational
AND-gate
Failure rate:
0.2 f/h
A
But…
Starting state:
CState space
0.2
explosion:
A is operational
B is operational
CTMC grows exponentially
FTPP difficult
to analyze
0.4
Failure rate:
B
0.4
A has failed
B has failed
0.2
0.4 f/h
Pr(A fails in T hours) = 1 – e-0.2•T
A’s Mean time to failure = 1/0.2 = 5 hours
A is operational
B has failed
Unreliability =
Prob[Reaching
in time T]
May 9, 2008
IPA Lentedagen, Rhenen
12
FTPP Results
System Failure
A A A A
Group 1 Failure
Group 2 Failure
Group 3 Failure
Group 4 Failure
2/3
2/3
2/3
2/3
NE1
B
B
B
S
N
E
2
N
E
4
B
S
S
A
B
C
A
B
C
A
B
C
A
B
C
S
NE3
S
C C C C
S
FDEP
NE1
S
FDEP
NE2
A A A A
S
FDEP
FDEP
NE3
B B B B
NE4
C C C C
S S S S
Analysis
method
Max number of
states
Max number of
transitions
Unreliability
(T=10)
Standard
32757
426826
2.55479 · 10-8
Compositional
1325
14153
2.55479 · 10-8
May 9, 2008
IPA Lentedagen, Rhenen
13
What’s behind it?
 Model local behavior
 We need compositional Markov
chains
 Combination of LTS and CTMC,
with I/O automata features
I/O-IMC
for
 Markovian transitions
(CTMC)
Basic(LTS)
event
 Interactive transitions
 Action signature (IOA)
 ? - Input actions
 ! - Output actions
 ; - Internal actions
λ
failed!
Input/Output Interactive Markov Chains (I/O-IMC)
May 9, 2008
IPA Lentedagen, Rhenen
14
Input/Output Interactive Markov Chains
 Properties of IMCs:
 Combines stochastic behavior and interactive
behavior orthogonally
 CSP-style synchronization + interleaving semantics
 Maximal progress for internal transitions
 Properties of IOIMCs:




τ
Unique outputs
λ
Input enabledness
Outputs cannot be blocked!
Maximal progress for output transitions
May 9, 2008
IPA Lentedagen, Rhenen
15
DFT semantics
DFT gate to I/O-IMC
f(A)?
f(B)?
f(C)!
f(B)?
f(A)?
f(A)?
f(B)?
f(C)!
f(B)?
May 9, 2008
IPA Lentedagen, Rhenen
16
What is deep compositionality?
 Semantics of a DFT arises naturally as
composition of the semantics of its building blocks
f(G1)
Group 1 Failure
f(G1)
2/3
A
B
C
S
f(NE1)
f(NE2)
f(NE3)
f(NE4)
f(NE1)
…
f(NE4)
 But: This may lead to huge models.
May 9, 2008
IPA Lentedagen, Rhenen
17
Why use deep compositionality?
 Formally define semantics
 Many useful techniques




Combining models: Composition
Refining models: Hiding
Minimizing models: Bisimulation
Reusing models: Renaming
Combat
State-space
explosion
 Well supported by CADP toolset (VASY/INRIA)
May 9, 2008
IPA Lentedagen, Rhenen
18
Compositional Aggregation
Composition +
Abstraction
Translation
Repeat
Aggregation
(minimization)
Analysis
Result: System
failure probability
Aggregated system CTMC (CTMDP)
May 9, 2008
IPA Lentedagen, Rhenen
19
Compositional Aggregation
Example
f(A)?
f(B)?
f(C)!
Failure rate:
0.2 f/h
f(B)?
f(A)?
Failure rate:
0.4 f/h
0.2
May 9, 2008
f(A)!
0.4
IPA Lentedagen, Rhenen
f(B)!
20
Compositional Aggregation
Parallel Composition
C
2
1
f(B)?
f(A)?
4
5
f(C)!
2||3
3
f(A)!
1||2
f(A)?
f(B)?
f(C)!
f(B)?
Inputs: f(A)? and f(B)?
Outputs: f(C)!
C||A
1||1
f(A)!
0.2
3||2
f(B)?
Inputs: none
Outputs: f(A)!
1
0.2
May 9, 2008
2
f(A)!
5||3
0.2
Synchronize on f(A)
A
4||3
f(B)?
3||1
3
IPA Lentedagen, Rhenen
21
Compositional Aggregation
Abstraction (hiding)
C
2||3
f(A)!
f(A);
1||2
f(C)!
f(B)?
A
B
1||1
f(A)!
f(A);
0.2
4||3
f(B)?
5||3
0.2
3||2
f(B)?
Abstraction (hiding):
Makes signal internal
May 9, 2008
3||1
IPA Lentedagen, Rhenen
22
Compositional Aggregation
Aggregation (weak bisimulation)
Aggregation:
Finding a smaller model
equivalent (behaviorally)
to the original
2||3
f(A);
1||2
f(C)!
f(B)?
1||1
f(A);
0.2
4||3
f(B)?
5||3
0.2
Weak bisimulation:
Disregard internal steps
May 9, 2008
3||2
f(B)?
3||1
IPA Lentedagen, Rhenen
23
Compositional Aggregation
Example (continued)
C||A
1
2
f(B)?
0.2
4
f(C)!
5
3
2||1
0.2
f(B)?
1||1
0.4
0.2
2||2
C||A||B
4||3
0.4
B
1||2
1
0.2
2
f(B)!
f(B)!
0.2
f(C)!
0.2
f(B)!
5||3
3
3||3
May 9, 2008
IPA Lentedagen, Rhenen
24
Compositional Aggregation
Example (continued)
0.2
0.4
f(C)!
C||A||B
0.4
May 9, 2008
IPA Lentedagen, Rhenen
0.2
25
DFT extensions
 Extensions:





Inhibition
Repair-policies
Complex spares
Complex dependencies
…
DSN07
Free!
 Adding extensions in the compositional
framework is easy:
 Modify translation of DFT building blocks
 Compositional aggregation algorithm is
unaltered
May 9, 2008
IPA Lentedagen, Rhenen
26
Extension: Repair
Basic event A
AND-gate C
λ
r(B)?
r(B)?
r(A)?
f(A)!
r(A)!
r(A)?
r(C)!
µ
f(B)?
f(A)?
r(C)!
f(C)!
f(A)?
f(B)?
r(C)!
r(B)?
r(A)?
May 9, 2008
IPA Lentedagen, Rhenen
r(A)?
r(B)?
27
Conclusion:
How we tackled drawbacks
 State-space explosion.
 Ambiguous syntax and
semantics.
 Lack of modularity:
Compositional Aggregation
DAG
Formal translation
 Dynamic modules can not be
reused.
 Restrictions on spares and
dependencies.
 Existing analysis technique is
hard to extend and/or modify.
May 9, 2008
IPA Lentedagen, Rhenen
I/O-IMC
Renaming!
Lifted!
Extensions at the
lowest level
28
Future work
 Fully automated tool (CORAL)
 More aggressive state reduction
 Recent work: specialized acyclic algorithm
 Apply deep compositionality to more advanced
engineering formalisms! (see Boudali et al.,
DSN08)
 Extend DFT formalism




Repair
Failure modes
Non-exponential failure distributions
Sophisticated dependencies
May 9, 2008
IPA Lentedagen, Rhenen
29
The end!
Questions?
May 9, 2008
IPA Lentedagen, Rhenen
30
Download