A Scalable Approach to Architectural-
Level Reliability Prediction
Leslie Cheung
Joint work with Leana Golubchik and Nenad Medvidovic
Motivation
• Many design decisions are made early in the software development process
– These decisions affect software quality
Need to assess software quality early
–
If problems are discovered later (e.g., after implementation), they may be costly to address
Motivation
• We focus on assessing software reliability using architectural models in this talk
– Reliability: the fraction of time that the system operates correctly
– Architectural models: describes system structure, behavior, and interactions
Case Study: MIDAS
• Measure room temperature and adjust the temperature according to a user-specified threshold by turning on/off the AC
• Sensor : measures temperature and sends the measured data to a
Gateway
• Gateway : aggregates and translates the data and sends it to a Hub
•
Hub : determines whether it should turn the AC on or off
• AC : Control the AC
• GUI : View current temperature, and change thresholds
Sensor1
Sensor2
Gateway
Motivations
GUI
Hub
AC
• Existing approaches for concurrent systems: keeps track of the states of all components
• MIDAS Example
–
State: (Sensor1, Sensor2, Gateway, Hub, GUI, AC)
Motivations
GUI
Sensor1
Sensor2
Gateway Hub
AC
( Taking Measurements , idle, idle, idle, Processing User Request , idle)
• Existing approaches for concurrent systems: keeps track of the states of all components
• MIDAS Example
–
State: (Sensor1, Sensor2, Gateway, Hub, GUI, AC)
Motivations
GUI Failed!
Sensor1
Sensor2
Gateway Hub
AC
( Failed!
, idle, idle, idle, Processing User Request , idle)
• Existing approaches for concurrent systems: keeps track of the states of all components
• MIDAS Example
–
State: (Sensor1, Sensor2, Gateway, Hub, GUI, AC)
Motivations
GUI
Sensor1
Sensor2 e.g., 2 Gateways,10 Sensors each
>5000 states
AC
(Taking Measurements, idle, idle, idle, Processing User Request, idle)
• Existing approaches for concurrent systems:
100s of Sensors and Gateways?
keeps track of the states of all components
The models are too big to solve
• MIDAS Example
–
State: (Sensor1, Sensor2, Gateway, Hub, GUI, AC)
The SHARP Framework
• SHARP: Scalable, Hierarchical, Architectural-
Level Reliability Prediction Framework
•
Idea: generate part of the system model at a time by leveraging use-case scenarios
– Solving many smaller models is more efficient than solving one huge model
MIDAS Use-Case Scenarios
GUI Sensor Gateway Hub
E1: SensorMeasurement
E2: GWMeasurement
E3: HubAckGW
E4: GWAckSensor
Hub
E5: GUIRequest
E6: GUIAck
Hub
E7: ChangeACTemp
AC
Scenario 2: GUI Request Scenario 3: Control AC
Scenario 1: Sensor Measurement
• MIDAS example
–
Sensor Measurement
–
GUI Request
– Control AC
The SHARP Framework
• Modeling concurrency: instances of scenarios may run simultaneously
•
MIDAS Example
– Processing a GUI request while processing sensor measurements
Sensor Measurement and GUI request scenarios run simultaneously
– Multiple sensors Multiple instances of the Sensor
Measurement scenario
The SHARP Framework
1. Generate and solve submodels according to the system’s use-case scenarios
2. Generate and solve a coarser-level model for system reliability
–
Describe what happens when multiple instances of scenarios are running
– Make use of results from the submodels
The SHARP Framework
Sensor Gateway Hub
E1: SensorMeasurement
E2: GWMeasurement
E3: HubAckGW
E4: GWAckSensor
Scenario 1: Sensor Measurement
The SHARP Framework
Sensor Gateway Hub
E1: SensorMeasurement
E2: GWMeasurement
E3: HubAckGW
E4: GWAckSensor
Scenario 1: Sensor Measurement m
1
R
1
The SHARP Framework
Hub
E5: GUIRequest
E6: GUIAck
GUI
Scenario 2: GUI Request
Hub
E7: ChangeACTemp
AC m
2
R
2
Scenario 3: Control AC m
3
R
3
The SHARP Framework
1. Generate and solve submodels according to the system’s use-case scenarios
2. Generate and solve a coarser-level model for system reliability
–
Describe the number of active instances of each scenarios
– Make use of results from the submodels
The SHARP Framework
The SHARP Framework
The SHARP Framework
The SHARP Framework m
1 m
2 m
3
R
1
R
2
R
3
The SHARP Framework m
1 m
2 m
3
R
1
R
2
R
3
R
Evaluation
• To show…
–
SHARP has better scalability than a flat model that can be derived from existing approaches, and
–
SHARP is accurate , using results from the flat model as
“ground truth”
•
Experiments
– Computational cost in practice
– Sensitivity analysis
Computational cost in practice
• Example: MIDAS system, varying the number of Sensor component (x-axis)
•
Y-axis: number of operations needed to solve the model
Sensitivity Analysis
• We are primarily interested in what-if analysis
– Is Architecture A “better” than Architecture B?
• but not
– Will my system’s reliability greater than 90%?
–
What is the probability that I can run my system for 100 hours without any failure?
Focusing on trends is meaningful at the architectural level
Sensitivity Analysis
• “Ground truth”: results from the flat model
•
Vary Sensor failure rate
Conclusions
• Assessing software quality early is desirable
•
Scalability is a major challenge in reliability prediction of concurrent systems using architectural models
•
We tackle address this challenge by leveraging a system’s use-case scenarios in SHARP
•
Future Work: Contention modeling
– Work thus far: assume no contention
– However, concurrency
contention
Defects
• Architectural: Mismatches between architectural models
– e.g., An interaction protocol mismatch between 2 comps
•
System: Limitations of components
– e.g., Sensor has limited power
• Allow system designers to evaluate how much reliability will improve if defects are addressed
–
Cost