Software Reliability ECE-355 Tutorial Jie Lian ECE355 Fall 2004 Software Reliability 1 Outline • Part I: Software Reliability Model – Musa’s Basic Model – Musa/Okumoto Logarithmic Model • Part II: Control Flow Graph ECE355 Fall 2004 Software Reliability 2 Definition of Software Reliability • Reliability is usually defined in terms of a statistical measure for the operation of a software system without a failure occurring • Software reliability is a measure for the probability of a software failure occurring • Two terms related to software reliability – Fault: a defect in the software, e.g. a bug in the code which may cause a failure – Failure: a derivation of the programs observed behavior from the required behavior ECE355 Fall 2004 Software Reliability 3 Parameters of Software Reliability • Average total number of failures (t) Average refers to n independent instantiations of an identical software. • Failure intensity (t) Number of failures per time unit, derivative of (t). 1 • Mean Time To Failure (MTTF): MTTF (t ) • t may denote elapsed execution calendar or machine clock time ECE355 Fall 2004 Software Reliability 4 Importance of Software Reliability • In safety-critical systems, certain failures are fatal. This requires pushing reliability to very high levels at very high costs (code redundancy, hardware redundancy, recovery blocks, n version programming…). • In non-safety-critical systems a certain failure rate is usually tolerable. – This is a question of quality of service. – Which failure rate is tolerable is mainly a question of customer acceptance. (customer lifts receiver and receives neither fast busy nor dial tone one every 10/10000 calls?) • We will only talk about non-safety-critical systems ECE355 Fall 2004 Software Reliability 5 Software Reliability Growth Model (SRG) • Purpose of SRG models SRGs rely on observation of failure occurrence and try to predict future failure behavior • Two different SRG models (appr 40 models totally): – Musa linear model – Musa/Okomoto logarithmic model ECE355 Fall 2004 Software Reliability 6 Basic Assumptions of Musa’s Model • Faults are independent and distributed with constant rate of encounter. • Well mixed types of instructions, execution time between failures is large compared to instruction execution time. • Test space covers use space. (Tests selected from a complete set of use input sets). • Set of inputs for each run selected randomly. • All failures are observed, implied by definition. • Fault causing failure is corrected immediately, otherwise reoccurrence of that failure is not counted. ECE355 Fall 2004 Software Reliability 7 Musa’s Basic Model • Assumption: decrement in failure intensity function is constant. • Consequence: failure intensity is function of average number of failures experienced at any given point in time (= failure probability). ( ) 0 1 v0 – – – – (): failure intensity. 0: initial failure intensity at start of execution. : average total number of failures at a given point in time. v0: total number of failures over infinite time. ECE355 Fall 2004 Software Reliability 8 Example 1 • Assume that we are at some point of time t time units in the life cycle of a software system after it has been deployed. • Assume the program will experience 100 failures over infinite execution time. During the last t time unit interval 50 failures have been observed (and counted). The initially failure intensity was 10 failures per CPU hour. • Compute the current (at t) failure intensity: ( ) 0 1 v0 failures 50 (50) 101 5 100 CPU Hour ECE355 Fall 2004 Software Reliability 9 Musa/Okumoto Logarithmic Model • Decrement per encountered failure decreases: ( ) 0e : failure intensity decay parameter. • Example 2 – 0 = 10 failures per CPU hour. – =0.02/failure. – 50 failures have been experienced ( = 50). – Current failure intensity: (50) 10e( 0.0250) 10e1 3.68 ECE355 Fall 2004 Software Reliability 10 Model Extension (1) • Average total number of counted experienced failures () as a function of the elapsed execution time (). • For basic model 0 ( ) v0 1 e v0 • For logarithmic model ( ) ECE355 Fall 2004 1 Software Reliability ln 0 1 11 Example 3 (Basic Model) • 0 = 10 [failures/CPU hour]. • v0 = 100 (number of failures over infinite execution time). v ( ) v 1 e • = 10 CPU hours: 0 0 0 10 10 1 100 (10) 1001 e 100 1 e 63 failures • = 100 CPU hours: 10 100 10 100 (100) 1001 e 100 1 e 100 failures ECE355 Fall 2004 Software Reliability 12 Example 4 (Logarithmic Model) • 0 = 10 [failures/CPU hour]. • = 0.02 / failure. • = 10 CPU hours: (10) ( ) 1 ln10 0.02 10 1 55 0.02 1 ln 0 1 (63 in basic model) • = 100 CPU hours: (100 ) ECE355 Fall 2004 1 ln10 0.02 100 1 152 (100 in basic model) 0.02 Software Reliability 13 Model Extension (2) • Failure intensity as a function of execution time. • For basic model: ( ) 0 e 0 v 0 • For logarithmic Poisson model 0 ( ) 0 1 ECE355 Fall 2004 Software Reliability 14 Example 5 (Basic Model) • 0 = 10 [failures/CPU hour]. • v0 = 100 (number of failures over infinite execution time). v ( ) e • = 10 CPU hours: 0 0 0 (10) 10e 10 10 100 failures 10e 3.68 CPU hour 1 • = 100 CPU hours: (100) 10e ECE355 Fall 2004 10 100 100 10e 10 failures 0.000454 CPU hour Software Reliability 15 Example 6 (Logarithmic Model) • 0 = 10 [failures/CPU hour]. = 0.02 / failure. 0 • = 10 CPU hours: ( ) 0 1 failures 10 (10) 3.33 (3.68 in basic model) 10 0.0210 1 CPU hour • = 100 CPU hours: failures 10 (100) 0.467 10 0.02100 1 CPU hour (0.000454 in basic model) ECE355 Fall 2004 Software Reliability 16 Model Discussion • Comparison of basic and logarithmic model: – Basic model assumes that there is a 0 failure intensity, logarithmic model assumes convergence to 0 failure intensity. – Basic model assumes a finite number of failures in the system, logarithmic model assumes infinite number. • Parameter estimation is major problem: 0, , and v0. Usually obtained from: – system test, – observation of operational system, – by comparison with values from similar projects. ECE355 Fall 2004 Software Reliability 17 Part II: Control Flow Graph (CFG) • A graph representation of a set of statements is called a flow graph or control flow graph. • Nodes in the flow graph represent computations and the edges represent the flow of control. • A basic block is a sequence of consecutive threeaddress statements in which flow of control enters at the beginning and leaves at the end without halt or possibility of branching except at the end. • A CFG consists of a set of basic blocks. ECE355 Fall 2004 Software Reliability 18 Three-Address Statements • Assignment statements of the form x: = y op z or x: = op z, where op is a binary or unary arithmetic or logical operation. • Copy statements x: = y where the value of y is assigned to x. • Unconditional jump goto L. Execution jumps to the statement labeled by L. • Conditional jump if x relop y goto L. • Indexed assignments of the form x: = y[i] and x[i] := y. • Address and pointer assignments of the form x := &y, x := *y, and *x := y. • Param x and call p, n, and return y, where return value of y is optional. For a procedure call p(x1, x2, … , xn), the transformed three-address statements are: ECE355 Fall 2004 Software Reliability param x1 param x2 … param xn, call p, n 19 Partition into Basic Blocks • Input: A sequence of three-address statements. • Output: A list of basic blocks with each three-address statements in exactly one block. • Method 1. Determining leaders (the first statement of basic blocks) by three rules: i. The first statement is a leader. ii. Any statement that is the target of a conditional or unconditional goto is a leader. iii. Any statement that immediately follows a goto or conditional goto statement is a leader. 2. For each leader, its basic block consists of the leader and all statements up to but not including the next leader or the end of the program. ECE355 Fall 2004 Software Reliability 20 Example 1 I = 1; TI = TV = 0; sum = 0; 2 3 4 5 6 7 DO WHILE (v[I] <> –999 and TI < 1) { TI++; IF (v[I] >= min and v[I] <= max) { TV++; sum = sum + v[I]; } I++; } 8 9 10 11 IF TV >0 ) av = sum/TV; ELSE av = –999 ; We do not strictly follow the transformation from source code to three-address statements. Note that each statement with a label is a leader. ECE355 Fall 2004 12 13 … I = 1; TI = TV = 0; sum = 0; IF (v[I] == –999) GOTO 10 IF (TI >= 1) GOTO 10 TI++; IF (v[I] < min) GOTO 8 IF (v[I] > max) GOTO 8 TV++; sum = sum + v[I]; I++; GOTO 2 IF (TV <= 0) GOTO 12 av = sum/TV; goto 13 av = –999; … Software Reliability Basic Block While loop IF ELSE 21 Transformation from Basic Blocks to CFG 1 2 3 4 5 6 7 8 9 10 11 12 13 … I = 1; TI = TV = 0; sum = 0; IF (v[I] == –999) GOTO 10 IF (TI >= 1) GOTO 10 TI++; IF (v[I] < min) GOTO 8 IF (v[I] > max) GOTO 8 TV++; sum = sum + v[I]; I++; GOTO 2 IF (TV <= 0) GOTO 12 av = sum/TV; goto 13 av = –999; … ECE355 Fall 2004 1 predicate node 2 3 R1 4 R4 10 5 R2 6 12 R5 11 8 13 R6 R3 7 9 Outer region Software Reliability 22 Cyclomatic Complexity • McCabe’s cyclomatic complexity – V(G) = E – N + 2, E: number of edges, N: number of nodes. – V(G) = p + 1, p is a number of predicate (decision) nodes. – V(G) = number of regions (area surrounded by nodes/edges). • V(G): upper bound on the number of independent paths – Independent path: A path with at least one new node/edge. • Example (pp. 22) : – V(G) = E – N + 2 = 17 – 13 + 2 = 6 – V(G) = p + 1 = 5 + 1 = 6 – V(G) = 6 • Advantage: # of test cases is proportional to the program size. ECE355 Fall 2004 Software Reliability 23 References [1] Musa, JD, Iannino, A. and Okumoto, K., “Software Reliability: Measurement, Prediction, Application”, McGraw-Hill Book Company, NY, 1987. [2] A. V. Aho, R. Sethi, and J. Ullman, "Compilers: Principles, Techniques, and Tools", Addison-Wesley, Reading, MA, 1986. ECE355 Fall 2004 Software Reliability 24