Uploaded by aminatshowole

Software Engineering MODULE 3

Coding- The objective of the coding phase is to transform the design of a system into code in a
high level language and then to unit test this code. The programmers adhere to standard and well
defined style of coding which they call their coding standard. The main advantages of adhering
to a standard style of coding are as follows:
A coding standard gives uniform appearances to the code written by different
It facilitates code of understanding.
Promotes good programming practices.
For implementing our design into a code, we require a good high level language. A programming
language should have the following features:
Characteristics of a Programming Language
Readability: A good high-level language will allow programs to be written in some ways
that resemble a quite-English description of the underlying algorithms. If care is taken,
the coding may be done in a way that is essentially self-documenting.
Portability: High-level languages, being essentially machine independent, should be able
to develop portable software.
Generality: Most high-level languages allow the writing of a wide variety of programs,
thus relieving the programmer of the need to become expert in many diverse languages.
Brevity: Language should have the ability to implement the algorithm with less amount
of code. Programs expressed in high-level languages are often considerably shorter than
their low-level equivalents.
Error checking: Being human, a programmer is likely to make many mistakes in the
development of a computer program. Many high-level languages enforce a great deal of
error checking both at compile-time and at run-time.
Cost: The ultimate cost of a programming language is a function of many of its
VSSUT, Burla
Familiar notation: A language should have familiar notation, so it can be understood by
most of the programmers.
Quick translation: It should admit quick translation.
Efficiency: It should permit the generation of efficient object code.
Modularity: It is desirable that programs can be developed in the language as a
collection of separately compiled modules, with appropriate mechanisms for ensuring
self-consistency between these modules.
Widely available: Language should be widely available and it should be possible to
provide translators for all the major machines and for all the major operating systems.
A coding standard lists several rules to be followed during coding, such as the way variables are
to be named, the way the code is to be laid out, error return conventions, etc.
Coding standards and guidelines
Good software development organizations usually develop their own coding standards and
guidelines depending on what best suits their organization and the type of products they develop.
The following are some representative coding standards.
1. Rules for limiting the use of global: These rules list what types of data can be declared
global and what cannot.
2. Contents of the headers preceding codes for different modules: The information
contained in the headers of different modules should be standard for an organization. The
exact format in which the header information is organized in the header can also be
specified. The following are some standard header data:
• Name of the module.
• Date on which the module was created.
• Author’s name.
• Modification history.
• Synopsis of the module.
• Different functions supported, along with their input/output parameters.
• Global variables accessed/modified by the module.
VSSUT, Burla
3. Naming conventions for global variables, local variables, and constant identifiers: A
possible naming convention can be that global variable names always start with a capital
letter, local variable names are made of small letters, and constant names are always
capital letters.
4. Error return conventions and exception handling mechanisms: The way error
conditions are reported by different functions in a program are handled should be
standard within an organization. For example, different functions while encountering an
error condition should either return a 0 or 1 consistently.
The following are some representative coding guidelines recommended by many software
development organizations.
1. Do not use a coding style that is too clever or too difficult to understand: Code
should be easy to understand. Many inexperienced engineers actually take pride in
writing cryptic and incomprehensible code. Clever coding can obscure meaning of the
code and hamper understanding. It also makes maintenance difficult.
2. Avoid obscure side effects: The side effects of a function call include modification of
parameters passed by reference, modification of global variables, and I/O operations. An
obscure side effect is one that is not obvious from a casual examination of the code.
Obscure side effects make it difficult to understand a piece of code. For example, if a
global variable is changed obscurely in a called module or some file I/O is performed
which is difficult to infer from the function’s name and header information, it becomes
difficult for anybody trying to understand the code.
3. Do not use an identifier for multiple purposes: Programmers often use the same
identifier to denote several temporary entities. For example, some programmers use a
temporary loop variable for computing and a storing the final result. The rationale that is
usually given by these programmers for such multiple uses of variables is memory
efficiency, e.g. three variables use up three memory locations, whereas the same variable
used in three different ways uses just one memory location. However, there are several
things wrong with this approach and hence should be avoided. Some of the problems
caused by use of variables for multiple purposes as follows:
Each variable should be given a descriptive name indicating its purpose. This is
not possible if an identifier is used for multiple purposes. Use of a variable for
multiple purposes can lead to confusion and make it difficult for somebody trying to
read and understand the code.
VSSUT, Burla
Use of variables for multiple purposes usually makes future enhancements more
4. The code should be well-documented: As a rule of thumb, there must be at least one
comment line on the average for every three-source line.
5. The length of any function should not exceed 10 source lines: A function that is very
lengthy is usually very difficult to understand as it probably carries out many different
functions. For the same reason, lengthy functions are likely to have disproportionately
larger number of bugs.
6. Do not use goto statements: Use of goto statements makes a program unstructured and
very difficult to understand.
Code Review
Code review for a model is carried out after the module is successfully compiled and the all the
syntax errors have been eliminated. Code reviews are extremely cost-effective strategies for
reduction in coding errors and to produce high quality code. Normally, two types of reviews are
carried out on the code of a module. These two types code review techniques are code inspection
and code walk through.
Code Walk Throughs
Code walk through is an informal code analysis technique. In this technique, after a module has
been coded, successfully compiled and all syntax errors eliminated. A few members of the
development team are given the code few days before the walk through meeting to read and
understand code. Each member selects some test cases and simulates execution of the code by
hand (i.e. trace execution through each statement and function execution). The main objectives
of the walk through are to discover the algorithmic and logical errors in the code. The members
note down their findings to discuss these in a walk through meeting where the coder of the
module is present. Even though a code walk through is an informal analysis technique, several
guidelines have evolved over the years for making this naïve but useful analysis technique more
effective. Of course, these guidelines are based on personal experience, common sense, and
several subjective factors. Therefore, these guidelines should be considered as examples rather
than accepted as rules to be applied dogmatically. Some of these guidelines are the following:
 The team performing code walk through should not be either too big or too small. Ideally,
it should consist of between three to seven members.
Discussion should focus on discovery of errors and not on how to fix the discovered
VSSUT, Burla
In order to foster cooperation and to avoid the feeling among engineers that they are
being evaluated in the code walk through meeting, managers should not attend the walk
through meetings.
Code Inspection
In contrast to code walk through, the aim of code inspection is to discover some common types
of errors caused due to oversight and improper programming. In other words, during code
inspection the code is examined for the presence of certain kinds of errors, in contrast to the hand
simulation of code execution done in code walk throughs. For instance, consider the classical
error of writing a procedure that modifies a formal parameter while the calling routine calls that
procedure with a constant actual parameter. It is more likely that such an error will be discovered
by looking for these kinds of mistakes in the code, rather than by simply hand simulating
execution of the procedure. In addition to the commonly made errors, adherence to coding
standards is also checked during code inspection. Good software development companies collect
statistics regarding different types of errors commonly committed by their engineers and identify
the type of errors most frequently committed. Such a list of commonly committed errors can be
used during code inspection to look out for possible errors.
Following is a list of some classical programming errors which can be checked during code
Use of uninitialized variables.
Jumps into loops.
Nonterminating loops.
Incompatible assignments.
Array indices out of bounds.
Improper storage allocation and deallocation.
Mismatches between actual and formal parameter in procedure calls.
Use of incorrect logical operators or incorrect precedence among operators.
Improper modification of loop variables.
Comparison of equally of floating point variables, etc.
Clean Room Testing
Clean room testing was pioneered by IBM. This type of testing relies heavily on walk throughs,
inspection, and formal verification. The programmers are not allowed to test any of their code by
executing the code other than doing some syntax testing using a compiler. The software
development philosophy is based on avoiding software defects by using a rigorous inspection
process. The objective of this software is zero-defect software. The name ‘clean room’ was
derived from the analogy with semi-conductor fabrication units. In these units (clean rooms),
defects are avoided by manufacturing in ultra-clean atmosphere. In this kind of development,
inspections to check the consistency of the components with their specifications has replaced
VSSUT, Burla
This technique reportedly produces documentation and code that is more reliable and
maintainable than other development methods relying heavily on code execution-based testing.
The clean room approach to software development is based on five characteristics:
Formal specification: The software to be developed is formally specified. A statetransition model which shows system responses to stimuli is used to express the
Incremental development: The software is partitioned into increments which are
developed and validated separately using the clean room process. These increments are
specified, with customer input, at an early stage in the process.
Structured programming: Only a limited number of control and data abstraction
constructs are used. The program development process is process of stepwise refinement
of the specification.
Static verification: The developed software is statically verified using rigorous software
inspections. There is no unit or module testing process for code components
Statistical testing of the system: The integrated software increment is tested statistically
to determine its reliability. These statistical tests are based on the operational profile
which is developed in parallel with the system specification. The main problem with this
approach is that testing effort is increased as walk throughs, inspection, and verification
are time-consuming.
Software Documentation
When various kinds of software products are developed then not only the executable files and the
source code are developed but also various kinds of documents such as users’ manual, software
requirements specification (SRS) documents, design documents, test documents, installation
manual, etc are also developed as part of any software engineering process. All these documents
are a vital part of good software development practice. Good documents are very useful and
server the following purposes:
o Good documents enhance understandability and maintainability of a software
product. They reduce the effort and time required for maintenance.
o Use documents help the users in effectively using the system.
o Good documents help in effectively handling the manpower turnover problem.
Even when an engineer leaves the organization, and a new engineer comes in, he
can build up the required knowledge easily.
o Production of good documents helps the manager in effectively tracking the
progress of the project. The project manager knows that measurable progress is
achieved if a piece of work is done and the required documents have been
produced and reviewed.
VSSUT, Burla
Different types of software documents can broadly be classified into the following:
• Internal documentation
• External documentation
Internal documentation is the code comprehension features provided as part of the source code
itself. Internal documentation is provided through appropriate module headers and comments
embedded in the source code. Internal documentation is also provided through the useful variable
names, module and function headers, code indentation, code structuring, use of enumerated types
and constant identifiers, use of user-defined data types, etc. Careful experiments suggest that out
of all types of internal documentation meaningful variable names is most useful in understanding
the code. This is of course in contrast to the common expectation that code commenting would
be the most useful. The research finding is obviously true when comments are written without
thought. For example, the following style of code commenting does not in any way help in
understanding the code.
a = 10; /* a made 10 */
But even when code is carefully commented, meaningful variable names still are more helpful in
understanding a piece of code. Good software development organizations usually ensure good
internal documentation by appropriately formulating their coding standards and coding
External documentation is provided through various types of supporting documents such as
users’ manual, software requirements specification document, design document, test documents,
etc. A systematic software development style ensures that all these documents are produced in an
orderly fashion.
VSSUT, Burla
Program Testing
Testing a program consists of providing the program with a set of test inputs (or test cases) and
observing if the program behaves as expected. If the program fails to behave as expected, then
the conditions under which failure occurs are noted for later debugging and correction.
Some commonly used terms associated with testing are:
 Failure: This is a manifestation of an error (or defect or bug). But, the mere presence of
an error may not necessarily lead to a failure.
 Test case: This is the triplet [I,S,O], where I is the data input to the system, S is the state
of the system at which the data is input, and O is the expected output of the system.
 Test suite: This is the set of all test cases with which a given software product is to be
Aim of Testing
The aim of the testing process is to identify all defects existing in a software product. However
for most practical systems, even after satisfactorily carrying out the testing phase, it is not
possible to guarantee that the software is error free. This is because of the fact that the input data
domain of most software products is very large. It is not practical to test the software
exhaustively with respect to each value that the input data may assume. Even with this practical
limitation of the testing process, the importance of testing should not be underestimated. It must
be remembered that testing does expose many defects existing in a software product. Thus
testing provides a practical way of reducing defects in a system and increasing the users’
confidence in a developed system.
Verification Vs Validation
Verification is the process of determining whether the output of one phase of software
development conforms to that of its previous phase, whereas validation is the process of
determining whether a fully developed system conforms to its requirements specification. Thus
while verification is concerned with phase containment of errors, the aim of validation is that the
final product be error free.
Design of Test Cases
Exhaustive testing of almost any non-trivial system is impractical due to the fact that the domain
of input data values to most practical software systems is either extremely large or infinite.
Therefore, we must design an optional test suite that is of reasonable size and can uncover as
many errors existing in the system as possible. Actually, if test cases are selected randomly,
many of these randomly selected test cases do not contribute to the significance of the test suite,
VSSUT, Burla
i.e. they do not detect any additional defects not already being detected by other test cases in the
suite. Thus, the number of random test cases in a test suite is, in general, not an indication of the
effectiveness of the testing. In other words, testing a system using a large collection of test cases
that are selected at random does not guarantee that all (or even most) of the errors in the system
will be uncovered. Consider the following example code segment which finds the greater of two
integer values x and y. This code segment has a simple programming error.
if (x>y)
max = x;
max = x;
For the above code segment, the test suite, {(x=3,y=2);(x=2,y=3)} can detect the error, whereas a
larger test suite {(x=3,y=2);(x=4,y=3);(x=5,y=1)} does not detect the error. So, it would be
incorrect to say that a larger test suite would always detect more errors than a smaller one, unless
of course the larger test suite has also been carefully designed. This implies that the test suite
should be carefully designed than picked randomly. Therefore, systematic approaches should be
followed to design an optimal test suite. In an optimal test suite, each test case is designed to
detect different errors.
Functional Testing Vs. Structural Testing
In the black-box testing approach, test cases are designed using only the functional specification
of the software, i.e. without any knowledge of the internal structure of the software. For this
reason, black-box testing is known as functional testing. On the other hand, in the white-box
testing approach, designing test cases requires thorough knowledge about the internal structure
of software, and therefore the white-box testing is called structural testing.
VSSUT, Burla
Testing in the large vs. testing in the small
Software products are normally tested first at the individual component (or unit) level. This is
referred to as testing in the small. After testing all the components individually, the components
are slowly integrated and tested at each level of integration (integration testing). Finally, the fully
integrated system is tested (called system testing). Integration and system testing are known as
testing in the large.
Unit Testing
Unit testing is undertaken after a module has been coded and successfully reviewed. Unit testing
(or module testing) is the testing of different units (or modules) of a system in isolation.
In order to test a single module, a complete environment is needed to provide all that is necessary
for execution of the module. That is, besides the module under test itself, the following steps are
needed in order to be able to test the module:
• The procedures belonging to other modules that the module under test calls.
• Nonlocal data structures that the module accesses.
• A procedure to call the functions of the module under test with appropriate parameters.
Modules are required to provide the necessary environment (which either call or are called by the
module under test) is usually not available until they too have been unit tested, stubs and drivers
are designed to provide the complete environment for a module. The role of stub and driver
modules is pictorially shown in fig. 19.1. A stub procedure is a dummy procedure that has the
same I/O parameters as the given procedure but has a highly simplified behavior. For example, a
stub procedure may produce the expected behavior using a simple table lookup mechanism. A
driver module contain the nonlocal data structures accessed by the module under test, and would
also have the code to call the different functions of the module with appropriate parameter
VSSUT, Burla
Fig. 19.1: Unit testing with the help of driver and stub modules
Black Box Testing
In the black-box testing, test cases are designed from an examination of the input/output values
only and no knowledge of design or code is required. The following are the two main approaches
to designing black box test cases.
• Equivalence class portioning
• Boundary value analysis
Equivalence Class Partitioning
In this approach, the domain of input values to a program is partitioned into a set of equivalence
classes. This partitioning is done such that the behavior of the program is similar for every input
data belonging to the same equivalence class. The main idea behind defining the equivalence
classes is that testing the code with any one value belonging to an equivalence class is as good as
testing the software with any other value belonging to that equivalence class. Equivalence classes
for a software can be designed by examining the input data and output data. The following are
some general guidelines for designing the equivalence classes:
1. If the input data values to a system can be specified by a range of values, then one
valid and two invalid equivalence classes should be defined.
2. If the input data assumes values from a set of discrete members of some domain,
then one equivalence class for valid input values and another equivalence class for
invalid input values should be defined.
VSSUT, Burla
Example 1: For a software that computes the square root of an input integer which can assume
values in the range of 0 to 5000, there are three equivalence classes: The set of negative integers,
the set of integers in the range of 0 and 5000, and the integers larger than 5000. Therefore, the
test cases must include representatives for each of the three equivalence classes and a possible
test set can be: {-5,500,6000}.
Example 2: Design the black-box test suite for the following program. The program computes
the intersection point of two straight lines and displays the result. It reads two integer pairs (m1,
c1) and (m2, c2) defining the two straight lines of the form y=mx + c.
The equivalence classes are the following:
• Parallel lines (m1=m2, c1≠c2)
• Intersecting lines (m1≠m2)
• Coincident lines (m1=m2, c1=c2)
Now, selecting one representative value from each equivalence class, the test suit (2, 2) (2, 5), (5,
5) (7, 7), (10, 10) (10, 10) are obtained.
Boundary Value Analysis
A type of programming error frequently occurs at the boundaries of different equivalence classes
of inputs. The reason behind such errors might purely be due to psychological factors.
Programmers often fail to see the special processing required by the input values that lie at the
boundary of the different equivalence classes. For example, programmers may improperly use <
instead of <=, or conversely <= for <. Boundary value analysis leads to selection of test cases at
the boundaries of the different equivalence classes.
Example: For a function that computes the square root of integer values in the range of 0 and
5000, the test cases must include the following values: {0, -1,5000,5001}.
VSSUT, Burla
One white-box testing strategy is said to be stronger than another strategy, if all types of errors
detected by the first testing strategy is also detected by the second testing strategy, and the
second testing strategy additionally detects some more types of errors. When two testing
strategies detect errors that are different at least with respect to some types of errors, then they
are called complementary. The concepts of stronger and complementary testing are schematically
illustrated in fig. 20.1.
Fig. 20.1: Stronger and complementary testing strategies
Statement Coverage
The statement coverage strategy aims to design test cases so that every statement in a program is
executed at least once. The principal idea governing the statement coverage strategy is that
unless a statement is executed, it is very hard to determine if an error exists in that statement.
Unless a statement is executed, it is very difficult to observe whether it causes failure due to
some illegal memory access, wrong result computation, etc. However, executing some statement
once and observing that it behaves properly for that input value is no guarantee that it will
VSSUT, Burla
behave correctly for all input values. In the following, designing of test cases using the statement
coverage strategy have been shown.
Example: Consider the Euclid’s GCD computation algorithm:
int compute_gcd(x, y)
int x, y;
1 while (x! = y)
2 if (x>y) then
3 x= x – y;
4 else y= y – x;
6 return x;
By choosing the test set {(x=3, y=3), (x=4, y=3), (x=3, y=4)}, we can exercise the program such
that all statements are executed at least once.
Branch Coverage
In the branch coverage-based testing strategy, test cases are designed to make each branch
condition to assume true and false values in turn. Branch testing is also known as edge testing as
in this testing scheme, each edge of a program’s control flow graph is traversed at least once.
It is obvious that branch testing guarantees statement coverage and thus is a stronger testing
strategy compared to the statement coverage-based testing. For Euclid’s GCD computation
algorithm, the test cases for branch coverage can be {(x=3, y=3), (x=3, y=2), (x=4, y=3), (x=3,
Condition Coverage
In this structural testing, test cases are designed to make each component of a composite
conditional expression to assume both true and false values. For example, in the conditional
expression ((c1.and.c2).or.c3), the components c1, c2 and c3 are each made to assume both true
and false values. Branch testing is probably the simplest condition testing strategy where only
the compound conditions appearing in the different branch statements are made to assume the
true and false values. Thus, condition testing is a stronger testing strategy than branch testing and
branch testing is stronger testing strategy than the statement coverage-based testing. For a
composite conditional expression of n components, for condition coverage, 2ⁿ test cases are
required. Thus, for condition coverage, the number of test cases increases exponentially with the
number of component conditions. Therefore, a condition coverage-based testing technique is
practical only if n (the number of conditions) is small.
VSSUT, Burla
Path Coverage
The path coverage-based testing strategy requires us to design test cases such that all linearly
independent paths in the program are executed at least once. A linearly independent path can be
defined in terms of the control flow graph (CFG) of a program.
Control Flow Graph (CFG)
A control flow graph describes the sequence in which the different instructions of a program get
executed. In other words, a control flow graph describes how the control flows through the
program. In order to draw the control flow graph of a program, all the statements of a program
must be numbered first. The different numbered statements serve as nodes of the control flow
graph (as shown in fig. 20.2). An edge from one node to another node exists if the execution of
the statement representing the first node can result in the transfer of control to the other node.
The CFG for any program can be easily drawn by knowing how to represent the sequence,
selection, and iteration type of statements in the CFG. After all, a program is made up from these
types of statements. Fig. 20.2 summarizes how the CFG for these three types of statements can
be drawn. It is important to note that for the iteration type of constructs such as the while
construct, the loop condition is tested only at the beginning of the loop and therefore the control
flow from the last statement of the loop is always to the top of the loop. Using these basic ideas,
the CFG of Euclid’s GCD computation algorithm can be drawn as shown in fig. 20.3.
b = a*2-1;
Fig. 20.2 (a): CFG for sequence constructs
if (a>b)
c = 3;
VSSUT, Burla
c =5;
Fig. 20.2 (b): CFG for selection constructs
Iteration :
while (a>b)
b=b -1;
c = a+b;
Fig. 20.2 (c): CFG for and iteration type of constructs
VSSUT, Burla
EUCLID’S GCD Computation Algorithm
Fig. 20.3: Control flow diagram
VSSUT, Burla
A path through a program is a node and edge sequence from the starting node to a terminal node
of the control flow graph of a program. There can be more than one terminal node in a program.
Writing test cases to cover all the paths of a typical program is impractical. For this reason, the
path-coverage testing does not require coverage of all paths but only coverage of linearly
independent paths.
Linearly independent path
A linearly independent path is any path through the program that introduces at least one new
edge that is not included in any other linearly independent paths. If a path has one new node
compared to all other linearly independent paths, then the path is also linearly independent. This is
because; any path having a new node automatically implies that it has a new edge. Thus, a path that
is sub-path of another path is not considered to be a linearly independent path.
Control Flow Graph
In order to understand the path coverage-based testing strategy, it is very much necessary to
understand the control flow graph (CFG) of a program. Control flow graph (CFG) of a program has
been discussed earlier.
Linearly Independent Path
The path-coverage testing does not require coverage of all paths but only coverage of linearly
independent paths. Linearly independent paths have been discussed earlier.
Cyclomatic Complexity
For more complicated programs it is not easy to determine the number of independent paths of the
program. McCabe’s cyclomatic complexity defines an upper bound for the number of linearly
independent paths through a program. Also, the McCabe’s cyclomatic complexity is very simple to
compute. Thus, the McCabe’s cyclomatic complexity metric provides a practical way of determining
the maximum number of linearly independent paths in a program. Though the McCabe’s metric does
not directly identify the linearly independent paths, but it informs approximately how many paths to
look for.
There are three different ways to compute the cyclomatic complexity. The answers computed by the
three methods are guaranteed to agree.
VSSUT, Burla
Method 1:
Given a control flow graph G of a program, the cyclomatic complexity V(G) can be
computed as:
V(G) = E – N + 2
where N is the number of nodes of the control flow graph and E is the number of edges in the
control flow graph.
For the CFG of example shown in fig. 20.3, E=7 and N=6. Therefore, the cyclomatic
complexity = 7-6+2 = 3.
Method 2:
An alternative way of computing the cyclomatic complexity of a program from an inspection
of its control flow graph is as follows:
V(G) = Total number of bounded areas + 1
In the program’s control flow graph G, any region enclosed by nodes and edges can be called
as a bounded area. This is an easy way to determine the McCabe’s cyclomatic complexity.
But, what if the graph G is not planar, i.e. however you draw the graph, two or more edges
intersect? Actually, it can be shown that structured programs always yield planar graphs. But,
presence of GOTO’s can easily add intersecting edges. Therefore, for non-structured
programs, this way of computing the McCabe’s cyclomatic complexity cannot be used.
The number of bounded areas increases with the number of decision paths and loops.
Therefore, the McCabe’s metric provides a quantitative measure of testing difficulty and the
ultimate reliability. For the CFG example shown in fig. 20.3, from a visual examination of
the CFG the number of bounded areas is 2. Therefore the cyclomatic complexity, computing
with this method is also 2+1 = 3. This method provides a very easy way of computing the
cyclomatic complexity of CFGs, just from a visual examination of the CFG. On the other
hand, the other method of computing CFGs is more amenable to automation, i.e. it can be
easily coded into a program which can be used to determine the cyclomatic complexities of
arbitrary CFGs.
Method 3:
The cyclomatic complexity of a program can also be easily computed by computing the
number of decision statements of the program. If N is the number of decision statement of a
program, then the McCabe’s metric is equal to N+1.
VSSUT, Burla
Data Flow-Based Testing
Data flow-based testing method selects test paths of a program according to the locations of the
definitions and uses of different variables in a program.
For a statement numbered S, let
DEF(S) = {X/statement S contains a definition of X}, and
USES(S) = {X/statement S contains a use of X}
For the statement S:a=b+c;, DEF(S) = {a}. USES(S) = {b,c}. The definition of variable X at
statement S is said to be live at statement S1, if there exists a path from statement S to statement S1
which does not contain any definition of X.
The definition-use chain (or DU chain) of a variable X is of form [X, S, S1], where S and S1 are
statement numbers, such that X Є DEF(S) and X Є USES(S1), and the definition of X in the
statement S is live at statement S1. One simple data flow testing strategy is to require that every DU
chain be covered at least once. Data flow testing strategies are useful for selecting test paths of a
program containing nested if and loop statements.
Mutation Testing
In mutation testing, the software is first tested by using an initial test suite built up from the different
white box testing strategies. After the initial testing is complete, mutation testing is taken up. The
idea behind mutation testing is to make few arbitrary changes to a program at a time. Each time the
program is changed, it is called as a mutated program and the change effected is called as a mutant. A
mutated program is tested against the full test suite of the program. If there exists at least one test
case in the test suite for which a mutant gives an incorrect result, then the mutant is said to be dead. If
a mutant remains alive even after all the test cases have been exhausted, the test data is enhanced to
kill the mutant. The process of generation and killing of mutants can be automated by predefining a
set of primitive changes that can be applied to the program. These primitive changes can be
alterations such as changing an arithmetic operator, changing the value of a constant, changing a data
type, etc. A major disadvantage of the mutation-based testing approach is that it is computationally
very expensive, since a large number of possible mutants can be generated.
Since mutation testing generates a large number of mutants and requires us to check each mutant
with the full test suite, it is not suitable for manual testing. Mutation testing should be used in
conjunction of some testing tool which would run all the test cases automatically.
VSSUT, Burla
Need for Debugging
Once errors are identified in a program code, it is necessary to first identify the precise program
statements responsible for the errors and then to fix them. Identifying errors in a program code
and then fix them up are known as debugging.
Debugging Approaches
The following are some of the approaches popularly adopted by programmers for debugging.
Brute Force Method:
This is the most common method of debugging but is the least efficient method. In this
approach, the program is loaded with print statements to print the intermediate values
with the hope that some of the printed values will help to identify the statement in error.
This approach becomes more systematic with the use of a symbolic debugger (also called
a source code debugger), because values of different variables can be easily checked and
break points and watch points can be easily set to test the values of variables effortlessly.
This is also a fairly common approach. In this approach, beginning from the statement at
which an error symptom has been observed, the source code is traced backwards until the
error is discovered. Unfortunately, as the number of source lines to be traced back
increases, the number of potential backward paths increases and may become
unmanageably large thus limiting the use of this approach.
Cause Elimination Method:
In this approach, a list of causes which could possibly have contributed to the error
symptom is developed and tests are conducted to eliminate each. A related technique of
identification of the error from the error symptom is the software fault tree analysis.
Program Slicing:
This technique is similar to back tracking. Here the search space is reduced by defining
slices. A slice of a program for a particular variable at a particular statement is the set of
source lines preceding this statement that can influence the value of that variable.
VSSUT, Burla
Debugging Guidelines
Debugging is often carried out by programmers based on their ingenuity. The following are some
general guidelines for effective debugging:
 Many times debugging requires a thorough understanding of the program design. Trying
to debug based on a partial understanding of the system design and implementation may
require an inordinate amount of effort to be put into debugging even simple problems.
 Debugging may sometimes even require full redesign of the system. In such cases, a
common mistake that novice programmers often make is attempting not to fix the error
but its symptoms.
 One must be beware of the possibility that an error correction may introduce new errors.
Therefore after every round of error-fixing, regression testing must be carried out.
Program Analysis Tools
A program analysis tool means an automated tool that takes the source code or the executable
code of a program as input and produces reports regarding several important characteristics of
the program, such as its size, complexity, adequacy of commenting, adherence to programming
standards, etc. We can classify these into two broad categories of program analysis tools:
Static Analysis tools
Dynamic Analysis tools
Static program analysis tools
Static Analysis Tool is also a program analysis tool. It assesses and computes various
characteristics of a software product without executing it. Typically, static analysis tools analyze
some structural representation of a program to arrive at certain analytical conclusions, e.g. that
some structural properties hold. The structural properties that are usually analyzed are:
 Whether the coding standards have been adhered to?
 Certain programming errors such as uninitialized variables and mismatch
between actual and formal parameters, variables that are declared but never
used are also checked.
Code walk throughs and code inspections might be considered as static analysis methods. But,
the term static program analysis is used to denote automated analysis tools. So, a compiler can be
considered to be a static program analysis tool.
Dynamic program analysis tools - Dynamic program analysis techniques require the program to
be executed and its actual behavior recorded. A dynamic analyzer usually instruments the code
(i.e. adds additional statements in the source code to collect program execution traces). The
instrumented code when executed allows us to record the behavior of the software for different
test cases. After the software has been tested with its full test suite and its behavior recorded, the
VSSUT, Burla
dynamic analysis tool caries out a post execution analysis and produces reports which describe
the structural coverage that has been achieved by the complete test suite for the program. For
example, the post execution dynamic analysis report might provide data on extent statement,
branch and path coverage achieved.
Normally the dynamic analysis results are reported in the form of a histogram or a pie chart to
describe the structural coverage achieved for different modules of the program. The output of a
dynamic analysis tool can be stored and printed easily and provides evidence that thorough
testing has been done. The dynamic analysis results the extent of testing performed in white-box
mode. If the testing coverage is not satisfactory more test cases can be designed and added to the
test suite. Further, dynamic analysis results can help to eliminate redundant test cases from the
test suite.
VSSUT, Burla
The primary objective of integration testing is to test the module interfaces, i.e. there are no
errors in the parameter passing, when one module invokes another module. During integration
testing, different modules of a system are integrated in a planned manner using an integration
plan. The integration plan specifies the steps and the order in which modules are combined to
realize the full system. After each integration step, the partially integrated system is tested. An
important factor that guides the integration plan is the module dependency graph. The structure
chart (or module dependency graph) denotes the order in which different modules call each
other. By examining the structure chart the integration plan can be developed.
Integration test approaches
There are four types of integration testing approaches. Any one (or a mixture) of the following
approaches can be used to develop the integration test plan. Those approaches are the following:
Big bang approach
Bottom- up approach
Top-down approach
Big-Bang Integration Testing
It is the simplest integration testing approach, where all the modules making up a system are
integrated in a single step. In simple words, all the modules of the system are simply put together
and tested. However, this technique is practicable only for very small systems. The main
problem with this approach is that once an error is found during the integration testing, it is very
difficult to localize the error as the error may potentially belong to any of the modules being
integrated. Therefore, debugging errors reported during big bang integration testing are very
expensive to fix.
Bottom-Up Integration Testing
In bottom-up testing, each subsystem is tested separately and then the full system is tested. A
subsystem might consist of many modules which communicate among each other through welldefined interfaces. The primary purpose of testing each subsystem is to test the interfaces among
various modules making up the subsystem. Both control and data interfaces are tested. The test
cases must be carefully chosen to exercise the interfaces in all possible manners Large software
systems normally require several levels of subsystem testing; lower-level subsystems are
successively combined to form higher-level subsystems. A principal advantage of bottom-up
integration testing is that several disjoint subsystems can be tested simultaneously. In a pure
bottom-up testing no stubs are required, only test-drivers are required. A disadvantage of bottomup testing is the complexity that occurs when the system is made up of a large number of small
subsystems. The extreme case corresponds to the big-bang approach.
VSSUT, Burla
Top-Down Integration Testing
Top-down integration testing starts with the main routine and one or two subordinate routines in
the system. After the top-level ‘skeleton’ has been tested, the immediately subroutines of the
‘skeleton’ are combined with it and tested. Top-down integration testing approach requires the
use of program stubs to simulate the effect of lower-level routines that are called by the routines
under test. A pure top-down integration does not require any driver routines. A disadvantage of
the top-down integration testing approach is that in the absence of lower-level routines, many
times it may become difficult to exercise the top-level routines in the desired manner since the
lower-level routines perform several low-level functions such as I/O.
Mixed Integration Testing
A mixed (also called sandwiched) integration testing follows a combination of top-down and
bottom-up testing approaches. In top-down approach, testing can start only after the top-level
modules have been coded and unit tested. Similarly, bottom-up testing can start only after the
bottom level modules are ready. The mixed approach overcomes this shortcoming of the topdown and bottom-up approaches. In the mixed testing approaches, testing can start as and when
modules become available. Therefore, this is one of the most commonly used integration testing
Phased Vs. Incremental Testing
The different integration testing strategies are either phased or incremental. A comparison of
these two strategies is as follows:
o In incremental integration testing, only one new module is added to the partial
system each time.
o In phased integration, a group of related modules are added to the partial system
each time.
Phased integration requires less number of integration steps compared to the incremental
integration approach. However, when failures are detected, it is easier to debug the system in the
incremental testing approach since it is known that the error is caused by addition of a single
module. In fact, big bang testing is a degenerate case of the phased integration testing approach.
System testing
System tests are designed to validate a fully developed system to assure that it meets its
requirements. There are essentially three main kinds of system testing:
 Alpha Testing. Alpha testing refers to the system testing carried out by the test team
within the developing organization.
 Beta testing. Beta testing is the system testing performed by a select group of friendly
 Acceptance Testing. Acceptance testing is the system testing performed by the customer
to determine whether he should accept the delivery of the system.
VSSUT, Burla
In each of the above types of tests, various kinds of test cases are designed by referring to the
SRS document. Broadly, these tests can be classified into functionality and performance tests.
The functionality test tests the functionality of the software to check whether it satisfies the
functional requirements as documented in the SRS document. The performance test tests the
conformance of the system with the nonfunctional requirements of the system.
Performance Testing
Performance testing is carried out to check whether the system needs the non-functional
requirements identified in the SRS document. There are several types of performance testing.
Among of them nine types are discussed below. The types of performance testing to be carried
out on a system depend on the different non-functional requirements of the system documented
in the SRS document. All performance tests can be considered as black-box tests.
• Stress testing
• Volume testing
• Configuration testing
• Compatibility testing
• Regression testing
• Recovery testing
• Maintenance testing
• Documentation testing
• Usability testing
Stress Testing -Stress testing is also known as endurance testing. Stress testing
evaluates system performance when it is stressed for short periods of time. Stress tests are
black box tests which are designed to impose a range of abnormal and even illegal input
conditions so as to stress the capabilities of the software. Input data volume, input data
rate, processing time, utilization of memory, etc. are tested beyond the designed capacity.
For example, suppose an operating system is supposed to support 15 multi programmed
jobs, the system is stressed by attempting to run 15 or more jobs simultaneously. A realtime system might be tested to determine the effect of simultaneous arrival of several
high-priority interrupts.
Stress testing is especially important for systems that usually operate below the maximum
capacity but are severely stressed at some peak demand hours. For example, if the nonfunctional requirement specification states that the response time should not be more than
20 secs per transaction when 60 concurrent users are working, then during the stress
testing the response time is checked with 60 users working simultaneously.
Volume Testing-It is especially important to check whether the data structures (arrays,
queues, stacks, etc.) have been designed to successfully extraordinary situations. For
VSSUT, Burla
example, a compiler might be tested to check whether the symbol table overflows when a
very large program is compiled.
Configuration Testing - This is used to analyze system behavior in various hardware
and software configurations specified in the requirements. Sometimes systems are built in
variable configurations for different users. For instance, we might define a minimal
system to serve a single user, and other extension configurations to serve additional users.
The system is configured in each of the required configurations and it is checked if the
system behaves correctly in all required configurations.
Compatibility Testing -This type of testing is required when the system interfaces with
other types of systems. Compatibility aims to check whether the interface functions
perform as required. For instance, if the system needs to communicate with a large
database system to retrieve information, compatibility testing is required to test the speed
and accuracy of data retrieval.
Regression Testing -This type of testing is required when the system being tested is an
upgradation of an already existing system to fix some bugs or enhance functionality,
performance, etc. Regression testing is the practice of running an old test suite after each
change to the system or after each bug fix to ensure that no new bug has been introduced
due to the change or the bug fix. However, if only a few statements are changed, then the
entire test suite need not be run - only those test cases that test the functions that are
likely to be affected by the change need to be run.
Recovery Testing -Recovery testing tests the response of the system to the presence of
faults, or loss of power, devices, services, data, etc. The system is subjected to the loss of
the mentioned resources (as applicable and discussed in the SRS document) and it is
checked if the system recovers satisfactorily. For example, the printer can be
disconnected to check if the system hangs. Or, the power may be shut down to check the
extent of data loss and corruption.
Maintenance Testing- This testing addresses the diagnostic programs, and other
procedures that are required to be developed to help maintenance of the system. It is
verified that the artifacts exist and they perform properly.
Documentation Testing- It is checked that the required user manual, maintenance
manuals, and technical manuals exist and are consistent. If the requirements specify the
types of audience for which a specific manual should be designed, then the manual is
checked for compliance.
VSSUT, Burla
Usability Testing- Usability testing concerns checking the user interface to see if it
meets all user requirements concerning the user interface. During usability testing, the
display screens, report formats, and other aspects relating to the user interface
requirements are tested.
Error Seeding
Sometimes the customer might specify the maximum number of allowable errors that may be
present in the delivered system. These are often expressed in terms of maximum number of
allowable errors per line of source code. Error seed can be used to estimate the number of
residual errors in a system. Error seeding, as the name implies, seeds the code with some known
errors. In other words, some artificial errors are introduced into the program artificially. The
number of these seeded errors detected in the course of the standard testing procedure is
determined. These values in conjunction with the number of unseeded errors detected can be
used to predict:
• The number of errors remaining in the product.
• The effectiveness of the testing strategy.
Let N be the total number of defects in the system and let n of these defects be found by testing.
Let S be the total number of seeded defects, and let s of these defects be found during testing.
n/N = s/S
N = S × n/s
Defects still remaining after testing = N–n = n×(S – s)/s
Error seeding works satisfactorily only if the kind of seeded errors matches closely with the kind
of defects that actually exist. However, it is difficult to predict the types of errors that exist in a
software. To some extent, the different categories of errors that remain can be estimated to a first
approximation by analyzing historical data of similar projects. Due to the shortcoming that the
types of seeded errors should match closely with the types of errors actually existing in the code,
error seeding is useful only to a moderate extent.
Regression Testing
Regression testing does not belong to either unit test, integration test, or system testing. Instead,
it is a separate dimension to these three forms of testing. The functionality of regression testing
has been discussed earlier.
VSSUT, Burla
Necessity of Software Maintenance
Software maintenance is becoming an important activity of a large number of software
organizations. This is no surprise, given the rate of hardware obsolescence, the immortality of a
software product per se, and the demand of the user community to see the existing software
products run on newer platforms, run in newer environments, and/or with enhanced features.
When the hardware platform is changed, and a software product performs some low-level
functions, maintenance is necessary. Also, whenever the support environment of a software
product changes, the software product requires rework to cope up with the newer interface. For
instance, a software product may need to be maintained when the operating system changes.
Thus, every software product continues to evolve after its development through maintenance
efforts. Therefore it can be stated that software maintenance is needed to correct errors, enhance
features, port the software to new platforms, etc.
Types of software maintenance
There are basically three types of software maintenance. These are:
 Corrective: Corrective maintenance of a software product is necessary to rectify the bugs
observed while the system is in use.
 Adaptive: A software product might need maintenance when the customers need the
product to run on new platforms, on new operating systems, or when they need the
product to interface with new hardware or software.
 Perfective: A software product needs maintenance to support the new features that users
want it to support, to change different functionalities of the system according to customer
demands, or to enhance the performance of the system.
Problems associated with software maintenance
Software maintenance work typically is much more expensive than what it should be and takes
more time than required. In software organizations, maintenance work is mostly carried out
using ad hoc techniques. The primary reason being that software maintenance is one of the most
neglected areas of software engineering. Even though software maintenance is fast becoming an
important area of work for many companies as the software products of yester years age, still
software maintenance is mostly being carried out as fire-fighting operations, rather than through
systematic and planned activities.
Software maintenance has a very poor image in industry. Therefore, an organization often cannot
employ bright engineers to carry out maintenance work. Even though maintenance suffers from a
poor image, the work involved is often more challenging than development work. During
VSSUT, Burla
maintenance it is necessary to thoroughly understand someone else’s work and then carry out the
required modifications and extensions.
Another problem associated with maintenance work is that the majority of software products
needing maintenance are legacy products.
Software Reverse Engineering
Software reverse engineering is the process of recovering the design and the requirements
specification of a product from an analysis of its code. The purpose of reverse engineering is to
facilitate maintenance work by improving the understandability of a system and to produce the
necessary documents for a legacy system. Reverse engineering is becoming important, since
legacy software products lack proper documentation, and are highly unstructured. Even welldesigned products become legacy software as their structure degrades through a series of
maintenance efforts.
The first stage of reverse engineering usually focuses on carrying out cosmetic changes to the
code to improve its readability, structure, and understandability, without changing of its
functionalities. A process model for reverse engineering has been shown in fig. 24.1. A program
can be reformatted using any of the several available prettyprinter programs which layout the
program neatly. Many legacy software products with complex control structure and unthoughtful
variable names are difficult to comprehend. Assigning meaningful variable names is important
because meaningful variable names are the most helpful thing in code documentation. All
variables, data structures, and functions should be assigned meaningful names wherever possible.
Complex nested conditionals in the program can be replaced by simpler conditional statements
or whenever appropriate by case statements.
Fig. 24.1: A process model for reverse engineering
VSSUT, Burla
After the cosmetic changes have been carried out on a legacy software the process of extracting
the code, design, and the requirements specification can begin. These activities are schematically
shown in fig. 24.2. In order to extract the design, a full understanding of the code is needed.
Some automatic tools can be used to derive the data flow and control flow diagram from the
code. The structure chart (module invocation sequence and data interchange among modules)
should also be extracted. The SRS document can be written once the full code has been
thoroughly understood and the design extracted.
Fig. 24.2: Cosmetic changes carried out before reverse engineering
Legacy software products
It is prudent to define a legacy system as any software system that is hard to maintain. The
typical problems associated with legacy systems are poor documentation, unstructured (spaghetti
code with ugly control structure), and lack of personnel knowledgeable in the product. Many of
the legacy systems were developed long time back. But, it is possible that a recently developed
system having poor design and documentation can be considered to be a legacy system.
The activities involved in a software maintenance project are not unique and depend on several
factors such as:
• the extent of modification to the product required
VSSUT, Burla
• the resources available to the maintenance team
• the conditions of the existing product (e.g., how structured it is, how well documented it
is, etc.)
• the expected project risks, etc.
When the changes needed to a software product are minor and straightforward, the code can be
directly modified and the changes appropriately reflected in all the documents. But more
elaborate activities are required when the required changes are not so trivial. Usually, for
complex maintenance projects for legacy systems, the software process can be represented by a
reverse engineering cycle followed by a forward engineering cycle with an emphasis on as much
reuse as possible from the existing code and other documents.
VSSUT, Burla
Two broad categories of process models for software maintenance can be proposed. The first
model is preferred for projects involving small reworks where the code is changed directly and
the changes are reflected in the relevant documents later. This maintenance process is graphically
presented in fig. 25.1. In this approach, the project starts by gathering the requirements for
changes. The requirements are next analyzed to formulate the strategies to be adopted for code
change. At this stage, the association of at least a few members of the original development team
goes a long way in reducing the cycle team, especially for projects involving unstructured and
inadequately documented code. The availability of a working old system to the maintenance
engineers at the maintenance site greatly facilitates the task of the maintenance team as they get a
good insight into the working of the old system and also can compare the working of their
modified system with the old system. Also, debugging of the reengineered system becomes
easier as the program traces of both the systems can be compared to localize the bugs.
Fig. 25.1: Maintenance process model 1
VSSUT, Burla
The second process model for software maintenance is preferred for projects where the amount
of rework required is significant. This approach can be represented by a reverse engineering
cycle followed by a forward engineering cycle. Such an approach is also known as software
reengineering. This process model is depicted in fig. 25.2. The reverse engineering cycle is
required for legacy products. During the reverse engineering, the old code is analyzed
(abstracted) to extract the module specifications. The module specifications are then analyzed to
produce the design. The design is analyzed (abstracted) to produce the original requirements
specification. The change requests are then applied to this requirements specification to arrive at
the new requirements specification. At the design, module specification, and coding a substantial
reuse is made from the reverse engineered products. An important advantage of this approach is
that it produces a more structured design compared to what the original product had, produces
good documentation, and very often results in increased efficiency. The efficiency improvements
are brought about by a more efficient design. However, this approach is more costly than the first
approach. An empirical study indicates that process 1 is preferable when the amount of rework is
no more than 15%. Besides the amount of rework, several other factors might affect the decision
regarding using process model 1 over process model 2:
 Reengineering might be preferable for products which exhibit a high failure rate.
 Reengineering might also be preferable for legacy products having poor design
and code structure.
Fig. 25.2: Maintenance process model 2
VSSUT, Burla
Software Reengineering
Software reengineering is a combination of two consecutive processes i.e. software reverse
engineering and software forward engineering as shown in the fig. 25.2.
Estimation of approximate maintenance cost
It is well known that maintenance efforts require about 60% of the total life cycle cost for a
typical software product. However, maintenance costs vary widely from one application domain
to another. For embedded systems, the maintenance cost can be as much as 2 to 4 times the
development cost.
Boehm [1981] proposed a formula for estimating maintenance costs as part of his COCOMO
cost estimation model. Boehm’s maintenance cost estimation is made in terms of a quantity
called the Annual Change Traffic (ACT). Boehm defined ACT as the fraction of a software
product’s source instructions which undergo change during a typical year either through addition
or deletion.
ACT = KLOC added + KLOC deleted
where, KLOCadded is the total kilo lines of source code added during maintenance.
KLOCdeleted is the total kilo lines of source code deleted during maintenance.
Thus, the code that is changed, should be counted in both the code added and the code deleted.
The annual change traffic (ACT) is multiplied with the total development cost to arrive at the
maintenance cost:
maintenance cost = ACT × development cost.
Most maintenance cost estimation models, however, yield only approximate results because they
do not take into account several factors such as experience level of the engineers, and familiarity
of the engineers with the product, hardware requirements, software complexity, etc.
VSSUT, Burla
Repeatable vs. non-repeatable software development organization
A repeatable software development organization is one in which the software development
process is person-independent. In a non-repeatable software development organization, a
software development project becomes successful primarily due to the initiative, effort,
brilliance, or enthusiasm displayed by certain individuals. Thus, in a non-repeatable software
development organization, the chances of successful completion of a software project is to a
great extent depends on the team members.
Software Reliability
Reliability of a software product essentially denotes its trustworthiness or dependability.
Alternatively, reliability of a software product can also be defined as the probability of the
product working “correctly” over a given period of time.
It is obvious that a software product having a large number of defects is unreliable. It is also
clear that the reliability of a system improves, if the number of defects in it is reduced. However,
there is no simple relationship between the observed system reliability and the number of latent
defects in the system. For example, removing errors from parts of a software which are rarely
executed makes little difference to the perceived reliability of the product. It has been
experimentally observed by analyzing the behavior of a large number of programs that 90% of
the execution time of a typical program is spent in executing only 10% of the instructions in the
program. These most used 10% instructions are often called the core of the program. The rest
90% of the program statements are called non-core and are executed only for 10% of the total
execution time. It therefore may not be very surprising to note that removing 60% product
defects from the least used parts of a system would typically lead to only 3% improvement to the
product reliability. It is clear that the quantity by which the overall reliability of a program
improves due to the correction of a single error depends on how frequently the corresponding
instruction is executed.
Thus, reliability of a product depends not only on the number of latent errors but also on the
exact location of the errors. Apart from this, reliability also depends upon how the product is
used, i.e. on its execution profile. If it is selected input data to the system such that only the
“correctly” implemented functions are executed, none of the errors will be exposed and the
perceived reliability of the product will be high. On the other hand, if the input data is selected
such that only those functions which contain errors are invoked, the perceived reliability of the
system will be very low.
VSSUT, Burla
Reasons for software reliability being difficult to measure
The reasons why software reliability is difficult to measure can be summarized as
 The reliability improvement due to fixing a single bug depends on where the bug is
located in the code.
 The perceived reliability of a software product is highly observer-dependent.
 The reliability of a product keeps changing as errors are detected and fixed.
 Hardware reliability vs. software reliability differs.
Reliability behavior for hardware and software are very different. For example, hardware failures
are inherently different from software failures. Most hardware failures are due to component
wear and tear. A logic gate may be stuck at 1 or 0, or a resistor might short circuit. To fix
hardware faults, one has to either replace or repair the failed part. On the other hand, a software
product would continue to fail until the error is tracked down and either the design or the code is
changed. For this reason, when a hardware is repaired its reliability is maintained at the level that
existed before the failure occurred; whereas when a software failure is repaired, the reliability
may either increase or decrease (reliability may decrease if a bug introduces new errors). To put
this fact in a different perspective, hardware reliability study is concerned with stability (for
example, inter-failure times remain constant). On the other hand, software reliability study aims
at reliability growth (i.e. inter-failure times increase). The change of failure rate over the product
lifetime for a typical hardware and a software product are sketched in fig. 26.1. For hardware
products, it can be observed that failure rate is high initially but decreases as the faulty
components are identified and removed. The system then enters its useful life. After some time
(called product life time) the components wear out, and the failure rate increases. This gives the
plot of hardware reliability over time its characteristics “bath tub” shape. On the other hand, for
software the failure rate is at it’s highest during integration and test. As the system is tested,
more and more errors are identified and removed resulting in reduced failure rate. This error
removal continues at a slower pace during the useful life of the product. As the software
becomes obsolete no error corrections occurs and the failure rate remains unchanged.
VSSUT, Burla
(a) Hardware product
(b) Software product
Fig. 26.1: Change in failure rate of a product
Reliability Metrics
The reliability requirements for different categories of software products may be different. For
this reason, it is necessary that the level of reliability required for a software product should be
specified in the SRS (software requirements specification) document. In order to be able to do
this, some metrics are needed to quantitatively express the reliability of a software product. A
good reliability measure should be observer-dependent, so that different people can agree on the
degree of reliability a system has. For example, there are precise techniques for measuring
performance, which would result in obtaining the same performance value irrespective of who is
carrying out the performance measurement. However, in practice, it is very difficult to formulate
a precise reliability measurement technique. The next base case is to have measures that correlate
VSSUT, Burla
with reliability. There are six reliability metrics which can be used to quantify the reliability of
software products.
 Rate of occurrence of failure (ROCOF)- ROCOF measures the frequency of
occurrence of unexpected behavior (i.e. failures). ROCOF measure of a software product
can be obtained by observing the behavior of a software product in operation over a
specified time interval and then recording the total number of failures occurring during
the interval.
 Mean Time To Failure (MTTF) - MTTF is the average time between two successive
failures, observed over a large number of failures. To measure MTTF, we can record the
failure data for n failures. Let the failures occur at the time instants t , t , …, t . Then,
MTTF can be calculated as
It is important to note that only run time is considered in the time measurements, i.e. the
time for which the system is down to fix the error, the boot time, etc are not taken into
account in the time measurements and the clock is stopped at these times.
Mean Time To Repair (MTTR) - Once failure occurs, sometime is required to fix the
error. MTTR measures the average time it takes to track the errors causing the failure and
to fix them.
Mean Time Between Failure (MTBR) - MTTF and MTTR can be combined to get the
MTBR metric: MTBF = MTTF + MTTR. Thus, MTBF of 300 hours indicates that once a
failure occurs, the next failure is expected after 300 hours. In this case, time
measurements are real time and not the execution time as in MTTF.
Probability of Failure on Demand (POFOD) - Unlike the other metrics discussed, this
metric does not explicitly involve time measurements. POFOD measures the likelihood
of the system failing when a service request is made. For example, a POFOD of 0.001
would mean that 1 out of every 1000 service requests would result in a failure.
Availability- Availability of a system is a measure of how likely shall the system be
available for use over a given period of time. This metric not only considers the number
of failures occurring during a time interval, but also takes into account the repair time
(down time) of a system when a failure occurs. This metric is important for systems such
as telecommunication systems, and operating systems, which are supposed to be never
down and where repair and restart time are significant and loss of service during that time
is important.
VSSUT, Burla
Classification of software failures
A possible classification of failures of software products into five different types is as follows:
 Transient- Transient failures occur only for certain input values while invoking a
function of the system.
 Permanent- Permanent failures occur for all input values while invoking a function of
the system.
 Recoverable- When recoverable failures occur, the system recovers with or without operator
Unrecoverable- In unrecoverable failures, the system may need to be restarted.
Cosmetic- These classes of failures cause only minor irritations, and do not lead to
incorrect results. An example of a cosmetic failure is the case where the mouse button has
to be clicked twice instead of once to invoke a given function through the graphical user
VSSUT, Burla
A reliability growth model is a mathematical model of how software reliability improves as
errors are detected and repaired. A reliability growth model can be used to predict when (or if at
all) a particular level of reliability is likely to be attained. Thus, reliability growth modeling can
be used to determine when to stop testing to attain a given reliability level. Although several
different reliability growth models have been proposed, in this text we will discuss only two very
simple reliability growth models.
Jelinski and Moranda Model -The simplest reliability growth model is a step function model
where it is assumed that the reliability increases by a constant increment each time an error is
detected and repaired. Such a model is shown in fig. 27.1. However, this simple model of
reliability which implicitly assumes that all errors contribute equally to reliability growth, is
highly unrealistic since it is already known that correction of different types of errors contribute
differently to reliability growth.
Fig. 27.1: Step function model of reliability growth
Littlewood and Verall’s Model -This model allows for negative reliability growth to reflect the
fact that when a repair is carried out, it may introduce additional errors. It also models the fact
that as errors are repaired, the average improvement in reliability per repair decreases (Fig. 27.2).
It treat’s an error’s contribution to reliability improvement to be an independent random variable
having Gamma distribution. This distribution models the fact that error corrections with large
contributions to reliability growth are removed first. This represents diminishing return as test
VSSUT, Burla
Fig. 27.2: Random-step function model of reliability growth
Statistical Testing
Statistical testing is a testing process whose objective is to determine the reliability of software
products rather than discovering errors. Test cases are designed for statistical testing with an
entirely different objective than those of conventional testing.
Operation profile
Different categories of users may use a software for different purposes. For example, a Librarian
might use the library automation software to create member records, add books to the library,
etc. whereas a library member might use to software to query about the availability of the book,
or to issue and return books. Formally, the operation profile of a software can be defined as the
probability distribution of the input of an average user. If the input to a number of classes {C i} is
divided, the probability value of a class represent the probability of an average user selecting his
next input from this class. Thus, the operation profile assigns a probability value Pi to each input
class Ci.
Steps in statistical testing
Statistical testing allows one to concentrate on testing those parts of the system that are most
likely to be used. The first step of statistical testing is to determine the operation profile of the
software. The next step is to generate a set of test data corresponding to the determined operation
profile. The third step is to apply the test cases to the software and record the time between each
VSSUT, Burla
failure. After a statistically significant number of failures have been observed, the reliability can
be computed.
Advantages and disadvantages of statistical testing
Statistical testing allows one to concentrate on testing parts of the system that are most likely to
be used. Therefore, it results in a system that the users to be more reliable (than actually it is!).
Reliability estimation using statistical testing is more accurate compared to those of other
methods such as ROCOF, POFOD etc. But it is not easy to perform statistical testing properly.
There is no simple and repeatable way of defining operation profiles. Also it is very much
cumbersome to generate test cases for statistical testing because the number of test cases with
which the system is to be tested should be statistically significant.
VSSUT, Burla