Chapter 9: Testing

advertisement
Chapter 11: Testing
According to the authors, “Testing is the process of finding differences between the
behavior specified by the system models and the observed behavior of the system.” In
effect, the main purpose of testing is to break the system. During the developmental
phases (such as system design, object design, etc.) our main goal was to build a system
that was to satisfy the requirements and specifications of the client. Having completed
building the system, we now need to break it. The idea is that if we cannot break the
system, then it is very likely that we did a good job. That is, the system has met with the
requirements and specifications provided by the client. Also, note that tests should be
carried out by qualified persons who are somewhat familiar with the whole system. It is
better however, that a tester should not be a developer. Furthermore, a tester should be
familiar with testing techniques.
There are different types of test used at different “stages” of the application. For example
there are:
1. The Unit tests.
2. Structural tests.
3. Functional tests.
4. Performance tests.
Some techniques to conduct tests that falls under Quality Control Techniques are:
1. Fault Avoidance Techniques: Tries to prevent the occurrence of errors and failures
by finding faults in the system before it is released. It includes
(a) Development methodologies: provides techniques that reduce fault introduction
into the system models and code. Includes unambiguous representations of
requirements, minimizing coupling and maximizing coherence, use of data
abstraction and encapsulation, capture of rationale configuration for maintenance
and early definition of subsystem interface.
(b) Configuration management: avoid faults caused by undisciplined change in the
system. Need to notify other developers when changes are made so that other
parts of the system that may be affected by such changes may be upgraded.
(c) Verification: attempts to find faults before any execution of the system. Must
assume that pre conditions are true and verify that post conditions are met.
(d) Review: is manual inspection of the system without actually executing the
system. Two types – walkthrough and inspection. Walkthrough is going through
the code line by line trying to identify errors in the code. Inspection, which is
similar to walkthrough, checks the code against the requirements, checks the
algorithm for efficiency and check the comments to see if they are accurate.
Developer is not present during inspection.
2. Fault Detection Techniques: Attempts to find faults of the system during
development, and in some cases, after release. These techniques do not attempt to
recover from the fault – example a blackbox that may be found on an aircraft. The
blackbox would try to record the reasons for a crash. Two types of fault detection
techniques:
1
(a) Debugging: debugger moves through a number of states finally coming to an
error and hence be able to fix it. It is finding an error in what is regarded as an
unplanned way. Two type of debugging: correctness debugging – find deviation
between observed and specified functional requirements and performance
debugging – find deviations between observed and nonfunctional requirements,
such as response time.
(b) Testing: finding an error in a planned way. Note – a successful test is a test that
was able to find errors. The developer tries to find errors before delivery of the
system. The idea is to choose test data that would guarantee a greater degree of
failure of the system. Testing activities include:
(i)
Unit Testing: tries to find faults in participating objects and/or subsystems
with respect to use cases from the use case model.
(ii)
Integration Testing: the activity of finding faults when testing the
individually tested components together. The system structure testing is
the culmination of the integration testing whereby all of the components
are tested together.
(iii) System Testing: tests all of the components together (seen as a single
system) with respect to functional requirements, design, etc. It includes:
 Functional Testing: tests the requirements from the RAD
(Requirements Analysis Document) and if available from the user
manual.
 Performance Testing: checks the nonfunctional requirements and
additional design goals. Done by developers.
 Acceptance testing and Installation testing: check requirements
against project agreement. Done by client with support from
developer.
3. Fault Tolerance Technique:
There may exist cases whereby we are unable to prevent errors. If this is the case, then
we must be able to recover from any such errors. Fault tolerance techniques are critical in
highly reliable systems example, a life saving unit, the space shuttle (which has 5
onboard computers providing a modular operation), etc.
Testing concepts:
1. A component: a part of the system that can be isolated for testing – could be an
object, a group of objects or one or more subsystems.
2. A fault: bug or defect – a design or coding mistake that may cause abnormal
component behavior.
3. An error: manifestation of a fault during the execution of the system.
4. A failure: deviation between the specification of a component and its behavior. This
is triggered by one or more errors.
2
5. A test case: a set of inputs and expected results that exercises a component with the
purpose of causing failures and detecting faults.
6. A test stub: a partial implementation of components on which the test component
depends.
7. A test driver: a partial implementation of a component that depends on the tested
component.
8. A correction: a change to a component – repair a fault.
Faults, errors and failures:
As noted a fault could be the result of bad coding or design. The figure on page 443 –
figure 11.3 shows an example of a fault. In this case, the workmen may have
miscommunicated thus resulting in the tracks not aligned. In actual software
development, this sort of thing can also happen. The programming may be divided into
groups, each group responsible for a subsystem. Due to lack of communication, the
subsystems cannot be integrated properly, yet each subsystem is working in their own
rights.
A fault would become an error only after the piece of code was tested. As in the example
given, the line is tested via a use case. A train is run on the track, which would lead to
derailment (a failure of the system).
Test cases:
A test case has 5 attributes:
Attributes
Name
Location
Input
Oracle
Description
Name of test case
Full path of name of executable
Input data or commands
Expected test results against which the
output of the test is compared
Output produced by the test.
Log
The name of a test case should reflect the component that is tested, thus it should include
part of the components name. The location should describe where the test case is found.
That is the pathname or URL, as the case may be, of the executable program and its
inputs. Input describes set of inputs to be used that will be entered either by the tester or
a test driver. The expected behavior (output) is described by the oracle and the log is a
correlation of the observed behavior with the expected behavior for various runs.
Testing must be done in some sort of sequence. That is, if a test case is dependent upon
the result of another test, then that test should be completed first. This may seem trivial
3
for small applications, but when applications consist of millions of lines of code and
involves dozens of programmer, it becomes very important indeed. Coordination is the
key.
Test cases are classified into “black-box” and “white-box” tests depending upon the
aspect of the system model that is tested. Black-box test deals with the input/output of the
component, not with the structure or the behavior of the component. White-box focuses
on the internal structure. This makes sure that every state in the dynamic model of the
object and every interaction of the objects is tested. Unit testing involves both black-box
and white-box testing. It tests for both input/output as well as the structural and dynamic
aspects of the component.
Black-box Testing:
Black-box testing focuses on the functional requirements of the software. It allows the
tester to use inputs that would test all functional requirements. It attempts to find errors in
the following categories:
(a) Incorrect or missing functions.
(b) Interface errors.
(c) Errors in data structures or external data base access.
(d) Performance errors and
(e) Initialization and termination errors.
Black-box testing is deferred to be done later in the software development process, unlike
white-box testing which is done earlier. Some questions that are used to guide a blackbox test are:
1. How is functional validity tested?
2. What classes of input will make good test cases?
3. Is the system particularly sensitive to certain input values?
4. How are the boundaries of a data class isolated?
5. What data rates data volumes can the system tolerate?
6. What effect will specific combinations of data have on system operation?
The first step in black-box testing is to understand the objects that are modeled in the
software and the relationships that connect these objects. Once this has been done, the
next step is to define a series of test that will verify “all objects have the expected
relationships to one another.” To do this, the software engineer begins to create a graph
with a collection of nodes (that represents the objects), links (edges – that represents the
relationships) and node weight (that represent the properties of a node). The link could
be directed (relationship moves in one direction), bi-directional (moves in both direction)
and parallel (a number of different relationships established between two nodes. The
figure shows that a menu select on new file generates a document window. The node
weight of document window provides a list of attributes that are to be expected when the
window is generated. The link weight indicates that the window must be generated in less
than 1 second. An undirected link establishes a symmetric relationship between new file
4
menu select and document text and parallel links exist between document window and
document text.
White-box testing:
This test uses the control structure to design the test cases. The tests derived to conduct
white-box testing:
1. Guarantee that all independent paths within a module have been exercised at least
once.
2. Exercise all logical decisions on their true and false sides.
3. Execute all loops at their boundaries and within their operational bounds.
4. Exercise internal data structures to assure their validity.
One technique used for white-box testing is the basis-path testing technique. The test case
designer would derive a logical complexity measure of a design and use this measure as a
guide for defining a basis set of execution paths. Test cases derived to exercise the basis
set are guaranteed to execute every statement in the program at least one time during
testing.
But a flow graph of the application must first be derived. The following graphs give
representation of code statement:
If
While
Sequence
Do
Until
Case
These sub-graphs would be used to convert a program into a flow graph. The sub-graph
would represent the various coding constructs that were used in the program. The final
graph would then be scanned for independent paths. Each independent path would now
require a separate test case. In this way, no line of code within the program would escape
testing.
An example: check the attached diagrams.
Test stubs and drivers:
When we want to test single components of a system, we need to separate that component
from the rest of the system. We do this by creating stubs to represent the related parts to
the system and drivers to carry out the test. The stubs actually are used to simulate the
parts of the system that are called by the component that is under test. So a stub may
consist of values, etc. that may be required by the component that is tested. Such values
would actually provide the test for the component. Note: a test stub should simulate the
5
component that it is substituting as close as possible, else the tested component may not
be adequately tested. Thus, it is some times even better to use the actual component that
would have been simulated by the test stub, to carry out the test.
Corrections:
Once a problem has been found, then corrections are made. Corrections could be simple
and applied to only the component that is under test or they could be more involved
requiring changes to an entire class or subsystem. There may be cases whereby entire
subsystems need to be redesigned, which may eventually introduce new errors. The
authors suggested several techniques that can be used to track and handle any new faults
that may evolve:
1. Problem tracking: once documentation of the entire process of finding errors and
correction is kept, then it is easy to revise those portions of the system with the intent
of finding faults.
2. Regression testing: re-execution of all prior tests after a change.
3. Rationale maintenance: justifying the changes that are made with the requirements
of the subsystem.
Testing activities:
1. Inspecting Components:
Inspections find faults in a component by reviewing its source code. It could be done
before or after the unit test. Fagan suggested a five-step method for inspection:
(a) Overview: The author of the component briefly presents the purpose and scope of
the component and the goal of the inspection.
(b) Preparation: The reviewers become familiar with the implementation of the
component.
(c) Inspection meeting: A reader paraphrases the source code (reads each source
code statement and explains what the statement should do) of the component and
the inspection team raise issues with the component. A moderator keeps the
meeting on track.
(d) Rework: The author revises the component.
(e) Follow-up: The moderator checks the quality of the rework and determines the
component needs to be re-inspected.
2. Unit Testing:
This focuses on the building block of the system, that is, the objects and subsystems.
There are 3 main reasons for unit test:
(a) It reduces the complexity of the overall test activity enabling focus on smaller
units at a time.
(b) It becomes easier to pinpoint and correct errors.
(c) Allow parallelism in testing i.e. it each component is tested independent of each
other.
There are many unit testing techniques:
6
Equivalence testing: a black-box testing technique that minimize the number of test
cases. The possible inputs are partitioned into equivalence classes and a test case is
selected for each class. Only one member of an equivalence class needs to be tested.
The test consists of two steps: identification of the equivalence class and the selection
of the test inputs. To identify equivalence class we use:
 Coverage: Every possible input belongs to one of the equivalence classes.
 Disjointedness: No input belongs to more than one equivalence class.
 Representation: If the execution demonstrates an error when a particular member
of an equivalence class is used, then the same error should appear if any other
member of the class is used.
For each equivalence class, two pieces of data is used – a typical input and an invalid
input. The example given for the method that returns the number of days: 3
equivalence classes were found for the month parameter – months with 31 days, with
30 days and February with either 28 or 29 days. Invalid inputs would be non-positive
integers and integers bigger than 12. Two equivalence classes were found for the year
parameter – leap year and non-leap year. Negative integers are invalid for year.
Together, this yield 6 equivalence classes that would need to be tested – table 11.2
page 455.
Boundary testing: Focuses on the conditions at the boundary of the equivalence class.
The testing requires that elements be selected from the “edges” of the equivalence
class. The above example: generally years that are multiple of 4 are leap years, but
note that years that are multiple of 100 are not leap years unless they are multiple of
400, even though they may be multiple of 4. Example – 2000 is a leap year but 1900
was not a leap year, even though it was a multiple of 100 and 4. Hence, both 1900
and 2000 are good boundary cases for year and 0 and 13 would be good boundary
cases for month.
Path Testing: A white-box testing technique that identifies faults in the
implementation of the component. The idea behind path testing is that each line of
code would be tested at least once, and thus if any fault (error) exist in any of the
paths tested, it would be found. Thus to carry out this test, flow graph of the source
must be developed. In the case of the example method, the flow graph on page 456
was developed. Note the decisions were taken into consideration – no looping
structures in this method. There are five if statements in the code (page 457)
represented by the diamond shapes and the activities are represented by the
rectangular-rounded-edged shapes consisting of 7- two for the exceptions and 5 for
the if’s and their else’s. The table on page 458 (table 11.4) shows the test cases and
the path – note there are 6 test cases indicating 6 paths.
Note that even though path testing could be used in OO languages, it was developed
specifically for the imperative languages, thus polymorphism for example, will need
more test cases to be used than could be computed from the cyclomatic complexity
formula. Also, because OO methods are shorter, less control faults may be uncovered.
This is because more inheritance exist in OO and thus the tests would require
involvement of a larger number of objects. Note also two things: Because path testing
is heavily dependent on the structure of the program, the problem of picking up a
7
value such as 1900 as not being a leap was not found. That is the test was made only
for a modulo of year by 4 not also by 100 and 400. Furthermore, none of the path
tests were able to pick up that August was missing from the set of months consisting
of 31 days.
State-based testing: This technique focuses on OO systems. This test focus on
comparing the resulting state of the system with the expected state of the system. We
derive test cases from the state-chart diagram for the class. Similar to the equivalence
testing, for each state, a representative set of stimuli is derived. The attributes of the
class are tested after each stimuli is applied.
The example on page 460 shows the watch case from chapter 2 being tested. The
states tested are MeasureTime and SetTime. State-based testing is a difficult method
and is still not fully developed as a testing method. Furthermore, owing to it’s
difficulties, it is hoped to be automated, and thus the automated version would be
easier to use.
3. Integration testing:
Once the unit testing has been successful, then it is time to integrate the units into
larger components – classes and/or subsystems or larger subsystems. Integration
testing is supposed to be able to detect faults that the unit test did not find. Some of
these faults lie in the interface that would be used to integrate the smaller objects into
bigger subsystems. The idea is to start small – integrate two objects first and test
them, then if no faults occur, add another object, then another, etc.
The key to easier and maybe more successful integration testing is ordering the
components. There are a number of strategies developed (based on the assumption
that the system components are hierarchical in their relationship with each other)
that enable ordering:

Big bang testing

Bottom-up testing

Top-down testing

Sandwich testing.
Big bang testing: Assumes that all components are tested individually and then are
put together and tested. This test could be expensive, because if a fault is found when
doing the big test, it would be difficult to locate and fix, especially for huge programs.
Also interface failure may be difficult to distinguish from component failures.
Bottom-up testing: all components of the bottom layer are tested individually and
then integrated with the layer up. This is continued until the entire system is tested.
Note: when two components exist at the same level and are tested together, it is
known as a double test (if three components are tested together, it is known as triple
test, and four together as quadruple test). Test drivers are used to simulate the
components that are not tested.
Top-down testing: The reverse of the bottom-up test; i.e. components from the top
layers are tested first and progressively integrate the lower layers.
8
Both of these tests have their advantages and disadvantages. Which is chosen is
actually dependent upon the tester/developer. In the case of the bottom-up test, an
advantage is that interface faults can be more easily found. The disadvantage is that
the interface components are tested last. These may be some of the more important
components and if faults are found they may require many of the lower components
to be revised which of course means that large amount of components have to be retested.
The advantage of the top-down test is that all interface components are tested first,
hence if faults are found corrections can be made to lower components before they
are even tested. The disadvantage is that the development of test stubs can be time
consuming and may also be error prone. This is because a large number of stubs are
required. Figure 356 shows how both of these tests are implemented.
Sandwich testing: combines the top-down and bottom-up tests trying to make use
"of the best of both strategies". The idea is to re-map the subsystem decomposition
into three layers - a middle layer, a layer above and a layer below. The components in
the bottom and top layers are used as is (no stubs are written). The middle (target)
layer is the focus. The other two layers are tested in parallel. A major disadvantage is
that methods in the target layer are not tested properly, if at all. So the modified
sandwich test helps to correct this problem, but of course more stubs and drivers are
required. Nevertheless, the modified sandwich test is shorter than either the bottomup or the top-down test.
4. System Testing:
Once unit and integration tests are completed, then the system needs to be tested to
make sure that the system meets both functional and nonfunctional requirements.
System testing includes:
(a) Functional Testing: test of functional requirements from use cases.
(b) Performance Testing: test of nonfunctional requirements.
(c) Pilot Testing: tests of common functionality among a selected group of end users.
(d) Acceptance Testing: usability, functional and performance tests done at the
developers’ environment by the customer against the acceptance agreement.
(e) Installation Testing: usability, functional and performance tests done at the
customer environment by the customer against the acceptance agreement.
Functional Testing:
Also called requirements testing, test to find differences between the functional
requirements and the system. A black-box testing method is used i.e. boundary conditions
are tested. The test cases are derived from the use case models. Figure 11.24 and 11.25
give an example using the use case PurchaseTicket. Note the features that are likely to
fail and that are actually tested (page 360).
Performance Testing:
9
Attempts to find differences between design goals selected during system design and the
system. May include:

Stress testing: checks to see if the system can respond to many simultaneous
requests.
 Volume testing: attempts to find faults associated with large amounts of data
such as static inputs imposed by the data structure, etc.
 Security testing: attempts to find security faults in the system. Few systematic
methods exist to carry out such test. Usually this is carried out by teams of
individuals who try to break into the system using their experience and
knowledge.
 Timing tests: attempts to find behavior that violates timing constraints described
by the nonfunctional requirements.
 Recovery tests: evaluate the ability of the system to recover from errors such as
hardware failure, etc.
After all of these tests are completed without finding any errors, then the system is said to
be validated.
Pilot Testing:
If the software is developed to be placed on the market, then a group of people are invited
to test the software and give their feedback. On the other hand, if it developed for a
particular client, a group of users are chosen to test the system. They would pretend as if
the system was there on a permanent basis and test it as thorough as possible without any
guideline to tests. An alpha test is test carried out by the users in the developers
environment. A beta test is carried out in the users’ environment by a limited number of
users. Beta tests are much more common, especially with the use of the Internet.
Acceptance Testing:
Three ways in which the client evaluates a system during acceptance testing:
 Benchmark test: a set of test cases are prepared that represents typical conditions
under which the system will operate.
 Competitor test: used when a new system is replacing an old system. They are tested
against each other.
 Shadow test: new and old system run in parallel and their outputs compared.
If all is well, then the customer accepts the system. If not, the developers are notified
pertaining as to what is wrong. The developers will modify, delete or add conditions as
specified by the client.
Installation Testing:
After acceptance, the system is installed in clients’ environment. The installation test is
carried out to make sure that the system is properly installed and all of the requirements
are met.
10
Download