1. Introduction

advertisement
Table of Contents
1. Introduction .................................................................... 3
1.1. Software Testing: ..................................................................................................... 5
1.2. The Testing Levels:.................................................................................................. 5
1.3. Why Object Oriented Testing is complicated: ......................................................... 5
2. Complications in Object Oriented Testing...................... 6
2.1. Encapsulation and Information Hiding: ................................................................... 6
2.2. Polymorphism and Dynamic Binding:..................................................................... 8
2.3. Inheritance: ............................................................................................................ 10
2.4. Genericity:.............................................................................................................. 13
3. Testing Techniques ....................................................... 15
4. Tools Discussion .......................................................... 18
???? ................................................................................... 18
5. Conclusion .................................................................... 18
6. References .................................................................... 18
Preface
In this report, we will investigate the effects of object oriented approach on software
testing and how to carry out successful testing. We will evaluate several techniques and
tools for testing object oriented software, proposed in diverse research articles.
The report mainly consists of following sections:
Introduction- contains the software testing theory, testing levels and an approach to test
object oriented software.
Complications in object oriented testing- addresses the obstacles in object oriented
testing. This part enlists the problems in sub-section which include the description of the
problem followed by example and proposed solution.
Testing Techniques- this section is more general which describes the techniques for
testing object oriented software to overcome the problems mentioned in the previous
section as well as to remove other complications.
An overview of testing tools- this section is about testing tools which contains the
integral properties of a good software test tool for object oriented software.
After the classification of tools, we conclude the report with some suggestions for
exertion.
1. Introduction
1.1. Purpose:
The purpose of this project is to investigate the complications introduced by the powerful
new features of object oriented languages and how to deal with those problems. The
project is research based which covers the research articles on aspects of object oriented
languages. The research is focused on the concerns by discussing what testing an object
oriented system means and presents several techniques for doing so.
The research articles have been chosen in view to the comprehensiveness of the problems
they mention. We are considering more general problems but specific to object oriented
approach, hence the common problems for traditional software and object oriented
software are not covered in this report.
1.2. Motivation:
Software is the key to success for small scale to the largest scale organizations. The
companies are buying more software to fulfill their business requirements. And their first
and foremost goal to buy software is, solution oriented, easy to use, fast and bug free.
Furthermore robustness, reliability and flexibility are other key features in a good
software solution. Software developing companies are working hard to achieve these
goals. The methodology used to achieve all these is to test the software at every stage of
its development. The development of the test strategy starts with the software
specification and goes on with the development of the software.
Object oriented technology is becoming more and more popular due to its advantages in
improving the productivity and reliability in software development. On the other hand
object oriented systems present new and different problems with respect to traditional
structured programs, due to its different approach. Significant research has been carried
out in object oriented analysis, design and programming languages, but relatively a little
attention has been paid towards the testing of object oriented systems.
1.3. Methodology:
This research project spans the articles of different authors that mention the tricky
situations in the testing of object oriented software. We also looked at the testing books
but most of them were not directly addressing the techniques to overcome the problems
occurred by object oriented approach. The research articles were found at ITU library’s
databases, i.e., ACM, CiteSeer, DADS and Springer.
1.4. Software Testing:
The authors present the following comprehensive IEEE definition of testing: "Testing is
the process of executing a program with the intention to yield measurable errors. It
requires that there be an oracle to determine whether or not the program has functioned as
required, with comparison of performance against a defined specification." They also
present the following definition of the testing process: "The process of exercising the
routines provided by an object with the goal of uncovering errors in the implementation
of the routines or the state of the object or both."
1.5. The Testing Levels:
The introduction of object-oriented programming has created new types of testing, as
well as utilizing the traditional types. Smith and Robson discuss four levels:
1) The algorithm level, which involves the churning of the routine (method) on some
data.
2) The class level which, is concerned with the interaction between the attributes and
methods for a particular class.
3) The cluster level, which considers interactions between a groups of co-operating
classes.
4) The system level, which involves testing an entire system in the aggregate.
Of these four levels, the class and cluster levels are specific to the object oriented
paradigms, while the algorithm and system levels resemble traditional unit and system
test. Jorgensen and Erickson segment the testing process into three levels: unit,
integration, and system. They agree with Smith and Robson that unit testing begins at the
method level. They view system testing as a testing with threads (sequence of normal
usage steps) and interaction between threads. This is a more defined view of Smith and
Robson's system level.
1.6. Why Object Oriented Testing is complicated:
While it is still possible to apply traditional testing techniques to object oriented
problems, the process of testing object oriented software is more difficult than the
traditional approach, since programs are not executed in a sequential manner. Object
oriented components can be combined in an arbitrary order, thus defining test cases,
becomes a search for the order of routines that causes an error.
The state-based nature of object oriented systems can have a negative effect on testing.
This is because the state of the object is not defined solely by state values in general, but
through associations with other objects. Additionally, objects can be compared by
equality and identity, which adds a level of confusion concerning what needs to be
verified.
2. Complications in Object Oriented Testing
Since new problems are related to object oriented specific characteristics, their
identification requires an analysis of the features provided by object oriented languages.
We describe the critical features in object oriented testing in the following sub-sections.
Each sub-section describes the problem first, and then examples followed by proposed
solutions.
2.1. Encapsulation and Information Hiding:
Information hiding refers to the likelihood for the programmer to specify whether one
feature, encapsulated into some module, is visible to the client modules or not; it allows
for clearly distinguishing between the module interface and its implementation.
In conventional procedural programming, the basic component is the subroutine and the
testing method for such component is input/output based. While in object-oriented
programming the basic component is represented by a class, composed of a data structure
and a set of operations. Objects are run-time instances of classes. The data structure
defines the state of the object which is modified by the class operations (methods). In this
case, correctness of an operation is based not only on the input/output relation, but also
on the initial and resulting state of the object. Furthermore, in general, the data structure
is not directly accessible, but can only be accessed using the class’s public operations.
Encapsulation and information hiding make it hard for the tester to check what happens
inside an object during testing. Due to data abstraction there is no visibility of the insight
of objects. Thus it is impossible to directly examine their state. Encapsulation implies the
converse of visibility, which in the worst case means that objects can be more difficult, or
even impossible to test.
Encapsulation and information hiding raise the following main problems:
1. Problems in identifying which is the basic component to test, and how to select test
data for exercising it.
2. The state of an object could be inaccessible.
3. The private state is observable only through class methods (thus relying on the tested
software)
The possibility of defining classes which cannot be instantiated, e.g., abstract classes,
generic classes, and interfaces, introduces additional problems related to their non
straightforward testability.
Figure: An example of information hiding
Above figure illustrates an example of information hiding. The attribute status is not
accessible, and the behavior of checkPressure is strongly dependent on it.
We find two proposed approaches in literatures for testing object oriented programs in
the presence of encapsulation and information hiding.
Breaking encapsulation: It can be achieved either by exploiting features of the language
(e.g., the C++ friend construct or the Ada child unit) or instrumenting the code. This
approach allows for inspection of private parts of a class. The drawback in this case is the
intrusive character of the approach. An example of this approach can be found in [55].
Equivalence scenarios: This technique is based on the definition of pairs of sequences of
method invocations. Such pairs are augmented with a tag specifying whether the two
sequences are supposed to leave the object in the same state or not. In this way it is
possible to verify the consistence of the object state by comparison of the resulting states
instead of directly inspecting the object private parts. In the presence of algebraic
specifications this kind of testing can be automated. The advantage of this approach is
that it is less intrusive than the one based on the breaking of encapsulation. However, it is
still interfering, since the analyst needs to augment the class under test with a method for
comparing the state of its instances. The main drawback of this technique is that it allows
for functional testing only. Moreover, the fault hypothesis is non-specific: different kind
of faults may lead to this kind of failure and many possible faults may not be caught by
this kind of testing.
Equivalence scenarios have been introduced in [29]. Another application of this approach
can be found in [85]. In this case, when testing a class, states are identified by
partitioning data member domains. Then, interactions between methods and state of the
object are investigated. The goal is to identify faults resulting in either the transition to an
undefined state, or the reaching of a wrong state, or the incorrectly remaining in a state.
2.2. Polymorphism and Dynamic Binding:
The term polymorphism refers to the capability for a program entity to dynamically
change its type at run-time. This introduces the possibility of defining polymorphic
references (i.e., references that can be bound to objects of different types). In the
languages we consider, the type of the referred object must belong to a type hierarchy.
For example, in C++ or Java a reference to an object of type A can be assigned an object
of any type B as long as B is either a inheritor of A or A itself.
A feature closely related to polymorphism is dynamic binding. In traditional procedural
programming languages, procedure calls are bound statically, i.e., the code associated to
a call is known at link time. In the presence of polymorphism, the code invoked as a
consequence of a message invocation on a reference depends on the dynamic type of the
reference itself and is in general impossible to identify it statically. In addition, a message
sent to a reference can be parametric, and parameters can also be polymorphic references.
Void foo(Shape polygon) {
...
area=polygon.area();
...
}
Figure : A simple example of polymorphism
Above figure shows a simple Java example of method invocation on a polymorphic
object. In the proposed example, it is impossible to say at compile time which
implementation of the method area will be actually executed.
Late binding can lead to messages being sent to the wrong object. Overriding changes the
semantics of a method and can fool the clients of the class it belongs to. Since subclassing is not inherently sub-typing, dynamic binding on an erroneous hierarchical
chain, can produce undesirable results. Moreover, even when the hierarchy is well
formed, errors are still possible, since the correctness of a redefined method is not
guaranteed by the correctness of the super class method.
Polymorphism and late binding introduce decision concerns in program-based testing due
to their dynamic nature. To gain confidence in code containing method calls on
polymorphic entities, all the possible bindings should be exercised, but the exhaustive
testing of all possible combinations of bindings may be impractical.
Figure: An example of polymorphic invocation
Above figure illustrates a method invocation (represented in a message sending fashion),
where both the sender and the receiver of the message are polymorphic entities. In
addition, the message has two parameters, at their turn polymorphic. In such a case, the
number of possible combinations (type of the sender, type of the receiver and type of
parameters) is combinatorial. Moreover, the different objects may behave differently
depending on their state, and this leads to a further explosion of the number of test cases
to be generated. In such a situation, a technique is needed, which allows for selecting
satisfactory test cases. The trade-off here is between the possible infeasibility of the
approach and its incompleteness.
Further problems arise when classes to be tested belong to a library. Classes built to be
used in one specific system can be tested by restricting the set of possible combinations
to the ones identifiable analyzing the code. Re-usable classes need a higher degree of
polymorphic coverage, because such classes will be used in different and sometimes
unpredictable contexts.
The problems introduced by polymorphism can be summarized as follows:
- Program based testing in the presence of polymorphism may become infeasible, due to
the combinatorial number of cases to be tested.
- New definitions of coverage are required to cope with the testing of operations on a
polymorphic object.
- The creation of test sets to cover all possible calls to a polymorphic operation can not be
achieved with traditional approaches.
- The presence of polymorphic parameters introduces additional problems for the creation
of test cases.
2.3. Inheritance:
Inheritance is probably the most powerful feature provided by object oriented languages.
Classes in object-oriented systems are usually organized in a hierarchy originated by the
inheritance relationship. In the languages considered in this work, sub-classes can
override inherited method and add new methods not present in their super class.
Inheritance, when conceived as a mechanism for code reuse, raises the following issues:
It is necessary to test whether a subclass specific constructor (i.e.,
the method in charge for initializing the class) is correctly invoking the constructor of the
parent class.
Initialization problems:
Semantic mismatch: In
the case of sub-typing inheritance we are considering, methods in
the sub-classes may have a different semantics and thus they may need different test
suites.
The problem here is to know whether we can trust features
of classes we inherit from, or we need to re-test derived classes from scratch. An
optimistic view claims that only little or even no test is needed for classes derived from
thoroughly tested classes.
Opportunity for test reduction:
A deeper and more realistic approach argues that methods of derived classes need to be
re-tested in the new context. An inherited method can behave incorrectly due to either the
derived class having redefined members in an inappropriate way or the method itself
invoking a method redefined in the subclass.
We want to know if it is possible to use the same test cases
generated for the base class during the testing of the derived class. If this is not possible,
we should at least be able to find a way of partially reusing such test cases.
Re-use of test cases:
We should test whether the inheritance is truly expressing an “isa” relationship or we are just in the presence of code reuse. This issue is in some way
related to the misleading interpretation of inheritance as a way of both reusing code and
defining subtypes, and can lead to other problems.
Inheritance correctness:
Testing of abstract classes: Abstract
classes can not be instantiated and thus can not be
thoroughly tested. Only classes derived from abstract classes can be actually tested, but
errors can be present also in the super (abstract) class.
Figure : An
example of inheritance in C++
Above figure shows an example of inheritance. Questions which may arise looking at the
example are whether we should retest the method Circle::moveTo() and whether is it
possible to reuse test sets created for the class Shape to test the class Circle.
Different approaches have been proposed in literature for coping with the testing
problems introduced by inheritance. Basic approaches assume that inherited code needs
only minimal testing. As an example, Fiedler [32] states that methods provided by a
parent class (which has already been tested) do not require heavy testing. This viewpoint
is usually shared by practitioners, committed more to quick results than to rigorous
approaches.
Techniques addressing the problem from a more theoretical viewpoint have also been
proposed. They can be divided in two main classes:
1. Approaches based on the flattening of classes
2. Approaches based on incremental testing
Flattening-based approaches perform testing of sub-classes as if every inherited feature
had been defined in the sub-class itself (i.e., flattening the hierarchy tree). The advantage
of these approaches is related to the possibility of reusing test cases previously defined
for the super class (es) for the testing of subclasses (adding new test cases when needed
due to the addition of new features to the subclass). Redundancy is the price to be paid
when following such approach.
All features are re-tested in any case without any further consideration. Examples of
flattening-based approaches can be found in the work of Fletcher and Sajeev [33], and
Smith and Robson [78]. Fletcher and Sajeev present a technique that, besides flattening
the class structure, reuses specifications of the parent classes. Smith and Robson present a
technique based on the flattening of subclasses performed avoiding to test “unaffected”
methods, i.e., inherited methods that are not redefined and neither do not invoke
redefined methods, nor did use redefine attributes.
Incremental testing approaches are based on the idea that both re-testing all inherited
features and not re-testing any inherited features are wrong approaches for opposite
reasons. Only a subset of inherited features needs to be re-tested in the new context of the
subclass. The approaches differ in the way this subset is identified.
An algorithm for selecting the methods that need to be re-tested in subclasses is presented
by Cheatham and Mellinger [18]. A similar, but more rigorous approach is presented by
Harrold, McGregor, and Fitzpatrick [39]. The approach is based on the definition of a
testing history for each class under test, the construction of a call graph to represent
intra/inter class interactions, and the classification of the members of a derived class.
Class members are classified according to two criteria:
- The first criterion distinguishes added, redefined, and inherited members;
- The second criterion classifies class members according to their kind of relations with
other members belonging to the same class.
Based on the computed information, an algorithm is presented. The algorithm allows for
identifying, for each subclass, which members have to be re-tested, which test cases can
be re-used, and which attributes require new test cases.
2.4. Genericity:
Most traditional procedural programming languages do not support generic modules.
Supporting genericity means providing constructs allowing the programmer to define
abstract data types in a parametric way (i.e., to define ADTs as templates, which need to
be instantiated before being used). To instantiate such generic ADTs the programmer
needs to specify the parameters the class is referring to. Genericity is considered here
because, although not strictly an object-oriented feature, it is present in most objectoriented languages. Moreover, genericity is a key concept for the construction of reusable
component libraries. For instance, both the C++ Standard Template Library and Booch
components for C++, Ada and Eiffel are strongly based on genericity.
Figure : An example of genericity in C++
A generic class could not be tested without being instantiated specifying its parameters.
In order to test a generic class it is necessary to chose one (or more) type to instantiate the
generic class with and then to test this instance(s).
Above figure shows a template class representing a vector defined in C++. The main
problem in this case is related to the assumptions that can be made on the types used to
instantiate the class when testing the method sort (int and complex in the example).
In detail, the following topics must be addressed when testing in the presence of generic
classes: parametric classes must be instantiated to be tested, no assumptions can be made
on the parameters that will be used for instantiation, and “trusted” classes are needed to
be used as parameters. Such classes can be seen as a particular kind of “stubs”, and a
strategy is needed which allows for testing reused generic components.
The problem of testing generic classes is addressed by Overbeck, which shows the
necessity for instantiating generic classes to test them, and investigate the opportunities
for reducing testing on subsequent instantiations of generic classes. Overbeck reaches the
following conclusions:
Once instantiated, classes can be tested in the same way non-generic classes are; each
new instantiation needs to be tested only with respect to how the class acting as a
parameter is used; interactions between clients of the generic class and the class itself
have to be retested for each new instantiation.
3. Testing Techniques
This section addresses techniques discussed by several authors. The majority of the
research is centered around the class testing, cluster testing, and successful test case
generation techniques. Smith and Robson discuss a framework for class testing called
FOOT (Framework for Object-Oriented Testing). This framework partitions testing into
various groups and guides the process with test strategies or by manual human
interaction. The testing concentrates on interroutine and intraroutine errors, using the
following guidance strategies:
1) Minimal - provide a small set of features that allow the framework to interact with the
object that is being tested.
2) Exhaustive - exercise all legal variations of routines for a given object (This may not
be practical for a large object).
3) Tester Guided - carry out random testing by a human being.
4) Inheritance -flatten the hierarchy and test only the routines not already tested in the
super classes.
5) Memory - ensure languages with "garbage collection" properly dispose heap objects.
6) Data Flow - monitor the construction and destruction of an object.
7) Identity - search for state transitions that give the same outcomes regardless of the
operation order.
8) Set and Examine – set attributes to values, and observe the outcome as the object's
state changes.
Smith and Robson
Smith and Robson agree with test cases being defined for each class, but they differ in
saying that the subclass should not test attributes from the parent class.
Harold et al.
Harold et al. presents a technique that takes advantage of the hierarchical nature of
classes, utilizing information from the super classes to test related groups of classes. They
begin with testing base classes, since these classes are at the top of the inheritance
hierarchy. Each routine (method, member function, etc.) is tested individually; then
interactions between routines of the same class are verified to insure they yield no errors.
This is similar to Smith and Robson's intraroutine and interroutine testing. Harold et al.
then associate a testing history with each attribute it tests. When another class inherits
from this base class, the testing history for the base class is passed along with the
attributes and routines. This scheme drives the test procedures, since it indicates which
cases are needed to test the newly derived class.
Harold et al. feel that the additional testing is minimal, since the test cases can be easily
derived from the parent class. The actual test cases have a traditional flavor, in that they
use both specification-based (black box) and program-based (white box) techniques. The
authors suggest following unit testing procedures such as stubbing nonexistence methods
and providing drivers to exercise other class methods.
Obviously, the attributes without a test history are defined within the new class, so a test
suite needs to be defined for them. Harold et al. suggest that attributes from previous
classes be retested in the subclass with respect to their interaction with attributes within
the new class.
D'Souza and LeBlanc
D'Souza and LeBlanc present a testing technique based on examining pointers to objects,
to see if any unwanted aliasing, is occurring. They remark that most techniques verify the
objects value, and cannot directly determine if an object has an alias. When looking for
duplicate pointer references, the tester needs to be able to examine the name space to
determine if attributes share a state. This technique can be used with routine or cluster
testing without causing additional executions. This is because the technique "watches"
the attributes as the system executes along a path chosen by the tester. When aliasing
occurs the tester is notified and can determine at the point of execution if the additional
reference should exist. These authors state and prove that this technique is direct,
efficient, and as good as sequence testing when used in conjunction with routine and
cluster testing.
D'Souza and LeBlanc also discuss a testing tool designed to allow a tester to examine
attribute state. This tool is illustrated in the Eiffel OO language, and provides a test class
that contains an instance variable to the class being tested. A table is created containing
the path name (root.z.instvar), an object ID, and the dynamic type of the object that is
being referenced with the path name. The table is searched for duplicate object ID entries.
Any duplicates indicate that the object is referenced by more than one pointer, and the
entry is printed for examination. D'Souza and LeBlanc concede that this technique was
only tested on a small system (less than 50 classes) and that a large amount of alias data
was created even for their small system. Before this technique can be used on a larger
system, a pruning algorithm needs to be added, to limit the alias output. Parrish et al.[7]
present a technique for testing OO systems that is based entirely on generating test cases
from the class implementation. They feel this is an advantage in the current software
development environment, because many systems are developed without a formal
specification for each class. Their technique also differs from traditional methods in that
its generation of cases is not random but systematic. The technique involves making
every message a node, and a directed edge between nodes represents the possibility that
one routine might be invoked followed by a second routine.
These authors discuss two specific formal specification methods that can be applied to
their technique: algebraic (axioms) and model-based (require and ensure clauses). They
acknowledge that there is a problem generating infeasible test case paths, and that the
generated cases are not guaranteed to be better than manual testing. They suggest
supplementing the generated cases with manual cases, especially when testing critical
classes.
McGregor and Korson
McGregor and Korson discuss a high-level view of testing OO systems within the entire
software development cycle. They begin by testing the development process and the
supporting documents. This involves verifying and updating the iterative development
process with each iteration to improve effectiveness. These authors discuss testing
analysis and design models, and identify three attributes common to all modules that
required verification: correctness, completeness, and consistency.
A correct model is one whose set of analysis entities is semantically correct with respect
to the problem domain. A complete model is declared so if its entities accurately
represent the knowledge being modeled well enough to satisfy the goal of the current
iteration. A consistent model is one whose relationships among model entities are well
designed and cohesive in nature.
The authors feel that test cases should be created to test these aspects during each
iteration, and that test reviews should be held to insure that the cases exercise the model
at a detailed level. The actual design testing is done via instance diagrams that show the
network of resulting objects. McGregor and Korson state that the actual verification
should be done with class/unit, then cluster, followed by system testing. They also
discuss partitioning the test suite into three "flavors": functional, structural, and
interaction. This allows the tester to execute cases designed with a different emphasis,
and diversifies the testing to insure a broader range of coverage. This partitioning
resembles the Smith and Robson method, but divides cases at a higher level of
abstraction.
Murphy et al.
Murphy et al.discuss verifying a telecommunications troubleshooting tool (TRACS)
using class testing, cluster testing, and system testing techniques. The TRACS system
was developed over several releases, and the authors augmented their testing philosophies
to improve coverage with each release. The initial release was tested solely with cluster
and system testing methods, since the authors felt that class testing was too costly and
time-consuming. The system test cases were designed using functional and nonfunctional
system requirements. Each test case consisted of a series of steps that a user would
probably select and an expected outcome for comparison. Clusters were developed by
grouping classes that had similar functions or that worked together to accomplish a
particular goal. The cluster testing was functional and dynamic in nature, and was
executed over the entire system. This meant that drivers were needed to simulate
incomplete classes. The least-dependant cluster was tested first, then this cluster was used
in testing the next cluster until the entire system had been built and tested. This method
provided an implicit integration test due to its layered approach. After the initial release
was completed, the authors noticed that some clusters were beginning to grow too large;
hence, were not being tested adequately. Also, the iterative nature of the development
cycle had caused an elimination of code reviews. Defects were hiding within the layers of
cluster testing and class hierarchies, and the cost of fixing these problems increased
exponentially with time. These two problems resulted in a large number of defects that
were eventually discovered in system test. The authors decided that the cost of fixing
these problems outweighed the cost of executing class testing, so a class testing strategy
was incorporated along with an automated testing tool. The cluster testing was also
improved by creating a test plan based on the analysis and design for a cluster of classes.
The plan included:
1) any prerequisite cluster testing
2) a description of the functionality being tested
3) special configurations that should be tested
4) all classes contained in this cluster, and whether the class has been class tested or not
5) a set of cluster test cases including a description, execution steps, and expected
outcome Once the case is created, an automated test tool is used to execute the case and
record the outputs.
The authors' class testing technique utilized three levels:
Compilation, Walkthroughs, and Dynamic Testing.
Compilation was used for static type-checking to remove class incompatibilities, and for
compiler-generated warnings such as unused attributes.
The walkthroughs were performed by peer groups using state-reporting methods to probe
the code. The code inspectors also looked for missing function, unnecessary code, and
violations of coding standards.
Dynamic testing was accomplished via the same testing tool used for cluster testing. A
suite of cases is sent to the tool, and the tool executes them and reports the results.
Inheritance is leveraged in that test cases defined for a superclass are inherited and
executed unless they are overridden by the current class. This is similar to the techniques
discussed by Smith and Robson and by Harold et al., except that Smith and Robson's
method does not execute tests that have been previously run. This testing technique
reduced the number of non-system test defects found at system test from 70 to one, and
reduced development time by 50 percent.
4. Tools Discussion
????
5. Conclusion
6. References
Download