A Modified Genetic Algorithm for Software Testing Anjali kapoor

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 5 - Nov 2013
A Modified Genetic Algorithm for Software Testing
Anjali kapoor1, Mohit kumar2
#1
#2
Research Scholar, CSE, RIMT-IET (PTU), Mandi Gobindgarh, India
Associate Professor, CSE, RIMT-IET (PTU), Mandi Gobindgarh, India
Abstract
The success of software depends largely on the quality as
well as quantity of the testing. The testing phase can
consume lots of resources, if not planned properly.
Automatic test case generation can help in saving some of
those precious resources and at the same time, can speed
up the testing. Genetic Algorithms (GAs) are being widely
used for such automatic generation, but none of these
methods are handling the equality type of predicates in an
effective way, as explained below in section 5. This paper
proposes a new method which can fasten the automatic
test case generation activity. Our initial study indicates a
remarkable improvement in the time needed for automatic
test case generation corresponding to equality predicates
and at the same time, does not affect most of the other test
case generation corresponding to non-equality predicates,
as mentioned in section 6 below. Results have also
indicated improvement in test coverage in improved
method as compared to existing GA method. A
comparison of the existing GA based techniques with the
newly proposed technique clearly demonstrates that new
method is overall more efficient and thus more desirable.
This proposed concept can be made more sound and
reliable by exploring it over much more complex and
lengthier benchmarking programs.
I.
Testing Background
A quality software has to undergo complex, labour intensive,
time consuming and costly test process before establishing the
confidence of user, developer and manager of being a reliable
and high probability error free product. Testing process
comprises of a set of activities in which first inputs are
selected and then software under test is executed using these
inputs for fault(s) identification. Next steps are of isolation
and resolution of these fault(s). If a small amount of
automation is done in testing process it can save significant
resources attributed to testing. Generation of apt test cases is
the crucial in effective & efficient testing. Test cases can be
generated in several ways: automated, semi-automated or
manual. While automated test cases generation saves a lot of
resources and these don’t encompass human biases and are
less intelligent. On the other side manual test cases, although
consist human intelligence, take much efforts and time. Last
decade experienced a number of research attempts in
automation of test cases generation [Korel1990],
[Wegener2001], [Sthamer1996] [Mcminn2004], specifically
use of heuristic approaches towards fulfilling the objectives of
ISSN: 2231-5381
this activity. Several soft computing based techniques like
genetic algorithm [Lin2001] [Berndt2003] [Miller 2006],
simulated
annealing
[Tracy1998],
tabu
search
[Diaz2003,2007], ant colony optimization [Mahanti2006]
were successfully employed in this direction. Test cases are
generated keeping in view a predefined test objective.
II.
Automatic Generation of Test Cases and
Genetic Algorithm
Being an NP hard problem, test cases generation is a perfect
case for employing Genetic Algorithm, which are non linear
search space algorithms with the characteristic of being robust
and adaptive. GA is population based search algorithm, whose
success in achieving objectives largely depends on the
definition of fitness function. Fitness function evaluates
candidate solutions based on the some criterion which is used
to adjudge the suitability of solution as compared to others.
Two candidate solutions in a search space may differ in
several ways, but they can exhibit similar characteristics with
respect to some criteria. A better fitness function always
exposes diversity in the solutions rather than similarity.
Finally, this diversity leads to better solution in less number of
iterations. This problem of early convergence of GA to the
non optimal solution in test cases generation has been cited by
several researchers [Wegener2001] [lin2001]. In identifying
test cases, fitness of a candidate solution may be described by
several alternates. It may depend on the percentage of the
components covered in testing or reward/penalty of
inclusion/exclusion of a desired path in program execution by
solution (candidate test case) or some other criterion.
III.
The Methodology
Gas can be employed to generate test cases by using both
functional testing and structural testing techniques. Functional
testing also called as black box testing uses functional
knowledge of program and focuses on input and output
domains of program to identify test cases. On the other side,
structural testing also called as white box testing (more
appropriately glass testing) considers program internal
structure in order to generate test cases by using flow graph of
the program. Structural testing is further divided into two
categories; control flow based and data flow based. In control
flow based structural testing only control of program is
analyzed. Data flow testing also uses control flow graph paths
or sub paths to ascertain the association between the definition
and computation or predicate uses of a variable. This analysis
can be static as well as dynamic. Static analysis does not
http://www.ijettjournal.org
Page 248
International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 5 - Nov 2013
require the actual execution of program unlike dynamic
analysis. Another issue of testing is setting up a limit on the
coverage of program components by test cases. A test data
adequacy criterion is a set of rules that is used to determine
whether or not sufficient testing has been performed. Several
criteria have been fixed and ranked according to their strength
and weakness by researchers [Rapps1982] [Rapps1985]
[Franlkl1988]. All-paths testing criterion subsumes all other
criteria and is the strongest criterion which we have used in
our experiments.
In symbolic execution, a testing path is identified
from control flow graph of program. Now a valid test case is
identified which should execute the particular path by
satisfying all of the boolean expression included in that path.
Concatenation of all such expressions (formally known as
predicates) involved in that path is done next to generate a
hybrid predicate. Special consideration to the variable
dependence on program processing has to be given during the
process of concatenation, as it affects the subsequent
execution criterion of the remaining path. In our method, we
have used symbolic execution technique of static structural
testing and GA for the identification of test cases. GA
generates population of candidate solutions and these are
evaluated using a fitness function. A candidate solution is
made up of all the input variables, to which values are
assigned for constructing a test case. To evaluate a fitness of a
candidate, all the constraints of a particular path are atomized
and one by one each is evaluated using current candidate
solution. An atomic predicate is that one, which contains only
one operator if it is satisfied then no penalty is imposed to
candidate solution, otherwise candidate solution is penalized
with following values as shown in table 1. This method has
been already used by several researchers [Korel1990] [Tracy
1998] [Wegenar 2001].
Atomized predicate
Penalty to be imposed in
case predicate is not
satisfied
A<B
A–B
A <= B
A–B+ζ
A>B
B–A
A >= B
B–A+ζ
A=B
Abs(A – B)
A ≠B
ζ – abs(A – B)
A and B are operands and ζ is a smallest constant of
operands’ universal domains. In case integer it is 1 and in
case real values it can be 0.1 or 0.01 depending on the
accuracy we need in solution.
IV.
Forced Constraint Satisfaction:
While evaluating fitness, two types of categories of atomic
predicates (constraints) can be identified; easily satisfiable and
hard to be satisfied atomic predicate. Former category
includes inequality condition such as A<B, A<=B, A>B,
A>=B and A≠B while later includes only equality condition
ISSN: 2231-5381
based predicate A==B that is not easily satisfiable by GA. A
predicate is said to be satisfied if it evaluates to be true. A
predicate is easily satisfiable if there are enough combinations
of inputs which can be easily identified by GA program in
order to execute
we propose a change in fitness function of GA. Fitness
function in existing GA techniques doesn’t change the
structure of solution but only evaluates candidate solution
according to the procedure described in section 4. In our
method, we are just concerned with atomic predicate
involving equality condition. When equality atomic predicate
fails to evaluate then our proposed fitness function doesn’t
penalize candidate solution straight away as in the case of
inequality predicates but it forcibly assigns value of one
variable (operand) to another and charges candidate solution
altogether with a small penalty of ζ, so that it can be evaluated
in the next iteration of GA just to ensure that recent
assignment has not violated already satisfied predicates. This
is called forced constraint satisfaction. Now let us consider the
effect of new fitness function on the generation probability of
test cases which can satisfy equality predicates. If we employ,
FCS method in above example then test cases generation,
related to satisfaction of equality predicate is increased from a
meager 0.2 to 1. Thus, it becomes very easy to generate test
cases for path evaluation in less number of generations and
same is corroborated by our experimental results.
V.
Results
To prove our method, we have experimented on two most
standard benchmark programs regularly used in testing
research. These are triangle classifier and line-rectangle
clipping programs. Former classifies triangle type based on
the value of three sides entered as inputs. Later program
identifies whether a line cuts a rectangle or lies completely
outside or lies completely inside of the rectangle. In this
program total eight inputs are entered; four for co-ordinates of
rectangle and other four inputs are used to define the line.
Control Flow graphs of two programs are manually
constructed from respective source code and all paths are
identified for the two programs. For each path a hybrid
predicate is formed manually which in turn becomes an input
to GA fitness function used to evaluate candidate solution.
For the purpose of comparison, random test cases are also
generated randomly. Test cases are generated using GA with
and without FCS method. Test cases are generated from inputs
by taking different domain; one very large of the size of the
order of 1011 and one small with a size of order of 104.
Preliminary experimentation results are shown in table 2 and
table 3. In larger domain of inputs, less coverage of paths is
achieved by random test generator as compared to small
domain. Even in larger domain existing GA method fails to
provide 100% coverage while our method provides cent
percent coverage without failing a single time. These results
clearly shows that our proposed change in fitness function of
GA drastically reduced the required no of test cases
generation with increased coverage and that is without the loss
of generality and simplicity of GA program.
http://www.ijettjournal.org
Page 249
International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 5 - Nov 2013
GA Test generation GA Test generation
without FCS
with FCS
Average Coverag Average Coverag Average Coverag
no of Test
e%
no of Test
e%
no of Test
e%
cases
cases
cases
Generate
Generate
Generate
d per
d per
d per
path
path
path
17147
42.85% 4649
87.14% 40
100%
Triangle
Classsifier
line14159
rectangle
clipping
53%
5203
78.82%
162
Re ctangle Clas sifier w ith Range be tw ee n -10^11 and +10^11
100000
120
100
%coverage in rando m test case
gener ati on
%coverage in GA test case
gener ati on wi tho ut FCS
%coverage in GA test case
gener ati on wi th FCS
Average Test Cases in Rand o m
test case g eneratio n
Average Test Cases in GA t est
case g eneratio n wit ho ut FCS
10000
Average Test Cases in GA t est
case g eneratio n wit h FCS
100%
80
01000
60
00100
40
00010
%Test Coverage
Random Test
Generation
Average Test Cases Generated
Generatio
n Method

20
00001
0
1
2
%cover ag e in rand om test case gener ati on
10 0
100
%cover ag e in GA t est case generat ion without FCS
10 0
100
52
52
44
57
54
62
65
%cover ag e in GA t est case generat ion with FCS
10 0
100
10 0
3
10 0
100
100
10 0
10 0
100
10 0
Aver age Test Cases i n Rando m test case g eneratio n
1
2 64
3 0 00 0
3 00 00
30 0 00
30 00 0
3 00 00
3 00 00
30 00 0
3 0 00 0
37
51
74
74
Aver age Test Cases i n GA test case g eneratio n wit ho ut FCS
1
88
103 42
10199
12 0 41
1194 5
108 92
8 98 4
10 616
12193
23 0
147
2 33
18 8
Aver age Test Cases i n GA test case g eneratio n wit h FCS
1
99
99
69
103 2
35
60
12
49
64
14 8
177
2 15
157
0
4
0
5
0
6
0
7
0
8
9
0
0
10
11
12
13
14
15
16
17
0
10 0
100
10 0
10 0
100
100
10 0
54
10 0
100
10 0
10 0
100
100
10 0
10 0
100
10 0
10 0
100
100
10 0
10 5
34
67
171
36
16 0
28 0
29
2 35
Path No
Table2. Comparison of test cases generation methods with
Input domain Range from -1011 to +1011
Generation
Method 
Random Test
Generation
GA Test
generation
without FCS
Average Cove Average Cove
no of Test rage no of Test rage
cases
%
cases
%
Generated
Generated
per path
per path
Triangle
10493
Classsifier
line-rectangle 13901
clipping
47.86 427
%
55% 2028
100%
99.65
%
GA Test
generation with
FCS
Average Cove
no of Test rage
cases
%
Generated
per path
35
100%
145
100%
Table3. Comparison of test cases generation methods with
Input domain Range Input domain Range from -104 to
+104
Triangle Class ifier w ith Range betw ee n -10^11 and +10^11
% Co verage in Rando m test
case generation
100000
120
% Co verage in GA test case
generation without F CS
100
10000
% Co verage in GA test case
generation with F CS
80
1000
A verage Test C ases in
R andom test case
generation
A verage Test C ases in GA
test case generatio n witho ut
FC S
A verage Test C ases in GA
test case generatio n with
FC S
60
100
40
10
20
1
0
1
2
%Cover age in Random t est case generat ion
100
100
%Cover age in GA t est case gener at ion wi thout FCS
100
100
34
92
91
%Cover age in GA t est case gener at ion wi th FCS
100
100
100
3
100
100
100
100
0
4
0
5
0
6
7
0
100
93
100
Aver age Test Cases in Random t est case generat ion
1
11
30000
30000
30000
30000
19
Aver age Test Cases in GA test case gener ati on wit hout FCS
1
136
20574
3473
3757
3315
27
Aver age Test Cases in GA test case gener ati on wit h FCS
1
195
26
11
24
9
16
P a t h No
ISSN: 2231-5381
Conclusion
In this paper a modified approach for software testing has
been proposed, which is based on genetic algorithm uses for
automatic test case generation. The weakness of traditional
GA algorithm has been removed by forcing the equality
constraint with the help of a revised fitness function. The
proposed approach has been evaluated and compared with the
existing techniques for two of the widely accepted problems
named as triangle classifier problem and the line-rectangle
clipping problem. Our initial study indicates a significant
improvement in automation of structural testing and the
proposed method can be quite useful for the software
managers in improving the quality of the software within
limited time and resources. However, the method needs to be
evaluated further for larger programs in order to make it more
reliable.
References:
[1] [Rapps1982] S. Rapps and E. J. Weyuker. Data flow
analysis techniques for test data selection. In
Software Engi-eering 6th International Conference.
IEEE Computer Society Press, 1982.
[2] [Rapps1985] S. Rapps and W. J. Weyuker. Selecting
software test data using data
[3] flow information. IEEE Transaction on software
engineering 11(4):367{375, April 1985.
[4] [Franlkl1988] Frankl, P.G. and Weyuker, E. J. “An
Applicable Family of Data Flow Testing Criteria”,
IEEE Transaction On Software Engineering., Vol. 14,
NO.10, pp. 1483-1498, 1988.
[5] [Myers 1978] G. J. Myers. A controlled experiment
in program testing and code walkthrough and
inspection.
[6] [Natafos1988] Ntafos, S. C. “A comparison of some
structural testing strategies. IEEE transaction on
Software Engineering 14:868-874, 1988.
[7] [Beizer1990] Beizer B. Software testing techniques.
2nd ed., Dreamtech publication New Delhi. 1990.
[8] [Watkins1990] Watkins AL. “The automatic
generation of test data using genetic algorithms”. The
http://www.ijettjournal.org
Page 250
International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 5 - Nov 2013
fourth software quality conference, vol. 2, p. 300–09.
1995
[9] [Sthamer1996] Sthamer H, “The automatic
generation of software test data using genetic
algorithms”. Ph.D. thesis, University of lamorgan,
Pontyprid,Wales, UK, April 1996.
[10] [Jones1996] Jones B, Sthamer H, Eyres D.
Automatic structural testing using genetic algorithms.
Software Engineering Journal, 11(5), 299–306, 1996
[11] [Pargas1999] R, Harrold MJ, Peck R. Test-data
generation using genetic algorithms. Journal of
Software Testing, Verification and Reliability, 9(4),
263–82. 1999
[12] [Wegener2001] Wegener J, Baresel A, Sthamer H.,
“Evolutionary test environment for automatic
structural testing”. Information and Software
Technology 43, 841–54, 2001;
[13] [Diaz 2003] Díaz E, Tuya J, Blanco R. “Automated
software testing using a metaheuristic technique
based on tabu search”. In: The 18th IEEE
international conference on automated software
engineering, p. 310–3, 2003
[14] [Lin 2001] Lin J-C,Yeh P-L. Automatic test data
generation for path testing using GAs. Information
Sciences 131, 47–64, 2001;
[15] [Berndt2003] Berndt D., Fisher J., Johnson L.,
Pinglikar J., and Watkins A., “Breeding Software
ISSN: 2231-5381
Test Cases with Genetic Algorithms”, IEEE
Proceedings of the 36th Hawaii International
Conference on System Sciences (HICSS’03).
[16] [Korel1990] B. Korel, “Automated software test data
generation’, IEEE transaction on software
engineering, 16(8), 870-879, 1990.
[17] [Miller 2006] James Miller, Marek Reformat,
Howard Zhang “Automatic test data generation using
genetic algorithm and program dependence graphs”,
Information and Software Technology, 48, 586–605,
2006.
[18] [McMinn2004] McMinn P., “Search-based Software
Test Data Generation: A Survey”, Software Testing,
Verification and Reliability, 14(2), pp. 105-156, June
2004.
[19] [Mahanti2006] Mahanti P K, Banerjee S,
“Automated Testing in Software Engineering: Using
Ant Colony and Self-Regulated Swarms”,
Proceeding of Modelling and Simulation, 2006.
[20] [Tracy1998] Tracey N., Clark J., Mander K., and
McDermid J., “An automated framework for
structural test-data generation” Proceedings of the
International Conference on Automated Software
Engineering, 285-288, 1998.
http://www.ijettjournal.org
Page 251
Download