Metamorphic testing - SEAS - University of Pennsylvania

advertisement
Applications of
Metamorphic Testing
Chris Murphy
University of Pennsylvania
cdmurphy@cis.upenn.edu
November 17, 2011
About Me

Lecturer, University of Pennsylvania

PhD-Computer Science, Columbia Univ, 2010


Advisor: Prof. Gail Kaiser

Research: software testing, CS education
Seven years experience in software
development industry
Problem:
When testing a piece of software,
how can we know that we’ve created
enough test cases?
Problem:
When testing a piece of software,
how can we create more test cases?
Solution:
We use properties of the software to
create new test cases from existing ones
(particularly those that have not failed).
Result:
This approach, known as metamorphic
testing, is more effective at testing certain
types of software than other approaches.
Today's Talk




What is metamorphic testing?
How is metamorphic testing used to find bugs in
software?
How can metamorphic testing be applied to
applications that do not have test oracles?
What are the open research questions related
to metamorphic testing?
UPenn CIS 573 Software Engineering

Graduate-level software engineering course

Just over 100 students

Focuses on software maintenance issues:
–
Testing
–
Formal verification
–
Debugging
–
Fault Localization
–
Refactoring
It's your first day of work as a
software engineer at BloobleSoft.
Your boss gives you 6,000 lines of
code and a specification and says
“find the bugs”.
Where would you start?
Specification
Code
Test Case
Generation
Strategy
Test
Test
Test
Test
Cases
Cases
Cases
Cases
You've created 4,837 test cases.
All of them pass.
How do you know when you're
done creating test cases?
Testing Requirements
Measurable
Adequacy Criteria
Test
Cases
Desired
Adequacy Level
Coverage
Level
Acceptable?
Your test cases are achieving
100% coverage.
None of them have
found any bugs.
Are those test cases useful?
Maybe those test cases can be
used to create new test cases.
The more test cases, the better.
Right?
This is the idea behind
“metamorphic testing”.
[Chen et al., HKUST TR CS-98-01, 1998]
A
simple example
(really, really, really, really)
Let's say you're testing a
cosine function.
(I know, I know...)
You have a test case {45º, 0.7071}, i.e.
cos(45º) = 0.7071
How could we use this test case
to create new test cases?
We know that the cosine function
exhibits certain properties.
That is, if we make certain changes
to the input, we can predict the
effect on the output.
These are referred to as
“metamorphic properties”.
What are the metamorphic properties
of the cosine function?
cos(x + 360º) = cos(x)
That is, if we add 360 to the input,
the output should not change.
cos(x - 360º) = cos(x)
cos(x + 180º) = -1 * cos(x)
Given our original test case {45°, 0.7071},
we can create three follow-on test cases.
Property: cos(x + 360º) = cos(x)
Input: 45º + 360º = 405º
Output: cos(45º) = 0.7071
Property: cos(x - 360º) = cos(x)
Input: 45º - 360º = -315º
Output: cos(45º) = 0.7071
Property: cos(-1 * x) = -1 * cos(x)
Input: -1 * 45º = -45º
Output: -1 * cos(45º) = -0.7071
Initial test case
{x, f(x)}
x
f
f(x)
t
g
t(x)
f
f(t(x))
=
g(f(x))
Follow-on test case
{t(x), f(t(x))}
A metamorphic property of a function f
is a pair of functions (t, g)
such that f(t(x)) = g(f(x))
for all inputs x
but wait, isn’t that the same as…
Program invariants: -1 ≤ cos(x) ≤ 1
Describe legal ranges/values of a function,
but not how it should react when the
input is changed.
Algebraic properties: cos²(x) = 1 – sin²(x)
Describe the relationships between
multiple functions, but not a single function.
simple categories of properties
Initial
test case
a
b
c
d
e
f
sum
s
Permute
c
e
b
a
f
d
sum
s
s+12
Add
a+2 b+2 c+2 d+2 e+2 f+2
sum
Multiply
2a 2b 2c 2d 2e 2f
sum
Include
a
b
c
d
e
Exclude
a
b
c
d
e
f
g
2s
s+g
sum
sum
s-f
Initial
test case #1
a
b
c
d
e
f
sum
s
Initial
test case #2
g
h
i
j
k
l
sum
t
a
b
c
d
e
f
g
h
i
j
k
l
sum
s+t
sum
2s+2t
Compose
Combination
of properties
2h 2d 2a 2k 2e 2g
2i 2c 2l 2f 2b 2j
Common Metamorphic Properties
• Additive: Increase (or decrease) numerical
values by a constant
• Multiplicative: Multiply numerical values by
a constant
• Permutative: Randomly permute the order
of elements in a set
• Invertive: Create the “opposite” of a set
• Inclusive: Add a new element to a set
• Exclusive: Remove an element from a set
• Compositional: Compose a set
[Murphy et al., SEKE’08]
Other Types of Properties
• Noise-based: include input values that will
not affect the output
• Semantically Equivalent: create inputs that
are have the same “meaning” as the original
• Heuristic: create inputs that are “close” to
the original
• Statistical: create inputs that exhibit the
same statistical properties
one more example
Consider a function that takes a set
of Points (x-y coordinates) and calculates
the total distance from the first to the last,
via the rest.
What are that function’s metamorphic
properties?
Okay, I think I get it.
But does it really work?!?!
In order to find bugs…
1. The original test case must pass,
even though there is a bug.
2. The follow-on test case must fail.
But how can this be?!?!
/* Return the smallest value in the array */
int findMin(int A[]) {
int min = A[0];
for (int i = 1; i < A.length-1; i++) {
if (A[i] < min) min = A[i];
}
return min;
}
/* Return the smallest value in the array */
int findMin(int A[]) {
int min = A[0];
for (int i = 1; i < A.length-1; i++) {
if (A[i] < min) min = A[i];
}
return min;
}
Test case { {2, 1, 4, 3}, 1}
100% statement coverage!
100% branch coverage!
PASS!
/* Return the smallest value in the array */
int findMin(int A[]) {
int min = A[0];
for (int i = 1; i < A.length-1; i++) {
if (A[i] < min) min = A[i];
}
return min;
}
Test case { {2, 1, 4, 3}, 1}
Metamorphic property: If we permute the
input, the output remains the same.
Follow-on test case: { {4, 2, 3, 1}, 1} FAIL!
metamorphic testing in
the real world
Bioinformatics [Chen et al., BMC Bioinf., 2009]
Machine Learning [Xie et al., JSS, 2011]
Network Simulation [Chen et al., FTDS, 2009]
Computer Graphics [Guderlei et al., QSIC, 2007]
what types of applications
is metamorphic testing good for?
Applications that deal primarily with
numerical input and numerical output.
Applications that use graph-based
algorithms.
Compilers.
[Zhou et al., ISFST’04]
Applications that do not have test oracles.
Program
Test
Input
Actual
Output
Specification
Expected
Output
Oracle
what if there is no oracle?
Machine Learning
Discrete Event Simulation
Length of Stay versus Utilization
300
16
14
12
units of time
200
10
150
8
6
100
4
50
2
0
0
0
2
4
6
number of beds
8
10
12
percent utilization
250
LOS
Doctor
Utilization
Nurse
Utilization
Triage
Utilization
Clerk
Utilization
x
f
f(x)
t
t(x)
g
f
f(t(x))
Actual
=?
g(f(x))
Expected
If f(t(x)) = g(f(x)) that does not
mean that the output is correct.
But if f(t(x)) != g(f(x)) then
one (or both) must be incorrect.
example: RapidMiner
RapidMiner is a suite of machine learning
algorithms implemented in Java.
In its NaïveBayes implementation, a
confidence level c is reported whenever
it classifies an example e using a model
M created from a training data set T.
That is:
c = Classify(M(T), e)
We expect that if we modify T to include
an extra instance of e, then the confidence
level should double, since we are twice
as certain about the classification.
That is:
Classify(M(T+e), e) = 2 * Classify(M(T), e)
Our testing detected violations of
this property, thus revealing a bug.
[Murphy et al., ICST’09]
empirical study
Goal:
Show that metamorphic testing is more
effective than other techniques at finding
bugs in applications without test oracles.
Approach:
Use mutation analysis to insert faults
into the applications, and see how many
are detected using various techniques.
Application domains investigated:
1. Machine Learning (C4.5, MartiRank,
Support Vector Machines, PAYL)
2. Discrete Event Simulation (JSim)
3. Information Retrieval (Lucene)
4. Optimization (gaffitter)
Techniques investigated:
1. Metamorphic Testing
2. Runtime Assertion Checking
3. Partial Oracle
Experimental Results
Partial Oracle
Runtime Assertion Checking
Metamorphic Testing
C4.5
MartiRank
SVM
PAYL
JSim
Lucene
gaffitter
TOTAL
0
20
40
60
% of Mutants Killed
[Murphy et al., ISSTA’09]
80
100
120
can we do better?
That experiment used applicationlevel metamorphic properties.
What if we test at the function level, too?
And continuously conduct those tests
while the software is running?
This is known as
Metamorphic Runtime Checking.
[Murphy et al., TR CUCS-042-09, 2009]
Experimental Results
Partial Oracle
Runtime Assertion Checking
Metamorphic Testing
MT + MRC
C4.5
MartiRank
SVM
PAYL
JSim
Lucene
gaffitter
TOTAL
0
20
40
60
80
100
120
research directions
When I run my test, I see that the
metamorphic property is violated.
Does that mean there's a bug?
Well, not necessarily....
How can we know whether the
metamorphic properties are sound?
I've used the guidelines to identify as
many metamorphic properties as I could.
Does that mean that's all of them?
Well, not necessarily....
How can we know whether the set of
metamorphic properties is complete?
I've used the guidelines to identify as
many metamorphic properties as I could.
Does that mean that's all of them?
Well, not necessarily....
Could we detect (likely) metamorphic
properties automatically?
I have a function for which I expect that,
if I double the input, the output should
be doubled.
Could I verify that property without
actually executing the code?
Well, probably....
Can metamorphic properties
be verified statically?
summary
Metamorphic testing is a method of
creating new test cases from existing ones.
It depends heavily on the software’s
metamorphic properties, which are
often numerical.
Metamorphic testing is particularly
effective at finding bugs in applications
that do not have test oracles.
thanks!
Applications of
Metamorphic Testing
Chris Murphy
University of Pennsylvania
cdmurphy@cis.upenn.edu
Download