1 - Centre for Research in Evolution, Search & Testing

advertisement
Tao Xie
North Carolina State University
In collaboration with Nikolai Tillmann, Peli de Halleux, Wolfram Schulte
@Microsoft Research and students @NCSU ASE

Software testing is important
 Software errors cost the U.S. economy about $59.5 billion each
year (0.6% of the GDP) [NIST 02]
 Improving testing infrastructure could save 1/3 cost [NIST 02]

Software testing is costly
 Account for even half the total cost of software development
[Beizer 90]

Automated testing reduces manual testing effort
 Test execution: JUnit, NUnit, xUnit, etc.
 Test generation: Pex, AgitarOne, Parasoft Jtest, etc.
 Test-behavior checking: Pex, AgitarOne, Parasoft Jtest, etc.

Developer testing
 http://www.developertesting.com/
 Kent Beck’s 2004 talk on “Future of Developer Testing”
http://www.itconversations.com/shows/detail301.html

This talk focuses on tool automation in
developer testing (e.g., unit testing)
 Not system testing etc. conducted by testers
+
Test
inputs
?
Program
Outputs
=
Expected
Outputs
Test Oracles
+
Test
inputs
?
Program
Outputs
=
Expected
Outputs
Test Oracles

Test Generation
 Generating high-quality test inputs (e.g., achieving high code
coverage)
+
Test
inputs
?
Program
Outputs
=
Expected
Outputs
Test Oracles

Test Generation
 Generating high-quality test inputs (e.g., achieving high code
coverage)

Test Oracles
 Specifying high-quality test oracles (e.g., guarding against
various faults)

Three essential ingredients:
 Data
 Method Sequence
 Assertions
void Add() {
int item = 3;
var list = new List();
list.Add(item);
var count = list.Count;
Assert.AreEqual(1, count);
}
list.Add(3);

Which value matters?
 Bad choices cause incomplete test suites.
 Hard-coded values get stale when product code
changes.
 Why pick a value if it doesn’t matter?
[Tillmann&Schulte ESEC/FSE 05]


Parameterized Unit Test =
Unit Test with Parameters
Separation of concerns
 Data is generated by a tool
 Developer can focus on functional specification
void Add(List list, int item) {
var count = list.Count;
list.Add(item);
Assert.AreEqual(count + 1, list.Count);
}

A Parameterized Unit Test can be read as a
universally quantified, conditional axiom.
void ReadWrite(string name, string data) {
Assume.IsTrue(name != null && data != null);
Write(name, data);
var readData = Read(name);
Assert.AreEqual(data, readData);
}
 string name, string data:
name ≠ null ⋀ data ≠ null ⇒
equals(
ReadResource(name,WriteResource(name,data)),
data)
Parameterized Unit Tests (PUTs) commonly supported by
various test frameworks
 .NET: Supported by .NET test frameworks
 http://www.mbunit.com/
 http://www.nunit.org/
 …

Java: Supported by JUnit 4.X
 http://www.junit.org/
Generating test inputs for PUTs supported by tools
 .NET: Supported by Microsoft Research Pex
 http://research.microsoft.com/Pex/
 Java: Supported by Agitar AgitarOne
 http://www.agitar.com/

Human
 Expensive, incomplete, …

Brute Force
 Pairwise, predefined data, etc…

Random:
 Cheap, Fast
 “It passed a thousand tests” feeling

Dynamic Symbolic Execution: Pex, CUTE,EXE
 Automated white-box
 Not random – Constraint Solving
Choose next path
Code to generate inputs for:
void CoverMe(int[] a)
{
if (a == null) return;
if (a.Length > 0)
if (a[0] == 1234567890)
throw new Exception("bug");
}
F
F
a.Length>0
a==null
T
Solve
Constraints to solve
Data
Observed constraints
null
a==null
a!=null &&
!(a.Length>0)
a!=null &&
a.Length>0 &&
a[0]!=1234567890
{}
a!=null
a!=null &&
a.Length>0
Execute&Monitor
{0} condition
Negated
a!=null &&
a.Length>0 &&
a[0]==1234567890
{123…}
a!=null &&
a.Length>0 &&
a[0]==1234567890
T
Done: There is no path left.
a[0]==123…
F
T

Loops
 Fitnex [Xie et al. DSN 09]

Generic API functions e.g., RegEx matching
IsMatch(s1,regex1)
 Reggae [Li et al. ASE 09-sp]

Method sequences
 MSeqGen [Thummalapenta et al. ESEC/FSE 09]

Environments e.g., file systems, network, db, …
 Parameterized Mock Objects [Marri AST 09]
Opportunities


Regression testing [Taneja et al. ICSE 09-nier]
Developer guidance (cooperative developer testing)

Loops
 Fitnex [Xie et al. DSN 09]

Generic API functions e.g., RegEx matching
IsMatch(s1,regex1)
 Reggae [Li et al. ASE 09-sp]

Method sequences
 MSeqGen [Thummalapenta et al. ESEC/FSE 09]

Environments e.g., file systems, network, db, …
 Parameterized Mock Objects [Marri AST 09]
Applications


Test network app at Army division@Fort Hood, Texas
Test DB app of hand-held medical assistant device at FDA
Download counts (20 months)
(Feb. 2008 - Oct. 2009 )
Academic: 17,366
Devlabs: 13,022
Total:
30,388

Loops
 Fitnex [Xie et al. DSN 09]

Generic API functions e.g., RegEx matching
IsMatch(s1,regex1)
 Reggae [Li et al. ASE 09-sp]

Method sequences
 MSeqGen [Thummalapenta et al. ESEC/FSE 09]

Environments e.g., file systems, network, db, …
 Parameterized Mock Objects [Marri AST 09]
Applications


Test network app at Army division@Fort Hood, Texas
Test DB app of hand-held medical assistant device at FDA
There are decision procedures for individual path
conditions, but…
 Number of potential paths grows exponentially with
number of branches
 Reachable code not known initially
 Without guidance, same loop might be unfolded
forever
Fitnex search strategy
[Xie et al. DSN 09]
public bool TestLoop(int x, int[] y) {
Test input:
TestLoop(0, {0})
if (x == 90) {
for (int i = 0; i < y.Length; i++)
if (y[i] == 15)
x++;
if (x == 110)
return true;
}
Path condition:
return false;
!(x == 90)
}
↓
New path condition:
(x == 90)
↓
New test input:
TestLoop(90, {0})
public bool TestLoop(int x, int[] y) {
Test input:
TestLoop(90, {0})
if (x == 90) {
for (int i = 0; i < y.Length; i++)
if (y[i] == 15)
x++;
if (x == 110)
return true;
}
return false;
Path condition:
}
(x == 90) && !(y[0] == 15)
↓
New path condition:
(x == 90) && (y[0] == 15)
↓
New test input:
TestLoop(90, {15})
public bool TestLoop(int x, int[] y) {
Test input:
TestLoop(90, {15})
if (x == 90) {
for (int i = 0; i < y.Length; i++)
if (y[i] == 15)
x++;
if (x == 110)
Path condition:
return true;
(x == 90) && (y[0] == 15)
}
&& !(x+1 == 110)
return false;
↓
}
New path condition:
(x == 90) && (y[0] == 15)
&& (x+1 == 110)
↓
New test input:
No solution!?
public bool TestLoop(int x, int[] y) {
Test input:
TestLoop(90, {15})
if (x == 90) {
for (int i = 0; i < y.Length; i++)
if (y[i] == 15)
Path condition:
x++;
(x == 90) && (y[0] == 15)
if (x == 110)
&& (0 < y.Length)
return true;
&& !(1 < y.Length)
}
&& !(x+1 == 110)
return false;
↓
}
New path condition:
(x == 90) && (y[0] == 15)
&& (0 < y.Length)
&& (1 < y.Length)
 Expand array size
public bool TestLoop(int x, int[] y) {
Test input:
TestLoop(90, {15})
if (x == 90) {
for (int i = 0; i < y.Length; i++)
if (y[i] == 15)
x++;
We can have infinite paths!
if (x == 110)
(both length and number)
return true;
}
Manual analysis  need at
return false;
least 20 loop iterations to
}
cover the target branch
Exploring all paths up to 20
loop iterations is practically
infeasible: 220 paths
public bool TestLoop(int x, int[] y) {
Test input:
if (x == 90) {
TestLoop(90, {15, 15})
for (int i = 0; i < y.Length; i++)
if (y[i] == 15)
x++;
Key observations: with respect to the
if (x == 110)
coverage target,
return true;
 not all paths are equally promising for
}
flipping nodes
return false;
}
 not all nodes are equally
promising to flip
Our solution:
 Prefer to flip nodes on the most promising path
 Prefer to flip the most promising nodes on path
 Use fitness function as a proxy for promising
FF computes fitness value (distance between the
current state and the goal state)
 Search tries to minimize fitness value

[Tracey et al. 98, Liu at al. 05, …]
public bool TestLoop(int x, int[] y) {
if (x == 90) {
for (int i = 0; i < y.Length; i++)
if (y[i] == 15)
x++;
Fitness function: |110 – x |
if (x == 110)
return true;
}
return false;
}
public bool TestLoop(int x, int[] y) { (x, y)
if (x == 90) {
for (int i = 0; i < y.Length; i++) (90, {0})
(90, {15})
if (y[i] == 15)
(90, {15, 0})
x++;
(90, {15, 15})
if (x == 110)
(90, {15, 15, 0})
return true;
(90, {15, 15, 15})
}
(90, {15, 15, 15, 0})
return false;
}
(90, {15, 15, 15, 15})
Fitness function: |110 – x |
(90, {15, 15, 15, 15, 0})
(90, {15, 15, 15, 15, 15})
…
Fitness
Value
20
19
19
18
18
17
17
16
16
15
Give preference to flip a node in paths with better fitness values.
We still need to address which node to flip on paths …
public bool TestLoop(int x, int[] y) { (x, y)
if (x == 90) {
for (int i = 0; i < y.Length; i++) (90, {0})
(90, {15})  flip b4
if (y[i] == 15)
(90, {15, 0})  flip b2
x++;
(90, {15, 15})  flip b4
if (x == 110)
(90, {15, 15, 0})  flip b2
return true;
(90, {15, 15, 15})  flip b4
}
(90, {15, 15, 15, 0})  flip b2
return false;
}
(90, {15, 15, 15, 15})  flip b4
Fitness function: |110 – x |
Branch b1: i < y.Length
Branch b2: i >= y.Length
Branch b3: y[i] == 15
Branch b4: y[i] != 15
(90, {15, 15, 15, 15, 0})  flip b2
(90, {15, 15, 15, 15, 15})  flip b4
…
Fitness
Value
20
19
19
18
18
17
17
16
16
15
• Flipping branch node of b4 (b3) gives us
average 1 (-1) fitness gain (loss)
• Flipping branch node of b2 (b1) gives us
average 0 (0) fitness gain (loss)
Let p be an already explored path, and n a node on that
path, with explored outgoing branch b.
 After (successfully) flipping n, we get path p’ that goes to
node n, and then continues with a different branch b’.
 Define fitness gains as follows, where F(.) is the fitness
value of a path.

 Set FGain(b) := F(p) – F(p’)
 Set FGain(b’) := F(p’) – F(p)

Compute the average fitness gain for each program branch
over time
Pex: Automated White-Box Test Generation tool for
.NET, based on Dynamic Symbolic Execution
 Pex maintains global search frontier

 All discovered branch nodes are added to frontier
 Frontier may choose next branch node to flip
 Fully explored branch nodes are removed from frontier

Pex has a default search frontier
 Tries to create diversity across different coverage criteria, e.g.
statement coverage, branch coverage, stack traces, etc.
 Customizable: Other frontiers can be combined in a fair
round-robin scheme
We implemented a new search frontier “Fitnex”:
 Nodes to flip are prioritized by their composite fitness
value:
F(pn) – FGain(bn),
where
 pn is path of node n
 bn is explored outgoing branch of n
 Fitnex always picks node with lowest composite
fitness value to flip.
 To avoid local optimal or biases, the fitness-guided
strategy is combined with Pex’s search strategies
A collection of micro-benchmark programs routinely
used by the Pex developers to evaluate Pex’s
performance, extracted from real, complex C# programs
Ranging from string matching like
if (value.StartsWith("Hello") &&
value.EndsWith("World!") &&
value.Contains(" ")) { … }
to a small parser for a Pascal-like
language where the target is to create
a legal program.


Pex with the Fitnex strategy
Pex without the Fitnex strategy
 Pex’s previous default strategy

Random
 a strategy where branch nodes to flip are chosen
randomly in the already explored execution tree

Iterative Deepening
 a strategy where breadth-first search is performed
over the execution tree
#runs/iterations required to cover the target
Pex w/o Fitnex: avg. improvement of factor 1.9 over Random
Pex w/ Fitnex: avg. improvement of factor 5.2 over Random



Pex normally uses public methods to
configure non-public object fields
Heuristics built-in to deal with common types
User can help if needed
void (Foo foo) {
if (foo.Value == 123) throw …
[PexFactoryMethod]
Foo Create(Bar bar) {
return new Foo(bar);
}

A graph example from QuickGraph library
interface IGraph
{
/* Adds given vertex to the graph */
void AddVertex(IVertex v);
/* Creates a new vertex and adds it to the graph */
IVertex AddVertex();
/* Adds an edge to the graph. Both vertices should
already exist in the graph */
IEdge AddEdge(IVertex v1, Ivertex v2);
}

Desired object state for reaching targets 1 and 2:
graph object should contain vertices and edges
Class SortAlgorithm
{
IGraph graph;
public SortAlgorithm(IGraph graph) {
this.graph = graph;
}
public void Compute (IVertex s) {
foreach(IVertex u in graph.Vertices)
{
//Target 1
}
foreach(IEdge e in graph.Edges)
{
//Target 2
}
}
}
method
sequence
Applying Randoop, a random testing approach that
constructs test inputs by randomly selecting method calls
Example sequence generated by Randoop
VertexAndEdgeProvider v0 = new VertexAndEdgeProvider();
Boolean v1 = false;
BidirectionalGraph v2 = new
BidirectionalGraph((IVertexAndEdgeProvider)v0, v1);
IVertex v3 = v2.AddVertex();
v4 not in the graph, so
IVertex v4 = v0.ProvideVertex();
edge cannot be added to
IEdge v15 = v2.AddEdge(v3, v4);
graph.
Achieved 31.82% (7 of 22) branch coverage
Reason for low coverage: Not able to generate graph with
vertices and edges
Mine sequences from existing code bases
Reuse mined sequences for achieving desired object states
A Mined sequence from an existing codebase
VertexAndEdgeProvider v0;
bool bVal;
IGraph ag = new AdjacencyGraph(v0, bVal);
IVertex source = ag.AddVertex();
IVertex target = ag.AddVertex();
IVertex vertex3 = ag.AdVertex();
IEdge edg1 = ag.AddEdge(source, target);
IEdge edg2 = ag.AddEdge(target, vertex3);
IEdge edg3 = ag.AddEdge(source, vertex3);
Graph object includes both
vertices and edges
Use mined sequences to assist Randoop and Pex
Both Randoop and Pex achieved 86.40% (19 of 22) branch
coverage with assistance from MSeqGen
Existing codebases are often large and complete
analysis is expensive
  Search and analyze only relevant portions
Concrete values in mined sequences may be
different from desired values
  Replace concrete values with symbolic values
and use dynamic symbolic execution
Extracted sequences individually may not be
sufficient to achieve desired object states
  Combine extracted sequences to generate
new sequences
Problem: Existing code bases are often large and complete
analysis is expensive
Solution:
Use keyword search for identifying relevant method
bodies using target classes
Analyze only those relevant method bodies
Target classes:
System.Collections.Hashtable
QuickGraph.Algorithms.TSAlgorithm
Keywords: Hashtable, TSAlgorithm
Shortnames of target classes
are used as keywords
Problem: Concrete values in mined sequences are different
from desired values to achieve target states
Solution: Generalize sequences by replacing concrete values
with symbolic values
Method Under Test
Class A {
int f1 { set; get; }
Mined Sequence for A
int f2 { set; get; }
A obj = new A();
void CoverMe()
obj.setF1(14);
{
obj.setF2(-10);
if (f1 != 10) return;
if (f2 > 25)
obj.CoverMe();
throw new Exception(“bug”);
}
}
Sequence cannot help in exposing bug since desired
values are f1=10 and f2>25
Replace concrete values 14 and -10 with symbolic values X1
and X2
Mined Sequence for A
A obj = new A();
obj.setF1(14);
obj.setF2(-10);
obj.CoverMe();
Generalized Sequence for A
int x1 = *, x2 = *;
A obj = new A();
obj.setF1(x1);
obj.setF2(x2);
obj.CoverMe();
Use DSE for generating desired values for X1 and X2
DSE explores CoverMe method and generates desired values
(X1 = 10 and X2 = 35)
 Randoop
Without assistance from MSeqGen: achieved 32%
branch coverage  achieved 86% branch coverage
 In evaluation, help Randoop achieve 8.7% (maximum
20%) higher branch coverage

 Pex
Without assistance from MSeqGen: achieved 45%
branch coverage  achieved 86% branch coverage
 In evaluation, help Pex achieve 17.4% (maximum
22.5%) higher branch coverage





Write assertions and Pex will try to break
them
Without assertions, Pex can only find
violations of runtime contracts causing
NullReferenceException,
IndexOutOfRangeException, etc.
Assertions leveraged in product and test code
Pex can leverage Code Contracts
+
Test
inputs
?
Program
Outputs
=
Expected
Outputs
Test Oracles
Division of Labors

Test Generation
 Test inputs for PUT generated by tools (e.g., Pex)
 Fitnex: guided exploration of paths [DSN 09]
 MSeqGen: exploiting real-usage sequences [ESEC/FSE 09]

Test Oracles
 Assertions in PUT specified by developers
http://research.microsoft.com/pex
http://pexase.codeplex.com/
https://sites.google.com/site/asergrp/

http://research.microsoft.com/en-us/projects/contracts/

Library to state preconditions, postconditions,
invariants
Supported by two tools:

 Static Checker
 Rewriter: turns Code Contracts into runtime checks

Pex analyses the runtime checks
 Contracts act as Test Oracle


Pex may find counter examples for contracts
Missing Contracts may be suggested
Class invariant specification:
public class ArrayList {
private Object[] _items;
private int _size;
...
[ContractInvariantMethod] // attribute comes with Contracts
protected void Invariant() {
Contract.Invariant(this._items != null);
Contract.Invariant(this._size >= 0);
Contract.Invariant(this._items.Length >= this._size);
}
Unit test: while it is debatable what a ‘unit’ is, a ‘unit’
should be small.
 Integration test: exercises large portions of a system.


Observation: Integration tests are often “sold” as
unit tests
White-box test generation does not scale well to
integration test scenarios.
 Possible solution: Introduce abstraction layers, and
mock components not under test

AppendFormat(null, “{0} {1}!”, “Hello”, “World”);

“Hello World!”
.Net Implementation:
public StringBuilder AppendFormat(
IFormatProvider provider,
char[] chars, params object[] args) {
if (chars == null || args == null)
throw new ArgumentNullException(…);
int pos = 0;
int len = chars.Length;
char ch = '\x0';
ICustomFormatter cf = null;
if (provider != null)
cf = (ICustomFormatter)provider.GetFormat(typeof(ICustomFormatter));
…


Introduce a mock class which implements the interface.
Write assertions over expected inputs, provide concrete outputs
public class MFormatProvider : IFormatProvider {
public object GetFormat(Type formatType) {
Assert.IsTrue(formatType != null);
return new MCustomFormatter();
}
}

Problems:
 Costly to write detailed behavior by example
 How many and which mock objects do we need to write?


Introduce a mock class which implements the interface.
Let an oracle provide the behavior of the mock methods.
public class MFormatProvider : IFormatProvider {
public object GetFormat(Type formatType) {
…
object o = call.ChooseResult<object>();
return o;
}
}

Result: Relevant result values can be generated by white-box
test input generation tool, just as other test inputs can be
generated!
54

Chosen values can be shaped by assumptions
public class MFormatProvider : IFormatProvider {
public object GetFormat(Type formatType) {
…
object o = call.ChooseResult<object>();
PexAssume.IsTrue(o is ICustomFormatter);
return o;
}
}

(Note: Assertions and assumptions are “reversed” when
compared to parameterized unit tests.)
55

Choices to build parameterized models
class PFileSystem : IFileSystem {
// cached choices
PexChosenIndexedValue<string,string> files;
string ReadFile(string name) {
var content = this.files[name];
if (content == null)
throw new FileNotFoundException();
return content;
}}
MSeqGen Evaluation
Subjects:
 QuickGraph
 Facebook
Research Questions:
 RQ1: Can our approach assist Randoop (random
testing tool) in achieving higher code coverages?
 RQ2: Can our approach assist Pex (DSE-based
testing tool) in achieving higher code coverages?
57
57
RQ1: Assisting Randoop
58
RQ2: Assisting Pex
 Legend:
 #c: number of classes
 P: branch coverage achieved by Pex
 P + M: branch coverage achieved by Pex and MSeqGen
59
void PexAssume.IsTrue(bool c) {
if (!c)
throw new AssumptionViolationException();
}
void PexAssert.IsTrue(bool c) {
if (!c)
throw new AssertionViolationException();
}


Assumptions and assertions induce branches
Executions which cause assumption violations are
ignored, not reported as errors or test cases

How to test this code?
(Actual code from .NET base class libraries)
[PexClass, TestClass]
[PexAllowedException(typeof(ArgumentNullException))]
[PexAllowedException(typeof(ArgumentException))]
[PexAllowedException(typeof(FormatException))]
[PexAllowedException(typeof(BadImageFormatException))]
[PexAllowedException(typeof(IOException))]
[PexAllowedException(typeof(NotSupportedException))]
public partial class ResourceReaderTest {
[PexMethod]
public unsafe void ReadEntries(byte[] data) {
PexAssume.IsTrue(data != null);
fixed (byte* p = data)
using (var stream = new UnmanagedMemoryStream(p, data.Length)) {
var reader = new ResourceReader(stream);
foreach (var entry in reader) { /* reading entries */ }
}
}
}

Exploration of constructor/mutator method
sequences

Testing with class invariants

Write class invariant as boolean-valued
parameterless method
 Refers to private fields
 Must be placed in implementation code

Exploration of valid states by setting
public/private fields
 May include states that are not reachable
Class invariant specification:
public class ArrayList {
private Object[] _items;
private int _size;
...
[ContractInvariantMethod] // attribute comes with Contracts
protected void Invariant() {
Contract.Invariant(this._items != null);
Contract.Invariant(this._size >= 0);
Contract.Invariant(this._items.Length >= this._size);
}
PUT:
[PexMethod]
public void ArrayListTest(ArrayList al, object o)
{
int len = al.Count;
al.Add(o);
PexAssert.IsTrue(al[len] == o);
}
Generated Test:
[TestMethod]
public void Add01() {
object[] os = new object[0];
// create raw instance
ArrayList arrayList =
PexInvariant.CreateInstance<ArrayList>();
// set private field via reflection
PexInvariant.SetField<object[]>(arrayList, "_items", os);
PexInvariant.SetField<int>(arrayList, "_size", 0);
// invoke invariant method via reflection
PexInvariant.CheckInvariant(arrayList);
}
// call to PUT
ArrayListTest(arrayList, null);
Download