Z3: An Efficient SMT Solver

advertisement
Pex
White Box Test Generation
for .NET
Nikolai Tillmann, Microsoft Research
SMT 2008
A unit test is a small program with assertions.
void AddTest()
{
HashSet set = new HashSet();
set.Add(7);
set.Add(3);
Assert.IsTrue(set.Count == 2);
}
Many developers write such unit tests by hand. This involves
determining a meaningful sequence of method calls,
selecting exemplary argument values (the test inputs),
stating assertions.
2
void AddSpec(int x, int y)
{
HashSet set = new HashSet();
set.Add(x);
Parameterized
Unit
Testing
Parameterized
Unit
Testing
bridges
set.Add(y);
}
the are
gap algebraic
between specifications!
Assert.AreEqual(x == y, set.Count == 1);
• Unit Testing, and
Assert.AreEqual(x != y, set.Count == 2);
• Design-By-Contract paradigm
Parameterized Unit Tests separate two concerns:
1) The specification of externally visible behavior (assertions)
2) The selection of internally relevant test inputs (coverage)
Test input generator
Pex starts from parameterized unit tests
Generated tests are emitted as traditional unit tests
Dynamic symbolic execution framework
Symbolic execution based on monitoring and re-execution
Whole-program, white-box code analysis
At the level of the .NET instructions (bytecode)
Support for “Java-like” programs as well as “unsafe” code
SMT-solver Z3 determines satisfying assignments for
constraint systems representing execution paths
How to test this code?
(Real code from .NET base class libraries.)
8
Main challenge:
Making sure it does not crash
by writing many tests that cover the code
Possible test case, written by Hand
9
Test input,
generated by Pex
10
Initially, choose Arbitrary
Solve
Test
Inputs
Constraint
System
Choose an
Uncovered Path
Result: small test suite,
high code coverage
Run Test and
Monitor
Execution Path
Known
Paths
Record
Path Condition
Finds only real bugs
No false warnings
Initially, choose Arbitrary
Solve
Test
Inputs
Constraint
System
Choose an
Uncovered Path
Result: small test suite,
high code coverage
a[0]
a[1]
a[2]
a[3]
…
=
=
=
=
0;
0;
0;
0;
Run Test and
Monitor
Execution Path
Known
Paths
Record
Path Condition
Finds only real bugs
No false warnings
Initially, choose Arbitrary
Solve
Test
Condition:
InputsPath
… ⋀ magicNum
Run Test and
Monitor
!=
0x95673948
Constraint
System
Choose an
Uncovered Path
Result: small test suite,
high code coverage
Execution Path
Known
Paths
Record
Path Condition
Finds only real bugs
No false warnings
Initially, choose Arbitrary
Solve
Test
Inputs
0x95673948
Run Test and
Monitor
… ⋀ magicNum !=
… ⋀ magicNum == 0x95673948
Constraint
System
Choose an
Uncovered Path
Result: small test suite,
high code coverage
Execution Path
Known
Paths
Record
Path Condition
Finds only real bugs
No false warnings
a[0]
a[1]
a[2]
a[3]
=
=
=
=
206;
202;
239;
190;
Initially, choose Arbitrary
Solve
Test
Inputs
Constraint
System
Choose an
Uncovered Path
Result: small test suite,
high code coverage
Run Test and
Monitor
Execution Path
Known
Paths
Record
Path Condition
Finds only real bugs
No false warnings
Initially, choose Arbitrary
Solve
Test
Inputs
Constraint
System
Choose an
Uncovered Path
Result: small test suite,
high code coverage
Run Test and
Monitor
Execution Path
Known
Paths
Record
Path Condition
Finds only real bugs
No false warnings
Results in VS
Report: Coverage, path conditions
class Point { int x; int y;
public static int GetX(Point p) {
if (p != null) return p.X;
else return -1; } }
L0:
ldtoken
Point::X
call __Monitor::LDFLD_REFERENCE
ldfld Point::X
call __Monitor::AtDereferenceFallthrough
br
L2
L1:
ldtoken Point::GetX
Prologue
call __Monitor::AtBranchTarget
call
__Monitor::EnterMethod
Record
concrete
values
call __Monitor::LDC_I4_M1
brfalse L0
ldarg.0
to have all ldc.i4.m1
information
L2:
call
__Monitor::NextArgument<Point>
Calls to
buildthis method
when
is called
call __Monitor::RET
.try {
(The
real
C#
compiler
path
condition
stloc.0 context
with
no proper
.try {
Calls
will perform
actually
moreleave L4
call __Monitor::LDARG_0 output is
} catch NullReferenceException {
ldarg.0
symbolic computation
complicated.)
‘
call __Monitor::AtNullReferenceException
call __Monitor::LDNULL
rethrow
ldnull
}
call __Monitor::CEQ
Epilogue
L4: leave L5
ceq
} finally {
call __Monitor::BRTRUE
call
__Monitor::LeaveMethod
brtrue
L1
Calls
to
build
endfinally
call __Monitor::BranchFallthrough
path
condition
}
call __Monitor::LDARG_0
L5: ldloc.0
ldarg.0
ret
…
18
Similar to representation of verification conditions in ESC/Java,
Spec#, …
Terms for
Primitive types (integers, floats, …)
Constants
Unary and binary expressions
‘struct’ types
Tuples
Instance fields of classes
Mapping of references to values
Elements of arrays, memory accesses through pointers
Mapping of integers to values
…
Goal: Efficient representation of evolving program states
Reduction of ground terms to constants
Sharing of syntactically equal sub-terms
BDDs over if-then-else terms to represent logical operations
Tries/Patricia Trees to represent associative-commutative-withunit operators
Normal form of polynomials
Update trees
Other simplification rules, e.g.
\forall x. ceq(vtable(x, m1), m2) => ceq(objecttype(x), t)
where m2 overrides m1, and t is the sealed declaring type of m2
Problem:
Reachable code not known initially
No loop invariants, loops must be unfolded
Without guidance, symbolic execution may get stuck
unfolding the same loop forever
Solution:
Search strategies outside of SMT solver choose “next branch
to flip”
Fair choice between different strategies
Individual strategies based on program structure, including:
Fair choice of branch instructions
Fair choice of branch instructions + stack contexts
Fair choice of branch coverage
Independent constraint optimization + Constraint caching
(similar to EXE)
Idea: Related execution paths give rise to "similar"
constraint systems
Example: Consider x>y ⋀ z>0 vs. x>y ⋀ z<=0
If we already have a cached solution for a "similar"
constraint system, we can reuse it
x=1, y=0, z=1 is solution for x>y ⋀ z>0
we can obtain a solution for x>y ⋀ z<=0 by
reusing old solution of x>y: x=1, y=0
combining with solution of z<=0: z=0
Decision procedures for uninterpreted functions with
equalities, linear integer arithmetic, bitvector
arithmetic, arrays, tuples
Support for universal quantifiers
Used to model custom theories, e.g. .NET type system
Model generation
Models used as test inputs
Incremental solving
Push / Pop of contexts for model minimization
Programmatic API
For small constraint systems, text through pipes would add
huge overhead
Problem:
Pex can collect constraints over private fields,
constraint solver determines assignment for private
fields
How to bring object into desired state?
Private fields cannot be initialized freely, but only through
constructor and other methods
Approach taken by Pex:
Automatic selection of constructor and state-modifying
methods based on static code analysis
Exploration of constructor and methods to find nonexceptional paths
void PexAssume.IsTrue(bool c) {
if (!c) throw new AssumptionViolationException();
}
void PexAssert.IsTrue(bool c) {
if (!c) throw new AssertionViolationException();
}
Assumptions and assertions are explored just like all
other branches
Executions which cause assumption violations are
ignored, not reported as errors or test cases
26
AppendFormat(null, “{0} {1}!”, “Hello”, “World”);  “Hello World!”
.Net Implementation:
public StringBuilder AppendFormat(
IFormatProvider provider,
char[] chars, params object[] args) {
if (chars == null || args == null)
throw new ArgumentNullException(…);
int pos = 0;
int len = chars.Length;
char ch = '\x0';
ICustomFormatter cf = null;
if (provider != null)
cf = (ICustomFormatter)provider.GetFormat(
typeof(ICustomFormatter));
…
27
Introduce a mock class which implements the interface.
Write assertions over expected inputs, provide concrete outputs
public class MFormatProvider : IFormatProvider {
public object GetFormat(Type formatType) {
Assert.IsTrue(formatType != null);
return new MCustomFormatter();
}
}
Problems:
Costly to write detailed behavior by example
How many and which mock objects do we need to write?
28
Introduce a mock class which implements the interface.
Let an oracle provide the behavior of the mock methods.
public class MFormatProvider : IFormatProvider {
public object GetFormat(Type formatType) {
…
object o = call.ChooseResult<object>();
return o;
}
}
Result: Relevant result values can be generated by white-box test input
generation tool, just as other test inputs can be generated!
29
We applied Pex on a core .NET component
Already extensively tested for several years
Assertions written by developers
>10,000 public methods
>100,000 basic blocks
Sandbox
Restriction of access to external resources (files, registry,
unsafe code, …)
10 machines (P4, 2Ghz, 2GB RAM) running for 3 days
Exploration started from simple, generated parameterized unit
tests (one per public method); assertions embedded in code
31
Coverage achieved:
43% block coverage
36% arc coverage
Errors found:
A significant number of benign errors, e.g.
NullReferenceException, IndexOutOfRangeException, …
17 unique errors involving
violation of developer-written assertions,
exhaustion of memory,
other serious issues.
32
Automatically achieved coverage on selected classes
for core .NET component
Classname
Blocks
Hit
Arcs
Hit
A (mostly stateless methods)
>300
95%
>400
90%
B (mostly stateless methods)
>100
97%
>200
94%
C (stateful)
>200
76%
>300
65%
D (parsing code)
>500
81%
>800
73%
E (numerical algorihm)
>400
71%
>600
67%
F (numerical algorihm)
>100
82%
>200
79%
G (numerical algorihm)
>100
98%
>100
97%
H (numerical algorihm)
>200
71%
>200
61%
I (numerical algorihm)
>200
97%
>300
96%
33
Assumption: Environment is deterministic
"Environment" includes all code that is not monitored, e.g.
native code, uninstrumented code
Pex prunes non-deterministic behavior
Assumption: Program is single-threaded
Potential solution: control and explore thread scheduling like
all other test inputs
Limitations of constraint solver
Z3 has no built-in theories for floating point arithmetic
approximation with rationals (linear arithmetic only)
Bounds on Z3's time and memory consumption
34
Goal: Test-input generation for programs with contracts
(preconditions, postconditions, invariants, etc.)
In Verisoft project, compiler generates Boogie or MSIL programs
from C code annotated with contracts
MSIL programs embed most contracts in executable form
These contracts are turned into constraints by Pex, which
performs a path-sensitive analysis
Challenge: Non-executable contracts
Quantifiers: may range over “all integers” or “all pointers”
Predicates for memory-safety: do not translate directly into machine-observable
behavior
Better scalability
More sophisticated search-frontiers (e.g. based on fitness
function that determines distance to target state)
Summarizing execution paths instead of exploring them
(TACAS'08)
Inference of likely invariants/contracts
(DySy, ICSE'08)
Dealing with multi-threaded programs
Controlling the scheduler
Systematically exploring all relevant thread interleavings
Race detection
Tom Ball et. al. are building such analyses on Pex framework
(ManagedChess)
Program model checkers
JPF, Kiasan/KUnit (Java), XRT (.NET)
Combining random testing and constraint solving
DART (C), CUTE (C), EXE (C),
jCUTE (Java),
SAGE (X86)
…
37
Parameterized Unit Tests separate two concerns
Specification of externally visible behavior
Selection of test inputs to cover internal behavior
Pex automates test input generation
Uses SMT-solver Z3
Dynamic Symbolic Execution platform for .NET
Used internally in Microsoft to test core .NET
components
Pex is publicly available for academic use.
http://research.microsoft.com/Pex
38
http://research.microsoft.com/Pex
Most interesting programs are beyond the scope of static symbolic
execution.
Calls to external world
Unmanaged x86 code
Unsafe managed .NET code (with pointers)
Safe managed .NET code
Dynamic symbolic execution will systematically explore the conditions in
the code which the constraint solver understands.
And happily ignore everything else, e.g.
Calls to native code
Difficult constraints (e.g. precise semantics of floating point arithmetic)
Result: Under-approximation, which is appropriate for testing
When generating test inputs for any method, e.g.
DateTime ParseDateTime(string s) { … }
a regression test suite can be generated, where each test
asserts the observed behavior.
void ParseDateTimeTest132() {
DateTime result = ParseDateTime(“6/19/2008”);
Assert(result.ToString() == “06/19/2008”);
}
XRT: Exploring Runtime
Interpreter for .NET programs
Static symbolic execution
Used Simplify to determine unsatisfiability of path constraints
Successful for self-contained programs
Used today on a large scale within Microsoft for quality
assurance purposes as the core of the model-based testing
tool “Spec Explorer 2007”.
Does not work well for real-world programs
All environment behavior must be modeled
Modeling of entire environment is often not feasible
Download