Static Regression Testing

advertisement
SymDiff: Leveraging Program
Verification for Comparing Programs
Shuvendu Lahiri
Research in Software Engineering (RiSE),
Microsoft Research, Redmond
Jointly with
Chris Hawblitzel (Microsoft Research, Redmond), Ming
Kawaguchi (UCSD), Henrique Rebelo (UPFE)
VSSE Workshop, 2012
Motivation
Ensuring compatibility
– Programmers spend a large fraction of their time
ensuring (read praying) compatibility after changes
How does the feature
Does the refactoring
change any observable
behavior?
addition impact existing
features?
Does my bug-fix
introduce a
regression?
Microsoft Confidential
Compatibility: applications
f() { Print(foo);
g(); }
g() { ...
Print(foo); }
Bug fixes
Refactoring
Version Control
Library API
changes
g() { ...
Print(foo);
Print(bar); }
New features
Compilers
Compatibility: Microsoft
• Products
– Windows APIs (Win32, ntdll)
– Driver development kits
– .NET frameworks, Base class library
– Compilers (C#, JIT,…)
Every
– …..
developer/tester
/auditor
• Windows updates
– Security patches
– Bug fixes
Problem
• Use static analysis to
–Improve the productivity of users trying
to ensure compatibility across program
changes
• Potential benefits
– Agility: fewer regressions, higher confidence in
changes,
smarter code review, ..
Challenge
• Equivalence checking is too strong a spec
– Most changes modify behavior
• Hard to formalize (separate expected changes
from unexpected changes)
–
–
–
–
–
–
–
Refactoring  behaviors intact
Bug fix
 non-buggy behaviors intact
Feature add  existing feature behaviors intact
API change  ??
Data change  ??
Config changes  ??
…
Challenge  Opportunity
• Hard to formalize (separate expected changes
from unexpected changes)
– Refactoring  behaviors intact
– Bug fix
 non-buggy behaviors intact
– Feature add  existing feature behaviors intact
– …….
Highlight
“unexpected”
changes
Our approach
– Provide a tool for performing semantic diff (diff over
behaviors)
How does the feature
Does the refactoring
change any observable
behavior?
addition impact existing
features?
Does my bug-fix
introduce a
regression?
Semantic Diff
Microsoft Confidential
Our approach
– Provide a tool for performing semantic diff (diff over
behaviors)
How does the
Does the
refactoring
change any
observable
behavior?
feature
addition
impact existing
features?
Does my
bug-fix
introduce a
regression?
Semantic Diff
Microsoft Confidential
What is SymDiff?
A framework to
– Leverage and extend program
verification for providing relative
correctness
Overview
• Demo
• Semantic diff
– Tool (in current form)
• An application
– Compiler compatibility
• Making SymDiff extensible with contracts
– Users can express “expected” changes
– Mutual summaries and relative termination
Demo
1. Eval (bug1)
2. Eval (func)
3. StringCopy (bug fix)
4. Recursive example
SymDiff tool
SymDiff
– Apply and extend program verification
techniques towards comparing
programs
–Current form: Checks input/output
partial equivalence
[CAV ’12 tool paper]
SymDiff tool: language independent
S1
C/.NET/
x86/ARM

Boogie
P1
S2
C/.NET/
x86/ARM

Boogie
P2
P1
= P2
P1
≠ P2
SymDiff
(Boogie+
Z3)
Works at Boogie
intermediate language
Boogie
• Simple intermediate verification language
– [Barnett et al. FMCO’05]
• Commands
–
–
–
–
–
–
–
x := E
//assign
havoc x
//change x to an arbitrary value
assert E
//if E holds, skip; otherwise, go wrong
assume E
// if E holds, skip; otherwise, block
S;T
//execute S, then T
goto L1, L2, … Ln //non-deterministic jump to labels
call x := Foo(e1,e2,..) //procedure call
Boogie (contd.)
• Two types of expressions
– Scalars (bool, int, ref, ..)
– Arrays ([int]int, [ref]ref, …)
• Array expression sugar for SMT array theory
– x[i] := y  x := upd(x, i, y)
– y := x[i]  y := sel(x,i)
• Procedure calls sugar for modular specification
procedure Foo();
requires pre;
ensures post;
modifies V;
call Foo();
assert pre;
havoc V;
assume post;
Basic equivalence checking
void swap1(ref int x, ref int y){
int z = x;
x = y;
y = z;
}
void swap2(ref int x, ref int y){
x = x + y;
y = x - y;
x = x - y;
}
z0 == x0 &&
x1 == y0 &&
y1 == z0 &&
swap1.x == x1 && swap1.y == y1
&&
x1' == x0 + y0 &&
y1' == x1' – y0 &&
x2' == x1' – y1' &&
swap2.x == x2' && swap2.y == y1'
&&
~ (swap1.x == swap2.x &&
swap1.y == swap2.y)
UNSAT (Equivalent)
Z3
theorem
prover
SAT (Counterexample)
Handling procedure calls
• Modular checking
– Assume “matched” callees are deterministic and
have the same I/O behaviors
– Modeled by uninterpreted functions [Necula ‘00,
…, Godlin & Strichman ‘08, …..]
• Addition of postcondition for Foo, Foo’
modifies g;
free ensures g == UF_Foo_g(x, old(g));
free ensures ret == UF_Foo_ret(x, old(g));
procedure Foo(x) returns (ret);
modifies g;
free ensures g == UF_Foo_g(x, old(g));
free ensures ret == UF_Foo_ret(x, old(g));
procedure Foo’(x) returns (ret);
Modeling C/Java/C#/x86  Boogie
• Separation of concerns
– Front end can be developed independently
– Quite a few already exists
• HAVOC/VCC for C, Spec#/BCT for .NET, ?? for Java, …
• Heap usually modeled by arrays
– x.f := y
 Heap_f[x] := y
• Challenges
– Deterministic modeling of I/O, malloc, …..
– The entire heap is passed around
Application: Compiler compatibility
Compiler validation
Source
ARM+opt
ARM
X86+opt
X86
v1
v2
Versions
Microsoft Confidential
v3
v4
Compatibility: x86 vs. x86 example
G01:
push ESI
mov ESI, EDX
G01:
mov
EAX, EDX
G02:
G02:
and
push
mov
call
and
push
mov
call
EAX, 255
EAX
EDX, 0x100000
WriteInternalFlag2(int,bool)
ESI, 255 254
ESI
EDX, 0x100000
WriteInternalFlag2(int,bool)
G03:
__epilog:
ret
pop
ret
ESI
X86+opt
v2
v3
Large x86 vs. ARM example
Beyond equivalence
Beyond equivalence
Type of change
Check
Refactoring/Optimizations
In1 = In2  Out1’ = Out2’
Bug fix
In1 = In2  (Fail1’ || Out1’ = Out2’)
Feature addition
In1 = In2  (UnImplemented1’ || Out1’ =
Out2’)
Performance optimization
In1 = In2  (Measure2’ <= Measure1’)
Differential assertion checking
(DAC) (see POPL’12 on
“Interleaved Bugs ….”)
In1 = In2  (Fail1’ || ~Fail2’)
Contracts over two programs
• Need an extensible contract mechanism for
comparing two programs
– Generalization of pre/post conditions
• Why
– Allow users to express relative correctness
specifications (e.g. conditional equivalence)
– Automated methods may not always suffice (even for
equivalence checking)
• Challenge
– Should be able to leverage SMT-based program
verifiers
Mutual summaries
– A extensible framework for interprocedural
program comparison
• Prior work (mostly automated):
– Intraprocedural
• Translation validation [Pnueli et al. ‘98, Necula ‘00, Zuck
et al. ’05,…]
– Coarse intraprocedural (only track equalities)
• Regression verification [Strichman et al. ‘08]
Mutual summaries
– [MSR-TR-2011-112]
• Mutual summaries (MS)
• Relative termination (RT)
• Dealing with loops and unstructured goto
Example: Feature addition
int f1(int x1){
a1 = A1[x2]; a2 = A2[x2];
if (Op[x1] == 0)
return Val[x1];
else if (Op[x1] == 1)
return f1(a1) + f1(a2);
else if (Op[x1] == 2)
return f1(a1) - f1(a2);
else
return 0;
}
int f2(int x2, bool isU){
a1 = A1[x2]; a2 = A2[x2];
if (Op[x2] == 0) return Val[x2];
else if (Op[x2] == 1){
if (isU) return uAdd(f2(a1, T), f2(a2, T));
else return f2(a1, F) + f2(a2, F);
}
else if (Op[x2] == 2){
if (isU) return uSub(f2(a1, T), f2(a2, T));
else
return f2(a1, F) – f2(a2, F);
}
else return 0;
}
Mutual summaries
void F1(int x1){
if(x1 < 100){
g1 := g1 + x1;
F1(x1 + 1);
}
}
void F2(int x2){
if(x2 < 100){
g2 := g2 + 2*x2;
F2(x2 + 1);
}
}
MS(F1, F2): (x1 = x2 && g1 <= g2 && x1 >= 0) ==> g1’ <= g2’
• What is a mutual summary MS(F1, F2)?
– An formula over two copies of
• parameters, globals (g), returns and next state of globals
(g’)
Mutual summaries
void F1(int x1){
if(x1 < 100){
g1 := g1 + x1;
F1(x1 + 1);
}
}
void F2(int x2){
if(x2 < 100){
g2 := g2 + 2*x2;
F2(x2 + 1);
}
}
MS(F1, F2): (x1 = x2 && g1 <= g2 && x1 >= 0) ==> g1’ <= g2’
• What does a mutual summary MS(F1, F2)
mean?
– For any pre/post state pairs (s1,t1) of F1, and
(s2,t2) of F2, (s1,t1,s2,t2) satisfy MS(F1,F2)
Example
int f1(int x1){
int f2(int x2, bool isU){
a1 = A1[x2]; a2 = A2[x2];
a1 = A1[x2]; a2 = A2[x2];
if (Op[x1] == 0)
if (Op[x2] == 0) return Val[x2];
return Val[x1];
else if (Op[x2] == 1){
else if (Op[x1] == 1)
if (isU) return uAdd(f2(a1, T), f2(a2, T));
return f1(a1)
f1(a2);
else return f2(a1, F) + f2(a2, F);
MS(f1,+f2)
=
else if (Op[x1] ==(x1
2) == x2 && !isU) ==>
} ret1 == ret2
return f1(a1) - f1(a2);
else if (Op[x2] == 2){
else
if (isU) return uSub(f2(a1, T), f2(a2, T));
return 0;
else
return f2(a1, F) – f2(a2, F);
}
}
else return 0;
}
Checking mutual summaries
• Given F1, F2, MS(F1, F2), define the following
procedure:
void CheckMS_F1_F2(int x1, int x2){
inline F1(x1);
inline F2(x2);
assert MS(F1,F2);
}
Modular checking: Instrumentation
1. Add “summary relations” R_F1, and R_F2
void F1(int x1);
ensures R_F1(x1, old(g1)/g1, g1/g1’);
2. Use the summary relations to assume mutual
summaries at call sites:
axiom (forall x1, g1, g1’, x2, g2, g2’::
{R_F1(x1, g1, g1’), R_F2(x2, g2, g2’)}
(R_F1(x1, g1, g1’) && R_F2(x2, g2, g2’))
==>
MS_F1_F2(x1, g1, g1’, x2, g2, g2’)
);
Leveraging program verifiers
• Mutual Summary checking
– Encode using contracts (postconditions), axioms
– Verification condition generation (Boogie)
– Checking using SMT solver (Z3)
• Next steps
– Inferring the mutual summaries
Relative termination
• Specification relating the terminating
behaviors of P2 wrt P1
• Not just for proving termination
– Required for composing transformations
– MS1(f,f’) && MS2(f’,f’’)  (MS1  MS2) (f,f’’)
– E.g. P_Eq(f,f’) && P_Eq(f’,f’’)  P_Eq(f,f’’)
Relative termination condition
void F1(int x1){
if(x1 < 100){
g1 := g1 + x1;
F1(x1 + 1);
}
}
void F2(int x2){
if(x2 < 100){
g2 := g2 + 2*x2;
F2(x2 + 1);
}
}
RT(F1, F2): (x1 <= x2)
• What is a relative termination condition
RT(F1, F2)?
– An formula over two copies of
• parameters, globals (g)
Relative termination condition
void F1(int x1){
if(x1 < 100){
g1 := g1 + x1;
F1(x1 + 1);
}
}
void F2(int x2){
if(x2 < 100){
g2 := g2 + 2*x2;
F2(x2 + 1);
}
}
RT(F1, F2): (x1 <= x2)
• What does relative termination condition
RT(F1, F2) mean?
– For pair of inputs states (s1,s2), if F1 terminates on s1,
and (s1,s2) satisfies RT(F1,F2), then F2 terminates on s2
What about loops?
int Foo2() {
i = 0;
if (n > 0) {
t = g;
v = 3;
do2:
a[i] := v;
i := i + 1;
v := v + t;
While2: //FLABEL
if (i < n)
goto do2;
}
return i;
}
int Foo2() {
(int ,int) While2(i2, t2, v2) {
i = 0;
i2' := i2;
if (n > 0) {
v2' := v2;
t = g;
if (i2' < n) {
v = 3;
a2[i2'] := v2';
do2:
i2' := i2' + 1;
a[i] := v;
v2' := v2' + t2;
i := i + 1;
return While2(i2', t2,v2');
v := v + t;
}
return While2(i, t, v);
return (i2‘,v2’);
}
}
return i;
}
Unrolling optimizations
void F2(int i2)
{
if (i2 < n) {
a2[i2] = 1;
F2(i2+1);
return;
}
return;
}
void F3(int i3)
{
if (i3 + 1 < n) {
a3[i3] := 1;
a3[i3+1] := 1;
F3(i3+2);
return;
}
if (i3 < n)
a3[i3] := 1;
return;
}
MS(F2, F3) = (i2 == i3 && a2 == a3) ==> a2’ == a3’
Extra step
• Inline F2 once inside F2 to “match up” with F3
Using mutual summaries
• Flow
1. Specify the FLABELS to remove loops and gotos
into procedures
2. Write mutual summaries for pairs of resulting
procedures
3. Specify the inlining limit (if needed)
Express translation validation proofs of
many compiler optimizations
– Copy propagation
– Constant propagation
– Common sub-expression
elimination
– Partial redundancy
elimination
– Loop invariant code
hoisting
– Conditional speculation
– Speculation
–
–
–
–
–
–
–
–
–
–
–
Software pipelining
Loop unswitching
Loop unrolling
Loop peeling
Loop splitting
Loop alignment
Loop interchange
Loop reversal
Loop skewing
Loop fusion
Loop distribution
Order of
updates differ
in two versions
[Kundu, Tatlock, Lerner ‘09]
A nice example that uses MS, RT
next: ref  ref;
data: ref  int;
void D(ref x){
data[x] := U(data[x]);
}
void A(ref x){
if(x != nil){
A(next[x]);
D(x);
}
}
Recursive
void B(ref x){
if(x != nil){
D(x);
B(next[x]);
}
}
Tail-recursive
void C(ref x){
ref i := x;
if(i != nil){
Do: D(i); i := next[i];
if (i != nil)
goto Do;
}
}
Do-while
Overview
Demo
Semantic diff
– Tool (in current form)
An application
– Compiler compatibility
Making SymDiff extensible with contracts
– Mutual summaries and relative termination
– General contracts for comparing programs
In summary
• Checking compatibility (statically) is a huge
opportunity
– Both formalizing the problem
– Tools/techniques to solve it
• Likely to have impact on development cycle
– Existing static analysis tools has failed to do so costeffectively, in spite of all the progress
• Combining with dynamic analysis
– To generate test cases when possible, or aid testing
achieve higher differential coverage
Resources
• SymDiff website
http://research.microsoft.com/symdiff/
• Binary release soon!
– Contains C front end
Download