DART and CUTE: Concolic Testing Koushik Sen University of California, Berkeley

advertisement
DART and CUTE:
Concolic Testing
Koushik Sen
University of California, Berkeley
Joint work with Gul Agha, Patrice Godefroid, Nils Klarlund,
Rupak Majumdar, Darko Marinov
7/17/2016
1
Today, QA is mostly testing
“50% of my company employees are testers,
and the rest spends 50% of their time
testing!”
Bill Gates 1995
2
Software Reliability Importance

Software is pervading every aspects of life


Software failures were estimated to cost the US economy
about $60 billion annually [NIST 2002]


Software everywhere
Improvements in software testing infrastructure may save onethird of this cost
Testing accounts for an estimated 50%-80% of the cost of
software development [Beizer 1990]
3
Big Picture
Safe Programming
Languages and Type
systems
Static Program
Analysis
Model
Checking
and
Theorem
Proving
Testing
Runtime
Monitoring
Dynamic Program
Analysis
Model Based Software
Development and Analysis
4
Big Picture
Safe Programming
Languages and Type
systems
Static Program
Analysis
Model
Checking
and
Theorem
Proving
Testing
Runtime
Monitoring
Dynamic Program
Analysis
Model Based Software
Development and Analysis
5
A Familiar Program: QuickSort
void quicksort (int[] a, int lo, int hi) {
int i=lo, j=hi, h;
int x=a[(lo+hi)/2];
// partition
do {
while (a[i]<x) i++;
while (a[j]>x) j--; if (i<=j) {
h=a[i];
a[i]=a[j];
a[j]=h;
i++;
j--;
}
} while (i<=j);
// recursion
if (lo<j) quicksort(a, lo, j);
if (i<hi) quicksort(a, i, hi);
}
6
A Familiar Program: QuickSort
void quicksort (int[] a, int lo, int hi) {
int i=lo, j=hi, h;
int x=a[(lo+hi)/2];

Test QuickSort


// partition
do {
while (a[i]<x) i++;
while (a[j]>x) j--; if (i<=j) {
h=a[i];
a[i]=a[j];
a[j]=h;
i++;
j--;
}
} while (i<=j);
// recursion
if (lo<j) quicksort(a, lo, j);
if (i<hi) quicksort(a, i, hi);




Create an array
Initialize the elements of
the array
Execute the program on
this array
How much confidence
do I have in this testing
method?
Is my test suite
*Complete*?
Can someone generate
a small and *Complete*
test suite for me?
}
7
Automated Test Generation

Studied since 70’s



King 76, Myers 79
30 years have passed, and yet no effective
solution
What Happened???
8
Automated Test Generation

Studied since 70’s



King 76, Myers 79
30 years have passed, and yet no effective
solution
What Happened???


Program-analysis techniques were expensive
Automated theorem proving and constraint solving
techniques were not efficient
9
Automated Test Generation

Studied since 70’s



30 years have passed, and yet no effective
solution
What Happened???



King 76, Myers 79
Program-analysis techniques were expensive
Automated theorem proving and constraint solving
techniques were not efficient
In the recent years we have seen remarkable
progress in static program-analysis and
constraint solving

SLAM, BLAST, ESP, Bandera, Saturn, MAGIC
10
Automated Test Generation

Studied since 70’s

King 76, Myers 79

Question:
Can
we
use
similar
30 years have passed, and yet no effective
techniques in Automated Testing?
solution

What Happened???



Program-analysis techniques were expensive
Automated theorem proving and constraint solving
techniques were not efficient
In the recent years we have seen remarkable
progress in static program-analysis and
constraint solving

SLAM, BLAST, ESP, Bandera, Saturn, MAGIC
11
Systematic Automated Testing
Static Analysis
Model-Checking
and
Formal Methods
Testing
Automated
Theorem Proving
Constraint
Solving
12
CUTE and DART

Combine random testing (concrete execution) and
symbolic testing (symbolic execution)
[PLDI’05, FSE’05, FASE’06, CAV’06,ISSTA’07,
ICSE’07]
Concrete + Symbolic = Concolic
13
Goal

Automated Unit Testing of real-world C and
Java Programs


Generate test inputs
Execute unit under test on generated test inputs


so that all reachable statements are executed
Any assertion violation gets caught
14
Goal

Automated Unit Testing of real-world C and
Java Programs


Generate test inputs
Execute unit under test on generated test inputs



so that all reachable statements are executed
Any assertion violation gets caught
Our Approach:

Explore all execution paths of an Unit for all
possible inputs

Exploring all execution paths ensure that all reachable
statements are executed
15
Execution Paths of a Program

Can be seen as a binary
tree with possibly infinite
depth




Computation tree
Each node represents the
execution of a “if then else”
statement
Each edge represents the
execution of a sequence of
non-conditional statements
Each path in the tree
represents an equivalence
class of inputs
1
0
0
1
1
1
0
0
1
1
0
1
16
Example of Computation Tree
void test_me(int x, int y) {
if(2*x==y){
if(x != y+10){
printf(“I am fine here”);
} else {
printf(“I should not reach here”);
ERROR;
}
}
}
N
2*x==y
N
Y
x!=y+10
Y
ERROR
17
Concolic Testing: Finding Security and Safety Bugs
Divide by 0 Error
x = 3 / i;
Buffer Overflow
a[i] = 4;
18
Concolic Testing: Finding Security and Safety Bugs
Key: Add Checks Automatically and
Perform Concolic Testing
Divide by 0 Error
Buffer Overflow
if (i !=0)
x = 3 / i;
else
ERROR;
if (0<=i && i < a.length)
a[i] = 4;
else
ERROR;
19
Existing Approach I

Random testing



generate random inputs
execute the program on
generated inputs
Probability of reaching
an error can be
astronomically less
test_me(int x){
if(x==94389){
ERROR;
}
}
Probability of hitting
ERROR = 1/232
20
Existing Approach II

Symbolic Execution





use symbolic values for
input variables
execute the program
symbolically on symbolic
input values
collect symbolic path
constraints
use theorem prover to
check if a branch can be
taken
Does not scale for large
programs
test_me(int x){
if((x%10)*4!=17){
ERROR;
} else {
ERROR;
}
}
Symbolic execution will
say both branches are
reachable:
False positive
21
Existing Approach II

Symbolic Execution





use symbolic values for
input variables
execute the program
symbolically on symbolic
input values
collect symbolic path
constraints
use theorem prover to
check if a branch can be
taken
Does not scale for large
programs
test_me(int x){
if(bbox(x)!=17){
ERROR;
} else {
ERROR;
}
}
Symbolic execution will
say both branches are
reachable:
False positive
22
Concolic Testing Approach
int double (int v) {
return 2*v;

}

void testme (int x, int y) {
z = double (y);
if (z == x) {
if (x > y+10) {
Random Test Driver:

random value for
and y
x
Probability of reaching
ERROR is extremely
low
ERROR;
}
}
}
23
Concolic Testing Approach
int double (int v) {
return 2*v;
}
Concrete
Execution
Symbolic
Execution
concrete
state
symbolic
state
x = 22, y = 7
x = x0, y = y0
path
condition
void testme (int x, int y) {
z = double (y);
if (z == x) {
if (x > y+10) {
ERROR;
}
}
}
24
Concolic Testing Approach
int double (int v) {
return 2*v;
}
Concrete
Execution
concrete
state
Symbolic
Execution
symbolic
state
path
condition
void testme (int x, int y) {
z = double (y);
if (z == x) {
x = 22, y = 7,
z = 14
x = x0, y = y0,
z = 2*y0
if (x > y+10) {
ERROR;
}
}
}
25
Concolic Testing Approach
int double (int v) {
return 2*v;
}
Concrete
Execution
concrete
state
Symbolic
Execution
symbolic
state
path
condition
void testme (int x, int y) {
z = double (y);
2*y0 != x0
if (z == x) {
if (x > y+10) {
ERROR;
}
}
}
x = 22, y = 7,
z = 14
x = x0, y = y0,
z = 2*y0
26
Concolic Testing Approach
int double (int v) {
return 2*v;
}
void testme (int x, int y) {
z = double (y);
Concrete
Execution
concrete
state
Symbolic
Execution
symbolic
state
path
condition
Solve: 2*y0 == x0
Solution: x0 = 2, y0 = 1
2*y0 != x0
if (z == x) {
if (x > y+10) {
ERROR;
}
}
}
x = 22, y = 7,
z = 14
x = x0, y = y0,
z = 2*y0
27
Concolic Testing Approach
int double (int v) {
return 2*v;
}
Concrete
Execution
concrete
state
Symbolic
Execution
symbolic
state
path
condition
void testme (int x, int y) {
x = 2, y = 1
x = x0, y = y0
z = double (y);
if (z == x) {
if (x > y+10) {
ERROR;
}
}
}
28
Concolic Testing Approach
int double (int v) {
return 2*v;
}
Concrete
Execution
concrete
state
Symbolic
Execution
symbolic
state
path
condition
void testme (int x, int y) {
z = double (y);
if (z == x) {
x = 2, y = 1,
z=2
x = x0, y = y0,
z = 2*y0
if (x > y+10) {
ERROR;
}
}
}
29
Concolic Testing Approach
int double (int v) {
return 2*v;
}
Concrete
Execution
concrete
state
Symbolic
Execution
symbolic
state
path
condition
void testme (int x, int y) {
z = double (y);
2*y0 == x0
if (z == x) {
if (x > y+10) {
x = 2, y = 1,
z=2
x = x0, y = y0,
z = 2*y0
ERROR;
}
}
}
30
Concolic Testing Approach
int double (int v) {
return 2*v;
}
Concrete
Execution
concrete
state
Symbolic
Execution
symbolic
state
path
condition
void testme (int x, int y) {
z = double (y);
2*y0 == x0
if (z == x) {
x0 · y0+10
if (x > y+10) {
ERROR;
}
}
}
x = 2, y = 1,
z=2
x = x0, y = y0,
z = 2*y0
31
Concolic Testing Approach
int double (int v) {
return 2*v;
}
void testme (int x, int y) {
z = double (y);
Concrete
Execution
concrete
state
Symbolic
Execution
symbolic
state
path
condition
Solve: (2*y0 == x0) Æ (x0 > y0 + 10)
Solution: x0 = 30, y0 = 15
2*y0 == x0
if (z == x) {
x0 · y0+10
if (x > y+10) {
ERROR;
}
}
}
x = 2, y = 1,
z=2
x = x0, y = y0,
z = 2*y0
32
Concolic Testing Approach
int double (int v) {
return 2*v;
}
Concrete
Execution
Symbolic
Execution
concrete
state
symbolic
state
x = 30, y = 15
x = x0, y = y0
path
condition
void testme (int x, int y) {
z = double (y);
if (z == x) {
if (x > y+10) {
ERROR;
}
}
}
33
Concolic Testing Approach
int double (int v) {
return 2*v;
}
void testme (int x, int y) {
Concrete
Execution
concrete
state
Symbolic
Execution
symbolic
state
path
condition
Program Error
z = double (y);
2*y0 == x0
if (z == x) {
x0 > y0+10
if (x > y+10) {
ERROR;
x = 30, y = 15
x = x0, y = y0
}
}
}
34
Explicit Path (not State) Model Checking

Traverse all execution
paths one by one to
detect errors

assertion violations
program crash

uncaught exceptions


combine with valgrind
to discover memory
errors
T
F
F
T
T
T
F
F
T
T
F
T
35
Explicit Path (not State) Model Checking

Traverse all execution
paths one by one to
detect errors

assertion violations
program crash

uncaught exceptions


combine with valgrind
to discover memory
errors
T
F
F
T
T
T
F
F
T
T
F
T
36
Explicit Path (not State) Model Checking

Traverse all execution
paths one by one to
detect errors

assertion violations
program crash

uncaught exceptions


combine with valgrind
to discover memory
errors
T
F
F
T
T
T
F
F
T
T
F
T
37
Explicit Path (not State) Model Checking

Traverse all execution
paths one by one to
detect errors

assertion violations
program crash

uncaught exceptions


combine with valgrind
to discover memory
errors
T
F
F
T
T
T
F
F
T
T
F
T
38
Explicit Path (not State) Model Checking

Traverse all execution
paths one by one to
detect errors

assertion violations
program crash

uncaught exceptions


combine with valgrind
to discover memory
errors
T
F
F
T
T
T
F
F
T
T
F
T
39
Explicit Path (not State) Model Checking

Traverse all execution
paths one by one to
detect errors

assertion violations
program crash

uncaught exceptions


combine with valgrind
to discover memory
errors
T
F
F
T
T
T
F
F
T
T
F
T
40
Novelty : Simultaneous Concrete and Symbolic Execution
int foo (int v) {
return (v*v) % 50;
}
Concrete
Execution
Symbolic
Execution
concrete
state
symbolic
state
x = 22, y = 7
x = x0, y = y0
path
condition
void testme (int x, int y) {
z = foo (y);
if (z == x) {
if (x > y+10) {
ERROR;
}
}
}
41
Novelty : Simultaneous Concrete and Symbolic Execution
Concrete
Execution
int foo (int v) {
return (v*v) % 50;
}
void testme (int x, int y) {
z = foo (y);
if (z == x) {
concrete
state
Symbolic
Execution
symbolic
state
path
condition
Solve: (y0*y0 )%50 == x0
Don’t know how to solve!
Stuck?
(y0*y0)%50 !=x0
if (x > y+10) {
ERROR;
}
}
}
x = 22, y = 7,
z = 49
x = x0, y = y0,
z = (y0 *y0)%50
42
Novelty : Simultaneous Concrete and Symbolic Execution
Concrete
Execution
concrete
state
void testme (int x, int y) {
z = foo (y);
if (z == x) {
Symbolic
Execution
symbolic
state
path
condition
Solve: foo (y0) == x0
Don’t know how to solve!
Stuck?
foo (y0) !=x0
if (x > y+10) {
ERROR;
}
}
}
x = 22, y = 7,
z = 49
x = x0, y = y0,
z = foo (y0)
43
Novelty : Simultaneous Concrete and Symbolic Execution
Concrete
Execution
int foo (int v) {
return (v*v) % 50;
}
void testme (int x, int y) {
z = foo (y);
if (z == x) {
concrete
state
Symbolic
Execution
symbolic
state
path
condition
Solve: (y0*y0 )%50 == x0
Don’t know how to solve!
Not Stuck!
(y0*y0)%50 !=x0
Use concrete state
if (x > y+10) {
Replace y0 by 7 (sound)
ERROR;
}
}
}
x = 22, y = 7,
z = 49
x = x0, y = y0,
z = (y0 *y0)%50
44
Novelty : Simultaneous Concrete and Symbolic Execution
int foo (int v) {
return (v*v) % 50;
}
void testme (int x, int y) {
z = foo (y);
Concrete
Execution
concrete
state
Symbolic
Execution
symbolic
state
path
condition
Solve: 49 == x0
Solution : x0 = 49, y0 = 7
49 !=x0
if (z == x) {
if (x > y+10) {
ERROR;
}
}
}
x = 22, y = 7,
z = 48
x = x0, y = y0,
z = 49
45
Novelty : Simultaneous Concrete and Symbolic Execution
int foo (int v) {
return (v*v) % 50;
}
Concrete
Execution
Symbolic
Execution
concrete
state
symbolic
state
x = 49, y = 7
x = x0, y = y0
path
condition
void testme (int x, int y) {
z = foo (y);
if (z == x) {
if (x > y+10) {
ERROR;
}
}
}
46
Novelty : Simultaneous Concrete and Symbolic Execution
int foo (int v) {
return (v*v) % 50;
}
void testme (int x, int y) {
Concrete
Execution
concrete
state
Symbolic
Execution
symbolic
state
path
condition
Program Error
z = foo (y);
2*y0 == x0
if (z == x) {
x0 > y0+10
if (x > y+10) {
ERROR;
}
x = 49, y = 7,
z = 49
x = x0, y = y0 ,
z = 49
}
}
47
Concolic Testing: A Middle Approach
Random
Testing
Symbolic
Testing
Concolic
Testing
+ Complex programs
+ Efficient
- Less coverage
+ No false positive
+ Complex programs
+/- Somewhat efficient
+ High coverage
+ No false positive
- Simple programs
- Not efficient
+ High coverage
- False positive
48
Implementations


DART and CUTE for C programs
jCUTE for Java programs


MSR has four implementations



Goto http://srl.cs.berkeley.edu/~ksen/ for CUTE
and jCUTE binaries
SAGE, PEX, YOGI, Vigilante
Similar tool: EXE at Stanford
Easiest way to use and to develop on top of
CUTE

Implement concolic testing yourself
49
Testing Data Structures
(joint work with Darko Marinov and
Gul Agha
50
Example
typedef struct cell {
int v;
struct cell *next;
} cell;
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
• Random Test Driver:
• random memory graph
reachable from p
• random value for x
• Probability of reaching abort( ) is
extremely low
51
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
int f(int v) {
return 2*v + 1;
}
concrete
state
symbolic
state
constraints
p
, x=236
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
Symbolic
Execution
p=p0, x=x0
NULL
52
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
Symbolic
Execution
symbolic
state
constraints
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
p
, x=236
p=p0, x=x0
NULL
53
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
Symbolic
Execution
symbolic
state
constraints
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
x0>0
p
, x=236
p=p0, x=x0
NULL
54
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
Symbolic
Execution
symbolic
state
constraints
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
x0>0
!(p0!=NULL
)
p
, x=236
p=p0, x=x0
NULL
55
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
symbolic
state
state
solve: x0>0 and p0NULL
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
Symbolic
Execution
constraints
x0>0
p0=NULL
p
, x=236
p=p0, x=x0
NULL
56
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
symbolic
state
state
solve: x0>0 and p0NULL
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
Symbolic
Execution
x0=236, p0
constraints
NULL
634
x0>0
p0=NULL
p
, x=236
p=p0, x=x0
NULL
57
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
concrete
state
p
634
NULL
, x=236
Symbolic
Execution
symbolic
state
constraints
p=p0, x=x0,
p->v =v0,
p->next=n0
58
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
Symbolic
Execution
symbolic
state
constraints
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
p
634
NULL
, x=236
p=p0, x=x0,
p->v =v0,
p->next=n0
x0>0
59
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
Symbolic
Execution
symbolic
state
constraints
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
p
634
NULL
, x=236
p=p0, x=x0,
p->v =v0,
p->next=n0
x0>0
p0NULL
60
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
Symbolic
Execution
symbolic
state
constraints
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
p
634
NULL
, x=236
p=p0, x=x0,
p->v =v0,
p->next=n0
x0>0
p0NULL
2x0+1v0
61
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
Symbolic
Execution
symbolic
state
constraints
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
x0>0
p0NULL
2x0+1v0
p
634
NULL
, x=236
p=p0, x=x0,
p->v =v0,
p->next=n0
62
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
symbolic
state
state
solve: x0>0 and p0NULL
and 2x0+1=v0
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
Symbolic
Execution
constraints
x0>0
p0NULL
2x0+1v0
p
634
NULL
, x=236
p=p0, x=x0,
p->v =v0,
p->next=n0
63
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
symbolic
state
state
solve: x0>0 and p0NULL
and 2x0+1=v0
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
Symbolic
Execution
x0=1, p0
constraints
NULL
3
x0>0
p0NULL
2x0+1v0
p
634
NULL
, x=236
p=p0, x=x0,
p->v =v0,
p->next=n0
64
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
p
3
Symbolic
Execution
concrete
state
symbolic
state
NULL
, x=1
p=p0, x=x0,
p->v =v0,
p->next=n0
constraints
65
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
Symbolic
Execution
symbolic
state
constraints
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
p
NULL
, x=1
3
p=p0, x=x0,
p->v =v0,
p->next=n0
x0>0
66
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
Symbolic
Execution
symbolic
state
constraints
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
p
NULL
, x=1
3
p=p0, x=x0,
p->v =v0,
p->next=n0
x0>0
p0NULL
67
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
Symbolic
Execution
symbolic
state
constraints
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
p
NULL
, x=1
3
p=p0, x=x0,
p->v =v0,
p->next=n0
x0>0
p0NULL
2x0+1=v0
68
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
Symbolic
Execution
symbolic
state
constraints
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
x0>0
p0NULL
p
NULL
, x=1
3
p=p0, x=x0,
p->v =v0,
p->next=n0
2x0+1=v0
n0p0
69
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
Symbolic
Execution
symbolic
state
constraints
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
x0>0
p0NULL
2x0+1=v0
n0p0
p
NULL
, x=1
3
p=p0, x=x0,
p->v =v0,
p->next=n0
70
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
Symbolic
Execution
symbolic
state
constraints
solve: x0>0 and p0NULL
and 2x0+1=v0 and n0=p0
.
x0>0
p0NULL
2x0+1=v0
n0p0
p
NULL
, x=1
3
p=p0, x=x0,
p->v =v0,
p->next=n0
71
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
Symbolic
Execution
symbolic
state
constraints
solve: x0>0 and p0NULL
and 2x0+1=v0 and n0=p0
x0=1, p0
3
x0>0
p0NULL
2x0+1=v0
n0p0
p
NULL
, x=1
3
p=p0, x=x0,
p->v =v0,
p->next=n0
72
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
concrete
state
symbolic
state
, x=1
p=p0, x=x0,
p->v =v0,
p->next=n0
p
3
Symbolic
Execution
constraints
73
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
Symbolic
Execution
symbolic
state
constraints
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
p
, x=1
3
p=p0, x=x0,
p->v =v0,
p->next=n0
x0>0
74
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
Symbolic
Execution
symbolic
state
constraints
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
p
, x=1
3
p=p0, x=x0,
p->v =v0,
p->next=n0
x0>0
p0NULL
75
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
Symbolic
Execution
symbolic
state
constraints
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
p
, x=1
3
p=p0, x=x0,
p->v =v0,
p->next=n0
x0>0
p0NULL
2x0+1=v0
76
CUTE Approach
Concrete
Execution
typedef struct cell {
int v;
struct cell *next;
} cell;
concrete
state
Symbolic
Execution
symbolic
state
constraints
int f(int v) {
return 2*v + 1;
}
int testme(cell *p, int x) {
if (x > 0)
if (p != NULL)
if (f(x) == p->v)
if (p->next == p)
abort();
return 0;
}
x0>0
p0NULL
Program Error
p
, x=1
3
p=p0, x=x0,
p->v =v0,
p->next=n0
2x0+1=v0
n0=p0
77
CUTE in a Nutshell

Generate concrete inputs one by one
 each input leads program along a different path
79
CUTE in a Nutshell


Generate concrete inputs one by one
 each input leads program along a different path
On each input execute program both concretely and
symbolically
80
CUTE in a Nutshell


Generate concrete inputs one by one
 each input leads program along a different path
On each input execute program both concretely and
symbolically
 Both cooperate with each other
 concrete execution guides the symbolic execution
81
CUTE in a Nutshell


Generate concrete inputs one by one
 each input leads program along a different path
On each input execute program both concretely and
symbolically
 Both cooperate with each other
 concrete execution guides the symbolic execution
 concrete execution enables symbolic execution to
overcome incompleteness of theorem prover
 replace symbolic expressions by concrete values if
symbolic expressions become complex
 resolve aliases for pointer using concrete values
 handle arrays naturally
82
CUTE in a Nutshell


Generate concrete inputs one by one
 each input leads program along a different path
On each input execute program both concretely and
symbolically
 Both cooperate with each other
 concrete execution guides the symbolic execution
 concrete execution enables symbolic execution to
overcome incompleteness of theorem prover
 replace symbolic expressions by concrete values if
symbolic expressions become complex
 resolve aliases for pointer using concrete values
 handle arrays naturally
 symbolic execution helps to generate concrete input for
next execution
 increases coverage
83
Handling Pointer Casting and Arithmetic
struct foo {
int i;
char c;
};
void * memset(void *s,char c,size_t n){
for(int i=0;i<n;i++){
((char *)s)[i] = c; }
return s;
}
bar(struct foo *a){
if(a && a->c==1){
memset(a,0,sizeof(struct foo));
if(a->c!= 1) {
ERROR;
}
}
}


Reasoning about dynamic
data is easy
Due to limitation of alias
analysis “static analyzers”
cannot determine that
“a->c” has been rewritten

BLAST would infer that
the program is safe


practical
CUTE finds the error
84
Library Functions With Side-effects
struct foo {
int i;
char c;
};

bar(struct foo *a){
if(a && a->c==1){
memset(a,0,sizeof(struct foo));
if(a->c== 1) {
ERROR;
}
}
}


Reasoning about function
with side-effects
Sound static tool will give
false alarm
CUTE will not report the
error

because it cannot generate
a concrete input that will
make the program hit
ERROR.
85
Underlying Random Testing Helps
1 foobar(int x, int y){
2 if (x*x*x > 0){
3
if (x>0 && y==10){
4
ERROR;
5
}
6 } else {
7
if (x>0 && y==20){
8
ERROR;
9
}
10 }
11 }

static analysis based
model-checkers would
consider both branches



Symbolic execution



both ERROR statements
are reachable
false alarm
gets stuck at line number
2
or warn that both
ERRORs are reachable
CUTE finds the only
error
86
Data-structure Testing
Solving Data-structure Invariants
int isSortedSlist(slist * head) {
slist * cur, *tmp;
int i,j;
if (head == 0) return 1;
i=j=0;
for (cur = head; cur!=0; cur = cur->next){
i++;
j=1;
for (tmp = head; j<i; tmp = tmp->next){
j++;
if(cur==tmp) return 0;
}
}
for (cur = head; cur->next!=0; cur = cur>next){
if(cur->i > cur->_next->i) return 0;
}
return 1;
}
testme(slist *L,slist *e){
CUTE_assume(isSortedSlist(L));
sglib_slist_add(&L,e);
CUTE_assert(isSortedSlist(L));
}
87
Data-structure Testing
Generating Call Sequence
for (i=1; i<10; i++) {
CU_input(toss);
CU_input(e);
switch(toss){
case 2:
sglib_hashed_ilist_add_if_not_member(htab,e,&m);
break;
case 3:
sglib_hashed_ilist_delete_if_member(htab,e,&m);
break;
case 4:
sglib_hashed_ilist_delete(htab,e); break;
case 5:
sglib_hashed_ilist_is_member(htab,e); break;
case 6:
sglib_hashed_ilist_find_member(htab,e); break;
}
}
88
Instrumentation
89
Instrumented Program

Maintains a symbolic state


maps each memory location
used by the program to
symbolic expressions over
symbolic input variables
Maintains a path constraint

a sequence of constraints
C1, C2,…, Ck


such that Ci s are symbolic
constraints over symbolic
input variables
C1 Æ C2 Æ … Æ Ck holds
over the path currently
executed by the program
90
Instrumented Program

Maintains a symbolic state


maps each memory location
used by the program to
symbolic expressions over
symbolic input variables

Arithmetic Expressions
a1x1+…+anxn+c

Pointer expressions
x
Maintains a path constraint

a sequence of constraints
C1, C2,…, Ck


such that Ci s are symbolic
constraints over symbolic
input variables
C1 Æ C2 Æ … Æ Ck holds
over the path currently
executed by the program
91
Instrumented Program

Maintains a symbolic state


maps each memory location
used by the program to
symbolic expressions over
symbolic input variables

a1x1+…+anxn+c

a sequence of constraints
C1, C2,…, Ck


such that Ci s are symbolic
constraints over symbolic
input variables
C1 Æ C2 Æ … Æ Ck holds
over the path currently
executed by the program
Pointer expressions
x

Maintains a path constraint

Arithmetic Expressions
Arithmetic Constraints
a1x1+…+anxn ¸ 0

Pointer Constraints
x=y
x y
x=NULL
x NULL
92
Constraint Solving
Path constraint
C1Æ C2Æ… Æ Ck Æ… Cn
93
Constraint Solving
Path constraint
C1Æ C2Æ… Æ Ck Æ… Cn
Solve
C1Æ C2Æ… Æ : Ck
to generate next input
94
Constraint Solving
Path constraint
C1Æ C2Æ… Æ Ck Æ… Cn
Solve
C1Æ C2Æ… Æ : Ck
to generate next input
If Ck is pointer constraint then invoke pointer solver
Ci1 Æ Ci2 Æ … Æ : Ck where ij 2 [1..k-1] and Cij is
pointer constraint
If Ck is arithmetic constraint then invoke linear solver
Ci1Æ Ci2 Æ … Æ : Ck where ij 2 [1..k-1] and Cij is
arithmetic constraint
95
Optimizations

Fast un-satisfiability check
C1Æ C2Æ… Æ : Ck
96
Optimizations
Fast un-satisfiability check
C1Æ C2Æ… Æ : Ck
 Eliminate same constraints
(y>2) Æ (y>2)Æ :(y==4)
´
(y>2)Æ :(y==4)
that is
C1Æ C2Æ…Ci … Cj … Æ : Ck = C1Æ C2Æ… Ci … Æ : Ck
if Ci = Cj

97
Optimizations
Fast un-satisfiability check
C1Æ C2Æ… Æ : Ck
 Eliminate same constraints
(y>2) Æ (y>2)Æ :(y==4)
´
(y>2)Æ :(y==4)
that is
C1Æ C2Æ…Ci … Cj … Æ : Ck = C1Æ C2Æ… Ci … Æ : Ck
if Ci = Cj
 Eliminate non-dependent constraints
(x==1) Æ (y>2)Æ :(y==4)
´
(y>2)Æ :(y==4)

98
SGLIB: popular library for C data-structures


Used in Xrefactory a commercial tool for refactoring
C/C++ programs
Found two bugs in sglib 1.0.1



Bug 1:


reported them to authors
fixed in sglib 1.0.2
doubly-linked list library
 segmentation fault occurs when a non-zero length list is
concatenated with a zero-length list
 discovered in 140 iterations ( < 1second)
Bug 2:

hash-table
 an infinite loop in hash table is member function

193 iterations (1 second)
99
SGLIB: popular library for C data-structures
Name
Run time
in
seconds
# of
Iterations
# of
Branches
Explored
% Branch
Coverage
# of
Functions
Tested
# of Bugs
Found
Array Quick
Sort
2
732
43
97.73
2
0
Array Heap
Sort
4
1764
36
100.00
2
0
Linked List
2
570
100
96.15
12
0
Sorted List
2
1020
110
96.49
11
0
Doubly Linked
List
3
1317
224
99.12
17
1
Hash Table
1
193
46
85.19
8
1
2629
1,000,000
242
71.18
17
0
Red Black
Tree
100
Experiments: Needham-Schroeder Protocol

Tested a C implementation of a security
protocol (Needham-Schroeder) with a known
attack



600 lines of code
Took less than 13 seconds on a machine with
1.8GHz Xeon processor and 2 GB RAM to
discover middle-man attack
In contrast, a software model-checker
(VeriSoft) took 8 hours
101
Testing Data-structures of CUTE itself

Unit tested several non-standard datastructures implemented for the CUTE tool




cu_depend (used to determine dependency
during constraint solving using graph algorithm)
cu_linear (linear symbolic expressions)
cu_pointer (pointer symbolic expresiions)
Discovered a few memory leaks and a copule
of segmentation faults


these errors did not show up in other uses of
CUTE
for memory leaks we used CUTE in conjunction
with Valgrind
102
Limitations

Path Space of a Large Program is Huge
 Path Explosion Problem
Entire Computation
Tree
103
Limitations

Path Space of a Large Program is Huge
 Path Explosion Problem
Entire Computation
Tree
Explored by
Concolic Testing
104
Limitations

Path Space of a Large Program is Huge
 Path Explosion Problem
Entire Computation
Tree
Explored by
Concolic Testing
105
Hybrid Concolic Testing
(joing work with Rupak Majumdar)
106
Limitations: A Comparative View
Concolic: Broad, shallow
Random: Narrow, deep
107
Hybrid Concolic Testing

Interleave Random Testing and Concolic Testing to
increase coverage
Motivated by similar idea used in VLSI design validation:
Ganai et al. 1999, Ho et al. 2000
108
Hybrid Concolic Testing

Interleave Random Testing and Concolic Testing to
increase coverage
while (not required coverage) {
while (not saturation)
perform random testing;
Checkpoint;
while (not increase in coverage)
perform concolic testing;
Restore;
}
109
Hybrid Concolic Testing

Interleave Random Testing and Concolic Testing to
increase coverage
while (not required coverage) {
while (not saturation)
perform random testing;
Checkpoint;
while (not increase in coverage)
perform concolic testing;
Restore;
}
Deep, broad search
Hybrid Search
110
Hybrid Concolic Testing


4x more coverage than random testing
2x more coverage than concolic testing
111
Results
112
Results
113
Summary
Static Analysis
Program
Verification
Automated
Theorem Proving
Model-Checking
and
Formal Methods
Constraint
Solving
114
Summary
Static Analysis
Program
Verification
Model-Checking
and
Formal Methods
Testing
Automated
Theorem Proving
Constraint
Solving
115
References


1.
2.
3.
4.
5.
6.
7.
DART: Directed Automated Random Testing. Patrice
Godefroid, Nils Klarlund, and Koushik Sen (PLDI 2005)
CUTE: A Concolic Unit Testing Engine for C. Koushik Sen,
Darko Marinov, and Gul Agha (ESEC/FSE 2005)
"Dynamic Test Input Generation for Database Applications” Michael Emmi, Rupak Majumdar,
and Koushik Sen, International Symposium on Software Testing and Analysis (ISSTA'07), ACM,
2007.
"Hybrid Concolic Testing” Rupak Majumdar and Koushik Sen, 29th International Conference on
Software Engineering (ICSE'07), IEEE, 2007.
"A Race-Detection and Flipping Algorithm for Automated Testing of Multi-Threaded Programs”
Koushik Sen and Gul Agha, Haifa verification conference 2006 (HVC'06), Lecture Notes in
Computer Science, Springer, 2006.
"CUTE and jCUTE : Concolic Unit Testing and Explicit Path Model-Checking Tools” Koushik
Sen and Gul Agha, 18th International Conference on Computer Aided Verification (CAV'06),
Lecture Notes in Computer Science, Springer, 2006. (Tool Paper).
"Automated Systematic Testing of Open Distributed Programs” Koushik Sen and Gul Agha,
International Conference on Fundamental Approaches to Software Engineering (FASE'06)
(ETAPS'06 conference), Lecture Notes in Computer Science, Springer, 2006.
“EXE: Automatically Generating Inputs of Death,” Cristian Cadar, Vijay Ganesh, Peter M.
Pawlowski, David L. Dill, Dawson R. Engler, 13th ACM Conference on Computer and
Communications Security, 2006.
“Automatically generating malicious disks using symbolic execution,” Junfeng Yang, Can Sar,
Paul Twohey, Cristian Cadar, and Dawson Engler. IEEE Security and Privacy, 2006.
116
Download