Dynamic Symbolic Execution (aka, directed automated random testing, aka concolic execution) Slides by Koushik Sen Software Testing • Testing accounts for 50% of software development cost • Software failure costs USA $60 billion annually – Improvement in software testing infrastructure can save one-third of this cost “The economic impacts of inadequate infrastructure for software testing”, NIST, May, 2002 • Currently, software testing is mostly done manually Simple C code int double(int x) { return 2 * x; } void test_me(int x, int y){ int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } Automated testing • What do we need to do if we want to automatically test a piece of code (i.e., automatic unit testing)? – Figure out the environment of the application • What are the inputs? • What are the interfaces for interacting with other components – Automatically generate an environment for the application • Automatically generate the values that come from the environment Automatic Extraction of Interface • Automatically determine (code parsing) – inputs to the program • arguments to the entry function – variables: whose value depends on environment • external objects – function calls: return value depends on the environment • external function calls • For simple C code – want to unit test the function test_me – int x and int y : passed as an argument to test_me forms the external environment Automated Random Testing • Generate a test driver automatically to simulate random environment of the extracted interface – most general environment – C – code • Compile the program along with the test driver to create a closed executable. • Run the executable several times to see if assertion is violated Random test-driver int double(int x) { return 2 * x; } Random Test Driver main(){ int tmp1 = randomInt(); int tmp2 = randomInt(); test_me(tmp1,tmp2); } void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } Random test-driver int double(int x) { return 2 * x; } Random Test Driver main(){ int tmp1 = randomInt(); int tmp2 = randomInt(); test_me(tmp1,tmp2); } void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } Probability of reaching abort() is extremely low Limitations • Hard to hit the assertion violated with random values of x and y – there is an extremely low probability of hitting assertion violation • Can we do better? – Directed Automated Random Testing • White box testing DART (Directed Automated Random Testing) Approach main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } t1=36 Symbolic Execution symbolic state t1=m constraints DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } t1=36, t2=-7 Symbolic Execution symbolic state t1=m, t2=n constraints DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } t1=36, t2=-7 Symbolic Execution symbolic state t1=m, t2=n constraints DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } x=36, y=-7 Symbolic Execution symbolic state x=m, y=n constraints DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { x=36, y=-7, z=72 int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } Symbolic Execution symbolic state x=m, y=n, z=2m constraints DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); x=36, y=-7, z=72 if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } Symbolic Execution symbolic state x=m, y=n, z=2m constraints 2m != n DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } x=36, y=-7, z=72 } Symbolic Execution symbolic state constraints 2m != n x=m, y=n, z=2m DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state Symbolic Execution symbolic state constraints solve: 2m = n m=1, n=2 void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } x=36, y=-7, z=72 } 2m != n x=m, y=n, z=2m DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } t1=1 Symbolic Execution symbolic state t1=m constraints DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } t1=1, t2=2 Symbolic Execution symbolic state t1=m, t2=n constraints DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } t1=1, t2=2 Symbolic Execution symbolic state t1=m, t2=n constraints DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } x=1, y=2 Symbolic Execution symbolic state x=m, y=n constraints DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { x=1, y=2, z=2 int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } Symbolic Execution symbolic state x=m, y=n, z=2m constraints DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); x=1, y=2, z=2 if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } Symbolic Execution symbolic state x=m, y=n, z=2m constraints 2m = n DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); if(z==y){ x=1, y=2, z=2 if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } Symbolic Execution symbolic state constraints 2m = n x=m, y=n, z=2m m != n+10 DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ x=1, y=2, z=2 printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } Symbolic Execution symbolic state constraints 2m = n m != n+10 x=m, y=n, z=2m DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } x=1, y=2, z=2 } Symbolic Execution symbolic state constraints 2m = n m != n+10 x=m, y=n, z=2m DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state Symbolic Execution symbolic state constraints solve: 2m = n and m=n+10 m= -10, n= -20 void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } x=1, y=2, z=2 } 2m = n m != n+10 x=m, y=n, z=2m DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } t1=-10 Symbolic Execution symbolic state t1=m constraints DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state t1=-10, t2=-20 void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } Symbolic Execution symbolic state t1=m, t2=n constraints DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state t1=-10, t2=-20 void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } Symbolic Execution symbolic state t1=m, t2=n constraints DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } x=-10, y=-20 Symbolic Execution symbolic state x=m, y=n constraints DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { x=-10, y=-20, z=int z = double(x); 20 if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } Symbolic Execution symbolic state x=m, y=n, z=2m constraints DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); x=-10, y=-20, z=if(z==y){ 20 if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } Symbolic Execution symbolic state x=m, y=n, z=2m constraints 2m = n DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); if(z==y){ x=-10, y=-20, z=if(x != y+10){ 20 printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } Symbolic Execution symbolic state constraints 2m = n x=m, y=n, z=2m m = n+10 DART Approach Concrete Execution main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } concrete state void test_me(int x, int y) { int z = double(x); Program Error if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); x=-10, y=-20, z=20 abort(); } } Symbolic Execution symbolic state constraints 2m = n m = n+10 x=m, y=n, z=2m DART Approach main(){ int t1 = randomInt(); int t2 = randomInt(); test_me(t1,t2); } int double(int x) {return 2 * x; } void test_me(int x, int y) { int z = double(x); if(z==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); abort(); } } N Y z==y N Error x!=y+10 Y DART in a Nutshell • Dynamically observe random execution and generate new test inputs to drive the next execution along an alternative path – do dynamic analysis on a random execution – collect symbolic constraints at branch points – negate one constraint at a branch point (say b) – call constraint solver to generate new test inputs – use the new test inputs for next execution to take alternative path at branch b – (Check that branch b is indeed taken next) More details • Instrument the C program to do both – Concrete Execution • Actual Execution – Symbolic Execution and Lightweight theorem proving (path constraint solving) • Dynamic symbolic analysis • Interacts with concrete execution • Instrumentation also checks whether the next execution matches the last prediction. Experiments • Tested a C implementation of a security protocol (Needham-Schroeder) with a known attack – 406 lines of code – Took less than 26 minutes on a 2GHz machine to discover middle-man attack • In contrast, a software model-checker (VeriSoft) and a hand-written nondeterministic model of the attacker took hours to discover the attack Larger Experiment • • • oSIP (open-source session initiation protocol) – http://www.gnu.org/software/osip/osip.html – 30,000 lines of C code (version 2.0.9) – 600 externally visible functions Results – crashed 65% of the externally visible functions within 1000 iterations – no nullity check for pointers Focused on oSIP parser – can externally crash oSIP server – osip_message_parse() : pass a buffer of size 2.5 MB with no 0 or “|” character – tries to copy the packet to stack using alloca(size) • this fails: returns NULL pointer – this NULL pointer passed to another function • does not check for nullity and crashes Advantage of Dynamic Analysis over Static Analysis struct foo { int i; char c; } bar (struct foo *a) { if (a->c == 0) { *((char *)a + sizeof(int)) = 1; if (a->c != 0) { abort(); } } } • Reasoning about dynamic data is easy • Due to limitation of alias analysis “static analyzers” cannot determine that “a->c” has been rewritten – Software model checker BLAST would infer that the program is safe • DART finds the error Further advantages 1 foobar(int x, int y){ 2 if (x*x*x > 0){ 3 if (x>0 && y==10){ 4 abort(); 5 } 6 } else { 7 if (x>0 && y==20){ 8 abort(); 9 } 10 } 11 } • static analysis based modelcheckers would consider both branches – both abort() statements are reachable – false alarm • Symbolic execution gets stuck at line number 2 • DART finds the only error Simultaneous Symbolic & Concrete Execution void again_test_me(int x,int y){ z = x*x*x + 3*x*x + 9; if(z != y){ printf(“Good branch”); } else { printf(“Bad branch”); abort(); } } • Let initially x = -3 and y = 7 generated by random test-driver Simultaneous Symbolic & Concrete Execution void again_test_me(int x,int y){ z = x*x*x + 3*x*x + 9; if(z != y){ printf(“Good branch”); } else { printf(“Bad branch”); abort(); } } • • • • Let initially x = -3 and y = 7 generated by random test-driver concrete z = 9 symbolic z = x*x*x + 3*x*x+9 take then branch with constraint x*x*x+ 3*x*x+9 != y Simultaneous Symbolic & Concrete Execution void again_test_me(int x,int y){ z = x*x*x + 3*x*x + 9; if(z != y){ printf(“Good branch”); } else { printf(“Bad branch”); abort(); } } • • • • • • Let initially x = -3 and y = 7 generated by random test-driver concrete z = 9 symbolic z = x*x*x + 3*x*x+9 take then branch with constraint x*x*x+ 3*x*x+9 != y solve x*x*x+ 3*x*x+9 = y to take else branch Don’t know how to solve !! – Stuck ? Simultaneous Symbolic & Concrete Execution void again_test_me(int x,int y){ z = x*x*x + 3*x*x + 9; if(z != y){ printf(“Good branch”); } else { printf(“Bad branch”); abort(); } } • • • • • • Let initially x = -3 and y = 7 generated by random test-driver concrete z = 9 symbolic z = x*x*x + 3*x*x+9 take then branch with constraint x*x*x+ 3*x*x+9 != y solve x*x*x+ 3*x*x+9 = y to take else branch Don’t know how to solve !! – Stuck ? – NO : DART handles this smartly Simultaneous Symbolic & Concrete Execution void again_test_me(int x,int y){ z = x*x*x + 3*x*x + 9; if(z != y){ printf(“Good branch”); } else { printf(“Bad branch”); abort(); } } • Let initially x = -3 and y = 7 generated by random test-driver Simultaneous Symbolic & Concrete Execution void again_test_me(int x,int y){ z = x*x*x + 3*x*x + 9; if(z != y){ printf(“Good branch”); } else { printf(“Bad branch”); abort(); } } • • • Let initially x = -3 and y = 7 generated by random test-driver concrete z = 9 symbolic z = x*x*x + 3*x*x+9 – cannot handle symbolic value of z Simultaneous Symbolic & Concrete Execution void again_test_me(int x,int y){ z = x*x*x + 3*x*x + 9; if(z != y){ printf(“Good branch”); } else { printf(“Bad branch”); abort(); } } • • • Let initially x = -3 and y = 7 generated by random test-driver concrete z = 9 symbolic z = x*x*x + 3*x*x+9 – cannot handle symbolic value of z – make symbolic z = 9 (randomly chose a value) and proceed Simultaneous Symbolic & Concrete Execution void again_test_me(int x,int y){ z = x*x*x + 3*x*x + 9; if(z != y){ printf(“Good branch”); } else { printf(“Bad branch”); abort(); } } • • • • Let initially x = -3 and y = 7 generated by random test-driver concrete z = 9 symbolic z = x*x*x + 3*x*x+9 – cannot handle symbolic value of z – make symbolic z = 9 and proceed take then branch with constraint 9 != y Simultaneous Symbolic & Concrete Execution void again_test_me(int x,int y){ z = x*x*x + 3*x*x + 9; if(z != y){ printf(“Good branch”); } else { printf(“Bad branch”); abort(); } } • • • • • • Let initially x = -3 and y = 7 generated by random test-driver concrete z = 9 symbolic z = x*x*x + 3*x*x+9 – cannot handle symbolic value of z – make symbolic z = 9 and proceed take then branch with constraint 9 != y solve 9 = y to take else branch execute next run with x = -3 and y= 9 – got error (reaches abort) Simultaneous Symbolic & Concrete Execution void again_test_me(int x,int y){ z = x*x*x + 3*x*x + 9; if(z != y){ printf(“Good branch”); } else { printf(“Bad branch”); abort(); } } • • • • • • Replace symbolic expression by concrete value when symbolic expression becomes unmanageable (i.e. non-linear) Let initially x = -3 and y = 7 generated by random test-driver concrete z = 9 symbolic z = x*x*x + 3*x*x+9 – cannot handle symbolic value of z – make symbolic z = 9 and proceed take then branch with constraint 9 != y solve 9 = y to take else branch execute next run with x = -3 and y= 9 – got error (reaches abort) Simultaneous Symbolic & Concrete Execution void again_test_me(int x,int y){ z = x*x*x + 3*x*x + 9; if(z != y){ printf(“Good branch”); } else { printf(“Bad branch”); abort(); } } void again_test_me(int x,int y){ z = black_box_fun(x); if(z != y){ printf(“Good branch”); } else { printf(“Bad branch”); abort(); } } Discussion • In comparison to existing testing tools, DART is – light-weight – dynamic analysis (compare with static analysis) • ensures no false alarms – concrete execution and symbolic execution run simultaneously • symbolic execution consults concrete execution whenever dynamic analysis becomes intractable – real tool that works on real C programs • completely automatic • Software model-checkers using abstraction (SLAM, BLAST) – starts with an abstraction with more behaviors – gradually refines – static analysis approach – false alarms – DART: executes program systematically to explore feasible paths Another Name for this Approach: Concolic Execution • Combine concrete and symbolic execution for unit testing – Concrete + Symbolic = Concolic • Concolic Execution – Use concrete execution over a concrete input to guide symbolic execution – Concrete execution helps Symbolic execution to simplify complex and unmanageable symbolic expressions • by replacing symbolic values by concrete values • Achieves Scalability – Higher branch coverage than random testing – No false positives or scalability issue like in symbolic execution based testing CUTE: A concolic execution tool • CUTE: A Concolic Unit Testing Engine – For C and Java – Handle pointers • Can test data-structures • Can handle heap – Supports bounded depth search – Use static analysis to find branches that can lead to assertion violation • use this info to prune search space CUTE Case study: SGLIB • SGLIB: popular library for C data-structures • Used in Xrefactory a commercial tool for refactoring C/C++ programs • Found two bugs in sglib 1.0.1 – reported them to authors – fixed in sglib 1.0.2 • Bug 1: – doubly-linked list library • segmentation fault occurs when a non-zero length list is concatenated with a zero-length list • discovered in 140 iterations ( < 1second) • Bug 2: – hash-table • an infinite loop in hash table is-member function – 193 iterations (1 second) DART vs CUTE • DART handles only arithmetic constraints • CUTE – Supports C with • pointers, data-structures – Highly efficient constraint solver • 100 -1000 times faster – arithmetic, pointers – Provides Bounded Depth-First Search and Random Search strategies – Publicly available tool that works on ALL C programs Example typedef struct cell { int v; struct cell *next; } cell; int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } • Random Test Driver: random memory graph • reachable from p • random value for x Probability of reaching abort( ) is •extremely low CUTE Approach typedef struct cell { int v; struct cell *next; } cell; int f(int v) { return 2*v + 1; } Concrete Execution concrete state symbolic state p , x=236 int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } Symbolic Execution NULL p=p , x=x 0 0 constraints CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state Symbolic Execution symbolic state int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } p , x=236 NULL p=p , x=x 0 0 constraints CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state Symbolic Execution symbolic state constraints int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } x >0 0 p , x=236 NULL p=p , x=x 0 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state Symbolic Execution symbolic state constraints int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } x >0 0 !(p !=NULL) 0 p , x=236 NULL p=p , x=x 0 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state symbolic state constraints solve: x >0 and p NULL 0 0 int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } Symbolic Execution x >0 0 p =NULL 0 p , x=236 NULL p=p , x=x 0 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; Symbolic Execution concrete state symbolic state constraints solve: x >0 and p NULL 0 0 int f(int v) { return 2*v + 1; } x =236, p 0 0 NULL 634 int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } x >0 0 p =NULL 0 p , x=236 NULL p=p , x=x 0 0 CUTE Approach typedef struct cell { int v; struct cell *next; } cell; Concrete Execution concrete state Symbolic Execution symbolic state int f(int v) { return 2*v + 1; } p int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } NULL, x=236 634 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 constraints CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state Symbolic Execution symbolic state constraints int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } p NULL, x=236 634 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 x >0 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state Symbolic Execution symbolic state constraints int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } p NULL, x=236 634 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 x >0 0 p NULL 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state Symbolic Execution symbolic state constraints int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } p NULL, x=236 634 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 x >0 0 p NULL 0 2x +1v 0 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state Symbolic Execution symbolic state constraints int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } x >0 0 p NULL 0 2x +1v 0 0 p NULL, x=236 634 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state symbolic state constraints solve: x >0 and p NULL and 2x +1=v 0 0 0 0 int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } Symbolic Execution x >0 0 p NULL 0 2x +1v 0 0 p NULL, x=236 634 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state symbolic state constraints solve: x >0 and p NULL and 2x +1=v 0 0 0 0 int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } Symbolic Execution x =1, p 0 0 NULL 3 x >0 0 p NULL 0 2x +1v 0 0 p NULL, x=236 634 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; int f(int v) { return 2*v + 1; } concrete state p NULL, x=1 int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } 3 Symbolic Execution symbolic state p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 constraints CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state Symbolic Execution symbolic state constraints int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } p NULL, x=1 3 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 x >0 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state Symbolic Execution symbolic state constraints int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } p NULL, x=1 3 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 x >0 0 p NULL 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state Symbolic Execution symbolic state constraints int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } p NULL , x=1 3 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 x >0 0 p NULL 0 2x +1=v 0 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state Symbolic Execution symbolic state constraints int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } x >0 0 p NULL 0 p NULL, x=1 3 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 2x +1=v 0 0 n p 0 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state Symbolic Execution symbolic state constraints int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } x >0 0 p NULL 0 2x +1=v 0 0 n p 0 0 p NULL , x=1 3 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } Symbolic Execution symbolic state constraints solve: x >0 and p NULL and 2x +1=v 0 0 0 and0 n =p 0 0 . x >0 0 p NULL 0 2x +1=v 0 0 n p 0 0 p NULL , x=1 3 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } Symbolic Execution symbolic state constraints solve: x >0 and p NULL and 2x +1=v 0 0 0 and0 n =p 0 0 x =1, p 0 0 3 x >0 0 p NULL 0 2x +1=v 0 0 n p 0 0 p NULL , x=1 3 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; int f(int v) { return 2*v + 1; } concrete state p , x=1 int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } 3 Symbolic Execution symbolic state p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 constraints CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state Symbolic Execution symbolic state constraints int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } p , x=1 3 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 x >0 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state Symbolic Execution symbolic state constraints int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } p , x=1 3 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 x >0 0 p NULL 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state Symbolic Execution symbolic state constraints int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } p , x=1 3 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 x >0 0 p NULL 0 2x +1=v 0 0 CUTE Approach Concrete Execution typedef struct cell { int v; struct cell *next; } cell; concrete state Symbolic Execution symbolic state constraints int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) abort(); return 0; } x >0 0 p NULL 0 Program Error p , x=1 3 p=p , x=x , 0 =v 0 p->v , 0 p->next=n 0 2x +1=v 0 0 n =p 0 0 Explicit Path (not State) Model Checking • Traverse all execution paths one by one to detect errors – check for assertion violations – check for program crash – combine with valgrind to discover memory leaks – detect invariants 1 0 0 1 1 1 0 0 1 1 0 1 Explicit Path (not State) Model Checking • Traverse all execution paths one by one to detect errors – check for assertion violations – check for program crash – combine with valgrind to discover memory leaks – detect invariants 1 0 0 1 1 1 0 0 1 1 0 1 Explicit Path (not State) Model Checking • Traverse all execution paths one by one to detect errors – check for assertion violations – check for program crash – combine with valgrind to discover memory leaks – detect invariants 1 0 0 1 1 1 0 0 1 1 0 1 Explicit Path (not State) Model Checking • Traverse all execution paths one by one to detect errors – check for assertion violations – check for program crash – combine with valgrind to discover memory leaks – detect invariants 1 0 0 1 1 1 0 0 1 1 0 1 Explicit Path (not State) Model Checking • Traverse all execution paths one by one to detect errors – check for assertion violations – check for program crash – combine with valgrind to discover memory leaks – detect invariants 1 0 0 1 1 1 0 0 1 1 0 1 Explicit Path (not State) Model Checking • Traverse all execution paths one by one to detect errors – check for assertion violations – check for program crash – combine with valgrind to discover memory leaks – detect invariants 1 0 0 1 1 1 0 0 1 1 0 1 Explicit Path (not State) Model Checking • Traverse all execution paths one by one to detect errors – check for assertion violations – check for program crash – combine with valgrind to discover memory leaks – detect invariants 1 0 0 1 1 1 0 0 1 1 0 1 Explicit Path (not State) Model Checking • Traverse all execution paths one by one to detect errors – check for assertion violations – check for program crash – combine with valgrind to discover memory leaks – detect invariants 1 0 0 1 1 1 0 0 1 1 0 1 CUTE in a Nutshell • Generate concrete inputs one by one – each input leads program along a different path CUTE in a Nutshell • Generate concrete inputs one by one – each input leads program along a different path • On each input execute program both concretely and symbolically CUTE in a Nutshell • Generate concrete inputs one by one – each input leads program along a different path • On each input execute program both concretely and symbolically – Both cooperate with each other • concrete execution guides the symbolic execution CUTE in a Nutshell • Generate concrete inputs one by one – each input leads program along a different path • On each input execute program both concretely and symbolically – Both cooperate with each other • concrete execution guides the symbolic execution • concrete execution enables symbolic execution to overcome incompleteness of theorem prover – replace symbolic expressions by concrete values if symbolic expressions become complex – resolve aliases for pointer using concrete values – handle arrays naturally CUTE in a Nutshell • Generate concrete inputs one by one – each input leads program along a different path • On each input execute program both concretely and symbolically – Both cooperate with each other • concrete execution guides the symbolic execution • concrete execution enables symbolic execution to overcome incompleteness of theorem prover – replace symbolic expressions by concrete values if symbolic expressions become complex – resolve aliases for pointer using concrete values – handle arrays naturally • symbolic execution helps to generate concrete input for next execution – increases coverage Testing Data-structures of CUTE itself • Unit tested several non-standard data-structures implemented for the CUTE tool – cu_depend (used to determine dependency during constraint solving using graph algorithm) – cu_linear (linear symbolic expressions) – cu_pointer (pointer symbolic expressions) • Discovered a few memory leaks and a couple of segmentation faults – these errors did not show up in other uses of CUTE – for memory leaks we used CUTE in conjunction with Valgrind SGLIB: popular library for C data-structures • Used in Xrefactory a commercial tool for refactoring C/C++ programs • Found two bugs in sglib 1.0.1 – reported them to authors – fixed in sglib 1.0.2 • Bug 1: – doubly-linked list library • segmentation fault occurs when a non-zero length list is concatenated with a zero-length list • discovered in 140 iterations ( < 1second) • Bug 2: – hash-table • an infinite loop in hash table is-member function – 193 iterations (1 second) Summary • • “DART: Directed Automated Random Testing” by Patrice Godefroid, Nils Klarlund, and Koushik Sen (PLDI’05) – handles only arithmetic constraints CUTE – Supports C with • pointers, data-structures – Highly efficient constraint solver • 100 -1000 times faster – arithmetic, pointers – Provides Bounded Depth-First Search and Random Search strategies – Publicly available tool that works on ALL C programs