Enhancing Symbolic Execution with Veritesting Thanassis Avgerinos, Alexandre Rebert, Sang Kil Cha and David Brumley Carnegie Mellon University ICSE 2014 1 Background Symbolic Execution • Use symbols to represent variables x = y + 1 z = x * 2 + 3 • Concrete execution y = 1 z = 7 • Symbolic execution y = in_y z = (in_y + 1) * 2 + 3 2 Background Symbolic Execution (2) x = input() if(x > 0) y = x; else y = -x; z = y; x = input() x>0? T F y=x y = -x z=y Test case generation x > 0 SMT solver input 3 Background Symbolic Execution (3) x = input() if(x > 0) y = x; else y = -x; z = y; x = input() x>0? T F y=x y = -x z=y Program verification: z = |x| z = ite(x>0, x, -x) SMT solver Valid? 4 Problem Approaches • Dynamic symbolic execution (DSE) - testing – Path-based formulas – Easy-to-solve – Hard-to-generate (Path explosion) • Static symbolic execution (SSE) - verification – Property-based formulas – Hard-to-solve (solver blowup) – Easy-to-generate • Easy-to-generate & Easy-to-solve ? 5 Method Veritesting • Alternates between SSE and DSE. • Twice as many Bugs DSE SSE DSE SSE DSE • Orders of magnitude more paths • Higher code coverage 6 Method DSE w/o Veritesting x = input() S <- Ø x>0? T F y=x y = -x z=y …………… 7 Method • CFGRecovery • CFGReduce • StaticSymbolic • Finalize 8 Method(1) CFGRecovery • Generate a partial CFG • (S) Symbolic branch • (E) Any hard-to-handle inst – ret – syscall – unknown exit node 9 Method(2) CFGReduce • Transition points – Immediate postdominator of entry node – Predecessors of Exit • Unrolling loops – Switch to concrete value – User-defined bound 10 Method(3) StaticSymbolic if(x > 1) y = 1; else if(x < 42) y = 17; 11 Method(4) Finalize • Create new executor – For each distinct transition point • CFG accurate – Overestimation – Underestimation • Incremental Deployment x = input() x>0? T F y=x y = -x z=y …………… 12 Implementation MergePoint 13 Evaluation Evaluation • Metrics – Number of bugs – Node coverage – Path coverage • Benchmarks – GNU coreutils – BIN suite (1,023 programs) – Debian packages (33,248 programs) 14 Evaluation (1) Bug finding • BIN: 63 + 85 • coreutils: 2 new bugs • 9 years old, time zone parser in Gnulib 15 Evaluation (2) Node Coverage • • 27% more coverage than S2E on coreutils • 16 Evaluation (3) Path Coverage • Three estimations – Time to complete test • 46 programs, 73% faster – Multiplicity • For bin, 1.4 x 10290 (average), 1.8 x 1012 (median) • For coreutils, 1.4 x 10199 (average), 4.4 x 1011 (median) – Fork rate • Reduce average by 65% • Reduce median by 44% 17 Evaluation (4) Debian benchmark 18 Conclusion Conclusion • Veritesting: enhance the DSE with SSE • MergePonit: infrastructure testing programs • Large value evaluation and results 19 Discussion • Why is it faster? – SSE introduces overhead for formula-solving – Reduces the number of duplicated paths Benefits > cost – Insight into the SMT solver • Exploit generation • Other bugs 20 Thanks 21