Bebop: A Symbolic Model Checker for Boolean Programs Thomas Ball Sriram K. Rajamani http://research.microsoft.com/slam/ Outline Boolean Programs and Bebop What? Why? Results Demo Semantics of Boolean Programs Technical details of algorithm Evaluation Related Work Boolean Programs: What Model for representing abstractions of imperative programs in C, C#, Java, etc. Features: Boolean variables Control-flow: sequencing, conditionals, looping, GOTOs Procedures Call-by-value parameter passing recursion Control non-determinism Boolean programs: Why bool x,y; [1] while (true) { [2] if(x == y) [3] y = } else{ [4] x = [5] y = } [6] if (?) break; } [7] [8] { !x; !x; !y; if(x == y) assert (false); Representation of program abstractions, a la Cousots Each boolean variable represents a predicate: • • • • (i < j) (*p==i) && ( (int) p == j) (p T), where T is recursive data type [Graf-Saidi] Bebop - Results Reachability in boolean programs reduced to context-free language reachability Symbolic interprocedural dataflow analysis Adaptation of [Reps-Horwitz-Sagiv, POPL’95] algorithm n Complexity of algorithm is O(E 2 ) E = size of interprocedural control flow graph n = max. number of variables in the scope of any label Bebop - Results Admits control flow + variables Existing pushdown model checkers don’t use variables (encode variable values explicitly in state) [Esparaza, et al.] Analyzes procedures separately exploits procedural abstraction + locality of variable scopes Explicit representation of control flow graph, as in a compiler Implicit representation of reachable states via BDDs Uses hybrid representation Generates hierarchical trace Bebop Demo! Outline Boolean Programs and Bebop Semantics of Boolean Programs “stackless” semantics using context-free grammar Technical details of algorithm Evaluation Related Work Stackless Semantics State = <p,> p = program counter = valuation to variables in scope at p No stack! (B): finite alphabet over boolean program B <call,p,> Call (with return to p), a valuation to Locals(p) <ret,p,> Return to p, a valuation to Locals(p) State transition <p,> --> <p’,’> (x) = (x), x in Locals(c) ’(x) = (x), x in Locals(c) ’(g) = (g), g a global <c:Pr(), > <e, ’> = <call,e,> = <ret,e,> <d: Proc Pr(),’> <r, > ’(g) = (g), g a global Trace Semantics Context-free grammar L(B) constrains allowable traces M -> <call,q,> M <ret,q,> M -> M M M -> 0 -1-> 1 -2-> … m-1 -m-> m is a trajectory of B iff i -i+1-> i+1 is a state transition, for all i 1 2 … m L(B) Outline Boolean Programs and Bebop Semantics of Boolean Programs Technical details of reachability algorithm Binary Decision Diagrams (BDDs) Path edges Summary edges Example Preliminary Evaluation SLAM Project Binary Decision Diagrams Acyclic graph data structure for representing a boolean function (equivalently, a set of bit vectors) F(x,y,z) = (x=y) x y y z 1 z 1 0 z 0 0 z 0 1 1 Hash Consing + Variable Elimination x y x y y z 1 z 0 1 z 1 y z 0 x y z 0 0 z 0 0 1 1 y 1 Path Edges <e,p> PE(p), iff Exists initialized trajectory ending in <e,e>, where e = entry(Proc(p)) Exists trajectory from <e,e> to <p,p> PE(p) is a set of pairs of valuations to boolean variables in scope in Proc(p) Can be represented with a BDD! Representing Path Edges with BDDs Example PE(p) for boolean variables x,y and z: PE(p) = F(x,y,z,x’,y’,z’) = (x’=x)^(y’=y)^(z’=x^y) BDDs also used to represent transfer functions for statements Transfer(z := x^y) = F(x,y,z,x’,y’,z’) = (x’=x)^(y’=y)^(z’=x^y) decl g; Join(S,T) = { <1,2> | <1,J>S, void main() 1 begin <J,2>T } decl h; g'=0^h'=1 h := !g; A(g,h); |g'=1^h'=0 skip; A(g,h); skip; if (g) then R: skip; fi end void A(a1,a2) g=g’=0^a1=a1’=0^a2=a2’=1 begin if (a1) then | g=g’=1^a1=a1’=1^a2=a2’=0 A(a2,a1); skip; g=g’=0^a1=a1’=0^a2=a2’=1 else g := a2; g=0^g’=1^a1=a1’=0^a2=a2’=1 fi end Summary Edges <1,2> = Lift(<d,r>, Pr) <1,2> c: Pr() d: Proc Pr() e r <d,r> 1(x) = 2(x), x in Locals(c) Locals don’t change 1(g) = d(g) and r(g) = 2(g), g global Propagation of global state decl g; void main() begin decl h; h := !g; A(g,h); skip; g=0^g’=1^h=h’=1 A(g,h); skip; g=0^g’=1^h=h’=1 if (g) then R: skip; fi end void A(a1,a2) begin if (a1) then A(a2,a1); g=0^g’=1^a1=a1’=1^a2=a2’=0 skip; else g := a2; g=0^g’=1^a1=a1’=0^a2=a2’=1 fi end decl g; void main() 1 begin decl h; g'=0^h'=1 h := !g; A(g,h); |g'=1^h'=0 skip; g=0^g’=1^h=h’=1 A(g,h); skip; g’=h’=1 if (g) then R: skip; fi end void A(a1,a2) g=g’=a1=a1’=a2=a2’=1 begin if (a1) then g=g’=a1=a1’=a2=a2’=1 A(a2,a1); skip; else g := a2; fi end Worklist Algorithm while PE(v) has changed, for some v Determine if any new path edges can be generated New path edge comes from Existing path edge + transfer function Existing path edge + summary edge (transfer function for procedure calls) New summary edges generated from path edges that reach exit vertex Generating Error Traces Partition reachable states into “rings” A ring R at stmt S is numbered N iff there is a shortest trace of length N to S ending in a state in R Hierarchical generation of error trace Skip over or descend into called procedures Outline Boolean Programs and Bebop Semantics of Boolean Programs Technical details of algorithm Preliminary Evaluation Linear behavior if # vars in scope remains constant Self application of Bebop Related Work decl g; void main() begin level1(); level1(); if(!g) then reach: skip; else skip; fi end void level<i>() begin decl a,b,c; if (g) then while(!a|!b|!c) do if (!a) then a := 1; elsif (!b) then a,b := 0,1; elsif (!c) then a,b,c := 0,0,1; else skip; fi od else <stmt>; <stmt>; fi g := !g; end Peak Live BDD Nodes for T(N) 250000 Peak Space for T(N) 200000 150000 100000 50000 0 0 200 400 600 N 800 1000 Application: Analysis Validation Live variable analysis (LVA) A variable x is live at s if there is a path from s to a use of x (with no intervening def of x) Used to optimize bebop Quantify out variables as soon as they become dead How to check correctness of LVA? Analysis validation Create a boolean program to check results of LVA Model check boolean program (w/out LVA) Analysis Validation Output of LVA: { (s,x) | x is dead at s } Boolean program Two variables per original program var x: For each fact (s,x): x_dead, x_defined := 1, 0; For each def of x: x_dead (initially 0) x_defined (initially 0) x_defined := 1; For each use of x if (x_dead && !x_defined) LVAError(); Query: is LVAError reachable? Results Found subtle error in implementation of LVA Was able to show colleague that there was another error, in his code Analysis validation now part of regression test suite Related Work Pushdown Automata (PDA) decidability results [Hopcroft-Ullman] Model checking PDAs [Bouajjani-Esparza-Maler] [Esparza-Hansel-Rossmanith-Schwoon] Model checking Hierarchical State Machines [Alur, Grosu] Interprocedural dataflow analysis [Sharir-Pnueli] [Steffen] [Knoop-Steffen] [Reps-Horwitz-Sagiv] Related Work Reps-Horwitz-Sagiv (RHS) algorithm Handles IFDS problems Interprocedural Finite domain D Distributive dataflow functions (MOP=MFP) Subsets of D Dataflow as CFL reachability over “exploded graph” Our results RHS algorithm can be reformulated as a traditional dataflow algorithm over original control-flow graph with same time/space complexity Reformulated algorithm is easily lifted to powersets of D using BDDs Arbitrary dataflow functions Path-sensitive Summary Bebop: a model checker for boolean programs Based on interprocedural dataflow analysis using BDDs Exploits procedural abstraction Admits many traditional compiler optimizations Hierarchical trace generation + DHTML user interface Release at end of year SLAM project Iteratively refine boolean program models of C programs Use path simulation to discover relevant predicates (simcl) Automated predicate abstraction (c2bp) Software Productivity Tools Microsoft Research http://research.microsoft.com/slam/