Deciding Bit-Vector Arithmetic with Abstraction Randal E. Bryant (CMU) Daniel Kroening (ETH / Oxford) Joël Ouaknine (Oxford) Sanjit Seshia (Berkeley) Ofer Strichman (Technion) Bryan Brady (Berkeley) (3 continents, 4 countries, 5 universities, 6 authors) TACAS’07 D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 1 / 27 Context: The Verifying Compiler Major renewed interest in bitvector formulas. Hardware verification not solved yet... Software verification. Tony Hoare D. Kroening, O. Strichman () The Verifying Compiler: a Grand Challenge for Computing Research Deciding Bit-Vector Arithmetic 2 / 27 Context: The Verifying Compiler “A Program Verifier” “A program verifier uses automated mathematical and logical reasoning to check the consistency of programs with their internal and external specifications.” D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 3 / 27 Decision Procedures for System-Level Software What kind of logic do we need for system-level software? 1 We need bit-vector logic – with bit-wise operators, arithmetic overflow 2 We want to scale to large programs – must verify large formulas Examples of program analysis tools that generate bit-vector formulas: 3 CBMC, SATABS F-Soft (NEC) SATURN (Stanford, Alex Aiken) EXE (Stanford, Dawson Engler, David Dill) Variants of those developed at IBM, Microsoft D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 4 / 27 Outline 1 Background Syntax Semantics Flattening Bit-Vector Logic Incremental flattening 2 Under- and over-approximation 3 Automatic refinement 4 Experimental results D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 5 / 27 Bit-Vector Logic: Syntax formula : formula ∨ formula | ¬formula | atom atom : term rel term | Boolean-Identifier | term[ constant ] rel : = | < term : term op term | identifier | ∼ term | constant | atom?term:term | term[ constant : constant ] | ext( term ) op : + | − | · | / | << | >> | & | | | ⊕ | ◦ ∼ x: bit-wise negation of x ext(x): sign- or zero-extension of x x << d: left shift with distance d x ◦ y: concatenation of x and y D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 6 / 27 Semantics Danger! (x − y > 0) ⇐⇒ (x > y) Valid over R/N, but not over the bit-vectors. (Many compilers have this sort of bugs) D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 7 / 27 Width and Encoding The meaning depends on the width and encoding of the variables. Typical encodings: Binary encoding hxi := n−1 X ai · 2i i=0 Two’s complement [x] := −2n−1 · an−1 n−2 X ai · 2i i=0 But maybe also fixed-point, floating-point, . . . D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 8 / 27 Semantics Meaning of the bit-wise operators is straight-forward Meaning of the arithmetic operators defined using modular arithmetic: a + b = c ⇐⇒ hai + hbi = hci mod 2n (for n-bit binary bit-vectors a, b, c) Depends on the encoding for most operators Satisfiability is undecidable for an unbounded width. It is NP-complete otherwise. D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 9 / 27 A simple decision procedure Transform Bit-Vector Logic to Propositional Logic Most commonly used decision procedure – this is standard in industry Also called ’bit-blasting’ Bit-Vector Flattening 1 Convert propositional part 2 Add a Boolean variable for each bit of each sub-expression (term) 3 Add constraint for each sub-expression D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 10 / 27 Bit-vector flattening What constraints do we generate for a given term? This is easy for the bit-wise operators. Addition, subtraction: carry chain adders a7b7 a6b6 a5b5 a5b4 a4b3 a3b2 a2b1 a0b0 i FA FA FA FA FA FA FA FA s7 s6 s5 s4 s3 s2 s1 s0 o Multiplication: simple quadratic multipliers This works very well for many formulas. D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 11 / 27 Multipliers Multipliers result in very hard formulas Example: a · b = c ∧ b · a 6= c ∧ x < y ∧ x > y CNF: About 11000 variables, unsolvable for current SAT solvers Similar problems with division, modulo Q: Why is this hard? Q: How do we fix this? D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 12 / 27 Incremental flattening ? ϕf := ϕsk , F := ∅ Choose F 0 ⊆ (I \ F ) F := F ∪ F 0 ϕf := ϕf ∧ Constraint(F ) 6 I 6= ∅ ? Is ϕf SAT? Yes! - compute I No! I=∅ ? ? UNSAT SAT ϕsk : abstraction of ϕ with new term variables. F : set of terms that are in the encoding I: set of terms that are inconsistent with the current assignment D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 13 / 27 Incremental flattening Idea: add ’easy’ parts of the formula first Only add hard parts when needed Implemented by current solvers ϕf only gets stronger – use an incremental SAT solver D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 14 / 27 Problems of incremental flattening What if there are simply many ‘easy’ constraints? Many: >1 GB CNF What if you really need some ’hard’ constraints? New algorithm: an abstraction-refinement procedure D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 15 / 27 an abstraction-refinement procedure The suggested algorithm is potentially efficient when: 1 2 ϕ is satisfiable, and there exists a numerically ‘small’ solution. ϕ is unsatisfiable, and a relatively small number of terms in this formula participate in the proof. i.e., the proof still holds after replacing the other terms with new inputs. D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 16 / 27 Under-approximation Scenario #1: Formula is SAT. Typically, small values are sufficient to satisfy the formula. Small values = few bits. −→ force some bits to 0 In case of signed values, use sign extension This is an under-approximation. We denote this by D. Kroening, O. Strichman () ϕ Deciding Bit-Vector Arithmetic 17 / 27 Under-approximation Let V ar(ϕ) = v1 , . . . , vn . Each underapproximation defines a frontier hwi , . . . , wn i – the allowed width of the variables. The underapproximation pushes the frontier: Monotonically, not uniformly, guided by the overapproximation. D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 18 / 27 Over-approximation Scenario #2: Formula is UNSAT. Frequently, a small part of ϕ is sufficient to show that ϕ is UNSAT. −→ remove atoms by replacing them with new variables. We denote this by D. Kroening, O. Strichman () ϕ Deciding Bit-Vector Arithmetic 19 / 27 Under- and over-approximation Q: Set which bits to zero for ϕ? Q: Remove what parts of the formula for ϕ? Idea: start with small formulas, then refine. D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 20 / 27 Over-approximation Consider the DAG Dφ corresponding to φ. Every internal node represents one of: A Boolean gate (∧, ∨, . . .), A predicate (=, >, . . .), An operator over bitvector terms (∗, +, >>, . . .) Every leaf node represents one of: Boolean variable, Bitvector variable. D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 21 / 27 Over-approximation Each internal node n in Dφ is represented by an auxiliary variable (Tseitin’s style). A Boolean auxiliary variable for the gates and predicates, A bitvector otherwise. Hence, each such node is associated with a set of CNF clauses c(n). Let C be the unsatisfiable core from the underapproximation. Over-approximation: if n is Boolean and vars(c(n)) ∩ vars(C) = ∅, replace it with a new variable. D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 22 / 27 Over-approximation procedure A(DAG Dφ , node n, unsat-core C) if n is a leaf then return ; end if if n is Boolean and vars(c(n)) ∩ vars(C) = ∅ then Replace n in Dφ with a new Boolean variable; return ; end if A(Dφ , left-child(Dφ , n), C); A(Dφ , right-child(Dφ , n), C); end procedure D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 23 / 27 Automatic refinement ? Select small bv-sizes Set new bv-sizes to cover assignment Yes 6 +assignment ? Is ϕ SAT? Compute ϕ No! +Proof from proof - Is ϕ SAT? Yes! No! ? ? ϕ is SAT D. Kroening, O. Strichman () ϕ is UNSAT Deciding Bit-Vector Arithmetic 24 / 27 Over-approximation Let α be a satisfying assignment of ϕ. Let wi (α) denote the width of variable i under α. Let F (α) = hw1 , . . . , wn i denote the Frontier of α. Let F = hw1 , . . . , wn i and F 0 = hw10 , . . . , wn0 i be two frontiers. We say that F dominates F 0 if for all 1 ≤ i ≤ n, wi ≥ wi0 . Theorem: For every α |= ϕ, F (α) dominates the current frontier F . D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 25 / 27 Experimental results Benchmarks used: Y86-*: CISC processor benchmarks s-40-50: security vulnerabilities in C programs Hardware verification benchmark from Intel C-P*: System-level equivalence checking models from a CAD company egt-5212: directed random testing of programs (SMT-COMP’06) We compare to: Bit-blasting STP, Yices: winners of the previous two competitions D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 26 / 27 Experimental results Formula Ans. Y86-std Y86-btnft s-40-50 BBB-32 rfunit flat-64 C1-P1 C1-P2 C3-OP80 egt-5212 UNSAT UNSAT SAT SAT SAT SAT UNSAT SAT UNSAT Bit-Blasting Run-time (sec.) Enc. SAT Total 17.91 * * 17.79 * * 6.00 33.46 39.46 37.09 29.98 67.07 121.99 32.16 154.15 2.68 45.19 47.87 0.44 * * 14.96 * * 0.064 0.003 0.067 Ref. Run-time (sec.) Enc. SAT Total 23.51 987.91 1011.42 26.15 1164.07 1190.22 106.32 10.45 116.77 19.91 1.74 21.65 19.52 1.68 21.20 2.61 0.58 3.19 2.24 2.12 4.36 14.54 349.41 363.95 0.163 0.001 0.164 STP (sec.) 2083.73 err 12.96 38.45 873.67 err * * 0.018 Yices (sec.) * * 65.51 183.30 1312.00 err * 3242.43 0.009 Our refinement is always better than ’bit-blasting’ Yices wins on ’easy’ SMT-COMP’06-type benchmarks Refinement dominating on hard benchmarks s-40-50: large number of iterations, needs incremental solver D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 27 / 27 Conclusion and future work Substantial improvement on hard bit-vector formulas Recently adopted by Synopsys for their tool Hector. Idea applicable to other logics, e.g., Quantifier-free Presburger Future work incremental implementation More forms of over/under-approximation What about floating point? D. Kroening, O. Strichman () Deciding Bit-Vector Arithmetic 28 / 27