Formal Verification of Infinite-State Systems Using Boolean Methods Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant Main Ideas Infinite State Systems Greater power & generality than finite-state models Verified by extensions of finite-state model checking Must find balance between expressiveness of model & ability to automate Outline Why infinite state systems? UCLID modeling capabilities Verification methods Implementation Advances in SAT and decision procedures –2– Prospects and challenges FLoC ‘06 Verification Example Task –3– Verify that microprocessor correctly implements instruction set definition Even though heavily pipelined Alpha 21264 Microprocessor Microprocessor Report, Oct. 28, 1996 FLoC ‘06 Verification Challenges Sources of Complexity Lots of internal state Complex control logic Opportunities –4– Most of the logic serves to store, select, and communicate data Alpha 21264 Microprocessor Microprocessor Report, Oct. 28, 1996 FLoC ‘06 Sources of Infinity Real-life computers are finite state Infinite-State Abstractions Traditional model for reasoning about programs Soundness depends on properties being verified Computer words Memory capacities In Use • • • –5– • • • • • • tail head FLoC ‘06 Sources of Infinity (cont.) P2 • P1 • Finite, but unbounded Synchronization protocol that should work for arbitrary number of processes • PN Verify for arbitrary N Circular buffer with fixed, but arbitrary capacity In Use head • • • –6– • • • • • • 0 tail Max-1 Verify for arbitrary value of Max FLoC ‘06 Existing Automatic Verification Methods Simulators, model checkers, … All Operate at Bit Level State model State encoded as words and arrays of words Comprised of bits Must track how each bit of state gets updated Only Verify Single Instance of Design Fixed values for parameters Word size Buffer sizes Number of processes Some Work in Parameterized System Verification –7– Exploit symmetries in system Limited applicability FLoC ‘06 What About Theorem Provers? Traditional Tool for Formal Verification Allow many forms of abstraction Hard to Use Lots of manual effort & expertise required Question: –8– Can we incorporate some of these abstraction abilities into an automated tool? FLoC ‘06 UCLID Seshia, Lahiri, Bryant, CAV ‘02 Term-Level Verification System Language for describing systems Inspired by CMU SMV Symbolic simulator Generates integer expressions describing system state after sequence of steps Decision procedure Determines validity of formulas Support for multiple verification techniques Available by Download http://www.cs.cmu.edu/~uclid –9– FLoC ‘06 Data Abstraction #1: Bits → Integers x0 x1 x2 x xn-1 View Data as Symbolic Words Arbitrary integers No assumptions about size or encoding Classic model for reasoning about software – 10 – Can store in memories & registers FLoC ‘06 Abstracting Data Bits Control Logic Com. ? Log. 1 Com. ? Log. 2 1 Data Path What do we do about logic functions? – 11 – FLoC ‘06 Abstraction #2: Uninterpreted Functions A Lf U For any Block that Transforms or Evaluates Data: Replace with generic, unspecified function Only assumed property is functional consistency: a = x b = y f (a, b) = f (x, y) – 12 – FLoC ‘06 Abstracting Functions Control Logic Com. Log. F1 1 Com. Log. F2 1 Data Path For Any Block that Transforms Data: – 13 – Replace by uninterpreted function Ignore detailed functionality Conservative approximation of actual system FLoC ‘06 Abstraction #3: Modeling Memories as Mutable Functions Memory M Modeled as Function M a M(a): Value at location a Initially M a – 14 – m0 Arbitrary state Modeled by uninterpreted function m0 FLoC ‘06 Effect of Memory Write Operation Writing Transforms Memory M = Write(M, wa, wd) M Express with Lambda Notation M = a . ITE(a = wa, wd, M(a)) wa = a wd M 1 0 Reading from updated memory: Address wa will get wd Otherwise get what’s already in M – 15 – FLoC ‘06 Comparison to Array Modeling Theory of Arrays Read operation Read(M, a) Write operation Write(M, wa, wd) Memory comparison predicate M1 = M2 Mutable Functions Function application M(a) Lambda definition a . ITE(a = wa, wd, M(a)) Content comparison a . M1(a) = M2(a) Only feasible for positive equality – 16 – Not limited to one dimension Lambda definition allows other forms of updating FLoC ‘06 An Out-of-order Processor (OOO) incr Program memory PC result bus valid tag val D E C O D E dispatch Register Rename Unit retire ALU execute head tail Reorder Buffer valid value src1valid src1val src1tag src2valid src2val src2tag dest op result 1st Operand 2nd Operand Reorder Buffer Fields Data Dependencies Resolved by Register Renaming Map register ID to instruction in reorder buffer that will generate register value Inorder Retirement Managed by Retirement Buffer – 17 – FIFO buffer keeping pending instructions in program order FLoC ‘06 Access Modes for Reorder Buffer Retire Dispatch result bus ALU execute FIFO head tail Content Addressable Insert when dispatch Remove when retire Directly Addressable – 18 – Select particular entry for execution Retrieve result value from executed instruction Broadcast result to all entries with matching source tag Global Flush all queue entries when instruction at head causes exception FLoC ‘06 Underlying Logic Scalar Data Types Formulas (F ) Boolean Expressions Control signals Terms (T ) Integer Expressions Data values Functional Data Types Functions (Fun) Integer Integer Immutable: Functional units Mutable: Memories Predicates (P) Integer Boolean Immutable: Data-dependent control Mutable: Bit-level memories – 19 – FLoC ‘06 CLU Logic Counter Arithmetic, Lambda Expressions and Uinterpreted Functions Terms (T ) ITE(F, T1, T2) Fun (T1, …, Tk) succ (T) pred (T) Formulas (F ) F, F1 F2, F1 F2 T1 = T2 T1 < T2 P(T1, …, Tk) Integer Expressions If-then-else Function application Increment Decrement Boolean Expressions Boolean connectives Equation Inequality Predicate application To support pointer operations – 20 – FLoC ‘06 CLU Logic (Cont.) Functions (Fun) f x1, …, xk . T Predicates (P) p x1, …, xk . F – 21 – Integer Integer Uninterpreted function symbol Function definition Integer Boolean Uninterpreted predicate symbol Predicate definition FLoC ‘06 UCLID Decision Procedure Operation CLU Formula Lambda Expansion Series of transformations leading to propositional formula Except for lambda expansion, each has polynomial complexity -free Formula Function & Predicate Elimination Term Formula Finite Instantiation Boolean Formula Boolean Satisfiability – 22 – FLoC ‘06 System Model Present State Next State State Variable Types Boolean Control signals Integer Data, addresses Function Memories, buffers Reset Inputs (Arbitrary) System Operation Synchronous All state variables updated on each step of operation Interleaving One (set of) state variable(s) updated at a time Simulate in synchronous model with uninterpreted scheduling function – 23 – FLoC ‘06 Outline Why Infinite State Systems? Modeling capabilities of UCLID Verification Methods Implementation Advances in SAT and decision procedures Prospects and Challenges – 24 – FLoC ‘06 Verifying Safety Properties Present State Next State Reachable States Bad States Reset States Reset Inputs (Arbitrary) Prove: System will never reach bad state – 25 – FLoC ‘06 Bounded Model Checking Reachable Rn Bad States R2 R1 Reset States Repeatedly Perform Image Computations Set of all states reachable by one more state transition Easy to Implement Underapproximation of Reachable State Set – 26 – But, typically catch most bugs with 8–10 steps FLoC ‘06 Implementing BMC Satisfiable? Reset S – 27 – X1 X2 Bad Xn Construct verification condition formula for step n by symbolically simulating system for n cycles Check with decision procedure Do as many cycles as tractable FLoC ‘06 True Model Checking Rn Bad States R2 R1 Reset States Impractical for Term-Level Models Can keep adding elements Reach Fixed-Point Rn = Rn+1 = Reachable Many systems never reach fixed point to buffer Convergence test undecidable Bryant, Lahiri, Seshia, CHARME ‘03 – 28 – FLoC ‘06 Inductive Invariant Checking I Bad States Reachable States Reset States Key Properties of System that Make it Operate Correctly Formulate as formula I Prove Inductive – 29 – Holds initially I(s0) Preserved by all state changes I(s) I((i, s)) FLoC ‘06 Verification Example: OOO incr Program memory PC result bus valid tag val D E C O D E dispatch Register Rename Unit retire ALU execute head tail Reorder Buffer valid value src1valid src1val src1tag src2valid src2val src2tag dest op result 1st Operand 2nd Operand Reorder Buffer Fields – 30 – FLoC ‘06 Verifying OOO Lahiri, Seshia, & Bryant, FMCAD 2002 Goal Show that OOO implements Instruction Set Architecture (ISA) model For all possible execution sequences Challenge OOO holds partially executed instructions in reorder buffer States of two systems match ISA Reg. File PC OOO Reg. File PC Reorder Buffer only when reorder buffer flushed – 31 – FLoC ‘06 Adding Shadow State McMillan, ‘98 Arons & Pnueli, ‘99 Provides Link Between ISA & OOO Models ISA Reg. File PC Additional info. in ROB Does not affect OOO OOO behavior Generated when instruction dispatched Predict values of operands and result From ISA model – 32 – Reg. File PC Reorder Buffer FLoC ‘06 State Consistency Invariants Register rename unit & reorder buffer encode same information redundantly Rename Unit: Registers Tags Reorder Buffer: Tags Registers valid tag val dispatch Register Rename Unit head – 33 – tail Reorder Buffer valid value src1valid src1val src1tag src2valid src2val src2tag dest op Reorder Buffer Fields FLoC ‘06 State Consistency Invariant Examples valid tag val dispatch Register Renaming invariants (2) Any mapped register should be in the ROB, and the destination register should match r.reg.valid(r) (rob.head reg.tag(r) < rob.tail rob.dest(reg.tag(r)) = r ) valid value src1valid src1val src1tag src2valid src2val src2tag dest op For any ROB entry, the destination should have reg.valid as false and tag should be to this or later instruction robt.[reg.valid(rob.dest(t)) t reg.tag(rob.dest(t)) reg.tag(rob.dest(t)) < rob.tail] – 34 – FLoC ‘06 Inductive Invariants Formulas I1, …, In holds for any initial state s0, for 1 j n I1(s) I2(s) … In(s) Ij(s ) for any current state s and successor state s for 1 j n Ij(s0) Overall Correctness Follows by induction on time Restricted form of invariants x1x2…xk (x1…xk) (x1…xk) is a CLU formula without quantifiers x1…xk are integer variables free in (x1…xk) Express properties that hold for all buffer indices, register IDs, etc. – 35 – FLoC ‘06 Proving Invariants Proving invariants inductive requires quantifiers |= [x1x2…xk (x1…xk)] [y1y2…ym (y1…ym)] Prove unsatisfiability of formula x1x2…xk (x1…xk) (y1…ym) Undecidable Problem – 36 – In logic with uninterpreted functions and equality FLoC ‘06 Cooking with Invariants Ingredients: Predicates rob.head reg.tag(r) Recipe: Invariants reg.valid(r) r,t.reg.valid(r) reg.tag(r) = t (rob.head reg.tag(r) < rob.tail rob.dest(t) = r ) reg.tag(r) = t Result: Correctness rob.dest(t) = r – 37 – FLoC ‘06 Automatic Recipe Generation Ingredients Recipe Creator Result Want Something More – 38 – Given any set of ingredients Generate best recipe possible FLoC ‘06 Automatic Predicate Abstraction Graf & Saïdi, CAV ‘97 Idea Given set of predicates p1(S), …, pk(S) Boolean formulas describing properties of system state View as abstraction mapping: States {0,1}k Defines abstract FSM over state set {0,1}k Form of abstract interpretation Do reachability analysis similar to symbolic model checking Implementation Early ones had weak inference capabilities Call theorem prover or decision procedure to test each potential transition – 39 – Recent ones make better use of symbolic encodings FLoC ‘06 Abstract State Space Abstraction Concretization p1(s), …, pk(s) Abstract States Abstract States Abstraction Function Concrete States – 40 – s Concretization Function t Concrete States s t FLoC ‘06 Abstract State Machine Abstract Transition Abstract System Concretize Concrete System Abstract Concrete Transition s s t – 41 – t Transitions in abstract system mirror those in concrete FLoC ‘06 P.A. as Invariant Generator A Rn Abstract System Reach Fixed-Point on Abstract System R2 R1 Reset States Concretize C Concrete System I Termination guaranteed, since finite state Equivalent to Computing Invariant for Concrete System Strongest possible invariant that can be expressed by formula over these predicates Reset States – 42 – FLoC ‘06 Symbolic Formulation of Predicate Abstraction Lahiri, Bryant, Cook, CAV ‘03 Task Predicates P = p1(S), …, pk(S) Compute set of legal abstract next states (B) given current abstract states (B) B, B: , : Abstract current and next-state state variables Boolean formulas Abstract Transitions Abstract System – 43 – (B) (B’) FLoC ‘06 Symbolic Formulation of P.A. Approach Create formula of form (S,B) Possible combinations of current concrete state S and next abstract state B (B) Abstract System Concretize Concrete System [P/B] B P[/S] – 44 – (S,B): General Concretize B’ P All Predecessors [P/B] BP[/S] FLoC ‘06 Symbolic Formulation of P.A. Computing Next-State Set Compute (B) S (S,B) Requires quantifier elimination Concretize Concrete System (S,B): S (S,B) General Concretize [P/B] B P[/S] – 45 – (B) (B) Abstract System B’ P All Predecessors [P/B] BP[/S] FLoC ‘06 Quantified Invariant Generation (Lahiri & Bryant, VMCAI 2004) User supplies predicates containing free variables Generate globally quantified invariant Example Predicates p1: reg.valid(r) p2: reg.tag(r) = t p3: rob.dest(t) = r Abstract state satisfying (p1 p2 p3) corresponds to concrete state satisfying r,t[reg.valid(r) reg.tag(r) = t rob.dest(t) = r] rather than r[reg.valid(r)] r,t[reg.tag(r) = t] r,t[rob.dest(t) = r] – 46 – FLoC ‘06 Outline Why Infinite State Systems? Modeling capabilities of UCLID Verification Methods Implementation Advances in SAT and decision procedures Prospects and Challenges – 47 – FLoC ‘06 Decision Procedure Needs Bounded Model Checking Satisfiability of quantifier-free CLU formula Handled by decision procedure Invariant Checking Satisfiability of quantified CLU formula Undecidable Predicate Abstraction Eliminate quantifiers from CLU formula Role of Decision Procedure – 48 – Apply in sound, but incomplete way FLoC ‘06 SAT-based Decision Procedures Input Formula Satisfiability-preserving Boolean Encoder Approximate Boolean Encoder Boolean Formula Boolean Formula SAT Solver SAT Solver satisfiable – 49 – Input Formula unsatisfiable EAGER ENCODING additional clause unsatisfiable First-order Conjunctions SAT Checker satisfiable satisfying assignment unsatisfiable satisfiable LAZY ENCODING FLoC ‘06 Recent Progress in SAT Solving Run-time (sec.) 3600 3,000 2,000 766 1,000 147 118 81 46 (2 00 5) TI Sa tE l it eG (2 00 4) Si eg e 04 ) (2 00 3- zC ha ff in er kM B – 50 – (2 00 2) (2 00 1) zC ha ff G ra sp (2 00 0) 0 Driven by annual SAT competitions FLoC ‘06 UCLID Decision Procedure CLU Formula Lambda Expansion Eager approach -free Formula Function & Predicate Elimination Term Formula Finite Instantiation Boolean Formula Boolean Satisfiability – 51 – FLoC ‘06 Eager Encoding Characteristics Input Formula – Must encode all information about domain properties into Boolean formula – Some properties can give exponential blowup Satisfiability-preserving Boolean Encoder Boolean Formula SAT Solver + Lets SAT solver do all of the work Good Approach for Some Domains Modern SAT solvers have remarkable capacity Good at extracting relevant portions out of very large formulas Learns about formula properties as search proceeds satisfiable – 52 – unsatisfiable FLoC ‘06 DPLL(T) Ganzinger, Hagen, Nieuwenhuis, Oliveras, Tinell, CAV ‘04 Modular, Lazy Decision Procedure Modern SAT solver as control loop Theory-specific solver (for theory T) plugs in Tentative Partial Solution Formula DPLL Engine OK / Backtrack Info. Theory Solver Compared to Other Lazy Solvers – 53 – Tighter coupling between DPLL engine & theory solver FLoC ‘06 DPLL(T) Example Formula Propositional Form g(a)c [f(g(a))f(c) g(a)d] cd p1 Action – 54 – [p2 Propositional p3] p4 Theory Unit Propagate p1 g(a)c Theory p2 Propagate Unit Propagate p3 f(g(a))f(c) Theory Propogate (Failure) cd p4 g(a)d T = Equality with Uninterpreted Functions (EUF) FLoC ‘06 Invariant Checking Revisited Prove Unsatisfiability of Formula x1x2…xk (x1…xk) (y1…ym) General Form: X (X) (Y) Quantifier Instantiation Generate expressions E1(Y), …, En(Y) Using terms that appear in View as a set of axioms that apply to terms in Expand as (E1(Y)) … (En(Y)) (Y) If unsatisfiable, then so is quantified formula Sound, but incomplete Trade-off – 55 – Be clever about instantiation, or Instantiate many terms and rely on decision procedure capacity FLoC ‘06 Versions of the OOO Processor base Executes ALU instructions only exc Handles arithmetic exceptions Must flush reorder buffer exc/br Handles branches Predicts branch & speculatively executes along path exc/br/mem-simp Adds load & store instructions Store commits as instruction retires exc/br/mem Stores held in buffer Can commit later Loads must scan buffer for matching addresses – 56 – FLoC ‘06 Comparative Verification Effort UCLID & Barcelona DPLL(T) decision procedures Measurements by Shuvendu Lahiri Person time shown cumulatively base Total Invariants UCLID time DPLL(T) time Person time – 57 – exc exc / br exc / br / exc / br / mem-simp mem 39 67 71 13 34 54 s 236 s 403 s 1s 4s 7s 2 days 7 days 9 days 1594 s 2200 s 85 s 24 days 34 days FLoC ‘06 Predicate Abstraction Revisited Formulate as Quantifier Elimination Problem Generate formula of form (B) S (S,B) S: Integer variables Solve by SAT Enumeration Find satisfying assignment (S) (B) for Record (B) as disjunct in Reformulate as (B) Implementations UCLID Do eager translation, then run incremental SAT solver DPLL(T) [Lahiri, Nieuwenhuis, Oliveras, CAV ’06] Modify DPLL engine to backtrack when it finds solution – 58 – FLoC ‘06 Systems Verified with Predicate Abstraction Model Predicates Iterations UCLID Time DPLL(T) Time Out-Of-Order Execution Unit 25 9 921s 36s German’s Cache Protocol 16 9 34s 1s German’s Protocol, unbounded channels 26 17 1,119s 23s Lamport’s Bakery Algorithm 32 18 245s 11s – 59 – Safety properties only FLoC ‘06 Why SAT Enumeration Works Model States Fraction Out-Of-Order Execution Unit 25 10,728 3 X 10-4 German’s Cache Protocol 16 326 5 X 10-3 German’s Protocol, unbounded channels 26 2,238 3 X 10-5 Lamport’s Bakery Algorithm 32 426 1 X 10-7 – 60 – Predicates Model with P predicates Number of abstract reachable states << 2P FLoC ‘06 UCLID Counterexample Generation Trace Symbolic Simulation Partial Interp. of Lambdas Lambda Expansion Counterexample – 61 – trace showing value of each state variable on each step. “Value” of a lambda is a set of argument/value pairs Important feature for tool users Partial Interpretation of UIFs Function & Predicate Elimination Integer Assignment Finite Instantiation Boolean Assignment Boolean Satisfiability FLoC ‘06 Providing Counterexamples Bounded Checking Trace shows concrete failure case Could represent error in design, in model, or in specification Invariant Checking Failure could be due to weak invariant or insufficient quantifier instantiation Predicate Abstraction – 62 – Does not provide useful counterexamples Generally yields abstract state “true” FLoC ‘06 Outline Why Infinite State Systems? Modeling capabilities of UCLID Verification Methods Implementation Advances in SAT and decision procedures Prospects and Challenges – 63 – FLoC ‘06 Why Verification Tasks Feasible CLU Logic Fairly Simple Equality, uninterpreted functions, difference constraints Small model property “Deep” Reasoning Not Required – 64 – Formulas large and messy, but straightforward Verifying systems that are designed to have constrained behaviors Only checking effect of a few cycles of system operation FLoC ‘06 Future Prospects Evaluation Demonstrated ability to verify complex, parameterized systems Predicate Abstraction Shows Promise Provides key automation advantage of model checking Successful Application to Program Verification – 65 – Qadeer & Lahiri, POPL ’06 Generate loop invariants for list manipulation programs FLoC ‘06 Areas of Research Bit-Vector Decision Procedures True model for hardware & low-level software Automatically apply abstractions Abstract to symbolic terms whenever possible Other Types of Verification Liveness properties Abstraction must underapproximate concrete system Certified correctness Explanation-generating decision procedures Extension to predicate abstraction engine Counterexample generation Important for user Hard to provide as raise level of abstraction & automation – 66 – FLoC ‘06 Questions?