Formal Verification of Infinite-State Systems Using Boolean Methods Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant Contributions by graduate students: Sanjit Seshia, Shuvendu Lahiri Outline Task Formally verify hardware and software systems Build on success in verifying finite models Infinite-State Models Need logic that is suitably expressive, yet remains reasonably tractable Verification Techniques Solve problems by mapping into propositional logic Proof engines can use powerful Boolean methods –2– Different levels of automation and capacity Truly Infinite-State Systems Systems where want to model real-world values (temperature, speed, ...) Hybrid systems Very difficult to verify Speedometer Reading Air Bag Controller Deploy! Accelerometer Reading Systems with real-valued time constraints E.g., timed automata Somewhat easier to verify, since all clocks move at same rate –3– Theoretically Infinite-State Systems Systems with unbounded buffers Even though can’t really build one In Use • • • –4– • • • • • • tail head Arbitrarily Large Finite-State Systems P2 • P1 • Synchronization protocol that should work for arbitrary number of processes • PN Verify for arbitrary N Circular buffer with fixed, but arbitrary capacity In Use head Verify for arbitrary value of Max • • • –5– • • • • • • 0 tail Max-1 Very Large Finite-State Systems Abstract 32-bit words as arbitrary integers View memories as having unbounded capacity IF/ID PC Op ID/EX Control EX/WB Control Rd Ra Instr Mem = Adat Reg. File A L U Imm +4 Rb –6– = Example: HP/Compaq Alpha 21264 Pipeline State Multiple caches Instruction queues Dynamicallyallocated registers Memory queue Many buffers between stages Verification Tasks Does it implement the Alpha ISA? Microprocessor Report, Oct. 28, 1996 –7– Abstracting Data from Bits to Integers x0 x1 x2 x xn-1 View Data as Symbolic “Terms” Arbitrary integers Verification proves correctness of design for all possible word sizes Can store in memories & registers Can select with multiplexors ITE: If-Then-Else operation p x y –8– 1 0 ITE(p, x, y) T x y 1 0 x F x y 1 0 y Abstraction Via Uninterpreted Functions A Lf U For any Block that Transforms or Evaluates Data: Replace with generic, unspecified function Only assumed property is functional consistency: a = x b = y f (a, b) = f (x, y) –9– Abstraction Via Uninterpreted Functions IF/ID PC Op ID/EX Control EX/WB Control Rd Ra Instr F3 Mem = Adat Reg. File A FL2 U Imm F1 +4 Rb = For any Block that Transforms or Evaluates Data: – 10 – Replace with generic, unspecified function Also view instruction memory as function EUF: Equality with Uninterp. Functs Decidable fragment of first order logic Formulas (F ) F, F1 F2, F1 F2 T1 = T2 P (T1, …, Tk) Terms (T ) ITE(F, T1, T2) Fun (T1, …, Tk) Functions (Fun) f Read, Write Predicates (P) p – 11 – Boolean Expressions Boolean connectives Equation Predicate application Integer Expressions If-then-else Function application Integer Integer Uninterpreted function symbol Memory operations Integer Boolean Uninterpreted predicate symbol Decision Problem Logic of Equality with Uninterpreted Functions (EUF) Truth Values Integer Values Task Dashed Lines Model Control Logical connectives Equations Solid lines Model Data Uninterpreted functions If-Then-Else operation e1 f T F e0 x0 f T d0 = T F = F Determine whether formula is universally valid True for all interpretations of variables and function symbols – 12 – Finite Model Property for EUF e1 f T F e0 x0 f T d0 x0 = f (x0) f (d0) T F d0 = F Observation – 13 – Any formula has limited number of distinct expressions Only property that matters is whether or not different terms are equal Boolean Encoding of Integer Values Expression x0 Possible Values {0} Bit Encoding 0 0 d0 {0,1} 0 b10 f (x0) {0,1,2} b21 b20 f (d0) {0,1,2,3} b31 b30 For Each Expression Either equal to or distinct from each preceding expression Boolean Encoding Use Boolean values to encode integers over small range EUF formula can be translated into propositional logic Tautology iff original formula valid – 14 – An Out-of-order Processor (OOO) incr Program memory PC result bus valid tag val D E C O D E dispatch Register Rename Unit retire ALU execute head tail Reorder Buffer valid value src1valid src1val src1tag src2valid src2val src2tag dest op result 1st Operand 2nd Operand Reorder Buffer Fields Data Dependencies Resolved by Register Renaming Map register ID to instruction in reorder buffer that will generate register value Inorder Retirement Managed by Retirement Buffer – 15 – FIFO buffer keeping pending instructions in program order Access Modes for Reorder Buffer Retire Dispatch result bus ALU execute FIFO head tail Content Addressable Insert when dispatch Remove when retire Directly Addressable – 16 – Select particular entry for execution Retrieve result value from executed instruction Broadcast result to all entries with matching source tag Global Flush all queue entries when instruction at head causes exception Required Logic Increased Expressive Power Model queue pointers Increment & decrement operations Relative ordering Ability to construct complex memory structures Not just set of fixed memory types Don’t Go Too Far – 17 – Want practical decision procedures Efficient reduction to propositional logic EUF CLU Terms (T ) ITE(F, T1, T2) If-then-else Fun (T1, …, Tk) Function application succ (T) Increment pred (T) Decrement Formulas (F ) – 18 – F, F1 F2, F1 F2 T1 = T2 P(T1, …, Tk) Boolean connectives Equation Predicate application T1 < T2 Inequality EUF CLU (Cont.) Functions (Fun) f Read, Write Uninterpreted function symbol Memory operations x1, …, xk . T Function lambda expression Predicates (P) p x1, …, xk . F Uninterpreted predicate symbol Predicate lambda expression • Arguments can only be terms • Lambdas are just mutable arrays – 19 – Modeling Memories with ’s Memory M Modeled as Function Writing Transforms Memory M = Write(M, wa, wd) M a M wa = M(a): Value at location a a Initially M M a – 20 – 1 0 m0 wd Arbitrary state Modeled by uninterpreted function m0 a . ITE(a = wa, wd, M(a)) Future reads of address wa will get wd Modeling Unbounded FIFO Buffer Queue is Subrange of Infinite Sequence Q.head = h Index of oldest element Q.tail = t Index of insertion location q(h–1) head q(h+1) Q.val = q • • • Function mapping indices to values q(i) valid only when h i < t q(t–2) Initial State: Arbitrary Queue Q.head = h0, Q.tail = t0 Impose constraint that h0 t0 Q.val = q0 Uninterpreted function – 21 – q(h) q(t–1) tail increasing indices Already Popped q(h–2) q(t) q(t+1) • • • • • • Not Yet Inserted Modeling FIFO Buffer (cont.) next[t] := ITE(operation = PUSH, succ(t), t) next[q] := (i). ITE((operation = PUSH & i=t), x, q(i)) – 22 – t • • • q(h–2) q(h–2) q(h–1) q(h–1) q(h) next[h] q(h) q(h+1) q(h+1) • • • • • • q(t–2) q(t–2) q(t–1) q(t–1) q(t) x q(t+1) • • • h • • • next[t] q(t+1) • • • next[h] := ITE(operation = POP, succ(h), h) op = PUSH Input = x Systems of Identical Processes Each Process has k State Variables • • • • • • sv2 • • • – 23 – sv1 • • • State of Process i • • • • • • Each state variable represented as array Indexed by process Id svk Modeling System of Identical Processes On Each Step: Select arbitrary process index p As if chosen by nondeterministic scheduler Update state for selected process • • • • • • inuse state p 0/1 next[state] := lambda(i) case – 24 – esac CRITICAL IDLE i = p & state(i) = IDLE: TRYING i = p & state(i) = TRYING & inuse : TRYING i = p & state(i) = TRYING & !inuse: CRITICAL default: state(i) TRYING Decision Procedure CLU Formula Lambda Expansion Operation – 25 – Series of transformations leading to propositional formula Propositional formula checked with BDD or SAT tools Bryant, Lahiri, Seshia [CAV02] -free Formula Function & Predicate Elimination Function-free Formula Convert to Boolean Formula Boolean Formula Boolean Satisfiability Finite Model Property for CLU x y succ(x) > pred(y) x x+1 x x+1 y –1 y x = 0, y = 3 y –1 y x x+1 y –1 y x x+1 y –1 y x x+1 y –1 y x = 2, y = 1 Observation – 26 – Need to encode all possible relative orderings of expressions Each symbolic value has maximum range of increments & decrements Can use Boolean encodings of small integer ranges Verifying OOO Lahiri, Seshia, & Bryant, FMCAD 2002 Goal Show that OOO implements Instruction Set Architecture (ISA) model For all possible execution sequences Challenges No bound on program length OOO holds partially executed instructions in reorder buffer States of two systems match – 27 – only when reorder buffer flushed ISA Reg. File PC OOO Reg. File PC Reorder Buffer Adding Shadow State McMillan, ‘98 Arons & Pnueli, ‘99 Provides Link Between ISA & OOO Models ISA Reg. File PC Additional entries in ROB Do not affect OOO behavior Generated when instruction dispatched Predict values of operands and result From ISA model OOO Reg. File PC Reorder Buffer – 28 – Adding Shadow Structures incr Program memory PC valid tag val D E C O D E result bus dispatch Register Rename Unit retire ALU execute head tail Reorder Buffer shdw.src1val[rob.tail] Rfisa(src1) Reorder Buffer Fields valid shdw.value value src1valid src1val shdw.src1val src1tag src2valid shdw.src2val src2val src2tag dest Shadow Fields op Updated directly from the ISA model during dispatch shdw.src2val[rob.tail] Rfisa(src2) shdw.value[rob.tail] – 29 – ALU(Rfisa(src1), Rfisa(src2), op) Invariant Checking Formulas I1, …, In holds for any initial state s0, for 1 j n I1(s) I2(s) … In(s) Ij(s ) for any current state s and successor state s for 1 j n Ij(s0) Invariants for OOO (13) Refinement maps (2) Show relation between ISA and OOO models Shadow state (3) Shadow values correctly predict OOO values State consistency (8) Properties of OOO state that ensure proper operation Overall Correctness – 30 – Follows by induction on time Refinement Maps incr Program memory PC result bus D E C O D E valid tag val dispatch Register Rename Unit retire ALU execute head tail Reorder Buffer Reorder Buffer Fields valid value src1valid src1val src1tag src2valid src2val src2tag dest op Correspondence with a sequential ISA model OOO and ISA synchronized at dispatch For Register File Contents r. reg.valid(r) reg.val(r) = Rfisa(r) For Program Counter – 31 – PCooo = PCisa shdw.value shdw.src1val shdw.src2val Shadow Fields Shadow Invariants incr Program memory PC result bus valid tag val D E C O D E dispatch Register Rename Unit retire ALU execute head tail Reorder Buffer Reorder Buffer Fields valid shdw.value value src1valid src1val shdw.src1val src1tag src2valid shdw.src2val src2val src2tag dest Shadow Fields op 1. robt. rob.valid(t) rob.value(t) = shdw.value(t) 2. robt. rob.src1valid(t) rob.src1val(t) = shdw.src1val(t) 3. robt. rob.src2valid(t) rob.src2val(t) = shdw.src2val(t) – 32 – State Consistency Invariants Tag Consistency invariants (2) Instructions only depend on instruction preceding in program order Register Renaming invariants (2) Tag in a rename-unit should be in the ROB, and the destination register should match r.reg.valid(r) (rob.head reg.tag(r) < rob.tail rob.dest(reg.tag(r)) = r ) For any entry, the destination should have reg.valid as false and tag should contain this or later instruction robt.(reg.valid(rob.dest(t)) t reg.tag(rob.dest(t)) < rob.tail) – 33 – Quantified Invariants and Proofs Allowed Form x1x2…xk (x1…xk) (x1…xk) is a CLU formula without quantifiers x1…xk are integer variables free in (x1…xk) Proving these invariants requires quantifiers |= (x1x2…xk (x1…xk)) y1y2…ym (y1…ym) Prove x1x2…xk[(x1…xk) (y1…ym)] is not satisfiable Undecidable Automatic instantiation of x1…xk with concrete terms – 35 – Sound but incomplete method Reduce the quantified formula to a CLU formula Can use the decision procedure for CLU Proving Invariants Proved Automatically Quantifier instantiation was sufficient in these cases Time spent = 54s on 1.4GHz machine Total effort = 2 person days Comparison – 36 – Previous efforts using theorem provers took weeks of effort Extending the OOO Processor base Executes ALU instructions only exc Handles arithmetic exceptions Must flush reorder buffer exc/br Handles branches Predicts branch & speculatively executes along path exc/br/mem-simp Adds load & store instructions Store commits as instruction retires exc/br/mem Stores held in buffer Can commit later Loads must scan buffer for matching addresses – 37 – Comparative Verification Effort base Total Invariants Manually instantiate UCLID time Person time – 38 – exc exc / br exc / br / exc / br / mem-simp mem 39 67 71 13 34 0 0 0 4 8 54 s 236 s 403 s 1594 s 2200 s 2 days 5 days 2 days 15 days 10 days “I Just Want a Loaf of Bread” Ingredients Recipe – 39 – Result Cooking with Invariants Ingredients: Predicates rob.head reg.tag(r) Recipe: Invariants reg.valid(r) r,t.reg.valid(r) reg.tag(r) = t (rob.head reg.tag(r) < rob.tail rob.dest(t) = r ) Result: Correctness reg.tag(r) = t rob.dest(t) = r – 40 – Automatic Recipe Generation Ingredients Recipe Creator Result Want Something More – 41 – Given any set of ingredients Generate best recipe possible Automatic Predicate Abstraction Graf & Saïdi, CAV ‘97 Idea Given set of predicates P1(s), …, Pk(s) Boolean formulas describing properties of system state View as abstraction mapping: States {0,1}k Defines abstract FSM over state set {0,1}k Form of abstract interpretation Do reachability analysis similar to symbolic model checking Prior Implementations Very weak inference capabilities Call theorem prover or decision procedure to test each potential transition – 42 – Little support for quantified predicates Abstract State Space Abstraction Concretization P1(s), …, Pk(s) Abstract States Abstract States Abstraction Function Concrete States – 43 – s Concretization Function t Concrete States s t Abstract State Machine Abstract Transition Abstract System Concretize Concrete System Abstract Concrete Transition s s t – 44 – t Transitions in abstract system mirror those in concrete Overapproximation by Abstract Model Abstract System Concrete System Path in abstract state space may not correspond to one in concrete OK when verifying safety properties Possible false negatives, but no false positives – 45 – Predicate Abstraction Example State Space State variables: { x, y } Initial State Initial State { (2, 1) } Next State Behavior x x y y Verification Task – 46 – Prove all bad states unreachable Bad States Precise Analysis Reachable States { (2, 1), (2, 1) } Reachable States Bad States – 47 – Predicates cx:3 cx:y cy:0 L L G E E E G G L – 48 – Use 3-valued predicates in this example Abstract Initial State cx:3 cx:y cy:0 L G G Reached Set #0 { LGG } – 49 – Step 1: Concretize Reached Set #0 Reached Set #0 { LGG } (Note loss of precision) Concretize s cx:3 cx:y cy:0 L G G – 50 – Compute Possible Successor States x x y y Concretize Concrete Transition s – 51 – s Abstract Newly Reached States cx:3 cx:y L cy:0 L L 0 Concretize 0 Abstract Concrete Transition s – 52 – s Reached Set #1 { LLL, LGG } 0 Step 2: Concretize Reached Set #1 Reached Set #1 { LLL, LGG } (Note loss of precision) Concretize s cx:3 L cx:y cy:0 L L – 53 – Compute Possible Successor States x x y y Concretize Concrete Transition s – 54 – s Abstract Newly Reached States cx:3 cx:y cy:0 G E G G Concretize Abstract Concrete Transition s – 55 – s Reached Set #2 { LLL, LGG, EGG, GGG } Final Reached State Set LLL EGG LGG Bad States – 56 – GGG Symbolic Formulation of Step 2 l1: l2: x<3 g3 : x<y y>0 g1: g2: x>3 l3 : x>y y<0 Reached Set #1 Concretized State Set { LLL, LGG } LGG Encode each 3-valued {L, E, G} predicate with 2 Boolean variables (l, g) Represent state set as formula LLL – 57 – (l1 g1 l2 g2 l3 g3) (l1 g1 l2 g2 l3 g3) Next-State Predicates Next State (x, y ) Get predicates l1, l2, l3 , g1, g2, g3 Determine conditions under which predicates will hold in next state Express in terms of current state (x, y) – 58 – x = x Current y = y State x < 3 x < 3 x > 3 — l2 x < y x < y x>y g2 l3 y < 0 y < 0 y>0 g3 g1 x > 3 x > 3 x < 3 — g2 x > y x > y x<y l2 g3 y > 0 y > 0 y<0 l3 Next State Predicate Condition l1 Matches Consistency Constraints l1 g1 Eliminate impossible predicate combinations In general, may need to introduce additional variables To express more complex transitivity constraints (g2 g3 l1) (g1 g1) g3 l3 l2 g2 g1 l1 (g1 l1) – 59 – g2 l2 l3 g3 Symbolic Form Formulation Express compatible combinations of current-state & nextstate variables Quantify out current-state variables Gives formula over next-state variables l1, l2, l3 , g1, g2, g3 (l1 g1 l2 g2 l3 g3) (l1 g1 l2 g2 l3 g3) ] (g1 g1) (g1 l1) (g2 g3 l1) [ l2 g2 g2 l2 l3 g3 g3 l3 – 60 – Current State Consistency Constraints Extracting Next-State Set Run SAT checker over formula Generate blocking clause for each newly generated state [ (l1 g1 l2 g2 l3 g3) (l1 g1 l2 g2 l3 g3) ] (g1 g1) (g1 l1) (g2 g3 l1) l2 g2 g2 l2 (l1 g1 l2 g2 l3 g3) l3 g3 g3 l3 – 61 – l1 g1 l2 g2 l3 g3 l1 g1 l2 g2 l3 g3 Next State 1 0 1 0 1 0 0 0 0 1 0 1 EGG 1 0 1 0 1 0 0 1 0 1 0 1 GGG 1 0 1 0 1 0 1 0 0 1 0 1 LGG 1 0 0 1 0 1 1 0 1 0 1 0 LLL Quantified Invariant Generation User supplies predicates containing free variables Generate globally quantified invariant Example Predicates p1: reg.valid(r) p2: rob.dest(t) = r p3: reg.tag(r) = t Abstract state satisfying (p1 p2 p3) corresponds to concrete state satisfying r,t[reg.valid(r) reg.tag(r) = t rob.dest(t) = r] rather than r[reg.valid(r)] r,t[reg.tag(r) = t] r,t[rob.dest(t) = r] – 64 – Generating Quantified Invariants Use Quantifier Instantiation to Approximate During Concretization – 65 – Causes even greater overapproximation Similar technique used by Flanagan & Qadeer, POPL ‘02 Systems Verified with Predicate Abstraction Model Predicates Iterations CPU Time Out-Of-Order Execution Unit 25 9 2,613s German’s Cache Protocol 21 9 122s German’s Protocol, unbounded channels 30 19 15,000s Bounded Retransmission Buffer 22 9 11s Lamport’s Bakery Algorithm 24 24 5,211s Very general models Unbounded processes, buffers, cache lines, … – 66 – Safety properties only Other Uses of UCLID Verifier Invariant Checking More complex version of OOO including speculative execution, exceptions, & buffered loads & stores Lahiri & Bryant, CAV 2003 Predicate Abstraction Core algorithm used to generate weakest Boolean precondition for software model checking SLAM project at Microsoft Pipelined Processor Verification Verify checker processor from U. Michigan Model extracted directly from Verilog – 67 – Bounded check of load-store unit from industrial microprocessor Conclusions CLU is Useful Logic Expressive enough to model wide range of systems Systems with unbounded resources Abstract away most data operations Simple enough to be tractable Small domain property allows exploiting Boolean methods Predicate Abstraction is Powerful Tool – 68 – Removes requirement to hand-generate invariants Benefits similar to model checking Further Work Support for Proofs of Liveness Must make argument that progress being made Greater Automation Automatic generation of predicates More efficient implementation of predicate abstraction More Powerful Logic Linear arithmetic would be useful Potential blow-up when translate to Boolean formula Apply to Other Systems – 69 – Software Network protocols