System Modeling and Verification with UCLID http://www.cs.cmu.edu/~uclid Randal E. Bryant http://www.cs.cmu.edu/~bryant Contributions by graduate students: Sanjit Seshia, Shuvendu Lahiri Applying Data Abstraction to Hardware Verification Idea Abstract details of data encodings and operations Keep control logic precise Applications Verify overall correctness of system Assuming individual functional units correct Technology –2– Use restricted subset of first-order logic Implement efficient decision procedures Multiple methods of performing verification Memocode 2004 Example: HP/Compaq Alpha 21264 Pipeline State Multiple caches Instruction queues Dynamicallyallocated registers Memory queue Many buffers between stages Verification Tasks Does it implement the Alpha ISA? Microprocessor Report, Oct. 28, 1996 –3– Memocode 2004 Abstracting Data from Bits to Integers x0 x1 x2 x xn-1 View Data as Symbolic “Terms” Arbitrary integers Verification proves correctness of design for all possible word sizes –4– Can store in memories & registers Memocode 2004 Required Logic Scalar Data Types Formulas (F ) Boolean Expressions Control signals Terms (T ) Integer Expressions Data values Arbitrary values from some infinite domain –5– Memocode 2004 Modeling Data Selection If-Then-Else Operation Mulitplexor Allows control-dependent data flow p x y –6– 1 0 ITE(p, x, y) T x y 1 0 x F x y 1 0 y Memocode 2004 Abstraction Via Uninterpreted Functions A Lf U For any Block that Transforms or Evaluates Data: Replace with generic, unspecified function Only assumed property is functional consistency: a = x b = y f (a, b) = f (x, y) –7– Memocode 2004 Abstraction Via Uninterpreted Functions IF/ID PC Op ID/EX Control EX/WB Control Rd Ra Instr F3 Mem = Adat Reg. File A FL2 U Imm F1 +4 Rb = For any Block that Transforms or Evaluates Data: –8– Replace with generic, unspecified function Also view instruction memory as function Memocode 2004 Modeling Data-Dependent Control Adata Branch Logic Cond p Bdata Branch? Model by Uninterpreted Predicate Yields arbitrary Boolean value for each control + data combination Produces same result when arguments match Pipeline & reference model will branch under same conditions –9– Memocode 2004 Modeling Memories as Mutable Functions Memory M Modeled as Function Writing Transforms Memory M = Write(M, wa, wd) M a M wa = M(a): Value at location a a Initially M M a – 10 – 1 0 m0 wd Arbitrary state Modeled by uninterpreted function m0 a . ITE(a = wa, wd, M(a)) Future reads of address wa will get wd Memocode 2004 Required Logic Scalar Data Types Formulas (F ) Boolean Expressions Control signals Terms (T ) Integer Expressions Data values Functional Data Types Functions (Fun) Integer Integer Immutable: Functional units Mutable: Memories Predicates (P) Integer Boolean Immutable: Data-dependent control Mutable: Bit-level memories – 11 – Memocode 2004 Modeling Unbounded FIFO Buffer Queue is Subrange of Infinite Sequence Q.head = h Index of oldest element Q.tail = t Index of insertion location q(h–2) q(h–1) head Q.val = q • • • q(i) valid only when h i < t q(t–2) Required Operations Increment head & tail pointers Compare head to tail (emptiness) q(h) q(h+1) Function mapping indices to values q(t–1) tail q(t) q(t+1) • • • – 12 – Already Popped increasing indices • • • Not Yet Inserted Memocode 2004 CLU Logic Counter Arithmetic, and Lambda Expressions and Uinterpreted Functions Terms (T ) ITE(F, T1, T2) Fun (T1, …, Tk) succ (T) pred (T) Formulas (F ) F, F1 F2, F1 F2 T1 = T2 T1 < T2 P(T1, …, Tk) – 13 – Integer Expressions If-then-else Function application Increment Decrement Boolean Expressions Boolean connectives Equation Inequality Predicate application Memocode 2004 CLU Logic (Cont.) Functions (Fun) f x1, …, xk . T Predicates (P) p x1, …, xk . F – 14 – Integer Integer Uninterpreted function symbol Function definition Integer Boolean Uninterpreted predicate symbol Predicate definition Memocode 2004 Decision Problem Logic of Equality with Uninterpreted Functions Truth Values Integer Values Task Dashed Lines Model Control Logical connectives Equations Solid lines Model Data Uninterpreted functions If-Then-Else operation e1 f T F e0 x0 f T d0 = T F = F Determine whether formula is universally valid True for all interpretations of variables and function symbols – 15 – Memocode 2004 Finite Model Property for EUF e1 f T F f T d0 e0 x0 x0 = f (x0) f (d0) T F d0 = F Observation – 16 – Any formula has limited number of distinct expressions Only property that matters is whether or not different terms are equal Memocode 2004 Boolean Encoding of Integer Values Expression x0 Possible Values {0} Bit Encoding 0 0 d0 {0,1} 0 b10 f (x0) {0,1,2} b21 b20 f (d0) {0,1,2,3} b31 b30 For Each Expression Either equal to or distinct from each preceding expression Boolean Encoding Use Boolean values to encode integers over small range EUF formula can be translated into propositional logic Tautology iff original formula valid – 17 – Memocode 2004 Finite Model Property for CLU x y succ(x) > pred(y) x x+1 x x+1 y –1 y x = 0, y = 3 y –1 y x x+1 y –1 y x x+1 y –1 y x x+1 y –1 y x = 2, y = 1 Observation – 18 – Need to encode all possible relative orderings of expressions Each symbolic value has maximum range of increments & decrements Can use Boolean encodings of small integer ranges Memocode 2004 file.ucl Model + Specification Symbolic Simulation UCLID Formula Lambda Expansion Operation UCLID Operation Series of transformations leading to propositional formula Each has polynomial complexity -free Formula Function & Predicate Elimination Term Formula Finite Instantiation Boolean Formula Boolean Satisfiability – 19 – Memocode 2004 UCLID Example Boolean state DLX Pipeline Term state Single-issue, 5-stage pipeline Function state Pipeline Fetch pc Decode fd Execute de Write Back Memory em mw Branch Arg1 Target Arg2 Value Instr Arg2 Type Type Instr Data PC PC Type Dest Valid Valid Valid Valid Instr pPC – 21 – pRF pMem Memocode 2004 Writing & Reading Register File Write Back Decode fd de mw Arg1 src1 pRF Instr Arg2 src2 Data Dest Valid – 22 – Memocode 2004 Writing Register File init[pRF] := rf0; (* Uninterpreted Function *) next[pRF] := Lambda(a) . Write case Back mw_Valid & (a = mw_Dest) : mw_Data; mw default : pRF(a); esac; pRF Data Dest Valid – 23 – Memocode 2004 Reading Register File init[de_Arg1] := dea10; (* Initially arbitary *) next[de_Arg1] := next[pRF](src1(fd_Instr)); init[de_Arg2] := dea20; (* Initially arbitary *) next[de_Arg2] := next[pRF](src2(fd_Instr)); Decode fd de Write-after-read semantics Arg1 src1 pRF Instr Arg2 src2 – 24 – Memocode 2004 Verifying Safety Properties Present State Next State Reachable States Bad States Reset States Reset Inputs (Arbitrary) Prove: System will never reach bad state – 25 – Memocode 2004 Bounded Model Checking Reachable Rn Bad States R2 R1 Reset States Repeatedly Perform Image Computations Set of all states reachable by one more state transition Easy to Implement Underapproximation of Reachable State Set – 26 – But, typically catch most Memocode 2004 bugs with 8–10 steps True Model Checking Rn Bad States R2 R1 Reset States Impractical for Term-Level Models Can keep adding Reach Fixed-Point – 27 – Rn = Rn+1 = Reachable Many systems never reach fixed point elements to buffer Convergence test undecidable Memocode 2004 Invariant Checking I Bad States Reachable States Reset States Key Properties of System that Make it Operate Correctly Formulate as formula I Prove Inductive – 28 – Holds initially I(s0) Preserved by all state changes I(s) I((i, s)) Memocode 2004 An Out-of-order Processor (OOO) incr Program memory PC result bus valid tag val D E C O D E dispatch Register Rename Unit retire ALU execute head tail Reorder Buffer valid value src1valid src1val src1tag src2valid src2val src2tag dest op result 1st Operand 2nd Operand Reorder Buffer Fields Data Dependencies Resolved by Register Renaming Map register ID to instruction in reorder buffer that will generate register value Inorder Retirement Managed by Retirement Buffer – 29 – FIFO buffer keeping pending instructions in program order Memocode 2004 Verifying OOO Lahiri, Seshia, & Bryant, FMCAD 2002 Goal Each step of OOO consistent with Instruction Set Architecture (ISA) model ISA Reg. File PC OOO Challenges OOO holds partially executed instructions in reorder buffer Reg. File States of two systems match only when reorder buffer flushed – 30 – PC Reorder Buffer Memocode 2004 Adding Shadow State McMillan, ‘98 Arons & Pnueli, ‘99 Provides Link Between ISA & OOO Models ISA Reg. File PC Additional entries in ROB Do not affect OOO OOO behavior Generated when instruction dispatched Predict values of operands and result From ISA model – 31 – Reg. File PC Reorder Buffer Memocode 2004 State Consistency Invariants Tag Consistency invariants (2) Instructions only depend on instruction preceding in program order Register Renaming invariants (2) Tag in a rename-unit should be in the ROB, and the destination register should match r.reg.valid(r) (rob.head reg.tag(r) < rob.tail rob.dest(reg.tag(r)) = r ) For any entry, the destination should have reg.valid as false and tag should contain this or later instruction robt.(reg.valid(rob.dest(t)) t reg.tag(rob.dest(t)) < rob.tail) – 32 – Memocode 2004 OOO Invariants Split into Formulas I1, …, In holds for any initial state s0, for 1 j n I1(s) I2(s) … In(s) Ij(s ) for any current state s and successor state s for 1 j n Ij(s0) Invariants for OOO (13) Refinement maps (2) Show relation between ISA and OOO models Shadow state (3) Shadow values correctly predict OOO values State consistency (8) Properties of OOO state that ensure proper operation Overall Correctness – 33 – Follows by induction on time Memocode 2004 Proving OOO Invariants Proved Automatically Time spent = 54s on 1.4GHz machine Total effort = 2 person days Comparison – 34 – Previous efforts using theorem provers took weeks of effort Memocode 2004 Extending the OOO Processor base Executes ALU instructions only exc Handles arithmetic exceptions Must flush reorder buffer exc/br Handles branches Predicts branch & speculatively executes along path exc/br/mem-simp Adds load & store instructions Store commits as instruction retires exc/br/mem Stores held in buffer Can commit later – 35 – Loads must scan buffer for matching addresses Memocode 2004 Comparative Verification Effort base Total Invariants Manually instantiate UCLID time Person time – 36 – exc exc / br exc / br / exc / br / mem-simp mem 39 67 71 13 34 0 0 0 4 8 54 s 236 s 403 s 1594 s 2200 s 2 days 5 days 2 days 15 days 10 days Memocode 2004 “I Just Want a Loaf of Bread” Ingredients Recipe – 37 – Result Memocode 2004 Cooking with Invariants Ingredients: Predicates rob.head reg.tag(r) Recipe: Invariants reg.valid(r) r,t.reg.valid(r) reg.tag(r) = t (rob.head reg.tag(r) < rob.tail rob.dest(t) = r ) Result: Correctness reg.tag(r) = t rob.dest(t) = r – 38 – Memocode 2004 Automatic Recipe Generation Ingredients Recipe Creator Result Want Something More – 39 – Given any set of ingredients Generate best recipe possible Memocode 2004 Automatic Predicate Abstraction Graf & Saïdi, CAV ‘97 Idea Given set of predicates P1(s), …, Pk(s) Boolean formulas describing properties of system state View as abstraction mapping: States {0,1}k Defines abstract FSM over state set {0,1}k Form of abstract interpretation Do reachability analysis similar to symbolic model checking Prior Implementations Very weak inference capabilities Call theorem prover or decision procedure to test each potential transition – 40 – Little support for quantified predicates Memocode 2004 Abstract State Space Abstraction Concretization P1(s), …, Pk(s) Abstract States Abstract States Abstraction Function Concrete States – 41 – s Concretization Function t Concrete States s t Memocode 2004 Abstract State Machine Abstract Transition Abstract System Concretize Concrete System Abstract Concrete Transition s s t – 42 – t Transitions in abstract system mirror those in concrete Memocode 2004 Generating Concrete Invariant A Rn Reach Fixed-Point on Abstract System R2 Abstract System R1 Reset States Concretize C Concrete System I Termination guaranteed, since finite state Equivalent to Computing Invariant for Concrete System Strongest possible invariant that can be expressed by formula over these predicates Reset States – 43 – Memocode 2004 Conventional Implementation of P.A. Basis Abstract state sets described as formulas over Boolean variables B = b1, …, bk Current state given by formula (b1, …, bk) Check whether candidate state (b1, …, bk) is successor Abstract System Concretize Concrete System Concretize Intersect? [P/B] – 45 – Abstract Transition? [P/B][/S] Predecessor [P/B] Memocode 2004 Drawbacks of Conventional Implementation Intersect? [P/B] [P/B][/S] Satisfiable? [P/B] [P/B][/S] Very Slow Guess at possible next state Construct term-level formula and test for satisfiability Possibly 2k calls to decision procedure Can Only Handle Proposition Predicates – 46 – Cannot construct quantified invariants Memocode 2004 Symbolic Approach to P.A. Lahiri, Bryant, Cook CAV 2003 Generate Quantified Formula Describing Next Abstract State Set Current state given by formula (B) Generate formula (B) describing all successors Abstract System All Abstract Transitions S, X How to reach abstract state B via concrete states S and X – 47 – (B, S, X) Memocode 2004 Symbolic Approach (cont.) Transform into Quantified Boolean Formula Formula of form Next(B) = S, X (S, X, B) S, X: Integer and function variables B: Abstract state variables Translate into Boolean formula of form A (A, B) A: Boolean variables encoding integer & function values Key Property { B | (S, X, B) satisfiable } = { B | (A, B) satisfiable } – 48 – Solve using either SAT enumeration or BDD quantification Memocode 2004 Quantified Invariant Generation User supplies predicates containing free variables Generate globally quantified invariant Example Predicates p1: reg.valid(r) p2: rob.dest(t) = r p3: reg.tag(r) = t Abstract state satisfying (p1 p2 p3) corresponds to concrete state satisfying r,t[reg.valid(r) reg.tag(r) = t rob.dest(t) = r] rather than r[reg.valid(r)] r,t[reg.tag(r) = t] r,t[rob.dest(t) = r] – 49 – Memocode 2004 Systems Verified with Predicate Abstraction Model Predicates Iterations CPU Time Out-Of-Order Execution Unit 25 9 2,613s German’s Cache Protocol 21 9 122s German’s Protocol, unbounded channels 30 19 15,000s Bounded Retransmission Buffer 22 9 11s Lamport’s Bakery Algorithm 24 24 5,211s Very general models Unbounded processes, buffers, cache lines, … – 50 – Safety properties only Memocode 2004 Challenge: Model Generation How to generate term-level model How to guarantee faithfulness to RTL description Comparison of Models RTL Abstracts functional elements from gate-level model Synthesis allows automatic map to gate level Bluespec Abstracts synchronous timing to atomic transactions Synthesize to RTL by operator scheduling Term level Abstracts bit-level data representations to words Abstracts memories to mutable functions – 51 – Memocode 2004 Dimensions of Abstraction Temporal Blue Spec Term with Scheduler RTL Term Gate Temporal & Data are Orthogonal Abstractions Bluespec provides only temporal abstraction UCLID language supports cycle-level timing Can incorporate scheduler to model system operating with atomic transactions Data – 52 – Memocode 2004 Automatic Model Generation Task Blue Spec RTL Abstract words Term with Scheduler Abstractors Program to automatically generate term-level models Replace functional units by uninterpreted functions Challenges Term Data Legacy code not written with abstraction in mind Hard to determine what to keep precise and what to extract Implementations – 53 – Andraus & Sakallah, DAC ‘04 Memocode 2004 Conclusions CLU is Useful Logic Expressive enough to model wide range of systems Systems with unbounded resources Abstract away most data operations Simple enough to be tractable Small domain property allows exploiting Boolean methods Predicate Abstraction is Powerful Tool – 54 – Removes requirement to hand-generate invariants Benefits similar to model checking Memocode 2004 Further Work Support for Proofs of Liveness Must make argument that progress being made Greater Automation Automatic generation of predicates More efficient implementation of predicate abstraction More Powerful Logic Linear arithmetic would be useful Potential blow-up when translate to Boolean formula Apply to Other Systems – 55 – Software Network protocols Memocode 2004