Symbolic, Word-Level Hardware Verification Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant Contributions by graduate students: Sanjit Seshia, Shuvendu Lahiri Outline Word-Level Abstraction of Hardware Abstract details of data While keeping detailed control and cycle-level timing Enables verification of entire system Automated Formal Verification –2– Provide capabilities similar to model checking Automate via automatic predicate abstraction Challenge: System-Level Verification Verification Task Does processor implement its ISA? Why is it Hard? –3– Lots of internal state Complex control logic Complex functionality Alpha 21264 Microprocessor Microprocessor Report, Oct. 28, 1996 Sources of Complexity State ISA: registers, memory Microarchitectural: caches, buffers, reservation stations Conceptually finite state, but practically unbounded Control Pipelines spread execution across multiple cycles Out-of-order execution modifies processing order Superscalar operation creates parallelism Control logic coordinates everything Resulting behavior matches that of sequential ISA model Functionality –4– Arithmetic functions, instruction decoding Existing Verification Methods Simulators, equivalence checkers, model checkers, … All Operate at Bit Level RTL model State encoded as words and arrays of words Comprised of bits Most Operate at Cycle or Subcycle Level How each bit of state gets updated System Modeling Languages –5– Abstract time up to transaction level Still view state as collection of bits Word-Level Abstraction Control Logic Com. Log. 1 Com. Log. 2 Data Path Data: Abstract details of form & functions Control: Keep at bit level Timing: Keep at cycle level –6– Data Abstraction #1: Bits → Integers x0 x1 x2 xn-1 View Data as Symbolic Words Arbitrary integers No assumptions about size or encoding Classic model for reasoning about software –7– Can store in memories & registers x Modeling Data Selection If-Then-Else Operation Mulitplexor Allows control-dependent data flow p x y –8– 1 0 ITE(p, x, y) 1 x y 1 0 x 0 x y 1 0 y Abstracting Data Bits Control Logic Com. Log. ? 1 Com. Log. ? 2 1 Data Path What do we do about logic functions? –9– Abstraction #2: Uninterpreted Functions A Lf U For any Block that Transforms or Evaluates Data: Replace with generic, unspecified function Only assumed property is functional consistency: a = x b = y f (a, b) = f (x, y) – 10 – Abstracting Functions Control Logic Com. Log. F1 1 Com. Log. F2 1 Data Path For Any Block that Transforms Data: – 11 – Replace by uninterpreted function Ignore detailed functionality Conservative approximation of actual system Modeling Data-Dependent Control Branch? Adata Branch Logic Cond p Bdata Model by Uninterpreted Predicate Yields arbitrary Boolean value for each control + data combination Produces same result when arguments match Pipeline & reference model will branch under same conditions – 12 – Abstraction #3: Modeling Memories as Mutable Functions Memory M Modeled as Function M a M(a): Value at location a Initially M a – 13 – m0 Arbitrary state Modeled by uninterpreted function m0 Effect of Memory Write Operation Writing Transforms Memory M = Write(M, wa, wd) M wa = a wd M 1 0 Reading from updated memory: Address wa will get wd Otherwise get what’s already in M – 14 – Express with Lambda Notation Notation for defining functions M = a . ITE(a = wa, wd, M(a)) Systems with Buffers Circular Queue Unbounded Buffer In Use 0 head Modeling Method – 15 – Mutable function to describe buffer contents Integers to represent head & tail pointers • • • head • • • • • • tail • • • • • • • • • In Use tail Max-1 Some History Historically Standard model used for program verification Widely used with theorem-proving approaches to hardware verification E.g, Hunt ’85 Automated Approaches to Hardware Verification Burch & Dill, ’95 Tool for verifying pipelined microprocessors Implemented by form of symbolic simulation – 16 – Continued application to pipelined processor verification UCLID Seshia, Lahiri, Bryant, CAV ‘02 Term-Level Verification System Language for describing systems Inspired by CMU SMV Symbolic simulator Generates integer expressions describing system state after sequence of steps Decision procedure Determines validity of formulas Support for multiple verification techniques Available by Download http://www.cs.cmu.edu/~uclid – 17 – Challenge: Model Generation How to generate term-level model How to guarantee faithfulness to RTL description Comparison of Models RTL Abstracts functional elements from gate-level model Synthesis allows automatic map to gate level Term level Abstracts bit-level data representations to words Abstracts memories to mutable functions No direct connection to synthesizable model – 18 – Generating Term-Level Model Manually Generate from RTL How do we know it is a valid abstraction? Hard to keep consistent with changing RTL Automatically Generate from RTL Andraus & Sakallah, DAC ‘04 Must decide which signals to keep Boolean, which to abstract Confused by bit field extraction primitives of HDL Synthesize RTL from Word-Level Model – 19 – Difficult to make efficient Underlying Logic Existing Approaches to Formal Verification E.g., symbolic model checking State encoded as fixed set of bits Finite state system Amenable to Boolean methods (SAT, BDDs) Our Task State encoded with unbounded data types Arbitrary integers Functions over integers Must use decision procedures Determine validity of formula in some subset of first-order logic Adapt methods historically used by automated theorem provers – 20 – EUF: Equality with Uninterp. Functs Decidable fragment of first order logic Formulas (F ) F, F1 F2, F1 F2 T1 = T2 P (T1, …, Tk) Terms (T ) ITE(F, T1, T2) Fun (T1, …, Tk) Functions (Fun) f x1, …, xk . T Predicates (P) p – 21 – Boolean Expressions Boolean connectives Equation Predicate application Integer Expressions If-then-else Function application Integer Integer Uninterpreted function symbol Function lambda expression Integer Boolean Uninterpreted predicate symbol EUF Decision Problem Circuit Representation of Formula Truth Values e1 Dashed Lines Logical connectives Equations ff F Integer Values Solid lines Uninterpreted functions If-Then-Else operation Task T e0 x0 d0 ff T T F == == F Determine whether formula F is universally valid True for all interpretations of variables and function symbols » E.g., all values of integer x0 & d0, all Booleans e0 and e1, and all integer functions f – 22 – Finite Model Property for EUF e1 ff T F e0 x0 ff T d0 T F == x0 d0 f (x0) f (d0) == F Observation – 23 – Any formula has limited number of distinct expressions Only property that matters is whether or not different terms are equal Boolean Encoding of Integer Values Expression x0 Possible Values {0} Bit Encoding 0 0 d0 {0,1} 0 b10 f (x0) {0,1,2} b21 b20 f (d0) {0,1,2,3} b31 b30 For Each Expression Either equal to or distinct from each preceding expression Boolean Encoding Use Boolean values to encode integers over small range EUF formula can be translated into propositional logic Logic circuit with multiplexors, comparators, logic gates – 24 – Tautology iff original formula valid file.ucl Model + Specification Symbolic Simulation – 26 – UCLID Formula Lambda Expansion Operation UCLID Operation Series of transformations leading to propositional formula Except for lambda expansion, each has polynomial complexity -free Formula Function & Predicate Elimination Term Formula Finite Instantiation Boolean Formula Boolean Satisfiability Verifying Safety Properties Present State Next State Reachable States Bad States Reset States Reset Inputs (Arbitrary) State Machine Model State encoded as Booleans, integers, and functions Next state function expresses how updated on each step Prove: System will never reach bad state – 31 – Bounded Model Checking Reachable Rn Bad States R2 R1 Reset States Repeatedly Perform Image Computations Set of all states reachable by one more state transition Easy to Implement Underapproximation of Reachable State Set – 32 – But, typically catch most bugs with 8–10 steps True Model Checking Rn Bad States R2 R1 Reset States Impractical for Term-Level Models Can keep adding Reach Fixed-Point – 33 – Rn = Rn+1 = Reachable Many systems never reach fixed point elements to buffer Convergence test undecidable Inductive Invariant Checking I Bad States Reachable States Reset States Key Properties of System that Make it Operate Correctly Formulate as formula I Prove Inductive – 34 – Holds initially I(s0) Preserved by all state changes I(s) I((i, s)) An Out-of-order Processor (OOO) incr Program memory PC result bus valid tag val D E C O D E dispatch Register Rename Unit retire ALU execute head tail Reorder Buffer valid value src1valid src1val src1tag src2valid src2val src2tag dest op result 1st Operand 2nd Operand Reorder Buffer Fields Data Dependencies Resolved by Register Renaming Map register ID to instruction in reorder buffer that will generate register value Inorder Retirement Managed by Retirement Buffer – 35 – FIFO buffer keeping pending instructions in program order OOO Invariants Split into Formulas I1, …, In holds for any initial state s0, for 1 j n I1(s) I2(s) … In(s) Ij(s ) for any current state s and successor state s for 1 j n Ij(s0) Invariants for OOO (13) Refinement maps (2) Show relation between ISA and OOO models State consistency (8) Properties of OOO state that ensure proper operation Added state (3) Shadow values correctly predict OOO values Overall Correctness – 36 – Follows by induction on time State Consistency Invariant Examples Register Renaming invariants (2) Tag in a rename-unit should be in the ROB, and the destination register should match r.reg.valid(r) (rob.head reg.tag(r) < rob.tail rob.dest(reg.tag(r)) = r ) For any entry, the destination should have reg.valid as false and tag should contain this or later instruction robt.(reg.valid(rob.dest(t)) t reg.tag(rob.dest(t)) < rob.tail) – 37 – Extending the OOO Processor base Executes ALU instructions only exc Handles arithmetic exceptions Must flush reorder buffer exc/br Handles branches Predicts branch & speculatively executes along path exc/br/mem-simp Adds load & store instructions Store commits as instruction retires exc/br/mem Stores held in buffer Can commit later – 38 – Loads must scan buffer for matching addresses Comparative Verification Effort base Total Invariants Manually instantiate UCLID time Person time exc exc / br exc / br / exc / br / mem-simp mem 39 67 71 13 34 0 0 0 4 8 54 s 236 s 403 s 1594 s 2200 s 2 days 7 days 9 days 24 days 34 days (Person time shown cumulatively) – 39 – “I Just Want a Loaf of Bread” Ingredients Recipe – 40 – Result Cooking with Invariants Ingredients: Predicates rob.head reg.tag(r) Recipe: Invariants reg.valid(r) r,t.reg.valid(r) reg.tag(r) = t (rob.head reg.tag(r) < rob.tail rob.dest(t) = r ) reg.tag(r) = t Result: Correctness rob.dest(t) = r – 41 – Automatic Recipe Generation Ingredients Recipe Creator Result Want Something More – 42 – Given any set of ingredients Generate best recipe possible Automatic Predicate Abstraction Graf & Saïdi, CAV ‘97 Idea Given set of predicates P1(s), …, Pk(s) Boolean formulas describing properties of system state View as abstraction mapping: States {0,1}k Defines abstract FSM over state set {0,1}k Form of abstract interpretation Do reachability analysis similar to symbolic model checking Implementation Early ones had weak inference capabilities Call theorem prover or decision procedure to test each potential transition – 43 – Recent ones make better use of symbolic encodings Abstract State Space Abstraction Concretization P1(s), …, Pk(s) Abstract States Abstract States Abstraction Function Concrete States – 44 – s Concretization Function t Concrete States s t Abstract State Machine Abstract Transition Abstract System Concretize Concrete System Abstract Concrete Transition s s t – 45 – t Transitions in abstract system mirror those in concrete Generating Concrete Invariant A Rn Abstract System Reach Fixed-Point on Abstract System R2 R1 Reset States Concretize C Concrete System I Reset States – 46 – Termination guaranteed, since finite state Equivalent to Computing Invariant for Concrete System Strongest possible invariant that can be expressed by formula over these predicates Predicate Abstraction Example State Space State variables: { x, y } Initial State Initial State { (2, 1) } Next State Behavior x x y y Verification Task – 47 – Prove all bad states unreachable Bad States Precise Analysis Reachable States { (2, 1), (2, 1) } Reachable States Bad States – 48 – Predicates cx:3 cx:y cy:0 L L G E E E G G L – 49 – Use 3-valued predicates in this example Abstract Initial State cx:3 cx:y cy:0 L G G Reached Set #0 { LGG } – 50 – Step 1: Concretize Reached Set #0 Reached Set #0 { LGG } (Note loss of precision) Concretize s cx:3 cx:y cy:0 L G G – 51 – Compute Possible Successor States x x y y Concretize Concrete Transition s – 52 – s Abstract Newly Reached States cx:3 cx:y L cy:0 L L 0 Concretize 0 Abstract Concrete Transition s – 53 – s Reached Set #1 { LLL, LGG } 0 Step 2: Concretize Reached Set #1 Reached Set #1 { LLL, LGG } (Note loss of precision) Concretize s cx:3 L cx:y cy:0 L L – 54 – Compute Possible Successor States x x y y Concretize Concrete Transition s – 55 – s Abstract Newly Reached States cx:3 cx:y cy:0 G E G G Concretize Abstract Concrete Transition s – 56 – s Reached Set #2 { LLL, LGG, EGG, GGG } Final Reached State Set LLL EGG LGG Bad States – 57 – GGG Systems Verified with Predicate Abstraction Model Predicates Iterations CPU Time Out-Of-Order Execution Unit 25 9 1,207s German’s Cache Protocol 13 9 14s German’s Protocol, unbounded channels 24 17 427s Bounded Retransmission Buffer 22 9 11s Lamport’s Bakery Algorithm 33 18 471s Very general models Unbounded processes, buffers, cache lines, … – 59 – Safety properties only Predicate Abstraction Convergences Powerful method for generating & evaluating abstract model of system Applicable to variety of systems with different modeling levels Hardware Word-Level UCLID Bit-Level – 60 – Software SLAM Seshia, Lahiri, Bryant, CAV ‘02 Ball, Rajamani, SPIN ‘01 Clarke, Talupar, Wang, SAT ‘03 CBMC Kroening, Clarke, ICCAD ‘04 Ongoing Research Areas Decision Procedures Expand class of logic Linear relations Improved encoding techniques Application to software & hardware verification Predicate Abstraction Improving efficiency Increases rapidly with number of predicates Automatic generation of predicates Based on property to be verified & system model Real-Life Application – 61 – Closing gap with actual hardware models