Symbolic, Word-Level Hardware Verification Carnegie Mellon University Randal E. Bryant

advertisement
Symbolic, Word-Level
Hardware Verification
Randal E. Bryant
Carnegie Mellon University
http://www.cs.cmu.edu/~bryant
Contributions by graduate students:
Sanjit Seshia, Shuvendu Lahiri
Outline
Word-Level Abstraction of Hardware

Abstract details of data
 While keeping detailed control and cycle-level timing

Enables verification of entire system
Automated Formal Verification


–2–
Provide capabilities similar to model checking
Automate via automatic predicate abstraction
Challenge: System-Level Verification
Verification Task

Does processor
implement its ISA?
Why is it Hard?



–3–
Lots of internal state
Complex control
logic
Complex
functionality
Alpha 21264 Microprocessor
Microprocessor Report, Oct. 28, 1996
Sources of Complexity
State



ISA: registers, memory
Microarchitectural: caches, buffers, reservation stations
Conceptually finite state, but practically unbounded
Control




Pipelines spread execution across multiple cycles
Out-of-order execution modifies processing order
Superscalar operation creates parallelism
Control logic coordinates everything
 Resulting behavior matches that of sequential ISA model
Functionality

–4–
Arithmetic functions, instruction decoding
Existing Verification Methods

Simulators, equivalence checkers, model checkers, …
All Operate at Bit Level

RTL model
 State encoded as words and arrays of words
 Comprised of bits
Most Operate at Cycle or Subcycle Level

How each bit of state gets updated
System Modeling Languages


–5–
Abstract time up to transaction level
Still view state as collection of bits
Word-Level Abstraction
Control Logic
Com.
Log.
1
Com.
Log.
2
Data Path
Data:
Abstract details of form & functions
Control: Keep at bit level
Timing: Keep at cycle level
–6–
Data Abstraction #1: Bits → Integers
x0
x1
x2

xn-1
View Data as Symbolic Words

Arbitrary integers
 No assumptions about size or encoding
 Classic model for reasoning about software

–7–
Can store in memories & registers
x
Modeling Data Selection
If-Then-Else Operation


Mulitplexor
Allows control-dependent data flow
p
x
y
–8–
1
0
ITE(p, x, y)
1
x
y
1
0
x
0
x
y
1
0
y
Abstracting Data Bits
Control Logic
Com.
Log.
?
1
Com.
Log.
?
2
1
Data Path
What do we do about logic functions?
–9–
Abstraction #2:
Uninterpreted Functions
A
Lf
U
For any Block that Transforms or Evaluates Data:


Replace with generic, unspecified function
Only assumed property is functional consistency:
a = x  b = y  f (a, b) = f (x, y)
– 10 –
Abstracting Functions
Control Logic
Com.
Log.
F1
1
Com.
Log.
F2
1
Data Path
For Any Block that Transforms Data:



– 11 –
Replace by uninterpreted function
Ignore detailed functionality
Conservative approximation of actual system
Modeling Data-Dependent Control
Branch?
Adata
Branch
Logic
Cond
p
Bdata
Model by Uninterpreted Predicate


Yields arbitrary Boolean value for each control + data
combination
Produces same result when arguments match
 Pipeline & reference model will branch under same conditions
– 12 –
Abstraction #3: Modeling Memories
as Mutable Functions
Memory M Modeled as Function
M
a

M(a): Value at location a
Initially
M
a


– 13 –
m0
Arbitrary state
Modeled by uninterpreted function m0
Effect of Memory Write Operation
Writing Transforms Memory

M = Write(M, wa, wd)
M
wa
=
a
wd
M

1
0
Reading from updated
memory:
 Address wa will get wd
 Otherwise get what’s
already in M
– 14 –
Express with Lambda Notation
 Notation for defining
functions
 M =
 a . ITE(a = wa, wd, M(a))
Systems with Buffers
Circular Queue
Unbounded Buffer
In Use
0
head
Modeling Method


– 15 –
Mutable function to describe buffer contents
Integers to represent head & tail pointers
•
•
•
head
•
•
•
•
•
•
tail
•
•
•
•
•
•
•
•
•
In Use
tail
Max-1
Some History
Historically


Standard model used for program verification
Widely used with theorem-proving approaches to hardware
verification
 E.g, Hunt ’85
Automated Approaches to Hardware Verification

Burch & Dill, ’95
 Tool for verifying pipelined microprocessors
 Implemented by form of symbolic simulation

– 16 –
Continued application to pipelined processor verification
UCLID

Seshia, Lahiri, Bryant, CAV ‘02
Term-Level Verification System

Language for describing systems
 Inspired by CMU SMV

Symbolic simulator
 Generates integer expressions describing system state after
sequence of steps

Decision procedure
 Determines validity of formulas

Support for multiple verification techniques
Available by Download
http://www.cs.cmu.edu/~uclid
– 17 –
Challenge: Model Generation


How to generate term-level model
How to guarantee faithfulness to RTL description
Comparison of Models

RTL
 Abstracts functional elements from gate-level model
 Synthesis allows automatic map to gate level

Term level
 Abstracts bit-level data representations to words
 Abstracts memories to mutable functions
 No direct connection to synthesizable model
– 18 –
Generating Term-Level Model
Manually Generate from RTL


How do we know it is a valid abstraction?
Hard to keep consistent with changing RTL
Automatically Generate from RTL


Andraus & Sakallah, DAC ‘04
Must decide which signals to keep Boolean, which to
abstract
 Confused by bit field extraction primitives of HDL
Synthesize RTL from Word-Level Model

– 19 –
Difficult to make efficient
Underlying Logic
Existing Approaches to Formal Verification


E.g., symbolic model checking
State encoded as fixed set of bits
 Finite state system
 Amenable to Boolean methods (SAT, BDDs)
Our Task

State encoded with unbounded data types
 Arbitrary integers
 Functions over integers

Must use decision procedures
 Determine validity of formula in some subset of first-order logic
 Adapt methods historically used by automated theorem provers
– 20 –
EUF: Equality with Uninterp. Functs

Decidable fragment of first order logic
Formulas (F )
F, F1  F2, F1  F2
T1 = T2
P (T1, …, Tk)
Terms (T )
ITE(F, T1, T2)
Fun (T1, …, Tk)
Functions (Fun)
f
 x1, …, xk . T
Predicates (P)
p
– 21 –
Boolean Expressions
Boolean connectives
Equation
Predicate application
Integer Expressions
If-then-else
Function application
Integer  Integer
Uninterpreted function symbol
Function lambda expression
Integer  Boolean
Uninterpreted predicate symbol
EUF Decision Problem
Circuit Representation of Formula

Truth Values
e1
 Dashed Lines
 Logical connectives
 Equations

ff
F
Integer Values
 Solid lines
 Uninterpreted functions
 If-Then-Else operation
Task

T
e0
x0
d0


ff
T
T
F
==

==
F
Determine whether formula F is universally valid
 True for all interpretations of variables and function symbols
» E.g., all values of integer x0 & d0, all Booleans e0 and e1, and all
integer functions f
– 22 –
Finite Model Property for EUF
e1
ff
T
F
e0
x0
ff
T
d0


T
F
==
x0
d0
f (x0) f (d0)

==
F
Observation


– 23 –
Any formula has limited number of distinct expressions
Only property that matters is whether or not different terms
are equal
Boolean Encoding of Integer Values
Expression
x0
Possible
Values
{0}
Bit
Encoding
0
0
d0
{0,1}
0
b10
f (x0)
{0,1,2}
b21
b20
f (d0)
{0,1,2,3}
b31
b30
For Each Expression

Either equal to or distinct from each preceding expression
Boolean Encoding


Use Boolean values to encode integers over small range
EUF formula can be translated into propositional logic
 Logic circuit with multiplexors, comparators, logic gates
– 24 –
 Tautology iff original formula valid
file.ucl
Model
+
Specification
Symbolic
Simulation

– 26 –
UCLID
Formula
Lambda
Expansion
Operation

UCLID Operation
Series of
transformations
leading to
propositional formula
Except for lambda
expansion, each has
polynomial
complexity
-free
Formula
Function
&
Predicate
Elimination
Term
Formula
Finite
Instantiation
Boolean
Formula
Boolean
Satisfiability
Verifying Safety Properties
Present
State
Next
State

Reachable
States
Bad
States
Reset
States
Reset
Inputs
(Arbitrary)
State Machine Model


State encoded as Booleans, integers, and functions
Next state function expresses how updated on each step
Prove: System will never reach bad state
– 31 –
Bounded Model Checking
Reachable
Rn
Bad
States
R2
R1
Reset
States
Repeatedly Perform Image
Computations

Set of all states reachable
by one more state
transition
Easy to Implement
Underapproximation of
Reachable State Set

– 32 –
But, typically catch most
bugs with 8–10 steps
True Model Checking

Rn
Bad
States
R2
R1
Reset
States
Impractical for Term-Level
Models

 Can keep adding
Reach Fixed-Point

– 33 –
Rn = Rn+1 = Reachable
Many systems never
reach fixed point
elements to buffer

Convergence test
undecidable
Inductive Invariant Checking

I
Bad
States
Reachable
States
Reset
States
Key Properties of System that Make it Operate
Correctly

Formulate as formula I
Prove Inductive
– 34 –

Holds initially I(s0)

Preserved by all state changes I(s)  I((i, s))
An Out-of-order Processor (OOO)
incr
Program
memory
PC
result bus
valid tag val
D
E
C
O
D
E
dispatch
Register
Rename Unit
retire
ALU
execute
head
tail
Reorder
Buffer
valid
value
src1valid
src1val
src1tag
src2valid
src2val
src2tag
dest
op
result
1st
Operand
2nd
Operand
Reorder Buffer
Fields
Data Dependencies Resolved by Register Renaming

Map register ID to instruction in reorder buffer that will generate
register value
Inorder Retirement Managed by Retirement Buffer

– 35 –
FIFO buffer keeping pending instructions in program order
OOO Invariants
Split into Formulas I1, …, In
holds for any initial state s0, for 1  j  n
I1(s)  I2(s)  …  In(s)  Ij(s ) for any current state s and
successor state s for 1  j  n
 Ij(s0)

Invariants for OOO (13)

Refinement maps (2)
 Show relation between ISA and OOO models

State consistency (8)
 Properties of OOO state that ensure proper operation

Added state (3)
 Shadow values correctly predict OOO values
Overall Correctness

– 36 –
Follows by induction on time
State Consistency Invariant
Examples
Register Renaming invariants (2)

Tag in a rename-unit should be in the ROB, and the
destination register should match
r.reg.valid(r) (rob.head  reg.tag(r) < rob.tail
 rob.dest(reg.tag(r)) = r )

For any entry, the destination should have reg.valid as
false and tag should contain this or later instruction
robt.(reg.valid(rob.dest(t)) 
t  reg.tag(rob.dest(t)) < rob.tail)
– 37 –
Extending the OOO Processor

base
 Executes ALU instructions only

exc
 Handles arithmetic exceptions
 Must flush reorder buffer

exc/br
 Handles branches
 Predicts branch & speculatively executes along path

exc/br/mem-simp
 Adds load & store instructions
 Store commits as instruction retires

exc/br/mem
 Stores held in buffer
 Can commit later
– 38 –
 Loads must scan buffer for matching addresses
Comparative Verification Effort
base
Total
Invariants
Manually
instantiate
UCLID
time
Person
time
exc
exc / br
exc / br /
exc / br /
mem-simp
mem
39
67
71
13
34
0
0
0
4
8
54 s
236 s
403 s
1594 s
2200 s
2 days
7 days
9 days
24 days
34 days
(Person time shown cumulatively)
– 39 –
“I Just Want a Loaf of Bread”
Ingredients
Recipe
– 40 –
Result
Cooking with Invariants
Ingredients: Predicates
rob.head  reg.tag(r)
Recipe: Invariants
reg.valid(r)
r,t.reg.valid(r)  reg.tag(r) = t

(rob.head  reg.tag(r) < rob.tail
 rob.dest(t) = r )
reg.tag(r) = t
Result: Correctness
rob.dest(t) = r
– 41 –
Automatic Recipe Generation
Ingredients
Recipe Creator
Result
Want Something More


– 42 –
Given any set of ingredients
Generate best recipe possible
Automatic Predicate Abstraction

Graf & Saïdi, CAV ‘97
Idea

Given set of predicates P1(s), …, Pk(s)
 Boolean formulas describing properties of system state


View as abstraction mapping: States  {0,1}k
Defines abstract FSM over state set {0,1}k
 Form of abstract interpretation
 Do reachability analysis similar to symbolic model checking
Implementation

Early ones had weak inference capabilities
 Call theorem prover or decision procedure to test each
potential transition

– 43 –
Recent ones make better use of symbolic encodings
Abstract State Space
Abstraction
Concretization
P1(s), …, Pk(s)
Abstract
States
Abstract
States
Abstraction
Function

Concrete
States
– 44 –
s
Concretization
Function

t
Concrete
States
s
t
Abstract State Machine
Abstract Transition
Abstract
System
Concretize

Concrete
System
Abstract

Concrete Transition
s
s
t

– 45 –
t
Transitions in abstract system mirror those in concrete
Generating Concrete Invariant
A
Rn
Abstract
System
Reach Fixed-Point on
Abstract System
R2

R1
Reset
States
Concretize

C
Concrete
System
I
Reset
States
– 46 –
Termination guaranteed,
since finite state
Equivalent to Computing
Invariant for Concrete
System

Strongest possible
invariant that can be
expressed by formula over
these predicates
Predicate Abstraction Example
State Space

State variables: { x, y }
Initial
State
Initial State

{ (2, 1) }
Next State Behavior


x  x
y  y
Verification Task

– 47 –
Prove all bad states unreachable
Bad
States
Precise Analysis
Reachable States

{ (2, 1), (2, 1) }
Reachable
States
Bad
States
– 48 –
Predicates
cx:3
cx:y
cy:0
L
L
G
E
E
E
G
G
L

– 49 –
Use 3-valued predicates in this example
Abstract Initial State
cx:3
cx:y
cy:0
L
G
G
Reached Set #0
{ LGG }
– 50 –
Step 1: Concretize Reached Set #0
Reached Set #0
{ LGG }
(Note loss of precision)
Concretize

s
cx:3
cx:y
cy:0
L
G
G
– 51 –
Compute Possible Successor States
x  x
y  y
Concretize

Concrete Transition
s
– 52 –
s
Abstract Newly Reached States
cx:3
cx:y
L
cy:0
L
L
0
Concretize

0
Abstract

Concrete Transition
s
– 53 –
s
Reached Set #1
{ LLL, LGG }
0
Step 2: Concretize Reached Set #1
Reached Set #1
{ LLL, LGG }
(Note loss of precision)
Concretize

s
cx:3
L
cx:y
cy:0
L
L
– 54 –
Compute Possible Successor States
x  x
y  y
Concretize

Concrete Transition
s
– 55 –
s
Abstract Newly Reached States
cx:3
cx:y
cy:0
G
E
G
G
Concretize

Abstract

Concrete Transition
s
– 56 –
s
Reached Set #2
{ LLL, LGG, EGG, GGG }
Final Reached State Set
LLL
EGG
LGG
Bad
States
– 57 –
GGG
Systems Verified with Predicate
Abstraction
Model
Predicates Iterations CPU Time
Out-Of-Order Execution Unit
25
9
1,207s
German’s Cache Protocol
13
9
14s
German’s Protocol, unbounded
channels
24
17
427s
Bounded Retransmission Buffer
22
9
11s
Lamport’s Bakery Algorithm
33
18
471s

Very general models
 Unbounded processes, buffers, cache lines, …

– 59 –
Safety properties only
Predicate Abstraction Convergences


Powerful method for generating & evaluating abstract model
of system
Applicable to variety of systems with different modeling
levels
Hardware
Word-Level UCLID
Bit-Level
– 60 –
Software
SLAM
Seshia, Lahiri,
Bryant, CAV ‘02
Ball, Rajamani,
SPIN ‘01
Clarke, Talupar,
Wang, SAT ‘03
CBMC
Kroening, Clarke,
ICCAD ‘04
Ongoing Research Areas
Decision Procedures

Expand class of logic
 Linear relations


Improved encoding techniques
Application to software & hardware verification
Predicate Abstraction

Improving efficiency
 Increases rapidly with number of predicates

Automatic generation of predicates
 Based on property to be verified & system model
Real-Life Application

– 61 –
Closing gap with actual hardware models
Download