Formal Verification of Infinite-State Systems Using Boolean Methods Carnegie Mellon University

advertisement
Formal Verification
of Infinite-State Systems
Using Boolean Methods
Randal E. Bryant
Carnegie Mellon University
http://www.cs.cmu.edu/~bryant
Contributions by graduate students:
Sanjit Seshia, Shuvendu Lahiri
Outline
Task


Formally verify hardware and software systems
Build on success in verifying finite models
Infinite-State Models

Need logic that is suitably expressive, yet remains
reasonably tractable
Verification Techniques

Solve problems by mapping into propositional logic
 Proof engines can use powerful Boolean methods

–2–
Different levels of automation and capacity
Truly Infinite-State Systems

Systems where want to model real-world values
(temperature, speed, ...)
 Hybrid systems
 Very difficult to verify
Speedometer
Reading
Air Bag Controller
Deploy!
Accelerometer
Reading

Systems with real-valued time constraints
 E.g., timed automata
 Somewhat easier to verify, since all clocks move at same rate
–3–
Theoretically Infinite-State Systems

Systems with unbounded buffers
 Even though can’t really build one
In Use
•
•
•
–4–
•
•
•
•
•
•
tail
head
Arbitrarily Large Finite-State Systems
P2
•
P1
•
Synchronization protocol that should work for arbitrary
number of processes
•

PN
 Verify for arbitrary N

Circular buffer with fixed, but arbitrary capacity
In Use
head
 Verify for arbitrary value of Max
•
•
•
–5–
•
•
•
•
•
•
0
tail
Max-1
Very Large Finite-State Systems


Abstract 32-bit words as arbitrary integers
View memories as having unbounded capacity
IF/ID
PC
Op
ID/EX
Control
EX/WB
Control
Rd
Ra
Instr
Mem
=
Adat
Reg.
File
A
L
U
Imm
+4
Rb
–6–
=
Example: HP/Compaq Alpha 21264
Pipeline State





Multiple caches
Instruction queues
Dynamicallyallocated registers
Memory queue
Many buffers
between stages
Verification Tasks

Does it implement
the Alpha ISA?
Microprocessor Report, Oct. 28, 1996
–7–
Abstracting Data from Bits to Integers
x0
x1
x2

x
xn-1
View Data as Symbolic “Terms”

Arbitrary integers
 Verification proves correctness of design for all possible word sizes


Can store in memories & registers
Can select with multiplexors
 ITE: If-Then-Else operation
p
x
y
–8–
1
0
ITE(p, x, y)
T
x
y
1
0
x
F
x
y
1
0
y
Abstraction Via Uninterpreted
Functions
A
Lf
U
For any Block that Transforms or Evaluates Data:


Replace with generic, unspecified function
Only assumed property is functional consistency:
a = x  b = y  f (a, b) = f (x, y)
–9–
Abstraction Via Uninterpreted
Functions
IF/ID
PC
Op
ID/EX
Control
EX/WB
Control
Rd
Ra
Instr
F3
Mem
=
Adat
Reg.
File
A
FL2
U
Imm
F1
+4
Rb
=
For any Block that Transforms or Evaluates Data:


– 10 –
Replace with generic, unspecified function
Also view instruction memory as function
EUF: Equality with Uninterp. Functs

Decidable fragment of first order logic
Formulas (F )
F, F1  F2, F1  F2
T1 = T2
P (T1, …, Tk)
Terms (T )
ITE(F, T1, T2)
Fun (T1, …, Tk)
Functions (Fun)
f
Read, Write
Predicates (P)
p
– 11 –
Boolean Expressions
Boolean connectives
Equation
Predicate application
Integer Expressions
If-then-else
Function application
Integer  Integer
Uninterpreted function symbol
Memory operations
Integer  Boolean
Uninterpreted predicate symbol
Decision Problem
Logic of Equality with Uninterpreted Functions (EUF)

Truth Values





Integer Values




Task

Dashed Lines
Model Control
Logical connectives
Equations
Solid lines
Model Data
Uninterpreted functions
If-Then-Else operation
e1
f
T
F
e0
x0
f
T
d0


=

T
F
=
F
Determine whether formula is universally valid
 True for all interpretations of variables and function symbols
– 12 –
Finite Model Property for EUF
e1
f
T
F
e0
x0
f
T
d0


x0
=
f (x0) f (d0)

T
F
d0
=
F
Observation


– 13 –
Any formula has limited number of distinct expressions
Only property that matters is whether or not different terms
are equal
Boolean Encoding of Integer Values
Expression
x0
Possible
Values
{0}
Bit
Encoding
0
0
d0
{0,1}
0
b10
f (x0)
{0,1,2}
b21
b20
f (d0)
{0,1,2,3}
b31
b30
For Each Expression

Either equal to or distinct from each preceding expression
Boolean Encoding


Use Boolean values to encode integers over small range
EUF formula can be translated into propositional logic
 Tautology iff original formula valid
– 14 –
An Out-of-order Processor (OOO)
incr
Program
memory
PC
result bus
valid tag val
D
E
C
O
D
E
dispatch
Register
Rename Unit
retire
ALU
execute
head
tail
Reorder
Buffer
valid
value
src1valid
src1val
src1tag
src2valid
src2val
src2tag
dest
op
result
1st
Operand
2nd
Operand
Reorder Buffer
Fields
Data Dependencies Resolved by Register Renaming

Map register ID to instruction in reorder buffer that will generate
register value
Inorder Retirement Managed by Retirement Buffer

– 15 –
FIFO buffer keeping pending instructions in program order
Access Modes for Reorder Buffer
Retire
Dispatch
result bus
ALU
execute
FIFO


head
tail
Content Addressable
Insert when dispatch
Remove when retire

Directly Addressable


– 16 –
Select particular entry for
execution
Retrieve result value from
executed instruction
Broadcast result to all
entries with matching
source tag
Global

Flush all queue entries when
instruction at head causes
exception
Required Logic
Increased Expressive Power

Model queue pointers
 Increment & decrement operations
 Relative ordering

Ability to construct complex memory structures
 Not just set of fixed memory types
Don’t Go Too Far


– 17 –
Want practical decision procedures
Efficient reduction to propositional logic
EUF  CLU
Terms (T )
ITE(F, T1, T2)
If-then-else
Fun (T1, …, Tk)
Function application
succ (T) Increment
pred (T)
Decrement
Formulas (F )
– 18 –
F, F1  F2, F1  F2
T1 = T2
P(T1, …, Tk)
Boolean connectives
Equation
Predicate application
T1 < T2
Inequality
EUF  CLU (Cont.)
Functions (Fun)
f
Read, Write
Uninterpreted function symbol
Memory operations
 x1, …, xk . T
Function lambda expression
Predicates (P)
p
 x1, …, xk . F
Uninterpreted predicate symbol
Predicate lambda expression
• Arguments can only be terms
• Lambdas are just mutable arrays
– 19 –
Modeling Memories with ’s
Memory M Modeled as Function
Writing Transforms Memory

M = Write(M, wa, wd)
M
a
M
wa
=

M(a): Value at location a
a
Initially
M
M
a

– 20 –
1
0
m0


wd
Arbitrary state
Modeled by uninterpreted
function m0

 a . ITE(a = wa, wd, M(a))
Future reads of address wa
will get wd
Modeling Unbounded FIFO Buffer
Queue is Subrange of Infinite Sequence
Q.head = h
 Index of oldest element
Q.tail = t
 Index of insertion location

q(h–1)
head
q(h+1)
Q.val = q
•
•
•
 Function mapping indices to values
 q(i) valid only when h  i < t
q(t–2)
Initial State: Arbitrary Queue

Q.head = h0, Q.tail = t0
 Impose constraint that h0  t0

Q.val = q0
 Uninterpreted function
– 21 –
q(h)
q(t–1)
tail
increasing indices

Already
Popped
q(h–2)
q(t)
q(t+1)
•
•
•

•
•
•
Not Yet
Inserted
Modeling FIFO Buffer (cont.)
next[t] :=
ITE(operation = PUSH, succ(t), t)
next[q] :=
 (i).
ITE((operation = PUSH & i=t),
x, q(i))
– 22 –
t
•
•
•
q(h–2)
q(h–2)
q(h–1)
q(h–1)
q(h)
next[h]
q(h)
q(h+1)
q(h+1)
•
•
•
•
•
•
q(t–2)
q(t–2)
q(t–1)
q(t–1)
q(t)
x
q(t+1)
•
•
•
h
•
•
•
next[t]
q(t+1)
•
•
•
next[h] :=
ITE(operation = POP, succ(h), h)
op = PUSH
Input = x
Systems of Identical Processes
Each Process has k State Variables

•
•
•
•
•
•
sv2
•
•
•
– 23 –
sv1
•
•
•
State of Process i
•
•
•
•
•
•

Each state variable represented as array
Indexed by process Id
svk
Modeling System of Identical
Processes
On Each Step:

Select arbitrary process index p
 As if chosen by nondeterministic scheduler

Update state for selected process
•
•
•
•
•
•
inuse
state
p
0/1
next[state] := lambda(i)
case
– 24 – esac
CRITICAL
IDLE
i = p & state(i) = IDLE:
TRYING
i = p & state(i) = TRYING & inuse :
TRYING
i = p & state(i) = TRYING & !inuse:
CRITICAL
default:
state(i)
TRYING
Decision Procedure
CLU
Formula
Lambda
Expansion
Operation



– 25 –
Series of
transformations
leading to
propositional formula
Propositional formula
checked with BDD or
SAT tools
Bryant, Lahiri, Seshia
[CAV02]
-free
Formula
Function
&
Predicate
Elimination
Function-free
Formula
Convert to
Boolean
Formula
Boolean
Formula
Boolean
Satisfiability
Finite Model Property for CLU
x  y  succ(x) > pred(y)
x x+1
x x+1
y –1 y
x = 0, y = 3
y –1 y
x x+1
y –1 y
x x+1
y –1 y
x x+1
y –1 y
x = 2, y = 1
Observation



– 26 –
Need to encode all possible relative orderings of
expressions
Each symbolic value has maximum range of increments &
decrements
Can use Boolean encodings of small integer ranges
Verifying OOO

Lahiri, Seshia, & Bryant,
FMCAD 2002
Goal


Show that OOO implements
Instruction Set Architecture
(ISA) model
For all possible execution
sequences
Challenges


No bound on program length
OOO holds partially executed
instructions in reorder buffer
 States of two systems match
– 27 –
only when reorder buffer
flushed
ISA
Reg.
File
PC
OOO
Reg.
File
PC
Reorder Buffer
Adding Shadow State


McMillan, ‘98
Arons & Pnueli, ‘99
Provides Link Between ISA
& OOO Models

ISA
Reg.
File
PC
Additional entries in ROB
 Do not affect OOO behavior


Generated when
instruction dispatched
Predict values of operands
and result
 From ISA model
OOO
Reg.
File
PC
Reorder Buffer
– 28 –
Adding Shadow Structures
incr
Program
memory
PC
valid tag val
D
E
C
O
D
E
result bus
dispatch
Register
Rename Unit
retire
ALU
execute
head
tail
Reorder
Buffer
shdw.src1val[rob.tail]  Rfisa(src1)
Reorder
Buffer
Fields
valid
shdw.value
value
src1valid
src1val
shdw.src1val
src1tag
src2valid
shdw.src2val
src2val
src2tag
dest
Shadow Fields
op
Updated directly from the
ISA model during dispatch
shdw.src2val[rob.tail]  Rfisa(src2)
shdw.value[rob.tail]
– 29 –
 ALU(Rfisa(src1), Rfisa(src2), op)
Invariant Checking
Formulas I1, …, In
holds for any initial state s0, for 1  j  n
I1(s)  I2(s)  …  In(s)  Ij(s ) for any current state s and
successor state s for 1  j  n
 Ij(s0)

Invariants for OOO (13)

Refinement maps (2)
 Show relation between ISA and OOO models

Shadow state (3)
 Shadow values correctly predict OOO values

State consistency (8)
 Properties of OOO state that ensure proper operation
Overall Correctness

– 30 –
Follows by induction on time
Refinement Maps
incr
Program
memory
PC
result bus
D
E
C
O
D
E
valid tag val
dispatch
Register
Rename Unit
retire
ALU
execute
head
tail
Reorder
Buffer
Reorder
Buffer
Fields
valid
value
src1valid
src1val
src1tag
src2valid
src2val
src2tag
dest
op
Correspondence with a sequential ISA model

OOO and ISA synchronized at dispatch
For Register File Contents

r. reg.valid(r)  reg.val(r) = Rfisa(r)
For Program Counter

– 31 –
PCooo = PCisa
shdw.value
shdw.src1val
shdw.src2val
Shadow Fields
Shadow Invariants
incr
Program
memory
PC
result bus
valid tag val
D
E
C
O
D
E
dispatch
Register
Rename Unit
retire
ALU
execute
head
tail
Reorder
Buffer
Reorder
Buffer
Fields
valid
shdw.value
value
src1valid
src1val
shdw.src1val
src1tag
src2valid
shdw.src2val
src2val
src2tag
dest
Shadow Fields
op
1. robt. rob.valid(t)  rob.value(t) = shdw.value(t)
2. robt. rob.src1valid(t)  rob.src1val(t) = shdw.src1val(t)
3. robt. rob.src2valid(t)  rob.src2val(t) = shdw.src2val(t)
– 32 –
State Consistency Invariants
Tag Consistency invariants (2)

Instructions only depend on instruction preceding in
program order
Register Renaming invariants (2)

Tag in a rename-unit should be in the ROB, and the
destination register should match
r.reg.valid(r) (rob.head  reg.tag(r) < rob.tail
 rob.dest(reg.tag(r)) = r )

For any entry, the destination should have reg.valid as
false and tag should contain this or later instruction
robt.(reg.valid(rob.dest(t)) 
t  reg.tag(rob.dest(t)) < rob.tail)
– 33 –
Quantified Invariants and Proofs
Allowed Form

x1x2…xk (x1…xk)
(x1…xk) is a CLU formula without quantifiers

x1…xk are integer variables free in (x1…xk)

Proving these invariants requires quantifiers
|= (x1x2…xk (x1…xk))  y1y2…ym (y1…ym)

Prove x1x2…xk[(x1…xk)  (y1…ym)] is not satisfiable
 Undecidable
Automatic instantiation of x1…xk with concrete terms


– 35 –
Sound but incomplete method
Reduce the quantified formula to a CLU formula
 Can use the decision procedure for CLU
Proving Invariants
Proved Automatically



Quantifier instantiation was sufficient in these cases
Time spent = 54s on 1.4GHz machine
Total effort = 2 person days
Comparison

– 36 –
Previous efforts using theorem provers took weeks of effort
Extending the OOO Processor

base
 Executes ALU instructions only

exc
 Handles arithmetic exceptions
 Must flush reorder buffer

exc/br
 Handles branches
 Predicts branch & speculatively executes along path

exc/br/mem-simp
 Adds load & store instructions
 Store commits as instruction retires

exc/br/mem
 Stores held in buffer
 Can commit later
 Loads must scan buffer for matching addresses
– 37 –
Comparative Verification Effort
base
Total
Invariants
Manually
instantiate
UCLID
time
Person
time
– 38 –
exc
exc / br
exc / br /
exc / br /
mem-simp
mem
39
67
71
13
34
0
0
0
4
8
54 s
236 s
403 s
1594 s
2200 s
2 days
5 days
2 days
15 days
10 days
“I Just Want a Loaf of Bread”
Ingredients
Recipe
– 39 –
Result
Cooking with Invariants
Ingredients: Predicates
rob.head  reg.tag(r)
Recipe: Invariants
reg.valid(r)
r,t.reg.valid(r)  reg.tag(r) = t

(rob.head  reg.tag(r) < rob.tail
 rob.dest(t) = r )
Result: Correctness
reg.tag(r) = t
rob.dest(t) = r
– 40 –
Automatic Recipe Generation
Ingredients
Recipe Creator
Result
Want Something More


– 41 –
Given any set of ingredients
Generate best recipe possible
Automatic Predicate Abstraction

Graf & Saïdi, CAV ‘97
Idea

Given set of predicates P1(s), …, Pk(s)
 Boolean formulas describing properties of system state


View as abstraction mapping: States  {0,1}k
Defines abstract FSM over state set {0,1}k
 Form of abstract interpretation
 Do reachability analysis similar to symbolic model checking
Prior Implementations

Very weak inference capabilities
 Call theorem prover or decision procedure to test each potential
transition

– 42 –
Little support for quantified predicates
Abstract State Space
Abstraction
Concretization
P1(s), …, Pk(s)
Abstract
States
Abstract
States
Abstraction
Function

Concrete
States
– 43 –
s
Concretization
Function

t
Concrete
States
s
t
Abstract State Machine
Abstract Transition
Abstract
System
Concretize

Concrete
System
Abstract

Concrete Transition
s
s
t

– 44 –
t
Transitions in abstract system mirror those in concrete
Overapproximation by Abstract
Model
Abstract
System
Concrete
System


Path in abstract state space may not correspond to one in
concrete
OK when verifying safety properties
 Possible false negatives, but no false positives
– 45 –
Predicate Abstraction Example
State Space

State variables: { x, y }
Initial
State
Initial State

{ (2, 1) }
Next State Behavior


x  x
y  y
Verification Task

– 46 –
Prove all bad states unreachable
Bad
States
Precise Analysis
Reachable States

{ (2, 1), (2, 1) }
Reachable
States
Bad
States
– 47 –
Predicates
cx:3
cx:y
cy:0
L
L
G
E
E
E
G
G
L

– 48 –
Use 3-valued predicates in this example
Abstract Initial State
cx:3
cx:y
cy:0
L
G
G
Reached Set #0
{ LGG }
– 49 –
Step 1: Concretize Reached Set #0
Reached Set #0
{ LGG }
(Note loss of precision)
Concretize

s
cx:3
cx:y
cy:0
L
G
G
– 50 –
Compute Possible Successor States
x  x
y  y
Concretize

Concrete Transition
s
– 51 –
s
Abstract Newly Reached States
cx:3
cx:y
L
cy:0
L
L
0
Concretize

0
Abstract

Concrete Transition
s
– 52 –
s
Reached Set #1
{ LLL, LGG }
0
Step 2: Concretize Reached Set #1
Reached Set #1
{ LLL, LGG }
(Note loss of precision)
Concretize

s
cx:3
L
cx:y
cy:0
L
L
– 53 –
Compute Possible Successor States
x  x
y  y
Concretize

Concrete Transition
s
– 54 –
s
Abstract Newly Reached States
cx:3
cx:y
cy:0
G
E
G
G
Concretize

Abstract

Concrete Transition
s
– 55 –
s
Reached Set #2
{ LLL, LGG, EGG, GGG }
Final Reached State Set
LLL
EGG
LGG
Bad
States
– 56 –
GGG
Symbolic Formulation of Step 2
l1:
l2:
x<3
g3 :
x<y
y>0
g1:
g2:
x>3
l3 :
x>y
y<0
Reached Set #1
Concretized State Set
{ LLL, LGG }
LGG

Encode each 3-valued {L, E, G}
predicate with 2 Boolean
variables (l, g)

Represent state set as formula
LLL

– 57 –
(l1  g1  l2  g2  l3  g3)
(l1  g1  l2  g2  l3  g3)
Next-State Predicates
Next State (x, y )

Get predicates l1, l2, l3 , g1, g2, g3

Determine conditions under which predicates will hold in
next state
Express in terms of current state (x, y)

– 58 –
x = x
Current
y = y
State
x < 3
x < 3
x > 3
—
l2
x < y
x < y
x>y
g2
l3
y < 0
y < 0
y>0
g3
g1
x > 3
x > 3
x < 3
—
g2
x > y
x > y
x<y
l2
g3
y > 0
y > 0
y<0
l3
Next State
Predicate
Condition
l1
Matches
Consistency Constraints

l1

g1
Eliminate impossible predicate
combinations
In general, may need to introduce
additional variables
 To express more complex transitivity
constraints
(g2  g3  l1)
(g1  g1)
g3

l3
l2

g2
g1
l1
(g1  l1)
– 59 –
g2

l2
l3

g3
Symbolic Form
Formulation



Express compatible combinations of current-state & nextstate variables
Quantify out current-state variables
Gives formula over next-state variables
 l1, l2, l3 , g1, g2, g3
(l1  g1  l2  g2  l3  g3)
 (l1  g1  l2  g2  l3  g3) ]
 (g1  g1)  (g1  l1)  (g2  g3  l1)
[
 l2  g2  g2 l2
 l3  g3  g3  l3
– 60 –
Current
State
Consistency
Constraints
Extracting Next-State Set
Run SAT checker over formula
Generate blocking clause for each newly generated state


[
(l1  g1  l2  g2  l3  g3)
 (l1  g1  l2  g2  l3  g3) ]
 (g1  g1)  (g1  l1)  (g2  g3  l1)
 l2  g2  g2 l2
 (l1  g1  l2  g2  l3  g3)
 l3  g3  g3  l3
– 61 –
l1 g1 l2 g2 l3 g3 l1 g1 l2 g2 l3 g3
Next
State
1
0
1
0
1
0
0
0
0
1
0
1
EGG
1
0
1
0
1
0
0
1
0
1
0
1 GGG
1
0
1
0
1
0
1
0
0
1
0
1
LGG
1
0
0
1
0
1
1
0
1
0
1
0
LLL
Quantified Invariant Generation


User supplies predicates containing free variables
Generate globally quantified invariant
Example

Predicates
p1: reg.valid(r)
p2: rob.dest(t) = r
p3: reg.tag(r) = t

Abstract state satisfying (p1  p2  p3) corresponds to
concrete state satisfying
r,t[reg.valid(r)  reg.tag(r) = t
 rob.dest(t) = r]
rather than
r[reg.valid(r)]  r,t[reg.tag(r) = t] 
r,t[rob.dest(t) = r]
– 64 –
Generating Quantified Invariants
Use Quantifier Instantiation to Approximate  During
Concretization


– 65 –
Causes even greater overapproximation
Similar technique used by Flanagan & Qadeer, POPL ‘02
Systems Verified with Predicate
Abstraction
Model
Predicates Iterations CPU Time
Out-Of-Order Execution Unit
25
9
2,613s
German’s Cache Protocol
21
9
122s
German’s Protocol, unbounded
channels
30
19
15,000s
Bounded Retransmission Buffer
22
9
11s
Lamport’s Bakery Algorithm
24
24
5,211s

Very general models
 Unbounded processes, buffers, cache lines, …

– 66 –
Safety properties only
Other Uses of UCLID Verifier
Invariant Checking

More complex version of OOO including speculative
execution, exceptions, & buffered loads & stores
 Lahiri & Bryant, CAV 2003
Predicate Abstraction

Core algorithm used to generate weakest Boolean
precondition for software model checking
 SLAM project at Microsoft
Pipelined Processor Verification

Verify checker processor from U. Michigan
 Model extracted directly from Verilog

– 67 –
Bounded check of load-store unit from industrial
microprocessor
Conclusions
CLU is Useful Logic

Expressive enough to model wide range of systems
 Systems with unbounded resources
 Abstract away most data operations

Simple enough to be tractable
 Small domain property allows exploiting Boolean methods
Predicate Abstraction is Powerful Tool


– 68 –
Removes requirement to hand-generate invariants
Benefits similar to model checking
Further Work
Support for Proofs of Liveness

Must make argument that progress being made
Greater Automation


Automatic generation of predicates
More efficient implementation of predicate abstraction
More Powerful Logic

Linear arithmetic would be useful
 Potential blow-up when translate to Boolean formula
Apply to Other Systems


– 69 –
Software
Network protocols
Download