Formal Verification of Infinite-State Systems Using Boolean Methods Carnegie Mellon University

advertisement
Formal Verification
of Infinite-State Systems
Using Boolean Methods
Randal E. Bryant
Carnegie Mellon University
http://www.cs.cmu.edu/~bryant
Contributions by former graduate students:
Sanjit Seshia, Shuvendu Lahiri
Outline
Task


Formally verify abstract models of hardware and software
systems
Build on success in verifying finite models
Infinite-State Models

Need logic that is suitably expressive, yet remains
reasonably tractable
Verification Techniques

Solve problems by mapping into propositional logic
 Proof engines can use powerful Boolean methods

–2–
Different levels of automation and capacity
Theoretically Infinite-State Systems

Systems with unbounded buffers
 Even though can’t really build one
In Use
•
•
•
–3–
•
•
•
•
•
•
tail
head
Arbitrarily Large Finite-State Systems
P2
•
P1
•
Synchronization protocol that should work for arbitrary
number of processes
•

PN
 Verify for arbitrary N

Circular buffer with fixed, but arbitrary capacity
In Use
head
 Verify for arbitrary value of Max
•
•
•
–4–
•
•
•
•
•
•
0
tail
Max-1
Existing Automatic Verification
Methods

Simulators, model checkers, …
All Operate at Bit Level

State model
 State encoded as words and arrays of words
 Comprised of bits

Must track how each bit of state gets updated
Only Verify Single Instance of Design

Fixed values for parameters
 Word size
 Buffer sizes
 Number of processes
–5–
What About Theorem Provers?
Traditional Tool for Formal Verification

Allow many forms of abstraction
Hard to Use

Lots of manual effort & expertise required
Question:

–6–
Can we incorporate some of these abstraction abilities into
an automated tool?
Data Abstraction #1: Bits → Integers
x0
x1
x2

xn-1
View Data as Symbolic Words

Arbitrary integers
 No assumptions about size or encoding
 Classic model for reasoning about software

–7–
Can store in memories & registers
x
Abstracting Data Bits
Control Logic
Com.
?
Log.
1
Com.
?
Log.
2
1
Data Path
What do we do about logic functions?
–8–
Abstraction #2:
Uninterpreted Functions
A
Lf
U
For any Block that Transforms or Evaluates Data:


Replace with generic, unspecified function
Only assumed property is functional consistency:
a = x  b = y  f (a, b) = f (x, y)
–9–
Abstracting Functions
Control Logic
Com.
Log.
F1
1
Com.
Log.
F2
1
Data Path
For Any Block that Transforms Data:



– 10 –
Replace by uninterpreted function
Ignore detailed functionality
Conservative approximation of actual system
Modeling Data-Dependent Control
Branch?
Adata
Branch
Logic
Cond
p
Bdata
Model by Uninterpreted Predicate


– 11 –
Yields arbitrary Boolean value for each control + data
combination
Produces same result when arguments match
Abstraction #3: Modeling Memories
as Mutable Functions
Memory M Modeled as Function
M
a

M(a): Value at location a
Initially
M
a


– 12 –
m0
Arbitrary state
Modeled by uninterpreted function m0
Effect of Memory Write Operation
Writing Transforms Memory

M = Write(M, wa, wd)
M
wa
=
a
wd
M

1
0
Reading from updated memory M(a):
 Address wa will get wd
 Otherwise get what’s already in M
– 13 –
Systems with Buffers
Circular Queue
Unbounded Buffer
In Use
0
head
Modeling Method


– 14 –
Mutable function to describe buffer contents
Integers to represent head & tail pointers
•
•
•
head
•
•
•
•
•
•
tail
•
•
•
•
•
•
•
•
•
In Use
tail
Max-1
UCLID

Seshia, Lahiri, Bryant, CAV ‘02
Term-Level Verification System

Language for describing systems
 Inspired by CMU SMV

Symbolic simulator
 Generates integer expressions describing system state after
sequence of steps

Decision procedure
 Determines validity of formulas

Support for multiple verification techniques
Available by Download
http://www.cs.cmu.edu/~uclid
– 15 –
System Model
Present
State
Next
State
State Variable Types


Boolean
 Control signals

Integer
 Data, addresses

Function
 Memories, buffers
Reset
Inputs
(Arbitrary)
System Operation

Synchronous
 All state variables updated on each step of operation

Interleaving
 One (set of) state variable(s) updated at a time
 Simulate in synchronous model with uninterpreted scheduling
function
– 16 –
Modeling Example
Boolean state
DLX Pipeline

Integer state
Single-issue, 5-stage pipeline
Function state
Pipeline
Fetch
pc
Decode
fd
Execute
de
Write
Back
Memory
em
mw
Branch
Arg1
Target
Arg2
Value
Instr
Arg2
Type
Type
Instr
Data
PC
PC
Type
Dest
Valid
Valid
Valid
Valid
Instr
pPC
– 17 –
RF
Mem
Writing & Reading Register File
Write
Back
Decode
fd
de
mw
Arg1
src1
RF
Instr
Arg2
src2
Data
Dest
Valid
– 18 –
Writing Register File
init[RF] := rf0; (* Uninterpreted Function *)
next[RF] := Lambda(a) .
Write
case
Back
mw_Valid & (a = mw_Dest) : mw_Data;
mw
default : RF(a);
esac;
RF
Data
Dest
Valid
– 19 –
Reading Register File
init[de_Arg1] := dea10;
(* Initially arbitary *)
next[de_Arg1] := next[RF](src1(fd_Instr));
init[de_Arg2] := dea20;
(* Initially arbitary *)
next[de_Arg2] := next[RF](src2(fd_Instr));
Decode
fd
de
Write-before-read
semantics
Arg1
src1
RF
Instr
src2
– 20 –
Arg2
Underlying Logic
Scalar Data Types

Formulas (F )
Boolean Expressions
 Control signals

Terms (T )
Integer Expressions
 Data values
Functional Data Types

Functions (Fun)
Integer  Integer
 Immutable: Functional units
 Mutable: Memories

Predicates (P)
Integer  Boolean
 Immutable: Data-dependent control
 Mutable: Bit-level memories
– 21 –
CLU Logic

Counter Arithmetic, Lambda Expressions and Uinterpreted
Functions
Terms (T )
ITE(F, T1, T2)
Fun (T1, …, Tk)
succ (T)
pred (T)
Formulas (F )
F, F1  F2, F1  F2
T1 = T2
T1 < T2
P(T1, …, Tk)
Integer Expressions
If-then-else
Function application
Increment
Decrement
Boolean Expressions
Boolean connectives
Equation
Inequality
Predicate application
To support pointer
operations
– 22 –
CLU Logic (Cont.)
Functions (Fun)
f
 x1, …, xk . T
Predicates (P)
p
 x1, …, xk . F
– 23 –
Integer  Integer
Uninterpreted function symbol
Function definition
Integer  Boolean
Uninterpreted predicate symbol
Predicate definition
Decision Problem
Circuit Representation of Formula

Truth Values





Dashed Lines
Model Control
Logical connectives
Equations
Integer Values




Solid lines
Model Data
Uninterpreted functions
If-Then-Else operation
e1
ff
T
F


e0
x0
ff
T
d0
T
F
==
==
F
Task

Determine whether formula F is universally valid
 True for all interpretations of variables and function symbols
 Often expressed as (un)satisfiability problem
– 24 –
» Prove that formula F is not satisfiable

Finite Model Property
e1
ff
T
F
e0
x0
ff
T
d0


T
F
==
x0
d0
f (x0) f (d0)

==
F
Observation


– 25 –
Any formula has limited number of distinct expressions
Only property that matters is whether or not different terms
are equal
Boolean Encoding of Integer Values
Expression
x0
Possible
Values
{0}
Bit
Encoding
0
0
d0
{0,1}
0
b10
f (x0)
{0,1,2}
b21
b20
f (d0)
{0,1,2,3}
b31
b30
For Each Expression

Either equal to or distinct from each preceding expression
Boolean Encoding


Use Boolean values to encode integers over small range
CLU formula can be translated into propositional logic
 Logic circuit with multiplexors, comparators, logic gates
– 26 –
 Tautology iff original formula valid
– 27 –
in
TI
(2
00
5)
118
Sa
tE
l it
eG
(2
00
4)
147
Si
eg
e
04
)
(2
00
2)
(2
00
3-
er
kM
(2
00
1)
(2
00
0)
1,000
zC
ha
ff
B
ra
sp
zC
ha
ff
G
Run-time (sec.)
Recent Progress in SAT Solving
3600
3,000
2,000
766
81
46
0
Verifying Safety Properties
Present
State
Next
State

Reachable
States
Reset
States
Reset
Inputs
(Arbitrary)
Prove: System will never reach bad state
– 28 –
Bad
States
Bounded Model Checking
Reachable
Rn
Bad
States
R2
R1
Reset
States
Repeatedly Perform Image
Computations

Set of all states reachable
by one more state
transition
Easy to Implement
Underapproximation of
Reachable State Set

– 29 –
But, typically catch most
bugs with 8–10 steps
Implementing BMC
Satisfiable?
Reset

S



– 30 –


X1
X2


Bad
Xn
Construct verification condition formula for step n by
symbolically simulating system for n cycles
Check with decision procedure
Do as many cycles as tractable
True Model Checking

Rn
Bad
States
R2
R1
Reset
States
Impractical for Term-Level
Models

 Can keep adding elements
Reach Fixed-Point

– 31 –
Rn = Rn+1 = Reachable
Many systems never
reach fixed point
to buffer

Convergence test
undecidable
Inductive Invariant Checking

I
Bad
States
Reachable
States
Reset
States
Key Properties of System that Make it Operate Correctly

Formulate as formula I
Prove Inductive
– 32 –

Holds initially I(s0)

Preserved by all state changes I(s)  I((i, s))
An Out-of-order Processor (OOO)
incr
Program
memory
PC
result bus
valid tag val
D
E
C
O
D
E
dispatch
Register
Rename Unit
retire
ALU
execute
head
tail
Reorder
Buffer
valid
value
src1valid
src1val
src1tag
src2valid
src2val
src2tag
dest
op
result
1st
Operand
2nd
Operand
Reorder Buffer
Fields
Data Dependencies Resolved by Register Renaming

Map register ID to instruction in reorder buffer that will generate
register value
Inorder Retirement Managed by Retirement Buffer

– 33 –
FIFO buffer keeping pending instructions in program order
Verifying OOO

Lahiri, Seshia, & Bryant,
FMCAD 2002
Goal


Show that OOO implements
Instruction Set Architecture
(ISA) model
For all possible execution
sequences
Challenge

OOO holds partially executed
instructions in reorder buffer
 States of two systems match
only when reorder buffer
flushed
– 34 –
ISA
Reg.
File
PC
OOO
Reg.
File
PC
Reorder Buffer
Adding Shadow State


McMillan, ‘98
Arons & Pnueli, ‘99
Provides Link Between ISA
& OOO Models

ISA
Reg.
File
PC
Additional info. in ROB
 Do not affect OOO behavior


Generated when
instruction dispatched
Predict values of operands
and result
 From ISA model
OOO
Reg.
File
PC
Reorder Buffer
– 35 –
Invariant Checking
Formulas I1, …, In
holds for any initial state s0, for 1  j  n
I1(s)  I2(s)  …  In(s)  Ij(s ) for any current state s and
successor state s for 1  j  n
 Ij(s0)

Invariants for OOO (13)

Refinement maps (2)
 Show relation between ISA and OOO models

Shadow state (3)
 Shadow values correctly predict OOO values

State consistency (8)
 Properties of OOO state that ensure proper operation
Overall Correctness

– 36 –
Follows by induction on time
State Consistency Invariant Examples
Register Renaming invariants (2)

Any mapped register should be in the ROB, and the
destination register should match
r.reg.valid(r) (rob.head  reg.tag(r) < rob.tail
 rob.dest(reg.tag(r)) = r )

For any ROB entry, the destination should have reg.valid
as false and tag should be to this or later instruction
robt.(reg.valid(rob.dest(t)) 
t  reg.tag(rob.dest(t)) < rob.tail)
– 37 –
Extending the OOO Processor

base
 Executes ALU instructions only

exc
 Handles arithmetic exceptions
 Must flush reorder buffer

exc/br
 Handles branches
 Predicts branch & speculatively executes along path

exc/br/mem-simp
 Adds load & store instructions
 Store commits as instruction retires

exc/br/mem
 Stores held in buffer
 Can commit later
 Loads must scan buffer for matching addresses
– 38 –
Comparative Verification Effort
base
Total
Invariants
UCLID
time
Person
time
exc
exc / br
exc / br /
exc / br /
mem-simp
mem
39
67
71
13
34
54 s
236 s
403 s
1594 s
2200 s
2 days
7 days
9 days
24 days
34 days
(Person time shown cumulatively)
– 39 –
“I Just Want a Loaf of Bread”
Ingredients
Recipe
– 40 –
Result
Cooking with Invariants
Ingredients: Predicates
rob.head  reg.tag(r)
Recipe: Invariants
reg.valid(r)
r,t.reg.valid(r)  reg.tag(r) = t

(rob.head  reg.tag(r) < rob.tail
 rob.dest(t) = r )
reg.tag(r) = t
Result: Correctness
rob.dest(t) = r
– 41 –
Automatic Recipe Generation
Ingredients
Recipe Creator
Result
Want Something More


– 42 –
Given any set of ingredients
Generate best recipe possible
Automatic Predicate Abstraction

Graf & Saïdi, CAV ‘97
Idea

Given set of predicates P1(s), …, Pk(s)
 Boolean formulas describing properties of system state


View as abstraction mapping: States  {0,1}k
Defines abstract FSM over state set {0,1}k
 Form of abstract interpretation
 Do reachability analysis similar to symbolic model checking
Implementation

Early ones had weak inference capabilities
 Call theorem prover or decision procedure to test each potential
transition

– 43 –
Recent ones make better use of symbolic encodings
Abstract State Space
Abstraction
Concretization
P1(s), …, Pk(s)
Abstract
States
Abstract
States
Abstraction
Function

Concrete
States
– 44 –
s
Concretization
Function

t
Concrete
States
s
t
Abstract State Machine
Abstract Transition
Abstract
System
Concretize

Concrete
System
Abstract

Concrete Transition
s
s
t

– 45 –
t
Transitions in abstract system mirror those in concrete
Generating Concrete Invariant
A
Rn
Abstract
System
Reach Fixed-Point on
Abstract System
R2

R1
Reset
States
Concretize

C
Concrete
System
I
Reset
States
– 46 –
Termination guaranteed,
since finite state
Equivalent to Computing
Invariant for Concrete
System

Strongest possible
invariant that can be
expressed by formula over
these predicates
Quantified Invariant Generation
(Lahiri & Bryant, VMCAI 2004)
 User supplies predicates containing free variables
 Generate globally quantified invariant
Example

Predicates
p1: reg.valid(r)
p2: rob.dest(t) = r
p3: reg.tag(r) = t

Abstract state satisfying (p1  p2  p3) corresponds to
concrete state satisfying
r,t[reg.valid(r)  reg.tag(r) = t
 rob.dest(t) = r]
rather than
r[reg.valid(r)]  r,t[reg.tag(r) = t] 
r,t[rob.dest(t) = r]
– 47 –
Systems Verified with Predicate
Abstraction
Model
Out-Of-Order Execution Unit
25
9
1,207s
German’s Cache Protocol
13
9
14s
German’s Protocol, unbounded
channels
24
17
427s
Lamport’s Bakery Algorithm
33
18
471s

– 48 –
Predicates Iterations CPU Time
Safety properties only
Future Prospects
Evaluation

Demonstrated ability to verify complex, parameterized
systems
Predicate Abstraction Shows Promise

Provides key automation advantage of model checking
Successful Application to Program Application


– 49 –
Qadeer & Lahiri, POPL ’06
Generate loop invariants for list manipulation programs
– 50 –
Automatic Predicate Discovery
Strength of Predicate Abstraction

If give it right set of predicates, PA will put them together into
invariant
Weakness


Gets nowhere without right set of predicates
Typical failure mode: Generate “true” as invariant
Challenges


– 51 –
Too many predicates will overwhelm PA engine
Our use of quantified invariants precludes counterexamplegenerated refinement techniques
Implementation of Predicate
Discovery
Lahiri & Bryant, CAV ’04
 Initially: Extract predicates from verification condition
 Iterate: Add new predicates by composing next-state
formulas
 With some heuristics thrown in
Experience


– 52 –
Can automatically generate invariants for real examples
~10X slower than for hand-selected predicates
Download