SAT-Based Decision Procedures for Linear Arithmetic and Uninterpreted Functions

advertisement

SAT-Based Decision

Procedures for

Linear Arithmetic and

Uninterpreted Functions

Randal E. Bryant

Carnegie Mellon University http://www.cs.cmu.edu/~bryant

Decision Procedures in Formal

Verification

RTL/

Source

Code

+

Specification

Abstraction

Formal

Model

+

Specification

Verification

OK

Error

Decision Procedure for Decidable Fragment of First-Order Logic

– 2 –

Applications: Out-of-order, Pipelined Microprocessors; Cache

Coherence Protocols; Device Drivers; Compiler Validation; …

SAT-based Decision Procedures

Input Formula

Satisfiability-preserving

Boolean Encoder

Boolean Formula

SAT Solver

– 3 – satisfiable unsatisfiable

EAGER ENCODING

Input Formula

Approximate

Boolean Encoder additional clause unsatisfiable

Boolean Formula

SAT Solver satisfiable satisfying assignment

First-order

Conjunctions

SAT Checker unsatisfiable

LAZY ENCODING satisfiable

Lazy Encoding Characteristics

Uninterpreted

Functions

Linear

Arithmetic

First-order

Conjunctions

SAT Checker

Theory

Combiner Bit Vectors

Theory N

+ Can be extended to handle wide variety of theories

+ Clean & modular design

– Does not scale well

 Number of calls to conjunction checker typically exponential in formula size

 Each call independent: nothing learned in one call can be exploited by another

– 4 –

Eager Encoding Characteristics

Input Formula

– Must encode all information about domain properties into Boolean formula

– Some properties can give exponential blowup

+ Lets SAT solver do all of the work Satisfiability-preserving

Boolean Encoder

Boolean Formula

SAT Solver

Good Approach for Some Domains

 Modern SAT solvers have remarkable capacity

 Good at extracting relevant portions out of very large formulas

 Learns about formula properties as search proceeds satisfiable unsatisfiable  Focus of this talk

– 5 –

Data and Function Abstraction x

0 x

1 x

2

 x n -1 x

Bit-vectors to (unbounded) Integers

A

L

U

 f

Common Operations p x y

1

0

ITE ( p , x , y )

If-then-else x y

= x = y

Test for equality

– 6 –

Functional units to Uninterpreted Functions a = x

 b = y

 f ( a,b ) = f ( x,y )

Abstract Modeling of Microprocessor

PC

F

1

Op

IF/ID

Rd

Ra

Control

Adat

ID/EX

Control

EX/WB

=

Imm

Reg.

File

A

F

L

U

2

=

Rb

For any Block that Transforms or Evaluates Data:

 Replace with generic, unspecified function

 Also view instruction memory as function

– 7 –

EUF: Equality with Uninterp. Functs

 Decidable fragment of first order logic

Formulas ( F )

F , F

1

F

2

, F

1

T

1

= T

2

P ( T

1

, …, T k

)

F

2

Terms ( T )

ITE ( F , T

1

, T

2

)

Fun ( T

1

, …,

T k

)

Functions ( Fun ) f

Read, Write

Predicates ( P ) p

Boolean Expressions

Boolean connectives

Equation

Predicate application

Integer Expressions

If-then-else

Function application

Integer

Integer

Uninterpreted function symbol

Memory operations

Integer

Boolean

Uninterpreted predicate symbol

– 8 –

EUF Decision Problem

Circuit Representation of Formula

 Truth Values

 Dashed Lines

Model Control

Logical connectives

Equations f f

 Integer Values

 Solid lines

Model Data

Uninterpreted functions

If-Then-Else operation f f

=

=

=

=

Task

 Determine whether formula F is universally valid

 True for all interpretations of variables and function symbols

 Often expressed as (un)satisfiability problem

» Prove that formula 

F is not satisfiable

– 9 –

Finite Model Property for EUF f f f f

=

=

=

=

 x

0 d

0 f ( x

0

) f ( d

0

)

Observation

 Any formula has limited number of distinct expressions

 Only property that matters is whether or not different terms are equal

– 10 –

Boolean Encoding of Integer Values

Expression Possible

Values x

0

{0} d

0 f ( x

0

) f ( d

0

)

{0,1}

{0,1,2}

Encoding

0

0 b

21

{0,1,2,3} b

31

Bit b b b

0

10

20

30

For Each Expression

 Either equal to or distinct from each preceding expression

Boolean Encoding

 Use Boolean values to encode integers over small range

– 11 –

 EUF formula can be translated into propositional logic

 Logic circuit with multiplexors, comparators, logic gates

 Tautology iff original formula valid

Some History of EUF Decision

Procedures

 Ackermann, 1954

 Quantifier-free decision problem can be decided based on finite instantiations

Burch & Dill, CAV ‘94

 Automatic decision procedure

» Davis-Putnam enumeration

» Congruence closure to enforce functional consistency

 Boolean approaches

 Goel, et al, CAV ‘98

» Attempted with BDDs, but didn’t get good results

Bryant, German, Velev, CAV ‘99

» Could verify microprocessor using BDDs

Velev & Bryant, DAC 2001

» Demonstrated power of modern SAT procedures

– 12 –

Exploiting Positive Equality

Bryant, German, Velev CAV ‘99

 First successful use of Boolean methods for EUF

Positive Equality

 Equations that appear in unnegated form

Exploiting

 Can greatly reduce number of cases required to show validity

 Only need to consider maximally diverse interpretations

 Reduce number of Boolean variables in bit-level encoding

– 13 –

Diverse Interpretations: Illustration

Task

Verify someone’s obscure code for 4X4 array transpose void trans(int a[4][4])

{ int t; for (t = 4; t < 15; t++) if (~t&2|| t&8 && ~t&1) { int r = t&0x3; int c = t>>2; int val = a[r][c]; a[r][c] = a[c][r]; a[c][r] = val;

Only operations on array elements

}

}

Observation

 Array elements altered only by copying one to another

 Just need to make sure right set of copies performed

– 14 –

Verifying Array Code

Test for trans4 a

0 1 2 3

4 5 6 7

8 9 10 11

12 13 14 15 trans4 a’

0 4 8 12

1 5 9 13

2 6 10 14

3 7 11 15

Single Test Adequate

 Unique value for each possible source element

 “Maximally Diverse”

 If a’[r][c] = a[c][r] , then must have copied proper value

– 15 –

Characteristics of Array Verification

Correctness Condition a’[0][0] = a[0][0]  a’[0][1] = a[1][0]  a’[0][2] = a[2][0]  …

…  a’[3][2] = a[2][3]  a’[3][3] = a[3][3]

Properties

 All equations are in positive form

 Worst case test is one that tends to make things unequal

 Maximally diverse interpretation: use as many different values as possible

 All maximally diverse interpretations isomorphic

 Only need to try one to prove all handled correctly

– 16 –

Equations in Processor Verification

PC

+4

Instr

Mem

Op

IF/ID

Rd

Ra

Control

Adat

ID/EX

Control

EX/WB

=

Imm

Reg.

File

A

L

U

=

Rb

– 17 –

Data Types Equations

 Register Ids

 Program Data

Control stalling & forwarding

 Instruction Address Only top-level verification condition

Only top-level verification condition

Exploiting Equation Structure

Positive Equations

 In top-level verification condition

 Can use maximally diverse interpretation

Negative Equations

 PIpeline control logic

 Between register IDs

 Operation depends on whether or not two IDs are equal

 Must use general encoding

Encode with Boolean variables

All possibility of IDs that match and/or don’t match

– 18 –

Application of Positive Equality

0

1 f f

7 8

5 6

7 8

0 1 f f

7

=

=

1

5 x

0

6 d

0

7 8 f ( x

0

) f ( d

0

)

5

=

=

6

5 6

7 6

5 6

Observation

 All equations are positive in this formula

 Can consider single, diverse interpretation for terms

– 19 –

Function Elimination: Ackermann’s

Method

Replace All Function Applications by Integer Variables

 Introduce new domain variable

 Enforce functional consistency by global constraints

 x

1 x

2

= vf f

1 vf f

2

=

F

– 20 –

 Unclear how to restrict evaluation to diverse interpretations

Function Elimination: ITE Method

General Technique

 Introduce new domain variable

 Nested ITE structure maintains functional consistency x

1 f vf

1

– 21 – x x

2

3

=

= f vf

2

= f vf

3

T

F

T

F

T

F

Generating Diverse Encoding

Replacing Application

 Use fixed values rather than variables

 Application results equal iff arguments equal x

1 f

5 x

2 x

3

=

= f

6

= f

7

T

F

T

F

T

F

– 22 –

Benefits of Positive Equality

Microprocessor Benchmarks Velev & Bryant, JSC ‘02

 1xDLX: Single issue, RISC processor

 2xDLX-EX-BP: Dual issue processor with exception handling

& branch prediction

 9VLIW-BP: 9-way VLIW processor with branch prediction

Measurements

 Using BerkMin SAT solver

Benchmark Using Pos. Eq.

No Pos. Eq

0.02

2

0.07

4

15

10

224

229

15

> 24hrs

> 24hrs

> 24hrs

– 23 –

1xDLX

2xDLX-EX-BP

9VLIW-BP buggy good buggy good buggy good

Revisiting Encoding Techniques x = y

 y = z

 z

 x Satisfiable?

Small Domain (SD)

 x

1 x

0

=

 y

1 y

0

   y

1 y

0

=

 z

1 z

0

   z

1 z

0

   x

1 x

0

 Use bit-level encodings of bounded integers

 Implicitly encode properties of equality logic

Per-Constraint Encoding (EIJ) e xy

 e yz

  e xz

Transitivity Constraints e e e yz xy xy

 e

 e zx yz

 e xz

 e

 e

 e xy xz yz

 Introduce explicit Boolean variable for each equation

 Additional transitivity constraints to express properties of equality logic

– 24 –

Per-Constraint Encoding

 Introduced by Goel et al., CAV ‘98

 Exploiting sparse structure by Bryant & Velev, CAV 2000

Procedure

 Initial formula F

Want to prove valid

Prove that

F is not satisfiable

 Replace each equation x = y by Boolean variable e xy

 Gives formula F sat

Generate formula expressing transitivity constraints

 Gives formula F trans

Use SAT solver to show that F sat

F trans not satisfiable

Motivation

 Provides SAT solver with more direct representation of underlying problem

– 25 –

Graph Interpretation of Transitivity

Transitivity Violation

 Cycle in graph

 Exactly one edge has e i,j

= false

= =

=

= =

= =

– 26 –

Exploiting Chords

Chord

 Edge connecting two nonadjacent vertices in cycle

Property

 Sufficient to enforce transitivity constraints for all chord-free cycles

 If transitivity holds for all chord-free cycles, then holds for arbitrary cycles

– 27 –

Enumerating Chord-Free Cycles

Strategy

 Enumerate chord-free cycles in graph

 Each cycle of length k yields k transitivity constraints

Problem

 Potentially exponential number of chord-free cycles

1 2 • • • k

2 k + k chord-free cycles

• • •

– 28 –

Adding Chords

Strategy

 Add edges to graph to reduce number of chord-free cycles

1 2 • • • k

2 k + k chord-free cycles

2 k +1 chord-free cycles

• • •

Trade-Off

 Reduces formula size

 Increases number of relational variables

– 29 –

Chordal Graph

Definition

 Every cycle of length > 3 has a chord

Goal

 Add minimum number of edges to make graph chordal

Relation to Sparse Gaussian

Elimination

 Choose pivot ordering that minimizes fill-in

 NP-hard

 Simple heuristics effective

– 30 –

1xDLX-C Equation Structure

Vertices

For each v i

13 different register identifiers

Edges

 For each equation

 Control stalling and forwarding logic

 27 relational variables

 Out of 78 possible

– 31 –

Adding Chordal Edges to 1xDLX-C

Original

 27 relational variables

 286 cycles

 858 clauses

Augmented

 33 relational variables

 40 cycles

 120 clauses

– 32 –

2DLX-CCt Equation Structure

Equations

 Between 25 different register identifiers

 143 relational variables

 Out of 300 possible

– 33 –

Adding Chordal Edges to 2xDLX-CCt

Original

143 relational variables

2,136 cycles

8,364 clauses

Augmented

 193 relational variables

 858 cycles

 2,574 clauses

– 34 –

Choosing Encoding Method

Comparison

 Formula length n with m integer variables & function applications

 Worst-case complexity

Small Domain Per-Constraint

Boolean

Variables

O( m log m ) O( m 2 )

Formula Size O( n + m 2 log m ) O( n + m 3 )

Per-Constraint Encoding Works Well in Practice

 Generates slightly larger formulas than small domain

 Better performance by SAT solver

– 35 –

Encoding Comparison

Benchmarks

Superscalar, out-of-order datapath

2 –6 instructions issued in parallel

Measurements

 Using BerkMin SAT solver

Issue

Width

2

3

4

5

6

Per-Constraint

Vars Clauses Time

139 8,213

308 33,270

553 96,480

1.6

15

65

857 240,892 154

1,243 528,962 1,957

Velev & Bryant, JSC ‘02

Small Domain

Vars Clauses Time

81

127

194

1,294

3,780

8,362

1.7

19

99

249 15,647 255

304 26,738 3,206

– 36 –

Extensions

Difference logic

Predicates of form x ≤ y + C

 Original logic of UCLID

 Use integer variables to represent pointers into buffers

 C =

1

Linear constraints

 Predicates of from a

1 x

1

+ a

2 x

2

+ … + a n x n

≤ b

 Used in applying UCLID to software verification and software security problems

– 37 –

Difference Logic

Predicates of form x ≤ y + C

 C generally a small integer

Encoding Methods

 Small domain

 Range bound n · max |C|

 Per constraint encoding

Variables of form e x,,y

C

Can have exponential blowup in number of variables

Choosing Encoding Method

Per constraint better, as long as it doesn’t blow up

 Predicting blowup

 Successfully used classifier trained by machine learning (Seshia,

Lahiri & Bryant, DAC ’03)

– 38 –

Linear Constraints

 Predicates of from a

1 x

1

+ a

2 x

2

+ … + a n x n

≤ b

Common Case

 All but k predicates are difference predicates

 a i

= +1, a j

= –1, rest = 0

 Rest are sparse

 At most w coefficients nonzero

 Coefficient values small n #variables w max #non-zero terms k b max

#non-difference constraints max |constant| a max max |coefficient|

– 39 –

Linear Constraints

Small Domain Encoding

(Seshia & Bryant, LICS ’04)

 Find value D such that only need to consider solutions with 0 ≤ x i

< D, for all i

 Bounds on D:

( n +2)

¢ n

¢

( b max

+1)

¢

( w

¢ a max

) k n w k

 Encode as SAT problem with log(D) bits / integer variable

 Practical for real applications b max a max

#variables max #non-zero terms

#non-difference constraints max |constant| max |coefficient|

– 40 –

Some Lessons We’ve Learned

Preserve Boolean Structure

 Other approaches require collapsing to conjunctions of predicates

Exploit Problem Characteristics

 Sparseness

 Tighten bounds and/or reduce number of constraints

 Polarity structure

 Positive equality

Let SAT Solver Do the Work

 Eager encoding: provide sufficient set of constraints to prove / disprove formula

 They are good at digesting large volume of information

– 41 –

Download