Automated Verification of Concurrent Linked Lists with Counters

advertisement
Automated Verification of Concurrent
Linked Lists with Counters
Tuba Yavuz-Kahveci and Tevfik Bultan
Department of Computer Science
University of California, Santa Barbara
{tuba,bultan}@cs.ucsb.edu
http://www.cs.ucsb.edu/~bultan/composite
General Problem

Concurrent programming is difficult and error prone
– Sequential programming: states of the variables
– Concurrent programming: states of the variables and the processes

Linked list manipulation is difficult and error prone
– States of the heap: possibly infinite

We would like to guarantee properties of a concurrent linked list
implementation
More Specific Problem

There has been work on verification of concurrent systems with
integer variables (and linear constraints)
– [Bultan, Gerber and Pugh, TOPLAS 99]
– [Delzanno and Podelski STTT01]
– Use widening based on earlier work of [Cousot and Halbwachs
POPL 77] on analyzing programs with integer variables

There has been work on verification of (concurrent) linked lists
– [Yahav POPL’01]

What can we do for concurrent systems:
– where both integer and heap variables influence the control flow,
– or the properties we wish to verify involve both integer and heap
variables?
Our Approach

Use symbolic verification techniques
– Use polyhedra to represent the states of the integer variables
– Use BDDs to represent the states of the boolean and enumerated
variables
– Use shape graphs to represent the states of the heap
– Use composite representation to combine them

Use forward-fixpoint computations to compute reachable states
– Truncated fixpoint computations can be used to detect errors
– Over-approximation techniques can be used to prove properties
• Polyhedra widening
• Summarization in shape graphs
Action Language Tool Set
Action Language
Specification of the
Concurrency Component
Action Language
Parser
Action Language
Verifier
Code Generator
Verified code
(Java monitor classes)
Composite Symbolic Library
Omega
Library
CUDD
Package
MONA
Outline

Specification of concurrent linked lists
– Action Language

Symbolic verification
– Composite representation

Approximation techniques
– Summarization
– Widening




Counting abstraction
Experimental results
Related Work
Conclusions
Action Language [Bultan ICSE00] [Yavuz-Kahveci, Bultan
ASE01]

A state based language
– Actions correspond to state changes

States correspond to valuations of variables
– Integer (possibly unbounded), heap, boolean and enumerated
variables
– Parameterized constants are allowed

Transition relation is defined using actions
– Atomic actions: Predicates on current and next state variables
– Action composition: synchronous (&) or asynchronous (|)

Modular
– Modules can have submodules

Properties to be verified
– Invariant(p) : p always holds
Composite Formulas: State Formulas

We use state formulas to express the properties we need to
check
– No primed variables in state formulas
– State formulas are boolean combination (, , ,,) of integer,
boolean and heap formulas
numItems>2 => top.next!=null
integer formula
heap formula
State formulas

Boolean formulas
– Boolean variables and constants (true, false)
– Relational operators: =, 
– Boolean connectives (, , ,,)

Integer formulas (linear arithmetic)
–
–
–
–

Integer variables and constants
Arithmetic operators: +,-, and * with a constant
Relational operators: =, , > , <, , 
Boolean connectives (, , ,,)
Heap formulas
– Heap variable, heap-variable.selector, heap constant null
– Relational operators: =, 
– Boolean connectives (, , ,,)
Composite Formulas: Transition Formulas

We use transition formulas to express the actions
– In transition formulas primed-variables denote the next-state
values, unprimed-variables denote the current-sate values
current state variables
pc=checknull and numItems=0 and top’=add and add’.next=null and
numItems’=1 and pc’=create and mutex’;
next state variables
Transition Formulas

Transition formulas are in the form:
boolean-formula  integer-formula  heap-transition-formula

Heap transition formulas are in the form:
guard-formula  update-formula

A guard formula is a conjunction of terms in the form:
id1 = id2
id1.f = id2
id1.f = id2.f
id1 = null
id1.f = null

id1  id2
id1.f  id2
id1.f  id2.f
id1  null
id1.f  null
An update formula is a conjunction of terms in the form:
id’1 = id2
id’1.f = id2
id’1 = null
id’1= new
id’1 = id2.f
id’1.f = id2.f
id’1.f = null
id’1.f = new
Stack Example
Variable declarations define
the state space of the system
module main()
heap {next} top, add, get, newTop;
boolean mutex;
integer numItems;
Predicates defining
the initial states
initial: top=null and mutex and numItems=0;
module push()
Atomic actions: primed
enumerated pc {create, checknull,updateTop}; variables denote next
sate variables
initial: pc=create and add=null;
push1: pc=create and mutex and !mutex’ and add’=new and
pc’=checknull;
push2: pc=checknull and top=null and top’=add and add’.next=null
and numItems'=1 and pc’=create and mutex’;
push3: pc=checknull and top!=null and add’.next=top and
pc’=updateTop;
push4: pc=updateTop and top’=add and numItems’=numItems+1
and mutex’ and pc’=create;
push: push1 | push2 | push3 | push4;
endmodule
Transition relation of the
push module is defined as
asynchronous composition
of its atomic actions
Stack (Cont’d)
module pop()
enumerated pc {copyTopNext, getTop, updateTop};
initial: pc=copyTopNext and get=null and newTop=null;
pop1: pc=copyTopNext and mutex and top!=null and
newTop’=top.next and !mutex’ and pc’=getTop;
pop2: pc=getTop and get’=top and pc’=updateTop;
pop3: pc=updateTop and top’=newTop and mutex’
and numItems’=numItems-1 and pc’=copyTopNext;
pop: pop1 | pop2 | pop3;
endmodule
main: pop() | pop() | push() | push();
spec: invariant([numItems=0 => top=null])
spec: invariant([numItems>2 => top->next!=null])
endmodule
Invariants to be verified
Transition relation of main defined as
asynchronous composition of two pop and two
push processes
Stack (with integer guards)
module main()
heap {next} top, add, get, newTop;
boolean mutex;
integer numItems;
initial: top=null and mutex and numItems=0;
module push()
enumerated pc {create, checknull,updateTop};
initial: pc=create and add=null;
push1: pc=create and mutex and !mutex’ and add’=new and
pc’=checknull;
push2: pc=checknull and numItems=0 and top’=add and add’.next=null
and numItems’=1 and pc’=create and mutex’;
push3: pc=checknull and numItems>0 and add’.next=top and
pc’=updateTop;
push4: pc=updateTop and top’=add and numItems’=numItems+1 and
mutex’ and pc’=create;
push: push1 | push2 | push3 | push4;
endmodule
Outline

Specification of concurrent linked lists
– Action Language

Symbolic verification
– Composite representation

Approximation techniques
– Summarization
– Widening




Counting abstraction
Experimental results
Related Work
Conclusions
Symbolic Verification: Forward Fixpoint

Forward fixpoint for the reachable states can be computed by
iteratively manipulating symbolic representations
– We need forward-image (post-condition), union, and equivalence
check computations
ReachableStates(I:
T:
RS := I;
repeat {
RSold := RS;
RS := RSold 
} until (RSold =
}
Set of initial states,
Transition relation) {
forwardImage(RSold, T);
RS)
Symbolic Verification: Symbolic Representations

Use a symbolic representation for the sets of states
– A boolean logic formula (stored as a BDD) represents the sets of
states of the boolean variables:
pc=create  mutex
– An arithmetic constraint (stored as polyhedra) represents the sets
of states of integer variables:
numItems>0
– Shape graphs are used to represent the sates of the heap variables
and the heap
add
top
Composite Representation

Each variable type is mapped to a symbolic representation type
– Boolean and enumerated types  BDD representation
– Integer variables  Polyhedra
– Heap variables  Shape graphs



Each conjunct in a transition formula operates on a single
symbolic representation
Composite representation: A disjunctive representation to
combine different symbolic representations
Union, equivalence check and forward-image computations are
performed on this disjunctive representation
Composite Representation

A composite representation A is a disjunction
A    aij
n
where
t
i 1 j 1
– n is the number of composite atoms in A
– t is the number of basic symbolic representations

Each composite atom is a conjunction
– Each conjunct corresponds to a different symbolic representation
Composite Representation: Example
BDD
pc=create  mutex
A list of shape graphs
A list of polyhedra

numItems=2
add

top

pc=checkNull  mutex

numItems=2

add
top

pc=updateTop  mutex

numItems=2

add
top

pc=create  mutex

numItems=3

add
top
Composite Symbolic Library [Yavuz-Kahveci, Tuncer, Bultan
TACAS01], [Yavuz-Kahveci, Bultan STTT02]

Our library implements this approach using an object-oriented
design
– A common interface is used for each symbolic representation
– Easy to extend with new symbolic representations
– Enables polymorphic verification
– As a BDD library we use Colorado University Decision Diagram Package
(CUDD) [Somenzi et al]
– As an integer constraint manipulator we use Omega Library [Pugh et al]
– For encoding the states of the heap variables and the heap we use shape
graphs encoded as BDDs (using CUDD)
Composite Symbolic Library: Class Diagram
Symbolic
+union()
+isSatisfiable()
+isSubset()
+forwardImage()
HeapSym
IntSym
CompSym
–representation:
BDD
–representation:
list of ShapeGraph
–representation:
list of Polyhedra
–representation:
list of comAtom
+union()
+union()
+union()
+ union()
BoolSym
•
•
•
CUDD Library
•
•
•
ShapeGraph
–atom: *Symbolic
•
•
•
OMEGA Library
•
•
•
compAtom
–atom: *Symbolic
Satisfiability Checking for Composite
Representation
is
boolean isSatisfiable(CompSym A)
for each compAtom a in A do
if a is satisfiable then
return true
return false
Satisfiable?
isSatisfiable?
boolean isSatisfiable(compAtom a)
for each symbolic representation t do
if at is not satisfiable then
return false
return true
isSatisfiable?
or
is
is
is
is
Satisfiable?
and

Satisfiable?
and
Satisfiable?
Satisfiable?
Forward Image for Composite Representation
A:
R:
CompSym forwardImage(Compsym A,
transitionRelation R)
CompSym C;
for each compAtom a in A do
for each atomic action r in R do
insert forwardImage( a,r ) into C
return C
C:
•••
Forward Image for Composite Atom
compAtom forwardImage(compAtom a, atomic action r)
for each symbolic representation type t do
replace at by forwardImage(at , rt )
return a
r:
a:
Forward-Image Computation: Example

pc=updateTop  mutex
pc=updateTop and
pc’=create and mutex’
pc=create  mutex

numItems=2
numItems’=numItems+1

numItems=3

add
add
top
top’=add
top
Forward–Fixpoint Computation
(Repeatedly Applies Forward-Image)
pc=create  mutex

numItems=0
add

top

pc=checkNull  mutex

numItems=0

add
top


pc=create  mutex

numItems=1
add
top

pc=checkNull  mutex

numItems=1

add
top


pc=updateTop  mutex

numItems=1
add
top

pc=create  mutex


pc=checkNull  mutex
numItems=2


numItems=2
add

top
add
top

pc=updateTop  mutex

numItems=2

add
top

pc=create  mutex
.
.
.

numItems=3

add
top
Forward-Fixpoint does not Converge

We have two reasons for non-termination
– integer variables can increase without a bound
– the number of nodes in the shape graphs can increase without a
bound



The state space is infinite
Even if we ignore the heap variables, reachability is undecidable
when we have unbounded integer variables
So, we use conservative approximations
Outline

Specification of concurrent linked lists
– Action Language

Symbolic verification
– Composite representation

Approximation techniques
– Summarization
– Widening




Counting Abstraction
Experimental results
Related Work
Conclusions
Conservative Approximations



To verify or falsify a property p
Compute a lower ( RS  ) or an upper ( RS + ) approximation to
the set of reachable states
There are three possibilities:
p
RS
“The property is satisfied”
RS +
Conservative Approximations
reachable sates which
violate the property
p
RS 
RS
“The property is false”
p
RS 
“I don’t know”
RS
RS +
Computing Upper and Lower Bounds for Reachable
States

Truncated fixpoint computation
– To compute a lower bound for a least-fixpoint computation
– Stops after a fixed number of iterations

Widening
– To compute an upper bound for the least-fixpoint computation
– We use a generalization of the polyhedra widening operator by
[Cousot and Halbwachs POPL’77]

Summarization
– Generate heap nodes in the shape graphs which represent more
than one concrete node
– Materialization: we need to generate concrete nodes from the
summary nodes when needed
Summarization

The nodes mapped to a summary node form a chain
...



No heap variable points to any concrete node that is mapped to
a summary node
Each concrete node mapped to a summary node is only pointed
by one pointer
During summarization, we also introduce an integer variable
which counts the number of concrete nodes mapped to a
summary node
Summarization Example

pc=create  mutex
numItems=3

add
top
After summarization, it becomes:
add
pc=create  mutex

numItems=3  summarycount=2
a new integer variable
representing the number
of concrete nodes encoded
by the summary node
top

summary node
Summarization



Summarization guarantees that the number of different shape
graphs that can be generated are finite
However, the summary-counts can still increase without a bound
We use polyhedral widening operation to force the fixpoint
computation to convergence
Let’s Continue the Forward-fixpoint

pc=create
 mutex
numItems=3
 summaryCount=2


add
top


pc=checkNull
 mutex
numItems=3

 summaryCount=2
add
top


pc=updateTop
 mutex
numItems=3

 summaryCount=2
add
top

pc=create
 mutex

numItems=4
 summaryCount=2
We need to do summarization

add
top
Summarization
pc=create
 mutex

numItems=4
 summaryCount=2

add
top
After summarization, it becomes:
pc=create
 mutex

numItems=4
 summaryCount=3

add
top
Simplification


After each fixpoint iteration we try to merge as many composite
atoms as possible
For example, following composite atoms can be merged

pc=create
 mutex
pc=create
 mutex

numItems=3
 summaryCount=2
numItems=4
 summaryCount=3


add
add
top
top
Simplification
pc=create
 mutex

numItems=3
 summaryCount=2

add
top

pc=create
 mutex

numItems=4
 summaryCount=3

add
top
=
pc=create
 mutex

(numItems=4
 summaryCount=3
 numItems=3
 summarycount=2)

add
top
Simplification on the integer part

pc=create
 mutex
(numItems=4
 summaryCount=3

add
top
 numItems=3
 summaryCount=2)
=
pc=create
 mutex

numItems=summaryCount+1
 3  numItems
 numItems  4

add
top
Widening


Forward-fixpoint computation still will not converge since
numItems and summaryCount keep increasing without a
bound
We use the widening operation:
– Given two composite atoms c1 and c2 in consecutive fixpoint
iterates, assume that
c1 = b1  i1  h1
c2 = b2  i2  h2
where b1 = b2 and h1 = h2 and i1  i2
– Also assume that i1 is a single polyhedron (i.e. a conjunction of
arithmetic csontraints) and i2 is also a single polyhedron
Widening

Then
– i1  i2 is defined as: all the constraints in i1 which are also satisfied
by i2


Replace i2 with i1  i2 in c2
This gives a majorizing sequence to the forward-fixpoint
computation
Widening Example
pc=create
 mutex

numItems=summaryCount+1
add

top
 3  numItems
 numItems  4

pc=create
 mutex

numItems=summaryCount+1

add
top
 3  numItems
 numItems  5
=
pc=create
 mutex

numItems=summaryCount+1
 3  numItems
Now, the forward-fixpoint converges

add
top
Dealing with Arbitrary Number of Processes

Use counting abstraction [Delzanno CAV’00]
– Create an integer variable for each local state of a process
– Each variable will count the number of processes in a particular
state

Local states of the processes have to be finite
– Shared variables of the monitor can be unbounded

Counting abstraction can be automated
Stack After Counting Abstraction
Variables for counting the
number of processes in each
state
module main()
heap top, add, get, newTop;
Parameterized constant
boolean mutex;
representing the number of
integer numItems;
processes
integer CreateC, ChecknullC,UpdateTopC;
parameterized integer numProcesses;
initial: top=null and mutex and numItems=0 and
Initialize initial state counter
CreateC=numProcesses and ChecknullC=0 and
UpdateTopC=0;
to the number of processes.
restrict: numProcesses>0;
Initialize other states to 0.
module push()
//enumerated pc {create, checknull,updateTop};
initial: add=null;
push1: CreateC>0 and mutex and !mutex' and add'=new and
CreateC'=CreateC-1 and ChecknullC'=ChecknullC+1;
push2: ChecknullC>0 and top=null and top'=add and add'->next=null
and numItems'=1 and ChecknullC'=ChecknullC-1 and
CreateC'=CreateC+1 and mutex';
push3: ChecknullC>0 and top!=null and add'->next=top
and ChecknullC'=ChecknullC-1 and UpdateTopC'=UpdateTopC+1;
push4: UpdateTopC>0 and top'=add and numItems'=numItems+1 and mutex'
and UpdateTopC'=UpdateTopC-1 and CreateC'=CreateC+1;
push: push1 | push2 | push3 | push4;
When local state changes,
endmodule
decrement current local state
counter and increment next
local state counter
Verified Properties
SPECIFICATION
VERIFIED INVARIANTS
Stack
top=null  numItems=0
topnull  numItems0
numItems=2  top.next null
Single Lock Queue
head=null  numItems=0
headnull  numItems0
(head=tail  head null)  numItems=1
headtail  numItems0
Two Lock Queue
numItems>1  headtail
numItems>2  head.nexttail
Experimental Results - Verification Times
Number of
Processes
Queue
Queue
Stack
Stack
IC
2Lock
Queue
HC
2Lock
Queue
IC
HC
IC
HC
1P-1C
10.19
12.95
4.57
5.21
60.5
58.13
2P-2C
15.74
21.64
6.73
8.24
88.26
122.47
4P-4C
31.55
46.5
12.71
15.11


1P-PC
12.85
13.62
5.61
5.73


PP-1C
18.24
19.43
6.48
6.82


Related Work

There is a lot of work on Shape analysis, I will just mention the ones
which directly influenced us:
– [Sagiv,Reps, Wilhelm TOPLAS’98], [Dor, Rodeh, Sagiv
SAS’00]

Verification of concurrent linked lists with arbitrary number of processes
in [Yahav POPL’01]
 [Lev-Ami, Reps, Sagiv, Wilhelm ISSTA 00] use 3-valued logic and
instrumentation predicates to verify properties that cannot be
expressed in our framework, however, our approach does not require
instrumentation predicates
 Deutch used integer constraint lattices to compute aliasing information
using symbolic access paths [Deutch PLDI’94]
 Use of BDDs goes back to symbolic model checking [McMillan’93] and
verification with arithmetic constraints goes back to [Cousot and
Halbwachs’77]
Conclusions and Future Work

One of the weakness of the summarization algorithm we used is
the fact that it only works on singly linked lists
– We need to find a more general summarizaton algorithm which
counts the number of summary nodes


Implementation is not efficient, we are working on improving the
performance
Liveness properties?
– We would like to do full CTL model checking
– Need to implement the backward image computation
APPENDIX
Action Language Verifier


An infinite state symbolic model checker
Composite representation
– uses a disjunctive representation to combine different symbolic
representations

Computes fixpoints by manipulating formulas in composite
representation
– Heuristics to ensure convergence
• Widening & collapsing
• Loop closure
• Approximate reachable states
Readers Writers Monitor in Action Language
module main()
integer nr;
boolean busy;
restrict: nr>=0;
initial: nr=0 and !busy;
module Reader()
boolean reading;
initial: !reading;
rEnter: !reading and !busy and
nr’=nr+1 and reading’;
rExit: reading and !reading’ and nr’=nr-1;
Reader: rEnter | rExit;
endmodule
module Writer()
boolean writing;
initial: !writing;
wEnter: !writing and nr=0 and !busy and
busy’ and writing’;
wExit: writing and !writing’ and !busy’;
Writer: wEnter | wExit;
endmodule
main: Reader() | Reader() | Writer() | Writer();
spec: invariant([busy => nr=0])
endmodule
Action Language Verifier


An infinite state symbolic model checker
Uses composite symbolic representation to encode a system
defined by (S,I,R)
– S: set of states, I: set if initial states, R: transition relation

Maps each variable type to a symbolic representation type
– Maps boolean and enumerated types to BDD representation
– Maps integer type to arithmetic constraint representation

Uses a disjunctive representation to combine symbolic
representations
– Each disjunct is a conjunction of formulas represented by different
symbolic representations
Conjunctive Decomposition


Each composite atom is a conjunction
Each conjunct corresponds to a different symbolic
representation
– x: integer; y: boolean; h heap
– x>0 and x’=x+1 and y´y
• Conjunct x>0 and x´x+1 will be represented by arithmetic
constraints
• Conjunct y´y will be represented by a BDD
– Advantage: Image computations can be distributed over the
conjunction (i.e., over different symbolic representations).
BDDs





Efficient representation for boolean functions
Disjunction, conjunction complexity: at most quadratic
Negation complexity: constant
Equivalence checking complexity: constant or linear
Image computation complexity: can be exponential
Arithmetic Constraint-Based Verification

Can we use linear arithmetic constraints as a symbolic
representation?
– Required functionality
• Disjunction, conjunction, negation, equivalence checking, existential
variable elimination

Advantages:
– Arithmetic constraints can represent infinite sets
– Heuristics based on arithmetic constraints can be used to
accelerate fixpoint computations
• Widening, loop-closures
Linear Arithmetic Constraints



Disjunction complexity: linear
Conjunction complexity: quadratic
Negation complexity: can be exponential
– Because of the disjunctive representation

Equivalence checking complexity: can be exponential
– Uses existential variable elimination

Image computation complexity: can be exponential
– Uses existential variable elimination
Linear Arithmetic Constraints


Can be used to represent sets of valuations of unbounded
integers
Linear integer arithmetic formulas can be stored as a set of
polyhedra
F    ckl
k
l
where each ckl is a linear equality or inequality constraint and
each
ckl is a polyhedron

l
A Linear Arithmetic Constraint Manipulator

Omega Library [Pugh et al.]
– Manipulates Presburger arithmetic formulas: First order theory of
integers without multiplication
– Equality and inequality constraints are not enough: Divisibility
constraints are also needed (a variable is divisible by a constant)



Existential variable elimination in Omega Library: Extension of
Fourier-Motzkin variable elimination to integers
Eliminating one variable from a conjunction of constraints may
double the number of constraints
Integer variables complicate the problem even further
Fourier-Motzkin Variable Elimination

Given two constraints   bz and az   we have
a  abz  b

We can eliminate z as:
z . a  abz  b if and only if
a  b
real shadow

Every upper and lower bound pair can generate a separate
constraint, the number of constraints can double for each
eliminated variable
Integers are More Complicated

If z is integer
z . a  abz  b if a + (a - 1)(b - 1)  b
dark shadow

Remaining solutions can be characterized using periodicity
constraints in the following form:
z .  + i = bz
Consider the constraints:
y . 0  3y – x  7  1 x – 2y  5
We get the following bounds for y:
2x  6y
3x - 15  6y
6y  2x + 14
6y  3x - 3
When we combine 2 lower bounds with 2 upper
bounds we get four constraints:
0  14 , 3  x , x  29 , 0  12
Result is: 3  x  29
x – 5  2y
y
2y  x – 1
x  3y
3y  x + 7
3
29
dark shadow
real shadow
x
Temporal Properties  Fixpoints
backwardImage
of p
Backward
fixpoint
Invariant(p)
Initial
states
initial states that
violate Invariant(p)
Forward
fixpoint
forward image
of initial states
Initial
states
p
• • •
states that can reach p
i.e., states that violate Invariant(p)
• • •
reachable states
of the system
p
reachable states
that violate p
Simplification Example
(y  z´ = z + 1)
 ((y  x)  z´ = z + 1)

(x  z´ = z + 1)
 ((x  y)  z´ > z)

((x  y)  (z´ = z + 1  z´ > z))

((x  y)  z´  z)

((x  y)  z´ > z)
Polymorphic Verifier
Symbolic TranSys::check(Node *f) {
•
•
•
Symbolic s = check(f.left)
case EX:
s.backwardImage(transRelation)
case EF:
do
snew = s
sold = s
snew.backwardImage(transRelation)
s.union(snew)
while not sold.isEqual(s)
•
•
•
}
 Action Language Verifier
is polymorphic
 When there are no integer
variable it becomes a BDD
based model checker
Download