Towards Satisfiability of Separation Logic with Integer Arithmetic

advertisement
Towards Satisfiability of Separation Logic
with Integer Arithmetic
ABSTRACT
Decision procedures for satisfiability are important for determining
if some formula is either infeasible (to support entailment proving)
or has a feasible instance (to support failure tracing via counterexample). Recently, a decision procedure was proposed for a fragment of separation logic with shape-only inductive predicates by
Brotherston et al at LICS 2014. To support automated verification,
we often need a more expressive fragment of separation logic with
other kinds of pure properties, such as size or set.
In this work, we consider satisfiability problem for a separation
logic fragment, comprising inductive predicates with Presburger
linear arithmetic. This extended logic, called SLA1, can handle
richer data structures with sortedness and size properties. We start
by proving that the satisfiability problem in the SLA1 fragment
is undecidable. We identify a decidable fragment, named SLA2,
where inductive predicate defines an eventually periodic set. We
prove the satisfiability is decidable in the subsystem by giving its
decision algorithm. We also propose a practical decision procedure
for satisfiability in this fragment. The essence of our procedure is
a mechanism to infer precise invariant for each inductive predicate
that is equi-satisfiable to its recursive predicate for the SLA1 fragment. Our procedure is based on abstract interpretation that may
compute over-approximated invariant for predicates that do not belong to SLA2 fragment. We use projection to first deriving a precise shape-only invariant, before attempting an over-approximated
invariant for the numeric properties for SLA1 fragment. For invariants that are computed from the non-SLA2 fragment, we can
still use an under-approximation check to determine if it is in fact a
precise invariant We prove the soundness of both these procedures,
and provide a prototype implementation to illustrate its feasibility.
Keywords
Combining Decision Procedure, Satisfiability, Separation Logic,
User-Defined Predicates.
1. INTRODUCTION
1Sec 2.1 Separation logic is an extension of Hoare logic to model
states of heap manipulating programs. Its strength comes from the
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00.
new separation conjunction operator ∗. The formula κ1 ∗ κ2 specifies that the heap can be split into two disjoint regions in which
κ1 and κ2 hold respectively. Using this operator, we can define recursive predicates that succinctly describe fairly complex shapes of
data structures. With these predicates, we can then specify and verify correctness properties for heap manipulating programs. However, the expressiveness of separation logic comes with a price as
this logic is undecidable in general [1]. To have decidability, one
has to restrict the fragment of separation logic under investigation.
For example, most works so far have only considered a fixed set
of predicates based on the linked-list structure with only pointer
(dis)equality.
In this paper, we consider a more expressive fragment of separation logic for satisfiability. Our fragment includes empty assertion, points-to predicates and arbitrary user-defined predicates to
model data structures. Moreover, we can use Presburger arithmetic
to describe useful properties of our data structure such as size of
a list, height of a tree and even sortedness. An example is illustrated below for a non-empty sorted linked-list predicate, with m as
its smallest value and n as its length.
data node{ int val; node next; } // data type declaration
pred sortll(root,n,m) ≡ root7→node(m, null) ∧ n=1
∨ ∃ q, m1 ·root7→node(m, q) ∗ sortll(q, n−1, m1 )∧m≤m1
Due to the presence of inductive predicates and an infinite integer domain, this fragment is actually undecidable (see Sec 2.5).
However, we have discovered a significant sub-fragment whose satisfiability is decidable. To support this fragment, we introduce a
new three-stage method to compute over-approximated predicate
invariants. Like [6], the first stage computes a precise shape-only
predicate invariant. The second stage computes a numeric predicate invariant that is either over-approximated or precise. The third
stage computes a combined predicate invariant, that is precise if the
numeric portion is precise. We show such precise predicate invariant in the combined domain to be equi-satisfiable to its inductive
predicate. We summarize our key contributions, as follows:
• We first show that satisfiability of separation logic (with inductive predicates) and Presburger arithmetic is undecidable.
• By restricting the logic fragment, where each inductive predicate defines an eventually periodic set, we give a constructive proof that decidable outcome on satisfiability is possible.
• To support a working implementation for our decision procedure, we present a three-stage algorithm to compute precise
predicate invariant that is equi-satisfiable to their inductive
predicates.
• As a practical proof of concept, we have implemented this
satisfiability decision procedure for this fragment of separation logic within an existing HIP/SLEEK[8, 16] verification
infrastructure.
• In the Appendix, we also describe an alternative way to infer precise invariants. This utilizes our inference mechanism for over-approximation, and then using a novel underapproximation check to confirm if an over-approximating invariant is also precise. This check must ensures that each
non-false under-approximation is also non-empty (i.e. has
at least one satisfiable instance).
2. SEPARATION LOGIC WITH ARITHMETIC
2.1 Syntax
We start with a fragment of separation logic with Presburger
arithmetic. We call this fragment SLA1. It assumes a finite collection of type constructors Ptr, a set of predicate names P , a set
of (program and logical) variables Var, a set Loc of distinct heap
locations, a set of non-address values Val, with null ∈ Val and Val
∩ Loc = ∅.
Predicates
Disj. SL
Heap formula
BAGA formula
Ptr (Dis)Eq.
Pred
Ψ
κ
ψ
α
Presburger arith.
φ
Linear arith.
i
a
v, vi , x, y ∈ Var
kint
::= pred P1 (v)≡Ψ1 ; · · · ; pred Pn (v)≡Ψn
::= ∃v̄· (κ∧α∧φ) | Ψ1 ∨ Ψ2
::= emp | x7→c(v) | P(v) | ψ | κ1 ∗κ2
::= false | (B, α, φ) | ψ1 ∨ψ2
::= true | v1 =v2 | v=null | v1 6=v2
| v6=null | α1 ∧α2
::= true | i | ∃v· φ | ∀v· φ | ¬φ | φ1 ∧φ2
| φ1 ∨φ2
::= a1 =a2 | a1 ≤a2
::= kint | v | kint ×a | a1 +a2 | −a
| max(a1 ,a2 ) | min(a1 ,a2 )
∈ Int Pi , P ∈ P c ∈ Ptr v ≡ v1 , . . ., vn
This logic fragment is quite expressive as it can use inductive
predicates to describe complex data structures, such as the nearlybalanced AVL tree (omitting sortedness):
data c2 { int val; c2 left; c2 right; }
pred avl(root,s,h) ≡ emp ∧ root=null ∧ s=0 ∧ h=0
∨ ∃l, r, s1 , s2 , h1 , h2 ·root7→c2 (_, l, r) ∗ avl(l, s1 , h1 )∗
avl(r, s2 , h2 ) ∧ s=s1 +s2 +1 ∧ h=1+max(h1 , h2 ) ∧
−1≤h1 −h2 ≤1
We use a special formula, called BAGA, to explicitly capture a bag
of addresses to denote the (heap and numeric) abstraction of each
inductive predicate. This BAGA form is itself expressed without any
user-defined predicates. The semantic denotation of this formula is
given in the next sub-section, while Sec 2.2 elaborates on how it
may be inferred from predicate definitions of separation logic.
2.2
BAGA Formula
To support the (un-)satisfiability problem for separation logic
with Presburger arithmetic, we will use a special formula, called
BAGA, to explicitly capture a BAG of Addresses to denote the (heap
and numeric) invariant of each inductive predicate. The syntax of
BAGA and the corresponding abstractions for separation logic formulas and predicates are:
BAGA inv.
Disj abstr.
BAGA abstr.
ψ
Ψ#
ψ#
::= false | (B, α, φ) | ψ1 ∨ψ2
#
::= ∃v̄· ψ # | Ψ#
1 ∨ Ψ2
#
::= (B, α, φ) | ψ1 ∗ ψ2# | P# (v)
Each basic component of BAGA is a triple with a multi-set of pointer
variables B, a pointer constraint α and an arithmetic constraint φ.
Compared to [6], we have now added an extra arithmetic constraint.
To support satisfiability, we will need to derive precise invariants
for our inductive predicates, but this is not always possible with
Presburger arithmetic. For example, we can derive ({root}, true , n>0)
as the precise invariant that is equi-satisfiable to sortll(root, m, n).
However, for avl(root, s, h), we could derive
({},root=null,h=0∧s=0)∨({root},true ,s≥h>0)
which is an over-approximated invariant lacking a relation between
s and h. Such over-approximation cannot decide satisfiability, but
is still helpful for some scenarios of unsatisfiability.
A BAGA formula is an abstraction of a separation logic formula.
We shall show how to compute an over-approximation for inductive
predicates and how to determine when they are precise. They can
be normalized via the following BAGA-specific rules.
({x} ⊎ B, α, φ) ∧ (α→x=null)
({x, y}⊎B, α, φ) ∧ (α→x=y)
(B, α, φ) ∧ (α∧φ → false )
(B1 , α1 , φ1 ) ∗ (B2 , α2 , φ2 )
(B, α, φ1 ) ∨ (B, α, φ2 )
∃x∪V · (B, α∧x=y, φ)
∃x∪V · (B, α, φ) ∧ typeof(x)∈Ptr
∃x∪V · (B, α, φ) ∧ typeof(x)=Int
2.3
⇔
⇔
⇔
⇔
⇔
⇔
⇔
⇔
false
false
false
(B1 ⊎B2 , α1 ∧α2 , φ1 ∧φ2 )
(B, α, φ1 ∨φ2 )
∃V · ([y/x]B, [y/x]α, φ)
∃V · (B−{x}, ∃x·α, φ)
∃V · (B, α, ∃x·φ)
Semantics
The semantic relation for our logic relies on the following two
functions:
def
Heaps =Loc⇀f in (Ptr, Val ∪ Loc)
def
Stacks =Var → Val ∪ Loc
The semantic relation s,h |= Ψ requires the stack s and heap h to
satisfy the constraint Ψ where h ∈ Heaps, s ∈ Stacks, and Ψ is a separation logic formula. As the semantic relation for pure formula is
standard, we define the heap-related components as follows:
s, h |= ({v̄}, α, φ)
s, h |= emp
s, h |= v7→c(v̄)
s, h |= P(v)
s, h |= κ1 ∗ κ2
iff
iff
iff
iff
iff
s, h |= ∃v̄·(κ∧α∧φ)
iff
s, h |= Ψ1 ∨ Ψ2
iff
s |= α∧φ and {s(v)}=dom(h)
h={}
l=s(v), h={l → (c, s(v))}
s, h |= Ψ, where P(v) ≡ Ψ
∃h1 , h2 · h1 #h2 and h=h1 ·h2
and s, hi |= κi for i=1, 2
∃ν̄ · s[v7→ν], h |= κ
and s[v7→ν] |= α∧φ
s, h |= Ψ1 or s, h |= Ψ2
Note dom(h) returns the domain of the heap h; {} is the empty
heap that is undefined everywhere; h1 #h2 denotes that heaps h1
and h2 are disjoint, i.e. dom(h1 ) ∩ dom(h2 ) = ∅; h1 ·h2 denotes the
union of two disjoint heaps; s[v1 7→ν1 , .., vn 7→νn ] denotes the stack
defined as s except that s[v1 7→ν1 , .., vn 7→νn ](vi ) = νi , for 1≤i≤n.
Using this relation, we can semantically classify predicate invariants, as follows:
D EFINITION 1 (P REDICATE I NVARIANTS ).
Given a predicate P(v) ≡ Ψ and a BAGA formula ψ :
• ψ is an over-approximated invariant of P if ∀s,h |= Ψ implies ∃h′ , dom(h′ )⊆dom(h) s,h′ |= ψ .
• ψ is a precise invariant of P if ψ is over-approximated invariant of P and ∀s,h |= ψ implies ∃h′ · dom(h)⊆dom(h′ )∧s,h′ |= Ψ.
The first definition essentially states that a formula ψ is an overapproximation if every satisfiable instance of s,h |= Ψ is also an instance of ψ , since ∃h′ , dom(h′ )⊆dom(h) s,h′ |= ψ . The BAGA counterpart ψ may refer to the same or fewer heap locations than original
formula Ψ. The second definition states that an over-approximated
formula ψ is precise when it is also a valid under-approximation. A
formula ψ is an under-approximation of Ψ, if for every instance s,h
that belongs to ψ , we are guaranteed an instance:
∃h′ · dom(h)⊆dom(h′ )∧s,h′ |= Ψ
that satisfies the original predicate definition, Ψ. In other works, a
formula ψ is a precise invariant if is equi-satisfiable with Ψ.
2.5
Undecidability
Without any restrictions on the shape and Presburger arithmetic
of inductive definitions for user-defined predicates, the satisfiability
is undecidable in SLA1.
T HEOREM 2.4. The satisfiability of a formula is undecidable in
SLA1.
2.4 From Separation Logic to BAGA
Proof. Define
To derive the invariant for each predicate, P(v)≡Ψ we shall build
BAGA abstraction using P# (v)≡A[Ψ] where A is defined as follows:
pred PW(x, y) ≡ emp ∧ (x = 0 ∧ y = 1)
∨∃x1 , y1 .(emp ∗ PW(x1 , y1 ) ∧ x = x1 + 1 ∧ y = 2 × y1 )
pred NPW(x, y) ≡ ∃z.(P (x, z) ∧ y 6= z)
A[emp] =df ({ },true ,true )
A[κ1 ∗κ2 ]=df A[κ1 ] ∗ A[κ2 ]
A[κ∧α∧φ] =df A[κ]∗({ },α,φ)
A[v7
,true )
W→c(_)]=df ({v},true
W
A[ ∃v · ∆] =df
∃v · A[∆]
A[P(v)] =df P# (v)
Then the satisfiability of PW(n, m) is equivalent to m = 2n and
the satisfiability of NPW(n, m) is equivalent to m 6= 2n for natural
numbers m, n.
Using sortll predicate as our running example, we obtain:
Presburger arithmetic with the predicate y = 2x is known to be
#
sortll (root,n,m)≡ ({root},true ,n=1) ∨ ∃ q,m1 ·
equivalent to Peano arithmetic. Hence we have some Σ01 formula
#
({root},true ,m≤m1 )∗sortll (q, n−1, m1 ) F (x) in Presburger arithmetic with the predicate y = 2x such that
F (n) is true iff the n-th Turing machine terminates, for every natIndeed, A is an equi-satisfiability reduction.
ural number n.
We can assume F (x) is in the prenex normal form and its quantifierL EMMA 2.1. Given a base formula ∆ and ψ ≡ A[∆]. ∆ is
free part is in the disjunction normal form. Construct F1 (x) from
satisfiable iff there is a disjunct (B, α, φ) of ψ such that both α and
F (x) by replacing every occurrence of the predicate ¬(y = 2x ) by
φ are satisfiable
NPW(x, y). Then construct F2 (x) from F1 (x) by every occurrence
P ROOF. Given a base formula ∆ and (B, α, φ)≡A[∆], we prove
of the predicate y = 2x by PW(x, y). Since F2 (x) has negations
(∀s, ∃h· s, h |= ∆ iff s |= α ∧ s |= φ) by induction on number ∗
only in front of arithmetical atomic formulas and it does not have
of κ, e.g. size(κ).
any universal quantifiers, we can transform F2 (x) to an equivalent
Base case. size(κ=0). Two cases:
formula Φ(x) in SLA1. Then the satisfiability of Φ(x) in SLA1
• ∆≡emp∧α∧φ and A[∆]≡({}, α, φ). Trivial.
is equivalent to the truth of F (x) in Presburger arithmetic with the
predicate y = 2x . Therefore Φ(n) is satisfiable in SLA1 iff the
• ∆≡x7→c(vi )∧α∧φ, A[∆]≡({x}, α, φ). for all s such that:
n-th Turing machine terminates, for every natural number n. ConWe have, s(x)=l and l6=h(null), then s |= x6=null. Thus,
sequently the satisfiability in SLA1 is undecidable. ✷
s, h |= x7→c(vi )∧α∧φ ⇔ s |= α∧φ (a).
From (a) and under assumption that heap and integer domains are disjoint, s |= x6=null∧π ⇔ s |= α∧s |= φ
Induction case.
Assume that Lemma 2.1 is valid for all heap κ: size(κ)=k≥0; we
now prove that Lemma 2.1 is also valid with heap size k’ whereas
k′ =k+1.
Since k′ ≥1, there exist κ1 and κ2 such that size(κ1 )≤k, size(κ2 )≤k
and κ≡κ1 ∗κ2 . Thus, we have s, h |= κ1 ∗κ2 ∧π
∃h1 , h2 ·, size(h1 )≤k, size(h2 )≤k and h1 #h2 and h=h1 ·h2 and
s, h1 |= κ1 ∧π and s,Vh2 |= κ2 ∧π .
h1 #h2 ⇔ s |= {v1 6=v2 | v1 7→_(_) ∈ κ1 and v2 7→_(_) ∈
κ2 } (2)
s, h1 |= κ1 ∧π ⇔ s |= A[κ1 ] (by induction hypothesis) (3)
s, h2 |= κ2 ∧π ⇔ s |= A[κ2 ] (by induction hypothesis) (4)
From (2),(3),(4),
and semantics of ∧, we obtain
V
s |= {v1 6=v2 | v1 7→_(_) ∈ κ1 and v2 7→_(_) ∈ κ2 }∧A[κ1 ]∧
A[κ2 ] ⇔ s |= A[κ1 ∗κ2 ] (definition of A over ∗ reduction) (5)
From (1), (5) and semantics of ∧, we obtain
s |= A[κ1 ∗κ2 ]∧π ⇔ s |= A[κ1 ∗κ2 ∧π] (definition of A over
κ∧π )
⇔ s |= A[κ∧π].
Therefore, Lemma 2.1 is proven for heap size k+1.
L EMMA 2.2. Given a predicate P(v)≡Ψ and ψ ≡ A[Ψ]. P(t)
is satisfiable iff there is a disjunct (B, α, φ) of ψ[v̄ := t̄] such that
both α and φ are satisfiable
T HEOREM 2.3. Given a formula Ψ and ψ ≡ A[Ψ]. ∆ is satisfiable iff there is a disjunct (B, α, φ) of ψ such that both α and φ
are satisfiable
3.
DECIDABLE FRAGMENT OF SEPARATION LOGIC & ARITHMETIC
This section presents a decidable subsystem of SLA1, and gives
an algorithm for the satisfiability. We call this subsystem SLA2.
SLA2 covers our sorted-list predicate sortll. We will prove the
decidability by reducing it to the decidability of the arithmetical
part by using the decision algorithm of the non-arithmetical part
given in [6]. First we will present a decidable fragment of Presburger arithmetic and inductive definitions. We call this fragment
DPI. Then we reduce the satisfiability in SLA2 to the decidability
in DPI.
3.1
Decidable fragment DPI of Presburger arithmetic with inductive definitions
We define a fragment DPI of Presburger arithmetic with inductive definitions. The idea is that we impose some restrictions on
the inductive definitions so that the inductive predicate defines an
eventually periodic set. Since the decidability proof of Presburger
arithmetic relies on the fact that a definable set is exactly an eventually periodic set, this restriction enables us to use the same proof
idea for its extension with inductive definitions.
Linear arithmetic terms a are the same as those in SLA1. We
extend linear arithmetic atomic formulas i in SLA1 by adding inductive predicates P (v̄). Presburger conjunctive formulas φ are
similarly defined from these atomic formulas. Presburger disjunctive formulas Φ are defined by Φ ::= ∃v̄.φ|Φ ∨ Φ. The inductive
definition is pred P (v̄) ≡ Φ. For simplicity, we assumed only one
inductive predicate P , but this decidability result can be straightforwardly extended to more than one inductive predicates.
Let Φ be
_
φj . We call φj a base case when P does not appear
j
in it, and we call it an induction case when P appears in it.
Let the arity of P be m.
Restriction 1. Our first condition is that the definition body of P
has only one induction case.
Let the induction case of the inductive definition pred P (x̄) be
^
∃z̄.i ∧
P (āl ).
The decision procedure for DPI is obtained by computing the
above M, p1 , p2 according to the decidability proof.
3.2
Decidable fragment SLA2
(1) xj = f (z̄j ) + c,
(2) xj ≥ f (z̄j ) + c,
(3) xj ≤ f (z̄j ) + c,
(4) a conjunction of the forms nxj = f (z̄j ), nxj ≥ f (z̄j ), and
nxj ≤ f (z̄j ), where c, n are some integer constants, n > 0, and
f (z̄j ) is a combination of zj1 , . . . , zjL with max, min, defined by
We define a subsystem SLA2 of SLA1 as follows. The restrictions are essentially the same as those imposed on DPI. The idea
of the decision procedure is that since the arithmetical part and the
non-arithmetical part are independent, we can split the satisfiability
of a given formula into the satisfiability of its arithmetical part and
its non-arithmetical part. However, these two parts need to synchronize. Since the unfolding of the non-arithmetical part terminates within some k steps according to [6], we consider two cases:
one case when the number of unfolding of the whole formula is less
than k, and the other case when it is equal to or greater than k. The
first case can be handled by a finite number of unfolding. In the
second case, we can use the inductive predicate for the arithmetical
part. In order to realize this idea, we will take k to be the maximum
number of possible (B, α)’s instead of the number of the iteration.
Our condition is that for the inductive definition pred P ≡ Φ,
#
≡ PN (A[Φ]) satisfies the conthe inductive definition pred PN
dition of DPI.
f (z̄j ) ::= zjl |max(f (z̄j ), f (z̄j ))|min(f (z̄j ), f (z̄j )).
T HEOREM 3.2. The satisfiability of a given disjunctive formula
Φ is decidable in SLA2.
1≤l≤L
Note that P (x̄) denotes P (x1 , . . . , xm ) and P (āl ) denotes
P (al1 , . . . , alm ).
Restriction 2. Our
^second condition is that the above i of the induction case is
ij such that ij is either of the following:
1≤j≤m
Then we can let the inductive definition of P be
_
^
^
pred P (x̄) ≡
∃z̄k .ik ∨ ∃z̄.
ij ∧
P (z̄ l ).
k
1≤j≤m
l
Example. The arithmetical part P of sortll is inductively defined by
pred P (n, m) ≡ n = 1 ∨ ∃q, m1 .(P (n − 1, m1 ) ∧ m ≤ m1 ).
This satisfies the above conditions. Note that P (n − 1, m1 ) is an
abbreviation of ∃z(z = n − 1 ∧ P (z, m1 )).
T HEOREM 3.1. The truth of a given disjunctive formula Φ is
decidable in DPI.
Proof. Define a set S of integers to be eventually periodic if there
are some M ≥ 0, p1 , p2 > 0 such that n ∈ S iff n + p1 ∈ S for
all n ≥ M , and n ∈ S iff n − p2 ∈ S for all n ≤ −M . Then we
call the set (M, p1 , p2 )-periodic.
Let
S = {(x1 , . . . , xm )|P (x1 , . . . , xm )},
Q = {(x1 , . . . , xm )|Φ0 (x1 , . . . , xm )}
where Φ0 is the disjunction of all the base cases of the definition of P . We write Sj for {xj |P (x1 , . . . , xm )}, and Qj for
{xj |Φ0 (x1 , . . . , xm )}. We have S = S1 × . . . × Sm and Q =
Q1 × . . . × Qm , since the j-th value xj depends on only the previous j-th value zj in the definition of P .
It is known that a set definable in Presburger arithmetic is exactly
an eventually periodic set. Hence each Qj is eventually periodic.
Let Qj be (M, p1 , p2 )-periodic.
In the case (1), we can show that Sj is (0, |c|, |c|)-periodic if
c 6= 0, and (M, p1 , p2 )-periodic if c = 0.
In the case (2), we can show that Sj is Qj ∪{x|x > (minQj )+c}
if Qj is downward finite and c ≥ 0, and Sj is Z otherwise.
In the case (3), we can show that Sj is Qj ∪ {x|x < (maxQj ) +
c} if Qj is upward finite and c ≤ 0, and Sj is Z otherwise.
In the case (4), we can show that Sj is also (M, p1 , p2^
)-periodic.
Since the truth of P (n1 , . . . , nm ) is equivalent to
(nj ∈
1≤j≤m
Sj ) and each Sj is eventually periodic, we can decide the truth of
a given formula that contains P . ✷
Proof. We will discuss only the case where P is an inductive
predicate since a general case can be proved by straightforwardly
extending the proof of this case. For simplicity, we assume P takes
one pointer variable and one numeric variable. For simplicity, we
also assume P appears exactly once in the induction case. The general case can be proved by straightforwardly extending the proof of
this case.
Let the inductive definition be
pred P (x, y) ≡ Φ0 (x, y) ∨ Φ1 (x, y).
where Φ0 (x, y) is the disjunction of all the base cases and Φ1 (x, y)
is the induction case.
We define Φ#(k) (x, y) by
P #(0) (x, y) ≡ A[Φ0 (x, y)],
P #(k+1) (x, y) ≡ A[Φ1 (x, y)][P # (u, v) := P #(k) (u, v)].
P #(k) (x, y) is the k-times iteration of A[Φ1 (x, y)] to A[Φ0 (x, y)].
#(k)
#(k)
We define PS (x) as PS [P #(k) (x, y)] and_
PN (y) as
#(≤k)
#(k)
#(k)
PN [P
(x, y)]. We also define PS
(x) as
PS (x).
0≤j≤k
#(>k)
We define PN
(y) by
#(>k)
pred PN
(y) ≡ PN [P #(k+1) (x, y)]∨
#(>k)
#
:= PN
].
PN [A[Φ1 (x, y)]][PN
Suppose we are deciding the satisfiability of P (t, a) where t is a
pointer variable or null and a is an arithmetical term.
Let k be 2n n(n + 1) where n is the number of pointer variable
arguments. It is the maximum number of possible (B, α)’s. In the
current simplified case, k = 4, since n = 1.
The decision algorithm is as follows: Return the satisfiability of
_
#(k)
#(>k)
P #(j) (t, a) ∨ PS (t) ∧ PN
(a).
0≤j≤k
The algorithm for the non-arithmetical part is based on the same
idea as that in [6].
#(>k)
(a) is decided by Theorem 3.1.
The satisfiability of PN
We show its correctness as follows. Assume P (t, a) is satisfiable. We will show the algorithm returns yes. Then we have some
n such that P #(n) (t, a) is satisfiable.
If n ≤ k, then the algorithm returns yes by the first disjunct.
#(n)
#(n)
Assume n > k. Then both PS
(t) and PN (a) are satisfi#(k)
#(n)
able. Since n > k, PS (t) is satisfiable. Since PN (a) implies
#(>k)
PN
(a), the algorithm returns yes.
For showing the other direction, assume the algorithm returns
yes for the input t, a. We will show _
P (t, a) is satisfiable.
#(k)
We consider cases according to
P #(j) (t, a) ∨ PS (t) ∧
0≤j≤k
#(>k)
PN
(a). If P #(j) (t, a) is satisfiable for some 0 ≤ j ≤ k, then
P (t, a) is satisfiable.
#(k)
#(>k)
Assume PS (t) ∧ PN
(a) is satisfiable. Then we have
#(n)
some n > k such that PN (a) is satisfiable.
#(k)
Since PS (t) is satisfiable and k is the maximum number of
#(q)
possible (B, α)’s, we have some q < k such that PS (t) and
#(q+1)
PS
(t) have some common (B, α) and (B, α) is statisfiable.
#(m)
#(n)
Then PS
(t) is satisfiable for all m ≥ q. Then PS
(t) is
#(n)
satisfiable. Therefore P
(t, a) is satisfiable, and hence P (t, a)
is satisfiable. ✷
Example. The algorithm goes for sortll as follows: We have
k = 4 since the number of pointer variable arguments is 1.
sortll#(0) (root, n, m) = ({root}, true, n = 1),
sortll#(1) (root, n, m) =
∃q, m1 .({root}, true, m ≤ m1 ) ∗ ({q}, true, n − 1 = 1).
Since
#(0)
sortllS (root) = ({root}, true),
#(1)
sortllS (root) = ∃q, m1 .({root}, true) ∗ ({q}, true)
#(4)
sortllS (root)
and they are equivalent,
is equivalent to
({root}, true). Hence the satisfiability of sortll(root, n, m)
is decided by checking the satisfiability of
_
sortll#(j) (root, n, m)∨
Sec A) which could confirm when some over-approximation is also
an under-approximation. This check provides a novel alternative to
confirm the presence of precise (or equi-satisfiable) predicate invariant. As it is not strictly required by our current proposal on
a decidable satisfiability SLA2 fragment, we have placed it in the
Appendix for discussions and future considerations. The next subsection provide details on how we infer over-approximating numeric predicate invariants which are provably precise for the DPI
fragment.
4.1
Computing Over-Approximation
With the BAGA abstraction, we shall infer its invariant in three
stages. First, we compute the shape-only invariant, by removing
(via projection) non-pointer variables from its constraints and solving its shape-only abstraction using the SLSAT algorithm [6]. Second, we infer the numeric-only invariant by removing pointer variables from the set of constraints and then inferring its numerical
invariant using a fix-point computation based on a disjunctive abstract interpretation (using FixCalc [25]). In the third stage, we
combine via conjunction the two fixed points obtained from the
prior two steps as an initial invariant for the combined domains.
This combined invariant may be imprecise but can be refined by
unfolding its predicate’s BAGA abstraction, a small number of times.
We illustrate these three stages using the sortll predicate.
Stage 1. We first compute shape-only abstraction by a projection
process PS :
PS [(B, α, φ)]
PS [P# (v)]
=df (B, α)
=df P#
S (v S )
PS [ψ1# ∗ψ2# ]
W
PS [ ∃v·ψ # ]
=df PS [ψ1# ]∗PS [ψ2# ]
W
=df
∃v S ·PS [ψ # ]
This essentially drops numeric constraints and parameters from
the BAGA abstraction. For each abstract predicate P# (v), we obtain
shape-only predicate P#
S (v S ) such that v S = v if typeof(v)∈Ptr. We
then infer the invariant over the shape-only domain for PS (v S ) using
the algorithm in [6]. For our running example, we derive:
sortll#
S (root) =df ({root}, true ).
0≤j≤4
#(4)
sortllS
#(>4)
(root) ∧ sortllN
(n, m)
where
#(>4)
pred sortllN
(n, m) ≡ n = 6
#(>4)
∨∃q, m1 .(sortllN
(n − 1, m1 ) ∧ m ≤ m1 ).
4. INFERRING BAGA INVARIANT
In the previous section, we have identified an arithmetic fragment with inductive predicate over Presburger formula, named DPI,
for which it is always possible to constructively build a precise
(or equi-satisfiable) numeric invariant. In this section, we propose
a practical algorithm for inferring precise invariants. Our algorithm is based on abstract interpretation which would attempt to
infer an over-approximation for each inductive predicate. We use
this approach since it is a standard and practical way for deriving numeric invariants, which are typically over-approximations
of the inductive predicates themselves. Leveraging on the periodic set property of DPI class of formulas, we can further confirm
that equi-satisfiable numeric invariants can always be inferred for
this class of inductive numeric predicates. Through two projections, followed by a combination of shape and numeric domains,
we show that it is always possible for each heap-based inductive
predicate with the DPI property to have a precise BAGA invariant.
This outcome leads to a decidable satisfiability algorithm for this
class of formula. Nevertheless, we also hope to go beyond this decidable fragment by proposing an under-approximation check (see
Stage 2. We next compute numeric-only abstraction by another
projection PN :
PN [(B, α, φ)]
PN [P# (v)]
=df φ
=df P#
N (v N )
PN [ψ1# ∗ψ2# ]
W
PN [ ∃v·ψ # ]
=df PN [ψ1# ]∧PN [ψ2# ]
W
=df
∃v N ·PN [ψ # ]
Using the sortll predicate as our running example, we obtain:
#
sortll#
N (n,m) ≡ n=1 ∨ ∃ m1 ·m≤m1 ∧sortllN (n−1, m1 )
This numerical abstraction is subjected to a fix-point computation yielding a numerical invariant: sortll#
(n,m)≡n>0.
N(fix)
Stage 3.
We combine the two invariants before applying unfolding to obtain a more precise fix-point. In essence, this step
eliminates any model that is satisfiable in the separate domains
but is not in
combined invariant is
W the combined domain. Initial
#
inv0 (v)≡ {(B, α, P#
N (v N )) | (B, α)∈PS (v S )}. Steps for each (kth) unfolding:
• Unfold the BAGA abstraction of the predicate P# by substituting recursive predicate instances with the prior invariant
invk−1 (v);
• Normalize and combine the branches. As invariant over numerical domain covers all predicate branches, this step only
refines the third component of each disjunct of invk (v). Hence,
number of disjuncts of the all invk (v) is not changed.
• We check if the prior invariant is as strong as the new one:
invk−1 (v) =⇒ invk (v). If this is so, we have reached fixed
point for the combined domain. Implication over BAGA formula is checked pairwise, namely: invk−1 (v) =⇒ invk (v)
iff ∀(ψ,α,φi )∈invk−1 (v) · ∃(ψ,α,φj )∈invk (v) · φi =⇒ φj .
For our running example, we derive
inv0 (root,n,m)≡({root}, true , n>0)
Unfolding gives:
inv1 (root,n,m) ≡({root},true , n=1)∨({root},true ,n>1)
≡({root},true , n=1∨n>1)
Since inv0 =⇒ inv1 , we detected inv0 as its combined fixed
point. Note that we only apply repeated unfolding for precise combined fix-points. For imprecise fix-points, we only unfold once, as
Stage 3 may loop otherwise.
L EMMA 4.1 (C OMBINED I NVARIANT ).
Split P# (v) = Ψ# [P# (w)] into two disjoint abstractions: P#
S (v) =
#
# #
#
(w)]
(v)
[P
and
P
=
Ψ
for
which
fix-points
(w)]
[P
Ψ#
N
N
N
S
S
#
fixS (v) ⇐⇒ Ψ#
S [fixS (w)] and ΨN [fixN (w)] =⇒ fixN (v) exists.
The conjunct of fix-points, namely fixC0 (v) = fixS (v)∧fixN (v), is
an over-approximated fix-point Ψ# [fixC0 (w)] =⇒ fixC0 (v) of the
combined domain. Furthermore, repeated unfolding of the form:
fixCn (v) = Ψ# [fixCn−1 (w)] would derive an over-approximated
invariant for the combined domain.
The following soundness theorem and lemma for it are proved in
a similar way to proofs of Theorem 3.2 given in Sec 3.
L EMMA 4.2. Provided an inductive predicate P(v) and fix(w)
be a fixed point computed by the three stages above, if s,h |= P(v),
there exists a disjunct BAGA formula (B, α, φ) of fix(w) and s′ such
that s ⊆ s′ , s′ |= α, s′ |= φ and s(B)⊆dom(h)
T HEOREM 4.3 (S OUNDNESS ). Provided an inductive predicate P(v) and fix(w) be a fixed point computed by the three stages
above, if fix(w)≡false , then P(v) is unsatisfiable.
If pure fix-point in Stage 2 is precise, we eventually infer a precise fix-point.
T HEOREM 4.4 (P RECISE I NVARIANT ).
Split P# (v) = Ψ# [P# (w)] into two disjoint abstractions: P#
S (v) =
#
# #
#
Ψ#
S [PS (w)] and PN (v) = ΨN [PN (w)] for which precise fix-points
#
fixS (v) ⇐⇒ Ψ#
S [fixS (w)] and ΨN [fixN (w)] ⇐⇒ fixN (v) exists.
The conjunct of fix-points, namely fixC0 (v) = fixS (v)∧fixN (v), is
a precise fix-point Ψ# [fixC0 (w)] ⇐⇒ fixC0 (v) of the combined
domain. Furthermore, repeated unfolding of the form: fixCn (v) =
Ψ# [fixCn−1 (w)] would eventually derive a precise invariant for
the combined domain.
Proof Sketch. We rely on the premise that both heap and integer
abstract domains are disjoint with precise fixpoints computed independently. For precise heap invariant, we have a finite set of heap
configurations that is preserved by unfolding. For the numeric domain, we can use disjunction to combine pure formula of identical
heap configurations without loss of information. Proof on termination of unfolding relies on the fact that heap configuration is finite,
and that we have precise numeric fix-point.
Our over-approximating fixpoint computation would yield precise numeric invariant for the DPI fragment. For more complex
formula, we may still derive an over-approximating invariant. To
confirm if these invariants are possibly precise, we propose a procedure for under-approximation check in App A. To be sound, this
procedure also checks that every non-false invariant has at least one
satisfiable instance.
5.
EXPERIMENTAL ASSESSMENT
We have implemented and integrated our proposed procedure in
Sec. 4 and Appendix A into HIP/SLEEK [8, 17, 16] for verifying
heap-based programs. We made use of Omega Calculator [26] to
eliminate existential quantifiers, Z3 [12] as a back-end SMT solver,
and FixCalc [25] to compute fixed point for the pure domain. The
URL for our demo web-site is available on request for double blind
reasons.
In the rest of this section, we will show the capability of our
invariant inference and its application in program verification. The
experiments were performed on a machine with the Intel i7-960
(3.2GHz) processor and 16 GB of RAM.
5.1
Invariant Inference
Using our proposed procedure, we have inferred precise invariants for a broad range of data structures. Some examples include
cyclic linked-list, list segment, linked-list with even size, binary
tree, and tree with linked leaves. In all these predicates, size n is
the pure property. The inferred invariants are shown in Table 1.
While the precise invariants for these common predicates are
fairly straightforward, we can also infer precise invariants that are
non-trivial. An example is the bndll predicate below to describe a
singly linked-list with lower bound l and upper bound u on every
value of the nodes with no two consecutive nodes having the same
value.
pred bndll(root,p,l,u,n) ≡ root=null ∧ n=0
∨ ∃v, q· root7→node(v, q)∗bndll(q,v,l,u,n−1)∧l≤v≤u∧v6=p
Our inference system automatically derives the following precise
invariant for it.
({}, root=null, n=0) ∨
({root}, true , n=1∧l≤u∧(l≤p−1∨p<u)) ∨
({root}, true , n>1 ∧ l+1≤u ∧ (l≤p−1 ∨ p<u))
There are also examples where we have inferred over-approximated
rather than precise invariants. These typically occur when they are
outside of the SLA2 fragment. Apart from AVL tree, we cannot
infer precise invariants for heap, complete, red-black trees.
5.2
Compositional Program Verification
We have integrated our proposed decision procedure into HIP/SLEEK
[8, 17, 16] for deciding unsat queries. Our satisfiability solver
was used to prune infeasible program states, discharge verification conditions with empty heap in RHS, and generate counterexample to highlight detected real errors. The specification system in HIP/SLEEK is based on separation logic with user-defined
predicates [8] and second-order predicates [16]. The Hoare-style
forward reasoning engine computes a set of states in separation
logic form. During this process, it prunes infeasible disjunctive
program paths with unsat queries at branch statements (e.g. if,
while statements). Additionally, this component also generates entailment obligations to ensure absence of memory errors (no null
deference, no double free and no leaking memory), validity of functional calls/loops via compositional pre-/post- conditions and postconditions holding. As a compositional verification system, the
correctness of a program is reduced to the validity of appropriate verification conditions generated. For each entailment check
(i.e. ∆a ⊢ ∆c ), the entailment procedure SLEEK discharges such
a verification condition by searching for matching on heap such
that heap of the RHS is subsumed by heap of the antecedent. Concretely, it manipulates heap on antecedent and consequent of the
verification condition (with folding and matching) until heap of the
consequent is empty, i.e. ∆ ⊢ emp∧πr . To support both safety
Data Structure
Singly llist
Even llist
Sorted llist
Doubly llist
CompleteT
Heap trees
AVL
BST
RBT
rose-tree
TLL
Bubble
Quick sort
Precise Predicate Invariant Inferred
({ }, root=null, n=0) ∨ ({root}, true, n>0)
({ }, root=null, n=0) ∨ ({root}, true, ∃i· i>0 ∧ n=2∗i)
({ }, root=null, n=0 ∧ sm<=lg) ∨ ({root}, true, n>0 ∧ sm<=lg)
({ }, root=null, n=0) ∨ ({root}, true, n>0)
({ }, root=null, n=0) ∨ ({root}, true, n<=2∗nmin ∧ nmin+1<=n)
∨({root}, true, n<=2∗nmin−1 ∧ nmin<=n)
({ }, root=null, n=0 ∧ mx=0) ∨ ({root}, true, n>0 ∧ mx>=0)
({ }, root=null, m=0 ∧ n=0 ∧ bal=1)
∨({root}, true, m>=1 ∧ bal<=2 ∧ bal<=n ∧ bal>=2−n ∧ bal>=0
({ }, root=null, n=0) ∨ ({root}, true, n>0)
({ }, root=null, n=0 ∧ bh=1 ∧ cl=0) ∨ ({root}, true, n>0 ∧ bh>0 ∧ cl=1)
∨({root}, true, n>0 ∧ bh>1 ∧ cl=0)
({ }, root=null, true) ∨ ({root}, true, true)
({root}, root=ll, true) ∨ ({root, ll}, true, true)
({ }, root=null, n=0) ∨ ({root}, true, n>0 ∧ sm<bg)
({ }, root=null, n=0) ∨ ({root}, true, n>0 ∧ sm<bg)
Type
Precise
Precise
Precise
Precise
Over
Over
Over
Precise
Over
Precise
Precise
Precise
Precise
Table 1: Invariants inferred for a benchmark of data structures
Data Structure (pure props)
Singly llist (size)
Even llist (size)
Sorted llist (size, sorted)
Doubly llist (size)
CompleteT (size, minheight)
Heap trees (size, maxelem)
AVL (height, size, bal)
BST (height, size, sorted)
RBT (size, blackheight, color)
rose-tree
TLL
Bubble (size, sorted)
Quick sort (size, sorted)
#queries
535
126
168
381
321
386
539
260
1451
55
107
208
181
#unsat
66
122
18
44
25
38
25
29
126
6
13
19
21
#sat
469
1
150
337
0
0
0
231
0
49
94
189
160
#unknown
0
3
0
0
296
348
514
0
1325
0
0
0
0
Time (s)
0.68
0.71
1.31
0.77
3.22
3.23
6.19
1.85
10.81
0.11
0.28
0.61
0.72
Table 2: Experimental Results on a Wide Range of Complex Data Structures
proving and bugs finding, SLEEK used an error calculus [17] to
classify the entailment with empty heap in consequent into√a lattice
value: unreachability ⊥ (when ∆ is unsatisfiable), safety (when
∆∧¬πr is unsatisfiable), must error ✵must (when ∆∧πr is unsatisfiable), and may error ✵may (otherwise). Since SLEEK verifies
programs with only sound abstraction, it can not confirm the satisfiability of ∆a ; consequently, it can not distinguish safety, must
errors (and may errors) with unreachability. Must errors may be
unreachable and this would be considered as false alarms.
To solve this problem, we employ the proposed decision procedure as a new satisfiability procedure in the error calculus. When a
must error is detected, we perform a additional satisfiability of ∆a
to check its reachability. If it is satisfiable, we proceed to invoke
error explanation to identify a set of code statements relating to the
error [17]. With our new satisfiability procedure, we can confirm
more true bugs and bring better support to fixing program bugs.
The experimental results are summarized in Table 2. The first
column lists data structures and their pure properties. rose-trees
are trees with nodes that are allowed to have variable number of
children, stored as doubly-linked lists. TLL is binary trees whose
nodes point to their parent and all leave nodes are linked as singlylinked list. The second column lists the total number of satisfiability queries sent to the decision procedure. The third, fourth and
fifth columns show the amount of unsat, sat and unknown queries,
respectively. The last column captures the processed time (in second) for queries of each data structure. As expected, Heap trees,
Complete trees, AVL and RBT data structures are beyond the decidable fragment, our system can only answer unsat queries. For
the case of Evenllist, although we can generate precise invariants, queries (generated from quantified specification) with complex form of quantifiers may drop into undecidable and discharged
with unknown outputs.
6.
RELATED WORK AND CONCLUSION
Due to complications of negation and the need for frame inference in resource-oriented separation logic, there are two separate
decision procedures of interests, namely entailment and satisfiability. Most works have focused on the first problem. Calcagno et
al. [7] first presented foundations about computability and complexity results of the basic separation logic fragment without recursive predicates. Since then there were initial proposals that introduced decision procedures for satisfiability in a fragment of separation logic with hardwired linked lists and (dis)equality between
pointer variables. Berdine et. al. proposed foundation and the
first proof theory for such a decidable fragment [2, 3]. Subsequently, [9] and [22] use more efficient algorithms to prune infeasi-
ble branches using graph techniques [9] and superposition calculus
[22]. [23] presents an efficient SMT-based decision procedure for
separation logic with acyclic list segments with length, by combining an entailment checking algorithm for separation logic with decidable SMT theories. To support more expressive fragment, [24]
proposed a logic of graph reachability and stratified sets to capture
the semantics of heap structure. For a fragment with general inductive predicates, [15] showed decidability (both satisfiability and
entailment problems) for bounded-width tree-like data structures.
On satisfiability, Brotherston et al. [6] recently presented a decision procedure for satisfiability over the fragment which included
(arbitrary) recursive predicates with pointer (dis)equality. Our work
is inspired from their proposal, as we also use fixed point computations to produce a set of models that are equi-satisfiable with the
original formula. We have now extended such a decision procedure
to include predicates with Presburger invariants. Our work shows
that precise invariants are key to ensuring satisfiability and that we
can use projection to derive precise invariants separately for the
shape and integer domains.
In terms of combining abstract domains, we list three related
works. [14] proposed a logical product combination of abstract
interpretations (over pure domains) to gain better precision than
the reduced product combination [11, 10], with the same conditions imposed on individual theories as in the Nelson-Oppen combination for decision procedures [21]. [32] presented a framework
for the reduced product combination of memory abstractions (over
shape domains). Lastly, [13] combines heap with numeric abstract
domains. Unlike these past works, our fix-points are based on disjoint shape and pure domains which are then combined by an unfolding stage to derive precise combined invariants for satisfiability.
There have also been a number of related works concentrating
on decision procedures for data structures. [31] proposes a multiphase decision procedure for the quantifier-free theory of algebraic
data types with different fold functions, that can be used to prove
various properties of functional data structures. [29] employs a
fragment of separation logic with recursive definitions but no explicit quantification, to specify general properties of structure, data
and separation, which are converted into classical logic and then
handled by natural proof mechanisms with the help of decidable
SMT solvers. [18] defines a new logic allowing existential and universal quantifications over nodes and complex combination of data
and structural constraints and identifies decidable fragments of the
logic where the decision procedure can be implemented by combining an MSO decision procedure over trees and an SMT solver
for integer constraints. Other works, such as [4, 16, 19, 20, 27,
28], propose abstract interpretation based analyses to infer invariants over heap and data for heap-manipulating programs. These
analyses either make use of fixed/pre-built shape predicates (e.g.
[4, 19, 20]) or support user-defined predicates (e.g. [27, 28]) or
infer arbitrary (second-order) shape predicates (e.g. [16]). They
are also based mostly on over-approximation analyses, and do not
provide any decision procedure on satisfiability.
Conclusion. We have considered an expressive fragment of separation logic with Presburger arithmetic. Though the satisfiability of
this fragment is not decidable, we show that precise invariant can
be guaranteed for the SLA2 fragment, leading to decidable satisfiability.
7. REFERENCES
[1] T. Antonopoulos, N. Gorogiannis, C. Haase, M. Kanovich,
and J. Ouaknine. Foundations for decision problems in
separation logic with general inductive predicates. In
FoSSaCS, pages 411–425, 2014.
[2] J. Berdine, C. Calcagno, and P. W. O’Hearn. A Decidable
Fragment of Separation Logic. In FSTTCS. Springer-Verlag,
December 2004.
[3] J. Berdine, C. Calcagno, and P. W. O’Hearn. Symbolic
Execution with Separation Logic. In APLAS, volume 3780,
pages 52–68, November 2005.
[4] A. Bouajjani, C. Dragoi, C. Enea, A. Rezine, and Mihaela
Sighireanu. Invariant synthesis for programs manipulating
lists with unbounded data. In CAV, pages 72–88, 2010.
[5] J. Brotherston, D. Distefano, and R. L. Petersen. Automated
cyclic entailment proofs in separation logic. In CADE, pages
131–146, 2011.
[6] J. Brotherston, C. Fuhs, J. A. Navarro Pérez, and
N. Gorogiannis. A decision procedure for satisfiability in
separation logic with inductive predicates. In CSL-LICS.
ACM, 2014.
[7] C. Calcagno, H. Yang, and P. W. O’Hearn. Computability
and complexity results for a spatial assertion language for
data structures. In FSTTCS, pages 108–119, 2001.
[8] W.N. Chin, C. David, H.H. Nguyen, and S. Qin. Automated
verification of shape, size and bag properties via user-defined
predicates in separation logic. SCP, 77(9):1006–1036, 2012.
[9] B. Cook, C. Haase, J. Ouaknine, M. Parkinson, and
J. Worrell. Tractable reasoning in a fragment of separation
logic. In CONCUR, volume 6901, pages 235–249. 2011.
[10] P. Cousot. Lecture Notes on Abstract Interpretation. 2005.
http://web.mit.edu/16.399/www/.
[11] P. Cousot and R. Cousot. Systematic design of program
analysis frameworks. In ACM POPL, San Antonio, Texas,
1979.
[12] L. M. de Moura and N. Bjørner. Z3: An Efficient SMT
Solver. In TACAS, 2008.
[13] P. Ferrara. Generic combination of heap and value analyses
in abstract interpretation. In VMCAI, volume 8318, pages
302–321. 2014.
[14] S. Gulwani and A. Tiwari. Combining abstract interpreters.
In ACM PLDI, pages 376–386, 2006.
[15] R. Iosif, A. Rogalewicz, and J. Simácek. The tree width of
separation logic with recursive definitions. In CADE, pages
21–38, 2013.
[16] QL. Le, C. Gherghina, S. Qin, and W.-N. Chin. Shape
analysis via second-order bi-abduction. In CAV, 2014.
[17] QL. Le, A. Sharma, F. Craciun, and W.-N. Chin. Towards
complete specifications with an error calculus. In NASA
Formal Methods, volume 7871, pages 291–306. Springer
Berlin Heidelberg, 2013.
[18] P. Madhusudan, Gennaro Parlato, and Xiaokang Qiu.
Decidable logics combining heap structures and data. In
ACM POPL, pages 611–622, New York, NY, USA, 2011.
ACM.
[19] S. Magill, J. Berdine, E. M. Clarke, and B. Cook. Arithmetic
strengthening for shape analysis. In SAS, pages 419–436,
2007.
[20] B. McCloskey, T. Reps, and M. Sagiv. Statically inferring
complex heap, array, and numeric invariants. In SAS, pages
71–99, Berlin, Heidelberg, 2010. Springer-Verlag.
[21] G. Nelson and D. Oppen. Simplification by cooperating
decision procedures. ACM Trans. Program. Lang. Syst.,
1(2):245–257, October 1979.
[22] J. A. N. Pérez and A. Rybalchenko. Separation logic +
superposition calculus = heap theorem prover. In ACM PLDI,
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
pages 556–566. ACM, 2011.
J. A. N. Pérez and A. Rybalchenko. Separation logic modulo
theories. In APLAS 2013, pages 90–106, 2013.
R. Piskac, T. Wies, and D. Zufferey. Grasshopper: Complete
heap verification with mixed specifications. TACAS’14, 2014.
C. Popeea and W.-N. Chin. Inferring disjunctive
postconditions. In ASIAN, pages 331–345, 2006.
W. Pugh. The Omega Test: A fast practical integer
programming algorithm for dependence analysis.
Communications of the ACM, 8:102–114, 1992.
S. Qin, G. He, C. Luo, W.-N. Chin, and X. Chen. Loop
invariant synthesis in a combined abstract domain. J. Symb.
Comput., 50:386–408, 2013.
S. Qin, G. He, C. Luo, W.-N. Chin, and H. Yang.
Automatically refining partial specifications for
heap-manipulating programs. Sci. Comput. Program.,
82:56–76, 2014.
X. Qiu, P. Garg, A. Ştefănescu, and P. Madhusudan. Natural
proofs for structure, data, and separation. In PLDI, pages
231–242, New York, NY, USA, 2013. ACM.
G. Rosu and A. Stefanescu. Checking reachability using
matching logic. In ACM OOPSLA, pages 555–574, New
York, NY, USA, 2012. ACM.
P. Suter, M. Dotta, and V. Kuncak. Decision procedures for
algebraic data types with abstractions. In ACM POPL, 2010.
A. Toubhans, B.-Y. E. Chang, and X. Rival. Reduced product
combination of abstract domains for shapes. In VMCAI,
pages 375–395, 2013.
M.-T. Trinh, QL. Le, C. David, and W.-N. Chin.
Bi-abduction with pure properties for specification inference.
In APLAS, pages 107–123. 2013.
APPENDIX
A. UNDER-APPROXIMATION CHECK
Given a pure formula π and a user-defined predicate P#
N (v), we
define under-approximated invariant problem as a procedure to verify whether π is an under-approximated invariant of P#
N i.e.
∀h, s. s, h|=π =⇒ ∃h′ . s, h◦h′ |=P#
N (v)
The challenge is that both π and P#
N (t) are sets of disjoint, unfolded
and possibly infinite disjuncts. In essence, it requires a systematic procedure to (i) match a disjunct πi in LHS with its partner
#
∆#
Ni in RHS such that πi is an under-approximation of ∆Ni , (ii)
strengthen LHS by excluding the middle, and (iii) strengthen RHS
by unfolding the predicate instances. Cyclic technique [5, 30] is a
promising approach for inductive reasoning of inductive predicates,
which looks up an induction hypothesis in historical proofs. As our
under-approximated invariant check is quite special, immediately
deploying cyclic proof is not efficient.
In this section, we propose a verification procedure for the underapproximated invariant problem. Our verification provides inductive reasoning and a case-split mechanism to strengthen the verification process. The verification of under-approximated invariant is
formalized as
(b, V ): π ≪ P#
N (v)
where (b, V ) is a reset table data structure called the context of
the proof search: b is the boolean refuted, which can be set any
time the context is detected to be inconsistent; and V is a set of
variables defining under-approximation. The verification process
is performed by systematically applying rules in Fig. 1.
Starting from the goal (false , { }): π ≪ P#
N (v), we initially
unfold RHS (using [UV−U] rule) and add π ≪ P#
N (v) as induction
hypothesis. The unfolding over P#
(v)
enables
inductive
reasoning
N
on the problem, i.e. induction hypothesis can be applied for any
predicate instances P#
N in RHS. Inductive reasoning is implemented
as follows. First, we encode the induction hypothesis above by setting π as a unique active under-approximated invariant of predicate
P#
N . After that, the induction hypothesis is applied as presented in
the [UV−I] rule. The soundness of inductive reasoning requires that
a proof including the induction application [UV−I] is valid if there
exists a valid proof of its corresponding base case. In the [UV−I]
rule, the auxiliary procedure eXPure(P#
N (v)) substitutes the induc(v)
tive predicate P#
by
its
active
under-approximated
invariant.
N
Rule [UV−U] strengthens the RHS by unfolding one user-defined
predicate assertion in the RHS via the procedure unfold(P(t), ∆).
This procedure unfold(P(t), ∆) unfolds once the user-defined predicate instance P(t) of the formula ∆. The steps are formalized as
follows:
W
′
′
fresh wi
ρi =[wi /wi ]
P(v)≡ n
i=1 (∃w i · πi )
′
′′
′
πi = πi [ρi ]
ρ0 =[t/v]
πi = πi [ρ0 ]
W
′
′′
unfold(∃w0 · P(t)∧π0 , i) ❀ n
(∃w
0 ∪w i · π0 ∧πi )
i=1
First, the function looks up the definition of P, refreshes the existential quantifiers. Second, formal parameters are substituted by the
actual parameters. Finally, the substituted definition is combined
with the residual formula as in the RHS of ❀.
[UV−D] rule is a terminal rule. It handles (b, V ): πa ≪ πc such
that both πa and πc are in conjunction form and do not include
any inductive predicates. It checks the implication over the pure
theory, i.e. integer arithmetic T . If the implication is not valid,
the boolean refuted will be set to true . Such a refuted context
will cause the search for proof to backtrack and consider a different
case. A naive approach to backtracking is to go down until reaching the [UV−RO] and try another disjunct in RHS. In the rest, we
present a systematic backtracking approach based on an entailment
and inference procedure. In more detail, we implement an entailment procedure, called SeA, as the implication check in the theory
T : [V ] πa ∧? ⊢ πc . Especially, this procedure is able to perform
logical abduction over the selected variables V [33]. Selective abduction is the problem of finding missing hypotheses, i.e. ?, over a
set of variables in a logical inference task. Concretely, in the above
implication if πa could not logically imply πc , our selective abduction would infer the simplest and most general explanation πI
over variables V such that πa ∧πI =⇒ πc and πa ∧πI =⇒
6
false .
The explanation will be used to guide the case split of backtracking
search strategy.
Search Strategy. We focus on the two rules, unfolding and induction applying, that are possibly applied for inductive predicates.
Unfolding predicate provides more precise disjuncts, but may explode the proof search. Applying induction helps soundly decide a
set of infinite disjuncts, but may be over under-approximated and
produces false alarms. Our strategy is that we give higher priority
on applying induction over predicate instances to hope that we can
find a proof as early as possible. In case of the boolean refuted is
set, we propose to first backtrack on under-approximation and then
strengthen the verification by unfolding the RHS or case splitting
the LHS to hope that we can reduce the false alarms as many as
possible. In the following, we describe the strategy with conflict
analysis for unfolding and explanation for case split.
Suppose the implication check in background theory T be =⇒ T .
Given an antecedent πa , a cosequent πc in the theory T , and a set of
variables V , the procedure SeA, i.e. [V ] πa ∧? ⊢ πc , is formalized
[UV−CS]
(b, V ): (πa ∧πI )∨(πa ∧¬πI ) ≪ πc
(b, V ): πa ≪ πc
[UV−D]
[V ] πa ∧? ⊢ πc
(b, V ): πa ≪ πc
[UV−U]
Wn
(b, V ): ψ ≪
(b, V ): ψ ≪ P(t)∧π
i=1 ∆i ≡unfold(P(t), P(t)∧π)
Wn
i=1
∆i
[UV−I]
(b, V ∪v): πa ≪ eXPure(P#
N (t))∧π
(b, V ): πa ≪ P#
(t)∧π
N
[UV−LO]
(b, V ): π1 ≪ π
(b, V ): π2 ≪ π
(b, V ): (π1 ∨ π2 ) ≪ π
[UV−RO]
(b, V ): π ≪ πi
i ∈ {1, 2}
(b, V ): π ≪ (π1 ∨ π2 )
Figure 1: Inference Rules for Under-Approximation Check.
as the following
[REFUTED]
πa =⇒ T false
[CONFL]
πa =⇒ T ¬πc
[VALID]
πa =⇒ T πc
[EXPL]
infer [FV(πa )] πa =⇒ T πc
split on LHS as in the following
...
(c)
I
(b)
I
n>0∧¬(n>1) ≪ πc
n>0∧n>1 ≪ πc
LO
(n>0∧n>1)∨(n>0∧¬(n>1)) ≪ πc
(a)
CS
n>0 ≪ ∃m1 ·sortll#
N (n−1, m1 )
In the first two rules, while [REFUTED] first checks the inconsistency on LHS and sets the boolean refuted to true, [VALID] proves
the validity and completes the proof search. In the last two rules,
the implication is not proven. Instead of giving up by setting the
boolean refuted to true, we will show how to strengthen the verification with conflict analysis or explanation inference.
Conflict Analysis. In the [CONFL] rule, SeA detects conflict between LHS and RHS and analyze the conflict to find whether variables defining the under-approximation contribute to the conflict.
The set of variables is computed by: Vf =V ∩ FV(πc ). Our proof
system now backtracks until reaching the earliest under-approximation
of a variable in Vf , rollbacks the induction application on predicate
instance and unfolds the predicate instance to obtain a more precise
RHS.
Explanation Inference. In the [EXPL] rule, if neither πa =⇒ T πc
nor πa =⇒ T ¬πc , the procedure cannot discharge or validate the
potential inconsistency. This is the case when either πa is too weak
to be an under-approximation of the given πc or πc has not been
sufficiently unfolded. To strengthen πa , we use abduction to infer
a right cut over free variables of πa to exclude the middle. This
inference is performed through free variables of πa , and produces
an explanation πI . Our proof system now backtracks until reaching the earliest under-approximation of a variable in FV(πi ) i.e.
(b, V ′ ): πa ≪ P#
N (v)∧π and v∩FV(πi )6={ }, and applies case split
on the LHS as: (b, V ′ ): (πa ∧πI ) ∨ (πa ∧¬πI )) ≪ (P#
N (v))∧π (as
presented in [UV−CS] rule).
In the above proof, πc ≡∃m1 ·sortll#
N (n−1, m1 ). Steps to search
proof for (b) is similar to (a) above. The following is the search
proof for (c).
T HEOREM A.1 (C ORRECTNESS ). Provided an inductive pred#
icate P#
N (v) and ψ such that (false , { }): π ≪ PN (v). s, h |= π
#
implies s, h |= PN (v).
SeA introduces an abduction, backtracking and case split as follows.
Example 1. We verify that n>0 is an under-approximated invariant of sortll#
N predicate. The proof is as follows. (Rules applied
for proof search are annotated (without the prefix −UV).)
(a)
[n] n>0∧? ⊢ n−1>0
D
n>0) ≪ ∃m1 ·n−1>0)
n>0 ≪ ∃m1 ·sortll#
N (n−1, m1 )
I
n>0 ≪ (n=0)∨(∃m1 ·sortll#
N (n−1, m1 ))
RO2
U
n>0 ≪ sortll#
N (n, m)
In the above proof, since LHS of the implication at the top implies neither RHS nor ¬RHS, our SeA system applies [EXPL] rule
to infer explanation πI =n>1, then backtracks to (a) and does case
(c)
[n] n>0∧¬(n>1)∧? ⊢ n−1>0
D
n>0∧¬(n>1)} ≪ ∃m1 ·n−1>0
n>0∧¬(n>1) ≪ ∃m1 ·sortll#
N (n−1, m1 )
I
At the top of the above proof, we apply [CONFL] rule to analyse
conflict over n variable, then backtrack to (c) and unfold sortll#
N
instance. These steps are presented as follows.
(c)
[n] r6=null∧n>0∧¬(n>1)∧? ⊢ ∧n=1
D
n>0∧¬(n>1) ≪ n=1
n>0∧¬(n>1) ≪ n=1 ∨ ∃m1 ·sortll#
N (n−1, m1 )
RO1
Therefore, n>0 is an under-approximated invariant of the predicate
sortll#
N .
Example 2. To illustrate that inductive reasoning requires a base
case, we consider the following check true ≪ sortll#
N (n, m).
Initially, the predicate in RHS is unfolded as:
true ≪ n=1 ∨ ∃m1 ·sortll#
N (n−1, m1 )∧m≤m1
The proof for the base case is as follows.
[n] true ∧? ⊢ n=1
D
true ≪ n=1
RO1
true ≪ n=1 ∨ ∃m1 ·sortll#
N (n−1, m1 )∧m≤m1
(1a)
n=1 ≪ sortll#
N (n, m)
¬(n=1) ≪ sortll#
N (n, m)
n=1∨¬(n=1) ≪ sortll#
N (n, m)
The check in LHS is trivial. The check in RHS is as follows.
...
[n] ¬(n=1)∧? ⊢ ¬(n−1=1)
D
¬(n=1) ≪ ¬(n−1=1)
(1b)
I
¬(n=1) ≪ ∃m1 ·sortll#
N (n−1, m1 )∧m≤m1
¬(n=1) ≪ n=1 ∨ ∃m1 ·sortll#
N (n−1, m1 )∧m≤m1
CS
RO2
SeA recommends for another abduction process. This process may
be non-terminating. This shows that our inference system is sound;
it never concludes that true is an under-approximated invariant of
sortll#
N (n, m).
Download