Yuri-Mashman-addedMaterial

advertisement
From Under-approximations
to Over-approximations and
Back
Complementary material
By Yuri Meshman
yurime@cs.technion.ac.il
Example
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
Assume we have the following code example.
In this case, the ERROR label is not reachable, and we want to
prove that with predicate abstraction.
First step: we want to know what are all the reachable locations.
ARG Definiton
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
We want to build an abstract reachability graph for it.
ARG: 𝐴 = V, E, 𝑣𝑒𝑛 , 𝜈, 𝜏, πœ“, βŠ‘, βŠ‘π‘‘
v1
v2
v3
v4
v5
v6
v2’
v7
v3'
v8
v9
ARG Definiton
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
We want to build an abstract reachability graph for it.
ARG: 𝐴 = V, E, 𝑣𝑒𝑛 , 𝜈, 𝜏, πœ“, βŠ‘, βŠ‘π‘‘ where
v1
(V, E, 𝑣𝑒𝑛 ) − is a directed acyclic graph
v2
𝜈 – is a map from nodes to control locations
(several nodes can map to the same pc)
v3
v4
v5
In the graph example
𝑣𝑖 maps to the control reaching
line i of code. Apostrophes are
used to distinguish different
nodes mapped to the same
revisited line (e.g. 𝑣2 , 𝑣2 ′).
v6
v2’
v7
v3'
v8
v9
ARG Definiton
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
We want to build an abstract reachability graph for it.
ARG: 𝐴 = V, E, 𝑣𝑒𝑛 , 𝜈, 𝜏, πœ“, βŠ‘, βŠ‘π‘‘ where
v1
𝜏 – is a map from edges (E) to actions
(instructions) of the program
v2
v3
v4
v5
In the graph example
𝜏(𝑣1 , 𝑣2 )=“i=0,x=0;”
v6
v2’
v7
v3'
v8
v9
ARG Definiton
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
We want to build an abstract reachability graph for it.
ARG: 𝐴 = V, E, 𝑣𝑒𝑛 , 𝜈, 𝜏, πœ“, βŠ‘, βŠ‘π‘‘ where
v1
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
v2
v3
{π‘‘π‘Ÿπ‘’π‘’}
πœ“ – is a map from nodes (V) to formulas
over program variables.
{π‘‘π‘Ÿπ‘’π‘’}
v4
v5
{π‘‘π‘Ÿπ‘’π‘’}
In the graph example
v3'
v6
{π‘‘π‘Ÿπ‘’π‘’}
v2’
{π‘‘π‘Ÿπ‘’π‘’}
option1 : all true – represents
reachable locations.
v7
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
v8
{π‘‘π‘Ÿπ‘’π‘’}
v9
{π‘‘π‘Ÿπ‘’π‘’}
ARG Definiton
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
We want to build an abstract reachability graph for it.
ARG: 𝐴 = V, E, 𝑣𝑒𝑛 , 𝜈, 𝜏, πœ“, βŠ‘, βŠ‘π‘‘ where
v1
{π‘‘π‘Ÿπ‘’π‘’}
{π‘₯ ≥ 0}
v2
v3
{π‘₯ ≥ 0}
πœ“ – is a map from nodes (V) to formulas
over program variables.
{π‘₯ ≥ 0}
v4
v5
{π‘₯ ≥ 0 ∧ 𝑖 ≥ 0}
In the graph example
v3'
v6
{π‘₯ ≥ 0}
v2’
{π‘₯ ≥ 0}
option2: general formulas over
variables – abstracts variables
values reaching this location.
v7
{π‘‘π‘Ÿπ‘’π‘’}
{π‘“π‘Žπ‘™π‘ π‘’} v8
{π‘₯ ≥ 0}
v9
{π‘‘π‘Ÿπ‘’π‘’}
ARG Definiton
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
We want to build an abstract reachability graph for it.
ARG: 𝐴 = V, E, 𝑣𝑒𝑛 , 𝜈, 𝜏, πœ“, βŠ‘, βŠ‘π‘‘ where
v1
{π‘‘π‘Ÿπ‘’π‘’}
βŠ‘ – an ancestor relation over the nodes
{π‘‘π‘Ÿπ‘’π‘’}
v2
v3
Used to define fixed point, and covered
vertexes. If 𝑣2′ is covered by 𝑣2, we don’t need
{π‘‘π‘Ÿπ‘’π‘’}
to explore more iterations of the loop.
{π‘‘π‘Ÿπ‘’π‘’}
v4
v5
{π‘‘π‘Ÿπ‘’π‘’}
In the graph example
v3'
v6
{π‘‘π‘Ÿπ‘’π‘’}
v2’
{π‘‘π‘Ÿπ‘’π‘’}
v7
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
v8
𝑣2′ is covered by 𝑣2 if:
1. 𝑣2 βŠ‘ 𝑣2′,
2.
𝑣2′ is dominated by 𝑣2 (all paths
from 𝑣𝑒 = 𝑣1 pass through it)
{π‘‘π‘Ÿπ‘’π‘’} 3. 𝜈 v2 = 𝜈(v2′ ) same code line
4. πœ“ v2′ → πœ“(v2) – the label for v2′ is
v9 {π‘‘π‘Ÿπ‘’π‘’}
subsumed by v2 label.
ARG Definiton
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
We want to build an abstract reachability graph for it.
ARG: 𝐴 = V, E, 𝑣𝑒𝑛 , 𝜈, 𝜏, πœ“, βŠ‘, βŠ‘π‘‘ where
v1
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
v2
v3
{π‘‘π‘Ÿπ‘’π‘’}
βŠ‘π‘‘ – a fixed linearization of the topological
order.
Gives us the order by which to traverse the
graph.
{π‘‘π‘Ÿπ‘’π‘’}
v4
v5
{π‘‘π‘Ÿπ‘’π‘’}
In the graph example (one option)
v3'
v6
{π‘‘π‘Ÿπ‘’π‘’}
v2’
{π‘‘π‘Ÿπ‘’π‘’}
𝑣2′ βŠ‘π‘‘ 𝑣6 βŠ‘π‘‘ 𝑣4 βŠ‘π‘‘ 𝑣5 βŠ‘π‘‘ 𝑣3 βŠ‘π‘‘ 𝑣2.
v7
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
v8
{π‘‘π‘Ÿπ‘’π‘’}
v9
{π‘‘π‘Ÿπ‘’π‘’}
Post operator in abstract
interpretation:
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
Post operator:
Given:
- An abstract state u
- An operation (instruction from code)
- An abstraction level (such as set of predicates)
Returns:
The successor state abstraction.
Post operator in abstract
interpretation:
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
Post operator:
Given:
- An abstract state u
- An operation (instruction from code)
- An abstraction level (such as set of predicates)
Returns:
The successor state abstraction.
Definition:
Post(u,v)=Ο• such that:
πœ“ 𝑒 ∧ 𝜏 𝑒, 𝑣 ⇒ πœ™`
Where πœ“ 𝑒 is the abstraction of state 𝑒.
𝜏 𝑒, 𝑣 is the instruction from code and
its interpretation under the abstraction
Post operator in abstract
interpretation:
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
Post operator:
Given:
- An abstract state u
- An operation (instruction from code)
- An abstraction level (such as set of predicates)
Returns:
The successor state abstraction.
Example
Assume you have predicates P1:(i<n) P2:(i<=n)
You want to know their values after “i=i+1” (P1`,P2`)
on an abstract edge (u,v)
If only P1 was true before “i=i+1” we don’t know P1`.
-But we know that P2` will be true.
-If P1 was False that will mean i>=n held before
“i=i+1” which will mean P1 and P2 will be false after it.
-And so on..
Post operator in abstract
interpretation:
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
Post operator:
Given:
- An abstract state u
- An operation (instruction from code)
- An abstraction level (such as set of predicates)
Returns:
The successor state abstraction.
Example
Assume you have predicates P1:(i<n) P2:(i<=n)
You want to know their values after “i=i+1” (P1`,P2`)
on an abstract edge (u,v)
P1’= if ¬P1 then F
else unknown
- P2’= if P1 then T
else if ¬ P1 then F
else unknown
Post operator
run example
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
P1:(i<n) P2:(i<=n)
v1
v2
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
The transition from v1 to v2
doesn’t change the predicates
Post(v1,v2)=true
Post operator
run example
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
P1:(i<n) P2:(i<=n)
v1
v2
v3
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
{𝑖 < 𝑛 ∧ 𝑖 ≤ 𝑛}
The transition from v2 to v3
sets both predicates to true
Post(v2,v3)=P1∧P2
Post operator
run example
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
P1:(i<n) P2:(i<=n)
v1
v2
v3
{𝑖 < 𝑛 ∧ 𝑖 ≤ 𝑛}
v4
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
{𝑖 < 𝑛 ∧ 𝑖 ≤ 𝑛}
v5
{𝑖 < 𝑛 ∧ 𝑖 ≤ 𝑛}
The transition from v3 to v4 or
from v3 to v5 doesn’t change the
predicates
…
Post operator
run example
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
P1:(i<n) P2:(i<=n)
v1
v2
v3
{𝑖 < 𝑛 ∧ 𝑖 ≤ 𝑛}
v4
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
{𝑖 < 𝑛 ∧ 𝑖 ≤ 𝑛}
v5
v6
{𝑖 < 𝑛 ∧ 𝑖 ≤ 𝑛}
{𝑖 < 𝑛 ∧ 𝑖 ≤ 𝑛}
The transition from v3 to v4 or
v5 doesn’t change the predicates
And so does the transition from
v4 to v6 or from v5 to v6.
So their join is the same.
Post operator
run example
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
P1:(i<n) P2:(i<=n)
v1
v2
v3
{𝑖 < 𝑛 ∧ 𝑖 ≤ 𝑛}
v4
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
{𝑖 < 𝑛 ∧ 𝑖 ≤ 𝑛}
v5
{𝑖 < 𝑛 ∧ 𝑖 ≤ 𝑛}
v6
{𝑖 < 𝑛 ∧ 𝑖 ≤ 𝑛}
v2’
{𝑖 ≤ 𝑛}
…
P1’= if ¬P1 then F
else unknown
- P2’= if P1 then T
else if (¬ P1 ∧ P2) then F
else unknown
The transition from v6 to v2’ is
as previously discussed
Under approximation
driven verification:
Under approximation
driven verification:
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
For UD – Post operator will always return true.
And we will see refinement, using interpolants.
Under approximation
driven verification:
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
v1
v2
{π‘‘π‘Ÿπ‘’π‘’}
An initial node 𝑣1 is created and
given the label true.
𝑣1 has a single successor 𝑣2 which
we will continue to explore.
Under approximation
driven verification:
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
v1
v2
𝑣1 has a single successor 𝑣2
and as previously mentioned, the Post
operator will return true.
𝑣2 has two possible successors,
we will continue to explore 𝑣3 for now
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
v3
v7
Under approximation
driven verification:
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
v1
v2
v3
{π‘‘π‘Ÿπ‘’π‘’}
v4
v3'
Post operator will return true for 𝑣3.
And in that fashion, the exploration
will continue until finishing the loop
iteration and reaching the beginning
of the loop a second time – a node 𝑣2′.
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
𝑣2′ has two sons, 𝑣3′ – which indicates
a second iteration of the loop
and 𝑣7 – which indicates exiting the
loop after one iteration or more.
{π‘‘π‘Ÿπ‘’π‘’}
v5
{π‘‘π‘Ÿπ‘’π‘’}
v6
{π‘‘π‘Ÿπ‘’π‘’}
v2’
{π‘‘π‘Ÿπ‘’π‘’}
v7
Under approximation
driven verification:
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
v1
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
v2
v3
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
v4
v3'
𝑣2′-s label is subsumed by the one of
𝑣2 meaning the exploration of 𝑣3′ will
not provide new information, and its
label will be the same as the one of 𝑣3.
This is indicated by the black arrow
from 𝑣2′ to 𝑣2.
v5
{π‘‘π‘Ÿπ‘’π‘’}
v6
{π‘‘π‘Ÿπ‘’π‘’}
v2’
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
v7
Under approximation
driven verification:
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
v1
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
v2
v3
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
v4
v3'
After finishing exploring all the paths,
the label of the error node 𝑣8 is not
false.
So we want to check:
1. if there is a concrete counter part to
the 2 paths 𝑣1 → β‹― → 𝑣8.
2. if not reachable, use interpolants
to find new labels that capture
why those paths are not reachable.
v5
{π‘‘π‘Ÿπ‘’π‘’}
v6
{π‘‘π‘Ÿπ‘’π‘’}
v2’
{π‘‘π‘Ÿπ‘’π‘’}
We describe next, how this Counter
Example Guided Abstraction
Refinement (CEGAR) phase is done.
v7
{π‘‘π‘Ÿπ‘’π‘’}
{π‘‘π‘Ÿπ‘’π‘’}
v8
{π‘‘π‘Ÿπ‘’π‘’}
v9
{π‘‘π‘Ÿπ‘’π‘’}
Building a formula for
CEGAR
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
We ignore all nodes and edges
irrelevant to the abstract path to err.
And, we add a boolean variable to each
node -- for convenience it will be the
name of the node.
v1
v2
Intuitively, if 𝑣1, 𝑣2, 𝑣3, 𝑣4, 𝑣6, 𝑣2′ , 𝑣7
π‘Žπ‘›π‘‘ 𝑣8 are all true then this path will
be feasible under concrete execution.
v3
v4
v5
Next, we add formulas for edges.
Similar to the way it would have been
done for Bounded Model Checking.
v6
v2’
v7
v8
Building a formula for
CEGAR
We use Static Single Assignment (SSA)
Form.
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
Definition:
A program is in SSA form if an
assignment to each variable appears
at most once in its syntax.
v1
v2
v3
v4
Therefore we rename variables for
which assignments appear more then
once.
“π‘₯“ will be π‘₯0 at lines 1—3
will become π‘₯1 at line 4
π‘₯2 at line 5 etc.
v5
v6
v2’
v7
v8
Building a formula for
CEGAR
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
“6.i = i + 1;” will translate to a formula on
the edge 𝑣6, 𝑣2′ :
π‘’π‘›π‘π‘œπ‘‘π‘’ 𝑣6, 𝑣2′ = (𝑖1 = 𝑖0 + 1)
We use the path formulas to capture
error execution in the ARG:
πœ‡6 : 𝑣6 ⇒ (π‘’π‘›π‘π‘œπ‘‘π‘’ 𝑣6, 𝑣2′ ∧ 𝑣2′)
v1
v2
v3
v4
v5
v6
v2’
v7
v8
Meaning if 𝑣6 is reached then πœ‹(𝑣6,𝑣2′)
will be taken and 𝑣2′ will be reached.
To avoid name conflicts each time a
variable appears on left side of an
assignment it receives a new subscript
(this is SSA).
Such as for π‘’π‘›π‘π‘œπ‘‘π‘’ 𝑣6, 𝑣2′ .
Building a formula for
CEGAR
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
For the graph example we will receive:
πœ‡1 : 𝑣1 ⇒ (𝑖0 = 0 ∧ π‘₯0 = 0 ∧ 𝑣2)
πœ‡2 : 𝑣2 ⇒ ((𝑖0 < 𝑛 ∧ 𝑣3)
∨ 𝑖0 ≥ 𝑛 ∧ π‘₯4 = π‘₯0 ∧ 𝑣7 )
πœ‡3 : 𝑣3 ⇒ ( 𝑖0 ≤ 2 ∧ 𝑣4 ∨ 𝑖0 > 2 ∧ 𝑣5 )
πœ‡4 : 𝑣4 ⇒ (π‘₯1 = 0 ∧ π‘₯3 = π‘₯1 ∧ 𝑣6)
πœ‡5 : 𝑣5 ⇒ (π‘₯2 = 𝑖0 ∧ π‘₯3 = π‘₯2 ∧ 𝑣6)
πœ‡6 : 𝑣6 ⇒ (𝑖1 = 𝑖0 + 1 ∧ 𝑣2’)
πœ‡2 ′: 𝑣2′ ⇒ (𝑖1 ≥ 𝑛 ∧ π‘₯4 = π‘₯3 ∧ 𝑣7)
πœ‡7 : 𝑣7 ⇒ (π‘₯4 < 0 ∧ 𝑣8)
v1
v2
v3
v4
v5
v6
v2’
v7
v8
The formula
𝑣1 ∧ πœ‡1 ∧ πœ‡2 ∧ πœ‡3 ∧ πœ‡4 ∧ πœ‡5 ∧ πœ‡6 ∧ πœ‡2 ′ ∧ πœ‡7
is UNSAT
Solving the formula for
CEGAR
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
Definition
An interpolant for 𝐴 ∧ 𝐡(= π‘ˆπ‘π‘†π΄π‘‡) is 𝐼 = 𝐼𝑛𝑑 𝐴, 𝐡 such that:
1. 𝐴 ⇒ 𝐼
2. 𝐼 ∧ 𝐡 = π‘ˆπ‘π‘†π΄π‘‡
3. 𝐼 is over the intersection of the variables of 𝐡 and 𝐴.
Solving the formula for
CEGAR
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
Definition
An interpolant for 𝐴 ∧ 𝐡(= π‘ˆπ‘π‘†π΄π‘‡) is 𝐼 = 𝐼𝑛𝑑 𝐴, 𝐡 such that:
1. 𝐴 ⇒ 𝐼
2. 𝐼 ∧ 𝐡 = π‘ˆπ‘π‘†π΄π‘‡
3. 𝐼 is over the intersection of the variables of 𝐡 and 𝐴.
Note: In the following slides links appear to
implementation of the formulas in iz3 (for interpolants)
and z3 (for general formulas).
Pressing the links opens the online z3 or iz3 tool, and
pressing play at the opened site should calculate the
solutions.
Solving the formula for
CEGAR
An interpolant for 𝐴 ∧ 𝐡(= π‘ˆπ‘π‘†π΄π‘‡) is 𝐼 = 𝐼𝑛𝑑 𝐴, 𝐡 such that:
1. 𝐴 ⇒ 𝐼
Foo(int n):
2. 𝐼 ∧ 𝐡 = π‘ˆπ‘π‘†π΄π‘‡
3. 𝐼 is over the intersection of the variables of 𝐡 and 𝐴.
1. i=0,x=0;
We have:
2. while (i<n)
𝑣1 ∧ πœ‡1 ∧ πœ‡2 ∧ πœ‡3 ∧ πœ‡4 ∧ πœ‡5 ∧ πœ‡2 ′ ∧ πœ‡7
3.
if (i <= 2)
is UNSAT
4.
x = 0;
else
To derive a new label for 𝑣7 we can
5.
x = i;
calculate an interpolant for
6.
i = i + 1;
𝐡 = πœ‡7 and
7. If (x < 0)
′
A
=
𝑣1
∧
πœ‡
∧
πœ‡
∧
πœ‡
∧
πœ‡
∧
πœ‡
∧
πœ‡
1
2
3
4
5
2
8.
ERROR
β„Žπ‘‘π‘‘π‘://π‘Ÿπ‘–π‘ π‘’4𝑓𝑒𝑛. π‘π‘œπ‘š/𝑖𝑍3/𝑑𝑧𝑄
A
9. return;
We get:
I7 = Int A, B = 𝑣7 ∧ π‘₯4 ≥ 0
v1
v2
v3
v4
v5
v6
v2’
v7
𝐼7
B
v8
Solving the formula for
CEGAR
To derive a new label for 𝑣2′
we can calculate an interpolant for
𝐡 = πœ‡2′ ∧ πœ‡7 and
A = 𝑣1 ∧ πœ‡1 ∧ πœ‡2 ∧ πœ‡3 ∧ πœ‡4 ∧ πœ‡5 ∧ πœ‡6
http://rise4fun.com/iZ3/5b
In that case we will receive:
(after transforming to nnf )
𝐼2′
= ( π‘₯4 ≥ 0 ∨ π‘₯4! = π‘₯3 ) ∧ 𝑣2′)
∨ ( π‘₯4 ≥ 0 ∧ 𝑣7)
Informally it means that either execution
reaches 𝑣2′ with π‘₯4 ≥ 0 or it reaches
𝑣7 with π‘₯4 ≥ 0 .
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
v1
v2
v3
A
v4
v5
v6
v2’
𝐼2′
v7
B
v8
The resulting formula needs cleaning to
get a label for 𝑣6.
Cleaning the formula of
CEGAR
𝐼2′
= ( π‘₯4 ≥ 0 ∨ π‘₯4! = π‘₯3 ) ∧ 𝑣2′)
∨ ( π‘₯4 ≥ 0 ∧ 𝑣7)
We want to extract for v2′ the label (π‘₯3
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
v1
v2
v3
A
v4
v5
v6
v2’
𝐼2′
v7
B
v8
Cleaning the formula of
CEGAR
If we return to the equations we got
interpolants from
π‘₯4 is relevant for 𝑣7
π‘₯0 is relevant for 𝑣2
π’™πŸ‘ is relevant for π’—πŸ′
π’™πŸ’ is relevant for 𝒗7
B
𝐼2′
= ( π‘₯4 ≥ 0 ∨ π‘₯4! = π‘₯3 ) ∧ 𝑣2′)
∨ ( π‘₯4 ≥ 0 ∧ 𝑣7)
We want to extract for v2′ the label
π‘₯3 ≥ 0 .
Why x3?
πœ‡1 : 𝑣1 ⇒ (𝑖0 = 0 ∧ π‘₯0 = 0 ∧ 𝑣2)
πœ‡2 : 𝑣2 ⇒ ((𝑖0 < 𝑛 ∧ 𝑣3)
∨ 𝑖0 ≥ 𝑛 ∧ π‘₯4 = π‘₯0 ∧ 𝑣7 )
πœ‡3 : 𝑣3 ⇒ ( 𝑖0 ≤ 2 ∧ 𝑣4 ∨ 𝑖0 > 2 ∧ 𝑣5 )
πœ‡4 : 𝑣4 ⇒ (π‘₯1 = 0 ∧ π‘₯3 = π‘₯1 ∧ 𝑣6)
πœ‡5 : 𝑣5 ⇒ (π‘₯2 = 𝑖0 ∧ π‘₯3 = π‘₯2 ∧ 𝑣6)
πœ‡6 : 𝑣6 ⇒ (𝑖1 = 𝑖0 + 1 ∧ 𝑣2’)
πœ‡2 ′: 𝑣2′ ⇒ (𝑖1 ≥ 𝑛 ∧ π‘₯4 = π‘₯3 ∧ 𝑣7)
πœ‡7 : 𝑣7 ⇒ (π‘₯4 < 0 ∧ 𝑣8)
Cleaning the formula of
CEGAR
𝐼2′
= ( π‘₯4 ≥ 0 ∨ π‘₯4! = π‘₯3 ) ∧ 𝑣2′)
∨ ( π‘₯4 ≥ 0 ∧ 𝑣7)
We want to extract for v2′ the label
π‘₯3 ≥ 0 .
To do so:
we will quantify all the variables out of
𝑣2′ scope - in this case π‘₯4;
and quantify all node-variables other then
𝑣2′ - in this case 𝑣7.
To remove the 𝑣2′ variable we set it to
true.
http://rise4fun.com/Z3/d8km
And so we receive π‘₯3 ≥ 0 .
(actually π‘₯3 > −1 )
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
v1
v2
v3
A
v4
v5
v6
v2’
𝐼2′
v7
B
v8
Cleaning the formula of
CEGAR
𝐢𝐿𝐸𝐴𝑁 𝐼𝑖 β‰œ ∀ π‘₯ π‘₯ ∈ π‘£π‘Žπ‘Ÿ 𝐼𝑖 ∧ ¬π‘–π‘›π‘†π‘π‘œπ‘π‘’ π‘₯, 𝑒𝑖
⋅ ∀{𝑐𝑒𝑖 |𝑒𝑗 ∈ 𝑉} ⋅ 𝐼𝑖 [𝑐𝑒𝑖 ← 𝑇]
Where π‘£π‘Žπ‘Ÿ 𝐼𝑖 is the set of variables and 𝑐𝑒𝑖 the boolean variable we
added. (both were 𝑣𝑖 so far)
Cleaning the formula of
CEGAR
𝐢𝐿𝐸𝐴𝑁 𝐼𝑖 β‰œ ∀ π‘₯ π‘₯ ∈ π‘£π‘Žπ‘Ÿ 𝐼𝑖 ∧ ¬π‘–π‘›π‘†π‘π‘œπ‘π‘’ π‘₯, 𝑒𝑖
⋅ ∀{𝑐𝑒𝑖 |𝑒𝑗 ∈ 𝑉} ⋅ 𝐼𝑖 [𝑐𝑒𝑖 ← 𝑇]
Where π‘£π‘Žπ‘Ÿ 𝐼𝑖 is the set of variables and 𝑐𝑒𝑖 the boolean variable we
added. (both were 𝑣𝑖 so far)
¬π‘–π‘›π‘†π‘π‘œπ‘π‘’ π‘₯, 𝑒𝑖 means variables relevant to that node.
Cleaning the formula of
CEGAR
𝐢𝐿𝐸𝐴𝑁 𝐼𝑖 β‰œ ∀ π‘₯ π‘₯ ∈ π‘£π‘Žπ‘Ÿ 𝐼𝑖 ∧ ¬π‘–π‘›π‘†π‘π‘œπ‘π‘’ π‘₯, 𝑒𝑖
⋅ ∀{𝑐𝑒𝑖 |𝑒𝑗 ∈ 𝑉} ⋅ 𝐼𝑖 [𝑐𝑒𝑖 ← 𝑇]
Where π‘£π‘Žπ‘Ÿ 𝐼𝑖 is the set of variables and 𝑐𝑒𝑖 the boolean variable we
added. (both were 𝑣𝑖 so far)
¬π‘–π‘›π‘†π‘π‘œπ‘π‘’ π‘₯, 𝑒𝑖 means variables relevant to that node.
Cleaning the formula of
CEGAR
𝐢𝐿𝐸𝐴𝑁 𝐼𝑖 β‰œ ∀ π‘₯ π‘₯ ∈ π‘£π‘Žπ‘Ÿ 𝐼𝑖 ∧ ¬π‘–π‘›π‘†π‘π‘œπ‘π‘’ π‘₯, 𝑒𝑖
⋅ ∀{𝑐𝑒𝑖 |𝑒𝑗 ∈ 𝑉} ⋅ 𝐼𝑖 [𝑐𝑒𝑖 ← 𝑇]
Where π‘£π‘Žπ‘Ÿ 𝐼𝑖 is the set of variables and 𝑐𝑒𝑖 the boolean variable we
added. (both were 𝑣𝑖 so far)
¬π‘–π‘›π‘†π‘π‘œπ‘π‘’ π‘₯, 𝑒𝑖 means variables relevant to that node.
Why is it quantified ∀ for things we want to disappear?
Cleaning the formula of
CEGAR
𝐢𝐿𝐸𝐴𝑁 𝐼𝑖 β‰œ ∀ π‘₯ π‘₯ ∈ π‘£π‘Žπ‘Ÿ 𝐼𝑖 ∧ ¬π‘–π‘›π‘†π‘π‘œπ‘π‘’ π‘₯, 𝑒𝑖
⋅ ∀{𝑐𝑒𝑖 |𝑒𝑗 ∈ 𝑉} ⋅ 𝐼𝑖 [𝑐𝑒𝑖 ← 𝑇]
Where π‘£π‘Žπ‘Ÿ 𝐼𝑖 is the set of variables and 𝑐𝑒𝑖 the boolean variable we
added. (both were 𝑣𝑖 so far)
¬π‘–π‘›π‘†π‘π‘œπ‘π‘’ π‘₯, 𝑒𝑖 means variables relevant to that node.
Why is it quantified ∀ for things we want to disappear?
For example we did:
∀𝑣7. 𝐼2′ = ∀𝑣7. ( π‘₯4 ≥ 0 ∨ π‘₯4! = π‘₯3 ) ∧ 𝑣2′) ∨ ( π‘₯4 ≥ 0 ∧ 𝑣7)
We wanted the invariant that holds at node 𝑣2′ regardless of whether
𝑣7 was reachable or not.
So we search solution both for when 𝑣7 = 𝑇(reachable) and when
𝑣7 = 𝐹.
Cleaning the formula of
CEGAR
𝐢𝐿𝐸𝐴𝑁 𝐼𝑖 β‰œ ∀ π‘₯ π‘₯ ∈ π‘£π‘Žπ‘Ÿ 𝐼𝑖 ∧ ¬π‘–π‘›π‘†π‘π‘œπ‘π‘’ π‘₯, 𝑒𝑖
Foo(int n):
⋅ ∀{𝑐𝑒𝑖 |𝑒𝑗 ∈ 𝑉} ⋅ 𝐼𝑖 [𝑐𝑒𝑖 ← 𝑇]
Where π‘£π‘Žπ‘Ÿ 𝐼𝑖 is the set of variables and 𝑐𝑒𝑖 the boolean variable we
added. (both were 𝑣𝑖 so far)
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
π‘‡β„Žπ‘’π‘œπ‘Ÿπ‘’π‘š 3(from the paper) Let 𝐼′π‘˜ =
𝐢𝐿𝐸𝐴𝑁(πΌπ‘˜ ).
a. If k=1 then 𝐼′π‘˜ ≡ π‘‘π‘Ÿπ‘’π‘’ and if k=n then
𝐼′π‘˜ ≡ π‘“π‘Žπ‘™π‘ π‘’
b. For any two nodes 𝑒𝑖 , 𝑒𝑗 ∈ 𝑉 s.t.
𝑒𝑖 , 𝑒𝑗 ∈ 𝐸 :
𝐼′𝑖 ∧ π‘’π‘›π‘π‘œπ‘‘π‘’ 𝑒𝑖 , 𝑒𝑗 ⇒ 𝐼′𝑗
v1
v2
v3
v4
v5
v6
Where π‘’π‘›π‘π‘œπ‘‘π‘’ 𝑒𝑖 , 𝑒𝑗 is the formula on
the edge as shown previously.
v2’
v7
v8
Back to Under approximation
driven verification:
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
After cleaning we get a new label per
each node.
v1
{π‘‘π‘Ÿπ‘’π‘’}
{π‘₯ ≥ 0}
v2
v3
{π‘₯ ≥ 0}
{π‘₯ ≥ 0} v4
v3'
If the label of v2′ is not still subsumed by
the label of 𝑣2, we continue to explore
𝑣3′ and iterations 2,3 etc. of the loop.
With Post operator returning true as a
label for each new node.
v5
{π‘₯ ≥ 0 ∧ 𝑖 ≥ 0}
v6
{π‘₯ ≥ 0}
v2’
{π‘₯ ≥ 0}
In this case, the label of v2′ is still
subsumed by the label of 𝑣2 so the
algorithm terminates.
v7
{π‘‘π‘Ÿπ‘’π‘’}
{π‘“π‘Žπ‘™π‘ π‘’} v8
{π‘₯ ≥ 0}
v9
{π‘‘π‘Ÿπ‘’π‘’}
Over approximation driven
verification:
Over approximation driven
verification:
Assuming we started with operator Post
as true, and refinement staged returned
as described before.
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
v1
{π‘₯ ≥ 0}
v2
v3
{π‘₯ ≥ 0}
{π‘₯ ≥ 0} v4
v3'
We take the predicates it used, in this
case π‘₯ ≥ 0, 𝑖 ≥ 0 an recalculate Post
operator as described before.
{π‘‘π‘Ÿπ‘’π‘’}
v5
{π‘₯ ≥ 0 ∧ 𝑖 ≥ 0}
v6
{π‘₯ ≥ 0}
v2’
{π‘₯ ≥ 0}
v7
{π‘‘π‘Ÿπ‘’π‘’}
{π‘“π‘Žπ‘™π‘ π‘’} v8
{π‘₯ ≥ 0}
v9
{π‘‘π‘Ÿπ‘’π‘’}
Over approximation driven
verification:
Statement “i=0,x=0;” sets both predicates
to true.
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
v1
v2
{π‘₯ ≥ 0 ∧ 𝑖 ≥ 0}
{π‘₯ ≥ 0 ∧ 𝑖 ≥ 0}
{π‘₯ ≥ 0 ∧ 𝑖 ≥ 0}
And they stay true through the rest of the
program.
{π‘‘π‘Ÿπ‘’π‘’}
{π‘₯ ≥ 0 ∧ 𝑖 ≥ 0}
v3
v4
v5
{π‘₯ ≥ 0 ∧ 𝑖 ≥ 0}
v6
{π‘₯ ≥ 0 ∧ 𝑖 ≥ 0}
v2’
{π‘₯ ≥ 0 ∧ 𝑖 ≥ 0}
{π‘₯ ≥ 0 ∧ 𝑖 ≥ 0}
v7
v3'
{π‘“π‘Žπ‘™π‘ π‘’} v8
v9
{π‘₯ ≥ 0 ∧ 𝑖 ≥ 0}
UFO:
UFO:
In this paper the authors start with UD
and after CEGAR continue with the new
Post operator they get.
Foo(int n):
1. i=0,x=0;
2. while (i<n)
3.
if (i <= 2)
4.
x = 0;
else
5.
x = i;
6.
i = i + 1;
7. If (x < 0)
8.
ERROR
9. return;
v1
v2
v3
{π‘₯ ≥ 0} v4
{π‘‘π‘Ÿπ‘’π‘’}
π‘₯ ≥0∧𝑖 ≥0 ?
Meaning, if 𝑣2 was not still subsumed by
the label of 𝑣2 they would have
continued exploring from 𝑣2′ with post
operator for π‘₯ ≥ 0, 𝑖 ≥ 0 .
{π‘‘π‘Ÿπ‘’π‘’}
{π‘₯ ≥ 0}
{π‘₯ ≥ 0}
v5
{π‘₯ ≥ 0 ∧ 𝑖 ≥ 0}
v6
{π‘₯ ≥ 0}
v2’
{π‘₯ ≥ 0}
v7
v3'
{π‘“π‘Žπ‘™π‘ π‘’} v8
{π‘₯ ≥ 0}
v9
{π‘‘π‘Ÿπ‘’π‘’}
Boolean/Cartezian Predicate
Abstraction
Boolean Predicate Abstraction
Given predicates 𝑝1 , 𝑝2 , … , 𝑝𝑛 we represent them using boolean vectors (𝑏1 , 𝑏2 , … , 𝑏𝑛 ) where
𝑏𝑖 = π‘‘π‘Ÿπ‘’π‘’ ↔ 𝑝𝑖 = π‘‘π‘Ÿπ‘’π‘’.
𝑇, 𝑇, 𝑇 ,
(𝑝1 ∧ 𝑝2 ∧ 𝑝3) ∨ (¬π‘1 ∧ 𝑝2 ∧ 𝑝3) ∨ (𝑝1 ∧ ¬π‘2 ∧ 𝑝3)
𝐹, 𝑇, 𝑇 ,
𝑇, 𝐹, 𝑇
We will have 2𝑛 possible states per each program counter location.
Cartesian Predicate Abstraction
We represent a cross product 𝑃1 × π‘ƒ2 × β‹― × π‘ƒπ‘› .
At each location we store separately per each predicate if it is π‘‘π‘Ÿπ‘’π‘’, π‘“π‘Žπ‘™π‘ π‘’.
If the predicate can be both we store “∗”.
(𝑝1 ∧ 𝑝2 ∧ 𝑝3) ∨ (¬π‘1 ∧ 𝑝2 ∧ 𝑝3) ∨ (𝑝1 ∧ ¬π‘2 ∧ 𝑝3)
(Note that (¬π‘1 ∧ ¬π‘2 ) is now also part of the state.)
(∗,∗, 𝑇)
A more compact representation (compared to Boolean) but we loose precision.
Results
• 105 programs in benchmark
• Compared with Wolverine http://www.cprover.org/wolverine/
• 5 versions of UFO
1.
2.
3.
4.
5.
Pure UD called ufoNo (Post returns true)
With Cartesian Predicate abstraction called ufoCP
With Boolean Predicate abstraction called ufoBP
Pure OD with Cartesian Predicate abstraction called CP
Pure OD with Boolean Predicate abstraction called BP
• Reports results for instances that should verify
(#Safe) number of instances solved.
and for instance where an error should be
discovered (#Unsafe) number of instances solved.
Results
Results
• UFO performs much better then Wolverine
• cpUFO performs significantly better than all other UFO
configurations.
• In the next slide we go deeper in to results and per example
first for #SAFE instances and then for #UNSAFE
• Benchmarks of token ring protocols and SSH servers various
hand shaking protocols.
• Fastest time at each line emphasized
Results a closer look
(Safe)
Results a closer look
(Safe)
• Number of refinements goes down as you go down
the predicate abstraction
• CP failed for all but 3 examples so wasn’t included
in results.
• No one clear winner in terms of time. Can be seen
also from the Unsafe results.
Results a closer look
(UnSafe)
FIN
Download