Document 17865145

advertisement
>> Nikolaj Bjorner: It's my pleasure to introduce Laura Kovács who was visiting Rice this week
from TU, Vienna. And as of next month, she will be in [inaudible]. Anyway, I'm excited about
the topic. It will be on symbol elimination and invariant generation on using automated theorem
proving and algebraic techniques for deriving properties of programs.
>> Laura Kovács: Thank you. There's Rice before going in Vienna, so I'm also from -- I mean,
essentially in Austria it's called rigorous system engineering, so it's funny that I'm from Rice and
visiting Rice as well.
So the talk will be on symbol elimination and invariance. And essentially it's an overview of
some things that I did in my Ph.D. thesis, so after my Ph.D., on generating automatically
program invariants.
And the thing is like so I tried not to combine the two techniques, one coming from computer
algebra; namely, using Gröbner Basis computation and a bit of quantifiable elimination. And
then with the same idea that Gröbner Basis is applied and I try to show you, okay, how you can
use the same ideas to generate quantified invariants but now using an automated theorem prover;
namely, the first theorem prover.
And over on methods of both techniques, it's called symbol elimination, which you can use
symbol elimination as being like variable elimination using Gröbner Basis computation or
symbol elimination which the symbols can be even function symbols and you can eliminate the
symbols using a theorem prover.
So now the type of properties that essentially I will describe how you can generate using symbol
elimination in various setups, I'm going to exemplify [inaudible] I just consider this particular
program which depending -- okay, you're given some array and then depending if this array at a
particular position is positive, you copy it into B, otherwise you copy it into C.
Okay. And then the things -- like, okay, so what are the various properties of this particular
program at any iteration, so you can come up with different invariants. Okay? And one
particular invariant would be, okay -- so I start one of the simplest one would be that, you know,
the sum of the elements in array B and the sum of the elements in array C, to sum them up are
exactly a number of elements in A.
Or you can also say, okay, since you start from A, B [inaudible] by 0 and then you only increase
them, then they're going to stay positive.
And then, okay, you also have this -- so you know throughout the loop, you know that the square
root is going to be less than equal in this upper bound N, or it might be that you don't even enter
the loop in case N is negative. Okay? So then you have linear inequalities.
And all this will essentially allow you to reason about some scatter properties of the program,
okay, so let's say about counting how many elements in array B, C, and E exist. But now if you
want to speak more about the content of the array, then you will look essentially what I said at
the very beginning, okay, so if A is positive, you are going to copy it into B. So the particular
elements are meaning that any element in B is going to be a positive element. And then you can
just express it using first the quantification saying that for every position from 0 up to the current
position in array B you know that all elements in array B are positive. So then you would get the
universally quantified properties.
And in addition, if you also know that it's not in every 3D property, but every 3D element, but
all -- every element in B is coming from some array element in A. And then, okay, now here
you would have an existential quantifier. You don't really know which exactly, so if you don't
want to be very specific, but you know that for every element in B there exists an element in A
such that the two are equal.
So now these are the type of invariants that I would be interested to generate automatically. And
what I think is so if you [inaudible] investigate this particular property, then you see, okay, they
have a different structure, in some way their complexity is different. Because the very first nine
or ten, okay, so linear relation among program variables, but you can view it as being a special
case of a polynomial equality, okay? Polynomial for degree 1.
Or in the further, okay, these are linear inequalities, and even where you can have a disjunction
of linear inequalities. And then further, okay, so these properties [inaudible] want to speak about
the array content, then you're going to get -- quantify it first with the properties.
So now depending which kind of property you want to generate, then you're going to use symbol
elimination, so the method that I'm going to introduce either using symbolic computation;
namely, a computer algebra techniques and algorithmic combinatorics, or you just use theorem
proving, which is essentially a generalization of symbolic computation in theorem proving. So
I'm going to show both techniques.
So if you would only apply symbolic computation techniques, you can generate quite powerful
scatter properties over the program. But you need something more stronger that would allow
you to also generate the quantified ones.
So now in the rest of the talk, okay, so I will just first give you the main idea of symbol
elimination and in both cases I'm going to show you how exactly symbol elimination is used to
generate these invariants automatically.
So the idea why symbol elimination, it's -- it's just a name, so I essentially it was already there,
but call it symbol elimination. And I'm going to use colors. Okay? So now from now on
whenever I speak of a particular example, then that example is colored in green, okay? So that's
essentially the program example that you as a user you are going to see.
So assume that you're given a program, okay, I color every symbol in this program, it's colored
in green. And I [inaudible] interested, okay, to generate a [inaudible] invariants automatically.
And since you want to have this invariance in the same -- so since you only see green, then you
want to have most of these assertions in green.
And now the thing of this, so if you only state -- only want to reason in the program language,
then you don't have much expressivity. For instance, if you speak about a program loop, you
want to speak of a particular iteration, what happens with the particular iteration of the loop.
Okay? And that iteration variable is not part of your programming -- not part of your loop
language.
So what we essentially are doing, okay, so starting with the green language, we tried to extend
this green language with some extra symbols that allow us more -- give us more power to reason
about the program. And such extra symbols, for instance, could be a loop counter, okay,
essentially tells us what iteration, what exactly is happening in the program. Or another symbol
can be, which I'm going to present later, okay, so array update predicate essentially tell you at
which iteration an array is updated by which value.
So now as soon as you start with this, so your input language, which is essentially the program
language defined by its symbols, now you extend this green language with these extra symbols,
and now you have a more -- you have a more powerful language in which you can now use any
method, any style method, to generate some program properties.
Now, the thing is that now is as soon as you generate these program properties, since you're
working in this extended language and you're not [inaudible] any more to only get program
properties which contain only green symbols. So, for instance, you might have a program
property that tells you explicitly what's happening at a particular iteration of the loop.
So now these program properties are going to use both green and red colors. And you're
argument will be, okay, so from this set of program properties how you can generate some
consequences or some other properties that are again just green and can you output then this
green assertions as being loop invariants. And this is -- so, okay, [inaudible] last step is to
eliminate these red symbols. And this is why we call it symbol elimination.
And, now, exactly what we're using in the process of symbol elimination is that now I generate -so there -- you can use essentially different methods to generate varied program properties. And
the approaches that we are using in our framework is we try to use some algebraic techniques;
namely, [inaudible] combinatorics. So automatically solving the recurrence equations of
programs. And some very lightweight analysis on when exactly arrays update it.
So I'm going to present this a bit later. And then in order to eliminate the red symbols now,
depending on which framework you are, so if you only have program properties that have
polynomial equations, then you can use Gröbner Basis computation and eliminate these program
variables.
Now, if you would have polynomial inequalities as well, then you can use quantified elimination
by [inaudible] decomposition. And you would get again just a green formula.
But in addition if now you have a formula, so these program properties don't only contain
scatters but they contain also some functions symbols or even predicate symbols, then what you
have -- what we are proposing, okay, now use a theorem prover that could eliminate these
function symbols and predicate symbols by then deriving again these assertions as being logical
consequences of the program properties.
So now -- so this is the general idea, and then for each of it I show, okay, how we can generate
loop invariants. So if you would only use symbolic computation, then you could generate quite
powerful polynomial properties. But I will show, okay, what are the restrictions on the
programming structure.
And if the advantage is now you -- on top of symbolic computation you also use theorem
proving, then that would allow you to prove essentially even every 3D quantified property. So
not necessarily only universally quantified properties, but you can get properties that contain
quantified intonations.
So now -- okay. So let me now go to the main -- one of the main parts, namely on -- so how
exactly these polynomial invariants can be generated using symbolic computation.
So I -- just let me give another example which does a bit more arithmetic than just summing up
so at least you do some [inaudible] multiplication.
So the thing is, okay, so assume you start with two variables, X and Y. You initialize them, X
being 1 and Y being 0, and then throughout the loop, okay, so independent we can see
[inaudible] too, so it's independently of the loop condition.
We are going to multiply X by 2 and on Y you divide by 2 and increment by 1.
Another thing -- okay, so -- I do this using symbolic computation. Okay. So what are the varied
properties of this loop? So varied polynomial equality properties among X and Y such that they
are 2 at arbitrary iteration of the loop.
Okay. Somehow it's there. So clearly again you can derive some trivial polynomial relation,
inequality relations since you know that X is 1, you only multiply and get by 2, so clearly X and
Y both are going to be positive. But the question can you derive some invariant that relates the
value of X and the value of Y, which is true at every iteration of the loop.
And for reasons for this particular program one polynomial invariant would be this polynomial
relation that says, okay, you take -- you multiply this by Y, take out 2 times X plus 2 is gives you
0. So, I mean, you can even -- so if you don't believe, then you can apply verification condition
generation and see that this particular loop invariant is true at the beginning, is true at every
iteration and it will hold, so it will therefore be a loop invariant.
But -- so -- and I'm going to show why it is a loop invariant. But before doing it there's one other
claim, is that it's not any loop invariant. So it clearly relates the value of X and Y in a nontrivial
manner. And what is important is for this particular loop, we can also prove that any other
polynomial invariant that will hold for this particular loop such that the loop condition is ignored
is going to be a logical consequence of this invariant.
So essentially we derive [inaudible] we derive all polynomial invariants by giving a finite
representation of the set of polynomial invariants of the program.
So how this works, so let me exemplify the method, okay? Just going. So the idea of symbol
elimination, okay, we start with a green language and then it's too weak for us to reason about
what exactly is going on, so we try to extend it by an extra symbol. And that extra symbol is
always -- it's one simple extension would be, okay, introduce a new variable N which essentially
is standing for loop counter. Okay. And we know that every loop is [inaudible] no negative
time, so you know that N is greater or equal to 0.
And now sometimes you use this extra symbol, okay, it's going to be green because it's for your
own reasoning.
So now the thing -- okay. So we can now examine what exactly is going on in the loop body by
essentially using this loop counter to express properties at particular iteration. And what I'm
saying, okay, so take this -- so X equals 2 times X. Okay? So this is in programming language.
Now, if you map it into the algebraic, what it will tell you, if you take the value of X at iteration
N plus 1, X prime, then it's going to be computed by taking the value of X iteration and
multiplied by 2. Okay? And this is the type of notation that we are using. So if I take X upper
script N plus 1, it tells you that this is the value of X at iteration N plus 1.
So by introducing this loop count at N and then from now on I'm going to view the variables as
being functions of the loop counter. And now similarly for Y, okay, you can see that the value of
[inaudible] N plus 1 is 1 over 2 times the value of Y iteration N plus 1. So I just get this
equation.
And now as soon as you have this equation, then from the algebraic point of view we are quite
happy in a sense, okay, now this is what people would say, okay, now you get recurrence
equations. And why this is nice, these recurrence equations, but some of these recurrence
equations can be solved. Namely, you can compute the value of X and Y [inaudible] iteration N
such as you can only depend on N. So it only depends on N and the initial value of N, but you
don't have to go through previous values of X.
So this is why we said, okay, okay, I saw this recurrence relation and compute the [inaudible]
closed forms of variables that only depend on N, you don't have to compute the value of X
iteration 5 in order to get the value of X iterations. [inaudible] you can just apply this
closed-form representation but by substituting N [inaudible].
And similarly for Y, okay, you would apply some methods from recurrence relation cycle, so
algorithmic combinatorics. And then you get the closed-form representation of Y which only
depends on N and the initial value of Y.
And then -- so it's fine, okay. Until the point, okay, now you have -- so this essentially -- what
you have here, this is an elective property because it only depends on the loop iterations, and
independently whether N or N plus 1 is going to be this particular closed form -- is going to
[inaudible] for error N plus 1, N plus 2, and so on. And then you can prove [inaudible] that if X
iteration then has this value, that X iteration N plus 1 is going to have the values substituted by N
plus 1.
So in a sense you can do that these are already inductive properties of the program. The only
thing is that now we are -- so your initial language was green, okay, and now we have green and
red symbols. So essentially what you are interested in some application that you want to get
these loop invariants which only use green symbols.
So the idea would be, okay, so we can get the green properties describing this set of equations by
eliminating the red symbols.
And now here's -- so in some sense it's, okay, you can eliminate but then you have a problem.
Because now these properties are not [inaudible] polynomial property, but here you don't even
have a polynomial property, you have some exponential property that uses N. Okay. So you
have the exponential indexes 1 over 2 to power N. Sorry. This is also not a polynomial
[inaudible] exponential sequence with the power N.
And now how you can -- so what you can do with this, so you want to eliminate the exponential
consequences such that now you automatically are going to derive a relation only among X and
Y. And now the thing what we are doing, okay, so for some nonpolynomial expression, namely
for exponential sequences, we are computing the so-called algebraic dependencies among these
exponential sequences. And then the algebraic dependency among exponential sequences, in this
case 2 to the power N and 2 to the power minus N, is a polynomial which describes the relation
between A and B but for arbitrary value of N. So independently whether N is equal 1, 2, or 3,
the particular polynomial relation should hold.
And in this case, this would be a polynomial relation. Okay. If you take 2 of the power N and
multiply by 2 of the power minus N and take out 1, you get the polynomial which is equal to 0.
And then, okay, so then sometimes you're fine because now if you take only these three relation,
then again you're back in polynomial relations.
And but I'm going to show a slide. So these algebraic dependencies, they are -- they're not
arbitrary algebra dependencies. But it's nice again that this algebraic dependencies are
polynomial dependencies. So again you can aim to generate a minimum set of algebraic
dependencies. So all algebraic dependencies that are relevant for the particular program.
So now the last step of symbol elimination, okay, now everything gets polynomial, so try to get
rid of the red symbols. And then what we are doing, okay, so we say, okay, the polynomial
invariant is allowed to use the final value of S iteration N. So this particular function which is
denoted by X and for this particular function we note as a final value to be Y. And then we just
eliminate the red variables in N. In the following elimination, we just use Gröbner Basis
computation such that, again, the red symbols in Gröbner Basis computation are going to be the
first symbols that should be eliminated.
And what is important, okay, so my Gröbner Basis here is quite nice to use because now you
have a complete characterization of the inductive property of this particular loop in X and Y.
And then you're saying, okay, what are the logical consequences of these properties that could
only contain green variables. And now what Gröbner Basis allows you not only derive any
logical consequence, but allows you to derive all logical consequence in a set that all minimal
logical consequences such that any of the logical consequence can be applied by this Gröbner
Basis, from the set of Gröbner Basis polynomial.
So in a sense what we get here, we get essentially the polynomial invariant idea. So we get the
finite bases of polynomials that generate to the polynomial invariant idea. So if there is a -- if
there is another polynomial invariant for this particular loop, it should be a member of the
invariant idea generated by this polynomial.
>>: I notice that the initial values of X and Y -- of 0 and Y0 are green.
>> Laura Kovács: Yeah.
>>: Does that mean that those sometimes show up in your ->> Laura Kovács: Yeah. So I skipped this part. So the initial values are -- I can use them. So
essentially for when I -- when variables become functions of the loop counter N, essentially
they're only two values which I consider that can go in the loop invariant, the initial value and
the final value. So this is why X at iteration 0 I write them as a green variable. So a green
function value. So it can go into the -- in the loop invariant.
And essentially when we do Gröbner Basis computation, first -- so this is very simplified. This
is already a simplified formula. So the -- essentially the Gröbner Basis polynomial is going to
contain X0 and Y0, and I just replace them here to have a simple formula.
And now, okay, so what exactly this approach can work because -- so you cannot use it for
everything, and this is okay, this is the constraint that we are imposing. So essentially one
thing -- okay. So we are considering loops with a sign [inaudible] condition so you can have
[inaudible] a branching in the loop body.
And now so these are the remaining -- so this is one structure constraint and some other
constraint imposed such that now we have polynomial assignments in the loop body, but in this
step here, we need to solve them. Okay?
And the other thing, okay, so -- okay. So we impose them -- so we say what kind of polynomial
assignments can be handled such that we can always compute their closed form. So they're -- the
recurrence equations can be arbitrarily complex. And for some class of recurrence equations,
there is no close -- there is no guarantee that there is a closed-form representation.
So essentially if you go for those [inaudible] constraint imposes the polynomial assignments will
always [inaudible] complete method to compute closed-form representation. And these
polynomial assignments, what you can view, for instance, C finite means it's a linear recurrence
in the program variable. So you don't have multiplication among two program variables, but you
can multiply a program variable by a constant and then -- so -- so what I'm saying by C finite, I
can have such updates where Y ends at different variables. Okay. I cannot have such an update.
So Y is updated by that -- let's see.
So in the recurrence update of Y, Y cannot depend polynomial in another variable. It can depend
only such that the update of -- so the recurrence color -- recursive color of X can depend on two
other variables where I can use them in multiplication. So this will be C finite. [inaudible] C
finite, this would be non-C finite.
So the type of assignments that we can handle essentially a generalization of such polynomial
updates.
And what is nice, because so although it's a restriction, but the nice thing of it, if you have such
assignments yielding recurrence equations you can know that for such assignments you can
always compute the closed-form representation. So you have a complete way to always solve
recurrence, so you can always -- for such assignments, you can always get such a representation.
>>: [inaudible] what is the [inaudible]?
>> Laura Kovács: Yeah. So now I come to this. [inaudible] I'll ignore. So we essentially -- we
ignore all tests in the program, so which means we have -- we handle programs as being
nondeterministic. And this is how we just know. So essentially if you would have a program
statement if condition B, then you execute a statement as well, otherwise you execute a statement
that's 2. A notation, we say you have a nondeterministic choice between S1 and S2. So the bar
is just saying, okay, you nondeterministically execute either S1 or S2.
>>: Can you recompute if your occurrence is for all possible [inaudible]?
>> Laura Kovács: Yeah. So I will show it later how exactly it works for -- so I only show the
example without conditionals, so how exactly it would work for -- without conditionals.
But it's important so the method would only be complete if you ignore all the tests. So you
generate polynomial invariants of a particular program, but if you have a program with multiple
parts and the parts can be executed nondeterministically. So you have no preferences which part
can be true or false.
And for the loop, okay, again, if you -- by ignoring the loop condition and saying you just take
arbitrary execution of S, this is how I denote it. And if you have a loop with conditional that I'm
saying, okay, is that we just ignore all conditions, so in this loop body you have different parts,
depending on the condition, so either would execute -- depending on the condition, you would
execute S1 or SK. But by ignoring all loop conditions, you say, okay, you have a
nondeterministic choice between S1 and SK, and you take them arbitrary many times.
So I'll just introduce this notation also for later so I don't have to write everything out.
And then the method, okay, what we have, okay, so we -- we generate this invariance
automatically by using algebraic techniques, namely current solving and Gröbner Basis
computation.
Another important thing, that although I didn't say it, so we consider even though the program
might have integer data types, we consider them to be rational. So, again, something on the
algebraic level so that we can apply Gröbner Basis computation.
And now this set of programs that for fear the particular constraints are called P-solvable loops,
namely so they are loops that will give you polynomial probability.
And what we are -- essentially what we are aiming to derive, okay, we are aiming to derive the
set of all polynomial relations over program variable such that if I apply our notation, then I still
get this program so that this polynomial relation should be true for the initial values and then
after arbitrary execution of nondeterministic choices between S1 and SK, I also have the
polynomial relation at the end.
And then a nice thing, okay, so although there can be -- there are can be infinitely many
polynomial invariants, the fact that we are -- we want to do it algorithmically, we have a nice
feature of Gröbner Basis that allows you to do the computation of polynomial invariants
algorithmically. So you can have a finite -- you only have to work with a finite representation.
But I'm going to show us later that the method is sound, but what is important it's also complete.
So essentially for this set of programs, which are P-solvable, we are always computing the
polynomial invariant idea.
And now the implementation. So it's implemented on top of Mathematica. But now -- so the
completeness -- so I'm going to now present why exactly it's complete. For let's say on the
easy -- the easier case is when you have only a loop without this. So when the loop only
contains sequences of assignment, that you can do what I presented previously, okay, you take a
loop counter, get the recurrence equation of the loop, solve it, and get the polynomial invariant
idea.
But the problem, okay, so what exactly happens if you have such loops. Okay, when you have
nondeterministic choices among S1 and SK, you can complete the recurrences for each of them,
for each S1 and SK, but then you have interleavings.
So how can we ensure completeness? So therefore I'm going to show first why it's complete for
loops without assignments and then why it's complete for loops with -- okay, why it's complete
for loops with only assignments, and the next step I'm going to show why it's complete only for
loops -- also for loops with an [inaudible] conditionals.
So in the case when you have loop-only assignments, there is no path -- there is no branching in
the loop body. Okay? So then essentially this S is just a sequence of assignments that is
updating some variables X [inaudible] of the loop. So what we're saying, okay, again, so if I
recall, introduce this loop counter N and compute the closed-form representation of X1 at
arbitrary iteration N such that it's now closed-form representation for P-solvable loop, it's going
to be a polynomial relation in the loop counter N. And you might have some algebraic
dependency -- exponential sequences.
So the thing that if a loop is P-solvable, you're going to [inaudible] that you always get such a
closed-form representation, namely that the closed form of each variable is going to be a linear
combination of the loop counter and some exponential sequences in the loop counter plus initial
values.
And now as soon as you [inaudible] you're already fine because what you can do in the next -- in
the third now you compute the algebraic dependencies among these exponential sequences.
And, again, you are aiming to derive polynomial relation among the exponential sequences, so
now what you're aiming, you're deriving -- derive all polynomials such that the polynomial
among N and the exponential sequences is 0 for every iteration of N.
So, for instance, I already showed -- so the running example contain these two exponential
sequences 2 to the power 2 and then 2 to the power minus N. So again you take the product
minus 1 is going to generate the idea of all algebraic dependencies among these two exponential
sequences.
Now, another would be if you take 2 to the power N and 4 to the power N, then you just take this
exponential sequence at power 2 minus further power, and then you get 0.
And so what is a nice thing -- sorry, I don't really show how you compute -- so we use again
Gröbner Basis computation to compute the set of minimal polynomials [inaudible].
But what is nice thing is that we don't restrict ourself to be [inaudible] theoretically it's nice. At
least I found it nice. They don't have to be a nice numbers. They don't have to be rational
numbers. You can essentially take algebraic numbers.
So you can also compute algebraic dependencies among sequences where you have, for instance,
square root of 5, okay, [inaudible] practice that would not happen, but in theory you can compute
algebraic dependencies among algebraic sequences. So one example -- one textbook example
would be let's say when you compute Fibonacci numbers, then the closed form representation of
Fibonacci numbers would contain the square root of 5. And then the thing, okay, you can apply
this approach for a program with Fibonacci number computation. You would derive its
polynomial invariants.
So and now -- now [inaudible] okay, so as soon as you have the closed-form representation of
the program variables and you have the algebraic dependencies among the exponential
sequences, now you use computer algebra that tells you, okay, how can you compute the
polynomial invariant idea, namely all polynomials among the program variable. But you already
know what is the value of each program variable at an iteration N. So now take these
polynomials, so you take X1 minus Q1 up to XM minus QM, they generate you the polynomial
properties of the program but with a loop counter.
Now, you want to eliminate the loop counter and the exponential sequences, so now you add the
idea of algebraic dependencies to this idea, so you get a bigger idea. And then what you're
interested -- you're only interested -- you only are interested to generate polynomials from this
idea that contain only variable X1 up to XM. So then you intersect.
So from a point of view of algebraic computation, okay, here you use an idea that generates you
the polynomial relation of the program in the program variables and the loop counter. Now you
have the idea of algebraic dependencies, then you have summation of two ideas which again you
can compute by Gröbner Basis computation. And then you have intersection of idea that you can
compute by Gröbner Basis computation.
So all this -- so plus an intersection essentially can be both represented finitely by Gröbner Basis.
And the nice thing of this -- of being a polynomial invariant idea is that if you take any
polynomial from this idea, now you know it's going to hold at the beginning of the loop, so when
you take the initial values, and it's going to hold after every iteration of the loop.
So if I just -- now, with this terminology, if I just revisit the example given at the beginning,
okay, so first I'm interested in generating these closed-form representations, so I'm getting the
closed-form representation of X and Y. The next step, okay, I have some algebraic
dependencies, so I introduce two new variables, A and B, representing the algebraic sequences.
And this is essentially the algebraic dependencies of the algebraic sequences.
So then I take -- so this is main variant idea. If you have indexed and in loop language, I take the
invariant idea of the algebraic sequences and I intersect it with the polynomial X and Y. And
then I get only the polynomial as being polynomial X times Y minus 2X plus 2 is the only -- is
the -- generally is the polynomial invariant idea.
So I have -- for this case you have completeness from Gröbner Basis computation. Essentially
the Gröbner Basis computation allows you to check all polynomial invariants over the program
in the green language. Yeah.
>>: So for intersection do you use ordering on the symbols [inaudible]?
>> Laura Kovács: Yeah. So ->>: [inaudible].
>> Laura Kovács: No. Here you don't need. I mean, you can take any ordering between X and
Y. But it's important when -- so there is no preference which is -- whether X is higher in order
Y. But with the preferences that A and B have to be bigger than X and Y.
>>: So you compute the ideal for [inaudible]?
>> Laura Kovács: Yeah.
>>: [inaudible] making A and B greater than X and Y?
>> Laura Kovács: Yes.
>>: And that gives your polynomial identity so you throw away -- so the intersection is just you
throw away the [inaudible]?
>> Laura Kovács: [inaudible] so you can -- so you can throw away if you have the Gröbner
Basis representation.
>>: Okay.
>> Laura Kovács: Yes. And then so essentially you compute the Gröbner Basis -- okay. You
have the Gröbner Basis for this idea and you have the Gröbner Basis for this idea. Data sum.
You compute the Gröbner Basis of the sum of the idea and then you throw away everything
which is not an X and Y.
>>: [inaudible] intersection.
>> Laura Kovács: Yes.
>>: And if you did that ->> Laura Kovács: It's not the general intersection. So it's an intersection with respect to
elimination.
>>: Okay. And if you did not or it's -- suppose you ordered X data A and B, then your
intersection operation would be ->> Laura Kovács: No, it will be still fine. Because essentially here in the intersection operation,
I'm saying that I'm intersecting. So this -- here. This ideal is going to be an ideal in the rationals
plus so are variables X, Y, A and B. Okay. So as soon as I'm intersecting here, I'm saying that
I'm keeping only variables X and Y.
>>: So just in your question is are you intersecting on the ideal or on the basis?
>>: On the basis, right?
>> Laura Kovács: Yeah, I'm intersecting on the basis.
>>: Yeah, I mean, basically [inaudible] on the basis, but the effect is ->> Laura Kovács: Yeah.
>>: Yeah, I mean, the way I was understanding it is that you intersect -- you compute this basis,
but it's part of the property of the ordering in order to get ->> Laura Kovács: No, not -- so here you ->>: -- the basis in a suitable form such that ->> Laura Kovács: No, here you don't need to. Because essentially -- so here, right, so as soon as
you add this -- so as soon as you have the direct symbols as well, then you have polynomials in
the red symbols as well. I mean, you can have -- you can have different [inaudible]. I mean, you
can have different basis. So you can have -- so if you fix the ordering among X, Y, A and B
differently, then you might get a different basis here. But automatically the result is going to be
only polynomial in X and Y.
>>: But you're going to get the different basis, but you're not necessarily going to get that
identity in the basis.
>> Laura Kovács: No. Right. But I might get the different identity. But what is important, that
among Gröbner Basis -- so among different bases, there is a relation.
>>: But then your intersection operation would be more difficult.
>> Laura Kovács: Right. Yeah.
>>: Is that correct?
>> Laura Kovács: Yes, yes. So this is one reason why you would put A B [inaudible] bigger
than X and Y. But you can pick any order. So you can play with different orderings. And then
that will give you different Gröbner Basis. So it might be that if you don't -- if you need a
different ordering here, then you get a Gröbner Basis which contains two polynomials or three
polynomials, and then you would say, okay, for that particular Gröbner Basis you can go for a
normal form of presentation which will boil down to this one.
Okay.
>>: Okay. So more basic question, it would be legitimate to have X not and Y not in the
invariant, right?
>> Laura Kovács: It would be X and Y not in the invariant. Yes.
>>: I mean, it's [inaudible] invariance in that way.
>> Laura Kovács: Yeah.
>>: So ->> Laura Kovács: Right. But then you would say, okay, what [inaudible] you would been
interested to generate an invariant not in X and Y but in A and B. So depending -- so, yeah, it's
up to you to decide which variables you want -- might want to compute.
So the point after this -- [inaudible] elimination idea, so I don't intersect and I don't fix myself
over which -- so which are the -- which are the invariants and which are the properties I'm
interested in, so over which variables, then I would compute all polynomials in the standard
language. And then it's up to you how to fix, okay, whether you want X and Y or you want X or
you only want -- you can also say, okay, you only want invariant in X.
>> Laura Kovács: Okay. So stupid question. Wouldn't you just put then X at the bottom of the
elimination order and just -- and then compute all the polynomial -- compute a basis for all the
polynomials on X? I guess I'm saying why are X not and Y not are not -- why are not they red?
>> Laura Kovács: But you could. But if you put X and Y to be red, then essentially you lose the
inductive argument of X and Y.
>>: [inaudible] and Y not. If we don't want them, why aren't they red?
>> Laura Kovács: Sorry ->>: Initial value.
>>: Initial values X -- X0 and Y0.
>> Laura Kovács: Oh, oh. Yeah. Okay. So to be very specific, okay, I should write here what
are the initial values. Okay? I'm intersecting X, Y and the initial values. Yeah. Okay.
>>: Intersect with respect to X not being 1.
>> Laura Kovács: Yeah. So I essentially -- so but I'm -- after this particular point, okay, and this
particular point, I'm going to generate polynomials also in the initial value. So I would consider
the initial values as symbolic variables.
So they are on the field of XY, X0, Y0. Okay. But then I just -- so essentially here we would
get a polynomial in X0, Y0, and then you start with the initial values. Yes.
And there are -- so there are different ways to do it. But how we consider, okay, so the initial
values we consider that they should go into the invariant, and therefore essentially we do -- we
take field extensions here. So essentially we take the initial value as being part of the field, and
we are very interested in X and Y.
>>: Well, in general you don't [inaudible].
>> Laura Kovács: Yeah. So if you don't use initial values, so if you don't know the concrete
values of the variables, then the polynomial invariant here would contain the initial values. And
then, yeah, so you can use -- so the substitution, okay, so be very concrete. So what you get
here, you will always get a polynomial with the initial values. And that polynomial can be
further simplified if you know the concrete ones.
Now, with the -- the thing -- so this was for the simple case when you only had assignments.
And the thing is now if you have multiple parts in the program, so if you have this [inaudible]
conditionals, then how do you ensure that you have completeness, so how can you ensure that
since paths can be interleaving and you have -- each path can be executed with the same
probability, then how can you combine various recurrences and ensure that the polynomial
invariant is going to contain all polynomial invariants of the loop with multiple parts.
So the first observation is that, okay, so -- all right, so very simple observation. So if you have a
loop which contains let's say K branches, then that particular loop is -- it's P-solvable if and only
if each inner loop. So if I take each path septately as a simple loop, it's going to be P-solvable.
And why is that? Because -- so if this is -- so essentially it's because computation of polynomial
relation. Computation of polynomial relations still gives you a polynomial relation.
Okay. And now essentially what we are interested, okay, what we're interested in, there's the
generate the polynomial invariant idea such that it contains all polynomials that are true at the
beginning of the loop, and then after arbitrary execution of the nondeterministic choices of the
path is going to be true after that particular -- after arbitrary executions as well.
>>: [inaudible] since you're ignoring the condition, you have -- this program has manual
behaviors [inaudible] program.
>> Laura Kovács: Yeah.
>>: So how can you [inaudible]?
>> Laura Kovács: So let me finish. Sorry. [inaudible] the completeness with respect to the fact
that I ignore the conditions. Yeah.
>>: I see. Not to the original.
>> Laura Kovács: Yeah. Right. So my loop -- so I'm only speaking of ->>: Sure.
>> Laura Kovács: So even if you're -- even if your program contains conditions, so it might be
that -- it essentially can easily happen that the invariants that I'm getting are true invariants for
the program that ignore conditions, but they are not anymore true invariants as soon as you put
the conditions inside. So it might be, for instance, that you write a loop with two paths, and one
of the path is never going to be entered. Okay? But this particular technique is going to give
you a loop invariant that are true for each path. So you might -- essentially might be more ->>: [inaudible].
>> Laura Kovács: Yes.
>>: [inaudible].
>> Laura Kovács: But essentially for P-solvable [inaudible].
>>: Now you're saying, okay, soon there are no conditions.
>> Laura Kovács: Yeah. So it's completeness only with respect P-solvability, which means all
tests are ignored. And I just use this notation which essentially is saying, okay, so the arbitrary
execution of S -- the path S1 up to SK essentially is a equivalent to take any -- so this is take any
permutation of S1, SK and execute it many times.
And now how -- so this is how the idea of computing the polynomial invariant idea, so is that -so we tried to make use of the fact that if you would have only a loop with assignments, we
already know how to compute the invariants. So if you would take only -- if you take a concrete
sequence of simple loops, then you know how you can compute the polynomial representation of
S1 up to SK, then you just merge polynomials together and then you get a polynomial invariant
idea for this particular sequence.
Now, the thing is now this is only one particular execution, and you might not cover everything.
But as soon as you have this loop sequence, what you do in the next -- so if you take every -- so
all permutation of all these sequences, so at least you have -- you cover all executions of S1 -- all
possible executions in one step of S1 up to SK.
So what I'm saying, take all possible permutation of these sequences, compute the polynomial
invariant idea, and then intersect it. But still the problem is that you want to have this arbitrary
many times. So there is still a gap how we can go from this polynomial invariant idea to the
polynomial invariant idea of the nondeterministic loop with multiple paths.
And then so how we are doing, it's just playing around with paths. What we are doing. Okay.
So we take -- okay. So we compute the polynomial invariant idea of all permutation of S1 up to
SK. Then we compute the polynomial invariant idea of what permutation S1, SK and one more.
And we are just taking longer and longer sequences of parts. So you take S1 -- so you take K
plus 1 loop sequences, K plus 2 loop sequences, K plus 3, and so on.
And ultimately if this algorithm is going to terminate, then it's going to terminate essentially after
let's say there is a number, then K -- there is a number, then N sequences of loops and N prime
sequences of the loops give you the same idea, then you use a fix point, which is the polynomial
invariant idea.
So what you're having here, you're having a dischanting chain of polynomial invariants. And
now the problem with -- if you would work with arbitrary polynomial invariants, you might have
here a nonterminating sequence.
And, for instance, if you just take -- so this would be one classical example. So this ideal would
be included in the ideal of XN minus 1, would be included in the ideal of X, would be ideal of 1.
And I can put it further. So it's clearly XN plus 1 and so on. And this is going to depend on
terminating 1.
So this particular property which is given here, it's not -- it might -- so it's not a property that
holds for arbitrary ideals. What we do -- what we did is that for this particular set of
P-solvability of loops, you can [inaudible] but for P-solvable loops, give you a special shape of
polynomial invariants, so a few variant ideas, that allows you to prove that it terminates.
And the idea is that you work with so-called form -- sorry, I don't give the proof. But the idea is
that you work with some normal representation of invariants. Then when you take the minimum
composition of prime ideas, and you prove that at each iteration the dimension of the prime ideas
is decreasing. Now, since your dimension is a positive number, it cannot decrease infinitely
many times. And this is -- at one point it has to terminate.
So for this particular example, okay, so it's again a textbook example which, okay, there's some
conditions, but now it's coming back to the question. So these conditions in our setup would be
ignored. So essentially you have a nondeterministic choice between the imprint and the S print
taken arbitrary many times. And now the question, okay, what would be the -- what are the
polynomial relation for this particular program. So if you apply the approach, then you would
get a polynomial invariant ideal that contains five properties. So again I started with the initial
values.
And then essentially like this [inaudible] the part on polynomial invariant, so the method that I
discuss so far, it's sound but it's complete for P-solvable for loops and namely for loops where all
conditions are ignored and the [inaudible] constraints are fulfilled.
So it generates you all polynomial invariants by solving recurrences, and there we use methods
on symbolic summation or people would say methods from algorithmic combinatorics. And then
we compute the algebraic dependencies among exponential sequences. We eliminate
nonprogram symbols, so we eliminate loop counters and exponential sequences in the loop
counters. And we compute various intersection of ideas.
And the nice thing, okay, so why we can do it algorithmically because we have Gröbner Basis
representation. So we only have to work with the finite representation of the ideal.
So in this setup -- okay. So [inaudible] Gröbner Basis would be just -- so symbol elimination
would be just Gröbner Basis on variable elimination. And there's some simple expansions that -so you can -- you can generate also polynomial inequality invariants by using only the linear
loop, so using the loop condition. If the loop condition is linear, you can formulate a problem
that you want to generate. The last iteration of the loop's [inaudible] loop condition is true and
the next iteration would be already violated. So then you could apply quantify -- so you could
[inaudible] the composition.
And you can do very simple -- so if you would use this setup and put some read and write only
arrays, then you can quantify -- you can generate quantified property of arrays, but arrays would
be uninterpreted functions. So it's not that it -- it's the simple extensions and they're not that big
results, so I don't represent them.
But what I present next, okay, so what can you do further like really getting more complicated.
So when you start entering in the array content, how can you get -- so using the methods on
symbolic computation you can generate some polynomial properties among the scalars.
And what can you do further in order to get some properties over the array content. And this is
where we are going to theorem proving. And I visit the very first example, which essentially
says, okay, partition array, an input array A, if it's positive, you copy it into B, and if it's
negative, you copy it into C.
So if you take this particular example, okay, after execution of the loop, you would have all
positive elements of A copied into B and all negative elements copied into C.
And now, I mean, the relation that A equals B plus C, you could use the previous approach to
generate this polynomial invariant. But now to generate something more really about the content
of the array, okay, for instance, you would [inaudible] property that each of the array elements
be -- are nonnegative and equal to one of the array elements in A. Now, if you write it down,
okay, then now you have the quantifier annotation, okay, the further -- universal quantifier over
B tells you that zero are positive and the existential tells you that there is one element in A which
are equal, which is equal to B.
Here you can do the same for C. Or you can -- for instance, you might be interested to say, okay,
that every element in array B which was untouched by the program remain unchanged. And for
that you would only need universally quantified invariants. You don't have to have intonations.
So how we are going to do it is by record the general picture of symbol elimination. This is
how -- but now afterwards I'm going to instantiate it for theorem proving. So, again, we start
with the green language. We are going to extend the green language with some extra symbols
such as the loop counter, and in the standard language we generate some loop properties and we
are going to make use of the polynomial and invariant in generation part to get the scalar
invariants. And then eliminate the symbols in order to get invariants.
So now -- so if you would only extend the loop language by the loop counter, then essentially
you can only derive properties over the loop counter in the extended language. So you would -what we are saying -- what I am saying is that you need some more extensions. You need to add
some extra predicate or function symbols that allow you to reason of intermediate properties of
the array updates. And these symbols are what we are considering, okay, so these predicates are
essentially two predicates that are saying that array V is [inaudible] iteration I at position P by
value X. So these are extra predicates. They are adding it into our language. And then we take
these predicates and instantiate. So for each -- and we investigate our program and for each
array in that particular program we write down the condition, whether the array B is updated at
iteration I, position P.
So array B is updated, okay, at iteration I and position B. It means that I is the iteration variable,
so it's a loop iteration, so it's greater than zero and is bounded by the loop counter N. The and
the position of P where the array is updated is given by the value of B at iteration I. And also
you have to ensure that the loop -- so you enter that particular branch, namely the array A at
position A is positive.
So in the first step of extending the language, okay, we know that these are the predicates that
we -- so only with these predicates currently we are extending our loop language. We take these
predicates. And for each array in the loop, we write down what it means, so what the definition
is.
And as soon as -- so in this extended language, okay, now we are -- in the next step we are trying
to correct loop properties. So one thing would be to generate polynomial scalar properties using
the method I presented before, or maybe you have other methods, which essentially would allow
you to generate that A is B plus C and A, B, and C, they're all positive.
Then by some lightweight analysis, which we call monotonicity properties of scalars, we say,
okay, since -- so we observe that the array A is always -- so it's monotonically updated, even
strictly, so then you can write down for every iteration the value of A iteration I plus 1 is strictly
greater that than the value of A at iteration I.
Then, okay, we have two update -- we have two properties of our update predicates. These are
the properties if you consider. Okay, one is saying, okay, if the array B was never updated at any
iteration at position -- so throughout the entire iterations I, the AB was not updated, position B,
then it means that the final value of the array B at position P is equal to the initial value of the
array B at position B. So you're doing [inaudible] particular problem.
Or if you updated the array B at iteration I and there was no further iteration that updated the
array B at the same position P, then the array -- the final value of the array B at iteration P
remains unchanged.
And the final is, okay, we just take translation of guarded assignments. So we represent the
program as a guarded assignment representation. So these are examples of properties if you
would automatically generate. In the second step I will generate properties.
And the thing, okay, so now if you start investigating these properties, so now I come back to the
coloring value, then you only see green, and the reason that you have green and red, so now if
you investigate these properties, now we have properties both with green and also red symbols.
And now the thing is, okay, so some of these red symbols essentially [inaudible] these red
symbols, they are function symbols or predicate symbols.
So you cannot now apply Gröbner Basis computation saying, okay, eliminate some scalar
variables. You need to do something more. And that something more, if you know what it is,
then would allow you to generate the green invariants.
So how we do it, okay, the idea is that -- so we know that this set of properties, they are varied
loop properties. So in the second step you only do some sound program analysis, so all these
properties are going to be varied loop properties of the program.
So now the idea, try to generate consequences of this set of properties. Since this set of
properties are varied loop properties, every consequence is going to be also a valid loop property.
But if you're lucky, then in the process of generating consequences, you might hit properties
which only contain green variables, and those are -- those are natural candidates for invariants.
So those are invariants [inaudible] not at issue.
And now this process how they are going to generate, now we are using a theory prover.
Essentially what it does in the saturation process of the theory prover, it's really generating
consequences of this set of input properties.
Now, since the set of input properties are varied loop properties, so this iteration can be infinite,
okay, you're never going to derive a contradiction, but you're going to derive logical
consequences of your input set of properties.
So now in order to make it work. So what you need -- so what is important, okay, in the process
of saturation, you're only interested in the green one. So you would only output the green
properties as invariants. So the thing, okay, how can you ensure. So how can you force a theory
prover that would only -- so at least would generate some properties that only contain green
symbols. And the idea is a bit similar to Gröbner Basis computation that it was the red symbols
were forced to be eliminated first. So here you make the saturation theory prover -- so to make
the ordering of the theory prover such, then the red symbol should be eliminated first.
And so if now what -- so this method is also implemented in the first theory prover, so
essentially the challenges that we had to face. Okay. One thing was that, okay, we had to start
reasoning both with quantifiers and theories. So it was not only just first-order logic. But they
were really -- we had a theory of integers, a theory rationals rather than theories. And the other
thing, okay, how to make the elimination step so that the red symbols get first eliminated and
you only get invariants in the green language.
So in the first step we said, okay, let me try to reason both with quantifiers and theories. What
we essentially do now in implementation, we take sound [inaudible] of the particular theory. So,
for instance, this was [inaudible] sound axioms of greater and equal and plus.
And now the important thing, okay, so -- so this would be universally quantified axioms over X
and Y. But I think, okay, so these are theory axioms. So even though this is not coming as an
input with your program, but these are theory axioms which are supposed to know by your
mathematical knowledge. So you can color these axioms as green. So they're allowed to use -so they are green formulas, namely even if you don't see it as an input program but if you would
get it for all X -- X is greater -- X plus 1 is greater than X, you would know the property from
integer arithmetic. That's it.
So these don't contain any red symbol. But the next step was more interesting. Okay. What do
we do with eliminate -- how do we tackle the problem of eliminating symbols. And again we are
reusing [inaudible] from the Gröbner Basis step. So every loop variable in [inaudible] language
becomes a function of the iteration counter.
Now, for every -- but we know that in -- so the loop invariants are allowed to compute the initial
value and the final value for it -- allowed to contain the initial value and final value of each
program variable. So essentially what we say for every loop variable, we can see that only two
value is green, everything else to be red. So we consider the initial value and the final value as
green and the [inaudible] target symbols.
And then the question, okay, what are the green symbols in the theorem proving step. So green
is every symbol that is either a target, so either initial or final value for variable. It can be a
interpreted symbol, because it can be a theory symbol, or it can be some [inaudible] function that
are introduced by a theory prover.
And then you make the deduction ordering of the theory prover such that you make the red
symbols very useless symbols, really big [inaudible] than that green one. So essentially you
force the theorem prover to make inferences where red symbols are eliminated. And whenever
there is a formula which contains only green one, then you output as a candidate invariant.
Except one thing which I didn't write down. So one condition of the invariant is that, okay, it
contains only usable symbols, but it has to contain [inaudible] column function or a target
variable. So what we are not interested to contain logical consequences of theory axioms. So if
you only have a green formula that contains only interpreted symbols as usable symbols, it
would not output it because it's just a consequence of the theory.
So then by this approach, okay, just the idea is, okay, use any program analysis method to
generate some varied properties in the extended loop language. So what we are doing so far if
you only generate some polynomial or scalar properties by monotonicity and array update
properties, then in this set of -- this set of properties ran a saturation based theorem prover to
generate again properties for the consequences of your original set of properties.
And as soon as one of these properties becomes green, then you derive an invariant by doing the
symbol elimination step in the theorem prover.
Now, the implementation is in Vampire, which is the first of the theory prover. So now
essentially what you can do in Vampire, then you have -- so you can use Vampire also further
just for prover so far, but now you can use program analysis features of Vampire. In the first
step you have the program analyzer which encodes in a simple scalar analysis and polynomial
invariant generation. You do some path analysis in order to do translation of programs in regard
to the assignments and also to derive some array properties over -- over -- using the update
predicates of update predicate properties.
So then essentially now this is how if you would decide or you would like run Vampire in the
program analysis and first derive this, so get the extended language derived properties in this
extended language. And then when you go now to symbol elimination, you have to go to the two
steps, namely first [inaudible] theories. So now Vampire would automatically load a theory of -so axioms of integers and arrays and make restricted saturation and we make the red symbols
bigger than the green ones. And then you get -- so here what I've found so far. Then you would
get some green invariants.
And the thing since this -- when you do saturation, this process can be infinite. So you might get
too many consequences. So what we said, okay, from this big set of consequences, big set of
invariants, can you try to generate something similar for this, for Gröbner Basis that -- namely a
minimal set of invariants that are interesting. And this is what we say, okay, after you generate
invariants, try to remove the consequences, try to come up with a relevant set of minimal
properties that you would output as loop invariants.
And this consequence remover is that essentially the idea you are given a set of invariants that
you obtain from symbol elimination, and also this set of invariants, try to compute a smaller set
of invariants S prime by essentially -- by removing all those properties from this S that are
logically implied by the other properties.
Now, if you want to solve this problem, essentially, again, you would face running the prover
and essentially you need to prove that first-order properties implied by other first-order
properties. So the only thing we do so far in computing this -- by removing consequences is we
just impose some time into Vampire and try to -- let's say 20 seconds of time in which Vampire
is allowed to prove for each property in this particular constraint.
Now, what's interesting that essentially consequence elimination, in conjunction with various
strategies or theories, would eliminate quite much redundancy. So if you -- let me -- so if you
would run only symbol elimination in Vampire for let's say -- let's say for the partition problem,
then you would -- even in one -- oh, sorry, in one second, you would generate 166 invariants. So
it's quite -- yeah, it's quite much. And even after this 166, when you start to investigate, then you
would observe there quite much redundancy.
So if you know the current setup of Vampire, so if you combine symbol elimination with
consequence remover, then you would get, well, significant degrees, you would only get 38
invariants as being reported as the minimum set of invariants within the particular time limit.
And now there is -- there is -- and this is where we stop now. But as soon as you would start
investigating this problem, so our experiments show that if you would -- so when you do
consequence remover, so you prove Vampire with respect to theories, using theories. And as
soon as you would run Vampire without theory loading, it might be that some properties would
be proved as consequences of other properties. So there is something much more to be done here
in consequence remover.
So even 38, okay, you might say, okay, 38 invariants, they are -- there might be too many, so
maybe you can use it in 210, so there's something that one would be interested to prove. I skip
because, I'm sorry, I finish now.
So the idea of like symbol elimination. So the name symbol elimination came from the fact that
we now use first the theorem prover to generate invariants, but the idea of symbol elimination
different setups was already there depending whether you use it in computer algebra for
polynomial equalities or inequalities.
But in general, so the general idea of symbol elimination is that you are given a green language
of a loop, then try to express loop properties containing extra symbols, so you extend first the
language with extra symbols, and then this extended language you're generating properties that
contain both green and red symbols. So you can use various techniques to get these properties.
Now, as soon as you have this set of value properties, you know that every consequence of this
mixed colored property is going be a varied property as well, but not necessarily an invariant
because it might contain a red symbol.
So in order to get the red symbol, you do symbol elimination. And then depending on your
application level, if you have first-order properties, you'd run a theory prover. But if you only
have polynomial properties, then you run a decision procedure such as Gröbner Basis. And the
nice thing, so every green [inaudible] that you derive is going to be loop invariant, and these
invariants you can -- depending on which frame what kind of symbol computation or theory
proving, they are just -- you can view them as being consequences of symbol eliminating
inferences.
Now, so far I only use an example and I use only two colors, like green one's a nice one that you
can keep and then the red one was the one that you have to eliminate. So now the further you
use a symbol elimination, okay, you can try to adapt the approach to interpolation, but essentially
now you would have three colors, so you have the red and blue, which are the -- the red and blue
symbols are the ones that you don't want to keep, and the green ones would be the ones that you
like to keep.
So then you would then do symbol elimination with respect to two colors. And the idea would
be, okay, you would generate interpolants, you would generate some properties which are green,
and they have this -- they have the property of Craig interpolation.
So now we have this also in Vampire. You can use symbol elimination for interpolation as the
color proof of special kind which I'll look at but I don't enter the detail there.
And the next thing which would be interesting, so I wrote it ongoing and if you have ideas
then -- then have been -- it would been nicely discussed, so so far -- so in all these application of
symbol elimination [inaudible] Gröbner Basis [inaudible] theory proving, then the invariants that
were generated essentially -- so the invariant [inaudible] process [inaudible] what is the property
to be proved. So we were trying to be complete with respect to the technique but not with
respect to proving the program property.
So now the idea would be, okay, assuming -- so can we generate any -- can we use the symbol
elimination in different ways in order to be more property driven, so assume you are given some
post condition and you're interested to generate invariants that would imply the post condition.
So you don't have to be complete with respect to the technique, but complete with respect to the
method. And so there are -- so if you have any suggestions or idea, then I would be happy to
discuss.
>> Nikolaj Bjorner: Thank you. Questions.
[applause]
>>: Are there some programs that you've tried this on where you think it'd produce interesting
invariants?
>> Laura Kovács: By this question, what interesting means?
>>: I mean, you didn't expect do some odd -- give some ->> Laura Kovács: Yeah ->>: [inaudible] about the program?
>> Laura Kovács: I mean, even for this example. I mean, there are many. Like essentially all
examples we tried. Like some numeric ones. Like if I only see this example, it's not really
necessary that I'm coming up immediately with this particular property. And another one ->>: But, I mean, in that example, that's kind of the nice way to write the recurrence that you're
extracting.
>> Laura Kovács: Yeah.
>>: As a programmer, I may have written it in exponential notation in the first place and then
directed to the value of the duration time.
>> Laura Kovács: Yeah, but still from -- I mean, the values -- still after enumerating all values
of X and Y, a particular iteration, there is a way from the values to come up -- okay. You can.
You can just enumerate, okay, for each iteration these are the values, but you still have to have -to come up with a relation that holds it between X and Y at any iteration.
>>: [inaudible].
>> Laura Kovács: I mean, so on this thing it's ->>: [inaudible] you consider it surprising when a computer program computes in ways that you
can see the nature of what this [inaudible].
>>: You can see it. So what I mean is that if I give a loop, am I just going to get an invariant
that says write because I expand all the paths, I'm going to get an invariant that says X is an
integer. Or X is a number. Or do I get something more of specific?
>>: No. You probably get a lot of surprise because in some [inaudible] table which says that the
[inaudible] of invariants were generated. And if you look these invariants, you will have no idea
what it means to quantify the server. So in a way they are surprising equation [inaudible] very
useful one.
>>: Yeah. Maybe that's ->> Laura Kovács: I mean, I also like this example like -- like also for this particular example
like -- I mean, yeah, I -- sure. If I write down these invariants and they are like as soon as you
read it, clearly it's not -- I mean, it's not that more surprising. All right.
But, for instance, like this -- I mean, this is not a very surprising invariant. But the thing is like
for this particular program, until we started to study this problem, you would only speak of
universally quantified properties. You would never speak of, okay, what is the relation. So these
values don't come just from somewhere, but they really come from a relation between the array
B and array A. So sometimes they were surprising.
>>: So the existential in that formula, now, that came from one of your original [inaudible]
symbols that appeared in the ->> Laura Kovács: Yes.
>>: So -- right. Because basically that comes out of this axiom that says if the array was
updated there was some iteration in which it was updated, and that's why the [inaudible].
>> Laura Kovács: Yeah.
>>: So but, you know, and another way to do symbol elimination is to just put a quantifier on
the formula. So I have some term I don't like, I could replace it by an existential.
>> Laura Kovács: Yeah.
>>: And so -- so -- and I guess, you know, even that's sort of -- if you think about it, how you
prove relative completeness of [inaudible] logics, you just say, well, here's my program and I can
just describe the regional states of my program essentially by throwing lots of existential
quantifiers on there.
So there's some -- so even with your polynomial examples, once you have those recurrences, I
could have just put an existential quantifier on N. You know what I'm saying? So I could say at
every time there exists the value of N, which is the loop counter, right, such that -- because
greater than equal to zero, blah, blah, blah.
So there's some question of what is the useful form of the invariant. Because I know I can
always write the invariant. If you give me arithmetic, I can write the exact set of reachable states
of my loop. So what exactly is the criterion for a good invariant?
>> Laura Kovács: Yeah. I mean, that's the thing. So yeah. We don't have -- we don't have -- I
mean, [inaudible] yeah, what is interesting, what is a good invariant. I mean, one like -- so, for
instance, a polynomial case, okay, clearly you can put existential quantifiers on before it, but it
will be still -- so if you can come off with a quantifier-free formula instead of having nestedness
of quantifiers, then you -- I would consider that it's better.
>>: Okay. Somehow less quantifiers seems better.
>> Laura Kovács: Yeah.
>>: Right? Because somehow it makes it easier to reason with that invariant.
>>: You mean like in the formula proof?
>>: What's that?
>>: Quantifiers is useful like in the formula proof?
>>: Well, yeah -- well, we already mentioned the problem of making the invariants useful, right,
in terms of proving [inaudible] implying your property. But I can always write down the
strongest invariant of my program. Right? If you give me arithmetic. And so -- but that's not
useful for me because I can't reason with it. And so somehow this seems to be getting at that
question of what's a useful invariant, and you're saying somehow symbol elimination is making
that invariant more useful in some way.
>> Laura Kovács: But it's making at least -- so I'm not saying more useful, it's giving you a way
to get some invariants. And now the question, okay, so what -- in the next step what we're trying
to do, okay, are these invariants useful with respect to the post condition, with respect to what
you want to prove. So essentially ->>: [inaudible] though, because once you got your recurrences, I could have just put an
existential quantifier on, right, and got the strongest invariant, all right, that you could infer,
right, and so -- so the question is why is the pretty one that you wrote as a polynomial, why is
that better, right, because both of them -- they're equivalent, right, and so ->> Laura Kovács: Yeah.
>>: Sorry. There's a typo in the formula. It should have A sub I.
>> Laura Kovács: Yeah.
>>: Because otherwise it just eliminates the I. So the point is that there you can [inaudible].
>>: [inaudible].
>> Laura Kovács: No, no, it's true. Here it should be I because it's -- I don't think [inaudible].
>>: And it's sort of the point that you can eliminate I and get an equivalent formula there, which
you could do in the polynomial cases [inaudible].
>>: I is a bound variable here.
>>: Right. But in the vocabulary of ->>: [inaudible].
>>: Can't eliminate I.
>>: But essentially I think there's some observation is that actually this [inaudible] I wasn't
reduced to this [inaudible], okay, so the invariants of Vampire are reduced [inaudible].
[multiple people speaking at once]
>>: As you correctly pointed out, but the nice thing is that if you would prove something about
this, you can actually use invariability as this [inaudible] function itself, okay, you could prove
that this is sum, therefore you do it in quantifiers.
>>: Even if you wanted to prove something with it, yes, you would have columnized it.
>>: Right, exactly. You can use [inaudible] and therefore you have something that is at least
good for the solution provers because you don't have quantifiers, you have [inaudible]. How
useful that really depends on the application, whatever.
>>: So maybe I ask the same question. If this -- the classic programs where you compute these
recurrences, I think maybe that's a better thing to ask. What can I envision are useful instances
that live in this class where you can compute these recurrences and you have this termination
property?
>> Laura Kovács: So what kind of ->>: Yeah.
>> Laura Kovács: What kind of program would -- so any program essentially. We already have
assignments in which you recursively change a variable. So assignment variable. In which you
recursively change the variable by linear updates.
>>: Right. But did you come across some examples of such programs that weren't sort of so
artificial that -- in terms of application?
>> Laura Kovács: Right. So like let's say from some of the examples which you just take from
some other big code, you would get more A plus, so you only get plus one or multiplication by
two. So you would not really get like big linear occurrences.
>>: I can think of one ->> Laura Kovács: But you can beat it up. So you can take -- so you can take this approach, like
you don't -- like you can take something over Y, something over Z, then build a more
complicated one and go further.
And then -- sorry. And there was like -- so one application that we were using in combination
with arrays were like for matrices, where you were supposed to derive various shape of matrices,
whether it was upper triangle. So then you have -- there you have linear relations among the
dimension of the matrices how you're updating.
>>: And another place where there might be some missing programs are in software controllers
which are going to do things like this, where like a PID controller is going to multiply elements
of an array by a constant and add them together in a big loop, and then you may want to know
does it blow up, does it [inaudible] blow up.
>> Laura Kovács: Right. On like -- on the application like numeric computations where you
compute special numbers. I mean, depends like what is application domain. But there
[inaudible] numeric compute -- numeric computation you would encode how you would
compute and the like [inaudible] maybe it's artificial. But you would have different operations
on the program variable computing that particular special number. So just plus and you just
increment by one or increment by two, might be not sufficient. You might take multiplication by
a scaler and end it up to another entity.
>>: [inaudible].
>> Nikolaj Bjorner: I think it's time to thank [inaudible].
[applause]
Download