>> Leonardo de Moura: Hello. It is my... Bonacina. She is visiting us for five weeks. ...

advertisement
>> Leonardo de Moura: Hello. It is my great pleasure to introduce Maria Paola
Bonacina. She is visiting us for five weeks. She was for many years on
automated reasoning.
Today she will talk about how to super (inaudible) calculus creates precision
procedures. And she was very kind, in the next week, she is going to give more
lectures about technical details. If you are interested, you are welcome to come
to.
>> Maria Paola Bonacina: Thank you very much.
Thank you, Leonardo, and (inaudible) and Nicoli for the invitation. Thank you all
for coming today and for having me here.
As Leonardo said, this is the first talk where I'm going to give an overview of how
one can possibly apply -- at least in some ways, to possibly apply generic
theorem proving to design decision procedures for satisfiability modular theory
problems.
If there is interest in the next weeks, I will be willing to give other talks or some
more informal (inaudible) so we can discuss more freely, to go into the details, if
there is interest of the (inaudible) improvement theorems in the papers behind
this presentation.
So for today, this is the general overview so we get a feeling of what has been
done, of what may be investigated next. I start with some motivation.
Then I'll enumerate some of the major reasoning methods that have been started
and the sum of what seems to be the strength, because different methods are
good for different things and none of them is good for all.
Then I'll give the overview of the results we have obtained, myself and many
others on applying generic theorem proving to satisfiability modular theory
problems that will include termination results that will show that we can use what
is a semidecision procedure for fixed-order logic as a decision procedure for the
satisfiability problems.
We gave a general theorem showing modularity of termination for combination of
theories. "Modularity" means that if we have termination on each theory, we also
have termination on the union.
Then I shall recall some experiments we conducted actually a few years ago now
with generic theorem prover named E, the work of Stefan Sholls (phonetic) on
the satisfiability problems. And then I'll overview more recent work on a
decomposition approach that is meant to unite somehow the strength of
fixed-order theorem provers with those of SMT solvers and try to get a little bit of
the best of both worlds to solve problems that I've had and some discussion -some open discussion at the end.
So let's start with some motivation. While we (inaudible) software everywhere
and we would like it to be reliable, it is more than a wish; it is a need. But it is a
difficult goal for many reasons. Software can be artful. Software is, in many
ways, a work of art. It contains the creativity, the insight, the ingenuity of people;
and so that makes it difficult to create.
It can be complex. It can be huge; we all know that. It can be very varied,
different pieces of software, different applications, different context.
It can be old, and maybe it comes without (inaudible) so that it is harder to deal
with it.
And even now there are so many studies in value formatics, there is the
possibility that software may reflect some "natural" -- I have quotes there
because "natural" is a heavy attribute, but still some natural laws of computing
that -- to face different computing models and computing formulas and may also
contribute to make software such an intriguing, complex, interesting and also
difficult-to-work-with work of art.
Furthermore, I'm talking about software -- and here we are in a software reliability
research group -- but we should not forget that the problems about hardware
verification are not solved. I am solving entirely at least. And the border
between software and hardware is somewhat blurred.
It is also evolving. It is not something defined once and for all. There is
migration of functionalities from software to hardware, and there are also many
approaches that actually describe hardware in situations it looks like software in
a certain level of abstraction.
So reasoning (inaudible) may contribute but only to reasoning about software but
also to reasoning about hardware. But let's stick to software.
There are many approaches to software reliability. You know most of them.
Most of you are experts in some of this, so you know them better than I do.
There is testing with automated test case generation. There is the design of
programmatic systems that enable the programmer to generate better software
to begin with. There are program analyzers that analyze the programs after they
have been written or maybe reviewing the process of developing valued
versions.
Analysis may be static or dynamic, and they use different technologies from
types to other certain interpretations.
There is software model checking. We all know that model checking has
obtained late results in hardware verification, but it's been applied also to
software. And interestingly to this talk, software model checkups make a
distinguished use of theorem proving of reasoning techniques.
For instance, in bounded model checking, the problem is reduced to decide the
satisfiability of a formula. In counterexample model checking, theorem proving
comes in because the problem of deciding whether an error in the abstract
program corresponds to an error in the concrete program is formulated as a
satisfiability problem. So if we get, "Yes, it is satisfiable," we know that the error
that we found in the abstraction also appears in the concrete.
If we get, "No, unsatisfiable," then we can use the proof to define the obstruction.
And I'm leaving out there because there is certainly more that I haven't cited.
But the common notion -- the common point I would like to make in this talk is
that a variety of these approaches, a variety of these technologies could make
use of reasoning about software.
What could reasoning about software be about? It can help find and remove
bugs. It can find and remove bugs by accident less modestly. It can prove a
program free of bugs of a certain kind. And more ambitious, it can prove the
program correct; that it is free of all bugs.
Now, systems -- the reason about software may be valid again, but they have
typically a common architecture with a front-end, which is the interface where we
model the program and we compile somehow the information coming from the
program into formula that can be given to the back-end where we have the
reasoning engine that ultimately solves the problem. What we can think of as a
theorem prover -- For instance, in your group, you have the theorem prover Z3
developed by Leonardo Nicoli and that here would be the reasoning engine, the
back-end behind another system that could be your havacer (phonetic) or other
special product systems in the front.
The focus of this talk will be on the reasoning engine. So I'm from theorem
proving. I'm going to discuss the reasoning technologies and some possibilities
they make available. When we talk about automated reasoning, we usually
mean two major tasks defined formally. One is theorem proving, finding a proof
for a conjecture and thereby show that it is a theorem; or else model building,
building a model, which is often a counterexample for a conjecture.
So let's see these a little bit more formally. We shall assume, as I said, to have
some way to go from programs to formula. So we are not concerned with that in
this talk because we focus on the reasoning engine. So let us assume it is
approached somewhere else.
So what we are concerned with is a reasoning engine that starts with a formula.
A formula may typically have the form H implies C, where H represents a bunch
of assumptions and C represents some conjecture.
What we want to do, we want to determine whether this H implies C is valid or,
equivalently, whether C is a logical consequence of H. Or, again, thinking
refutationally whether H union, not C, is satisfiable. And this would be achieved
by giving a refutation of H union, not C, which shows the dividities of satisfiable
and, therefore, is a proof of the validity of H implies C (inaudible) or the validity of
C. This is the task of theorem proving.
Dually, we may have the answer that H union, not C, is satisfiable, which means
H implies C is not valid.
And this is done by giving a model, a model of H union, not C, which would be
the for-account model of C, of the positive conjecture. And this is what is done
by model building.
Now, what do we have in formula? In formula, we have values ingredients
beginning with propositional logic with the usual connectives and propositional
valuables. We shall have equality, positive or negated, with so-called
uninterpreted constant symbols. ABC is function symbol of GH. "Uninterpreted"
means that they are free. They could be interpreted in different ways in different
structures.
Then we have (inaudible) theories, which means we have fixed the ascension to
certain interpretations and, therefore, symbols are not treated; they are
interpreted in a certain way. They include theories of (inaudible) structures such
as lists, records of the structure that we can think of as a generalization of lists.
Just like in lists, we have a constructive constant and to selector part incurred, or
head and tail. In a constant data structure, we have a general constructor and,
say, K selectors. Then we have arrays, records, B vectors. We have (inaudible)
because problems come with numbers, with integer of real types.
Finally, we have the whole of the logic which brings in the quantifier and three
predicate symbols, or relation symbols.
Now, depending on which language, which theory we select, our validity or our
satisfiability problem or dual satisfiable problem may be decidable or
semidecidable. If we place ourselves in the general system of the logic, we shall
have a semidecidable problem and, therefore, the best we can get is a
semidecision procedure.
If we select the decidable fragment, we can have a decision procedure. Several
of the first-order theories I mentioned before have the property that if we restrict
ourselves to the quantifier-free fragment, that is, we consider only ground
formula, formula without valuables in the logical sense, we may have a decidable
fragment. Therefore, we can have decision procedures.
In the literature, people talk about key decision procedures for the decision
procedure that decides satisfiability of the ground formula in a theory T. Ground
formula without loss of generality can be reduced with certain ground clauses,
call it S.
Typically, people talk about the satisfiability procedure for a decision procedure
that decides satisfiability of a conjunction of ground unit clauses, or conjunction
of literals, ground literals.
So what do we want these reasoning procedures to be like?
We would like them to be expressive so that they handle all ingredients, for
instance, all the theories that appear in the formula. We would like them to be
sound and complete so that they give neither false negatives saying "there is no
bug," not because really there is no bug but because the proof is wrong; and
giving no false positives, which means it doesn't happen. They say, "Yes, there
is a bug," but that's not true. It is because the system cannot find the proof that
there is none.
We want them to be efficient because each formula they will be dealing with will
typically represent only some task of the general verification task we are after.
We want them to be scaleable because practical problems generate huge
formula.
We would like them to produce proofs so that we can check the proof and
manipulate it. For instance, I remember what I mentioned before about software
model checking where one can use the proof of the satisfiability -This is not needed now -- where one can use the proof of unsatisfiability to refine
an abstraction. So being able to work on the proof is useful.
And we would like them also to produce models because, as I said before, the
model represents a counterexample which is often what we are really after. We
would like a counterexample that helps us to find the bug. A counterexample to
correct this is often a case to find the bug. So these are all the (inaudible) for
reasoning procedures.
Now, (inaudible) bring to our disposal a variety of reasoning methods that are
good for different things. Many of them are probably known to you and are
already implemented here. Let's start with the
Davis-Putnam-Logemann-Loveland procedure, the DPLL, which is typically used
for satisfiability problems in a position of logic. And it is very strong for its ability
to do case analysis and, therefore, break apart large formula distinguishing the
value scaling.
We have congruence closure algorithm, in short CC, which is typically used to
reason about ground equations. Congruence closure means -- think about a
graph. If we have that all the children of two vertices are in relation, then the two
parent vertices having the same label are also in the relation. And we can use
that to think about equality because equality is a congruence. Then there are
specialized theory solvers, for instance, the simplex method for linear arithmetics.
These can be combined in the so-called DPLLT framework, that is, we have the
(inaudible) procedure of propositional logic. We've incorporated a procedure for
a theory. For instance, the congruence closure procedure for equality. EUF
stands for equality with an interpreted function symbol.
Next, we can combine different theory solvers using different methods and obtain
what are called DPLL base SMT solvers. There are different ways to do
combination of theories. One is the (inaudible) method. One is a delayed theory
combination that essentially has the solver do the combination in model-based
theory combination, which presumes that each solver builds its own model and
can use it to drive the combination.
But there are more. For instance, there is ray hiding, also known as
simplification, which brings in walking to the picture. The main ingredient is that
we assume we found the unknown terms, and this gives us an notion of normal
form of an expression or a term being reduced to normal form that can only
conform.
What is important is that variety brings in matching, that is the ability to match a
target with a (inaudible) in the presence of valuables. So we know longer have
reasoning about grounded questions only as in congruence closure. But with
guiding and matching, we can work with valuables.
Then there is resolution. Resolution brings in the ability to deduce clauses from
clauses. So it is considered a synthetic reasoning method because it
synthesizes clauses from clauses. And its major point of strength is unification.
If you consider resolution in propositional logic, it is not of much use. DPLL does
much better.
Resolution comes usefully into the picture when you go up to fixed-order logic,
when you have universally quantified valuables because resolution enables you
to instant shade quantified valuables by using unification.
Now, matching a e-unification can be generalized to e-matching and
e-unification, which means we can do the matching of the unification operation
modular set of equations. And this can be done also in the congruence closure
algorithm to instant shade (inaudible) universally quantified valuables using the
ground equalities in the congruence closure graph as the equalities intercept E.
Then we have theorem proving methods that work by instance generation.
Essentially, they implement the Harbon (phonetic) theorem by trying to generate
an unsatisfiable set of ground clauses from a general set of ground clauses.
Then there is the whole family of tableau-based method that in contrast with
resolution can be seen as analytic method because the way they work is by
(inaudible) deduction; that is, by decomposing formula into subformula. And they
can also be used to show satisfiability by so-called model elimination, viewing the
tableau as the survey of all possible models and eliminating them all to obtain a
proof
Now, (inaudible) can be combined with super position, which is an inference rule
for deducing equations from equations, equations with valuables. This can be
done in the context of so-called MUSE Bendix completion that can be used to
semidecide problems in a question of theories.
Finally, when we put it all together and we put together resolution of hiding super
position of (inaudible) modulation -- they are very similar -- we get the full-fledged
first-order theorem prover that can deduce clauses, of course, with a question
with universally quantified valuables, from other clauses with the questions.
Now, this is a really large choice of tools of reasoning methods that we can pick
to build our decision procedures. Now, empirically people have got to have a
sense a little bit of what these value methods are good for. DPLL is good for
SMT problems, especially because it can break apart large nonhaR clauses.
Congruence closure is good for reasoning about ground equations. Theory
solvers are very good for reasoning about special theories like (inaudible).
DPLL-based SMT solvers are really strong on ground SMT problems, uniting the
strengths of all the previous ingredients.
Now, when we look at the other side and we see rewriting(inaudible) and, for
instance, MUSE Bendix completion, these are good for reasoning about
nongrounded questions with universally quantified variables. Likewise, resolution
is good for reasoning about nonground first-order clauses, especially HAR
(phonetic). And same, when we combine resolution paramodulation rewriting,
we get methods that are sufficiently good for reasoning about general nonground
fixed-order logic clauses with equality.
But, again, this method are not as strong as DPLL in breaking apart very large
formula and they seem to excel especially at haR problems.
Now, this is the overview of what our general theorem proving was. Now, let's
focus a little bit more on what we are going to use in this talk. Let us assume to
have an inference system for first-order logic with equality. Let's say that I
based, not so much because they wanted to exclude the others but because all
the problems you're interested in include equality and rewriting has proved to be
probably one of the best methods for dealing with equality, a tableau-based
system or instance generation systems when they want to work with equality.
They also bring in some sort of rewriting and super position.
So let's assume to have one of those.
An inference system is non-deterministic in nature. So in order to have a working
method, a so-called theorem proving strategy, we shall need to combine it with
some control that is usually called search plan for typical terminology from AI. So
we get a strategy inference system plus a search plan.
And if we have a refutationally complete inference system -- that is, one that
ensures that whenever the input is unsatisfiable, a proof can be generated -- and
the phased search plan that is one that does not neglect any necessary step in
the form ensures that if there are refutations, one can be built, we shall have all
together a complete strategy. So there are many of these. This is not our
problem in this talk. We shall assume one and see what we can do with it for
SMT problems.
Now, the main idea of the approach I'm going to describe is the following. It is
very simple. If we can show that first-order inference system is guaranteed to
terminate on any T satisfiability problem for some theory T, then any complete
theorem-proving strategy based on that inference system is, in itself, a decision
procedure. Well, it is already sound and complete because it is a strategy for
first-order logic. If we add the termination, we go from having a semidecision
procedure for general fixed-order logic to having a decision procedure for the
specific class of problems, the satisfiability problems, where we show
termination.
Now, a few things to remember. If we use a generic theorem prover, we are not
going to have the theory built in, so to speak, in the algorithm; so we shall give a
presentation, a maximization of the theory, a bunch of clauses describing the
theory in the form restricting the interpretation of the symbols in the theory as
part of the input. So the input will have the form T union S where S is, say, a set
of ground unit clauses. NT is a presentation of the theory.
Of course, in problems we usually have more than one theory as we saw before.
So in this approach, the combination of theories will start by giving us input, a
union of the presentations of all the theories we need.
Also, notice that the border between T and S is soft flexible because if someone
comes with a problem that contains a formula with a universally quantifiable,
valuables we can migrate it from S to T and say it is part of the presentation of
the theory. So if you have a system where theories are built-in and you have a
new formula with quantifiers coming in, you have to deal with it as part of the
problems.
If, instead, you use a first-order theorem prover where everything goes into the
input anyway, you can see it as part of the theory. So there is an additional
flexibility there.
>>: I have a question.
>> Maria Paola Bonacina: Please.
>> You said that theory has to be somehow deduced as a set of axioms?
>> Maria Paola Bonacina: Yes.
>>: How does one figure out if those axioms got to the theory?
>> Maria Paola Bonacina: Well, for those theories such as those I listed before,
lists, arrays, records and so on, (inaudible) unknown, I could give some
examples if there is interest. For instance ->>: So that is a problem that the user has to somehow solve? That is not to be
addressed by this theorem, right?
>> Maria Paola Bonacina: Yeah, that's correct. The theorem prover assumes to
have -- as part of the input a description of the theory so that we come with a
problem, yes.
Other questions? Just feel free to interrupt any time, okay?
Now, what would be the advantages if we could do this?
Well, we do have a sound and completed system, so we have complete
strategies. We have advantages in terms of expressivity because we would have
the full effect of the logic with the quality and, in particular, with narrative
quantifier reasoning. For combination of theories, provided we can show
termination -- and we shall see that with the modularity result -- we just gave the
union of the presentations as part of the input. And we have, as I said, some
flexibility drawing the line between theory and problem.
Now, we can use existing theorem provers of the shelf, so to speak, or almost
that way, proof generation will be already there by default because these
inference systems typically generate proof if they find unsatisfiable.
Model generation is not as easy, but we have a starting point. If we demonstrate
that the inference system is guaranteed to terminate on this T union S problem, if
we terminate regardless of whether it finds proof or not -- if it finds a proof, the
output will be the proof. If it doesn't find the proof, the output will be a satisfiable
set which has certain properties. We say it is saturated, which can form the
basis for being a model because it is satisfiable and saturated, which means it
contains a lot of information about what is true in that theory.
Now, so far so good. But what about termination? We are starting from what is
a semidecision procedure first-order logic with the equality. So termination is by
far not real. Here I enumerate some of our results that I'm going to outline in the
second half of the talk.
Termination results. We proved that fairly standard, the first-order inference
system is guaranteed to terminate on the satisfiability problems in theories of the
structures in the form that is the satisfiability procedure. And in some cases
when the theory allows it, this can be done with polynomial time complexity.
Then I give a result about combination of theories of what I mentioned already
before as modularity of termination. "Modularity" means if I have termination on
T1, T2, T3, each taken separately, I also have termination on the union, on the
combined problem.
Then I will give some experimental evidence which is whether limited to problems
on ground literals, so the axioms and the conjunction of ground literals.
Then I shall discuss the problem of generalizing this approach from so-called the
satisfiability problems where we have conjunction of ground literals, a conjunction
of ground unit clauses to modularity decision problems where we have a
conjunction of ground clauses.
And a way to do that would be by the so-called decomposition approach, which is
a way to decompose the problem and then submit it to a system which works as
a pipeline of a fixed-order prover and an SMT solver so that the first-order prover
is invoked first to preassess the problem and do intuitively as much theory
reasoning and as much reasoning on universally quantified variables as possible,
hopefully generating a ground problem, and then feed its output to the SMT
solver, an engine like your Z3, for instance. Or else one can, of course, decide to
pass something directly to the SMT solver. So we have a bunch of sufficient
conditions to show how to do this while preserving, of course, the satisfiability of
the problem and the flow of the consonance of the whole transformation.
Okay. So let's go now a little bit more second half into the details of what we
have. Termination results, we use a specific based system which is called SP -SP from super position -- specific but, in fact, quite standard.
It is one of those systems that is implemented in the most commonly used
fixed-order theorem provers, even the (inaudible) implement these kind of
inference rules. So what is in there? There is resolution. There is a super
position. There is factoring. There is a simplification of rewriting. All very
standard inference rules, nothing terribly special or unique to this system.
An important ingredient is that this inference system must assume an ordering on
terms, literal clauses. Why I mention this -- I said I wouldn't go into any technical
details; indeed, I will not. I mention this because the ordering plays a role in the
proof of termination.
What kind of ordering? It is a complete simplification ordering. "Complete"
means it is total on ground terms. If we have two terms without variables, we can
always tell which is greater. "Simplification ordering" means it has a set of nice
structural properties that are very intuitive, like a term will always be strictly
greater than any of its strict subterms. That's very natural. If you have a tree,
you expect it to be greater than each of its strict subtrees.
And the monotonicity and stability properties that are also very natural about the
structure of terms and the application of substitution. There are many such
orderings. They are implemented. They are well-known. We can assume one.
And most important is a consequence of the nice properties I mentioned. These
orderings are well-founded. I said these orderings will be important for the proof
of termination. Why? Because in order to prove termination, we shall need to
prove the different system generates only a finite number of clauses. More
precisely that the set of clauses that persist, that don't get deleted by rewriting,
say, is finite.
And the order will play a role there because it will exclude many inferences than
we would otherwise do.
In particular, we shall also assume that these ordering is such that T is greater
than C for all ground compound terms T and constant C. This is also very
natural. We can just impose the precedence where all the function symbols are
greater than all constant symbols and then define an order incursively and get
easily (inaudible).
Now, this inference system equipped with this ordering and the first search plan
will give us a complete theorem proving strategy. The form of our theorems is
the following. So this is not one special theorem. It is a template, a schema of a
theorem. We show that a completed strategy is its satisfiability procedure. How
do we show it? By showing that a completed strategy is guaranteed to terminate
on an input given by the axiom for T and set of ground literals on the signature of
T.
We proved this kind of theorem for all the theories listed here, so we have the
theory of lists. We considered both -- we considered different presentations of
the theory of lists. We can have no name T, possibly cyclic lists. We can have
possibly MT, possibly cyclic lists. The difference between no name T and
possibly MTs is whether you have (inaudible) or not in the signature.
"Possibly cyclic" means that these presentations don't include axioms to exclude
models where an equation like power of X is equal to X is satisfiable. So we
don't worry about cycles.
We proved it for a raise with or without extensionality. Extensionality is the axiom
that tells that two arrays are equal if all the locations are equal.
Records with or without extensionality, fragments of linear arithmetics, integer
offsets, and integer offsets and modular of K. "Integer offsets" mean the theory
where you have predecessor and successor. Very simple.
And also general records (inaudible) structures with one constructor in K
selectors. If you take K equal one you get integer offsets as a subcase with a
predecessor as the constructor and successor as the selector.
Please.
>>: (Inaudible)
>> Maria Paola Bonacina: Integer offset modular is modular K, so you do not
have all the integers but you have only the integers from zero to K minus 1.
So it means that in integer offsets, you have actually an infinite bunch of
axioms -- we should see how to deal with that -- that say, successor of X is
different from X, the successor of successor X is different from X and so on.
If you do integer offsets modular, you pick a K and you say successor of X is
different from X, successor of successor from X is different from X but successor
of X K times is equal to X. So you're close and you only have K value, okay?
(Inaudible) structure, if you pick K equals one, you get back again integer offsets.
If you get K equals two, you get lists. But incursive structures usually come with
axioms for saying there is a structure is acyclic; that is, we can't satisfy it in an
equation like power of X equals X. So there are acyclicisity axioms there. So we
shall have acyclic lists in constant with those I listed before.
Anyway, a whole bunch of theories. Here are the integer offsets that you were
asking before. I mention it because this may look like an undoable problem for a
theorem prover that expects a presentation as input because it has infinitely
many axioms. These are those I mentioned before. We have full X, the
successor of the predecessor of X is equal to X; the predecessor of the
successor of X is equal to X. For all X and for all I greater than zero, successor
of X I times is different than X.
So what we do, we give a problem reduction. Basically, we proved that it is
sufficient to give as input finitely many of these axioms, how many, especially it is
sufficient to go up to N where N is the number of all currencies of successor, is in
the worst case the number of all currencies of successor and predecessor the set
of ground unit literals.
It could be simply the number of constants for which the input defines the
successor. But in the worst case, if each equation introduces a new constant, it
will be equal to the number of all currencies of the function symbol successor and
predecessor.
>>: (Inaudible).
>> Maria Paola Bonacina: I'm assuming that S is a set of ground literals, yes.
Big S is a set of ground literals, yes.
Next. So we proved all these theorems about termination on each of these
theories. Actually, they are not all that difficult because they are all based on the
analysis of the inferences with respect to the theories and the ground literals and
show the only clauses of a certain kind can be generated under the ordering and
then they are finitely many because we have a finite signature and organized
structure of the inference system.
But one says, well, it is not enough to prove termination for each theory. You
also need to combine the theories. What if I have to prove termination for each
combination? That would be ugly. But that's not the case because we gave a
modularity result. "Modularity of termination" means if I can prove that the
strategy terminates on TISOC (phonetic) problems, if it terminates on satisfiability
problems in Theory 1, in Theory 2, in Theory 3, each taken separately, then I can
guarantee it will terminate also on the union, and, therefore, on any union of any
of these theories.
This requires two hypotheses. One is very standard for all combination methods.
It requires that the theories do not share function symbols. Constants can be
shared; functions, no. Why is that? Well, because when we prove the
modularity result, we want to prevent unleashing infinitely many steps across
theories. So if we do not have a shared symbol, we prevent all the
paramodulation steps from one compounder in one theory into another term in
the other theory.
And then we shall assume that the theories are valuable and active. This is a
technical but very simple condition. It prevents paramodulation from variables,
and it is satisfied by all equational theories without trivial models. And if you
have a theory with trivial model, you can just add something like exist X, exist Y,
X different than Y to exclude the previous model. It is satisfied by hardened
theories without previous models.
And, basically, with this property, we are in the realm of the so-called stable
infinite theories. There are also those considered by the S and opp and
combination method. It is a technical condition, but it is very simple to satisfy.
And it is satisfied by all the theories we saw before.
Yes?
>>: The generalizations, the nonshared function symbols, they still have
(inaudible).
>> Maria Paola Bonacina: Yes, there has been some generalization of the S and
opp method (inaudible) with the shared function symbols. You are thinking about
the work by(inaudible). Yes.
>>: (Inaudible). There are a finite number. Do they subscript to that ->> Maria Paola Bonacina: So we listed the modularity theorem also to those
cases. I don't know yet, but it is quite possible. It is something to be investigated
next, certainly.
So let's see the shape of the modularity theorem. If the theories do not share
function symbols in the valuable inactive and the strategy is (inaudible)
procedure for each of them, then it is a procedure also for the union of the
theories and, therefore, for any of the combinations.
Now, after this point, we were working on the satisfiability problems. So we
conducted some experiments. This was now a few years ago, so we used the
systems that were available back then, Version 0.82 of E and Version 1 of CBC
and 1.1.0 of CBC light.
What was interesting with those experiments, at least, was that maybe we were a
bit influenced by the SAT community. We wanted to try synthetic pararithemtic
benchmark to test scaleability. We tried both satisfiable and unsatisfiable
instances. We tried combination of theories, and we tried sets of literals from the
Euclid system. You see here some of the results. For instance, these
benchmarks were called (inaudible) with parameter N. These are unsatisfiable
instances. It was the problem -- obtaining a problem from theory of arrays with
extensionality.
It turned out to be fairly hard for the systems, at least back then. None could
terminate for N greater or equal to 10. But you can see the theorem prover
which is the one with the white circle could do better than the systems with
Theory 13 because it could take advantage of (inaudible).
See, the satisfiable instances of the same family of problems, but they are much
easier. If you look at the runtime on the Y axis, you will see that these are much
smaller in the way the theorem prover did well here, too, which is interesting
because the general (inaudible) is theorem provers, since they aim at finding
proofs, should be good at unsatisfiable problems but maybe not so good at
satisfiable problems. Once we showed that the theorem prover is a decision
procedure, this isn't necessarily true.
This is an another family of problems on arrays where, let's see here, we have
two curves overlapping. So CVC and E had essentially the same performance.
Unsatisfiable instances of the E theorem prover did a little better.
This is yet another family of parametric synthetic benchmarks in the theories of
arrays, again unsatisfiable instances. And here, too, the theorem prover did well
and using a different order in group Bendix ordering, you would do it in nearly
constant time.
This was the benchmark for circular queue. We modeled the queue by making
up a record which has an array to hold the queue elements and then to indices to
mark the start and the end in the queue of the array. This is the first and last
element of the array which is used in the queue.
Here to the theorem prover could scale reasonably well with the integer modular
K.
And, finally, it could do very easily the problems from the Euclid set. So they
were (inaudible), but all could be done in a very short time.
But, again, most problems are not really only satisfiability problems. They come
with general clauses. They come with disjunctions. So we just do not have sets
of ground literals, but we have sets of ground clauses. So we want to go from
assuming that S is a conjunction of ground unit clauses to having S as a
conjunction of ground clauses.
We proved a theorem that says that if the theory is valuable inactive (inaudible)
and the strategy -- the SB strategy is guaranteed to terminate on problems made
of ground literals, then it will also terminate on problems made of ground clauses.
This was obtained through these key decision schemes, so assume T union S is
the input. T is the theory, and S is a set of ground clauses. There is a summary
process that involves essentially only flattening. So a subset S1 of ground unit
clauses is fed together with the presentation of the theory to the theorem prover,
which generates a finite limit as F, which is then reunited with a set S2 which
contains only strictly flat ground nonunit clauses that is disjunctions of equalities
and inequalities between constants. Through flattening we can essentially
reduce any set of clauses to S1 union, S2 where S1 is a bunch of units and S2 is
a bunch of disjunctions of this kind.
So we put them back together. We apply the strategy again, and we can show
that this second application, if one terminates, which we assume, this second
one will terminate also because we have an analysis of the possibility inferences
on clauses made of these junctions of strictly flat equalities of these equalities
that is, again, equalities of these equalities between constants. And the theory is
valuable inactive so we don't paramodulate from examples. It follows it
essentially. The number of paramodulation steps is bounded by the number of
constants. Since we have finitely many of those, we shall have termination.
Yes?
>>: (Inaudible).
>> Maria Paola Bonacina: Yes.
>>: (Inaudible).
>> Maria Paola Bonacina: Yes.
>>: So you are essentially formulating a completion?
>> Maria Paola Bonacina: Yes, in some sense. This could terminate and find
satisfiability right here if it happens that only the unit part is already responsible
for unsatisfiability. Or else it will generate a saturated sector, finite, satisfiable
(inaudible) so it has in some sense compiled a part of the problem into a
segregated sector which then gets reunited with the piece we left out before.
And if we give it to the same strategy again, it terminates.
>>: (Inaudible) wouldn't do this kind of staging within ->> Maria Paola Bonacina: One single run, you say? No.
>>: It is not typical. Good SP strategies can be guided to do this kind of ordering
of first using the units and then only (inaudible) and saturates it?
>> Maria Paola Bonacina: Yeah, but it's not hard to do. I mean, the
preprocessing of the input is not (inaudible) and invoked in part and then can
restart again on the output of the first run and part of the input that was not yet
preprocessed. It is not hard to implement in an existing theorem prover.
>>: (Inaudible). Is that something you can do specifically?
>> Maria Paola Bonacina: I don't know. I mean, "common" is an empirical
statement. I don't know whether it is common or not. It doesn't sound so
uncommon to me because you know these kind of inference systems, ever since
Bendix completion, always had a sort of double life. I mean, you can see them
as a semidecision procedure to go after a proof or you can see them as inference
system that generates something called conical.
Now, in many cases, the second interpretation is not useful because it doesn't
terminate. In most cases, segregated systems are infinite. But here we are in a
framework where we have termination to begin with, so we get a finite thing here.
It is not so unusual also from a conceptual point of view to think of these
inference systems not only as proof procedures to go after a proof but also as
inference system to do a sort of completion. So this is what happens there. You
can think of these as a completion; and then finding the proof in the completed
set with the rest of the problem, the nonunit part.
>>: The standard selection procedures probably don't consider that literal person
(inaudible).
>> Maria Paola Bonacina: Oh, you are worried about the selection strategy -about how the selection strategy and the prover could keep these quiet while
working on these and activate these clauses only after this is done?
>>: (Inaudible).
>> Maria Paola Bonacina: Okay. I don't think it is too hard. Okay.
Other questions?
>>: (Inaudible).
>> Maria Paola Bonacina: Yes. Indeed, earlier I thought that this was the point
of the -- I thought this was his question, do we do it once all together. Yes, of
course. This was just the way that we found easy to prove termination so that we
could reuse here all the termination results for the ground unit clauses we had
before and then only prove termination from the saturated sect and the flat
ground disjunctions.
So this is not how -- this is also probably a better answer to his question. This is
not necessarily how you're supposed to implement it because since the
strategies fail, we can have the work done in one single hand. This is the way
we proved the termination.
Sorry. I should have answered to my first implied reading of your question, which
was actually his question. Anyway ...
But we didn't really think about solving -- we didn't really think about
implementing this one because we tried a few experiments and we found that for
(inaudible) provers such as E not really handled efficiently very large disjunctions
because essentially they work by resolution and resolution tends to duplicate
literals because it takes two clauses and builds a new clause by inheriting most
of the literals.
So if you have very large disjunctions even with matched search plans, even with
selection strategies, it gets really hard to handle those huge clauses that gets
generated. So we started thinking about something else,
How we could somehow combine the strengths of the first-order prover with
those of an SMT solver which is based on the
Davis-Putnam-Logemann-Loveland procedure and the four can handle the
ground nonunit clauses with DPLL.
So we generalized the notion of dividing unit clauses from nonunit clauses into a
more general, more elegant, if you wish, the composition principle. We take a
problem and we compose it into a so-called definitional and operational part.
"Definitional" means we are going to have the axioms of the theory and flat
ground unit clauses, for instance, the defined function symbols. And the
operational part will certainly contain all the nonunit clauses again. So it can boil
down to what we had before, but it can also be more general.
Then, we shall do, indeed, theory compilation that is an application of the
theorem prover as a (inaudible) of completion engine to do as much as possible
theory reasoning upfront and as much as possible reasoning on nonground
equations upfront.
Then, we shall take the output of this and whatever we left out in the original
input, unite them again and give them to an SMT solver, such as your Z3. The
important thing is that we proved that in the saturated set output of the first
preprocessing step, we can get rid of the axioms, which, of course, we want to do
because the SMT solver doesn't know what to do with the axioms. It has the
theories (inaudible). So we want to show that the theorem prover can do enough
completion work to have insaturated set, enough knowledge about the theory
that the theory axiom is no longer needed. Essentially, we have proved that the
saturated systems -- the saturated set can entail everything that could have been
entailed by the theory axioms, had they been still there.
So we gave a bunch of sufficient conditions to prove that this transformation
preserves satisfiability. This is how it works, we call it T decision by stages. So
we start with the problem, which is T, the theory or combination of theories
because it all leads up to combinations; S, a set of ground clauses, not
necessarily unit. We decompose it.
Now, the crucial part of the composition is make sure that one only contains
ground units, but it doesn't have to contain them all. We can actually send some
ground units here, if we prefer, provided that all here are ground units and we put
them together with the theory. And we apply SP strategy as a compiler, as a
preprocessor to generate-- there I get a (inaudible) this batch. This S should
have been closer to the (inaudible). Never mind. I didn't notice before.
Anyway, we applied to generate a saturated system, a saturated set. Then we
throw out the theory axioms, which at this point we no longer need because we
want to go toward an SMT solver. And what we have here, together we will have
thought before, we give it to the SMT solver. And we showed that this worked for
a bunch of theories like arrays, records, integer offsets.
And in some cases, the output here is ground. So except for the axiom for the
axioms of the theory, the saturated output is ground, which means we actually do
an induction from the problem in the theory to a problem in the theory of equality
with uninterpreted function symbols. So we managed to compile away all the
information about theory. Here we get a ground set that we can just give to an
SMT solver.
And on the other hand, assume that we have some theory like linear arithmetic
theorem prover (inaudible) so we prefer to give arithmetic to the SMT solver, fine.
If we have arithmetic in the problem, we just don't say anything about arithmetic
to the prover. We keep it here, and we just give it directly to the SMT solver
saying big vectors. Big vectors are also a theory that is handled much better by
SMT solver than by general theorem provers. So whatever we don't want to
bother giving to the prover, we keep here and sending down to the SMT solver.
To sum it up a little bit, what we have is a bunch of terminational results to show
that we can design the satisfiability procedure based on generic reasoning. We
have a modularity theorem for combination of theories. We ran some
experiments. Now they are a little bit outdated; but back then, they showed that
the theorem prover against some (inaudible) was not behind the then-SMT
solvers of the day.
We generalized that two decision problems with clauses, not just literals. And
one of the ways to do the generalization is this notion of decision by stages
where we pipelined the prover and the SMT solver in such a way that the prover
acts as a preprocessor for the SMT solver. That is nice because you can do it
once and for all and then forget about the prover and just work with the SMT
solver. But it is useful, especially if we have problems with quantifiers because
you can deal with those as part of the theory in the theorem prover and then not
have to guess how to associate the universally quantifiable variables in the SMT
solver.
Some current and future work. Well, I'm looking for more termination results, for
more decision procedure, for more powerful decision procedures, experiments
with decision problems and not just a satisfiability problems with literals. There is
always the issue of how to find good search plans for these problems, especially
keeping into account that the search plans that comes with the theorem prover
we take off the shelf often are conceived for different search problems. Some of
the problems typically have an infinite search space; and they need narrow, deep
searchs to find what is often a relatively small proof in a potentially infinite search
space of a semidecidable problem.
In decidable problems, often the search space and the search behavior are
different. We need to be not as deep. A shallow search is sufficient; but on the
other hand, we need to go wider and be more exhaustive in checking things. So
there's -- there are search problems there to be investigated.
Then I am interested in integration with model building because, as I said, once
you have a finite saturated set, it is the basis for building a model but it is not yet
a model as we would like it. So there are a lot of issues and problem about
model representation and model extraction.
Here is a bunch of references. These are three journal papers. One a peer, one
is a peer from February, April on the (inaudible) logic of computation, and one we
just submitted that essentially contained all of the work I surveyed quickly here.
So there you will find all the definitions and theorems and proofs for those
results.
And, of course, I want to thank a whole bunch of people I had the pleasure to
work with on this topic.
(Listing names) and probably more that at least I discussed with, although they
were not co-authors.
Finally, I'm here for five weeks, as Leonardo said, and I'm looking for more
friends to work with, including post-doc students and more problems,
applications, theories to try and things to think about.
So thank you very much for your attention and for having me here.
(Applause.)
Yes?
>>: In your last diagram where you split the things into trying to harvest the
theories, the S1, I think you said, was only -- only U is possible?
>> Maria Paola Bonacina: Yes.
>>: I think of the problems I would feed to the theorem prover, they would
correspond to at the outer-most level. There would be a bunch of disjunctions
that are corresponding to the paths of the program. So that means I would feed
it hardly any unit clauses as all. Does that mean it buys me nothing?
>> Maria Paola Bonacina: No, no, it doesn't mean that because -- let's go back a
little bit. We have to see what happens in the -- in this decomposition -- in this
decomposition state.
Okay. So your disjunctions will be up there in the S sector. They will go first
through the composition. The composition will need, among other things, to
flatten your problem. That is to make sure that no term has depth greater than
one.
Are your problems already very flat, or do you have nested terms?
>>: Yeah, they're nested.
>> Maria Paola Bonacina: They are nested terms, okay. So to do the flattening,
new constant symbols will be introduced and new unit equalities will be
generated to define function applications and constants as other constants, new
symbols that come in to represent intermediate values, so to speak.
So that this guy here will then contain disjunctions that are disjunctions of
equalities these equalities only between constants. So then you have only things
like A equal B or C different than D and so on, only constants.
Everything else will have gone here in the forms of unit clauses that may not
have been in the original formulation of your problem but will have been
generated by the composition.
>>: It is like the vocabulary?
>> Maria Paola Bonacina: Yes, exactly, exactly.
Please.
>>: So you can have uncomplex theories that can handle literals that checks
viability of literals (inaudible).
So you never generate nonunit clauses?
>> Maria Paola Bonacina: And?
>>: That seems to be fairly different from our sort of (inaudible) array.
>> Maria Paola Bonacina: Yes, yes that's true. Yes, that's true. Well, you see,
the issue of convexity and nonconvexity, as I said for the modularity result on
termination, we assume valuable inactive things so we are listing infinite but not
necessarily convex. So we are more general than convex but not more general
than stably infinite for now when we can see to go beyond that.
In general, let's say that that affects the modularity of termination; but in general
with the theorem prover, you don't have, in principle, restrictions -- I mean, take
somebody like Victor Kunko (phonetic). He was at MIT. He graduated with
Martin Reinhart (phonetic), and now he is in Elozon (phonetic). He also uses
generic theorem provers for his reasoning about Java programs and he doesn't
even worry about termination. He just uses the prover and see how it goes or
else sets empirical limits, don't go beyond a certain depth in the proof or in the
length of clauses like you do in Z3 with the (inaudible) controlling termination by
empirical means. That can always be done.
So there are also people -- we're worried about termination because we wanted
to make the case that we can have decision procedures so that they can be
embedded in something bigger and one doesn't have to worry, will it terminate?
Won't terminate? Do I have to set a parameter to make it terminate if not?
So I think having decision procedures is important. But if you have a generic
theorem prover, nothing prevents you from trying it on nonconvex, nonstably
stably infinite, non whatever you have and see if you can still get good enough
results for your experiment or whether you can still get termination even if you
don't have maybe the valuable deductive condition.
>>: (Inaudible). Quite intriguing, to do units, description of units, even in the
case that you have non(inaudible).
>> Maria Paola Bonacina: Well, but then there will be creation of nonunit clauses
in here, because say that you have the theory of arrays in here, which is not
convex. So here they are the axioms of the theory of arrays, which are not unit
clauses, so which have universally quantified variables and that will interact with
the unit here and generate nonunit stuff. So the output will not be a unit.
>>: Oh.
>> Maria Paola Bonacina: Okay, okay. No, it doesn't buy us so much, the
completion part; but it does buy us something.
More questions or comments?
(Applause.)
1:12:56
Download