>> Leonardo de Moura: Hello. It is my great pleasure to introduce Maria Paola Bonacina. She is visiting us for five weeks. She was for many years on automated reasoning. Today she will talk about how to super (inaudible) calculus creates precision procedures. And she was very kind, in the next week, she is going to give more lectures about technical details. If you are interested, you are welcome to come to. >> Maria Paola Bonacina: Thank you very much. Thank you, Leonardo, and (inaudible) and Nicoli for the invitation. Thank you all for coming today and for having me here. As Leonardo said, this is the first talk where I'm going to give an overview of how one can possibly apply -- at least in some ways, to possibly apply generic theorem proving to design decision procedures for satisfiability modular theory problems. If there is interest in the next weeks, I will be willing to give other talks or some more informal (inaudible) so we can discuss more freely, to go into the details, if there is interest of the (inaudible) improvement theorems in the papers behind this presentation. So for today, this is the general overview so we get a feeling of what has been done, of what may be investigated next. I start with some motivation. Then I'll enumerate some of the major reasoning methods that have been started and the sum of what seems to be the strength, because different methods are good for different things and none of them is good for all. Then I'll give the overview of the results we have obtained, myself and many others on applying generic theorem proving to satisfiability modular theory problems that will include termination results that will show that we can use what is a semidecision procedure for fixed-order logic as a decision procedure for the satisfiability problems. We gave a general theorem showing modularity of termination for combination of theories. "Modularity" means that if we have termination on each theory, we also have termination on the union. Then I shall recall some experiments we conducted actually a few years ago now with generic theorem prover named E, the work of Stefan Sholls (phonetic) on the satisfiability problems. And then I'll overview more recent work on a decomposition approach that is meant to unite somehow the strength of fixed-order theorem provers with those of SMT solvers and try to get a little bit of the best of both worlds to solve problems that I've had and some discussion -some open discussion at the end. So let's start with some motivation. While we (inaudible) software everywhere and we would like it to be reliable, it is more than a wish; it is a need. But it is a difficult goal for many reasons. Software can be artful. Software is, in many ways, a work of art. It contains the creativity, the insight, the ingenuity of people; and so that makes it difficult to create. It can be complex. It can be huge; we all know that. It can be very varied, different pieces of software, different applications, different context. It can be old, and maybe it comes without (inaudible) so that it is harder to deal with it. And even now there are so many studies in value formatics, there is the possibility that software may reflect some "natural" -- I have quotes there because "natural" is a heavy attribute, but still some natural laws of computing that -- to face different computing models and computing formulas and may also contribute to make software such an intriguing, complex, interesting and also difficult-to-work-with work of art. Furthermore, I'm talking about software -- and here we are in a software reliability research group -- but we should not forget that the problems about hardware verification are not solved. I am solving entirely at least. And the border between software and hardware is somewhat blurred. It is also evolving. It is not something defined once and for all. There is migration of functionalities from software to hardware, and there are also many approaches that actually describe hardware in situations it looks like software in a certain level of abstraction. So reasoning (inaudible) may contribute but only to reasoning about software but also to reasoning about hardware. But let's stick to software. There are many approaches to software reliability. You know most of them. Most of you are experts in some of this, so you know them better than I do. There is testing with automated test case generation. There is the design of programmatic systems that enable the programmer to generate better software to begin with. There are program analyzers that analyze the programs after they have been written or maybe reviewing the process of developing valued versions. Analysis may be static or dynamic, and they use different technologies from types to other certain interpretations. There is software model checking. We all know that model checking has obtained late results in hardware verification, but it's been applied also to software. And interestingly to this talk, software model checkups make a distinguished use of theorem proving of reasoning techniques. For instance, in bounded model checking, the problem is reduced to decide the satisfiability of a formula. In counterexample model checking, theorem proving comes in because the problem of deciding whether an error in the abstract program corresponds to an error in the concrete program is formulated as a satisfiability problem. So if we get, "Yes, it is satisfiable," we know that the error that we found in the abstraction also appears in the concrete. If we get, "No, unsatisfiable," then we can use the proof to define the obstruction. And I'm leaving out there because there is certainly more that I haven't cited. But the common notion -- the common point I would like to make in this talk is that a variety of these approaches, a variety of these technologies could make use of reasoning about software. What could reasoning about software be about? It can help find and remove bugs. It can find and remove bugs by accident less modestly. It can prove a program free of bugs of a certain kind. And more ambitious, it can prove the program correct; that it is free of all bugs. Now, systems -- the reason about software may be valid again, but they have typically a common architecture with a front-end, which is the interface where we model the program and we compile somehow the information coming from the program into formula that can be given to the back-end where we have the reasoning engine that ultimately solves the problem. What we can think of as a theorem prover -- For instance, in your group, you have the theorem prover Z3 developed by Leonardo Nicoli and that here would be the reasoning engine, the back-end behind another system that could be your havacer (phonetic) or other special product systems in the front. The focus of this talk will be on the reasoning engine. So I'm from theorem proving. I'm going to discuss the reasoning technologies and some possibilities they make available. When we talk about automated reasoning, we usually mean two major tasks defined formally. One is theorem proving, finding a proof for a conjecture and thereby show that it is a theorem; or else model building, building a model, which is often a counterexample for a conjecture. So let's see these a little bit more formally. We shall assume, as I said, to have some way to go from programs to formula. So we are not concerned with that in this talk because we focus on the reasoning engine. So let us assume it is approached somewhere else. So what we are concerned with is a reasoning engine that starts with a formula. A formula may typically have the form H implies C, where H represents a bunch of assumptions and C represents some conjecture. What we want to do, we want to determine whether this H implies C is valid or, equivalently, whether C is a logical consequence of H. Or, again, thinking refutationally whether H union, not C, is satisfiable. And this would be achieved by giving a refutation of H union, not C, which shows the dividities of satisfiable and, therefore, is a proof of the validity of H implies C (inaudible) or the validity of C. This is the task of theorem proving. Dually, we may have the answer that H union, not C, is satisfiable, which means H implies C is not valid. And this is done by giving a model, a model of H union, not C, which would be the for-account model of C, of the positive conjecture. And this is what is done by model building. Now, what do we have in formula? In formula, we have values ingredients beginning with propositional logic with the usual connectives and propositional valuables. We shall have equality, positive or negated, with so-called uninterpreted constant symbols. ABC is function symbol of GH. "Uninterpreted" means that they are free. They could be interpreted in different ways in different structures. Then we have (inaudible) theories, which means we have fixed the ascension to certain interpretations and, therefore, symbols are not treated; they are interpreted in a certain way. They include theories of (inaudible) structures such as lists, records of the structure that we can think of as a generalization of lists. Just like in lists, we have a constructive constant and to selector part incurred, or head and tail. In a constant data structure, we have a general constructor and, say, K selectors. Then we have arrays, records, B vectors. We have (inaudible) because problems come with numbers, with integer of real types. Finally, we have the whole of the logic which brings in the quantifier and three predicate symbols, or relation symbols. Now, depending on which language, which theory we select, our validity or our satisfiability problem or dual satisfiable problem may be decidable or semidecidable. If we place ourselves in the general system of the logic, we shall have a semidecidable problem and, therefore, the best we can get is a semidecision procedure. If we select the decidable fragment, we can have a decision procedure. Several of the first-order theories I mentioned before have the property that if we restrict ourselves to the quantifier-free fragment, that is, we consider only ground formula, formula without valuables in the logical sense, we may have a decidable fragment. Therefore, we can have decision procedures. In the literature, people talk about key decision procedures for the decision procedure that decides satisfiability of the ground formula in a theory T. Ground formula without loss of generality can be reduced with certain ground clauses, call it S. Typically, people talk about the satisfiability procedure for a decision procedure that decides satisfiability of a conjunction of ground unit clauses, or conjunction of literals, ground literals. So what do we want these reasoning procedures to be like? We would like them to be expressive so that they handle all ingredients, for instance, all the theories that appear in the formula. We would like them to be sound and complete so that they give neither false negatives saying "there is no bug," not because really there is no bug but because the proof is wrong; and giving no false positives, which means it doesn't happen. They say, "Yes, there is a bug," but that's not true. It is because the system cannot find the proof that there is none. We want them to be efficient because each formula they will be dealing with will typically represent only some task of the general verification task we are after. We want them to be scaleable because practical problems generate huge formula. We would like them to produce proofs so that we can check the proof and manipulate it. For instance, I remember what I mentioned before about software model checking where one can use the proof of the satisfiability -This is not needed now -- where one can use the proof of unsatisfiability to refine an abstraction. So being able to work on the proof is useful. And we would like them also to produce models because, as I said before, the model represents a counterexample which is often what we are really after. We would like a counterexample that helps us to find the bug. A counterexample to correct this is often a case to find the bug. So these are all the (inaudible) for reasoning procedures. Now, (inaudible) bring to our disposal a variety of reasoning methods that are good for different things. Many of them are probably known to you and are already implemented here. Let's start with the Davis-Putnam-Logemann-Loveland procedure, the DPLL, which is typically used for satisfiability problems in a position of logic. And it is very strong for its ability to do case analysis and, therefore, break apart large formula distinguishing the value scaling. We have congruence closure algorithm, in short CC, which is typically used to reason about ground equations. Congruence closure means -- think about a graph. If we have that all the children of two vertices are in relation, then the two parent vertices having the same label are also in the relation. And we can use that to think about equality because equality is a congruence. Then there are specialized theory solvers, for instance, the simplex method for linear arithmetics. These can be combined in the so-called DPLLT framework, that is, we have the (inaudible) procedure of propositional logic. We've incorporated a procedure for a theory. For instance, the congruence closure procedure for equality. EUF stands for equality with an interpreted function symbol. Next, we can combine different theory solvers using different methods and obtain what are called DPLL base SMT solvers. There are different ways to do combination of theories. One is the (inaudible) method. One is a delayed theory combination that essentially has the solver do the combination in model-based theory combination, which presumes that each solver builds its own model and can use it to drive the combination. But there are more. For instance, there is ray hiding, also known as simplification, which brings in walking to the picture. The main ingredient is that we assume we found the unknown terms, and this gives us an notion of normal form of an expression or a term being reduced to normal form that can only conform. What is important is that variety brings in matching, that is the ability to match a target with a (inaudible) in the presence of valuables. So we know longer have reasoning about grounded questions only as in congruence closure. But with guiding and matching, we can work with valuables. Then there is resolution. Resolution brings in the ability to deduce clauses from clauses. So it is considered a synthetic reasoning method because it synthesizes clauses from clauses. And its major point of strength is unification. If you consider resolution in propositional logic, it is not of much use. DPLL does much better. Resolution comes usefully into the picture when you go up to fixed-order logic, when you have universally quantified valuables because resolution enables you to instant shade quantified valuables by using unification. Now, matching a e-unification can be generalized to e-matching and e-unification, which means we can do the matching of the unification operation modular set of equations. And this can be done also in the congruence closure algorithm to instant shade (inaudible) universally quantified valuables using the ground equalities in the congruence closure graph as the equalities intercept E. Then we have theorem proving methods that work by instance generation. Essentially, they implement the Harbon (phonetic) theorem by trying to generate an unsatisfiable set of ground clauses from a general set of ground clauses. Then there is the whole family of tableau-based method that in contrast with resolution can be seen as analytic method because the way they work is by (inaudible) deduction; that is, by decomposing formula into subformula. And they can also be used to show satisfiability by so-called model elimination, viewing the tableau as the survey of all possible models and eliminating them all to obtain a proof Now, (inaudible) can be combined with super position, which is an inference rule for deducing equations from equations, equations with valuables. This can be done in the context of so-called MUSE Bendix completion that can be used to semidecide problems in a question of theories. Finally, when we put it all together and we put together resolution of hiding super position of (inaudible) modulation -- they are very similar -- we get the full-fledged first-order theorem prover that can deduce clauses, of course, with a question with universally quantified valuables, from other clauses with the questions. Now, this is a really large choice of tools of reasoning methods that we can pick to build our decision procedures. Now, empirically people have got to have a sense a little bit of what these value methods are good for. DPLL is good for SMT problems, especially because it can break apart large nonhaR clauses. Congruence closure is good for reasoning about ground equations. Theory solvers are very good for reasoning about special theories like (inaudible). DPLL-based SMT solvers are really strong on ground SMT problems, uniting the strengths of all the previous ingredients. Now, when we look at the other side and we see rewriting(inaudible) and, for instance, MUSE Bendix completion, these are good for reasoning about nongrounded questions with universally quantified variables. Likewise, resolution is good for reasoning about nonground first-order clauses, especially HAR (phonetic). And same, when we combine resolution paramodulation rewriting, we get methods that are sufficiently good for reasoning about general nonground fixed-order logic clauses with equality. But, again, this method are not as strong as DPLL in breaking apart very large formula and they seem to excel especially at haR problems. Now, this is the overview of what our general theorem proving was. Now, let's focus a little bit more on what we are going to use in this talk. Let us assume to have an inference system for first-order logic with equality. Let's say that I based, not so much because they wanted to exclude the others but because all the problems you're interested in include equality and rewriting has proved to be probably one of the best methods for dealing with equality, a tableau-based system or instance generation systems when they want to work with equality. They also bring in some sort of rewriting and super position. So let's assume to have one of those. An inference system is non-deterministic in nature. So in order to have a working method, a so-called theorem proving strategy, we shall need to combine it with some control that is usually called search plan for typical terminology from AI. So we get a strategy inference system plus a search plan. And if we have a refutationally complete inference system -- that is, one that ensures that whenever the input is unsatisfiable, a proof can be generated -- and the phased search plan that is one that does not neglect any necessary step in the form ensures that if there are refutations, one can be built, we shall have all together a complete strategy. So there are many of these. This is not our problem in this talk. We shall assume one and see what we can do with it for SMT problems. Now, the main idea of the approach I'm going to describe is the following. It is very simple. If we can show that first-order inference system is guaranteed to terminate on any T satisfiability problem for some theory T, then any complete theorem-proving strategy based on that inference system is, in itself, a decision procedure. Well, it is already sound and complete because it is a strategy for first-order logic. If we add the termination, we go from having a semidecision procedure for general fixed-order logic to having a decision procedure for the specific class of problems, the satisfiability problems, where we show termination. Now, a few things to remember. If we use a generic theorem prover, we are not going to have the theory built in, so to speak, in the algorithm; so we shall give a presentation, a maximization of the theory, a bunch of clauses describing the theory in the form restricting the interpretation of the symbols in the theory as part of the input. So the input will have the form T union S where S is, say, a set of ground unit clauses. NT is a presentation of the theory. Of course, in problems we usually have more than one theory as we saw before. So in this approach, the combination of theories will start by giving us input, a union of the presentations of all the theories we need. Also, notice that the border between T and S is soft flexible because if someone comes with a problem that contains a formula with a universally quantifiable, valuables we can migrate it from S to T and say it is part of the presentation of the theory. So if you have a system where theories are built-in and you have a new formula with quantifiers coming in, you have to deal with it as part of the problems. If, instead, you use a first-order theorem prover where everything goes into the input anyway, you can see it as part of the theory. So there is an additional flexibility there. >>: I have a question. >> Maria Paola Bonacina: Please. >> You said that theory has to be somehow deduced as a set of axioms? >> Maria Paola Bonacina: Yes. >>: How does one figure out if those axioms got to the theory? >> Maria Paola Bonacina: Well, for those theories such as those I listed before, lists, arrays, records and so on, (inaudible) unknown, I could give some examples if there is interest. For instance ->>: So that is a problem that the user has to somehow solve? That is not to be addressed by this theorem, right? >> Maria Paola Bonacina: Yeah, that's correct. The theorem prover assumes to have -- as part of the input a description of the theory so that we come with a problem, yes. Other questions? Just feel free to interrupt any time, okay? Now, what would be the advantages if we could do this? Well, we do have a sound and completed system, so we have complete strategies. We have advantages in terms of expressivity because we would have the full effect of the logic with the quality and, in particular, with narrative quantifier reasoning. For combination of theories, provided we can show termination -- and we shall see that with the modularity result -- we just gave the union of the presentations as part of the input. And we have, as I said, some flexibility drawing the line between theory and problem. Now, we can use existing theorem provers of the shelf, so to speak, or almost that way, proof generation will be already there by default because these inference systems typically generate proof if they find unsatisfiable. Model generation is not as easy, but we have a starting point. If we demonstrate that the inference system is guaranteed to terminate on this T union S problem, if we terminate regardless of whether it finds proof or not -- if it finds a proof, the output will be the proof. If it doesn't find the proof, the output will be a satisfiable set which has certain properties. We say it is saturated, which can form the basis for being a model because it is satisfiable and saturated, which means it contains a lot of information about what is true in that theory. Now, so far so good. But what about termination? We are starting from what is a semidecision procedure first-order logic with the equality. So termination is by far not real. Here I enumerate some of our results that I'm going to outline in the second half of the talk. Termination results. We proved that fairly standard, the first-order inference system is guaranteed to terminate on the satisfiability problems in theories of the structures in the form that is the satisfiability procedure. And in some cases when the theory allows it, this can be done with polynomial time complexity. Then I give a result about combination of theories of what I mentioned already before as modularity of termination. "Modularity" means if I have termination on T1, T2, T3, each taken separately, I also have termination on the union, on the combined problem. Then I will give some experimental evidence which is whether limited to problems on ground literals, so the axioms and the conjunction of ground literals. Then I shall discuss the problem of generalizing this approach from so-called the satisfiability problems where we have conjunction of ground literals, a conjunction of ground unit clauses to modularity decision problems where we have a conjunction of ground clauses. And a way to do that would be by the so-called decomposition approach, which is a way to decompose the problem and then submit it to a system which works as a pipeline of a fixed-order prover and an SMT solver so that the first-order prover is invoked first to preassess the problem and do intuitively as much theory reasoning and as much reasoning on universally quantified variables as possible, hopefully generating a ground problem, and then feed its output to the SMT solver, an engine like your Z3, for instance. Or else one can, of course, decide to pass something directly to the SMT solver. So we have a bunch of sufficient conditions to show how to do this while preserving, of course, the satisfiability of the problem and the flow of the consonance of the whole transformation. Okay. So let's go now a little bit more second half into the details of what we have. Termination results, we use a specific based system which is called SP -SP from super position -- specific but, in fact, quite standard. It is one of those systems that is implemented in the most commonly used fixed-order theorem provers, even the (inaudible) implement these kind of inference rules. So what is in there? There is resolution. There is a super position. There is factoring. There is a simplification of rewriting. All very standard inference rules, nothing terribly special or unique to this system. An important ingredient is that this inference system must assume an ordering on terms, literal clauses. Why I mention this -- I said I wouldn't go into any technical details; indeed, I will not. I mention this because the ordering plays a role in the proof of termination. What kind of ordering? It is a complete simplification ordering. "Complete" means it is total on ground terms. If we have two terms without variables, we can always tell which is greater. "Simplification ordering" means it has a set of nice structural properties that are very intuitive, like a term will always be strictly greater than any of its strict subterms. That's very natural. If you have a tree, you expect it to be greater than each of its strict subtrees. And the monotonicity and stability properties that are also very natural about the structure of terms and the application of substitution. There are many such orderings. They are implemented. They are well-known. We can assume one. And most important is a consequence of the nice properties I mentioned. These orderings are well-founded. I said these orderings will be important for the proof of termination. Why? Because in order to prove termination, we shall need to prove the different system generates only a finite number of clauses. More precisely that the set of clauses that persist, that don't get deleted by rewriting, say, is finite. And the order will play a role there because it will exclude many inferences than we would otherwise do. In particular, we shall also assume that these ordering is such that T is greater than C for all ground compound terms T and constant C. This is also very natural. We can just impose the precedence where all the function symbols are greater than all constant symbols and then define an order incursively and get easily (inaudible). Now, this inference system equipped with this ordering and the first search plan will give us a complete theorem proving strategy. The form of our theorems is the following. So this is not one special theorem. It is a template, a schema of a theorem. We show that a completed strategy is its satisfiability procedure. How do we show it? By showing that a completed strategy is guaranteed to terminate on an input given by the axiom for T and set of ground literals on the signature of T. We proved this kind of theorem for all the theories listed here, so we have the theory of lists. We considered both -- we considered different presentations of the theory of lists. We can have no name T, possibly cyclic lists. We can have possibly MT, possibly cyclic lists. The difference between no name T and possibly MTs is whether you have (inaudible) or not in the signature. "Possibly cyclic" means that these presentations don't include axioms to exclude models where an equation like power of X is equal to X is satisfiable. So we don't worry about cycles. We proved it for a raise with or without extensionality. Extensionality is the axiom that tells that two arrays are equal if all the locations are equal. Records with or without extensionality, fragments of linear arithmetics, integer offsets, and integer offsets and modular of K. "Integer offsets" mean the theory where you have predecessor and successor. Very simple. And also general records (inaudible) structures with one constructor in K selectors. If you take K equal one you get integer offsets as a subcase with a predecessor as the constructor and successor as the selector. Please. >>: (Inaudible) >> Maria Paola Bonacina: Integer offset modular is modular K, so you do not have all the integers but you have only the integers from zero to K minus 1. So it means that in integer offsets, you have actually an infinite bunch of axioms -- we should see how to deal with that -- that say, successor of X is different from X, the successor of successor X is different from X and so on. If you do integer offsets modular, you pick a K and you say successor of X is different from X, successor of successor from X is different from X but successor of X K times is equal to X. So you're close and you only have K value, okay? (Inaudible) structure, if you pick K equals one, you get back again integer offsets. If you get K equals two, you get lists. But incursive structures usually come with axioms for saying there is a structure is acyclic; that is, we can't satisfy it in an equation like power of X equals X. So there are acyclicisity axioms there. So we shall have acyclic lists in constant with those I listed before. Anyway, a whole bunch of theories. Here are the integer offsets that you were asking before. I mention it because this may look like an undoable problem for a theorem prover that expects a presentation as input because it has infinitely many axioms. These are those I mentioned before. We have full X, the successor of the predecessor of X is equal to X; the predecessor of the successor of X is equal to X. For all X and for all I greater than zero, successor of X I times is different than X. So what we do, we give a problem reduction. Basically, we proved that it is sufficient to give as input finitely many of these axioms, how many, especially it is sufficient to go up to N where N is the number of all currencies of successor, is in the worst case the number of all currencies of successor and predecessor the set of ground unit literals. It could be simply the number of constants for which the input defines the successor. But in the worst case, if each equation introduces a new constant, it will be equal to the number of all currencies of the function symbol successor and predecessor. >>: (Inaudible). >> Maria Paola Bonacina: I'm assuming that S is a set of ground literals, yes. Big S is a set of ground literals, yes. Next. So we proved all these theorems about termination on each of these theories. Actually, they are not all that difficult because they are all based on the analysis of the inferences with respect to the theories and the ground literals and show the only clauses of a certain kind can be generated under the ordering and then they are finitely many because we have a finite signature and organized structure of the inference system. But one says, well, it is not enough to prove termination for each theory. You also need to combine the theories. What if I have to prove termination for each combination? That would be ugly. But that's not the case because we gave a modularity result. "Modularity of termination" means if I can prove that the strategy terminates on TISOC (phonetic) problems, if it terminates on satisfiability problems in Theory 1, in Theory 2, in Theory 3, each taken separately, then I can guarantee it will terminate also on the union, and, therefore, on any union of any of these theories. This requires two hypotheses. One is very standard for all combination methods. It requires that the theories do not share function symbols. Constants can be shared; functions, no. Why is that? Well, because when we prove the modularity result, we want to prevent unleashing infinitely many steps across theories. So if we do not have a shared symbol, we prevent all the paramodulation steps from one compounder in one theory into another term in the other theory. And then we shall assume that the theories are valuable and active. This is a technical but very simple condition. It prevents paramodulation from variables, and it is satisfied by all equational theories without trivial models. And if you have a theory with trivial model, you can just add something like exist X, exist Y, X different than Y to exclude the previous model. It is satisfied by hardened theories without previous models. And, basically, with this property, we are in the realm of the so-called stable infinite theories. There are also those considered by the S and opp and combination method. It is a technical condition, but it is very simple to satisfy. And it is satisfied by all the theories we saw before. Yes? >>: The generalizations, the nonshared function symbols, they still have (inaudible). >> Maria Paola Bonacina: Yes, there has been some generalization of the S and opp method (inaudible) with the shared function symbols. You are thinking about the work by(inaudible). Yes. >>: (Inaudible). There are a finite number. Do they subscript to that ->> Maria Paola Bonacina: So we listed the modularity theorem also to those cases. I don't know yet, but it is quite possible. It is something to be investigated next, certainly. So let's see the shape of the modularity theorem. If the theories do not share function symbols in the valuable inactive and the strategy is (inaudible) procedure for each of them, then it is a procedure also for the union of the theories and, therefore, for any of the combinations. Now, after this point, we were working on the satisfiability problems. So we conducted some experiments. This was now a few years ago, so we used the systems that were available back then, Version 0.82 of E and Version 1 of CBC and 1.1.0 of CBC light. What was interesting with those experiments, at least, was that maybe we were a bit influenced by the SAT community. We wanted to try synthetic pararithemtic benchmark to test scaleability. We tried both satisfiable and unsatisfiable instances. We tried combination of theories, and we tried sets of literals from the Euclid system. You see here some of the results. For instance, these benchmarks were called (inaudible) with parameter N. These are unsatisfiable instances. It was the problem -- obtaining a problem from theory of arrays with extensionality. It turned out to be fairly hard for the systems, at least back then. None could terminate for N greater or equal to 10. But you can see the theorem prover which is the one with the white circle could do better than the systems with Theory 13 because it could take advantage of (inaudible). See, the satisfiable instances of the same family of problems, but they are much easier. If you look at the runtime on the Y axis, you will see that these are much smaller in the way the theorem prover did well here, too, which is interesting because the general (inaudible) is theorem provers, since they aim at finding proofs, should be good at unsatisfiable problems but maybe not so good at satisfiable problems. Once we showed that the theorem prover is a decision procedure, this isn't necessarily true. This is an another family of problems on arrays where, let's see here, we have two curves overlapping. So CVC and E had essentially the same performance. Unsatisfiable instances of the E theorem prover did a little better. This is yet another family of parametric synthetic benchmarks in the theories of arrays, again unsatisfiable instances. And here, too, the theorem prover did well and using a different order in group Bendix ordering, you would do it in nearly constant time. This was the benchmark for circular queue. We modeled the queue by making up a record which has an array to hold the queue elements and then to indices to mark the start and the end in the queue of the array. This is the first and last element of the array which is used in the queue. Here to the theorem prover could scale reasonably well with the integer modular K. And, finally, it could do very easily the problems from the Euclid set. So they were (inaudible), but all could be done in a very short time. But, again, most problems are not really only satisfiability problems. They come with general clauses. They come with disjunctions. So we just do not have sets of ground literals, but we have sets of ground clauses. So we want to go from assuming that S is a conjunction of ground unit clauses to having S as a conjunction of ground clauses. We proved a theorem that says that if the theory is valuable inactive (inaudible) and the strategy -- the SB strategy is guaranteed to terminate on problems made of ground literals, then it will also terminate on problems made of ground clauses. This was obtained through these key decision schemes, so assume T union S is the input. T is the theory, and S is a set of ground clauses. There is a summary process that involves essentially only flattening. So a subset S1 of ground unit clauses is fed together with the presentation of the theory to the theorem prover, which generates a finite limit as F, which is then reunited with a set S2 which contains only strictly flat ground nonunit clauses that is disjunctions of equalities and inequalities between constants. Through flattening we can essentially reduce any set of clauses to S1 union, S2 where S1 is a bunch of units and S2 is a bunch of disjunctions of this kind. So we put them back together. We apply the strategy again, and we can show that this second application, if one terminates, which we assume, this second one will terminate also because we have an analysis of the possibility inferences on clauses made of these junctions of strictly flat equalities of these equalities that is, again, equalities of these equalities between constants. And the theory is valuable inactive so we don't paramodulate from examples. It follows it essentially. The number of paramodulation steps is bounded by the number of constants. Since we have finitely many of those, we shall have termination. Yes? >>: (Inaudible). >> Maria Paola Bonacina: Yes. >>: (Inaudible). >> Maria Paola Bonacina: Yes. >>: So you are essentially formulating a completion? >> Maria Paola Bonacina: Yes, in some sense. This could terminate and find satisfiability right here if it happens that only the unit part is already responsible for unsatisfiability. Or else it will generate a saturated sector, finite, satisfiable (inaudible) so it has in some sense compiled a part of the problem into a segregated sector which then gets reunited with the piece we left out before. And if we give it to the same strategy again, it terminates. >>: (Inaudible) wouldn't do this kind of staging within ->> Maria Paola Bonacina: One single run, you say? No. >>: It is not typical. Good SP strategies can be guided to do this kind of ordering of first using the units and then only (inaudible) and saturates it? >> Maria Paola Bonacina: Yeah, but it's not hard to do. I mean, the preprocessing of the input is not (inaudible) and invoked in part and then can restart again on the output of the first run and part of the input that was not yet preprocessed. It is not hard to implement in an existing theorem prover. >>: (Inaudible). Is that something you can do specifically? >> Maria Paola Bonacina: I don't know. I mean, "common" is an empirical statement. I don't know whether it is common or not. It doesn't sound so uncommon to me because you know these kind of inference systems, ever since Bendix completion, always had a sort of double life. I mean, you can see them as a semidecision procedure to go after a proof or you can see them as inference system that generates something called conical. Now, in many cases, the second interpretation is not useful because it doesn't terminate. In most cases, segregated systems are infinite. But here we are in a framework where we have termination to begin with, so we get a finite thing here. It is not so unusual also from a conceptual point of view to think of these inference systems not only as proof procedures to go after a proof but also as inference system to do a sort of completion. So this is what happens there. You can think of these as a completion; and then finding the proof in the completed set with the rest of the problem, the nonunit part. >>: The standard selection procedures probably don't consider that literal person (inaudible). >> Maria Paola Bonacina: Oh, you are worried about the selection strategy -about how the selection strategy and the prover could keep these quiet while working on these and activate these clauses only after this is done? >>: (Inaudible). >> Maria Paola Bonacina: Okay. I don't think it is too hard. Okay. Other questions? >>: (Inaudible). >> Maria Paola Bonacina: Yes. Indeed, earlier I thought that this was the point of the -- I thought this was his question, do we do it once all together. Yes, of course. This was just the way that we found easy to prove termination so that we could reuse here all the termination results for the ground unit clauses we had before and then only prove termination from the saturated sect and the flat ground disjunctions. So this is not how -- this is also probably a better answer to his question. This is not necessarily how you're supposed to implement it because since the strategies fail, we can have the work done in one single hand. This is the way we proved the termination. Sorry. I should have answered to my first implied reading of your question, which was actually his question. Anyway ... But we didn't really think about solving -- we didn't really think about implementing this one because we tried a few experiments and we found that for (inaudible) provers such as E not really handled efficiently very large disjunctions because essentially they work by resolution and resolution tends to duplicate literals because it takes two clauses and builds a new clause by inheriting most of the literals. So if you have very large disjunctions even with matched search plans, even with selection strategies, it gets really hard to handle those huge clauses that gets generated. So we started thinking about something else, How we could somehow combine the strengths of the first-order prover with those of an SMT solver which is based on the Davis-Putnam-Logemann-Loveland procedure and the four can handle the ground nonunit clauses with DPLL. So we generalized the notion of dividing unit clauses from nonunit clauses into a more general, more elegant, if you wish, the composition principle. We take a problem and we compose it into a so-called definitional and operational part. "Definitional" means we are going to have the axioms of the theory and flat ground unit clauses, for instance, the defined function symbols. And the operational part will certainly contain all the nonunit clauses again. So it can boil down to what we had before, but it can also be more general. Then, we shall do, indeed, theory compilation that is an application of the theorem prover as a (inaudible) of completion engine to do as much as possible theory reasoning upfront and as much as possible reasoning on nonground equations upfront. Then, we shall take the output of this and whatever we left out in the original input, unite them again and give them to an SMT solver, such as your Z3. The important thing is that we proved that in the saturated set output of the first preprocessing step, we can get rid of the axioms, which, of course, we want to do because the SMT solver doesn't know what to do with the axioms. It has the theories (inaudible). So we want to show that the theorem prover can do enough completion work to have insaturated set, enough knowledge about the theory that the theory axiom is no longer needed. Essentially, we have proved that the saturated systems -- the saturated set can entail everything that could have been entailed by the theory axioms, had they been still there. So we gave a bunch of sufficient conditions to prove that this transformation preserves satisfiability. This is how it works, we call it T decision by stages. So we start with the problem, which is T, the theory or combination of theories because it all leads up to combinations; S, a set of ground clauses, not necessarily unit. We decompose it. Now, the crucial part of the composition is make sure that one only contains ground units, but it doesn't have to contain them all. We can actually send some ground units here, if we prefer, provided that all here are ground units and we put them together with the theory. And we apply SP strategy as a compiler, as a preprocessor to generate-- there I get a (inaudible) this batch. This S should have been closer to the (inaudible). Never mind. I didn't notice before. Anyway, we applied to generate a saturated system, a saturated set. Then we throw out the theory axioms, which at this point we no longer need because we want to go toward an SMT solver. And what we have here, together we will have thought before, we give it to the SMT solver. And we showed that this worked for a bunch of theories like arrays, records, integer offsets. And in some cases, the output here is ground. So except for the axiom for the axioms of the theory, the saturated output is ground, which means we actually do an induction from the problem in the theory to a problem in the theory of equality with uninterpreted function symbols. So we managed to compile away all the information about theory. Here we get a ground set that we can just give to an SMT solver. And on the other hand, assume that we have some theory like linear arithmetic theorem prover (inaudible) so we prefer to give arithmetic to the SMT solver, fine. If we have arithmetic in the problem, we just don't say anything about arithmetic to the prover. We keep it here, and we just give it directly to the SMT solver saying big vectors. Big vectors are also a theory that is handled much better by SMT solver than by general theorem provers. So whatever we don't want to bother giving to the prover, we keep here and sending down to the SMT solver. To sum it up a little bit, what we have is a bunch of terminational results to show that we can design the satisfiability procedure based on generic reasoning. We have a modularity theorem for combination of theories. We ran some experiments. Now they are a little bit outdated; but back then, they showed that the theorem prover against some (inaudible) was not behind the then-SMT solvers of the day. We generalized that two decision problems with clauses, not just literals. And one of the ways to do the generalization is this notion of decision by stages where we pipelined the prover and the SMT solver in such a way that the prover acts as a preprocessor for the SMT solver. That is nice because you can do it once and for all and then forget about the prover and just work with the SMT solver. But it is useful, especially if we have problems with quantifiers because you can deal with those as part of the theory in the theorem prover and then not have to guess how to associate the universally quantifiable variables in the SMT solver. Some current and future work. Well, I'm looking for more termination results, for more decision procedure, for more powerful decision procedures, experiments with decision problems and not just a satisfiability problems with literals. There is always the issue of how to find good search plans for these problems, especially keeping into account that the search plans that comes with the theorem prover we take off the shelf often are conceived for different search problems. Some of the problems typically have an infinite search space; and they need narrow, deep searchs to find what is often a relatively small proof in a potentially infinite search space of a semidecidable problem. In decidable problems, often the search space and the search behavior are different. We need to be not as deep. A shallow search is sufficient; but on the other hand, we need to go wider and be more exhaustive in checking things. So there's -- there are search problems there to be investigated. Then I am interested in integration with model building because, as I said, once you have a finite saturated set, it is the basis for building a model but it is not yet a model as we would like it. So there are a lot of issues and problem about model representation and model extraction. Here is a bunch of references. These are three journal papers. One a peer, one is a peer from February, April on the (inaudible) logic of computation, and one we just submitted that essentially contained all of the work I surveyed quickly here. So there you will find all the definitions and theorems and proofs for those results. And, of course, I want to thank a whole bunch of people I had the pleasure to work with on this topic. (Listing names) and probably more that at least I discussed with, although they were not co-authors. Finally, I'm here for five weeks, as Leonardo said, and I'm looking for more friends to work with, including post-doc students and more problems, applications, theories to try and things to think about. So thank you very much for your attention and for having me here. (Applause.) Yes? >>: In your last diagram where you split the things into trying to harvest the theories, the S1, I think you said, was only -- only U is possible? >> Maria Paola Bonacina: Yes. >>: I think of the problems I would feed to the theorem prover, they would correspond to at the outer-most level. There would be a bunch of disjunctions that are corresponding to the paths of the program. So that means I would feed it hardly any unit clauses as all. Does that mean it buys me nothing? >> Maria Paola Bonacina: No, no, it doesn't mean that because -- let's go back a little bit. We have to see what happens in the -- in this decomposition -- in this decomposition state. Okay. So your disjunctions will be up there in the S sector. They will go first through the composition. The composition will need, among other things, to flatten your problem. That is to make sure that no term has depth greater than one. Are your problems already very flat, or do you have nested terms? >>: Yeah, they're nested. >> Maria Paola Bonacina: They are nested terms, okay. So to do the flattening, new constant symbols will be introduced and new unit equalities will be generated to define function applications and constants as other constants, new symbols that come in to represent intermediate values, so to speak. So that this guy here will then contain disjunctions that are disjunctions of equalities these equalities only between constants. So then you have only things like A equal B or C different than D and so on, only constants. Everything else will have gone here in the forms of unit clauses that may not have been in the original formulation of your problem but will have been generated by the composition. >>: It is like the vocabulary? >> Maria Paola Bonacina: Yes, exactly, exactly. Please. >>: So you can have uncomplex theories that can handle literals that checks viability of literals (inaudible). So you never generate nonunit clauses? >> Maria Paola Bonacina: And? >>: That seems to be fairly different from our sort of (inaudible) array. >> Maria Paola Bonacina: Yes, yes that's true. Yes, that's true. Well, you see, the issue of convexity and nonconvexity, as I said for the modularity result on termination, we assume valuable inactive things so we are listing infinite but not necessarily convex. So we are more general than convex but not more general than stably infinite for now when we can see to go beyond that. In general, let's say that that affects the modularity of termination; but in general with the theorem prover, you don't have, in principle, restrictions -- I mean, take somebody like Victor Kunko (phonetic). He was at MIT. He graduated with Martin Reinhart (phonetic), and now he is in Elozon (phonetic). He also uses generic theorem provers for his reasoning about Java programs and he doesn't even worry about termination. He just uses the prover and see how it goes or else sets empirical limits, don't go beyond a certain depth in the proof or in the length of clauses like you do in Z3 with the (inaudible) controlling termination by empirical means. That can always be done. So there are also people -- we're worried about termination because we wanted to make the case that we can have decision procedures so that they can be embedded in something bigger and one doesn't have to worry, will it terminate? Won't terminate? Do I have to set a parameter to make it terminate if not? So I think having decision procedures is important. But if you have a generic theorem prover, nothing prevents you from trying it on nonconvex, nonstably stably infinite, non whatever you have and see if you can still get good enough results for your experiment or whether you can still get termination even if you don't have maybe the valuable deductive condition. >>: (Inaudible). Quite intriguing, to do units, description of units, even in the case that you have non(inaudible). >> Maria Paola Bonacina: Well, but then there will be creation of nonunit clauses in here, because say that you have the theory of arrays in here, which is not convex. So here they are the axioms of the theory of arrays, which are not unit clauses, so which have universally quantified variables and that will interact with the unit here and generate nonunit stuff. So the output will not be a unit. >>: Oh. >> Maria Paola Bonacina: Okay, okay. No, it doesn't buy us so much, the completion part; but it does buy us something. More questions or comments? (Applause.) 1:12:56