>> Francesco Logozzo: So the last one. Good afternoon everyone. It does not happen all of the days to introduce the authors of the papers that change your life, but that is what is happening to me today. I am glad to introduce Patrick Cousot the author of the paper on abstract interpretation in Poppel in 1977 that changed my life, really changed my life, because I was undergraduate at Pisa and I read his paper and said really cool. That's what I want to do in my life. Then I moved to Paris, did my PhD, found my wife there, [laughter] then moved to Redmond doing the same thing. My kids were born there. I work for Microsoft so I can afford a lot of bikes [laughter] so you know--so yep, it is a really big honor and--yeah, one is broken; you know the sad story, let's forget about it and let's talk about abstract interpretation and SMT solver today thank you. >> Patrick Cousot: Hello. Thank you. So this paper has three authors and the other one is Laurent Mauborngne [inaudible] in Madrid. And none of us know about SMT servers so we try. So we have two references that we re-factored into… I promised to say refactoring, so I have to find a way of saying re-factoring so we re-factored these two papers in a journal paper which is submitted. And a small part of it that I present. So the lack of [inaudible] school was changed by algebraic abstractions which are used in abstract interpretation for analysis or verification of systems. And the idea is that when you have properties and specifications, you abstract them into an algebraic lattice, and in practice you encode these lattices into and if you [inaudible] on which you have a [inaudible] algorithm. So the analysis is fully automatic such that the system properties are computed by approximating fixed point and they are made out of algebraic transformers which are built upon primitive that are operation on these lattices, and in general you use several abstraction that you combine into a with a reduced product. We will see what it is. Then you have another bunch of [inaudible] which are used in deductive methods and I call it proof theoretical or logical extraction. So then their system properties as specification I express in first order formula using theories and so you have a universal encoding of properties and you have one algorithm and some more algorithms and you manipulate this universal representation so most of the work is done once and for all. That is a great advantage. The problem is that it is only partly automatic, because most of the system properties are either divided manually by the end-user and are automatically checked by some automatic system, or you cannot prove. Very complex problems require difficult and variant to find. So, for example, when we analyze with a [inaudible] which are used [inaudible] in a single block, the invariant is 1 GB, so no one can write, can even read it, so. And what is interesting is that you can have several theories and you can combine them with Nelson-Oppen which was born about the same time as abstract interpretation, so one object is because we only know the abstract interpretation part; we can only explain everything in terms of abstract interpretation, not the contrary. So that is why we will show that proof theoretic and logical abstraction are just the particular case of algebraic abstraction, and in fact, this is done by the mathematician. They do that. Then we will show that the Nelson-Oppen procedure is a particular case of reduced product, and when you understand that, it becomes very easy to bridge the two worlds. You can have them that works simultaneously on one side with algebraic abstraction. On the other side you have logical abstraction and you use the Nelson-Oppen and reduced product to tie between the two. So that is one form of convergence between the two approaches. So we now go to the technical details. When you have a language, you have to have the semantics, and the syntax. So my syntax is very simple. You have variables. You have constants. You have functioned simple of some arity. You have terms and you have predicates and I distinguish atomic formulae which have no quantifiers. Then you have program expressions that either return or predicate and you have also clauses in the simple conjunctive normal form that use a conjunction of the terms that they are used sometimes in the, and some twos. And program I don't care really; I just assume there's some assignment and there are some forms of test or guards or something like that. Then you go to the classical notion of interpretation. So it is classical. Just I use different notation. That is when I have an interpretation; it is a pair for a set of values to assign values to variables and something here which provides semantics to the function symbols. So each function symbol becomes a function that I use and predicates, I interpret them as maps from values of variables to Booleans. Then you introduce a notion of environment which maps variables to their values and I assume that you have a mechanism that can evaluate the value of an atomic formula when you give it an interpretation and an environment. And also you can evaluate the term in some interpretation and some value, so the classical definition that you find in any Boolean logic. >>: [inaudible] fixed point, right? [inaudible] this big union? >> Patrick Cousot: This is an interesting question so we will come back to it [laughter] later. When you consider the semantic parameters you have the same notion as in logic. You have a standard interpretation that is described previously, and I will give the answer to your question. I will give the semantics in form of post-fixpoint because it may happen that when you go to logic there is maybe no infinite union or infinite intersection, and so you may have a fixpoint that doesn't exist. In that case you cannot define the semantic as a logical formula. But you may find infinitely many formulas that are postfixpoint and you can use it. So I think the example here of all logic. If your logic is not expressive enough, you may not be able to express the strongest invariant of a loop. But you may be able to make a proof of correctness by using an invariant that is not the strongest. So if your semantics give you all of the possible invariants that you can use for that loop, and the strongest is not there, but by knowing all of the invariants you know how you can do the proof, or you cannot. So that is why I do that. So I assume that I have observable that depend on the interpretation and observable can be a set of traces, execution trees, states, sets of states and things like that. Then I define properties to be sets of observables, so the properties are set of objects of value property so the property even is zero, two, four, five set. And I have a, I associate to the parameter transformer so this is a syntactic process I explained by induction of the [inaudible] parameter how I built the transformer so all logic would be the verification condition. Then I define the semantics as a set of postfixpoint that is a set of invariant in the case of logic that satisfy the verification condition. And in case you have a least fixed point it will be one of them. So here I have a little example. So here I have this parameter which starts on one and end increment. I inject predicate on integers and the least fixed point will be access strictly positive. But if I don't have the symbol greater than in my logic, it will be X is one or X is two, X is three or X is four and the infinite formula is not part of my logic, so without the symbol I cannot express the least fixed point. No problem. I can use post-fixpoint or people do add a new symbol. But by adding new symbol, you make your logic more complicated and in the end you must reduce to least fixed point and there is no [inaudible] and then the thing is decidable, so all the benefit of having servers is lost. At least complete servers. So that's it. So when you have done that, you see that properties, they have a complete lattice structure for inclusion which is logical implication. This one is false too. That is [inaudible] in this junction and so at the level of interpretation I am in [inaudible] theory. Wow. Someone is willing to do something on this computer, or what. So I will give you my password. Now you know some information. You know the number of characters. [laughter]. That is a leak of information, which is very important, in fact. So they should feel up to the end with a [laughter] I don't understand it, don't do it. So I think there is repetition. So you have the structure, post-fixpoint exists because I assume this is increasing that is [inaudible]. More importantly the transformer is built out of primitives, so here I ask for the assignment area. It says after the assignment the values that are possible are the same as you had before the assignment, except that you assign a new value to X which is obtained by evaluating the expression and the interpretation, and for test is the same. You check with at least two or fours and you keep the environment if it is true. And what is satisfying the property? It is among the invariant that I have in my semantic. There is one that implies the properties that I want to prove. So by going to postfixpoints you have a semantic which looks a bit more complicated which is more generic. So that is what I will explain. So now it is not the end of the story. And the problem is that problem does not have one semantic but they have many, many semantics. And usually when people do verification, they consider only the mathematical semantics, and they don't consider the implementation semantic so it is an interesting problem to have these comparisons. For example, you might say I will do an analysis which will compare the execution to the mathematical definition and say when they differ. Or you have an interesting analysis saying if I interpret my problem in reels and then you float but there is a difference of between the value that I can obtain. The [inaudible] so there are two [inaudible] in that [inaudible] variation. And so we have showed them this parameter which is very interesting because if you run it you will get 100. But if you compute with reals, you will get six. And if you use any approximation to the real, you will get 100. So because six is a fixed point which is [inaudible] and 100 is a fixed point which is attractive and so anything goes to 100 except one thing which is real stuff. So if you have a program with a [inaudible] on we say that is a 100 as a prover, please proof that it is 100 and it is not, because it is six in the reals. So all I will say to the prover is prove that it is six and the prover will make it then the run and then he says that is 100, but six and 100 is almost the same depending on the scale you are considering [laughter]. So another example if you take floats you have four possible interpretations depending on the rounding mode. And these give completely different results. So if you have a semantic you must give the four semantics and then tell what you do, so in some cases you know in which case you are because the machine can be set to have one of these semantics. But in some other case you don't know, so for example, you may do an analysis that is value for all these four semantic; that is all the conclusions of whichever of the semantics you have chosen. And this is a, the fact that you have many semantics and that you ignore it, is an abstract interpretation. So in fact when you say I do mathematical proof of my problem, you do an abstract interpretation which consists ignoring some semantics, and for these semantics you know nothing. So it is [inaudible] that you can formalize it by [inaudible] connection but it does not solve any problem. So we go to this machine interpretations, and we will have that everywhere next. So no we go to the details of abstract interpretation. So everything is built on that abstract domain, which is a set of properties you are interested in. You have an order which represents the implication, the logical implication. You have false true [inaudible]. You have widening, narrowing and you have these transformers that abstract the real transformers, the complete transformers for assignment, for double backward for word and test. Then to make an analysis you reuse this abstract domain, you define your transformer by induction on the syntax of the [inaudible] and this transformer use the primitives that you have included in your abstract domain, and so the abstract semantic will be either the least fixed point or the set of post-fixpoint in case the least fixed point does not exist. So you see in the abstract, the structure things act the same as in the concrete. Then you need to have a correspondence between the abstract and the complete, and can we do that with concretization function. So this explains that this is the abstract version of implication. This is the abstract version of union, and essentially when you apply concretization to all abstract values, you get a subset of the properties. And so abstraction consists just in considering the concrete properties, selecting your subset and saying I will stay in all of my computation within this subset. So when I have an operation that goes from gamma ray to gamma ray I lose no information. But if I have an operation that goes from gamma ray to p outside [inaudible], I will re-approximate by an element of gamma ray. That is where I will lose information. And then we have to change the notion of the [inaudible] because we have to postfixpoint but it is a bit trivial that it is sound that says if I have done a proof in the abstract, that if you have some abstract semantic that imply what I want to prove, then I will be able to redo it in the concrete. So I will be able to [inaudible] in the [inaudible] concretization of the abstract value. And it is complete in the other direction. If I have a proof in the concrete, I can find some way to do it in the abstract. So this is generalized as classical notion with least fixed point. And then you have sufficient conditions for completeness, for soundness, excuse me, and completeness also. And here is one. It says essentially when I make a transformation in the abstract and go to the concrete, it is another approximation of the concrete transformation that I would've done. So this condition that works for fixed points will also work for post-fixpoint semantic. So the generalization changes nothing. And to prove that you just prove it for the primitives that you have in your abstract domain and then you prove it by induction on the structure of [inaudible] because this f is defended activity and so the proof is really trivial to do. And also I mentioned that in case you have a best abstraction, you have Galois connections but many implementation you don't have, and when you go to logic you don't have. So another point in abstract interpretation is widening, so I have to explain that. So we have widening that the operation that takes two elements and give something that is another approximation of both elements. So you see that if you take two elements in the lattice and there is a best approximation which is joined, but sometimes we have no joint or no efficient way to compute the joint exactly, so we use a widening just to approximate the joint. So we call that over approximating at widening, so that is one problem you can use while learning to replace joint when you cannot compute them. And the other use of widening is to enforce convergence when you have fixed point computations. So when you have fixed point computation you make iteration, and the suggestion here is that each time you do a widening, which is a mysterious operation which will go faster in the iterates. It will over approximate the iterates with the central property that it will terminate ultimately. So in fact the definition is just what you want. You want to have an operation which enforces approximation and here it says because you have this zero solution, will be another approximation, so we practice most widening are used for the approximating and termination but sometimes this is not the case. So here the iterates, you see I start with the bottom. I just stop when I have a post-fixpoint here, and otherwise they take the previous iterate, the next iterate and I do a widening. You can do so widening with respect to all previous iterates. There are many variants. The theorem says when you do that you will reach a post-fixpoint infinitely. We need iteration. Either widening is both over approximating and terminating. So now when you implement, you essentially implement this until you find your representation that is structure for representing properties. Then you implement programs to compute the implication join [inaudible]. So if you have different abstract domain they have different representations, and so you need to have some form of communication between the two. So it is the advantage of this approach that is very efficient. It proves the algorithm that you use that are specific to the kind of properties that you are manipulating. If you have an integer, it is a pair of integers which is a, there is nothing more [inaudible] machine. But you require experiment. I've seen many not so many people able to do it very well. So the is no problem. Why, yes? >>: What is the reduced product? >> Patrick Cousot: I will answer your question. [laughter]. I have a bunch of slides to explain that, yes. It is trivial. It is a conjunction. A conjunction of properties, but you have to do it in the abstract and efficiently and we will see. So now I come back to logic and I introduce quantified formula, classical. I just have a quality and then I just write the free variable of psi. I write in this way which is not conventional. Then you have a satisfaction [inaudible] which says psi is true for the assignment of variable given by this environment in this interpretation, and the interpretation of equality is equality in the end. That's it. And because previously we have seen that a Boolean may have multiple interpretations, I will work with respect to many interpretations. So I will take the set of interpretation and so I have to extend the meaning of formula for a set of interpretations, so it is very, very simple. I say you give me an interpretation. I give you the environment that makes the formula true for this interpretation. So it is a map form interpretation to the environment that makes the formula satisfiable. So now we have a problem. How are we going to represent a set of interpretation? And in fact, you can represent by a theory, because a theory is a set of theorems that is closed, sometimes no free variables. And the models of the theory they are interpretations, and so they are a set of interpretations. In general a theory has many interpretations. And because of the sentences of the theory have no free variable, I can say that the I exist in the in the environment for all of the environment because I don't use it. So these two formulas are exactly equal. So when you refer to the theory, you have a semantic which has many interpretations, and the these interpretations are the models of the theory. So this is for me because I always forget, so that is my slide to remember. So decidable theory, you can decide whether or not the formula belongs to the theory by [inaudible] them which is efficient and dominate them. So deductive theory they are such that if you have a formula in the theory and it implies another one, this other one is also in the theory, then the theory is satisfiable on at least one model. And it is complete if when you take any sentence either it is in the theory or is not. I hope that this is correct, but I never remember this modulo theory so I revise before going on [laughter]. I am new to-so validity for all models the formula is true. Satisfiability modulo theory SMT, that is for all models, all the--no, no. There exists a model in the theory such that the formula can be shown to be true for this particular interpretation. And so now you can, that your secret of this SMT solver, you can when the theories decide [inaudible] by the decisional theorem, you can check the satisfiability and if the theory is decidable and complete, then you can also use this algorithm to check satisfiabilities. So the only problem I see is that many solvers SMT solver, they put some restriction on the quantifier, the formula that they cannot [inaudible]. So if I understand well the quantifier is outside it is okay. If you have levered and levered and levered the quantifier it is not so okay. We have to leave with this. So I think that when you use theories and so on, you do abstraction, so let's have to show why, because it fit in the modulo abstract interpretation. That is it. So you take a set of formula first or the logic as your outside domain, and you take a theory and then you define an abstract domain with this formula. This will be the implication which is given by quantifying all of the [inaudible] variable in showing that the essentially logical implication, except that you quantify here. Then you have formula two [inaudible] disjunction and conjunction is part of this definition here. You don't have widening and narrowing, but you may be able to define some. And then the transformers you will see that they are also trivial so you see that it completely fits in the previous framework, except that the lattice that you get is not complete because you are missing limits of iterates or limits of change or you cannot make infinite unions and things like that, and so you have no Galois connection, and so this means that there are iterate choices that you have to do when you do and that is easy so and there is in [inaudible] no best choices. By changing your way you make your abstraction of the function here, for the example, you will chain the result and you have no way to have the best possible result, because you cannot express it in your formula. So computation is just that you take a formula, you check for all of the model of the theory whether the formula is satisfiable or not. And so you understand perfectly which is the semantic that you give to your formula with respect to the theory. So the abstract semantic is very simple. You define it in a term of post-fixpoint with a transformer which is based on the transformer. I hope that I have given them under, yeah, I have given them here. So you [inaudible] when we add by your formula first order logic. So for example, for the assignment you say there exists a known value so you substitute, you introduce this quantifier. You substitute with the new variable in the predicate, so it is true before and you assign the value before two [inaudible] and you evaluate the term and you give this result into the variable to which you made the assignment. And backwards it is interesting to see that you don't have the introduction of an existing check quantifier. So that is why many people go backward when they do the [inaudible] method because there is no quantifier to eliminate, where out here you get one. So I thought I found the solution but not, because as soon as you will compute fixed point you will get back the quantifier which says either I am at first iteration or at second or at third so you get the existing iteration where I am, and so the quantifier that you had here will reappear here and so you have gained nothing, except either the user gives you the variable in which case by one iteration you have no quantifier. So in the conjunction is just you see in some formula that you can add in, if it does not work, you can over approximate for example [inaudible]. So the implementation is very nice. There is a universal representation of abstract properties so you have no algorithm to design. You have all of the operation of the lattice that already exists in the formula so you have nothing to do. For the implication you use SMT solver so you have nothing to do, and concrete transformers are purely syntactic, so you have nothing to do, and so implementation is very trivial. So that is really nice. And moreover you can prove once for all that the syntactic transformer are correct, so you do it once for all. And the only problem is that when you are given the concrete property, how are you going to abstract it. For example, I am giving the property is a prime number, and you have to write a formula that say it is a prime number. It is not just a trivial task. You can say it is one or two or three or five or seven and then far more [laughter], so the facility that you had before you pay it when you would have to do these translation. And the other thing is that there is no widening so you have to define one. So I thought I should, I thought to do this one for Ken, but he is not there. It is a pity because I wanted to provoke him on the [laughter]… >>: It is recorded. >> Patrick Cousot: What? >>: Is recorded so if you want to… [inaudible] and we will remember [laughter] >> Patrick Cousot: And so here I have the slide for Ken. So what you can do this choose a subset of your, or you can choose a finite sub lattice, for example. If you choose a finite sub lattice and when you have theories they go, they do not go fast enough, you go jump in a finite sub lattice somewhere. And so you cannot do that forever because the lattice is finite or it satisfies the ascending chain condition and that is a very simple way of defining widening. Using [inaudible] we have [inaudible] to the power of 2 for example and the power of 10 when you have a [inaudible] because engineer when they use the powers of 2 minus or plus 1 so if we are lucky those, by jumping to that there is some chance that it will become stabilized. So we can use. Any other I think Craig interpolation is some kind of widening that is one we called bounded widening, where you know a bound beyond which you are sure that the things are wrong [inaudible] specialize algorithm. So here if you know the specification you have a bound, and so you can find something in between that you get syntactically by Craig interpolation. The problem is that it does not enforce convergence, but because when you go over it can go farther, but you can go back to this solution to have a finite dominating widening. So the first day I saw interpolation by Craig I said but isn't that widening, and he said, he had the transparencies and he say had an additional transparency saying it is not. So [laughter] that is why I wanted to show that. So reduced product. At least one is interested. It is not time to sleep. It is [laughter] now that you know is so easy you can sleep on this one, because it is not yet the reduced product. So you see that [laughter] we have a version that does not work. So you take many outside domains, but the finitely only one, because if you have infinitely many you have [inaudible] difficult to implement. [inaudible] I cannot prove. For each of them you take a computation, and then you take just a product that gives you, you take the Cartesian product in the implication and you do it [inaudible] twice. And so the new meaning, it joins the conjunction of the information given by the component. So for example, you can have one thing that uses psi and another that uses [inaudible] and you make two analyzers of psi and [inaudible] they do not interact. So now the reduced product is almost the same. You have a finite number of outside domain. You have a computation for each of them. The thing you consider is the Cartesian product, but you make your reduction. You say two predicates are equivalent when they have the same meaning and then you cushion your [inaudible] mathematical [inaudible] the properties by this equivalence and so you have a, you chain the abstract domain putting together all properties that are equivalent. So mathematically it is a triviality, so problem is that algebraically is actually really difficult to compute this thing because the stuff here they are infinite, and there is no way of having an algorithm that will give you this definition. But it is a good definition. So what we do in practice is we approximate by doing a pairwise reduction that will iterate it. So we will go to this. So the Cartesian product is useful at the basic implementation, and the reduction is useful here because it is the best that we can do mathematically, and this is a compromise where we will make a reduction but not the most precise one. So let's find an example of reduction. You see I have one analysis which is [inaudible]. Another one which is simple [inaudible] answers and so because it is 2 Modulo 4 between these two bounds, can be 2, then the, cannot be 6 and cannot be less than 1 so I know now that it is 2 and because it is 2, it is 2 Modulo 0. So you see that the two outside domain have been reduced by the information that I have of the other one, because I have the interaction, between the 2 and the [inaudible]. And this must, so reduction is something that goes to something smaller, and we could go to false that would, there is nothing smaller than false. So we don't want that. We want something that will preserve the meaning. It so if I have a [inaudible] with some function that defines the meaning of limit or the abstract into the concrete, and I have a reduction, I say that it is meaning preserving when I apply the reduction to an abstract property it remains the same in the concrete. That is if I compute that and that or I compute that I get exactly the same. So meaning, permutation just mean they have the same meaning in the complete although in the abstract they have different representations. So it is a reduction if it is smaller, that is I improve for the order on the [inaudible] level and I improve on the order on the congruence here. And so why should I do that and I have a little example. You say here I say one say X is positive and the other say X is odd. So if I say X is positive and X is negative or zero then I get here X is zero and odd. In fact to know that X is negative does not prevent the number to be odd. So I get nothing here now if I do the reduction, you see I say because it is odd it cannot be zero so it is greater than zero. Then after this reduction, I analyze this, and I get 4 because you cannot be less than zero; it is strictly positive. And here I get nothing. But I make the reduction and now I get 4s on both sides. I should have written 4 here. And so I have proved that this code is unreachable, whereas here I could not prove it except the height analyzers. So, although the meaning is the same, that these two expressions in the concrete they mean the same thing exactly. Because of the way that I transfer information to transformers, the result not be the same with this one and this one in the abstract. So I have an interest to be always as precise as possible in the abstract. So when I have a reduction, so I assume I have a pair set, a computation, a reduction in the abstract, I can iterate it. So I start from [inaudible] I use nothing to do one more iteration, I just apply the reduction to the previous iterates, and I may have to do that forever, so when I pass through the limit, I take the iterates intersection of all the reductions. Because you may, for example, we have an example where you say i is greater than zero, then greater than one, then greater than two, then greater than three and you go on forever and the limit is no, is false. So you have to take the infinite intersections in some time. And the iterates can be unbounded. So it may happen that this is not well defined because it is just infinite abstraction on it well defined, that but then a if you don't stop at any point in the iteration, you get something better than previously. Each step improves. So if you go there and there is no intersection you stop before. And another problem is that if you do finitely [inaudible] of [inaudible] reduction the [inaudible] and more precise, but if you do infinite iteration of limit preserving reductions it may not preserve the meaning, which is a bit strange, but here is a stupid example. My concrete is 2 elements and my abstract is completely stupid because I have infinitely many elements in the abstract to [inaudible] this concrete. That is not [inaudible] in my reduction of this one, you see I improve in the abstract. I am smaller and smaller, but if I take the intersection, I go there, and there I no longer have the same meaning so this shows that you have to take care when you pass the limit. So, wow, there are only two Greek letters, which is Ρρ Rho and Γγ and gamma so not so difficult. So I will explain [laughter]. It is a big definition for something trivial. It is very difficult to make a reduction on many abstract domains at the same time. So the idea is do it 2 x 2. So I have many finitely many abstract domains. For each of them I have a computation into a concrete domain, and I have a reduction 2 x 2. It is reduction IJ takes someone in domain I and someone in domain J and reduce them by [inaudible] to formula in each of the domains, and I assume that the 2 x 2 reduction that is really the reverse something smaller than the original, and I also assume that they preserve the meaning. So up to now it is many symbols to say something simple. And so now I extend the 2 x 2 to vectors and that is very trivial to reduce vector. I take the two elements. I pick them here. I reduce them and I put them back in the vector, not changing the other elements. So eventually the pairwise reduction takes two things and leaves the others not changed. Then I combine that. That is I take the reduction 2 x 2 for all possible pairs in my vector of abstract domains. So I get a numerator that will apply for this reduction 2 x 2 and I just compose them. And then I iterate. So I take the limit of these operator which will do 2 x 2 reduction. Then considering all possibilities and going on forever. Yeah. So let's go to the result. It should say when you iterate and you pass through the limit, it is more precise at any iterate is more precise that any pair of reduction, and it is more precise than the original. That's what we want. And also we have that it will be mainly preserving, that is it will be, all these will be in the class of all equivalent expression of the property. The problem is that they may not be the smallest one in this property. There are one so I have proved that my reduction is [inaudible] correct, but I have not proved that they are the best possible. And in fact they are not. So in general, the pairwise reduction is not as precise as a reduced product, but we have sufficient conditions that are sufficient for having the best but these are many Greek letters so you can read the paper for seeing them. And here is a counterexample. I have, this is a concrete, so my properties are a subset of ABC. My first abstraction is the set Mt. I can only say it's A, so I can say false. It is A or I don't know. Here I can say false. It is A or B or I don't know. And here I can say false. It is A or C and I don't know. And now I take this property, the first abstract domain I say [inaudible]. The second, I say it is A or B. The third it is A or C. So if it is A or B and A or C, it is A. So here I should have A and I can express it, because I have A in my domain. So I have improved this one. Here I cannot express A. I can only say A or B, so I say A or B. And the same for the last one; the intersection would be A and the approximation of A is A or C here. So I say A or C. Now if you take the 2 x 2 reduction, see if you reduce these two, you get nothing. If you reduce these two, you get nothing. If you reduce these two you get A, but you have to reapproximate the result in the domain so you get back AB, and here you get A and you have to reapproximate in the domain and you get AC. So you see that if I take any 2 x 2 reduction it reduces nothing, so if I iterate it with [inaudible] reduce and, so I will not raise the best solution which would have been this one. Then to get it you have to have all three at the same time, and if they are N, you have to have all N at the same time, so the reduction is really to be optimal must take all abstract domains and if you go 2 x 2, you are less precise in general, but always correct. It is more efficient to implement, because it is easy whenever you introduce a new abstract domain you make a reduction with some of the previous one for which it is useful. So now you have to show that Nelson-Oppen is doing that. So I will start with an example and stay at the level of example. So I take this formula, Ψ is X equal A or B and F of X is different than F of A, and F of X is different from F of B and I have two theories. One where I have A and B, and the other where I have F. So what NelsonOppen is doing is the first phase is purification, they call that. So they will say when we transform my formula into a conjunction of two formulas, which are all in the same theory. So this one is in the theory with only As and this one is in the theory with only F. And to do that I already told you I introduce auxiliary variables so here is Y and Z that are shared between the two formulas. Psi 1 and Psi 2. And now that I am here I will ask the server to say do you have a solution for this one. Then I will ask do you have a solution for this one. And from these two questions I must answer whether this is decidable or is a satisfiable or not. So the first phase is purification. So we have not seen that in that interpretation, so no, so I don't speak about it now. I speak about it later. So then the second phase is from this I will infer equalities or inequalities disequalities between the variables. So the for the equalities that I get from the first is X is A and Y is A so I have X equals Y, and also I have X is B and Z is B so I have X equals B, and for the second I have if X was equal to Y this would be equal, so it is not possible to have X equals Y, and same here. I cannot have X equals Z because this would be false. Then you put this back in the two formulas. So now I have X equals Y and B here, plus X is different from Y and X is different from Z so I get here false. And the original I will not get that in one shot. I have to iterate. That is I will add after adding inequalities or equalities I get more that I will push back into the other, and that will get more, and that will push back into the other, and each time I will say I will send to the server to conclude. That's why I learned that is not at all like this that server will do. But [laughter] when you [inaudible] that is what you understand. So then the conclusion of Nelson-Oppen they are followers because they were wrong if I know the beginning. So if the theories are disjoined that means that A and B are not F. And this one is something which says if it passes the limit correctly, and this one says equalities or disequalities is not enough; you should have conjunction of disjunction of equalities and things like that. But if the theory has nice properties, you just have to propagate equalities or disequality. And if you have all of this hypotheses, then the [inaudible] are determined because there are finitely many variables, so you have finitely equalities and disequalities that you can propagate, so we must dominate. It is sound that is when you propagate in formation then you take the computation is the same, so it is meaning preserving as I was saying. And this is complete if the formula is satisfiable, it will always succeed. And also I have seen in some paper that theorem provers they do, they have the same technique as they use, although they are not complete. So why do you have these hypotheses. It is essentially to get completeness. Because if you don't have this one then you will propagate not enough information, but the information you would propagate would be [inaudible]. If you don't have this one it is, this one is essentially when I pass through the limit I see that I have this which is 2 so I don't care. And the disjointedness of the theory signature is there to say I cannot propagate something else then equalities or disequalities. So that my interval and congruence example I was propagating more than equalities and disequalities. I was propagating 100 values, in fact, and this is a condition that ensures that propagating equalities and disequalities will be sufficient for concluding. So you see that if I eliminate all of these restrictions I get determinates and I get it sound. And I abandon its complete. But in such analogies nobody cares about completeness because we serve an undecidable problem, so you have no solution anyway. So that was this that I explained. I forgot that yesterday evening I added because I look at the [inaudible] and I know you have won the competition. Not in all categories, but they are really the best. Congratulations. The point is that the paper, they have to be complete or otherwise they will not win the competition. So I understand that if you want to win the competition when the answer is yes, you must then say yes or no. But otherwise, we don't care, because since everything is undecidable, you can abandon all of these possibilities, and what is nice is I understand you have nothing to change in your SMT server. You just use it for theories that are not satisfying this at all. And it will give something which is correct if I understand well. >>: [inaudible] >> Patrick Cousot: What? >>: For [inaudible] arithmetic it is incomplete. >> Patrick Cousot: Ahh. Already you are incomplete, so I am worried for nothing, because I [laughter] I just [inaudible] you said it was very important and [inaudible] people want to know why it does not work. I am happy to see that you're on the right side [laughter] [inaudible] of completeness [inaudible]. So now I have to show that this is true that you have a reduced product. And so the problem is that you have more than values, because in the previous thing I was assigning values to variables, but when you go to static analysis they don't do that; they do more. And in the example is [inaudible] so [inaudible] generalization of complex number to locate the position in the space. And you have a normalization like a with a the norm of complex number, which is psi expression, so if you make an analysis of this one must be one most often, except when it's not and then we normalize. And if you normalize ABCD you get no chance to prove that this is going to be one. So what we do is we had something that will [inaudible] this expression, so we give this value to variables say [inaudible] equals that, and whenever we modify A or B or C or psi we see the influence on the denominator. And when we need these values we know that it will be in the value of the denominator. So that is like I told you [inaudible] variable but it is done in a [inaudible]. In fact we just decompose a fraction by putting D in the variable and assigning it to D and D is always assigned the same formula so we can make the special analysis for it. So in fact it is not solvable justifies variable to which we assign a sub expression that we have somewhere in the theorem and we want to observe this sub expression. So this first phase of purification is just this name, sub terms and keep track of their values. Then so ahh, so this is purification. So you know that perfectly so, don't need than the [inaudible] you see in the Nelson-Oppen you take pairwise theories, formulas and theories and you propagate the equalities or inequalities of one into the other. And the trick is that it is only the reduction that you can do because of the disjunction [inaudible] because the theory of disjoint they cannot share values, so information on values that you have one like positive, it cannot influence any other one because any other one can speak of positive because of the theory of disjoint. But if you have for example, parity and [inaudible] or something like that, then the symbol under disjoined because they share plus and so it is forbidden. But then you must propagate more than just equality like inequalities. Manuel? >>: So does the disjointedness require that for soundness or… >> Patrick Cousot: No. In my opinion I have reread the proofs and I have the impression really that it is for completeness. >>: [inaudible] also [inaudible]. >> Patrick Cousot: Yes. But you see also for, in practice if you don't make the reduction that they have shown, you will get very poor result. So what they would have to do is if they share information, look at what is the common information that you can express in your theory and propagate it to the other which makes the thing more complicated. But, so here we are. So now we can use this to combine the two kinds of analyzers and from my discussion with Nikolai a few days ago, I understand that it is a bit different from what you are doing. But I don't know exactly what you are doing. So maybe it is the same. So my idea is following, one you other than abstract interpreter which has traditional algebraic domains for example Boolean [inaudible] what you want. And here you have a logical abstract domain with several theories. And you have already this reduction that exists because this Nelson-Oppen will do this reduction for you. And on this side you already have these reductions because you analyzer who usually are some form of pairwise reduction. So what you have to do now is to write these reductions at, for example, collect, if I stick to equalities on this side, you can collect all of the equalities that you can find in these domains, and just add it as a conjunction in the formula here. So you really inject all of the equalities that you have learned on that side, you will inject them and then the Nelson-Oppen will take them into account. And from this side, if you have only equalities you can always generalize the domain can express equalities. It is very rare that they cannot express equality, so you can propagate the background from logical too. And with just this I think you have the minimal implementation and just transfer equalities and fit perfectly the classical scheme. You still have to find a widening, but if you want to be extremely simple, you stabilize on this side and then you forget about this one [laughter]. [inaudible] I can't imagine, or you or you will do one more iteration, and you formalize that change. You put it to two. That's another less stupid way. So that's it. You need to know completeness, this will be the actual reduction is really simple. The end-user will not need to have inductive invariant if you do the widening on the left side, otherwise it will be trivial. Then one thing which is nice is that the interaction with a [inaudible] you can formalize rather easy when you have abstract domain sometime it is used to communicate the Boolean [inaudible] 1000 facets. It is not that easy. So I think this would solve one problem we have, and also if you want to introduce a new abstraction, if you introduce this on that side, it cost less. It cost you nothing. Because I saw everything is just implemented before. And you pay Nikolai for improving his [laughter], introducing new theory but it is not your work. You are doing your static analyzer, so it's not your business. And you can complain to him if it does not work. Whereas on this side, you have to do it yourself and if it does not work, it is your fault. So, have no implementation and I have a student who I hope will be able to do something to experiment. So that is good news. It is great. And the bad news is that I am not sure that we solve all of the problems in this way, because we get more expressivity. I am not sure we get more efficiency. And I am even less sure that we get real productivity, reproducibility. That is a bit long here. I look in the dictionary for spelling, but it was not the proper word. It is some other one. So because if I understand when you run two times on the SMT server on the formula, you don't get the same answer at the same time. >>: [inaudible] [laughter]. >> Patrick Cousot: [inaudible]. So you see the [inaudible] abstract analyzer [inaudible] the code of [inaudible] is 6054 hours. It goes to 36 on two, three machines. It will tell them sometimes it will be 36 all the time, it will be 200. It will not like it. So we need for static analysis system that behaves the same way in all circumstances. >>: [inaudible] the size of the [inaudible] [laughter]. >>: [inaudible] take exponential time in the size of the [inaudible] every single time. [laughter]. >> Patrick Cousot: So maybe it should be almost [inaudible], but it is here. And so I have an announcement of a past event before conclusion. Because we had a seminar at ENS by Leopold Haller who is a young student, and he explained that DPLL is abstract interpretation, and I was convinced that because that was the first time I probably understood the DPLL [laughter] and so it may be that there is another connection between the abstract interpretation and the SMT solver because [inaudible] server are different but not so different from SMT server. So there might be other connection that might be interesting to explore. Thank you. [applause] >>: I have a question, I guess. So I understand the main issue is finding abstract remains that the signatures are non-disjoint and so in the decision procedure integration [inaudible] can you combine theories of non-disjoint signatures… >> Patrick Cousot: Yes? >>: And are there other cases where you can't complete integration such as the [inaudible] concrete [inaudible] that even though the signatures are not disjointed, they are not guaranteed? >> Patrick Cousot: You know, I am thinking that you can have a theory of int [inaudible] which is not good for you but for us it is nice, and the theory of module constant that I showed. Because they share plus you presently is forbidden because they share a symbol in the signature, but when you make the analysis it's very easy to transfer one expression to one side or the other. It was no problem. It is the same expression in both cases. >>: [inaudible] converging and not complete. >> Patrick Cousot: Yes. And it will be not complete, but… >>: It will not be complete? >> Patrick Cousot: No it will not be--with the present state of the art, I think it is not complete because you have to transfer more than equalities. I don't believe the [inaudible] is… >>: [inaudible] integration [inaudible] and intervals [inaudible]. >> Patrick Cousot: The complete integration would transfer… >>: [inaudible] reduce [inaudible] the bases… >> Patrick Cousot: Yes. We have one. The one we have is complete. That is we have an algorithm which was done by Gransier years ago, and you sure get exactly the right reduction, because it is simple finite. If you try to reduce, for example [inaudible] with, so you have some kind of [inaudible] plus linear congre answers [inaudible] linear expression equal, ahh, the constant equal [inaudible] would you lure a constant? And to intersect that with the Boolean [inaudible] would be really difficult. But if you have a domain with a high loop [inaudible] to make the reduction with [inaudible] not so easy either. So you have difficult [inaudible] and [inaudible] which may be complete reduction, but they are so [inaudible] you don't want to do that. And so in the simple case we have completeness and completed the case we do a sub [inaudible]. For example, in [inaudible] we don't do all of the pairwise reduction. There is an order among the abstract domain and we do the reduction in some of there, because we know which are the reduction and which are the useful. [inaudible] pointer and [inaudible] it will just transfer zero, null; we don't care. So we don't do the reduction. >> Francesco Logozzo: Any other questions? Okay. I think we are done so, thank you again. [applause].