>> Nikolaj Bjorner: It's my great pleasure to welcome Arie Gurfinkel from CMU and in a fitting tribute to Curiosity landing on Mars he's going to talk about UFOs and state space explosion, and so take it away. >> Arie Gurfinkel: This is a joint work with Aws here in big letters and Yi Li and Marsha Cheichik. He is in big letters because he did a lot of the work and also because he's right over there [laughter], so if you have questions that you think of after I'm gone, he is the one to talk to. I give no warranty, so any mistakes I have you can't sue my employer. I have to do it; I'm sorry [laughter]. So let me start. What this work is about is automated software analysis. I'm not going to spend a lot of time motivating why this is interesting, I am just in the position where this works. From my perspective an automated analysis is this box. It takes a program and it tells us whether the program is correct or incorrect. There are two major approaches to building this box. One is pioneered by Ed Clarke, Allen Emerson and Joseph Sifakis which is now called software model checking with predicate abstraction. Another way to build this box is through something called abstract interpretation which was pioneered by Cousot and Cousot at about the same time. The reason why I'm showing you these pictures is sort of my work is mostly working in between these two realms so doing some software model checking applications with abstract interpretation or some abstract interpretation with some software model checking. This particular work sort of fits more here and the follow-up to it fits more here. If you are interested about it I can talk about it later. I get questioned why is this picture this way and I just couldn't find a picture of Radhia. That's a picture she has on her webpage. And there are no other [laughter], there are no other pictures if you go to Google images and look for her. Here's the outline. I'll talk about the way we classify approaches to software model checking is over under approximation driven and the key word here is driven. Both approaches used partial models and things like that, but researchers separate what is the primary and what is secondary. I'll talk about UFO which is our way of combining the two and then I will talk about exploration strategy and the refinement strategy which are two important steps there, and then I will conclude. I don't know if I have to say this but this talk will be very informal so if you have questions please stop and ask them. Don't wait until the end. Here's a picture of something called CEGAR counter example guide to distraction refinement. We will say it's an over approximation driven approach and this is how works. We start with a program. We start with some abstract domain, some initial set of predicates, compute an invariant, check whether the invariant is safe. If it is we’re done. If the invariant is not safe we look for an abstract counterexample. Use an SMT solver to check whether it's feasible. If it is we have a counter example and it is done. If not, we are going to go and somehow refine our post operator, get more predicates using either interpolation or precondition and repeat this process. So we will say that this is an over approximation driven approach because a lot of the work is done through doing this over approximation part, so we do all of this work just to get the right abstract domain and then a lot of the analysis is done in this abstract computation. I'll give you an example just so that you'll see exactly what I mean. We have this program on the left-hand side and it has an error location and we’re trying to check whether the error is reachable. So the first question is is it reachable or not. Assuming no overflow [inaudible]. >>: The question [inaudible]. >> Arie Gurfinkel: If there's no overflow then it is unreachable, so let's see how well CEGAR approach will work. The first thing we do is we build an initial abstraction where, for example, we will take no predicate, just the control flow of the program, and it will look something like this, where the stars mean the nondeterministic choice. We then build a model out of this thing, which looks something like this and then we asked whether error which is now at 5 is reachable. Is it reachable here? >>: Yes. >> Arie Gurfinkel: It is reachable. We have a counter example; now we have to see whether it's a real counter example, so we go into our visibility check. We model this counterexample back to the original program. And we say okay, from 1 to 2 that's okay. From 2 to 4 is a problem. So from 1 to 2 the program actually goes from location one to location two. It doesn't go from location two to location four and the reason is we need this predicate. This is what blocks that aggregation. So we take this predicate. We add it to the list of predicates, building the abstraction where now we will have a Boolean variable B that will represent this predicate and then everything that can be abstracted by B is abstracted as precisely as we can. For example, the loop position is just B and this statement says if y was less than or equal to 2 then before we subtract one then it remains so, otherwise it is nondeterministic. We build a model and now in this model is 5 reachable from 1? >>: Um, yeah? >> Arie Gurfinkel: I may be losing some people [laughter], but there is no path to 5; and so we know that what we have just proven is that there is in fact unreachable and we have an abstract proof of it. So this is the CEGAR approach. We'll say that it's over approximation driven because it models the reasoning that to build the right over approximation and do reachability over it and then conclude that the program is safe or there is conflict. Here is another approach which we will call under approximation driven. An instance of it is impact or lazy abstraction with interpolation and this is how works. Instead of spending any time generating the [inaudible] model or abstract counterexample, it says generate just arbitrary paths to errors from a control flow graph. Once you have a bunch of these paths check whether they are feasible. You can do it using an SMT solver. There is at least one feasible path then you have a counter example and be done. If not, you try to explain in a uniform way why all of the paths are safe. You can use interpolation or weakest precondition for that. Once you have the explanation as to why this particular path that you picked is safe, you check whether this happens to be an inductive invariant. If it is, then you are done. You just have proven that the program is safe. If not, then there is some path that you haven't looked at in your program that is safe for a different reason or may be unsafe for a reason. So you go and pick more paths, check if they are feasible and keep going on and on. So here the approach is under approximation driven in a sense that a lot of the reasoning is in this part where you deal with a bunch of concrete executions and you are trying to decide whether they are feasible or not or generalize things from them. So let's see how it works on exactly the same program as before. I haven't changed anything. The error is still unreachable. What we are going to do is first we are going to pick a path, any path through a control flow graph will do. We can pick this one and go 1, 2, 4, 5. We then check whether this path is feasible and of course it isn't. And then we get a proof as to why it isn't feasible. The proof here is written as annotations which basically say that each annotation represents a set of reachable states from the top. It isn't over approximation of that. Each pair of annotations together with an edge form a horn triple [phonetic]. So you have to substitute the edge with appropriate action and that will form a horn triple. Here we will see this proof tells us that five is unreachable. The over approximation of reachable state is false. This proof is not yet inductive because we have this location two. The loop had two and we haven't seen it twice yet, so we're going to look for another path, maybe the one that goes through the loop once and gets out of the loop and then gets out and goes in to there. We checked whether this happens. It isn't. How do we know it isn't feasible? >>: [inaudible]. >> Arie Gurfinkel: The program is safe. We already know that so they cannot be physical calc driven. So we use, for example, interpolations to generate the proof that explains why this counterexample isn't feasible and we get something like this that says so initially we start with everything and then y is less than 2, subtract 1. Y is less than 2, we go into a loop. We tried to get out and we can't so there's nothing reachable at 4 in this case and nothing is reachable at 5. So now we have, we also see that the annotation here for 2 is subsumed by the annotation here at 2. So we in fact have an inductive proof of safety if you look a little bit through the sort of horn notations, you can now generate and apply horn for loops and get, using the inductive invariant [inaudible]. That's an example of an under approximation driven approach. Yes? >>: [inaudible] under approximation then you don't explore the feasible paths. You're doing something else here. >> Arie Gurfinkel: Again? >>: If it is an under approximation and you just explore the under approximations… >> Arie Gurfinkel: So I picked a bunch of possible executions through a control flow graph without using any abstraction or anything. >>: [inaudible]. >> Arie Gurfinkel: No. I don't know if they are feasible or not. >>: [inaudible] approximation? >> Arie Gurfinkel: Because they deal with concrete executions. So I actually write the form that represents the exact executions and then I check if the formal [inaudible] is false, then I represent it no executions. If the formal is something else, then I represent that some execution. So I deal with concrete executions. If I find a counterexample, it is a concrete counterexample. But I deal with it symbolically and so in this case I can write a symbolic expression which represents what seems to be many executions but really represents nothing. But the meaning of it is what it represents. Meaning is under approximation with respect to the program. >>: [inaudible] execution points going from 1 to 5? [inaudible]. >> Arie Gurfinkel: We can debate the terminology. You could say that I am working on [inaudible] of a control flow graph and then rolling somehow over approximate what the program does and I'm thinking I am working on concrete executions and I'm just using symbolic techniques to represent them. It just really is a terminology difference. If you won't allow that step for me then I can't say that there is over and under and we combined the two, which the story doesn't work as much. I'm going to show you the techniques that we have. I'm going to show you it at a much higher level than the example that you saw. I want to show you the things that you solve but at a higher level, where we don't look at the details but we just look at how the exploration happens where the over approximating part isn't where the under approximating part is. So we're going to look at that as a skeleton of the program and I'm going to show you the over approximating approach here in sort of high level pictures and then an under approximating approach in other pictures and then it should make sense in the next slide. This is what happens within over approximating approach. We go through this explore refine explore phases where will be build the model and refine counts for example build more models and the explorer phase will look like this. We start with a control flow graph node and initially it's labeled by whatever the initial condition is so blue means that it's labeled by some formula over the predicates. We then make a step, so we find a successor or pick a successor. Successor is not yet labeled. We just picked it. We then compute the abstract transformer, so we get the label for that successor. We get an overall predicate. And we keep going so. We pick a successor. We label it until we find the, a successor that corresponds to the [inaudible] location. In which case we will say okay, we have a counterexample. This one. We have to go through refinement, so we have to analyze it and see if it is infeasible, and if it is get a new predicate. After we go through refinement step, we go back to the very beginning. We don't always have to go to the very top, but it is simple if we go to the very beginning. We now have more predicates and repeat our exploration. Again, we start at 1; we go to 2. We label 2. We go to error and if our refinement was good enough then this time an error becomes unreachable. That means we label it with something, but the something will be false. We then go find another successor of 2, so here it's in the loop so 2 is successor of itself and then we check whether the annotation here is subsumed by the annotation here, whether we have seen all of the reachable states. If so, we terminate. If not we keep exploring until we get new error, new refinement and so on. Is that sort of clear? In the under approximating approach we also have exactly the same stuff, explore, refine, explore, except what they do is slightly different. The explore step just picks a path from a control flow graph. We haven't labeled it and don't know if it's feasible or not. We just pick the path so everything is labeled by [inaudible]. We then check whether it's feasible and if it's not we use interpolant in order to get a proof or compute an over approximation of reachable states along this path. I'll use orange color to indicate that we have labels. They also formalize over program variables but they come from interpolants. And of course if this path is infeasible the error state will be labeled by false. Then we pick another path, label that path and keep going on like this until, again, as in here we get that something at the bottom is covered by something at the top. Something at the bottom has a label that is subsumed by something at the top. >>: So initially these things don't really look very different because at the beginning of something like slam, you only have a control flow graph and all the model checker is doing is exactly like picking, just the path arbitrarily to error, right? >> Arie Gurfinkel: Yes. >>: Because initially the abstract transformer is just the identity? >> Arie Gurfinkel: Yes, absolutely. So the two things are really, really similar and I'm trying to argue that they are very, very similar and they should become mine and I'm going to show you how exactly for these reasons. The difference is from what you mentioned, for example, is what you do after. In the interpolate, in the under approximating based approach you take the proof and you abstract labels from the proof. In the predicate abstraction approach you don't do this. You take the proof and you abstract predicates from the proof. You forget the actual proof that you have and then you go back to your over approximating stuff and you see whether this new predicate is good enough to rule out the same problem happening. >>: So what that means is that the over approximating technique actually computes stronger predicates than the under approximating technique? >> Arie Gurfinkel: Uh-huh. It could be. >>: Because if you take the interpolants and then you abstract predicates and then you run predicate abstraction, predicate abstraction will compute something at least as strong as those interpolants. Blue labels will be at least as strong as your orange labels. >> Arie Gurfinkel: Modular [inaudible], but if you take the stronger… >>: Assuming you take the most didactic version of predicate abstraction that says compute the strongest Boolean combination, compute the best post condition using the predicates, right? In fact, if it doesn't compute something strong you're probably in trouble… >> Arie Gurfinkel: Absolutely. >>: So really the over approximating technique, the blue labels are actually stronger than the orange labels. >> Arie Gurfinkel: Yes, because orange labels only good enough to rule out those counterexamples, whereas the blue ones may be good enough to rule out future counterexamples that look in a similar way. >>: So the point about under approximation, in other words those interpolants that you are computing come from under approximations of the program’s behavior because they come from just partial behaviors of the program, and yet they are really neither under approximations nor over approximations, are they? >> Arie Gurfinkel: The labels themselves? >>: The orange labels. [inaudible]. [multiple speakers]. >>: How they know they are using generally just traces, infeasible traces to get their… >> Arie Gurfinkel: In principle, the orange labels come bottom-up. You generalize something. You have something very concrete and you generalize them into these labels. Whereas, here you start to come top down. >>: Well, okay, is BLAST and over approximating or an under approximating? >> Arie Gurfinkel: Can you hold this song for a second? I'm going to get to the picture of our approach and then I can show you how you get all of the points in the spectrum by specializing just various parts in this algorithm and then I can show you exactly where a BLAST fits into this picture. I think I'm quite close to that part. The first thing, so this is maybe the discussion that we just had to some extent. If you look at these two techniques we can try to put them into the space. This is an ideal picture. They don't necessarily look like this. We have this number of refinements here and the cost of exploration, the cost of building the labels. In our approximation driven approach the cost of exploration is high. We put a lot of effort into computing the predicate abstraction. In under approximation driven method, the cost of exploration is very low, but the number of refinements can be very high, because we don't compute labels which are strong. We sort of have these two parts where one has maybe a few number of refinements but very expensive cost of exploration, so if you get the right predicates and you are lucky you terminate in a few iterations. And here you may need a lot of refinements to combat the fact that your labels are to general, to weak. This doesn't reflect performance; it's not the case that, for example, fewer refinements means that you [inaudible]. It's not clear how this affects performance, but what we would like to have is an algorithm that can be in between these two. Not necessarily an algorithm that at any point, but algorithm that is easy to control, maybe not make that [inaudible] to squeeze between doing more of this or more of this. This is what this work is. So this is UFO. This is a picture that you should all be familiar with because this is infamous states space explosion which is a big problem in model checking at least. UFO algorithm has this help. On one hand it's an interpolation based under approximation driven algorithm. It uses orange labels. And on one hand it's a predicate abstraction based algorithm that uses blue labels. What we have on top of it, on top of this algorithm, one of the things that sort of makes it work is we have a novel interpolation based refinement that makes this whole thing work together. So what I'm going to talk about to talk about next is I'm going to show you the algorithm on a high-level and I think that I'll just concentrate the talk more about the refinement part. This is how UFO looks in a nutshell, so this is exactly the same picture you before for with blue labels and orange labels. What I'm trying to convey to you is that we will combine things and so what you should see is one picture that has both blue labels and orange labels. If you see that then that means we have combined them. It's going to use exactly the same steps as before. It's going to be explore, refine explore. First we start in exactly the same way as predicate abstraction and so we find some location L, we label it and we then go and see what the successors are. This happens to have two successors, scenario successor and some inner one. We pick the inner successor. We keep going on, expanding and expanding and then we hit elegant so always the loop head of something. We find the successor of L is also the [inaudible] location. We check whether we've seen all of the reachable states. If so we go and compute the label from their location and now we have a possible counterexample. So now we're going to go to the refinement stage. So far it was exactly like predicate abstraction. The refinement phase will be different. What we will do in the refinement stage is we will take this whole [inaudible] graph and we're going to prove absence of counterexamples in it. Not one path at a time but the whole graph at once and from this whole graph at once we are going to get a proof that explains why there are no counterexamples in this graph, again, all at once. And this is going to give us the orange label. What can happen here? So the orange labels will be in some sense strengthening the blue labels. Why strengthening? Because here error was reachable and here it is not so it got stronger. But it's not necessarily uniform strengthened. It's not uniform in the sense that here the other unreachable states of this location can now be bigger than the set of unreachable states of this location. So what we're going to do now for the next exploration phase is where you're going to find out all of the back edges which are broken which are no longer inducted and then we can restart exploration from that point on possibly with a new predicate that we found. That's it. Let's see how we get the other algorithm. So say we want to get an under approximation driven algorithm. What we would do is say here, whenever you compute the blue labels, always use the most imprecise path that you can. Use zero predicates, always. If you always use zero predicates here you will just unroll everyone who once and get interpolants to rule things out and see if you have converged. If you haven't converged you unroll every loop once again, check now if this bigger graph as a counter example. If it doesn't build the labels that it doesn't, check that it is inductive and keep going on and on. Now say we want it the other way. Say we want to get BLAST or something similar to just the predicate abstraction algorithm. Well, we're going to use all the predicates that we can every time we do more proofs and more interpolants and we are going to extract the predicates and use them in the next iteration, but also, why should we start the exploration from this point? We can start the exploration from any point here for any point above it. In fact, whenever we go after refinement after we have proven absence of a counterexample and we've gotten these orange labels we can then restart the exploration always from the top effectively killing the orange labels just using the predicates from it. That will be a predicate abstraction based algorithm. How do we control this interplay between over and under approximation driven? If we want it to be more under approximating driven, then we throw away some of the predicates. We find out which predicates we don't want and throw them out and the more we throw the more under approximating driven we become because more imprecise of possible [inaudible]. If we want it to be more over approximating driven, then the higher up in our tree start, higher up in the tree we restart. So that's our two ways. Does that answer your question about the [inaudible]? >>: No. [inaudible] later. It doesn't really relate to… >> Arie Gurfinkel: You asked me where the BLAST is and how it fits into it. So the way… >>: [inaudible] if you took a sort of very didactic version of BLAST, what BLAST would say is well when you get a bad path, then you are going to do predicate abstraction along that path to update the labels, so the extreme version of BLAST is the extreme lazy where you are saying do predicate abstraction along the path but don't propagate those predicates along other paths. Do the refining very locally, so that would be the extreme lazy version of BLAST which you wouldn't want to use necessarily. But if you think about that version of BLAST is actually sort of a special case of impact because all it's saying is I'm going to get the interpolants by instead of taking them directly out of my SMT solver I will get the predicates out of the SMT solver and I will do predicate abstraction along that path and that will give me a stronger… >> Arie Gurfinkel: So like starting higher up? >>: That would give me a stronger set of interpolants, exactly. So what I'm saying is is then BLAST an over approximation technique or an under approximation technique? I would think of it as a special case of impact to the extreme. >> Arie Gurfinkel: I would say that it is an over approximation driven technique, because the main engine for reasoning is the post estimation of predicate abstraction. So you do a lot of things so that you can apply the abstract post and then… >>: In the end isn't it just computing an interpolant for a path? >> Arie Gurfinkel: It's computing a set of reachable states. It's not really what it is computing because at the end of the day anything that you compute would be an interpolant for this problem. It's a question of, to me it's a question of how it is computing it. It is computing it by looking at executions and abstracting and generalizing a proof or whether it's computing it by doing something else, by… >>: [inaudible] post conditions, isn't that a proof? >> Arie Gurfinkel: Of course, at the end of the day when it computes it is a proof. So this would be… >>: [inaudible] an interpolant from a proof? >>: I think the distinction is do you take your abstract post and iterated to an abstract fixed point, thus computing an entire over approximation? And so the blue closed thing is that, right? Now eventually some other technique might do that, but I think to me the over approximation generally means not just computing the abstract post predicate of abstraction, but iterating to a fix point. >>: Yeah, but ultimately we get a fix point from all of these techniques. >>: Yes, but it's a question if you are doing it equally. >> Arie Gurfinkel: Far enough. >>: So is it BLAST predicate abstraction or not? [laughter]. >> Arie Gurfinkel: If you move far away from it, all of what they are doing is they compute a proof, a [inaudible] style proof of [inaudible] of the program. No matter which way you do it. >>: [inaudible] in the first iteration of predicate abstraction do you have an under approximation or an over approximation? >>: It's an over. >> Arie Gurfinkel: At some point you iterate the post and so you… >>: [inaudible] you looked at only a subset of the control set. >>: No, no. [inaudible] full fixed reachable fix point on the entire [inaudible]. >>: Let's say that you're doing predicate abstraction, so you have your, just [inaudible]… >>: Think about doing abstract interpolation in general. You think about you are iterating that abstract post. After two steps of the abstract post, do you have an under approximation or an over approximation? >> Arie Gurfinkel: That's what I don't want, that's a discussion that I was trying to prevent in the beginning and I didn't succeed. When you look at these techniques and you stop the technique in any point and you say I have an object in the middle of this technique what is this object? The safe thing to say is that this object is a partial model of something. It could be an over approximation, an under approximation, but it is a partial model. It describes some information that you have so far. >>: [inaudible] some information about some subset of behaviors… >> Arie Gurfinkel: It is. >>: So therefore it is both an over approximation and an under approximation. >> Arie Gurfinkel: It is. >>: If it's bounded, right, if it's bounded it's under but it could be with respect and abstract post which… >>: [inaudible]. >>: Over approximation or under approximation. >> Arie Gurfinkel: Absolutely, but what I'm trying to promote here is the stresses on the q were driven so I'm saying some techniques are driven by their over approximating way of thinking and some techniques are driven by the under approximating way of thinking. If you pick any of these techniques and stop in the middle, then of course what it has is a partial model which is neither an over approximation or an under approximation, but if you look at the way I was-maybe I won't go back right now--the way I was presenting the two approaches the CEGAR approach and the under approximation driven approach, then I would claim what makes it driven is what is sort of the most computationally intensive box or do other boxes work towards the sum box or do they just work on their own. >>: But I could explain this another way. I could just take, you are either computing stronger interpolants or weaker interpolants. You are computing stronger interpolants if you use more predicates and you are computing weaker interpolants if you are using fewer predicates. So you are just dialing the knob between strong and weak interpolation and if you go stronger you are putting… >> Arie Gurfinkel: Following a fix point problem. >>: But you are solving a fix point problem in every case. >> Arie Gurfinkel: No. The blue part I run until I get the fix point. And then the orange part I just run on a bounded problem that I generated through the blue problem, so it's known as… >>: Yeah, right, so that's right. There you are right. That is something else, because there you are saying that… >> Arie Gurfinkel: The blue box generates an inductive invariant that is maybe not safe. The orange one generates a safe set of reachable states that may not be inductive. So that's the interplay. In one case you say what's important. Abstract interpolation is what you say is important is to generate a set of inductive, an inductive invariant. Whether it's safe or not, that's secondary, so you prove your y indeterminate; you don't prove that your y [inaudible] is safe. Whereas, in an under approximating driven approach, the sort of say I want to prove why this bounded version of the program is safe and then I hope that this will become inductive, so that's… >>: That makes sense and that is the difference. I'm not sure I understand why one is over approximating and one is under approximating, but it is a difference. One is more eager than the other, the one that is going to a fix point and that's definitely right. There are cases where that is what you want to do. You want to have some kind of abstraction where you go to a fix point. >> Arie Gurfinkel: Again, the terminology is something that makes sense to us and it also made sense to, it's sort of seems like it makes a nice story where you have over and under and they come from two sides and you combine them, but perhaps that's a bad terminology and something else could be more effective. >>: [inaudible] the one on the left is really guided by [inaudible] and the other one is really guided by some concrete examples and then [inaudible] and so the one on the left is looking for a proof. >> Arie Gurfinkel: But it's a proof over approximation. The one on the left never loops the actual program. It loops at sort of an abstract post, what your predicates. >>: So the proof can do with abstractions. Whereas, the one on the right is really looking for concrete examples so you have to do concrete execution. >>: But you could also think of the difference as being essentially ways that you [inaudible] you are saying I am being more eager and I am taking a different set of rules and going towards a fix point with slight saturation proof. >>: Uh-huh. >>: So the point is taken. >> Arie Gurfinkel: Okay. Here is the algorithm if you really want to know. I don't really expect anybody to read it, but if you want to you can. I just wanted to point out that there is this explore part which is the function that expands the control flow graph and there is the refined part which is this one line which is the more interesting part from my end, so that's the one I'll talk about most of the rest of the talk. Just for the explore part, whenever I show this slide people will ask the question why do you go to a error state first, or what is the heuristic of unrolling the control flow graph. What we decided to do was use the weak topological ordering and is there anybody here that doesn't know what that is? Ah, okay. A weak topological ordering is a topological ordering of the nodes of the graph that also indicates roughly where each loop starts in it. For this graph this is the weak topological ordering because it is described as a sequence of nodes together with a bracket that indicates the site of the first element and the bracket that the loop had and then the closing bracket tells us where this loop terminates. The reason why this is interesting way to look at this problem is that if you want to compute a set of ritual states of BLAST interpretation, instead of using a work list, you could do it in a deterministic fashion where you go in topological order iterating each loop, each inner loop until it converges before going to the next one. Here for example, the exploration would be you got one, two, three, four and you iterate four until it converges and then you go five, six and again two until this convergence and then you go one, seven and seven, so you would end up doing all the inner loops before you do an outer loop. >>: [inaudible] actual loop [inaudible]. >> Arie Gurfinkel: But [inaudible] interpretation has its name weak topological ordering and whereas the difference is it also applies if you don't have reducible graphs, so you could order the graph in the weak topological ordering. The reason why we picked this is because it gives a deterministic exploration so it makes managing things in the algorithm a lot easier, but it doesn't have to be the best exploration strategy. If you don't have any more questions I will go through refinement. Refinement in our case is based on Craig interpolation. That is a picture of Craig if you haven't seen him before. The Craig interpolation theorem basically says if you have two formulas A and B such that A and B is unsatisfiable then there exists an interpolants i that is implied by A and is unsatisfiable together with B. this is one way to write it in sort of model checking applications [inaudible] think about A and B being unsatisfiable. In the original theorem it was A implies B and you want to find something in the middle i, so that you can apply the [inaudible]. We know that we can build Craig interpolants effective from resolution through unsatisfiability and it's well-known that we can use it for over approximate set of finite reachable states. Let's see how it's used in model checking. I'm just going to show you an example of how it is used to over approximate reachable states and I'll get to show you how we use it. If we write our i for the ith step of this transition and then we take the A part of our interpolant, the set of initial states and R at zero times step up to R to N steps and we will take B as a set of bad states. If A and B is unsatisfiable that means that there is no bad execution with N plus 1 steps and then the interpolant is an over approximation of states reachable in Nsteps that does not contain any bad states. Is that sort of clear just from the definition of interpolation, it will be only over the variables which are common to the bad states in the last step and it will be an over approximation of A, so it will over approximate the set of reachable states. So this is the picture. So A is the set of reachable states. The interpolant is an over approximation that is inconsistent with B. what we actually are using here is, note, that this is how you would do interpolation to find out the set of reachable states. What we want to do is given a path in the program build a horn style proof that this path is invisible, so we will need more than one interpolant and there is a concept called interpolation sequence or path interpolation that does this and it says something like this. Given a sequence of formulas Ai such that the conjunction of this formula is unsatisfiable, and interpolation sequence was a sequence of formulas i one to I minus one such that each i is an interpolant of the corresponding prefix and suffix and then if I pick any interpolant which represents a prefix up to k and I take the k plus one step of my transition system, then I am going to imply the next step, the next interpolant in the sequence. So that is basically the horn style provable that says this is the precondition and I do this step I get into a post-condition. So pictorially this is how it looks. We have this formula A zero to A six that represents condition relations. We are going to get our interpolants which are izero to I five; each one represents the prefix, over approximate prefix and then we want to know that the A zero implies i zero and we want to know that A one and i zero implies i one and A two and i one implies i two and so on and so on. So if Ai is a transition relation of step i, then the interpolant sequence will be a horn style proof of safety of that trace. This is the principle that, for example, was used by [inaudible] in order to use annotation of its [inaudible]. Questions? It seems like you are thinking of something and so I didn't know if you had a question. >>: I'm not thinking about that. I'm thinking about something else, I'm sorry. I will try to pay more attention [laughter]. >> Arie Gurfinkel: Maybe I said something that was not quite… >>: Sorry. >> Arie Gurfinkel: So this is the part where it gets interesting. Using path interpolation we can get rid of a single path, but we have a DAG and so we want to get rid of all of the paths in the DAG at once and to do this we introduce this new concept that we call DAG interpolation as opposed to path interpolation. This is what it is. What's in the problem is, so the input to the problem is this graph, the DAG graph together with each edge being annotated by a formula in such a way that along any path a conjunction of the path formula is unsatisfiable. Then for what we want to get as an outcome is this interpolant i that labels each node of the DAG with a formula that satisfies the following two conditions. First, if you pick any place like say this one, then this formula will be an interpolant of any prefix to it and any suffix from it. For example, here I have a single prefix but it has multiple suffixes and i2 to be an interpolant for each of the suffixes. The second thing that we want is that if we pick the formula say this one i2 and we pick the formula of any edge then together that should imply the interpolant of the next node. Here is an example. For i2, i2 has to be an interpolant of pi one, of pi one and pi eight because of the path one through seven. It also has to be an interpolant of pi one and two, three, six and seven because of that path, and so on. And we also want this local rule to be true that for any interpolant together with a formal of an outgoing edge implies an interpolant on the next node. >>: So [inaudible] and six. >> Arie Gurfinkel: And i1 can be true and i7 can be false, yes if… >>: [inaudible]. >> Arie Gurfinkel: Yes. That's right. So the question is how do you compute a DAG interpolate? You can try to maybe break the DAG into a tree and then try to compute and interpolant of every pass and then maybe can join interpolants that came from nodes that were joined in the DAG or something like this. That would not be efficient because this would be linear in the number of paths through the DAG and that would be exponential in the DAG. So you somehow want to solve this problem in an efficient way. By efficient I mean at least linear in the size of the DAG. This is what we propose. What we actually do was sort of not quite a happy end for us because we reduce the DAG interpolation problem to sequence interpolation problem with the step that I will talk about. This clean step that actually ends up being quite expensive and this is something that we are still working on trying to figure out is there a better way to do this, but let me first explain to you what is happening. The way that we reduce it to a sequence interpolation problem is through the following steps. First we encode the entire problem, the graph, and the annotations on edges by a sequence of formulas A zero N. I'll show you how this is done, in such a way that every satisfying assignment to this formula corresponds to a satisfying assignment through the graph. We then get the sequence interpolants of this formula so I want, for example, an interpolant between A zero and the rest and I choose and interpolant between A zero and one and zero. Then this gives us almost what we want. I'll show you that this also includes some of the variables that we don't like to be in our interpolant. Then I will go through this clean process that gets rid of those variables. Let me show you step-by-step picture. The encode process is really simple. This is what if you think about this problem for a minute this is probably the first thing that you're going to come up with. The idea is to introduce Boolean variables, one Boolean variable for each node that is in the graph. Use the Boolean variables to encode the structure of the graph and then add constraints saying that if you have taken an edge then the condition on this edge has to be true. So if you look this is sort of what this looks like. It says initially V one has to be true because we start at one and then if one is true then two has to be true and the condition has to be true. If I am at two there I have two successor sides. There I go to three and then pi 2 has to be true, so I got to 7 and pi 8 has to be true. Then I go to 7, pi 8 is true. But what I also am going to do is I'm going to do this encoding and I am going to order in a topological order according to the topological order of the nodes as I go. The topological order simply means that whenever I am at the node I have seen all of its predecessors. Now I'm going to take sequence interpolants between this point, this point, this point, this point and this point. Just to show you what this means I am going to do an example of one interpolant in a sequence and try to explain what this formal action has. Say we are going to pick this position, A3. Say we are going to get i4 which is the fourth element in our sequence. If we cut these constraints here what happens? If you think visually, what is being represented? The constraints up here describe this part of the DAG. They have all of the paths together with all of the edge labels. The conditions which are left are these things. Effectively by drawing a line in constraints here is like taking a DAG finding a node in it and then cutting it in such a way that everything which is topologically under it, so either reachable from a joint comparable to it goes to the bottom and everything else goes to the top, and of course there are some edges that cross this boundary. Crossing this boundary means that in this edge I am using a variable, say V7 and I'm using the variable V7 here as well. What I am interested in is I want to get a label of 4 but what will the interpolant be? They interpolant will be an over approximation of this formula that is inconsistent with this formula on the common variables, described using common variables. It will actually be an over approximation of all of those guys. It will be the set of reachable states at 4, at 5 and at 7. It over approximates what you could get at 7 from here at 3 from here and at 4 from here and at 5 from here. This is a variable that I want to somehow get rid of. This is our clean step, getting rid of these variables and it's very ugly and is very expensive, so instead of looking at the formula, let me try to explain what that means. If you look at this formula here what does interpolant look like. Ideally it will look like a clean formula that will say V7 is true and some constraint about V7 or V4 is true and some constraint about V4 or V5 is true and some constraint about V5. If the formula would have been in a nice clean formula like that, you would just throw away the 7 and 5 parts and you would just keep the 4. But that's fairly easy to do even if the formula is in this transient state because we can just substitute the [inaudible] of the 4 for the 7 and the 5 and simplify the formula. Another part that we are going to get is that any variable that say only [inaudible] in this location according to interpolation could appear in here and it does. It often appears inside here in the form of some [inaudible]. It can't appear there in any meaningful way because this has to over approximate the set of states reachable here. This variable is not [inaudible], cannot be constrained, but it can appear in [inaudible] in very strange ways like this. The only way to get rid of it that is actually sound is to universally quantify that and this is what this clean step is. It is to substitute some variables by true and then substitute some variables by false and then whatever you cannot get rid of quantify things up. And this is where the big bottleneck of this procedure is because even so we can apply various heuristics to try to find to [inaudible] and get rid of them in a better way than by just giving it to quantify for elimination procedure. At the end of the day that's what we do. We go to the 3 and we ask you to quantify whatever is left out and in practice if the algorithm ever gets to this point that more or less means that we lost, so if we ever need to go to this 3 and ask for a new quantifier elimination problem and the problem is not very, very quick, then this typically means that it will become gradually really, really slow in this iteration or the next one and then it will not converge. Just to summarize this whole procedure. We start with a DAG where edges are labeled by formulas. We encode the DAG using these extra variables interpolating sequence in order to get the initial guess of the labels and then we use quantifier elimination to make the labels what we want. We use a single call to an SMT server to do the refinement, but we do have the quantifier elimination step. In principle, I know how to phrase it. I have, we have a technique now that doesn't use quantifier elimination but requires more SMT calls and in principle one can prove that you don't need the additional SMT calls. That is all of the information that you need is in the initial proof, but in order to do this you need to sort of open the interpolating procedure and start dealing with it. That is change the interpolating procedure itself, whereas, so far we have been using Mathsat as a black box, sort of fitting things into it and then extracting them and massaging them ourselves. Does this appear fine and natural? We start with blue labels. We mix orange labels. We continue with the blue and orange labels. How many orange labels you have, you want, we are under approximation driven. How many blue labels you have, that's another we call over approximation driven and you control your algorithm by having more or less of those labels. >>: Can you say a little bit about the heuristics that you use for making those decisions? >> Arie Gurfinkel: Right now, we don't put a lot of effort into it. Right now the heuristic that we use is that we use orange everywhere and then blue as [inaudible]. >>: And when you compute the predicates… >> Arie Gurfinkel: We don't, we do exactly this. >>: You're saying that you had some knob that you can… >> Arie Gurfinkel: You could say restart exploration from the last place that you left off that got broken but restart it from other places and we haven't tried that. We always restart exploration from the variable. >>: But so when you use predicates to secure your blue labels here how do you decide which predicates you actually want to use? >> Arie Gurfinkel: We use all predicates from the predicate abstraction but then we use Boolean predicate abstraction. Not Boolean, Cartesian predicate abstraction as opposed to Boolean, so something that's inexpensive to compute. But we have really been concentrating not on these heuristics yet, but extending this further into other abstract domains. I'll mention a little bit about that. What I've told you is about UFO sort of as an algorithm, but there is also another story of this which is used as the framework. This whole thing is built on top of LLVM as a front end. It's publicly available including source code, so we would like people who are interested in this to, maybe who are interested in different heuristics to try them out and help us. In terms of the architecture it is fairly straightforward. We start with LLVM front end. We use LLVM optimizer in order to massage the program to make it simple for verification. For example, we don't deal with memory at all. We only deal with integer programs, but LLVM optimizer does pretty good job at inclining lots of things, not just inclining functions, but also inclining memory access into registers whenever it can. And then we build a cut point graph. It wasn't apparent here, but we always work at the representation of the program where a single edge is the longest loop free path instead of just a single statement and we call this a cut point graph. It's a controlled flow graph representation and then we build the abstract reachability graph and while this is happening you can pick your refinement strategy which is several ways to use this DAG interpolation procedure. You get to pick what abstract path you want and we support predicate abstraction and we also know some of the abstract interpretation domains and then the expansion strategy which is how far back and forth you want to go. And we use Z3 and Mathsat together. We use Z3 mostly for satisfiability and for simplifying the problem, and then when the problem is simple enough we use Mathsat to generate the interpolants. >>: What does it mean to simplify the problem? >> Arie Gurfinkel: To extract an [inaudible] core. >>: [inaudible]. >> Arie Gurfinkel: We found out that as the problems get larger using Mathset to directly extract interpolants is not feasible, so what we do is we had lots of assumptions meaning [inaudible] build minimal unsatisfiable subset and then take that part and give it to Mathset to get the interpolants. >>: Do you try to make it, so we don't necessarily give the minimal [inaudible]? >> Arie Gurfinkel: What we actually do right now is we have several heuristics being implemented which we don't use but what we actually do right now is we do it three times. We say you put assumptions everywhere. Check with assumptions and it comes back and removes assumptions which became false, remove the [inaudible] as it became false and do this again and do it three times and then… >>: Do you have to remember anything about how much more you get doing it three times? >> Arie Gurfinkel: No, but I can give you the… I don't remember offhand. I'm sorry. Aws tried it. I should really talk to him. So you try a few numbers and then three states. >>: [inaudible]. [laughter]. >> Arie Gurfinkel: Okay. I just want to show you some numbers just to see, to show whether this thing is working or not. The take away point is combining things are good but we don't have a silver bullet, so we don't know how to combine things that always works. What I'm going to show you is experimental results for the following configurations. We have something we call UD which means that you never have labels. You always use [inaudible] come true, the most precise one. We have OD which means you always restart from the very top, so you compute the orange labels and then you take the predicates and restart from the very top and the idea is use Cartesian predicate abstraction and Boolean predicate abstraction. Again, the abstraction is always over a loop free program is taking, so even the process of generating a predicate abstraction is quite expensive. It's not on a single, simple state. And then the two combined one. UFO combined with Cartesian and with Boolean. We've used, what we've validated this on was the software competition benchmarks from last year. We didn't complete because this work was done [inaudible] and the deadline for the competition and the work was at the same time and we didn't make it. We also have some of the Pacemaker benchmarks that Aws has worked on before. We've compared it with, just to sort of compare it with an existing tool, the only one we have compared it against is Wolverine, but since then we've looked at, for example, what C [inaudible] is doing which is, they were in the competition and we do quite well compared to them. So here are the total numbers. They are not very meaningful, but people often like to see what happens in total. You have the Cartesian predicate abstraction seems to be the best as in it takes less time and it solves more instances. But the benchmark is very uneven, so it's very hard to sort of look at the global number and know what's a good technique or not. If we go deeper and look at some of the examples and you have probably seen these examples before. They are in the competition from prior work. This is the token ring protocol and this is the SSH server examples from BLAST Marsha Checki has done a while back that encode various handshaking protocol verifications. And so here we've got as promised as you go down the predicate abstraction part, the number of refinement goes down, so the more predicate abstraction you do, the less refinements you have. It doesn't mean that your time gets smaller, but at least we have that. Then in terms of time it sort of a little bit all over the place. Sometimes like here, it's good to not do any predicate abstraction because predicate abstraction just takes too long and so you die in the first iterations. Sometimes it's really, really good. Sometimes you just get the right predicates and you terminate very, very quickly, whereas, simply unrolling things and hoping that the interpolants will converge doesn't work. >>: [inaudible] you use force covering? >> Arie Gurfinkel: We didn't use force covering at all. >>: It's tricky when you [inaudible]. >>: Yeah, okay. I'm not sure how you would apply it. >> Arie Gurfinkel: The force covering is a heuristic that would say if I have something that is not inductive, check what happens if there is an inductive loop invariant on the site. >>: [inaudible] induction to impact and for the past-based impact it's very important but I don't really understand whether it makes sense. >> Arie Gurfinkel: Our intuition was that predicate abstraction stuff would do this for us, that it would generate strong invariants… >>: Right. I'm just asking for the one without predicate abstraction, the impact by itself is really too lazy and if you wanted to get it to converge on loops you would have to go a little bit more eager. >> Arie Gurfinkel: We don't do that at all. And then this is a look at unsafe, so here of course is what you would expect again. The number of refinements is the same, more refinements here and less here, but it's hard to see in this table because it doesn't solve a lot. But what you see here is that if you have unsafe examples from places where there is a counter example then unrolling is great. You can just unroll and give it to an SMT solver and that's it. You don't waste any time building something that you know will not work. So the qualitative part is that we know that the solvers that we have are useful, that you could configure the system in such a way that would solve a particular example faster. The not so good news is that [inaudible] us for the future is that we don't have a silver bullet that says do this configuration, so this is how you infer them from a problem, or any of those things. So this is more or less observations. >>: [inaudible] switch between… >> Arie Gurfinkel: If we find time. So what we find is sort of the best strategy is looking at the numbers and it seems like you should try a couple of iterations with just being very lazy and then gradually bring predicate abstraction, because that's the cases where you do really, really bad if there is a really clear counterexample which is very few iterations, then you are much better off unrolling confining it and then spending time building your set of original states. So there is a lot of related work to this. These are just some of the tools. So of course Ken has this Impact and Impact2 rules. There is a tool Wolverine by Weissenbacher and it implements an input algorithm with bit-level precision. Compared to it we do a lot better but that doesn't mean anything because we are not bit-level precise; we use a different solver, so that doesn't tell you anything about [inaudible] techniques. There is also a paper that came out this year called Ultimate which is impact with large block encoding, which is in some sense very similar to an instance of our system, but they don't have the predicate abstraction and the refinement that they do is still based on each individual path as opposed to a whole DAG. We've also looked into interprocedural version of this, so we have this tool called Whale that would use interpolates but in a slightly different way to try to get procedure summaries. We haven't yet combined the two together but that is on our to-do list, and there are several small sort of technical issues because the interpolation problems that we solve our slightly different, but they should play well together. There is also a work coming out from Natasha Sery and his group on using summarization, using interpolation to get function summaries. Their goal is not really to try to prove programs but rather to try to prove that two versions of the program are equivalent to some extent, so they called it upgrade checking. It means that if you take a program and you change some of the functions, you want to reuse the previous proof to prove that the new version given this change is still safe, so the difference is that they don't look for the inductive generalization part. They only do single interpolation problems. So this was in the beginning of the year. We've collected this list in the beginning of the year and there is more recent work that's also related. In this year's [inaudible] there was a software model checking IC3 by Cimatti and Griggio and there is really very similar to a large extent to what we do in a sense of reusing the orange labels. That is really the key intuition and there. I don't quite agree that they've extended IC3 to software but at least they bring the orange labels and they also show it's quite effective. There is work by Ken on duality which does the interprocedural version of this problem, and Nikolaj has generalized property directed reachability which sort of opens all this up and says one way I view it is it says let's not just get interpolants from the proof; let's force this over to the solver in such a way that we get the interpolants we want. And there are several other papers here. There is a space here because they are related but not as closely related. There is work by [inaudible] on solving recursion free horn clauses and this is really, really similar to the DAG interpretation problem that we have so it is actually kind of a generalization of it, but unfortunately I've tried reading this paper and I don't understand it. If anybody here can explain it to me that would be great, because maybe they've already solved all the problems. And then there is this whole other line of research and I put one paper here by Sinha which comes from sort of program analysis perspective but it's really solving the same problem and in a program analysis perspective they would say if I want to reason about I program with lots of functions, I can in-line all of them but that blows up, so I want to have some sort of strategy on how to in-line function and maybe use the solver to guide me how to align back and forth and so they have a paper at [inaudible] this year where they do propose that particular strategy and it seems interesting, but they do it completely outside of the solver from a program analysis perspective. You build bounded programs and then you feed them to the solver and then the solver gives you results and based on that you decide how to unroll the bounded program. >>: [inaudible] similar to [inaudible] you're doing. Was that that paper? [inaudible] reachability modular theories? >> Arie Gurfinkel: That's not that paper. That was another paper that could've been included in this list. Yes? >>: [inaudible] description. >> Arie Gurfinkel: Yes. They had something that they called stratified in lining. >>: Yes. >> Arie Gurfinkel: So that's is that you first in line just using un-interpolated functions and based on satisfying assignments in line forward. The alternate part says that you don't have to start from main; start from close to your assertion and then go out and then also use interpolants to learn summaries of functions that you have already seen and use also instead of the actual bodies if you can. If I'm missing something, let me know. I'm sure I'm missing a lot of more recent work. If you're interested, so this is sort of towards the end. If you're interested in what you were served, this is sort of the nutritional label. That's what you get in the box. There is some abstract interpolation now which I have talked about and there is some verification condition including some CEGAR, some interpolation. It's all mixed together. You apply it typically for one second and it may work. >>: [inaudible]. [laughter]. >> Arie Gurfinkel: You make it [inaudible]. Check it out. Talk to Aws. Look at the papers. Just to conclude, so sort of a past, present and future. We started this work with this tool called Whale where we really were interested in extending interpolants to interprocedural analysis. We built a tool. It worked. But we couldn't find lots of problems for it because there aren't that many problems with recursion that we could get our hands on and so, but what we invented in there was this concept of state condition and interpolants which is I think still a quite interesting concept to explore further. We moved to some other domains and said okay, if we don't have recursive programs we still have the dealing with DAGs and doing with cut point graphs and we said well, let's look at the software competition benchmarks and what we can do with them and this is where UFO came from. After working on that and looking where predicate abstraction helps and where it doesn't help, we said what we should really bring abstract interpolation into this picture as well. So now we have this new work that will appear in SAS; it's called Vinta. Vinta stands for, what does Vinta stand for? >>: Verification interpolation abstract interpolation. >> Arie Gurfinkel: And maybe Aws can at some point tell you more about it. I found out and it was a big surprise what Vinta, the meaning of it that I knew it is a boat from the Philippines. >>: A what? >> Arie Gurfinkel: A boat. So you have a whale inside the ocean there on top and UFO… It's a colorful boat with kind of a sail. But this is not a logo, so apparently in [inaudible], which is an Indian dialect Vinta means novelty of surprise [laughter]. So if you try to go to any translation and you say translate novelty to [inaudible], you get this. This is their script and so apparently that means Vinta. So what Vinta adds to the picture is using abstract interpretation which means we need to figure out how widening fits into the picture. I have an interesting story there. And also it made our interpolation problem so much harder just because abstract interpretation would unroll the control flow graph much more and it let us sort of rethink how we were using information. So if you look here we generate orange labels and then blue labels and orange labels and we never mix the two labels together. That forced us to think about how to do this and we have this new abstract interpretation guided way to do DAG interpolation were basically we would take the inductive but not safe invariants and bring them into the interpolation problem so that the interpolation does not give me the reason why the program is safe but translate the inductive invariants that are not safe, to be safe but maybe lose inductiveness, and that was a lot more effective in that space. That's it. [laughter].