>> Shaz Qadeer: It's my pleasure the welcome Damien Zufferey again to Microsoft Research. Some of you might remember him as an intern about a year ago when he started the P Project at MSR. He's visiting us again. He'll be here today and tomorrow. He just finished his PhD from Tom Henzinger's group at IST Austria, and he's starting a post doc in Martin Rinard's group in the fall at MIT. And he has a variety of interests in formal methods, in language design, reliable systems. And today, he is going to tell us about some work he did automating separation logic using automated theorem proving. >> Damien Zufferey: Thank you for the introduction. It's always a pleasure to be here visiting MSR. So this is joint work with Ruzica Pizkac, who is actually visiting MSR these days and she's around for the week, and Thomas Weis at NYU. Okay, so what's the motivation? Separation logic is pretty nice to express succinctly very complex invariants of heap structures and has a lot of good features, for instance spatial conjunctions allow you to easily reason about high learning and which elements are in some subpart of the heap and which orders are part of the heap to make sure that those are different. There are inductive predicates like list, trees which allow you to easily define those structures or shape of the heap. And then, there is also the frame rule. So the frame rule is an extension of all logic for separation logic to also reason about [inaudible] frame so that when you call a method only the part which is touched by the method in the heap will be changed and the rest is not changed. Now what is not that nice on the other end is that even for decidable fragments separation logic tends to use specialized provers which means they are hard to integrate into a much larger framework. And it's also not clear within those fragments how you combine different theories. So if you want to reasoning with integers because you have pointer arithmetic then we like to bring it all down to some -- particularly what we'll try to do is reduce some decidable fragment of separation logic down to some first-order theory and preserve [inaudible] really cheap. Maybe just the example is some very simple concatenation which takes two lists and returns a single list. The first part would be the specification, so we have pre- and postconditions and an invariant. We assume that all our programs are [inaudible]. We don't do any invariant generation but just see how to go from this part and that part which is expressing separation logic between the fragments of singly-linked lists down to firstorder [inaudible]-like program. So that was then and also for the specification. And what that means is here we have this inductive predicate which defines lists and the sequence of nodes and here Node A going to null -- Go from A to null -- and here, the same for B to null. And here in the middle we have the stars so the separating conjunctions. So [inaudible] means that the whole heap can be -- so the part of the heap which is touched by this method can be divided into two parts which are disjointed and one contains one list, the other containing the other list. And here the definition of list segments. The last node like here, null, is not part of the allocated heap. And again for the invariant, so we have two list segments and the separating conjunctions seeing that those two are disjointed. And what we can see is here the list segment is not included as part of the footprint of the last node. Here curr is not considered as part of this list but curr is part of the second list. Okay. And then now for the frame inference. So when you start executing your programming and you get to the loop. The loop has an invariant; we need to check that as a precondition which basically says that this loop will modify two list segments, one going from A to curr and the other going from curr to null. So we look [inaudible] in our heap. Okay, we have a list segment from A to null which is good, but then we have also this list segment from B to null which is not going to be touched by the loop. Here basically we have the requirements of the precondition and then we have the invariant that we want to prove, but this is not the whole heap so we here need to find out the frame, basically figure out that this part is actually the frame. Okay, so those were the nice features of separation logic. And what are our contributions? Well, we reduce -- while preserving the [inaudible] satisfiability from separation logic, so the fragment of single-linked list, to decidable first-order theory. So then, we can fit into the SMT framework. Then, we can do satisfiability, entanglement, frame inference and abduction all using automated provers having first-order provers. And then, we will also be able to fit decidable first-order theory that we'll show later into the [inaudible] framework and combine it with other theories. And finally at the end we'll speak a little bit about the implementation in the GRASShopper tool. First we'll just go over the translation and then the second part, implementation. First what is the decidable fragment of first-order logic we are looking at? It was introduced by George Berdine [inaudible] in 2004. It's separation logic for singly-linked lists. Basically a formula can be [inaudible] this equality of variables means that the [inaudible] on the heap X points to the [inaudible] Y list segment. I will show later how they are defined. And also the separating conjunctions of two heaps. That was the original fragment and here we'll extend that will some Boolean structure on top of that, so on top of this separation link formula you can imply negation and conjunctions. >>: How do you interpret sigma? When you interpret sigma do you imagine that there is a variable called heap? >> Damien Zufferey: Yeah, I guess that will come here. So you mean when you interpret this? So, yeah, that's the semantics of how we interpret this. We have something which is equivalent to the standard definition but slightly different in the sense that we have both some interpretation of the heap which we'll consider as total so far for variables and then a subset of the heap which defines over which subpart of the heap the formula should apply. So we call that the footprint. >>: So A is a map? >> Damien Zufferey: Yes, exactly. So it's a bit more general because -- So here for the moment we'll just consider one pointer which is called H. So here A will be the map giving the interpretation to H, so taking cell [inaudible} and returning the nodes that are pointed by this. Yes, in that case it would be a total map. >>: From end to end? >> Damien Zufferey: What we call nodes or memory cell to memory cell. So we just [inaudible] disjoint. We need to combine [inaudible]. We want to like merge integers and -- We'll speak about this later. So here basically what it says is that when you are speaking about equality [inaudible] so interpreting A. So the [inaudible] check whether this is true or not. And the footprint of those two formulas is [inaudible], so here we use the [inaudible] interpretation of separation logic formula [inaudible] Formula H is defining precisely a subpart of the heap so you cannot have anything on the side. So everything must be [inaudible]... >>: I still don't understand what -- Okay, what is X superscript A? >> Damien Zufferey: Oh, that's the interpretation of X in A. So the value that A is giving to X. >>: But A is [inaudible]... >> Damien Zufferey: So, okay, A is giving the interpretation both to variables and to the function H. >>: I see. >> Damien Zufferey: So in a sense H is a map and for the variables it's just giving them a value. >>: Subset X is the subset of the domain of A? [inaudible] >> Damien Zufferey: It will be actually -- It will be both because in some sense the interpretation of the heap is total, and here it will be [inaudible] -- it's the union of the two so it's subset over the whole heap. So it's just saying which part of the heap you are touching. So you can touch part of the range and part of the domain of H. Yes? >>: So the normal A's there should be curly A's? >> Damien Zufferey: [inaudible] >>: Bullets 2 and 3. >> Damien Zufferey: Oh. >>: There is no distinction... >> Damien Zufferey: So here? Yeah, okay so I'm sorry, that's... >>: They're all curly A's? They're all curly... >> Damien Zufferey: Yeah, they're all curly. >>: Okay. >> Damien Zufferey: And okay and the point two -- Well, you start looking into the interpretation to find the value of the map which -- Here we call it H and the footprint will be just the node X as in the classical separation logic. Then for separating -- Question? >>: Yes. So for this view, the mapping is from the current to the next node. Is that what... >> Damien Zufferey: Yes. So here will -- Yeah, the current to the next node. So here we'll denote that by function H but then later we can have like multiple pointer [inaudible] structures. But, yeah, it's just the pointer to the next node. >>: So for any last element [inaudible] list, that would be null or --? Because [inaudible]... >> Damien Zufferey: Not necessarily. We'll see on the next slide how the list is defined. >>: Okay. >> Damien Zufferey: At lease like these segments where you give both the start of the list and the last node of the list. >>: What is this? >> Damien Zufferey: Oh, that you will substitute -- So here when [inaudible] H1 in the interpretation -- Well, okay. Actually X -- Sorry, a typo. So actually here instead of X you put U1. Yeah, actually that was copied from -- Yeah, so forget the bit here. Just think of that. So when we did the subpart of the heap of the formula H1, instead of X use U1 just for the footprint. Yeah, that was a typo. Basically when you evaluate it to subparts first you have this existential quantification of guessing the separation somewhere and then, you evaluate -- they have to be disjoined. And, okay. Yeah, actually we can also have interpretation of X and A. Sorry I should clean that up. Basically you guess two subparts of the heap which has to cover the whole heap, so the whole X, and that has to be empty so disjoined. The intersection is empty. And then, when you evaluate the subpart you will just instead of X use U1 and U2. Okay, yeah that was a mistake. Then for the list: so for the list basically we consider only finite lists so no infinite lists. So basically a list segment going from X to Y corresponds to this. First you guess the length of the list and then you say, okay, a list of length zero just says that the two nodes are the same and the footprint of an empty list is empty. And then there is inductive definition that says that going from X to Y there exists a node -- Yeah, so there is a node U such that -- Okay, I'm very sorry. So basically you say that the list -- so you have X points to Z so Z is the successor of X. And, oh, yeah. So it introduces a new variable Z which has a value U which is a node. So in the interpretation you pick one node and then, you say you have X points to Z. And then, you have a list of length N so just one list. And [inaudible] this inductive step and then you need to force inequality of variables that's given by the separating conjunctions. And then, the interpretation of the Boolean connective just that the conjunctions basically is taking the conjunctions. And here what is the difference between -- But here there is no separation so there is no need for guessing two parts of the heap over which to interpret; they are both interpreted over the same part of the heap. And for the negation basically here we just leave the negation as not the interpretation of the formula. >>: What's the role of Z in the third line? >> Damien Zufferey: Z is supposed to -- So it's just -- Okay, I think it was a mismatch between variables and the interpretation of variables. So... >>: Where is the boundary? >> Damien Zufferey: Yeah, I think -- No, actually -- Okay, basically Z should be bound with U. >>: [inaudible] variable. >> Damien Zufferey: Here we have the -- We know that there is a successor which is different and this successor has to be node. So [inaudible] H would be bound also here. So you guess a variable in the heap which would be quantified then you give it a value and interpretation. And then you just say that Z should be a successor of X. >>: So it's basically the intermediate. >> Damien Zufferey: Yeah, exactly. That's the one-step successor. >>: Right. Which is going to be existentially part of [inaudible]. >> Damien Zufferey: Yeah, it should be bound here. >>: Yes. >> Damien Zufferey: Yeah. >>: So U is also -- But where is... >> Damien Zufferey: U is just to give an interpretation of the X, sorry. So just say that's it's the... >>: Well there's some H, yeah. >> Damien Zufferey: So it's just to say that's it's there in the interpretation.... >>: So it's essentially saying there exists an edge such that X goes to Z, Z goes to some other U. Right. Okay. >> Damien Zufferey: Yeah, there is an edge to the next successor which has interpretation and then recursion. So there is... >>: I want to say that U is in that LS of N or Y or is that just understood? >> Damien Zufferey: No, it's just we wanted to say that some -- Actually here instead of U we could have like Z of A, so the interpretation of Z in the --. >>: Yes. >> Damien Zufferey: So the role of U is just to give a meaning of Z. >>: Okay. >>: So I think when you write there the A with the Z goes to U, it seems to me that you're not intending a substitution; you're trying to say that Z is the thing that already maps a U. Because you don't want to change the... >>: That's not a substitution. That's an edge [inaudible]. Right? That's Z is... >> Damien Zufferey: So you just want to [inaudible] say that in this interpretation... >>: It is... >> Damien Zufferey: ...you just take the... >>: ...[inaudible] that there's such an edge, right? I mean... >> Damien Zufferey: Yeah, it just gives you... >>: [inaudible] >> Damien Zufferey: You just say that -- Yeah, so the property of the interpretation being [inaudible], you know, that there is a U being mapped to the successor. And just, you want to name it Z. >>: Oh, it is a substitution. Oh, I'm confused now. >> Damien Zufferey: So the idea is that the interpretation is totals [inaudible] if the function successor is defined. So someone named that successor Z, so we'll just in the interpretation give a name to the successor. >>: Why is X not equal to Y kind of at the metalevel but Z not equal to X at the formula level? >> Damien Zufferey: Okay, here in the -- Yeah actually here that should be -- Yeah, that's missing. Because it is somewhat actually implied by the separating conjunction. But let me think if there is some detail. >>: [inaudible] >> Damien Zufferey: So basically here if N -- Okay, N should be greater than. Yeah, sorry. >>: Because when you choose Z, you're choosing a Z that's not in a domain of A. Is that correct? >> Damien Zufferey: Actually we want to introduce a fresh name to be sure there is no clash of names. So in a sense, yes, Z is fresh but it cross points to a limit which [inaudible] is already existing in A. >>: Okay. >> Damien Zufferey: So it's just introducing a fresh name so it won't clash. >>: So you don't say Z superscript A because you can't [inaudible]. Is that right? >> Damien Zufferey: We know at that point Z superscript A is U. But yeah I think, okay, that should be -- the way of doing it would be to [inaudible] question of quantifying Z and saying Z is not the same so you get a fresh Z. Okay, yeah. Okay. And now on to the graph reachability part. So here we have a few terms: terms can be nodes, type of node. And apply H to this successor function then we can have equality of our terms. Then the reachability predicate: we'll explain later the reachability predicate. And stratified set. So that's the [inaudible] fragment. This part has been introduced but this part is -- Sorry. Okay, maybe I should just explain that more clearly. So the idea is that we'll lift our separation logic formula into these two parts. So the part that was A will become related to this graph reachability part which will be giving the prediction to the successor functions to terms, nodes in the heap. And then, since we'll not be able to have this total interpretation of every node we'll have partial interpretation of the heap and use this reachability predicate to [inaudible] for all the nodes that are there in between [inaudible] a list segment. You don't want -- You want to abstract for every single node in the list. That will be the total interpretation. And following the pointers along the list from one term to the other. So that's the reachability predicate. >>: What is H? >> Damien Zufferey: H, that's the successor function. So basically... >>: Is there just a single H? Or there could be lots of H's? >> Damien Zufferey: For the moment we'll have just a single one. At the end I will talk about an extension where you can have multiple ones. But here the idea is H -- What is this term? So here say that you can, from this term to this term -- So you can reach this term to this term without -- Maybe I should go to the next slide for the semantics. Okay. Say that you can reach this term without seeing this term in the bus. So that says we can reach this from this and from this you can reach this, and you won't see that term along the path. >>: Just like [inaudible] under the arrow. >>: Yeah. >> Damien Zufferey: Oh, okay. >>: Yes. >> Damien Zufferey: And for the set we have stratified sets. So classical set operations and also set comprehension. There here in the set comprehension we have some [inaudible] restriction that will needed for decidability that you cannot have the term which is bound in the comprehension occurring below the successor function into the definition of the set. And then, we can have set equality and [inaudible]. And then, on top of that we have kind of a [inaudible] Boolean combination of those two parts. Okay. So maybe just -Yeah, those parts were about this theory of functional graph reachability. Types of the elements in here are nodes. Here we consider just one successor function which is H, but you can have a finite number, a fixed number of H if you want to solve for the theory. And then, we have predicate symbols like here that's [inaudible] reachability so going for A to B without seeing C. And, okay, stratified set theory will have elements of nodes and sets. And function symbols: empty sets which is zero and intersection, union and deference. And that's predicate [inaudible]. Okay so maybe I should have put that slide before. Speaking about functional graphing reachability, we assume that this successor is really a function so any node is mapped to another single node. And why do we want these to be functions and why a graph and not just functions? Because the interpretation is really a function that you can follow and here we want to take a reasonable transitive closure by this reachability predicate. So we are interested in the path and not just the direct successor. And here, okay, so this means that from T1 you can reach term T2 following a path along the function H without seeing term T3. And we'll use also this list as a shortcut for - So we'll use [inaudible] reachability as a shortcut from seeing that. You don't see the last element because in the definition -- We'll see later -- going from X to X without seeing X is true because you don't need to follow a path directly in the node. And also we use another shortcut which is between X and Y, so it's defining a set of all the elements between -- all the Z that can be reached from X following H without seeing Y so all the nodes in between and especially excluding the last one. Okay, what does that mean? So let's say that you have three nodes: X, Y and Z. And here you don't have to -- you want to reason about the path so when I'm making these dashed nodes that means there can be many nodes in between in the heap. So here you have a path from X to Y and a path from Y to Z. And Z [inaudible] so it's pointing back to itself. And we have a tree graph. And then, here you have a path from X to Y and then, Y has some other path but it never goes to Z. And here X, Y and Z. And these may be looping back to somewhere between X and Y. Now we just check the values of this predicate. So for instance, going from X to Y without seeing Z is true in all of them because there is always this segment. Now going from X to Y without seeing Z is false in all of them because going from X to Z you always need to go through Y because there is a single successor so you cannot have another path. Then going from Y to Z without seeing X is true in the first two and here it's not true because there is no path. Here again there is no path. And [inaudible]? Oh, here that you have a path pointing back to somewhere in between so you don't necessarily need to know which node it's pointing to, to have a path going here. So those paths can merge somewhere without knowing exactly the position of the merge. And then, in the end -- Z to X -- Okay. Again there is no path to X. Okay, so the usual way that has been done a few times is to say that we translate separation logic to first-order using two different parts. The first part is to translate the structures of the heap. And that was in our semantic definitions, the interpretation of the variable in terms of cells in the heap and the interpretation of the successor function. And then, to have sets to encode the footprint of the formula. But here we'll do something slightly different because, for instance, we want to have [inaudible] dealing with [inaudible] that will need negation on top of the separation logic formula. So if we just use that transformation we will lose decidability. >>: I've got a question. If you didn't have those lists inside your separation logic fragment, is there still a change? Or am I introducing another challenge because of star or is a challenge because of lists? >> Damien Zufferey: So it would become because of the star. So the list will map fairly directly to this theory of graphs. But then, the star it will be linked mostly to the footprint. So we see that some pointer of this existential over the two subparts of the heap. So when you take negation, they should become universal. >>: Okay. >> Damien Zufferey: And that's really bad. >>: Why is that bad? That's first-order logic. >> Damien Zufferey: Yeah, but we want to preserve decidability -- So you have this theory of lists which is decidable. So if you remove lists, you're also decidable and you want to preserve decidability. So you are mapping to these sets, first-order sets, which are existentially quantified. As you long as you existentially quantify, you are easily decidable. And then, you take the negation for that [inaudible] and check those existential [inaudible] to universal one. And here you might lose decidability at least in the case of lists. Yeah, if you have this direct translation, you lose this ability here. Otherwise, we have slightly more complex transition to preserve decidability. >>: Would it be fair to say that if I did not care about efficiency and decidability and what not, that the only goal was to convert to first-order logic, is this trivial? >> Damien Zufferey: Yeah, because you just take the semantic definition... >>: Okay. >> Damien Zufferey: ...[inaudible]. But here the problem is that when you [inaudible] we'll need to [inaudible] a part of the heap and when you take the negation for all possible partitions of the heap, [inaudible]. I mean the critical part will be this for all partitions of the heap that will be intrinsic by negation. So actually we have a slightly different translation that says that this part which is the structures, so it's still an interpretation of the node and the successor function but it also has some information about how the different subparts of the heap are connected so which parts are disjointed and which parts have a union and which parts are empty. And then, here instead of all constraints about the footprint we just give the definition. So we'll see what our definition is. So that's just a leaf term in the translation. So the translation should come here. We'll go recursively over the formula and generally these two part, one part about the definition of this set. So here the translation means the structures of X equals Y over the footprint set big Y. So here it says that we have equality and the definition of Y here that should be the empty set. For not equal, again, the same. Now for the translation of X going to Y. Okay, we see that the successor is Y but we say then the definition of the part of the heap over this formula is interpreted as X. Now for the list segment from X to Y. So basically we have a map directly to this path from X to Y. And here we have the part of the heap, the footprint of this formula is basically all the nodes between X and Y including X, excluding Y. And now comes the tricky part -- well, one of the first tricky parts. Okay, we will recurse over the two subformulas. And for that we introduce two fresh sets that will be then defined into the translation, into G1, G2. And then, we'll just part this as part of the definition of the set. We have introduced this fresh set but we still need to define this Y, so the definition of Y will come here. So it's the union of the two subheaps. And then when you're done, the question of disjointness will be put on the part of other structures. The structures of the two subformulas are here. And then we have also to say that those two subheaps are disjoined. So here one of the critical parts is to separate the definition of Y from the constraints over Y. So that was for the part of the separation logic formula. Now for the Boolean structures on top of that. You have contraction negation and when you get to the end of the Boolean structures, we have the separation logic formula. Here we look at the fresh variables and call the structural translation we saw in the previous slide. And here we need to push to again define X. So X will be defined as equal to Y. But again we need to keep -- So Y is defined here and this we don't touch because now the part is a negation. But what does negation mean? Well, we just [inaudible] translation and here we only negate the constraints of the other structures. We do not touch the definition of the set. And that's the second critical part. Okay then for the conjunctions it's just the intersection of the structure and the definition. And for the top level, just take the union of both. But here we had these two critical parts. Here the negation is [inaudible] formula. And previously we saw that here we somewhat split the Y into the definition and the constraints over the set. >>: Damien, just to check my understanding: the constraints at H means you can represent the tree that knows its children, right? If you represented let's say a binary degree where the node knows its two children, [inaudible]. >> Damien Zufferey: So nothing is fragmented. >>: [inaudible] parent but not... >> Damien Zufferey: Yeah, this fragment you can do -- That's one extension we want because here basically you'll see that the thing we need to reason about this segment is something in the [inaudible] theory that directly matches and that is decidable. So as an extension, we could extend that to a decision procedure. So a first-order tier we reason over tree and then we would be able [inaudible] tree predicate. But that's not something that we have implemented yet. But then if you have this, you can use exactly the same idea and generalize the translation to get -- The trick that you need [inaudible] the Tr in the supporting conjunction and in the negation. So they are not in the definition of the predicate. So we just need to find a theory that somewhat matches a decidable first-order theory that matches the predicate, the separation logic predicate that you want to reason about. So I come to the translation so we have this acyclic list which is non-empty so from X to Y and Y to Z. And then, we have this additional constraint that Z is not equal to X. And then, how does that translate? Well the first part is this. So here we have the two parts, the structures, the definition of the set. Then for the pointer it's kind of the same, so just a successor in part of the heap. List segment which [inaudible] predicate and the between. And now, okay, [inaudible] yeah. For the separating conjunctions we have one part which is the structural constraints that say that those parts are disjoined and the other part is saying that Y1 is union of the two. For other separating conjunctions it's the same. And then at the top level we have the definition of X equal to Y1. So now what becomes interesting is when we have negation on top of that. Here that's exactly the same as we had before, so I'm just ignoring the negation now. And now there's just this inner part and I'm going to apply the negation. So this part did not change at all, but this part on the other hand you just apply negation as you would expect. And here is the negation [inaudible]. And the question is now why do we need to do that? So here in G we are introducing this variable Y, and they are existentially quantified. And we need to keep them existentially quantified. And the problem is that, yeah, if we had negation they might slip to universal. So what's the observation that allows us to do this interesting -- to apply only negation on the F part? Well, we see that this Y1: before they are defined only as set comprehension or union which means that in any given interpretation of the heap you can find the part G will always be satisfiable. And actually it's stronger than that. What we can prove is that if you do this type semantic the formula that we have reasoned over and interpreted over the whole heap, there is actually one single assignment to this Y variable that makes the definition true. And now what's interesting is if this existential quantifier has a unique solution, well, we can slip from existential to universal while preserving equivalence. So here basically those two formulas are equivalent because for all solutions or existing solutions if we know that when their solution is a single one, it's the same. And here what happened at the negation actually is that we applied negation just on that term and [inaudible] the universal to stay in this nice case and just have negation over F. So that's actually the core of the translation in proving completeness in [inaudible] is that observation. So we really need to almost split the definition and make sure that those subsets are unique, those subparts of the heap are unique. And then, when we take negation we can just preserve the quantification that's existential. Okay so now just maybe quickly now that we have this translation, what do we do with this formula? So this formula over this graph reachability and stratified sets has two parts: the part about sets and the part about graph reachability. So we'll just speak quickly about the part about sets. So what happened with the sets? Okay, you just put everything normal form. For every set that is not equal you can replace them by fresh variable that is in the symmetric difference. Then, we can eliminate set comprehension. So each time in some context that you have a set comprehension you just introduce a fresh set variable and some axiom, so universals calls that say that you define the membership in the set X as the value of the predicate R. And then, we need to eliminate those universals so that's now why we need a stratified set. And what is nice about this theory is that this is a local theory. So what does that mean? It means that we just go over the formula and collect all [inaudible] terms of sort nodes so all the terms that do not contain any quantified variables and then, we just expand. So for every [inaudible] term we generate a conjunction of the calls. And then basically we have a quantifier-free formula in the end. And since the theory of stratified set is local, this instantiation is complete. So that's the part about sets. So let's say that we have this formula. After applying the translation, we get this so nothing special. But then we say, "Okay, we just collect all the [inaudible] terms," so it'd be Y, Z and -- Oh, here also need to introduce the value -- Oh no, the value here. Yeah. So we have this Y, Z and the value and we just instantiate over the two closest and get this larger formula. And then, from this large formula we can just revert it into a simpler form. And then what we'll get is that set [inaudible] is the universe, this part over -- So W is in the universe which is in S so this has to be true. And then, we'll need to combine with the reachability part which we'll see on the next slide which says, "Okay, if from the value you can reach Y and from Y you can reach Z, well, by transitivity of following the path, you'll be able to reach Z from Y." And that's something you should not do. So here we'll get the contradiction. Okay, so we saw how to get rid of the sets but now how to get rid of this reachability part. Actually it's the same idea. Again, this theory of graph reachability is a local theory but here now we'll apply the same thing but [inaudible] just one of these few definitions [inaudible] to expand. Actually we have a whole set of axioms to expand. We'll not go into details but if you want the details you can check into this popular paper by Thomas and Nishant. But what's the intuition between this [inaudible]? So that's what I was explaining before; if we look at the complete heap, particular in the total interpretation then we have nodes and pointers and everything. But obviously this might be of some arbitrary size so we cannot reason about the whole heap. So what we'll do is replace those paths by this reachability predicate. So here basically we go from total interpretation of the heap to a partial interpretation, and we can prove that this partial interpretation actually embeds in the total heap if and only if there is an interpretation of the total heap that satisfies the formula. That's what is shown about those axioms, this embedding between the partial interpretation and the total interpretation [inaudible] preserves satisfiability. And, yeah, so that's the key part. And also then for decision that we have all these axioms which are also proved to be local. So again we just go to a formula, collect all [inaudible] terms of type node, instantiate to all the axioms and get a set of constraints which are satisfiable. Okay, so that was the first part. So we've have seen how we can reduce this separation logic of linked lists to this graph reachability and stratified sets. And since we have negation, on top of satisfiability we can also do the entailment check. That was one of the motivations so that we have a uniform way of dealing with them. But now there is also this question of frame inference and abduction -- or abduction is called sometimes anti-frame inference -- so that if you have a formula F interpreted and formulas A and B, you need sometimes to find a frame. You have the precondition of [inaudible], you have the current state and you need to find -- B will give you what will be touched by the [inaudible] you are calling. But you need to figure out which is the part of the heap which is not going to be modified. And then, there can be also the reverse problem of finding the part here that you might need to add to the current state of the heap for satisfiability. So for that we'll need the inverse translation from GRASS to SLL. And just for the sake of simplicity, we'll assume the formula that we have in GRASS corresponds to a formula obtained by this translation. We can lift it to any general formula in GRASS. But just to make it simpler now I've just assumed that it's come from translation. And then, we'll need a model generating solver for instance Z3. In this step we'll get those partial interpretations from the solver. So we push formula F through the solver and get partial interpretation. And then, from this partial interpretation we'll just lift the formula in separation logic then push the partial interpretation that's blocking close to the solver and go to the next partial interpretation and basically [inaudible] until we get the formula is not satisfiable. And then, we have the complete frame. So actually this translation is mostly of theoretical interest and we saw later that we can actually avoid that part when we do analysis of program. We don't need to go back from first-order to separation logic to deal with the frame. And how do we lift partial interpretation to separation logic? So from the partial interpretation, we'll extract the successor function which is a mix of the reachability predicate and the successor as a function H which will be the next successor. So if the function H is defined on that node then it will be the application of H. And if H is not defined, we might have just the reachability predicate and not the concrete value. In that case we just take the first node that satisfies the reachability predicate as successor. We also extract the part about what's called a pure part, so the part about equality or inequality of variables and just lift everything. So nothing really special. For instance what's the use case? So assume that we have a list segment and then we are in a branch that was testing that X is not equal to Z. And then, we are just calling that we want to free the head of the list. And the precondition: we'll say that X has to point to Y so basically X is allocated. So when it's translated we get the constraints coming from the if branch and the rest coming from the assumption. If we push that then we get two partial interpretations that say that, okay, X points to Z and Z might be equal to Y. So X points to Y and Y might be equal to Z. And here the set Z will be in the constraints as defined as the difference of the footprint of the precondition so the frame. And also there might be X points to Y and there might be a list segment going from Y to Z. So the list here might be either of length one or more than one. Okay, then we can just lift the translation, so we have the pure part so the equality and inequality variables. And then here when we lift, the list is empty so just Z equal to Y. And then, [inaudible] this part can be lifted to list. And the whole translation says that we have a list from Y to Z, and X is neither Y nor Z. Yeah. So this applies [inaudible] and actually we'll see later how to bypass this. Now a motivation was we want to combine with other theories, so we have proven that this combination is actually stably infinite with respect to the sort node. So we can then [inaudible] function symbols. But then what of an extension that here we have just H, just one function symbol? We also lift this using to many different function symbols [inaudible] successor to represent structures when you have multiple pointers. And you can also do read and write over this field, this H when there are many of them. And actually we can combine with the other theories so that when you modify one field, the other fields are not changed. Then we want to do something more interesting. We want to combine this translation in a way that is not purely disjointed in the [inaudible] part but we also want to express invariant [inaudible] data and constraints in the list. So here we have this part which -Here basically the theories are not really disjoined but they become inter-linked. So we cannot do that in the [inaudible] case but we do that only if the constraints over the data is again some kind of local theory. So the intuition being local is that say that you can define your theory with a finer set of axioms and just prove that if you instantiate your axioms over the [inaudible] that you currently have, then your formula is sufficient. And for instance we can express sorted lists. So there is the first part which is the exact translation of lists but then we have this additional clause that says -- Oops. Sorry. Oh yeah, it says that for two variables into Y -- So Y being the footprint of the list -- if Z can reach Y then the data of Z is less than the data of Y. And here we can again use the definition of Y into the part and we get this universally quantified clause which is local, so here we can have these kinds of theory combinations. We can have sorted lists and more complex data structures. Then there was the question of trees. So for the moment we don't have an implemented decision procedure for trees. If you have a decision procedure for trees, you can do it. But for instance we can do doubly-linked lists because doubly-linked lists actually are normal lists over which you add additional structural constraints just make sure that the previous pointer is what you expect. And now I'm doing implementation. So, yeah basically the idea is that you start with a program with this separation logic specification and apply some reduction step and get a program that looks like [inaudible] Boogie [inaudible] specification. Okay so here we'll see examples for how it works on these concrete examples. But the idea is basically you first replace part of the [inaudible] structures like if, as, choose and assume. Then for uniformity we replace loops by tail-recursive methods so that we only need to deal with recursion. Then comes the interesting part which is the translation, and also we'll need to modify the formula to add those variables corresponding to the heap. So that will be something we'll see in the example. Then we can also do memory access checking that [inaudible] safe and do the frame inference. Well, the last part is just --. Let's look at this example. Okay do we have a mouse? Yeah, I don't have the right [inaudible] so hopefully this knows about it. Basically the idea is that we start with this program. So first we define the structures. Also what we want to work on is automatically translating that kind of inducted definition into this first-order logic [inaudible] into the tool these parts. So the translation is hard-coded and we're working on making this automatic. Okay so this [inaudible] separating conjunction. And then we have this procedure -That's the one I showed before -- and that's the input. So now what does it look like after the first step? Okay, so the definition part did not change. Now we just replace the if's by a nondeterministic statement that assumes the condition or assumes the negation of the condition. New variables are introduced by havoc statement. As we'd expect now we still have a loop. And oh, yeah. And the updates to the field are replaced by some updates like over the array theory and, like, we consider next as the field is encoded [inaudible] as an array. >>: You said that you introduced new variables but [inaudible] statement it looks like... >> Damien Zufferey: Okay, yeah, they are already declared. But the first time they are used, which correspond to the -- Thank you. Here they correspond to these statements. Here we'll put [inaudible] but we don't know the value so that's the havoc statement. >>: I have a question about that. In your predicate lseg it looks like you have between the X equals Y or X is not equal to Y separating conjunction. But then, you have an X in there. >> Damien Zufferey: Oh, yeah. Because here we have also -- So I generally would say that X points to Z and the list segment is ZY. So here we just consider this as a memory cell allocated but without introducing the Z because, otherwise, we need to introduce a quantifier. >>: I see. So it's the separation logic that provides that X and then the arrow and then like underscore... >> Damien Zufferey: Underscore, yeah, but we want to avoid introducing this existential successor and here we call the list segment -- So here we start already in the definition. >>: So it's like the active predicate that you [inaudible]. >> Damien Zufferey: Yeah, exactly. So now it will start becoming interesting. Okay. So here now we start rewriting the program. Let's say, okay, the data structures: the next is just a field pointing to the location. And then, this list segment is into two parts. One part is the structural constraints and the other part is the footprint. And then we again start to expand the precondition which then becomes basically a call to this -- So it's introducing now here the existentially quantified sets that are the footprint. And it's just basically forcing the constraints that we have, so the union for instance that this one is [inaudible] of the two. And then we have the leaves so the domain of the list segment. And then, the structural constraints that the intersection of the two is empty. So what you would expect for this part. Then, we do the same. So here we are in -- Yeah, so here we have [inaudible]. And it starts to be really big. And, okay, then another part that was changed that -- Also now before we had that loop inside the concat. So the concat actually -- The loop is lifted into this omitted which has pre- and post-condition corresponding to the loop invariant. And then, if we -- And then the loop that was in the program before corresponds to a call to the [inaudible]. But also a more dramatic change is that now we see that there are these closed variables which are called Alloc of type Set which are passed as arguments of the [inaudible]. And there are also some other sets which are returned. So the idea is that we don't assume that the prover that we are reducing to has any kind of knowledge about the heap. So basically we'll just do the frame inference and everything as a closed variable, the part of the heap over which the [inaudible] is supposed to operate. And then, we put some constraints about how to [inaudible] the heap. So here we'll have this call, that's the returned parameters. And here Alloc basically was the [inaudible] which corresponded to the argument that you received when the [inaudible] was called. So it's carried then over. And then, we have the update after the call which says that, "Okay, here --" Okay, for the moment those two arguments are equal. Basically we'd say that the final part of the heap is the original heap; you just remove what was passed as the frame which is returned here. And then, you add the new part which is here. So the [inaudible] returned -- [inaudible] in that translation the [inaudible] returned the footprint that was at the call, so that was [inaudible]. And the footprint after. So basically the difference between the two contain the modification. And then, whichever was not in that part -- so in the part that was passed -- is not changed and will be part of the frame. And we need one more part because this is just dating the footprint but it doesn't tell you anything about these function pointers. And here we'll have this magic predicate that I'll explain. So we want to say basically that that's the whole heap or the heap of the callee, the part of the heap that you are modifying. And then, we want to say some equality about those pointers because if they were omitted then the [inaudible] but if they were not omitted they should be preserved. So that would be the frame predicate. And here now we can maybe explain how we did the frame. So the frame -- Yeah, so the frame now is still a work in progress but the idea is that we want to avoid this frame inference. Before I showed you that, you know, we can actually reconstruct this frame using this universal translation when in practice it does not work because there is this exponential blow up mostly of K splitting between equality and inequality of these variables. So what we are trying to do is just have an axiomatic definition of the frame rule and then prove that we can get some completeness result of this axiomatization and also some decidability. Okay. What do we want the frame to say? That any path along this successor edge that does go through the frame is unchanged. But for that what we'll need is to add what you call this entry point. So the entry point of proving one successor function in the set big X from X, basically the first node enters -- when you follow the H from X, it's the first node entering big X. And, otherwise, it's X itself if you don't enter. And then, okay. What is the frame doing now? We'll say for any node which is part of the callee but not of the part that the call is modifying, well, the successor function does not change. So the prime version is the same for all variables. Then we need to reason to say the same about the path. So we say that for any node which is in the frame -- Okay, there are two cases. One will be say that in the case that there is no path from U to the big X, EP will be the same because there is no entry point of X. Then anything that you know about reachability is not modified because you can never go in this -- You're always in the frame so you can never go in the modified part. So, reachability is persevered. And then, the more complicated part is, say that, well, there is a path from X to Y and you never see the entry point, so basically it's a path that never goes to the part that is modified. And again here the reachability information is preserved. And now the question is, well, we have these entry points, where are they coming from and how many of them do we need? Again the idea is, you know, try to use these [inaudible] of local theory. But here it's not a local theory any more because if you just take the axiom that we had before and just [inaudible] over ground term, well, it does not work because in the partial interpretation that you have you have only [inaudible] path but you don't have like the node which is just a the border between the two regions of the heap. So here we have what we call [inaudible] local theory. You call a ground term, apply some function over that and generate new terms. And then over this whole set of terms which will be the [inaudible] already have plus just simply adding ep to those, you can just apply the axioms and get completeness. And here what's the definition of this entry point? I'd say that basically for any X to -- X can reach the entry point but it's just following the H [inaudible]. Here let's say that if the entry point is small X and in the part modified then basically this function is idempotent. So if you apply the entry point to the entry point, it's the same node. And then, just a reachability question that says that -- Oh, yeah -- for any Y which is in the footprint of the callee, if you follow X to the entry point of X, you'll never see further than the first node in the callee footprint. And because this entry point is idempotent, we can prove that -- Well, we're still working on it to make the formalization a bit nicer. But the idea is that you can prove that again you have [inaudible] local theory so you can just take the ground term, add the entry point, instantiate the axioms and get completeness result. The idea is to say that if a node X cannot reach the footprint of the callee then the entry point is the node itself. If you can reach the callee of the footprint then it's first node inside. And in the partial interpretation [inaudible] so that's why we need this type to introduce the node. And then, if you are already inside the callee footprint then the entry point is the node itself. Okay, so some experimental results that we're dealing with: basic list examples so concatenation, copy, filter, free, and in different versions, singly-linked list or doubly linked lists. The singly-linked list the basic case just a loop. And then, we have a recursive case so we can compare this automatic translation of loop to recursive function how much addition work -- Actually in [inaudible] it's actually better in terms verification query that you have do then the manual encoding. And then, we have benchmarks over sorted lists. Then, sorting algorithms: so this one requires reasoning about the data. But here the data are interpreted [inaudible] and then we can also do some slightly more involved reasoning that you want to take one list which is sorted, just [inaudible] and prove that it's sorted, take two sorted lists of the same lengths, do sum over all the elements, get a third list and prove that the third list is still sorted. And, yeah. So generally most of these more complex examples show that most of the time of the solver is actually spent dealing with the frame. So, yeah, that's part of the future work, how to do better with the frame. Okay, maybe I'll show it quickly before the future work. This is related work. So this separation logic fragment was introduced by Josh Berdine [inaudible] et al 2004. At that point the decision procedure was basically a set of [inaudible] which was proved to be complete. The idea was either you consider the case where the list is empty or you went full twice [inaudible]. And if you can derive a contradiction within these two cases then you are good. And then, [inaudible] Cook's [inaudible] came with polynomial time algorithm which is based on a graph and graph embedding for the entanglement check. Then, the translation of separation logic to first-order logic is also already existing. People have been look at [inaudible] which are decidable but without inductive predicates because you need this addition first-order tier which is decidable and match the predicate you want to reason about. Or people have gone to a very general translation but then they don't care about decidability so we can just embed the semantic of separation logic into first order. Okay then there are also alternatives to separation logic when you want to reason about the frames. So mostly [inaudible] dynamic frames and regional logic but I guess people here are very familiar with those works. And also people that are trying to study the connection between separation logic, these frame rules and [inaudible] into these other logics. What's the equivalence? Then for the [inaudible] theory of graph and stratified sets. There is also some previous work from first-order theory over reachability, how to decide efficiently this logic. Yeah? >>: There's also a work of [inaudible]. The study of lists is pretty extensive in the context of shape analysis. >>: [inaudible] >>: [inaudible]. I can give you some references. >> Damien Zufferey: Yeah [inaudible]. Yeah, there is also... >>: They looked at whether [inaudible] with its support for conical extraction as well as SMT to [inaudible] predicate [inaudible]. >> Damien Zufferey: Yeah, there is also a group [inaudible] that uses separation logic and also in combination with some other theory at some point. But they are looking at more like invariant generation so they don't really care about completeness. >>: Yeah, that's right. I mean that's the same thing as shape analysis. >> Damien Zufferey: So I think here that maybe the critical difference is that we wanted to preserve decidability and that's why we had all this complication with this negation and this [inaudible] avoiding the universal. So we want more examples using these combinations of theories, more complex structures. And I told you this dealing with the frame for the moment is still very preliminary. So in theory you can prove that it works but in practice this tends to be an extremely expensive query to the server so we want to try some simpler schema. The program right now is that when we go to the translation we directly expand the predicate with a definition which is much larger and introduces [inaudible] so K splits. But sometimes, for instance in the case of the merge sort, you can actually do the frame inference just by semantic matching because the predicates are too close. But when you expand everything, it just becomes a huge formula and the server [inaudible] solving those. So we want to try some hybrid scheme where you first try not to expand the formula and you find a frame without expanding your formula. That's related to some work by Matthew Parkinson on separation logic and abstraction so [inaudible]. Then the question of trees sorted because there are many examples of sorted trees and binary trees and also maybe how [inaudible]. And we're also looking at modularity because there is already some work to see how it [inaudible] framework that you have these generic data types and how you deal with those generic data types. Thank you for your attention. [applause] >> Shaz Qadeer: More questions? >>: You showed us before those numbers there. How do they compare to doing the same programs in Chalice? >> Damien Zufferey: That's a very question because at the point when we did these performances numbers basically the translation set that I showed you was not done explicitly but everything was encoded and done in the tool. So we did not add those intermediate steps so we cannot really [inaudible] something and try it in Chalice. Now that we have these intermediate steps, so we can actually try them. Actually the goal right now would be to say that now we have this -- instead of having both the translation and the solving part in our tool, we want to first have the translation send encoding then print something that Boogie can understand and then, this decision procedure over graph reachability and implement this as two-way plugin to Z3 and see if we can just get rid of that. And then, we have a much more meaningful comparison with other tools. >>: So why is Chalice -- Why [inaudible]? Why not compare with [inaudible]? Why is Chalice a more reasonable... >>: Well if you do something like separation logic there is few... >>: Ah, I see. >>: That's closer to the Chalice system. >>: Okay. >>: I mean but Chalice also does fractional permissions which your decision procedure does not deal with, right? >> Damien Zufferey: No. No, that's [inaudible]. >>: So I mean the fact that you have something that's decidable and Chalice is not decideable... >> Damien Zufferey: In practice that does not mean anything, I mean, in terms of performance. >>: [inaudible] -- I mean do you think there are programs that you would be able to do that Chalice wouldn't just because you're decidable? I mean Chalice also translates to [inaudible] so you're directly not going through separation logic. >> Damien Zufferey: That's a good point. I guess if you program [inaudible] my guess is that the server will be able to give you a model and say that's the bug. Now I don't know if you have found some case in Chalice that some point the server just times out or diverges. >>: I guess the high-level question you are really asking is that why should we focus on decidability given that when we try to solve new problems, they're always be some [inaudible] outside the decidable realm. What is the practical advantage that we get? >>: [inaudible] >>: By focus on decidable cores? >>: That would be a big question. Another big question is, the fact that you're starting with separation logic means that you have to deal with things like the existential that you get when you get to star, that you have to figure out how to split the heap essentially and do all the framing and the frame [inaudible. Whereas, in Chalice you don't have to at all. I man this computes those things. Just like [inaudible] computes things and core logic. I mean, you need the intermediate steps when you do sequential composition. So I guess a different question is you're choosing to start with separation logic which is more complicated, but maybe if you had started with something that's simpler and can accomplish the same things like Chalice then maybe you could imply decision procedures there. >> Damien Zufferey: Okay, so let's say that there was one motivation of choosing separation logic also solves this question of how we can combine with other theories. So how we can get to the outside of this [inaudible] code and combine with other provers. This slide is about how we combine [inaudible] within the [inaudible] fragment. How do we also combine with -- For instance here we have this question of how do we combine with [inaudible] -- [inaudible] because when you have constraints that are interacting within the reachability. I mean that was a question that came to our mind by looking at separation logic formula. So if you change from another framework, those questions disappear and one part of the motivation of this work kind of disappeared. So that was the example we are looking at. And then, the question of decidability. I mean, that's a very good question because in practice [inaudible] gives you better results, but I think there is also this paper from the group in [inaudible], an [inaudible] 2009 paper that also does some kind of prediction where they don't care about completeness. And then, at the end of the paper you have -Well, one big part of the paper is that, well, if we just do this encoding and send it to Z3, it does not work. And then, they do some very complex optimization of the axioms of triggers to make it work. So here maybe one of the reasons in trying to go for a complete fragment is that if you really want performance you will need to go down to this business of [inaudible] axioms putting triggers and making sure that it's fast. But here basically even in the case where it's not fast, we know that it will not diverge. It will at some point return an answer. So we can somewhat kind of bypass this kind of messing around step by trying a more kind of clear framework. I don't how much time -We tried at some point to just have this axiom [inaudible] and just sending it to Z3. We are doing this local step and seeing how Z3 was dealing with it. And most of the examples for which it is working, it is faster. But for some examples, we just diverge if you just send the axiom [inaudible]. So that was the motivation of this small work of decidability [inaudible] so that you can maybe deal in a way that is not [inaudible] with the axioms but will at some point give you an answer. So we had this problem of divergence that we encountered. Did I cover all the question? >>: Is the fact that the theory is local -- Maybe another way to say it is, it seems like that should help you write axioms in Z3 so that it expands them maybe equally [inaudible] but you know that under some [inaudible] expanding. >> Damien Zufferey: We tried but not quite. So the problem is that if you look at the axiom and you just supply the axiom, you generate new terms. But the local theory tells you that you don't need to apply the axiom over these new terms. And we have not found a way of [inaudible] encoding that. I think there is some related work to this local theory actually that was printed at [inaudible] by [inaudible] and his group, what they call natural proofs. They've also tried to somewhere just apply the axioms once and see if you can find a proof. But again also the difference is that they, again, are not looking for completeness but trying to apply the axiom a few times and if you find a proof, you know, it's proof, otherwise. Here by putting that into this local framework, we know exactly when we need to instantiate the axioms and when to stop. >>: I would conjecture about the question. So the reason things get undecidable is because quantifiers in the SMT context. The only -- I mean, I think people have tried to investigate decidable fragments that can handle quantifiers but these fragments are not robust. They're very finicky. It's very easy to slide out of them. The only robust thing that I'm aware of that can handle quantifiers is basically instantiation. And [inaudible] matching is a particular [inaudible] that was implemented and simply is a particular thing that --. So my feeling is that as long as we are stuck with [inaudible] matching and there is no breakthrough like model theoretic breakthrough in handling quantifiers, we are going to focus in on decidable fragments and so on as [inaudible] help us. Because what happens is that you hide things inside theories but there's a little bit of stuff outside in quantifiers and that stuff is not visible that instantiation there. And then, you get incompleteness or all sorts of problems. So I think we need some breakthrough in how to handle quantifiers before all this focusing on decidability, all these arguments which are basically essentially model theoretic arguments [inaudible] proofs. >> Damien Zufferey: Well, I would rather [inaudible] if you have some real breakthrough in [inaudible] quantifier, you might not need this anymore. But this is some kind of thing... >>: I don't think that these breakthroughs just happen. They depend on these kinds of things, these kinds of breakthroughs. >> Damien Zufferey: In that sense, yes. >>: Yeah. >> Shaz Qadeer: Okay. >> Damien Zufferey: Thank you. >> Shaz Qadeer: [inaudible] [applause]