>> Tom Ball: I'm Tom Ball, and it's my... Microsoft Research in Cambridge.

>> Tom Ball: I'm Tom Ball, and it's my distinct pleasure to welcome Byron Cook from Microsoft Research in Cambridge. Byron has a long history working with us in -- from many years ago, and we're pleased to have him here today talking about temporal property verification, revisited. So take it away. >> Byron Cook: Thanks very much. So this is joint work with my Ph.D. student, Eric, who's now finishing his Ph.D. and going to -- he was a Ph.D. student at Cambridge University and he's going to NYU as a postdoc. And this is work that he and I did over a couple of years. And every six months or so Moshe Vardi visits us and we would tell him about what we were doing, and he would show us how we were wrong. So he would -- so we would reset what we were doing according to what Moshe said. The goal of this work is where -- basically we wanted to revisit how temporal properties are proved of systems. And in particular we wanted to focus on programs. And we wanted to basically revisit and -- the old wisdom and look to see if there was maybe new ways we could do things that people hadn't considered before. And this is based on two papers. So there's a paper that appeared at this year's CAV, and then a paper that appeared at the last POPL, so the talk is based on these two papers, so you can go read them if you're interested. So we're going to start with a little history lesson. And I'm going to sort of open up some old wounds. So in our culture there are some people who -- there are some sort of schools of thought about temporal logics. And actually the people involved have heated debates about this, and in fact they don't like each other very much. But we're going to revisit this area. So the area is the difference between linear-time and branching-time logics. And I'm going to say this; that we started off as a goal to prove linear time logics, but it turns out basically that the high level of this talk is that branching-time logics are going to be a fantastic way to prove linear-time logic properties in linear time. So linear time, the most popular linear-time logic is called LTL. That's used by a number of tools. And from what I understand from Moshe, PSL is a logic that has a linear-time-like semantics. And then on the other side you have logics like CTL or Sugar. And I'll try and explain this in a moment, but actually Hoare logic is something like a branching-time logic. Okay. So I know that in hardware temporal logics are much more popular than they are in software, so here temporal logics are not as well known as they would be at Intel Research, but Hoare logic should be familiar to all of us. Okay. So what I'm going to do is I'm going to first explain the difference between branching-time and linear-time logic. Okay. So in branching-time logic the idea is that a program or a system is represented -- the meaning of it is represented as a tree of all of its executions. So you have the initial state and then you go to some other state. And then here, what this represents is from this state we could either go to this state or this sate. So maybe there's some user inputs or there's some nondeterminism, maybe it's a thread switch or something like that. So different events could happen, and you could either go here or go here. And those choices are represented in the tree. Okay. So when you're trying to prove a property of one of these systems, typically what you do is you think of the specification and the implementation, and then you try and build relationships between the specification and the implementation in the tree between them. So you're saying things like, oh, the one system here could make a transition to this state or to this state, and over in the other system is there some analogous transition such that certain properties are respected. Okay. So this is branching time, and often it's called state-based formalism. So what we do is we talk about states, we talk about certain representations of symbolic states, and then we say, oh, from this state we could go to this other state, and then we reason about those transitions between states and how that works over in the other system that we're trying to compare our system to. So that's the deal with branching time. And linear time, what we're considering is traces or sequences of states. So the meaning of a system in linear time is all of the sequences of states that the -- all the executions of the system. All right. So then when you're -- and for convenience, all -- usually all -- we consider only infinite traces. So if you have a potentially finite execution through your system, then the final state then is just repeated forever. So then what we're doing in this world is we're saying if we're trying to compare two systems, then we're just using set theory to talk about, you know, is this set of traces a subset of this set of traces. Okay. So that's the difference. And, I mean, they don't seem all that different from the outset, but when you begin to look in the details, there a's a few very subtle differences that cause a lot of trouble. But I wanted to generalize and to say that the wisdom is is that LTL is really powerful, but it's considered hard to prove. Whereas branching-time logics are considered easier to prove, but they're less powerful. So what I mean by that, so I had an experience when I was at Intel Research. I saw -- at the time when I was there people were using branching-time temporal logics. And their tools were extremely fast. But then a great deal of time was spent arguing about what had been proved. So you would prove a property, but then no one could agree on whether or not what had been proved was what people had intended to prove. Whereas in the linear-time logics, it's usually quite clear that you've proved what you were hoping the prove. So the subtleties -- there are less subtleties in LTL, but to prove the properties in LTR are -- it's more difficult. Okay. So let me say just a little bit more formally how -- let me give you an example. Okay. So imagine we have a property like this, FG. So this is saying in the future it becomes true and stays true that the value of X is true. So imagine the X is a Boolean variable. And we want to prove this property up here; that the program together with its initial states respects that property. So eventually X becomes true and then it stays true through the rest of the execution of the program. And the traditional approach is something called the automata-theoretic approach. It comes from Vardi and Wolper. That traditional approach is to do the following. You take the -- you consider the traces of the program -- so that's the language of the program, the set of traces that the program could execute basically -- and then you take the property and negate it and then ask what is that language. And then you intersect and show that you are left with nothing. So that's what I meant by we're using basically just set theory to reason about programs and the system. Alternatively in the branching-time world it's much more prescriptive. This is the moral equivalent of the same property. So this is saying on all paths in the future you reach a state such that from all paths from that state forever X is true. And if you look at the proof rule for this property, really what it's saying is find a set of states S that you eventually reach and then from that set of states S prove that X is true and stays true. Okay. So what's nice about branching-time is that you really can decompose based on the property. You take the property and you say, oh, it's an AF, that looks like termination; AG, that looks like safety. And it basically gives you a recipe for how to go about the proof, right, so we have termination and safety. Whereas in the linear-time case you really can't do that. So in a traditional approach really what this turns out to be is you end up proving basically something called fair termination. So after you dot all the Is and cross all the Ts, really what this problem turns out to be is that your -- so this really turns into a set of fairness constraints and then your program is a set of infinite traces, and what you're really trying to do is to prove that all of the fair executions of this program are actually not allowed and, thus, that's why the set is empty. So if you look at the --like I've written an implementation based on this approach, and really what it is is it's Terminator together with a little bit of extra mechanism for considering fair termination, and then you can turn all of this into a question of fair termination. Okay. So that's the traditional approach. Okay. So let me show you an example that highlights the difference between those two approaches. So this is a program that sets some variable to true, and then this is while a nondeterministic choice, so while -- on every time we hit this it could be true or could be false. We skip and come back around or else we set X to false, set it back to true, and then run forever. So the question here is does X eventually become true and stay that way. So I leave that to you, the audience. Someone have an answer? I'm going to pick on someone. >>: True. >> Byron Cook: Okay. >>: Yes. >> Byron Cook: Yes? >>: All four. >> Byron Cook: What's that? >>: All four sets it to true and then you're good. >> Byron Cook: Great. Yep. >>: [inaudible] >> Byron Cook: What's that? >>: [inaudible] number of terminates [inaudible]. >> Byron Cook: That's right. Exactly. Yeah, that's right. Great. So basically you think in LTL. So in LTL the property is valid, right? And it's exactly for that reason. You basically -- you think of all the traces, right? So one trace is the trace where you stay in this loop forever, and in that case X is true. And the other traces are the cases where you leave this loop. So you run through this loop N times and then come here, it becomes false, but then it becomes true and so on, so that's fantastic. In CTL, actually, this property is not valid. Okay? And the reason is is because when you look at the tree structure of the program really what this property is saying is can you find some frontier, some set of states that you eventually reach, right, such that from that set of states on X is true and stays true and do we eventually reach that frontier. And you'll find that it's really hard to find actually in this case. I'll bring the code back. So you're trying to describe a state or a set of states, so that has to be evaluations of the program counter and the variables of which there's only one, and you need to write down a set of states -if you're trying to prove this property, you need to try and write down the set of states that you eventually reach, and from that set of states X is true and stays true. Okay. And the reason you won't be able to find it is that when we go in the interloop, the first loop, L1 to L2 -- so here I've drawn the tree a little bit more. So here we go to the left when we stay in the first loop and we go to the right when we leave the first loop and then we hang out in the second loop forever, right? But it's this run that gets us into trouble. Because at any point down that line we could pop up and set X to false. So this is what a counterexample would look like when represented as a finite graph. It's saying, hey, you could go to L1 and stay in the first loop as long as you want, but at any point you could set X to false. And for that reason we can't find a set of states that you eventually reach such that from that set of states X is true and stays true. Okay. Great. Ah. But there's this really amazing paper by Abadi and Lamport that comes from 1991, and they give us a recipe for something we can do. So what Abadi and Lamport say -- I'm going to paraphrase now; so they didn't quite say it in this language, but I'm going to sort of put it in the current context of what we're talking about -- they say, hey, if you want to prove an LTL property using CTL approaches, there always exists a history variable and a prophecy variable -- I'll tell you what those are in a moment -- that you can put into the system. That system is trace equivalent -- i.e., any LTL property that holds to the original system will hold to the new system and vice versa -- such that now the CTL approach will work. Okay. So I'll show you what that might look like. So here's a variant of the program where I've introduced a new variable called rho. And rho is initialized to any natural number or to some special value. And in the loop, inside the loop -- here I'm using indentation to represent code inside the loop -inside the loop, if -- we basically check if rho is not equal to zero, and then we decrement rho. And rho -- decrementing on bottom is equal to bottom. Okay. And then when we leave the loop, then we check to see if rho is equal to zero. So what's happening here is that rho -- I've written rho to choose ahead of time how many times you're going to go through this loop. And if it's some natural number that's chosen, then that's the number of times you're going to go through the loop. And if bottom is chosen, then you're going to stay in the loop forever. And then -- and this is a trick that's quite well known here, but we're basically introducing values and then using assumes to prune out the executions we find inconvenient. So now for every trace in the original program there exists an equivalent trace in this program, if you project out rho. So imagine that we're looking at the execution where you stay here forever, well, then I could choose bottom. Yeah? And imagine the execution where we go through this loop K times, then I could just choose rho to be K. And then on every iteration of the loop, it will be K is not equal to 0, K minus 1 is not equal to zero, et cetera, until K does become -- until rho does become 0, and then this will be false. So this will force us then to go out of the loop. Okay. And what's neat about this program is that -- well, I think I've already said that. What's neat about this program is it does respect the CTL property. And here I'll show you what the S is. So the S is the set of states that we eventually reach, and then from that set of states X is true and stays true is the following set, is the set S, such that either rho is bottom or the program counter is equal to L4. Okay. So that's pretty neat. Great. Okay. And then the trick -- the thing they didn't talk about is how to find rho. Okay. So that's basically the approach here. So this ends now the history lesson. So the traditional -- let me recap. The traditional approach if you want to prove LTL is you use the Vardi-Wolper approach, you do this sort of automata-theoretic construction, build this program that's kind of -- has the automata running in the program, and then you prove fair termination. And what we've done is looked back in the archives and found this other approach that says, oh, what you should instead do is introduce a prophecy variable. And now what I'm going to show you to -- how -- what we can do is a technique for synthesizing prophecy variables on demand. Okay? Oh. Oh, and there's one other thing I have to do. It turns out that CTL fell into such disfavor that no one really knew how to do CTL for infinite state programs. Okay? So we also have to sort that out. So I'll tell you -- that's basically -- the talk is in two parts. So what I'm going to do is first show you how we can do abstraction to state-based reasoning and then infer prophecy variables when we need them. Okay? And that's via something we call decision predicates, and I'll tell you about that in a moment. And then the second bit is that all of this was predicated on the existence of a fast branching-time or state-based technique for proving like CTL-like properties of programs, and actually that wasn't known. So what we've again -- by the way, I'm a very -- well, as Tom will know, I'm a very lazy person, right, so rather than building a whole new tool what I've instead done is discovered a way to reduce this problem into something that we already know how to do. So actually what we do is we reduce the question of branching-time temporal property verification to one of an introprocedural program analysis question. And then I can just run a tool like SLAM on it, SLAM together with some Terminator-like techniques. And then to do that I have to do some clever encoding tricks. Okay. So that's where we're going to go. So this is the POPL paper. And basically this approach assumes that you -- that CTL will often suffice. And experimentally this turns out to be true. So the traditional automata-theoretic approach says we're going to end the construction, track all of the nondeterminism that's important. And when I say that, what I'm talking about is it turns out that there's no difference between ACTL and LTL if your system is deterministic. It's only nondeterminism that gets you into trouble. All right? So -- and when you use CTL to prove an LTL property, when that succeeds, then it's going to turn -- that means that the nondeterminism in the system in that case didn't get you into trouble. So our strategy is going to be try to use a CTL approach and then find the problematic nondeterminism when it occurs, remove it, and then basically symbolically predeterminize the program a little bit, partially determinize using prophecy variables, and then push on. And here I'll show you the experimental results first. So here we've taken examples from Postgres SQL, Apache and some fragments from the operating systems, primarily bits and pieces of device drivers. And here's the shapes of the properties we're trying to prove where Ps and Qs are atomic formulae, things like did you release spinlock or did you acquire spinlock, so on. And then here are the two competing approaches. So this is the traditional approach. This comes from my POPL '07 paper. And you'll see that we have a number of timeouts. And these timeouts are after like four hours. So it really means it's gone to lunch. Yeah? And then this approach is the new one. So this is based on decision predicates which I'm going to tell you about, and we only get one timeout and the runtimes are often faster. There's a few cases where it's slower, and I'll tell you about why that's true. Okay. So let me now -- so here's the procedure. Okay. So we're trying to prove an LTL property. We have M, which is the system. We have the property in LTL. Those are the two parameters. And then the first thing we're going to do is we're going to approximate the property in CTL. So this is pretty straightforward. We basically just walk over the property. Whenever we see a G we put an A in front of it. Whenever we see an F we put an A in front of it, and that we assume that the atomic formulae are closed under negation and therefore you can push negations all the way to the bottom. So the problem -- you run into problems in CTL because the negation of AF can mean something that we can't represent. I need an E, which I haven't told you about and we don't support. So we have to push all the negations down to the bottom first. But after we've done that, then we can just walk over the LTL formula and write a morally equivalent CTL formula. And then we know that -- and we've proved this, but I think it's existing in the literature anyways -- that if you prove something that's been approximated as we've done with CTL, then that means it holds an LTL. Okay. So this is the set of decision predicates. I haven't told you what they are yet, but you don't need to know yet because we start off with the empty set. And then this is a procedure that uses the set of decision predicates to partially symbolically determinize the system. And when the set is empty actually you just get back the same system. So in this case we're just taking the same system. I'll tell you about determinizing more in a moment. The next thing we do is that we try and prove the property in CTL, and I'll tell you about that in a little bit. When that succeeds, we're done. And when we fail, then this is where things get interesting. So here we're getting a counterexample back. It's in CTL. Counterexamples in CTL are trees. Counterexamples in LTL are sequences. And now we have to do something smart. So, A, we don't know if it's a real counterexample, and it's not even clear how that counterexample relates to a counterexample that might exist in the LTL property, and so now we have a bit of work ahead of us. Okay. So I'm going to tell you about a procedure that does refinement. And if new decision predicates are found, then we add them into the set of decision predicates, otherwise we're done. Okay. And here's the deal, here's the connection between CTL counterexamples and LTL. If there exists a single trace through the system such that it's not inconsistent with each of the paths through the tree, then it's a real counterexample. Right? So then if you can find that in the tree, then the tree represents actually a single path. Let me say it more operationally. If you start walking that tree but every time you see nondeterminism, you just can join. So you basically zip the tree. If that's not unsatisfiable, then it's a real counterexample. Otherwise it's not. In LTL. Okay. So that's basically what we do, is we walk this counterexample. So this is going to look very familiar to Tom, because basically we're doing something like we did in Newton in SLAM. We're going to walk this counterexample. We have the counterexample here that's been given to us by the CTL model checker in terms of locations in the program, and we have the program. And what we're going to do is we're going to symbolically walk that counterexample. But the trick is that when we hit nondeterminism we're going to take both branches simultaneously. Okay. So here we are at L1. And the counterexample says you have to go to L2 and L3 at the same time. All right? So now we're at L2 and L3 at the same time. And the program -- the conjunction of program counter equal to L2 and program counter equal to L3 is unsatisfiable, and so for that reason this counterexample is a spurious with respect to LTL. Okay. So this then is the crux. This is where then we get this idea of a decision predicate. So the distinction is that from program counter equal L1 we either went to L3 or to not L3. We could say that alternatively; as you went to L2 or not L2. Yeah? And so the decision predicate that we get is a pair. And I'll just take it from the paper. Decision predicate is a pair of first-order logic formulae A and B where like the -- where ->>: [inaudible] >> Byron Cook: Yeah, I'm very lazy. So it's A tells you when you care. A is the presupposition. Yeah? And then B is the distinction, the thing that distinguishes the nondeterminism. So A and B distinguish between a transition from A to B in the pre-state and the post-state and A and not B. And then we're going to use this base -- this is connected to prophecy variables, and I'm going to tell you about this in a moment. Okay? Great. And to ->>: The decision predicate is just about -- it's always just about the program? >> Byron Cook: No, it can be about other things, but it turns out often it's about program counters. >>: Oh, because it's just unsatisfiable while ->> Byron Cook: Yeah. >>: For a formula. >> Byron Cook: Yeah. >>: Then ->> Byron Cook: I can write a little example. It actually has to be over other variables, but you -yeah. >>: [inaudible] okay. >> Byron Cook: It turns out that -- so, I mean, I think, as most of you know, I worked on Terminator, and in Terminator we used Farkas' lemma-based approaches with an SMT solver to find ranking functions, and it turns out very similar technique can work here. So we're just asking along the constraints that we've walked does there exist a way of expressing a set of states that distinguishes between the one transition and the other transition here. I think I have a characterization of that in a couple slides. Yeah. So our synthesis procedure basically takes two transitions, the left and the right, and we're saying can you find an A and a B such that in the meaning of A and B prime you can distinguish that, you can distinguish between the two relations. And we use Farkas' lemma. It's not very important. There's various approaches you could probably use. Okay. So that's that. Okay. So now let me tell you about determinants. Okay. So here is the crucial bit. I've been telling you about prophecy variables, I've told you about decision predicates, and here's where the two connect. So now -- so the prophecy variable rho, what it's doing is it's tracking the number of transitions from A to B that will be taken until you see an A to not be transitioned. So if rho is 10, then you're going to take 10 A-to-B transitions, and on the 11th transition you're going to take an A to not B. And if you choose bottom, then you're always going to go A to B. You'll never go A to not B. Okay. So here's what determinized looks like. So determinized is taking a -- it's a system and the decision predicates, and we're going to return a new system. And the new system states base is going to be like the old states base together with for every decision predicate pair we're going to have a natural number lifted. And then the initial states will be like the old initial states but their initial configuration could be any possible valuation of the vector of natural -- lifted natural numbers. And then this is the crucial bit. So the transition relation says, hey, you can take a transition from a state to a new state if in the underlying system you could take a transition and for every pair and their set of decision predicates one in four cases holds. So the first case is -- remember I've been saying A says when you care, right? So here I'll go to this case first. This is the case where you don't care. So A is not true of S, in which case nothing changes. Okay. But let's look at the more interesting cases where we care. So now we're in a state where we care. So S -- the A predicate holds of S and rho is equal to bottom. What rho equals to bottom means is you're all -- whenever you see A you're always going to go to B forever. So, thus, we say you have to go to B and rho stays at bottom. Here's another case. A is true but rho is some positive natural number, some nonzero natural number. Okay. So we're forced to go to a B state, but we're going to decrement rho. And then the last case is this is the one where we trigger. We're in the A state. Rho is equal to zero. So now we're going to go to a not B state, and then this resets rho, so we redecide now for the next time around how many times we're going to take A to B transitions. >>: [inaudible] >> Byron Cook: Yes. Yeah. Nondeterminism -- in my world, nondeterminism is free. >>: But you want ->>: [inaudible] >> Byron Cook: Well, okay. But it's -- the thing is is that the nondeterminism used to show up in the -- just the -- the arrows in the tree, whereas now they're showing up at earlier places. So in the initialization you'll see now a very wide set of possible choices. And then now when you take this transition to a B, you make that decision ahead of time. So basically we're moving the choice earlier. >>: It's like hoisting. >> Byron Cook: I guess so. Yeah. >>: [inaudible] >> Byron Cook: Yeah. That's right. I mean, yeah, I think maybe -- when I said we partially symbolically determinized, that may not be the perfect intuition, because ->>: [inaudible] is the next state [inaudible]. >> Byron Cook: Yes. >>: Is it going to be the same nondeterministic value ->> Byron Cook: No, no, no. It's crucial that it could make any choice. >>: At any given step. So it's not ->> Byron Cook: No, but it's only here where you do that. >>: Yes. >> Byron Cook: Maybe we'll talk about it offline. You look unhappy, but I don't know how to make you happy. If I draw this -- I mean, if I write this down in text, then this is what you get. This is basically what we had before. So the -- and this is where it's -- this is the code you're a little bit unhappy with. It basically -- in this case it doesn't matter, but sometimes it does, that when you have triggered this case then you have to reassign. >>: You're saying rho equals rather than assigns. >> Byron Cook: Assigns, sorry. The equals is -- sorry. This should be assignment. >>: Okay. So I guess how do you know -- okay. So the thing that I'm having a little trouble with is how do those statements get associated with program counters? Like how -- I mean, how do you know where they go? >> Byron Cook: So this is actually -- really what we're doing is this up here. And I'm trying -I'm just trying to show you texturally what that might look like. >>: Can you go back? >> Byron Cook: Yeah. >>: So here -- I see. So they're -- they're right. They're attached to these states [inaudible]. >> Byron Cook: Yeah. So if these predicates are always over program counters, then I can go -one way to implement that is to go into the program and drop some code there. >>: Okay. Are they sometimes on transitions? >> Byron Cook: They could be on tran- -- they could be on just other -- not on transitions, but on other -- other like ->> Predicates? So they [inaudible]? >> Byron Cook: Yeah. >>: Okay. >> Byron Cook: Okay. So then -- well, then that's that. Okay. So that's the strategy for inferring prophecy variables. So now I'm going to tell you about the branching-type approach. So this is based on the CAV paper with Eric and I and Moshe. I already said that. I'll come back to that. Okay. So here what I'm going to do is I'm going to encode backtracking, reasoning about eventualities, all using recursion for simplicity. So if I want to prove this CTL property hold of the system, then what I'm doing is I'm reducing it to another question which is basically a program analysis question. So I construct a new program that is constructed by walking over the property. And if SLAM were to say that this program could only return true ever, then the original property holds. With the condition that I need to find this Q. And I'll tell you about the Q in a little bit. It's an extra condition. So basically we need to try and find a Q such that then with a tool like SLAM we could prove that this program could only ever return true. And if that condition holds, then the original property holds. And that's an if and only if almost. Okay. It's -- there's a -- it's morally if and only if, but there's a really interesting case where it doesn't hold, and I'll tell you about that. So it should be if and if -- it's sound but not complete. And I'll tell you about that in a moment. Okay. Yeah. So here's the transformation. So we're going to take the program, the property Q. So Q is -you'll see what Q is. But Q just kind of comes from the sky, so we don't have to worry about it too much. And then the transition relation and the current state. And this is going to return bool. And what we do is we walk over the structure of the property. So I'll give you an easy case first. So this, by the way, will look very simple, but there's some subtleties in there. So in the simple case imagine that the property is conjunction. Then what we do is we say, oh, okay, from our current state, call this procedure recursively. So call the checker, let's say it's called the checker on the subproperty together with the current state, and if that says the property is false, then return false. Otherwise, return the check -- the results of the checker on the second property, on the second piece of the contract. Okay. So that's the easy case. Now I'm going to raise a difficult issue. Okay? The difficult issue is I said that this -- if this can't return false, then the property holds. But one way not to return false is to never return. And this is a program. So one difficulty would be that what if this subchecker just never returned. So that's a problem. So what we actually have to do in this encoding is to make it such that from every state the thing could return. Okay. So that's one of the subtleties. So let's look at a more complex example. So this is AG. So we're saying that on every step of the system the subproperty holds. Okay. So what we do is we basically build a little interpreter. So let me read the last two lines here. So while true, so forever, run the transition system one step and then copy its state over. Okay. But on every iteration of that, we're going to use the recursively defined subchecker to check to see if the property holds. And if it doesn't hold, then we can return false. And then this is the sort of crucial bit. At any point we need to be able to return true. Okay. And the reason is is that we need to be able to return true such that if the other conjunct or other disjunct or something else doesn't hold, we need to be able to return true such that the checker -the other checker can go off and find that it doesn't hold. So we're really looking at all finite runs and then reasoning about the infinite runs in that way. So it's kind of like coinduction. So let's look at AF. So in AF this is -- if you know Terminator, then this should be familiar. This is basically -- so basically -- actually, let me see where I have jumped ahead of myself. Let's go to F. Yeah. I said all that. Amazing. AF false is termination. Right? So actually if I substitute this subproperty for false, then I get -this is exactly the Terminator program reduction that we've done before. Okay. So what's happening here is we initialize some variable to false, and then, like before, we just run the system. Okay. So on F we -- on every iteration we take a step of the transition relation and get a new state, update it, and then the checker says, hey, on every state, check to see if the subproperty holds. If it does, then we've reached AF. Right? We know we've now reached that frontier, so then just return true. The problem is is what if we don't. Yeah? And in the case where we don't, then we don't really know what to do because we're talking about now. It's hard to look into the future. So imagine we've run the program for a while and we've hit some state and the subproperty doesn't hold now. What we're going to do is we're going to continue to allow the program to run to any other state, but we're going to record the state maybe, and later we're going to check to see if this state S and this state S prime are in some relationship which shows that we're making progress. Okay. This is the Terminator approach. So here we say if -- and this variable dup is initialized to 0, and when we copy the state, then we set it to 1. So we say if dup is false and nondeterministic choice, so maybe we do the copying now, then grab it and grab the state. And then if dup is true, then we just ask is the old state and the current state -- is it in Q, and Q is in progress. And I don't want to go into details, but if you know about Terminator, then you know what I mean. There's a way of describing progress that's quite nice. So it's saying this transition from S to S prime, we actually know that we're making progress. And we can now -- it turns out that we have techniques for finding Qs. Okay. So that's the approach there. Great. Uh-huh. All right. So then, hey, let me -- yeah, I have a bit of time. So let's look at an example. >>: Byron? >> Byron Cook: Yeah? >>: [inaudible] logic program [inaudible] logic program semantics to say when you have things [inaudible]. >> Byron Cook: I would imagine so [inaudible] benefit. I mean -- yeah, I mean, I guess what you're saying is that I'm doing -- I'm doing kind of things you do in logic where there's nondeterminism everywhere, and I'm saying, oh, and now I make a next step but I don't know which one exactly. >>: Because the next thing you have to do is that, okay, you have also an interpreter [inaudible]. >> Byron Cook: I mean, in a sense, if you look at the programs that are often written in the internals of SLAM, say, they kind of look like that, because nondeterminism is free. So you make nondeterministic transitions. You use assume. >> Oh. >> Byron Cook: You use assume. Right? So this program may be very difficult to execute, but it's easy to prove. >>: So what do you mean by nondeterminism is free? >> Byron Cook: I mean it's not free but it's already paid for. Right? So the tools you have inside SLAM or I think Boogie [phonetic], you have nondeterministic go to, so you can either go this way or go that way. And then the proof just has to hold over all the possible choices. And then you have usually X gets nondeterministic choice. And then you have assume. So an assume is -- like you can think of it as like backtracking. Like you can try and execute these programs by making a choice, maybe it's not the right choice, and then if you ever hit an assume which fails, then go back to a previous choice and try to figure out which one it was and take the other branch. And that's actually internally really what the programs look like in -- I mean, we never run them; we only prove them. Right? And so in the proof it's easy because you're just considering -- you just have to consider all the choices. And so this program I would never run but I prove. >>: Right. >> Byron Cook: Yeah. Okay. So let's look at this example. So it's AG, X gets 1, then eventually X gets 0. So this is like you acquire the lock, you eventually release the lock and -- of this program. So X gets 1, then we go through some loop some number of times and then release it. Okay. Now I'm going to get really practical on you, and I'm not actually going to do this transformation because what I know is that I have the code for R. So really what I'm going to do is I'm going to take these transitions and put them in the code. This is just an implementation strategy. So what I can do is build a program -- here's the main -- that calls the subchecker. So basically I'm going to define a procedure for each of these subcheckers and then just assert that it can't return false. So then this is the checker for AG for this, for this whole property. And basically on every line of the program I'm basically doing the transformation for AG. And what it says is, hey, if the property doesn't hold, the subproperty, so that's X not equal 1 or AFX equal 0, then return false. Otherwise maybe return true, otherwise keep going. So if ever there's a way for the property to fail, then one of these subprocedure calls will fail and then we'll return it. Okay. And then we can look at the code for this guy, and that's here, I believe, or here. And so it just says, oh, call with a disjunction. So call the one case, call the other case. And then this is the AF case. So here basically at every line we're just doing the thing we did inside the loop and the interpreter. Right? We're saying if the subproperty holds, then return true. Otherwise, if we've already copied the state, then check to see if the states are in some progress measure. If they're not, then return false. Otherwise, if we haven't copied already, maybe we copy now, do the copying, and then at any point maybe return true, that way this basically implements backtracking. Okay. But then actually I can actually do quite a bit to simplify this, because you can use like static analysis to discover a lot of cases you don't need to check. And actually what I'm doing now in my implementation I'm writing now is I'm using like a Ken McMillan sort of lazy interpolation-style unrolling and I'm only building these checkers really when they're needed. But you could do a bit of work statically to get it pared down. So this is the result after I've done a bunch of simplification. And you're left with a program that doesn't look too bad. So if this program -- if we can find a Q, I think it shows up as M here, but if we can find an argument for termination such that a tool like SLAM could prove that this assert never fails, then the original property holds. Oh. Oh, oh, right. And this is the crucial thing I was telling about. It's a moral if and only if. Why is it a moral if and only if? Why isn't it a real if and only if? I'm making one assumption. I'm making the assumption that checking that two states are included in M, that that's decidable. Right? So you could imagine writing -- implementing this actually as a program and you might not know if this inclusion check will ever answer. And not answering is a disaster because that assert false will never fail. So I have to know that when I'm doing my encoding, when I'm asking, hey, are two states NM, that that check needs to be -- you need to be able to do that in a finite amount of time. And in practice it's no problem because this is all -- turns into an SMT check. But in theory you could imagine something more complicated. >>: [inaudible] >> Byron Cook: Exactly. Okay. I'm going to skip all that. Great. So to wrap all this up, this is just the Terminator approach again. We -- so we're trying to prove this ACTL property. What we're going to do is we're going to initialize Q. That's the progress measures. And we're going to start with a set of basically no measures. And if this succeeds, then we're done. Right? So now we try and -- we say, hey, could this program here, could it ever return false. If not, then we're done. Otherwise, get back the counterexample, look for lasso fragments in the counterexample. And if you know Terminator, you know what I'm talking about; and if you don't know Terminator, it's probably too late and I won't be able to explain to you in the five minutes remaining, so just bear with me. And we look for a ranking function that's a witness to well foundedness when we can find one. And then we add that into the progress measures. So we build a new relation which says, hey, the ranking function we found went down, that's the new progress measure, we added in a Q, and then we go back and try and check this again. Okay. And if you do AF false, then this is Terminator. Okay. Right. And that's pretty much that. So, I mean, in summary, we've dug in the archives, found old papers, figured out new ways to solve this old problem and then automated some of the parts that were unknown. So how do you do prophecy variable synthesis and how do you do branching-time reasoning for infinite state programs. And that's what I've told you about. Let me tell you about a few limitations, and then I'm going to show you the experimental results again, because there's something interesting to look at now. So some limitations. This works abysmally for finite state systems. Okay. And the reason is that on every prophecy variable I introduce in finite state land would have to -- what's the -- what number of values would that have to represent. It would have to represent the diameter of the system. And then if I had two prophecy variables, the second one has to be over the diameter of the system with the first one. So things get really bad really quickly. But in the infinite state tools, infinity are -- is free, right, is already paid for. So we can introduce infinite state variables. It's no problem because techniques like predicate abstraction or interpolation are designed to work around that problem. Whereas in a classic finite state encoding, that would be really bad news. I'm working on techniques to make that better, but just as I've told you today, this approach would work very bad for finite state systems. The other thing is we use Farkas' lemma, so this really limits us to linear -- decision predicates that can be expressed over linear arithmetic. So if you, for example, were to go to a finite state system, that might be a bad choice. And then Abadi and Lamport tell us that there always exists a prophecy variable. But what we're doing is we're looking for a finite set of prophecy variables that range over effectively the natural numbers. And it's not clear if there always exists a finite number that suffice, and it's really not clear that we always would find that. Yes? >>: So these prophecy variables always natural numbers? >> Byron Cook: Yeah. >>: [inaudible] and the only operations you do are decrements and resets? >> Byron Cook: Yeah. >>: So even if you had a finite state system that had these natural [inaudible] ->> Byron Cook: Yep. >>: [inaudible]? >> Byron Cook: Yes. And probably what you're implying, which I think is true, is that there's probably a nice way to encode it since you're only doing decrement and reset. >>: Right, right. So even if they're not [inaudible] variables that it will still be [inaudible]. >> Byron Cook: I agree. Let's look at the experimental results really quickly. So remember I said that the traditional approach all reduces basically everything down to fair termination. We don't really make use of the structure of the property. We don't say, oh, we have an AGAF. So that's kind of like safety; that's kind of like termination. Let's do some safety proving and then do some termination proving. Instead we do this massive automata-theoretic instruction and then just throw everything into this big bucket and then prove basically fair termination. Turns out that's quite expensive. Right? Yes, it works. It's kind of like hiring an attorney to drive you to the airport. Like you'll get to the airport, but it's really expensive. And so this -- I mean, this tool works in theory, but it times out a lot. So what we should see -- now, why are these times faster? Okay. So I've -- I'm drawing your attention to a couple things. One thing is this is the size of the termination argument. So it's the number of measures we had to introduce or the number of times we had to go back and try and rediscover a new ranking function and put that in. And what you'll see is that the number of measures is lower. Right? And think of computing a measure as an excruciatingly expensive procedure. It's like you have to basically reset everything and start all your work all over again. That's one reason it's faster. And another thing I wanted to highlight is that this here, this is the number of decision predicates we had to infer during these proofs. And so what we had hoped was and what's turned out to be true in these cases at least is that the number of decision predicates you need is quite small. And in cases where the number is high, then you'll see that we're actually slower than the traditional approach. So this is bearing out the thing that I predicted; that if the nondeterminism that you're trying to track is not crucial, then will be faster. But really if there's a lot of nondeterminism that needs to be tracked, probably it's just better to go with a traditional approach, and that's what happened. So I can write examples that are extremely bad for our approach that are really good for this approach, and that involves lots of nested temporal operators or examples with lots of very complicated nondeterminism, all of which is crucial. So that's the way I can trick my tool. Okay. So that's that. Thanks very much. So you might look at the papers if you're interested in the details. I can also take questions. Thanks very much. >> Tom Ball: Thanks, Byron. [applause] >> Tom Ball: Any more questions? >>: Can you comment on -- so what if there is a bug basically at the end that basically is some path that failed? So in the previous slide you had some Xs on the last column which means that the property was not satisfied; is that correct? >> Byron Cook: That's right. Yeah. >>: I see. So does the new -- basically your new decision procedure [inaudible] what's the -- is there any implication about, for instance, making sure that it's not a spurious counterexample? I mean, you already touched a little bit, but from a high-level bit basically, was the high-level ->> Byron Cook: In this -- so in this graph X means counter -- it actually means counterexample because we looked at them all. But it doesn't mean we proved that they're counterexamples. So it means we failed to prove correctness and then we handed something back. We know that there's -- I mean, we've removed the spurious ones in CTL, but we don't know if they're spurious in LTL. But since doing this now my tool actually disproves also. >>: [inaudible] experiment all the Xs that you have at the end you already had with fair termination before, right? >> Byron Cook: Yeah, yeah, yeah. >>: In a way you can ->> Byron Cook: Yeah, that's right. >>: In general. I mean, I don't know in general ->> Byron Cook: I mean, we have an approach for disproving also when we get -- so what we do is when we get a counterexample, what we suspect is a counterexample, then we try and prove its [inaudible] by trying to synthesize a recurrent set and so on. But there of course -- I mean, we're assuming -- there's a number of assumptions we're making like linear arithmetic and so on. So, I mean, the tool much more focused on proving rather than disproving. But we have some approaches in place. So in all of my regression tests now in T2, which is my new termination prover, there's one case where we can't disprove termination. In all the other cases we've proved nontermination. And I think in this approach that will also work too. But I haven't tried yet. Yes? Any more questions? >> Tom Ball: Okay. >> Byron Cook: Okay. Thanks. [applause]

>> Tom Ball: I'm Tom Ball, and it's my... Microsoft Research in Cambridge.

Related documents

Products

Support

&gt;&gt; Tom Ball: I'm Tom Ball, and it's my... Microsoft Research in Cambridge.

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

>> Tom Ball: I'm Tom Ball, and it's my... Microsoft Research in Cambridge.