>> Shaz Qadeer: Okay. Welcome everybody. It's my pleasure to introduce Alastair Donaldson. He's visiting us today and tomorrow from Oxford University. Alastair got his PhD from the University of Glasgow and then he worked for a few years in the industry at Codeplay Software wording on multicore compilation. After that he decided to come back to academia. He's now post-doc fellow at the University of Oxford. And he's going to tell us today about some work he's been doing on verification of concurrent programs. >> Alastair Donaldson: Thank you very much for the introduction. During the talk please feel free to ask questions and interrupt me. I'm sure that what's you do anyway. So, yeah. That's what I like. So this is work on applying predicate abstraction to replicated concurrent programs and trying to do that efficiently by taking advantage of symmetry. And it's joint work with Daniel Kroening, Alexander Kaiser and mostly Thomas Wahl at the University of Oxford. So as you very well know, one of the major former methods success stories has been the SLAM project. So taking a load of really well understood formal verification techniques, theory proving techniques, combining them with novel techniques for making this work on real C programs, yeah, I'm building a tool that can be used to verify device drivers. But despite the success, I think it's fair to say that there's not been very much progress on applying predicate abstraction to shared-variable concurrent programs. And the reason for this is state -- space explosion. So abstraction may be more expensive if you have multiple threads and also the verification of a concurrent Boolean program becomes intractable as the number of threads increase. Of. So in this work we've been contribute to go this situation by dining a scalable predicate abstraction and CEGAR-based model checker which is geared towards verifying replicated C programs. And we achieve scalability by exploiting the replicated structure ever these programs using symmetry reduction, and in particular building on recent work by my collaborators, Thomas Wahl and Daniel Kroening and also Gerard Basler who was an intern here some years ago on doing symmetry reduction for symbolic model checking using a technique called symbolic counter abstraction. So I'm not going to tell you about symbolic counter abstraction during this presentation, but that's the technology that makes this work well. As well as the novel abstraction technique that I'll show you. >>: Counter [inaudible]. >> Alastair Donaldson: Sorry? >>: What does counter -- >> Alastair Donaldson: Oh, counter abstraction. Well, I'm going to -- I will explain that later. >>: [inaudible] counter that comes up or ->> Alastair Donaldson: So this is -- it's this thing I showed you on the board where you count the number of processes in each state. >>: Okay. >> Alastair Donaldson: So ->>: [inaudible]. >>: [inaudible]. >>: It's counting abstraction. Okay. >> Alastair Donaldson: Yeah. Well, it's -- I guess maybe that would be a better name. In the literature it's referred to as counter abstraction. And it's an exact method. So it doesn't lose precision, despite being an abstraction it doesn't lose precision. And I'll discuss that a bit more later. So the course and programs we check don't need to read this program in too much detail. But what I mean by replicated program is we have some loop that launches an unknown number of threads and we say that this number is going to be bounded. So our model check would just stop if it launches more than five threads if you say the bound is five. And then all threads run the same program. This is an example of building a lot using test and set instructions. And I'll talk a little bit more about this example later on. So all threads are going to run the same program and you'll see there's no use of thread identity in the code for these threads. So this is the sort of programs we are thinking about. So a little more specifically. In our model of computation we assume no recursion. And in this work we inline all procedures. And I'll talk later about ways we might lift these restrictions to some extent. So, yeah, sorry. >>: Okay. Why should we do symbolic [inaudible] concurrent programs and tools that do [inaudible]. >> Alastair Donaldson: So ->>: [inaudible]. >> Alastair Donaldson: If your program has data like integers, then explicit state exploration won't be able to provide you with full state-space coverage. >>: And then the interplay between data and concurrent [inaudible]. >> Alastair Donaldson: Well, I suppose that would depend on individual examples, but -- so I mean if I gave this to like spin, unless you did a manual extraction to explicitly remove all data, then spin would just get stuck exploring like data value after data value after data value. It wouldn't necessarily -wouldn't necessarily find any useful bugs or show that no bugs exist. So if there is lots of interplay between control flow and data values then actually predicate abstraction probably wouldn't work very well either on these examples. The idea is predicate abstraction gets rid of the data problem. >>: [inaudible] SLAM is partially successful [inaudible] all these new techniques but also because it had device drivers as the motivating example. So I guess one question is -- once your [inaudible] example of this sort of [inaudible] system ->> Alastair Donaldson: Yeah, so ->>: Template. >> Alastair Donaldson: So the examples that we found our approach [inaudible] are actually lock-free data structures. So in this building a lock using atomic instructions. So we -- yeah, we can check the -- these assertions won't fail and we don't care too much about the actual context in which this lock is being used, right? >>: So the idea is that your main routine is really sort of a harness where you're simulating an unknown number of clients of essentially a passive library, a library that might do synchronization but it's not creating threads itself? >> Alastair Donaldson: Exactly. Well, that is the benchmarks that we've tried this technique on is exactly the situation. So if you had a larger concurrent program with many other threads doing different things, then you might need to extract something from that concurrent program for this technique to tell you something meaningful. >>: I see. >> Alastair Donaldson: Yeah. So okay. So we're assuming this sort of program structure. We inline all procedures. And then we have some restrictions on pointers. So we allow pointers -- within a thread we allow pointers between variables of the thread. We allow pointers from a thread's local variables into the shared states. And we allow pointers between the shared states. But in this work we don't allow pointers from the shared state into a thread's local state. The reason being that this would give us -- this would break our asynchronous model of computation where threads proceed by modifying their own local state and the shared state. I feel like these kind of pointers then a thread could use such a pointer to directly manipulate another thread state. And by barring these sorts of pointers, we also bar pointers between the local states or threads because you could only get such a pointer by communicating it through the shared states. So we haven't found this restriction to be a problem in practice. This would correspond really to give away an address of a stack rarely to the shared state which some programmers will do. But it's I would say generally regarded as quite bad practice. >>: Since the local state is only stack states ->> Alastair Donaldson: So in this work, yeah, local state is stack state. We would regard the heap as being shared state. >>: Specifically you [inaudible]. >> Alastair Donaldson: Yeah, you could allocate some of your private space. >>: [inaudible]. >> Alastair Donaldson: So you could allocate your -- you could allocate memory in the stack and use that in a thread local way. So with our current tool, the tool would just assume a shared state. So there has been like some interesting research on trying to do analysis of programs to determine which variables are shared and which variables are local, depending on the way they're used. >>: [inaudible] synchronous program you [inaudible] that's allocated [inaudible] thread to thread [inaudible]. >> Alastair Donaldson: So you pass something from one thread to another? >>: Yes. >> Alastair Donaldson: And then would you then regard it once you passed it would you ->>: [inaudible] to that thread. >> Alastair Donaldson: Then you would still regard it as part of the original thread's local state. >>: If it's local to [inaudible]. >> Alastair Donaldson: Okay. But once you pass it to another thread, did you regard it as being local to that other thread. Yes. So I mean I'm not sure if you could model that directly in C and have our tool work successfully on it. But essentially that's not the problem we're trying to avoid. We're trying to avoid the problem of one thread being able to directly change the state of another thread. >>: But he can model it, right, because that will be allocated on the heap and ->>: [inaudible] shared state. >>: It would be shared state. >> Alastair Donaldson: Yes. So we would just treat it as shared state. >>: Yeah. >> Alastair Donaldson: So, yeah, maybe after the presentation we could talk through an example that would do that and ->>: [inaudible]. >> Alastair Donaldson: No, no. I'm delighted to be. So, yeah. [laughter]. One of the things I would -- so Tom asked the question about what kind of examples are we looking at? And we have a class of examples where this method is useful. We would be very interested in trying to expand that class of examples. And if you have ideas on that, that would be great. Okay. So then and then also we're assuming a strong memory model and we're assuming in this statement-level granularity so threads interleave at statement level which is obviously unrealistic. You can avoid this problem by preprocessing your input essentially into three address codes. This strong memory model problem we're looking forward to dealing with in future work. We have a new researcher, Jad Algave [phonetic] who is an expert in memory models. So I'm hoping to collaborate with her on extending this to ->>: [inaudible]. >> Alastair Donaldson: Yeah. Okay. So now I'm going to do a very quick recap of the Cartesian abstraction. I'm not sure if this is necessary for this audience. I wasn't sure if there would be many more people here. But I'll go through it fairly quickly. So if we have a set of predicates phi 1 to phi M, what we're going to do is abstract the program. This is sequential predicate abstraction. We're going to abstract a program to produce a Boolean program where variables B1 up to BM track these predicates. So by F of phi I denote the best approximation of the expression phi over our predicates. So F strengthens phi to the weakest thing we can express over -sorry, strengthens psi to the weakest thing we can express over the phi such that F of psi will imply psi. And then what choose A, B means is if A is true then 1, else if B is true then so else star where star is the known deterministic expression. And the effect of an assignment statement st on a predicate phi is defined -- is abstracted as a choose between the weakest precondition for phi to hold after the statement has been executed but strengthened over our predicates or the weakest precondition for not phi to hold but strengthen over our predicates. So this is like the best thing we can say about phi's new value with our current predicates. But considering phi in isolation from other predicates. So we turn a statement into parallel assignment to our Boolean variables doing this choice for each predicate. So in this work we are not actually restricted to the Cartesian abstraction. We could have phrased this work in terms of existential abstraction generally but our motivation for this is that we've written a paper on this work that we've sent to CAV, and we'll like this paper to be readable and compatible with the seminal work on predicate abstraction from 10 years ago which introduces the Cartesian abstraction so we'd like it -- someone to read this paper and then read our paper and the notation to be enough, the notation from this paper to be enough to understand what we've done. Okay. So the goal of our work is we have a template program P and an integer N and we want -- what we want to do is to check that the parallel composition of N copies of P is correct. And by correct I mean that no assertions in the template can be violated. So one approach to doing this using existing techniques would be to build the program PN directly, abstract it and then check the abstraction. So let's think about how this would work. Suppose we had this very simple program here and I'm going to write my programs in this simple form where I define shared and local variables and then have some statements, rather than giving you whole fragments of C for reasons of space on the slides. So this obviously incorrect for more than one thread program, a shared-variable S, a local variable L. We assert that they're different in increment S. So clearly for more than one thread this program is incorrect. And we had this predicate that S and L are different then all we could do is we could expand this program by multiplying out the threads. So we have the single-shared-variable and then we have two instantiations of the template. So we get a local variable L1, a local variable L2. And two different assertions. And we also would expand the predicate. So we have a predicate that says S doesn't equal L1 and a predicate that says S doesn't equal L2. And then we could apply predicate abstraction directly to this program to get the program alpha P squared. So this is the program with two threads. Practice we would use a Boolean variable B1 to track the predicate S not equal to L1, Boolean B2 subtract the corresponding predicate for L2. And then we would get this parallel assignment corresponding to the updates to S. So if S is not equal to L1, then we don't know whether it will still not be equal to L1 after the assignment. But if they are equal, then they definitely won't be equal after the assignment. So we get 1 here. So does this make sense so far? It's a straightforward application of regular predicate abstraction. And then we could check this. So it would be easy to turn this program into a sequential Boolean program by simulating concurrency with nondeterminism. And we could use a model check. I like Bebop or we could use SMV with a suitable transformation from Boolean programs into the SMV language to check this program for two threads. So we would find verification fails. >>: [inaudible]. >> Alastair Donaldson: So this thing of turning a Boolean program into an SMV program. So you just model the program as a variable, yeah, and then you just have like separate variables for all the threads. So all the ->>: So it would be basically a big gigantic ->> Alastair Donaldson: Yeah, a big gigantic loop. >>: Loop. >> Alastair Donaldson: Monolithic loop. >>: [inaudible]. >> Alastair Donaldson: Yeah, yeah. >>: There's no recursions. >>: I see. No loop? >> Alastair Donaldson: Yeah. >>: Okay. >> Alastair Donaldson: Well, you could have loops but ->>: You can have loops but all procedure [inaudible]. >> Alastair Donaldson: So in this [inaudible] we've ->>: [inaudible]. >> Alastair Donaldson: Yeah. If you wanted to do smart things with procedure codes than this might be more tricky. >>: So the conduct is exactly the same, the two threads update two shared variables. >> Alastair Donaldson: So in the original program, the two threads both update the shared-variable S. Yeah. And that means that the assertion will fail for the -after one thread does the update then the assertion will fail for the other thread. Yeah? Okay. So -- >>: [inaudible]. >> Alastair Donaldson: Yeah. So I mean if both threads do the assertion, then things will be okay. >>: So verification failed means counterexample ->> Alastair Donaldson: Counterexample, yes. So here we would get a counterexample in the abstract program. And if we simulate this counter example, we'll find it's genuine. >>: So it's a success? >> Alastair Donaldson: So it's a success, yes. Yes. Okay. So what are the pros and cons of this work? So, yeah, the pros are it works in principle, right? So this is a way we can do verification of these kinds of programs. And we can use existing techniques essentially directly. But the problem is it doesn't scale. And we've got experiments later that show this. So there are two problems. One is the scalability of the abstraction. So suppose we've got K predicates over our template program P. Then potentially we're going to end up with a separate version of each predicate for every thread. So if you had a predicate just of shared variables then we wouldn't multiple that predicate for every thread. But the predicate S not equal to L, we've got two predicates corresponding to that predicate. Right. So when we perform abstraction, even if we're abstracting thread 1, we have the predicates of related to all the threads available to us when computing the abstraction. And in this example, we made use of that. So if I go back to the code for, yeah, here you can see that in thread 1 year using both B1 and B2, which are the versions of the predicate for thread 1 and thread 2. Okay? So abstraction is expensive. And if we multiple the number of predicates it becomes more expensive. And I guess a less important problem but maybe still worth noting is that if we had different values of the number of threads that we care about, then we would have to do different abstractions and check them separately. We can't like do one abstraction and use it for multiple thread counts. >>: But given your assumptions you should be able to just have a template [inaudible]. >> Alastair Donaldson: So you mean do abstraction at the template level? >>: Well, I mean you have this one piece of source code that you know is going to be the main for all threads. >> Alastair Donaldson: Yeah. >>: Right. So the local variables it's all parametric. >> Alastair Donaldson: Yeah. So but the thing is that if you multiple the program and then multiple the predicates then you abstract a thread with respect to ->>: This is where you do the explicit composition. >> Alastair Donaldson: So in this work, we're not doing parameterized model tracking, right. We're not trying to share this for an [inaudible] we're trying to share this for the program where up to some fixed number of threads are launched. >>: Oh, so I -- so you have a parallel composition that you actually perform to create the program? >> Alastair Donaldson: Yeah. That's the -- that's not where I'm going to propose we actually do, but that's -- this is what we could do. >>: This is in the context of that. >>: Yes. Those assumptions. >> Alastair Donaldson: And then the other problem is that it's not feasible to check this program alpha P to the pi and for large thread counts. So we get [inaudible] exposure because of concurrent thread interleavings. So we refer to this method as symmetry-oblivious predicate abstraction. So we're basically ignoring symmetry here. And what I'm going to show you next I'm going to propose a method that takes advantage of symmetry to do this in a better way. So potentially more natural approach. Well, this template program P is not an executable program but it's a program nevertheless. So what if we could abstract P directly at the level of the template so to get an abstraction alpha prime P. So you say alpha prime not alpha because the abstraction we do isn't going to be exactly the same as what we would have done in the previous slides. So this will hopefully be cheaper to compute because we haven't blown up the number of predicates that we're abstracting over. What we'd like to do is to do this so that when we then take the parallel composition of the resulting Boolean program N times we get something that overapproximates the parallel composition of the template N times. But because we're working at the template level we should then be able to exploit recent techniques on model checking replicated Boolean programs that exploit symmetry to do the model checking efficiently. And in addition we then can just abstract this program P once to get alpha prime P and then we can try alpha prime P with various thread counts. So we don't have to abstract a separate program for each thread count we're interested in. So I wouldn't be telling you about this unless the answer to this question was yes, you can do this, and this is what we call symmetry aware predicate abstraction. And it's what I'll tell you about during the rest of the talk. So any questions up to this point? >>: [inaudible]. >> Alastair Donaldson: Yeah? >>: [inaudible] basic question. My understanding that you were doing bounded verification [inaudible]. [brief talking over]. >>: [inaudible] the loops you unroll them? >> Alastair Donaldson: No, no, we don't unroll loops. >>: [inaudible]. >> Alastair Donaldson: In a Boolean program there can be loops, right. Yeah. So we're bounding the number of threads that get created but we're not bounding the number of contact switches. We're not bounding the depth that we search to. Yeah. But we don't have recursion. Okay. So a quick overview of symmetry reduction, in case you're not familiar with it. So for verifying this replicated program, suppose M was 9, right, and let's just ignore shared state for this -- for the purposes of this example. Suppose we've checked some state where the threads are in this configuration so 1 and 2 are in state A, 3, 4, and 5 are in state B, et cetera. So if we've checked that this state is safe, then because these threads are isomorphic we don't need to check. For example the state where the identities of processes two and three are being flipped. So because these states are permutations of one another, if we've checked one, we don't need to check the other. Okay. And similarly, we wouldn't need to check this state here where our processes 6 and 7 have been flipped. Or this one, for example. And actually the model checker, the symmetry exploited model checker that we use as a back end for our work, which I'm not going to give you details of, uses the technique called counterexample. Called counter abstraction. So this whole equivalence class of states would be represented by this counter abstract state here. So we say there are four processes in local state A, four in local state B, and one in local state C. So we abstract -- it's abstraction because we abstract the way identity. But an important point about counter abstraction, yes, so symmetry reduction can give you a very large reduction in the size of the state space you need to search. So symmetry reduction gives you a bisimilar quotient structure. So it gives you -you're checking something bisimilar to the unreduced structure. And counter abstraction is a method of implementing symmetry reduction. And it also gives you bisimulation. So although we call it abstraction, we're not introducing any further overapproximation. So in this work we go from a program template to a concurrent -- to a Boolean program template. We expand that and then we model check it using counter abstraction. We lose precision when we do the abstraction. But we don't do more precision because we're using this technique called counter abstraction in the model checking phase. Of. I gave a rehearsal of this talk on someone, yeah, raised that point which I guess because I'm so into symmetry I didn't think about. Okay. So the idea is we're going to take this template program P, abstract it to get alpha prime P, expand this, and we're going to do it in such a way that our expansion simulates the original program's expansion. But actually we're going to check the symmetry quotient of our expanded Boolean program and because this is bisimilar this is a sound thing to do. All right. So I nearly had a heart attack when I looked through the POPL proceedings and saw a paper with this title, Predicate Abstraction and Refinement for Verifying Multithreaded Programs. And I thought, oh, no, someone's doing something very similar to us. And then I read the paper and it's a very nice paper, but it concludes with this very encouraging statement saying: Another technique to fight state explosion is to factor out redundancy due to thread replication as proposed in counter abstraction implemented in the model checker BOOM, which is our tool. We view these techniques as paramount in obtaining practical multithreaded verifiers. So, yeah, this heart attack thrown into a good feeling of, yeah, other people think that this is a good thing to be working on. All right. So before I tell you about our approach, I just want to give you an indication of how many we had to change the CEGAR loop to make this work. So almost everything I'm going to tell you is related to computing the abstraction. We have a novel technique to do predicate abstraction at the template level. Then we had to adapt our BOOM model checker quite significantly into what we call B-BOOM because it needs to perform broadcasts as we'll see in what follows. I'm not going to tell about you the details of how we adapt to B-BOOM but this is a significant piece of work. Checking feasibility of counter examples required almost no modification. And in the moment our tool only refines the abstraction by adding predicates. So this is also very straightforward. And what we're working on now is constrain style refinement to make the abstraction with a given number of predicates more precise. And this is actually not so straightforward in our template level setting, but on the plane here I think I figured out how to do it. And it involves significant work but is absolutely doable. So this is the state of things. Well, this last bit isn't the state of things, but that's the state of things. Okay. So let's think about how we would do this template level predicate abstraction. So let's consider this simple example here where we have a local variable L, a shared-variable S, a shared-variable T, which I think I should remove because I don't use it in the example, yeah, and we're not actually -- I don't care about this program, but the -- I suppose we've got two predicates. Yeah. Sorry. We do have a predicate S is equal to T. That's why I need T. So we've got this predicate S is equal to T and a predicate L is equal to four. And what we want to do is to turn this into a Boolean program. So we can abstract the statements directly using Cartesian abstraction. And clearly we're going to need a Boolean to represent each of these predicates. So now the question is because we're going to expand this to a concurrent Boolean program, the Boolean variables need to have a scope. So they need to be local or they need to be shared so in this example, what do we want to do for these predicates? Well, I think it's pretty obvious that we would want the predicate S equals T over shared variables to be a shared-variable, and we want the predicate just over local state L equals four to be represented by a local variable. Okay? But what about this example here, where we have variables S and L? S is shared and L is local. And I'll use that convention throughout the talk. And then we have a predicate -- yeah, this is the example we saw before that's incorrect for more than one thread. And we have this predicate S is not equal to L. So we build the Boolean program as I showed you before, except we're doing it just at the template level. And we have this Boolean variable that tracks the predicate S not equal to L. And now the question is should we make this variable shared or local? And it's not immediately clear what we should do or it wasn't clear to me initially because this variable refers to both shared and local state. So what if we make it local? Well, the problem is that now if you look at this Boolean program for any number of threads we would say this program is correct, right, because this predicate is initially true. And then we can't set it to false. Right? So if it's true, it will remain true. So this is not a sound thing to do because we would deduce the -- our original program was correct if we regarded this as a sound abstraction when we know it's not. So clearly we cannot just represent these mixed predicates using local variables. What about representing them -- oh, yeah, in this example if we made this shared-variable then we would correctly deduce that the Boolean program is wrong. Okay? What if we instead decided to represent these mixed predicates using shared variables? Well, this is an example where this wouldn't work either. It's a bit more of an intricate example. We have shared-variable S, a shared Boolean flag F and a local integer L. And a thread can either go into this condition here where if the flag is true we assert that S and L are different, okay? So if you imagine that one thread skips over here and performs this update then S and L will be different -- sorry. Let me just think of -- S and L will be the same for any other threads, right? Know. I'm getting myself confused here. Yeah. >>: [inaudible]. >> Alastair Donaldson: Let me look at the abstraction. Okay. Basically I think I might have made a mistake in this example, but it's very easy to construct an example where if you represent these mixed products using shared variables then you get unsoundness in the other direction. So if you don't mind, I'm going to skip over this, because I don't want to spend ages figuring out the details. I think I've made a ->>: [inaudible] you introduce the Cartesian abstraction introduces ->> Alastair Donaldson: I want to use an example where if I make this mix predicate S not equal to L be shared, then I will incorrectly claim that the abstract program is correct when the concrete program is incorrect. I think this example does it, but I'm sorry I prepared this [inaudible] PPOP conference over lunch and yeah. I'll may be look at it after the talk, and if anyone's interested I'll go through the details. Okay. And in this example declaring this predicate to be local would work if the example were correct. Okay. So the idea was to establish with these examples the taking either one of these strategies of either making a mixed predicate be always local or making a mixed predicate be always shared. Neither strategy works. Right. And now you might ask well, okay, these examples were very contrived. If I did this in practice, would I maybe get reasonable results for benchmarks. So these are benchmarks that I'll tell you a bit about later. But what I want to show you here is that if with try this approach of declaring mixed predicates to always be local or mixed predicates to always be shared, then we get very strange results. So here what I'm saying is that unsafe means this is a buggy version of the example. So a correct verification result will be to say that the example is unsafe. So if we declare mixed predicates as local variables, we frequently get the model checker telling us that these unsafe examples are safe, which is unsound. And if we also declare mixed predicates as shared variables then in one case here we find that the model checker tells us that the example is safe when it's not. By no difference found what I mean is that we don't manage to get a conclusive verification result and adding -- and we can't find any further predicates to add using our predicate discovery strategy. And I say that this is an erroneous result because with the symmetry aware approach we don't get a no difference found, we actually get the right verification result. And a quick observation is that we'll never have -- we'll never get told that a safe example is unsafe, right, because we get a character example which we have to simulate over the original program, and we're never going to find that a safe program has a counter example. Yeah? >>: The -- suppose that you take the local state and you replicate it to make [inaudible] instead of having two local variables you have two shared variables. >> Alastair Donaldson: Yeah. >>: And you just arrange it in such a way that the program one thread only accesses the [inaudible] and the other one only access the other one. >>: Well, in that case all of your variables are shared ->> Alastair Donaldson: Yeah. >>: So in that case you don't have a mixed predicate, so it's clear that all of the Booleans should be ->> Alastair Donaldson: Should be shared. So that's precisely what we did in -the first [inaudible] I showed you where we expanded the threads out separately, yeah, treated the thread's local variables as being different variables, expanded the predicates to be distinct predicates, yeah. Then we do the abstraction on a thread-by-thread basis. Then everything is shared. And that's exactly what you proposed. But we're not doing that at the template level and therefore we can't exploit symmetry in our model checking. >>: But is that sound? >> Alastair Donaldson: Yes, that's the sound thing to do, yeah. >>: So, right. But -- how -- I mean, how can the other one not be sound. I mean that is if you don't take -- if you're just preventing that things are more shared than they are, I guess I'd like to see the other example [inaudible]. >> Alastair Donaldson: Okay. Yeah. [brief talking over]. >>: Well, what's missing here is you haven't really defined your notion of your abstraction in an individual process. Because if you take this process and it has fixed vocabulary predicates as a [inaudible]. >> Alastair Donaldson: Yeah. >>: [inaudible] the program. Fixed set of globals that it knows about. Now that you start composing these guys together ->> Alastair Donaldson: Yeah. >>: -- your global state now has N times that many global variables. If you have made the shared predicate the shared predicates global -- >> Alastair Donaldson: Okay. So ->>: [inaudible] vocabulary has changed, right, between an instance of one process and now my composition so you haven't said what happens is if I execute process one, what happens to the global variables it doesn't know about? I mean, are they staying the same when process one transitions? >> Alastair Donaldson: So you're talking about the Boolean programs having these global variables, right? >>: You're saying you're locally abstracting one -- one thread. >> Alastair Donaldson: Yeah. >>: To Boolean program. That Boolean program has a vocabulary, has a set of local variables and a set of shared variables. >> Alastair Donaldson: Yeah. >>: But its set of shared variables is not the complete set of shared variables. >> Alastair Donaldson: So in this case, I'm proposing that it would be. So here, this would be the Boolean program for an arbitrary thread. And this predicate here is -- would just be a shared-variable and when you compose this program many times you only ever have two shared variables. So ->>: So the one variable -- so you could have one copy of the variable that stands for all. >> Alastair Donaldson: Yeah. >>: Even though, in fact, it's representing ->> Alastair Donaldson: And that's why this doesn't work. [brief talking over]. >> Alastair Donaldson: And this is meant to be an example just to simulate that. Yeah. >>: Because you've got all these different local variables, so how can this one shared predicate. >> Alastair Donaldson: Exactly. >>: [inaudible]. >> Alastair Donaldson: Or you need something else. [brief talking over]. >> Alastair Donaldson: So maybe I could show you this -- maybe I could now show you the solution rather than ->>: [inaudible] or something ->>: Or something ->>: Okay. Okay. So that's ->> Alastair Donaldson: So ->>: All equals through, all equals false or [inaudible]. >> Alastair Donaldson: Yeah. So basically if you made these predicates, these mixed predicates be local variables then you don't communicate when you should communicate. And if you make them just be represented by a simple shared-variable, then a communication that's specific to one thread just is applied to all threads. So neither of these approaches make sense. Okay. So now you might ask -- I'm going to show you how we can deal with mixed predicates in a sound way. But you might first ask, do we need them at all? Might it be possible to rewrite our program so we never have these mixed predicates. And it's very easy to construct an example where we do need mixed predicates. So this example is rather contrived. The idea is that be only one thread will be able to get into this loop here. And this thread will increase these variables S and L and assert that they're equal, right. S is a shared-variable, L is a local variable. So clearly we want the mixed predicate S equals L to prove this program correct: And it can be shown that over a set of non-mixed predicates you won't be able to compute an invariant strong enough to show this program is correct if you assume unbounded integers with machine integers. You would need a very, very large number of predicates tracking the value of every -- you know, every possible integer. Okay. And then in practice, also, let's have a look at this example of building acquire lock using test and set. So in this example we do test and set on this lock variable to get back a condition, right? And if the condition is locked, then we know that the lock was already held so we do this exponential backoff and we do this. All right. And what we want to assert is once we have successfully acquired the lock, the condition should not be equal to locked, it should say not lock -- this should be locked. And this is something we would represent very naturally by mixed predicate. Lock is a global variable, cond is a local variable. So these mixed predicates not only do we need them in theory, but in practice they're useful for these sorts of examples. All right. So now I'm going to explain how we handle mixed predicates in a sound way in our symmetry aware predicate abstraction technique. And I'm going to show you the technique if we assume no pointers first. And then I'll show you how pointers can be slotted in. This makes the presentation much easier and doesn't lose anything. So suppose we have a program P and a set of predicates over the variables of P. We want to translate this into a Boolean program B so that BN approximates PN. And we want Boolean variables B1 up to BN as usual. Our approach is to say that the Boolean variable BI is a shared-variable if and only if the predicate phi I is shared. So if the predicate phi I only refers to the shared state then the Boolean variable is a shared-variable. Otherwise we're going to make the predicate the Boolean variable local. So in particular, we are going to track mixed predicates in local variables. So clearly we need something else up our sleeve because I've shown you that that alone would be sound. So let's suppose we have an assignment V becomes equal to E, a predicate phi and associate of to variable B and we want to work at what the effect on this predicate should be. So first of all, if V doesn't occur in phi then the variable B won't change. Because remember, I'm considering no pointers here. Otherwise we need to update B in one of three ways. So suppose V and phi are both shared, so V is a shared-variable and phi is a shared predicate. Then we can -- so for example if our statement is incrementing S, we have a predicate S is equal to 12, then we would just update B according to standard predicate abstraction. So we would do this. B equals B if B then zero L star. Okay. So because this predicate is shared, it's in a shared-variable, it's in one place. This will be visible to all threads. So this is what we want. Okay? Let's suppose that V is local and phi is either local or mixed. So V is a variable local to a thread, phi is a predicate that either is just the local variables or of a combination of variables. And we know V occurs in this predicate phi. Then it's sufficient just to update the predicate for the thread that executed the update because we've changed the truth of the predicate for that thread but we clearly haven't changed the truth of the predicate for other threads because we didn't do a shared update. So for example the analogous example with L instead of S. So if we had the predicate L equals 12 then if this is true it will be false. If we had the predicate L is equal to some shared-variable S, if it was true for this set it would be false. But clearly it's not going to become false for any other threads or come true for any other threads. So the interesting case is when we have V being a shared-variable and phi a mixed predicate. So phi is a predicate over shared and local variables. So it's truth will be thread dependent. And V is a shared-variable. So by updating V, we are going to change the truth of this predicate potentially for every thread. But in the C program in the high level program one thread is goal to actually execute this statement, right? So something that would only change the shared state in the C program is going to change the state of many threads in the Boolean abstraction of this program. And we handle this using what we call a notify all update. So if we had this thing then S plus plus a predicate that says L and S are equal, then what we do is for the local thread we output a normal update to say that this thread's Boolean B representing this predicate gets updated in the usual way. But we also need to tell all the passive threads to update their local variable for this predicate appropriately. So we do this by introducing a new Boolean program construct called a broadcast. So let's think about what a regular update on a local variable looks like. Suppose thread I executes some update. Then this causes thread I's program counter to increase and thread I's local state to change. What I'm going to introduce is a broadcast update which is a thread executing a statement that has no effect on the thread's local state but changes the state of all passive threads. And once I've -- I'll show you what this looks like and then I'll show you how we use it. So we use this syntax square brackets V to mean this L value is going to be changed in all passive threads but not in the active thread. So suppose we had this state here where the active thread is I, okay, so its program count is going to change but it's local state isn't going to change. And the local states of all the other threads are going to change if we do a broadcast. So V in brackets. So I'll call -- I'll call this passive V. So passive V becomes a [inaudible] something where V is a local variable. This causes the active thread to step forward where it is in the program and for the states of all passive threads to be updated. Yeah? >>: [inaudible]. >> Alastair Donaldson: This means some R value. So this is just the right-hand side of the assignment. >>: Can it refer to local variables at [inaudible]. >> Alastair Donaldson: So should it not. So we'll come to that. So, yes, the answer is -- in principle it could refer to the local variables of any threads. And we find it useful to allow it to refer to the variables of the active thread and the variables of the passive thread. Yeah. >>: So you're saying -- I'm confused. So why do you have this notion of that active thread not changing its local state? >> Alastair Donaldson: I think that will become clear when I show you an example in the next slide. >>: Okay. >> Alastair Donaldson: Okay. But tell me if it's not. So this broadcasts something to all passive threads. So -- >>: I'm sorry but fundamentally each [inaudible]. >> Alastair Donaldson: Yeah. >>: [inaudible] understand exactly what [inaudible] passive threads are about to execute some statement but they're all basically disabled and you're going to -you're going to do -- but you're going to sort of tell them to update what their predicate corresponding to phi. >> Alastair Donaldson: Yeah. So we're not going to tell them to update, we're just going to update it for them. >>: We're going to update? >> Alastair Donaldson: Yes, synchronously. So at once. You can imagine like a loop that executes atomically and just changes the states of all the other threads. >>: Okay. >> Alastair Donaldson: Yes? >>: On your previous slide I think you had -- yes, notify all updates and shared V. [inaudible] V is shared and then [inaudible]. >> Alastair Donaldson: Okay. So this is -- so on this slide by V I mean a variable in the V program. >>: Yes. >> Alastair Donaldson: And on this side I mean a Boolean variable -- a variable in the Boolean program. So I really should have said B not V there to make this clearer. Thanks for [inaudible]. >>: B? >> Alastair Donaldson: Yes, it could be V but it would be clearer if I had said B not V. So, yeah, here V is a local variable from the previous slide V is shared. >>: Can we assume that for any given program the number of [inaudible] is a dot, dot, dot is finite? >> Alastair Donaldson: The number of different values of dot, dot, dot. In a Boolean program, yes, because -- well, I mean syntactically, no. >>: No, but the dot, dot, dot is an expression not on a Boolean variables. >> Alastair Donaldson: Okay. Well it -- on the variables ->>: It's on our values as you said. It's on our value from the original program. >> Alastair Donaldson: No, no. This is in the Boolean program we're going to do this. Yeah. >>: A boolean value? >> Alastair Donaldson: Yes. So what they think -- so this is in an abstract program we're going to introduce this statement to -- this is a Boolean program variable. And we're going to update it and all other threads using Boolean program variables from well we'll come to -- from which that will come from. I think when I show you this in the context of the abstraction hopefully it will be clear. So I'm going to introduce a bit of notation. So suppose phi is a predicate. And if I put phi in square brackets what I mean syntactically is a formula equivalent like phi but every local variable L is replaced with L in square brackets. So this is phi in the context of a passive thread. So this is like another thread's take on phi. So for example if we had the predicate S is equal to L, if I put that in square brackets, I mean S is equal to passive L. >>: So now that's really mixed because S is a C curve run variable and bracketed L is a Boolean ->> Alastair Donaldson: No, no, no. This would be a predicate over the C program. So S is a shared-variable and L is a local variable in the C program yeah. >>: So, okay. So -- okay. >> Alastair Donaldson: Yeah. So here phi refers to a predicate over the original program. >>: I know that. But I thought that the brackets ->> Alastair Donaldson: So, yeah. So in a -- so we're going to use the brackets on both levels. We're going to use the brackets of the C program level during our abstraction and we're going to use the brackets in the Boolean program syntax to represent broadcasts. >>: Okay. Okay. So you're overloading with brackets. >> Alastair Donaldson: Yeah. >>: Okay. >>: But which L are you referring to? I mean L is local to each one ->> Alastair Donaldson: Yeah, yeah, yeah. So this is just -- at the moment if you can just think about this as a piece of syntax so we mean in some passive thread. And then I hope it will become clear what I mean when I show you how we use it. So yeah, if we have an assignment V becomes equal to E, a predicate phi and a variable B they be if V is shared and phi is mixed, we're going to do one of these notify-all updates. So what we generate is a parallel assignment. So we say that B -- this is the Boolean variable correspondent to phi in the active thread it gets updated according to usual predicate abstraction rules and simultaneously B and every other thread gets updated, right, but this is the -- where this phi in square brackets gets updated -- gets used. So the -- we're updating B in another thread because the version of phi for that thread its truth may be shared by this shared update. So we need to update it to reflect the weakest pre-condition for it to hold in this other thread after the assignment of V not in the other thread but in the active thread, right, being updated with our value E. >>: So why can't I just think of this [inaudible]. >> Alastair Donaldson: As what? >>: [inaudible]. It's sort of like. >> Alastair Donaldson: Because we're not talking about -- because we're not talking about a specific -- yeah. So I wanted to -- we could have used -- we could have used V underscore J. We could have written for all I, V underscore I. But we wanted something we could right in our programs. >>: I wonder if he just needs two vocabularies. >>: Yeah. >>: One to talk about the thread that's moving and one to talk about the thread that's being [inaudible]. >> Alastair Donaldson: Not the thread that's being affected but like any thread that's being affected, yes. >>: But the point is you're thinking about only parallelize interactions of these two thread local states. Right? And then you're sort of taking the Cartesian product of all those parallelized interactions and you're saying okay, this thread that's moving interacts with this thread in the following way and that thread in the following way and that thread in the following way. Those interactions may be correlated but we lose those correlations. >> Alastair Donaldson: Yeah. So maybe a clearer way of writing it would be something like saying B equals blah, blah, blah and for all I, BI equals blah, blah, blah, blah, blah, right? So we played with various different notations for this, but, yeah, maybe -- >>: Okay. >> Alastair Donaldson: Okay. And yet these statements are actually done simultaneously. So let's look at this on an example. Suppose we have this assignment S becomes equal to L, S is shared, L is local and we have a mixed predicate that tracks where the S and L reoccur. And there's a corresponding Boolean variable B. So we generate this, okay. So let's simplify it. So let's look at the top line. With weakest pre-condition for S to be equal to L after S, the assignment S equals L is L and L being equal, which is true. Okay, so this turns into B equals 1. Right? So clearly if a thread says S becomes equal to L then for that thread S and L will be equal. So what about the other threads? Well, the weakest precondition for -- well, so this formula S is equal to L in a passive thread this means that the shared-variable L is equal to the passive thread's L. And the weakest precondition for that to hold after the assignment is that the active and passive thread have the same value of L. Right? Okay. So now how do we compute this? So this relates to your question as to what the R value should be, right, what we should allow on the right-hand side of this assignment, which threads should we allow or should we be able to read from ->>: [inaudible]. >> Alastair Donaldson: Yeah. So which predicate should we give to the F operator? So the obvious predicates we have at our disposal are the predicate phi so S is equal to L and the predicate passive phi, S is equal to passive L. So what we might try is saying let's just give the F operator -- because we're operating a variable in a passive thread, let's just give the operator predicates of the passive thread. So in this case, what we would be trying to do is to strengthen L is equal to passive L just over the predicate S is equal to passive L. And it's hopefully clear that the best we can do here is false, right, and same for the negation. So in this case we would say passive V becomes a [inaudible] everywhere. So we would say we've updated the shared-variable so let's just kill all the information in other threads about that shared-variable. And that would clearly be a sound thing to do. But perhaps not a very useful thing to do. So we can do better if we allow the F operator to range over variables -- predicates in the passive thread and predicates in the active thread. So in this case we would be computing the strengthening of L is equal to passive L but over two predicates so S is equal to passive L and S is equal to L. And now we can do significantly better, right? So we can say that L will certain be equal to passive L if S is equal to passive L and S is equal to L, okay? Because they're both equal to S. And we can say that these things won't be equal if we have either S being equal to passive L and S not being equal to active L or vice versa. So the strengthening of the expression is the conjunction of these predicates and the strengthening of the negation is the exclusive ore of the predicates. And this is a much more precise thing to say. So we say here B is a choice between these two things. And I wonder if maybe the interplay between applying the brackets operator to Boolean variables and applying brackets operator to the C program variables is now a bit clearer. So during abstraction our abstraction engine is given these bracketed variables in the program. But what it will generate is something over passive and active predicates. Yeah? So I mean I wonder if it might have been clear to use two different -- maybe to not overload these things. I think they're kind of naturally related and for me overloading this works. >>: If I could just think of this as being sort of the ordinary, you know, Cartesian predicate abstraction of two processes, the active and the passive. I mean, can I just think of it that way and then say okay now for my whole collection of processes I'll just take the conjunction of all those things from all of the [inaudible] for all of the active passive variables? >> Alastair Donaldson: I don't think it's the same because I think if you just took the Cartesian abstraction of a pair of processes you won't correct the -- simulate the affect of updating all processes. >>: You would be updating all the processes individually and then take the cross-product of all those updates. >> Alastair Donaldson: So if you considered each pair separately and then combined them? >>: Right, you consider each -- you pick one to be active and now consider all the pairs consisting of the active one and the passive one, right, predicate abstract and do a Cartesian abstraction going forward, all right. And now you're going to wind up with states for each active successor state and you're going to wind up with corresponding passive successor states for each of the other processes take the Cartesian product of all that. Is that what you got? >> Alastair Donaldson: I don't think I freely understand what you mean, but I think that you might get something equivalent to this thing of expanding the programming doing the most precise abstraction you can. >>: Well, the most -- this couldn't be the most precise ->> Alastair Donaldson: No, absolutely. >>: [inaudible] because it would lose all the correlations between the passive. >> Alastair Donaldson: Yes, within the passives, yes. And it does do this. >>: You're capturing the correlation between active and passive for each passive individually. >> Alastair Donaldson: Yeah. Yeah. Okay. >>: Correlation between the passive is being lost. >> Alastair Donaldson: Yeah. >>: That seems to me the information that's being lost in this ->> Alastair Donaldson: That is exactly the information that's being lost. So you can come up with examples. When I said we haven't implemented constraints to our refinement in our tool yet, that's because you get counter examples that say that this update is -- this transition is infeasible because of information in other passive threads. Yeah. And we would need some notation and theory for including that in our Boolean language. Okay. So the answer is I think so. But I'd like to talk more about it. So, yeah, this is the approach we take. And it may seem kind of arbitrary. This is saying let's update the passive thread considering no other threads and here we're saying let's update the passive thread considering one of the other thread. We could consider two or three threads. But this seems very natural because the active thread did the update, so hopefully the active thread state is going to be useful in determining how we should update the passive thread state. But it's not always enough. All right. So the price here is that F operator now computes over twice as many predicates in the worst case. So give the operator the predicates of the active thread and the predicates of the passive thread. But it's not as bad as what I showed you before where we have N times M where N is the number of threads. So given an assignment V equals E, the way we do this in practice, so I showed you how we update with respect to one predicate. In practice what we do is we compute the indices of those predicates for which we need to do a broadcast. So we work at -- this is when we're working at -- when we're figuring out the abstraction. We work out the indices of the predicates that are mixed and that have this variable V in them. So for all predicates we do the usual thing of simultaneously updating all the active threads Boolean variables, using standard predicate abstraction and then for the predicates with these indices we do a broadcast. And this is one big parallel assignment. Yeah? Does that make sense? So this is just putting together what I showed you before. Okay. So what about pointers. Now, I'll briefly show you how we can adapt this to take pointers into account. So I'm not going to tell you anything interesting about how to do concurrent alias analysis which is a difficult problem. But let's assume we have an alias analysis procedure that's concurrency-aware, and let's also assume that it will reject the program conservatively if the situation of having a pointer from shared state to local state can concur. Because like I told you in the beginning, we restrict our attention to not consider those programs. So otherwise the procedure will yield a relation points to D so we write X points to DY if X may point to Y at program point D right? >>: [inaudible] just treats all statements [inaudible]. >> Alastair Donaldson: Yeah, yeah. >>: So fundamentally ->> Alastair Donaldson: Maybe you don't ->>: [inaudible] analyses if they're imprecise enough simulate the concurrency already, depending on which one you pick. If you want too much precision, then -- if you want a lot of precision like full sensitivity, then you lose -- yeah, you lose the concurrency. But it turns out that if you vary imprecise you simulate concurrency. >> Alastair Donaldson: Anyway. >>: Anyway. >> Alastair Donaldson: So we've done something very [inaudible] to make our analysis concurrent. And I don't know too much about concurrent analyses like [inaudible] analysis, for example, but we're hoping to look into that in the future. I don't know whether it scales. But let's suppose we have this black box. So the locations of the variable V we just say is the set V. Okay? Here I'm assuming that pointers point to variables. I'm not think about records and arrays. But with some tedious work this could be generalized to that situation. And we say the locations of a D reference are the variable you're D referencing and anything it could point to. So locations of an expression are anything you would need to read to evaluate the expression. And this can be defined recursively for more -- for compound expressions. And then we say that the -- so capital Loc of phi is the union of the locations of any program points. So before we said a predicate was shared if it only involved shared variables, mixed if it involved both local and shared. So now we just generalize this. So we say that a predicate is shared if its global locations are a subset of the shared variables, local mix otherwise. So this is the obvious generalization to take pointers into account. And then to abstract assignments we -- so this is our definition of loc and we have a corresponding definition of targets. So the targets of a variable are just the variable and the targets of a dereference are just what the variable can point to. So this captures what you could change by writing through this L value. Okay? So now when do we need to do a broadcast? Well, suppose we have an assignment phi becomes equal to E and I write phi because phi could be a variable or it could be a dereference. And we assume that you've preprocessed your program so that you don't have multiple dereferences as an L value. So the predicate were assigned phi and we want to know do we need to broadcast to other threads regarding the predicate phi, well, we need to do this when phi is a mixed predicate using this pointer aware definition of mixed and when there is some variable V which is shared and belongs to the location of psi at this program point and to the targets of psi. Sorry, this should be the locations of phi not locations of psi. So a variable V is relevant to phi at this program point and which could be changed by assigning to psi at this program point. So I hope it -- I hope it's clear that if you -- with these new definitions then everything I showed you before would now work for pointers. The challenge would be of course doing the alias analysis. So now I'll briefly tell you about how we've looked to the rest of the CEGAR process. So we built a tool which we call SymSatAbs. This is based on Daniel Kroening's SymSatAbs model checker. So we've implemented this new predicate abstraction scheme in the model checker. And then to do the actual checking we've extended BOOM which is this symmetry capable Boolean program model checker. We've extended it to be able to do these broadcasts which required some nontrivial modifications. And they're quite computationally expensive to perform. Which is why we've been careful in this work to try to work out as precisely as possible when you must do broadcast and not do a broadcast unless you have to. Because the way this symbolic counter abstraction works these broadcast are quite expensive to realize. Simulation required only trivial modifications. And then with refinement. So with predicate discovery if we discover a predicate by doing say weakest precondition on a counter example trace like L3 is less than S, so local variable to process 3 is less than shared-variable S, then we simply add the generic predicate L is less than S to our set of predicates. So this adds the predicate for all different threads. And we never get predicates like L1 being less than L2, because in our program we could never compare local variables of different threads. So this is actually very straightforward. And then transition refinement, CONSTRAIN style transition refinement, this is in progress. The problem is you may get a spurious transition between two abstract states but these states will refer to very specific thread IDs. And then in your Boolean program you want to add a CONSTRAIN to rule out this condition. And doing that in -- well, you can't add a CONSTRAIN about specific threads. If you did, you would destroy symmetry which is the thing that makes upward scale. So we're working out a way to be able to write a generic CONSTRAIN which will involve some quantification, involve extending our Boolean programming language a bit and extending the B-BOOM model checker accordingly. Okay. So I'll tell you briefly about experiment results. I'm nearly at an hour. Can I go on a little bit longer? Yeah. Okay. So these examples are mostly working on lock-free data structures. So we've got a lock-based and a cas-based counter, a pseudorandom number generator, a lock-based and a cast-based stack that supports concurrent pushes and pops. Then these concurrent lock implementation examples like I showed you earlier. And also some examples on finding the maximum element in an array. So relatively small but not completely trivial C programs and we have simple properties that we're checking specified as assertions in the source code. And for each version we've injected a bug. So we have a correct version and a buggy version. The bay we've evaluated this experimentally is by comparing our symmetry aware approach against the symmetry oblivious approach that I mentioned to you at the beginning of the talk where we just expand the threads out. So what I'm claiming is that our approach gives you faster abstraction times because you abstract over fewer predicates, just the predicates of -- mostly just the predicates of one thread. And when you need to do a broadcast the predicates of two threads, the active thread and some passive thread. And then model checking is faster because of the symmetry reduction. So mixed predicates were necessary for all of these examples except lock-based pseudorandom number generator and the lock-based stack implementation. And the intuition here is that to build these lock-free versions you need to do these tests that check whether some local variable is still equal to a shared-variable. And that's the kinds of good source of the mixed predicate. So it's not really a surprise that in these versions where you imagine you have a lock as a primitive language construct. You don't necessarily need mixed predicates. >>: How do you model a lock? >> Alastair Donaldson: How do we model a lock in SatAbs? So we actually cheat. We actually have something called C prover atomic, right, which just makes this like atomic in this [inaudible]. So that's how we model a lock in these benchmarks. >>: Have you decided when you're [inaudible]. >> Alastair Donaldson: Then the code you want to be executed until [inaudible]. >>: [inaudible] I guess what I'm asking, do you model locks as a Boolean variable ->> Alastair Donaldson: Yeah. >>: As a Boolean variable. Okay. One more question. When you say you were a finite cas-based stack. >> Alastair Donaldson: Yeah. >>: Were the cas operations just used to implement the lock or were they being used to, you know, flip or [inaudible]. >> Alastair Donaldson: So I'm not actually sure. This comes from an open source IBM implementation. And our student, Alex Kaiser, he worked on this. I believe it's not just implementing a lock on the stack, I believe it's an actual lock-free stack. I haven't looked into the details. But you can -- I know where it is online. And we can look at it. So this stack example was written in Java. And he ported it to C, which is what our model checker works in. Tantalizingly he found a bug in the IBM implementation but untantalizingly, the bug manifest with only one thread. So not very interesting for us to say we found this bug. >>: [inaudible]. >> Alastair Donaldson: So it's from this book concurrency design patterns. Yeah. So ->>: [inaudible]. >> Alastair Donaldson: Yeah. So we experiment on a three gigahertz Intel Xeon. We have a timeout of one hour. And we do the Cartesian abstraction with a maximum cube length 3. So in blue I've indicated which method performs best. So you can see here we have the number of predicates required for the symmetry oblivious method which you can see grows with the number of threads. The number of predicates required by the symmetry aware method which is independent of the number of threads. Because this is the number of like generic predicates. Then we have the time taken for abstraction and symmetry oblivious and symmetry-aware approaches respectively. And the time taken for model checking. So here model checking is performed with B-BOOM, our extension of the BOOM tool. And here we took the best time between SMV and BOOM with no symmetry reduction. So you can see that we get significant speedups on many of the examples we can verify. So we show the largest thread count here that we can verify with each approach. So in this example, showed a mixture of thread counts. But 10 was the largest thread count we could verify without symmetry whereas we could go up to 20 but not further with symmetry. So we can check interesting thread counts obviously during a certain point checking more threads isn't that interesting. So the important thing is for some of these cases we could only check very small thread counts without exploiting symmetry. But we can check larger thread counts with symmetry. Are there any questions about the experiments? >>: So it seems that there are now two differences between symmetry oblivious and symmetry aware. And now one of them is that you're doing this active passive abstraction of the transition relation in symmetry aware but not symmetry oblivious. >> Alastair Donaldson: Yes, symmetry aware is -- computes a more precise abstraction. Which is why it takes longer. >>: Yeah. And the second one is of course that in the symmetry aware you are displaying symmetry. >> Alastair Donaldson: Yeah. >>: But even in -- even without using -- even without using your active passive approximation. >> Alastair Donaldson: Yeah. >>: That the system is still symmetric. >> Alastair Donaldson: Yeah. >>: And that you could still apply symmetry reduction techniques even without that. >> Alastair Donaldson: Okay. So we have a little bit. So with this symmetry oblivious approach, if you expand the threads and then abstract thread one to get a Boolean program, then you know the Boolean program for thread two is going to be the same, but you just switch their IDs. Right? So what we did with these experiments is we actually didn't implement that but we divided the abstraction time by the number of threads to be fair. So because it would be possible to compute the precise abstraction for one thread and then generate the abstract versions for the other threads that would require some very tedious implementation work. So ->>: Correct. But you could also use sort of the traditional symmetry reductions, you know, on the -- on the version that is still exact and doesn't use your active passive ->> Alastair Donaldson: We end up with this ->>: [inaudible]. >> Alastair Donaldson: Okay. Yeah. So the trouble is that we aren't aware of a way of doing that in practice like a model checker that can take a set of processes and do symbolic symmetry reduction. So, yeah, we could give this to something like spin, say, and do symmetry reduction or Murphy. But then we would -- that would be hopeless because of the problem of nondeterministic Boolean variables, right? So with a large number of Boolean variables the states place would be ->>: So you're saying ->> Alastair Donaldson: Feasible. >>: So you're saying there's no symbolic implementation of that symmetry ->> Alastair Donaldson: Yeah. >>: I'm not saying that symmetry reduction would necessarily work well in that ->> Alastair Donaldson: But it's something we ->>: I'm trying to say that there's sort of two [inaudible] everyone was saying was going on here, right. >> Alastair Donaldson: Yes. So one is computing an efficient abstraction ->>: Right, one is the abstraction you're performing and the other is the symmetry quotient? >> Alastair Donaldson: Yes. >>: Which could in principle be applied on both although you don't have the tools to apply them [inaudible]. >> Alastair Donaldson: Right. >>: So you're saying that I should really multiply like the abstraction like the 13 on the first row I should really multiply it by six to get the true time? >> Alastair Donaldson: Well, to get the true time that we actually spent. So we left this running like over a week or whatever. And that's how long it actually took because we just abstracted thread one and then we abstracted thread two and thread three. But if I had sat down for like two days or something or less I could have -- or maybe three hours, I don't know how long I could implement a tool that could take the abstraction from one thread and produce -- just using syntactic changes abstractions for the other threads. >>: Okay. >> Alastair Donaldson: So it would be pretty unfair of us to report that multiplied time given that this is like an implementation trick that we just haven't had time to do ourselves. So we had to use -- vowel we used the timeout longer one hour for those examples. We multiplied the timeout for abstraction by the number of threads. >>: [inaudible]. >> Alastair Donaldson: Because ->>: Okay. >> Alastair Donaldson: Yeah. I showed you this comparisons of the unsigned approaches before showing them no good and I just wanted to show you about the symmetry-aware approach is good. It gives you the right results in all indicates. Okay. So future work in -- that -- some of its in progress. So this CONSTRAIN style refinement is important so -- because we don't have this, we had to give quite the list of predicates manually for our examples. So if we give fewer predicates than SatAbs can't find any more useful predicates and the abstraction isn't precise enough. So I'm looking forward to solving that problem. The abstraction time is very high because we don't take advantage of procedural information right, so when you're abstracting one procedure, we're considering predicates like overall other procedures. But the challenge with concurrency, which I would love to talk to you guys about is that when you're abstracting over these passive threads, then they could be in any procedure. So you don't know where they are. Yeah. So I don't know if there are like any cool analyses to deduce the two procedures can't be mutually occupied or, you know, I don't know if that really happens in practice in interesting concurrent programs. We need a -- we've got a very crude concurrent alias analysis. It would be nice to get a smarter one to deal with heat manipulating programs. There's this large blocking code in this being successful and the blast model checker. It would be good to try that, but with concurrency suddenly it seems less effective, right, because you can't just mung together a book of statements if some shared-variable access is in the book of statements. So we would need to do this large book encoding much more conservatively than it can be applied in sequential programs. As I said, we're going to look at trying to analyze programs with weak memory models. And I'm not so interested in parameterized verification personally. I think you know for me showing that something works up to an interesting thread game is kind of enough. But my colleagues are very interested in parameterized verification and have techniques from CAV last year which in principle could be applied directly in our current setting to use this cutoff detection method to show that once you've checked up to a certain number of threads any larger thread count can be safe. For some technical reasons we can't yet apply that. Like in theory there's no reason why we shouldn't apply it. But there's some technical reasons which I don't fully understand we can't yet. But, yeah, this is something that they're definitely going to look into. Finally, just to summarize, well, maybe I don't need to belabor the point ->>: I'm sorry. Because when you do the counter abstraction with acceleration for certain types of parameterized systems finite state you can -- I mean [inaudible] results, right, counter abstraction ->> Alastair Donaldson: Yeah. >>: Acceleration, right? >> Alastair Donaldson: So this is [inaudible] right? >>: Yeah. >> Alastair Donaldson: Yeah. But that's extremely expensive [inaudible] exponential. And at CAV last year, my colleagues have a more efficient technique where you check your system for thread count after thread count after thread count and in success I have checks you look for something, and I'm not quite sure what it is, but it's something that tells you that's enough, you don't need to go any further. Right? But implementing it, the trick -- the problem is implementing it symbolically is nontrivial. So they have an explicit state algorithm for doing it. And I think they're working on a simpler version but don't yet have it. Okay. So in summary, instead of expanding P and abstracting the resulting program, which would be very expensive both to abstract in jack we're abstracting at the template level, which is scalable, expanding this, but checking it by exploiting symmetry and because symmetry reduction gives you a bisimulation, then this is a sound thing to do. We publish this as a technical report as well as having submitted the paper to CAV. And, yeah, I would love to talk to you all about it now or you can e-mail me if you're interesting in getting hold of the tools and trying it. [applause]