15852 >> Rustan Leino: Good morning, everyone. Welcome to...

15852 >> Rustan Leino: Good morning, everyone. Welcome to this morning's talk, one of them. And I'm pleased to introduce Aquinos Hobor, who is a Ph.D. student with Andrew Rapel at Princeton. And he's coming here by way of all kinds of corners of the earth, Singapore, Bangalore, even New Jersey. And he's going to tell us about his Ph.D. work that he's in the process of wrapping up. >> Aquinos Hobor: Good morning. This is joint work with Andrew and Francesco Zapo Nordeli (phonetic) at Inryo (phonetic). So why are proofs about code hard? Map proofs are usually of this form, whereas proofs about real programs usually look somewhat more like this. But there are always some exceptions and usually some underlying assumptions. And the real point is there's a lot of detail. So we have two choices. The first choice is we can sort of isolate the core idea and try to prove something about it on paper and then we have to hope that the implementation doesn't make too many mistakes. Of course, it does make some mistakes. And the second choice is we prove something about the actual code. And the problem is proving things about code by hand is really hard. So the solution is to have a machine check proof, use a computer to check your proofs. There are a lot of systems out there. We use a system called Coq, which is out of France. Ideally the computer is going to actually help do some of the proof as well. And an issue starts to emerge which is that these proofs are very large. Very time-consuming to write. And so proof engineering becomes an issue just like in the large software system software engineering becomes an issue. So what's the goal of oracle semantics. We want to add concurrency to a large system in a provably correct way modular way. Usually the systems already exist in some kind of sequential form. We start with a compiler for sequential cold or some proof about some sequential thing. We'd like to reuse the code and in particular the machine checked proofs. Usually the proofs are larger actually than the code. We'd like to be able to reuse that whenever we can. And the key is going to be isolating the sequential and concurrent reasoning from each other. So this is the project that we sort of applied the technique to. This is the Concert Project by Zavy Luoux (phonetic). Basically there's a sequential source program in a language called C minor. C minor, one way to think about it, it's the top-most intermediate representation in a compiler. It's so high that you could imagine actually writing a program in it. You feed the program into the comp cert compiler and out the bottom comes a target program which is in Power PC and the innovative this about this compiler when it was done in 2006 was that the source and the target both have an operational semantics, separate specified in Coq. And there's a correctness proof which says that any time you put a source program into the compiler, the operational semantics of the source is equivalent to the operational semantics of the target. In other words, the compiler hasn't introduced any bugs. So a question that might occur is what's the relationship between your source program and the operational semantics? In particular, this is the work that Apel Gauzee (phonetic) did and the observation is that usually in a compiler, or the way Zavi found it most convenient was to relate operational semantics at the source and the target level but frequently when you have a source program you'd like an axiomatic semantics like a separation logic to reason about your program. And so now in addition to writing the program, the user is going to provide some program verification with respect to the separation logic. And Apel Gauzee (phonetic) designed the separation logic and designed a soundness proof which connects the axiomatic semantics to the operational semantics and then you sort of have an end-to-end result, because you understand what the semantics are at the source level and the compiler guarantees that the semantics are translated to the target level. So we'd like to move to concurrency. So all kinds of things have to change. We're going to have a concurrent source program. Concurrency aware compiler. That's going to produce Power PC with concurrency GU. We have concurrent semantics at the source level and target level. We have a concurrent separation logic for axiomatic semantics. And the soundness proof and the compiler correctness proof both have to be updated to be concurrency aware. Just to review for a second what this would mean we'd have a machine checked end-to-end proof of concurrent program that actually was executing on the machine. So you would know that the actual bits that the chip was reading have the behavior that you specified and it would be machine checked end-to-end. Okay. So I'm going to go through this sort of one component at a time. Now, the source program, of course, is provided by the user. What we do is we add a few new statements to the language of C minor to allow him to program in a concurrent way. We add five new statements. A lock/unlock make lock free lock and fork. This sort of supports the C threads concurrency model that programmers are familiar with. Make lock and free lock are bookkeeping instruction. Lock and unlock and fork do sort of the standard thing. The unusual thing here is make lock takes a resource invariant R. And R describes what resource of a thread is going to acquire when it locks a lock and release when it unlocks. So here's a little example program. We use make lock to indicate that we're turning L from a regular piece of data into a lock and we provided invariant there that says that X points to an even number. It's in blue in a different font because the point is that's a predicate as opposed to code. Our system just lets you intermix them like that. And then what we're going to do is we're going to fork, the child will pass at L and then here we're going to lock L and fiddle with X a little bit and unlock L. And of course we'd like to know that the child isn't going to be upset whether it runs here or whether it runs here. So, again, we want to combine sequential and concurrent features. Here's sequential features, all these control flow statements. I haven't talked much about what's inside C minor but there's sort of a large variety of control flow constructs. We've added a couple of concurrency features we want to combine them. At the language that's the easiest thing in the world it's just syntax and they play nicely together. So moving from there over to the far side. The operational semantics, the concurrent operational semantics is really where sort of the heart of the work is. The goal is to make the semantics that makes the various proofs of that semantics easier and give us the separation that we want. So here's some different kinds of reasoning. Here's a separation logic soundness for sequential and concurrent separation logic soundness and here's a sequential compiler correctness and concurrent compiler correctness and other kinds of things. We'd like to add some magic so we can basically reason over here independently of what's going on over there and reason over there independently of what's going on over here. So how are we going to do it? The first thing is we're only going to consider well-synchronized programs. In other words, we're going to consider programs that are database-free that have Dykstra semifors that kind of thing. And our operational semantics is actually going to get stuck on L synchronized programs. So we're going to start with a small step operational semantics for C minor, essentially. There are some step relation. It takes a sigma and a kappa to another sigma and kappa. Sigma is a state. Row is some local variables and is the memory and kappa is the code, control stack. We're going to add to the state a new thing, W. W is a world. Like memory. It's a map from addresses to something. Unlike memory which is a map from addresses to values, world is a map from addresses to something called ownerships. So there are a variety of ownerships. Here are some basic ones. One, the first kind is none, that means we don't own the address in question. The second kind is Val. That means we own it and it's data. So we can read it, we can write it. The third kind is lock. Lock has to come with an invariant. Where the invariant comes from the semantics is from the make lock instruction. Some of those what make lock does it turns a Val into a lock and adds the invariant there. And the key point is our world aware semantics is going to get stuck if you use memory without ownership. So if you try to read where you have none or if you try to lock a Val or anything like that, you're just going to get stuck. What we're going to do is we're going to enforce that each thread is going to have its own world. And all the worlds are going to be disjoint. In other words, at any time only one thread is going to be access any address. Another thing with sequential instructions in proofs, they don't know about the existence of lock. They know that there are some kinds of ownerships. They know about some of them. They know there could be others. And so, for example, when you're writing to a memory address, you check to see whether it's Val. You don't check to see that it's not lock. Okay. What's the concurrent step relation look like. We have four components. We have omega, which is a scheduler. That's just going to be a list of the threads that are going to execute. Once we've picked a scheduler, everything is deterministic. We're going to quantify on the outside over all schedulers. So we can handle any schedule you might want to throw at it. Then we have a list of threads, K. A thread is just some local variables that could be registers at the low level or just variables at a higher level. A world, which is to say what pieces of memory it has ownership of. And some code. Then there's one memory for the whole machine. And then there's another piece called a global world. Basically everything is owned by something at all times. So if you unlock a lock and you give up some resources, those resources have to go to someone. If there's no one ready to take them, they get put into the global world. So when you want to execute a sequential instruction, what you do is you use the world aware sequential step relation. So you just call it on a procedure and you don't have to worry about what's going on inside. When you want to execute a concurrent instruction, you have to do -- you actually have to do some work. So, for example, the concurrent step relation will update memory at lock to flip the bit from zero to one and back. It will maintain the thread list, in general, but in particular for fork. And the third thing that it does is it transfers world around, typically between threads in the global pool. And at unlock you transfer world to the global pool whereas at lock you transfer world from the global pool. And we have a picture. So this is time going down. And different threads going across and the global world there off on the side. And thread B is going to unlock L. And WR is going to be the unique world that satisfies L's lock invariant. I'll explain a little bit how that can be, but there is one. So after doing that, WR has been transferred from thread B to the global world. Sometime later thread A is going to lock L, which is going to transfer WR from the global world to thread A. And what this means is that memory that used to be visible and changeable by thread B is now visible by thread A. So B has communicated to A. So this is how processes communicate. I thought I would show an actual rule here. This is the unlock rule. And it's going to illustrate a little bit about how we avoid raised conditions. First of all, the held of the scheduler here we're executing thread I. And the thread associated with I is currently an unlock instruction. And its world is W. We're splitting W into two pieces. W prime and W lock. Then we're looking up and we're discovering that L is a lock within invariant P. That's what the squiggly line means is a lock with an invariant. And then what we're going to do we're going to check that P holds on W lock. So, in other words, a property about anything inside a lock is going to be that there is a unique sub world that holds, that it will hold on. So that way W lock will be unique. And there's a strange thing here which is that this semantics actually isn't constructive. You couldn't actually run it. And the reason is we're providing a classical logic to the users and the user is going to be able to write down whatever he wants to write down. He's going to write down some complicated thing in his make lock instruction. And over here we have to actually test whether or not that thing holds. And so we can't. Fortunately we don't actually have to run our semantics. What we have to do is compile it. And having an unconstructive semantics doesn't prevent us from compiling it at all or prevent us from reasoning about cases where it wouldn't get stuck where we can prove that the test would actually pass. Another unusual feature is we interleave only very rarely. In general we only interleave when we reach a concurrent instruction, like a lock or unlock. If we have a long sequence of sequential instructions we'll execute them all without doing any interleaving at all. And the reason we can get away with this is we have well-synchronized programs. The whole reason Dykstra was introducing semifors was so that you could reason in this style without having to worry about each interleaving. So now we want to reason about concurrency and about concurrent execution. And there's an issue, which is that most of the time concurrent programs are actually executing sequential code. And sequential features are usually hard enough to reason about. I sort of illustrated there was a whole bunch of sequential control flow. Those actually can get quite nasty to reason about. And C minor doesn't have things like exceptions which can get nastier still. And one thing we really don't want to have to do is to have the extra complexity about concurrency when you're reasoning about sequential code. So in the middle of returning from your function call, you don't want to worry about a contact switch. So the idea is why don't we just pretend like it's sequential. So here's a little snippet of code. We have two sequential instructions followed by a concurrent instruction and then another sequential instruction. We'd like to sort of reason about this. But when we try to use a sequential way of looking at things, we just get stuck. We're able to make progress until we hit the lock. Then at the lock we can't make any progress. So our idea is we're going to augment our relation with another feature, an oracle. The oracle is going to be constant over sequential steps. And sigma one and kappa one on the left is the same as sigma one and kappa one on the right. So, in other words, from over a sequential instruction, if you know how it behaves over here you know exactly how it behaves over there. That allows you to recover all kinds of sequential reasoning that you might do. When you get to a lock, then what's going to happen is you're going to consult this oracle and it will tell you, you know, what the result will be, what memory will look like after you finish this operation. Afterwards you just continue. This is another sequential instruction. The oracle again is unchanged and you can just continue to reason about this sequence. So how do we do this? An oracle has a number of components. It's got a scheduler. It's got a bunch of other threads. It's got a global world. And what it's doing is it's simulating running all of the other threads until the scheduler returns control. Again, this is a nonconstructive -- this is a classical kind of thing. It reduces the halting problem. But again, what we're trying to do is compile things, not actually run them. >>: You call it oracle, that makes me think that it's going to do something good for you. But it's juicing demonically, is that right? >> Aquinos Hobor: No, it tells you. It does do something good. It says after you finish this lock, this is what memory is going to look like. >>: Right, but suppose that your program works only if memory looks a particular way. >> Aquinos Hobor: You're going to have to show that for any oracle, for any other set of threads that obeys your locking discipline. >>: So the oracle doesn't always return it to the same time that the program would actually ->> Aquinos Hobor: Well, in fact, you won't be able to, for example, you won't be able to prove a hoare triple, because the oracle will force you into a bad state. >>: The methods. >> Aquinos Hobor: I suppose. I think of it as a force for good. Okay. So we need a connection between reasoning in this sort of oracler style and reasoning on the concurrent machine, which is roughly as follows: If a thread executes in a certain way on the oracler machine, then it executes in the same way on the concurrent machine. That means you can reason about each thread independently. And then you can combine your oracler understanding of how each one is executing into an understanding of how the entire machine is executing. Okay. So now talking about axiomatic semantics. This is probably going to -- this is intended for a general audience. I'll go very fast here. We have hoare triples, precondition, command post condition. And our setting these pre and post conditions are just predicates on state. So here's our Coq definition. Hoare logic is a set of axioms for deriving valid triples. Here's the sequence one. A point something to point out is that this rule is usually false in concurrent settings, because you can have a contact switch between the first instruction and the second instruction and so the post condition guaranteed by the first instruction won't actually be true by the time you get to the second instruction. So big problem with hoare logic is dealing with pointers. So if we know that X points to zero and Y points to zero we update X. The problem it's not quite clear what the post condition is. X could be one and Y could be zero, or they could both be one because they're aliased. So Reynolds and O'Hearn introduced the separation operator that says the left half of the separation and the right half of the separation are disjoint pieces of memory and so afterwards, after this you know that X and Y are not aliased. And the nice thing about this is you can have more local reasoning for programs with pointers if you know some hoare triple PCQ you can add a frame to the left and the right and your hoare triple will still hold. Obviously this doesn't hold with "and" because of aliasing concerns. So concurrent separation logic. This is the original concurrent separation logic was by Peter O'Hearn. This is sort of a more advanced version. It includes sequential separation logic sort of as a proper subset. That's sort of sequential separation logic sitting inside concurrent separation logic. And it also has extensions to deal with concurrency. For example, a new predicate, you can write is L as a lock with invariant R. We saw it in operations semantics rules. And we have various rules for various triples for reasoning about hoare, about these lock and unlock rules. So, for example, if we know L is a lock with invariant R and we lock L. Afterwards, we know L is a lock with invariant R and we know R. And there are several ways of formulating an unlock rule but this is one way which is just the reverse of the lock. So if we know L is a lock within invariant R and we know R then we know we've lost the fact that we used to know R. And the beautiful thing about this is that programs proved or verified in concurrent separation logic are well synchronized which means their operational semantics will be well defined on programs you can verify on this. I just thought I would show very briefly I've gotten rid of the global world here. So this is how thread A would prove the lock rule. He's got some frame F and also L is a lock with invariant R. Afterwards he also knows R and here's how by combining the lock rule and the frame rule you can prove this triple. Now I just wanted to talk about the program verification. Here's the example program. I'm going to verify this little piece right here. There's some frame, of course. And then we know L is a lock, such that X points to an even. We lock L. And then we know at this program point not only do we know all that we knew before but we also know that right now X points to an even. That means we can now reason about this instruction. Now, of course, X points to an odd. We increment again. This, of course, is equivalent to this with a slightly larger value of Y. And this lets us prove our unlock rule because now we know the lock invariant. So we can take it out with the unlock rule. Seeing it all together is here. And then here's the whole program. And there are some lessons. The first lesson is there are a whole lot of details that start coming in. Actually, there have been some omitted in this presentation. And what this really leads to is why machine checking is very important. This process has actually been done for some larger example programs in our system in Coq. So fully machine checked. Another point is we had to do this all by hand here. But machine generation would be really helpful. So something that would actually generate all or most of this stuff would be really terrific, and I'll talk a little bit more about that at the end. So now talking about the soundness proof here. This is the connection between the axiomatic semantics and the operational semantics. And what are some difficulties? One difficulty is invariants need to refer to other invariants. So, for example, here's something where the next value in memory points to an even number. So this is a lock that's guarding the next value in memory which must point to an even number. Here's two things like this. And once you create something as cool as this, what you really want to do is tell your friends about it. So what that really means is you need to put it in another lock. So here's another lock. It's guarding a pointer. And if you grab this lock, what you know is that it's pointing to an object of this second kind. And the first thing to say is that method invariants in general are difficult to model. They're a self-referential property which can be very nasty. So one thing you might say is well why don't we just all be first order about it. And the problem is really most programs you can imagine actually have locks that refer to other locks all over the place. That's how communication gets started. That's how you pass function pointers around. Function pointers you want to pass locks to. Locks you want to put inside function pointers. Function pointers you want to put inside locks. Any way you do it to get communication started really you need to have some kind of higher order setting here. So what do we do to solve this sort of nested invariant mess? What we did is we defined a modal logic. And basically when a logical proposition says when and where another logical proposition holds you have a modal logic, we have a modal operator later written with this triangle. You may have noticed the triangle in the unlock rule. Later P means that P is going to hold at all times strictly in the future. It might not hold right now. And the key to that is that it avoids circularity, because we're always talking about right now we may not know much, but in the future we'll know something. So things can become well defined. And it gives us a nice induction principle. We also, in addition to talking about time, we also talk about space, with the separation operator and an operator fashionably, which we write with a circle. Or fashionably P we have a lot of fun. We have lots of these operators, they're great. Fashionably, P means P holds on all words of equal age, which means the only thing you're allowed to assume about the world part of P is its age. So, then we need to construct a model for our modal logic. We use the very modal model of Apel Melies Richards and Verion (phonetic). This was presented in Popolu 7 (phonetic). We had to extend it a bit because we have spatial properties as well as modal properties. So we did that. And a point that we've discovered is that we've discovered that semantic models tend to scale better in large systems. So that's part of why we built up this semantic system. Sort of an engineering point. Basically, our invariants are semantic, as I mentioned, and they're shallowly embedded in Coq. So what this means is that they're easy to use and reason about in in the theorem prover. So we get to use the same tactics at the Coq level and invariant level. We also don't have to reason about binders. So there's this huge Pabo mark challenge and people are submitting their works about, well, maybe you should use locally nameless terms or this or that. We actually don't have to deal with that at all. We use the binders that Coq provides. They're easy to deal with. They're simple. And it really simplifies all the engineering work. There's another modeling difficulty. As I showed in the example program, we actually embed our assertions directly into program syntax. What this means is that the definition for syntax has to depend on the definition for predicate, naturally. On the other hand, one type of predicate we'd like to be able to specify is that something is a function. F is a function with precondition P and post condition Q. That's useful for forking and also if you want to put a function inside a lock. This is actually a predicate about function behavior. So function behavior, of course, is defined in terms of program syntax. So there's a very nasty circularity here where program syntax depends on predicates but there's a predicate that depends on program syntax. So we solve this actually in a similar way to the previous circularity, which is we actually define our hoare triple using our modal logic. Our hoare triple just becomes a predicate on state just like any other predicate. One hypothesis is we may be able to reason about things like a jet or some other self-modifying code with this model as well. That's sort of a little bonus. I thought I would just put it up there just for a second. This is our actual definition of our hoare triple. And I'm putting it up there both because it's nasty, but also just because we actually have to prove all of our hoare rules sound. If we didn't have the shallow embedding, proving something about something this large would be really a nightmare. Because we would be constantly in our proofs worrying about shifting and variable renaming and all this kind of mess and we'd have to have special tactics at the invariant level and it would just be a horrible mess. So the fact that, you know, we're able to define something at the invariant level and then actually reason about it in any kind of reasonable way is actually an illustration that the shallow embedding is a big win. Okay. Now we need to prove our hoare triple sound. And the key is we're going to prove things relative to the oracler machine instead of relative to the concurrent machine. We have all of these Apel and Bazee (phonetic) machine-checked proofs. They had been written before. And the great thing about proving things relative to the oracle machine is that we're able to basically reuse all of those proofs. So proofs that had taken two people six months to write sort of one person with a week was able to get to go through the machine checker. And as I pointed out usually for example like the sequence rule is actually false when you try to prove it in a concurrent setting. And not only is it true in our setting, but it's so true that the previous proof was able to be reused. So that was very nice. We also have to prove the axioms that are unique that are concurrent to separation logic like the axioms about lock. Obviously we can't recycle any other proofs because nobody's done them yet. Or had done them before we got to them. But when we were doing them, at least we were able to ignore all the difficulties of sequential control flow. So when we were reasoning about lock, we got to ignore the fact that the actual concurrent machine had all this complicated sequential control flow and just we were able to focus just on the concurrent behavior. So this is probably a good time to talk about the machine checked proofs. We ended up developing a slogan, which was that, as expected, it took longer than expected. We have got about 62,000 lines of proof. I think when Rustan saw this last it was about 58. So we've been working hard. There's a wonderful quote by Zavi Louiux (phonetic) here. Which is very true, thank God, which is building these things is very addictive in a video game kind of way. >>: (Inaudible). >> Aquinos Hobor: Maybe. Maybe. There's -- I wish. I would even consider paying them. But it is fun, which is good. So what's the status? All the definitions are done in Coq. All the sequential rules are done. The concurrent separation logic rules are all done except for the unlock rule, we've got a little piece left. That's not actually the hard rule. The difficult rule is actually lock, which is done. So there's a little bit of bookkeeping left to do at the end there. There's also the connection between the oracler reasoning and the concurrent reasoning. I currently estimate it's about 90 percent done. As expected it might take longer. But we're pretty confident about where things are. So I want to talk briefly about future work. So we have to modify this compiler to be concurrency aware. Now this is an optimizing compiler but it's not heavily optimizing. And basically the compiler we think can translate lock and unlock basically as a function call. All the intermediate levels of the compiler understand about function call. So it's our hope that the compiler may not need any modification or only very, very minor modification, because it doesn't do very aggressive optimizations that would be difficult for concurrency. There's one issue that one might worry about which is we have these make lock predicates which actually -- these make lock constructions actually have a predicate inside them. So one worry would be that the compiler has to look inside and modify this predicate as it compiles down to intermediate levels. We've actually very carefully designed our predicate so that they look only at memory and at world. World is, of course, a virtual thing, so that's fine. The compiler is not going to change that at all. And the comps for compiler has been carefully designed so each intermediate memory level is preserved. So the compiler shouldn't even have to look at the make lock statement. In fact, as far as the compiler is concerned, it's just going to be unit. Which actually turns out to be useful when you do extraction to take your compiler from Galena, which is the language it's built in, built into Coq, and turn it into O Camel so you have reasonable performance. So then operational semantics here at the target level. There's going to be an oracler machine down here an oracle step down here. One additional feature you have to worry about is the so-called weak memory models which means to say real processors don't interleave the way we do at all. They actually don't interleave at all they have out of order execution with all kinds of constraints, and while sequentially you can't actually tell that this is going on, concurrently you can. We think that for well synchronized programs our interleaving model should be fine but this is something that we have to prove. Then, of course, this correctness proof, the compiler correctness proof has to be updated for concurrency. And it's our hope that the vast majority of it we'll be able to reuse, just like we were able to reuse all the sequential rules from Apel and Blazee (phonetic). So that's future work. But we're reasonably confident. The compiler, of course, will be modified very little. So hopefully the associated correctness proofs won't have to be modified very much. So papers and related work. Peter O'Hearn did the original concurrent separation logic. Steven Brooks had the original soundness proof. The original concurrent separation logic didn't have, well, it didn't have functions or something like fork. It had a different concurrency primitive. It didn't have locks that could refer to other locks. So locks were static, fixed at the beginning of the program. And global to all threads. You didn't have to worry about this sort of self-referential behavior and stuff like that. But anyway Steven Brooks did the original soundness proof. This is our paper. This is work done by people in Cambridge. Alexi is a graduate student at the University of Cambridge and Josh Berdyne and Byron Cooke Brooks and others work at MSR Cambridge. They came up with a similar concurrent separation logic. It's roughly isomorphic to ours which is I think is an indication that the logic that we came up with is sort of the natural way to extend concurrent separation logic to C threads. Now, ours is more powerful in sort of a variety of ways. We have a machine checked proof. But anyway that's definitely good work. Okay. So then there's some talk about semantic models. This is the model for the modal logic. This is actually this has appeared now. This is talk -- so what we discovered engineering-wise was we would get to a point in the proof where things were very nasty. And the solution was stop and back out and try to express the problem at the level of the modal logic. And maybe we have to define new operators like fashionably, or whatever. Lately, all these different kinds of things. And prove things about those operators, reformulate the problem at the level of the modal logic and reason there. And that just, engineering wise, that just made the job a lot -- that was the right technique so we sort of talk about our experience there. This is another paper where we are sort of explaining how semantic methods tend to work and how we found them useful. This just got accepted in PCC. Okay sequential separation logic was in TP Hall last year, and then sort of at the very end I said sort of at the moment we're assuming that the user will actually write all the verification pieces themselves. Alexi Gottman and company at Cambridge developed sort of an initial tool that can try to infer lock invariants and prove, provide these verifications. It sort of works on a simplified, in a simplified setting. For example, I think all the locks have to be declared in the main thread and so you have sort of a fixed number of locks there. But it's great work. And definitely something that helps the user generate that thing is a big win. This is a senior thesis at Princeton where they take Alexi's algorithm and he implements it in Coq. And proves that when it reaches a fixed point that it actually has a proofing concurrent separation logic and there's a verifier. So there's various people working in that direction. A related thing I've done, which is I can actually talk about here, you know, the only place around that I can, is I worked at the Center For Software Excellence, a few buildings over, when Monterey Doss (phonetic) was still running things. Developed this annotation checker to find concurrency bugs in Windows. They don't actually tell me what the story is these days, but I've been told it's running over the Windows code base so the techniques I developed for that are scalable and you can use them on fairly large pieces of software. So that's the concurrency minor project and so I wonder if anybody has any questions. (Applause). >>: So would you give me a rough breakdown of your code size, complexity, you give that (inaudible) communications but you didn't give it for -- if you could give us the definition of the size of the proof (phonetic). >> Aquinos Hobor: The definitions are actually not so simple. But, let's see. So I have 62,000 lines. Now, I should point out that it's in development. So one of the things that you do is you write for a while and then you say, okay, this is too ugly for words. I have to go back and then so it's not linear. And some of it's been now gotten to a pretty polished state and some of it is pretty rough. All of that said, there's about -- the sequential separation logic soundness rules, all about 25,000 lines. So that's one rule for each of those, one lima for each of those hoare rules, definitions. >>: Do you know how many hoare rules? >> Aquinos Hobor: I don't know. I have to count. You know the truth is, like I said, it required very little modification and I actually have the person who had written it. He spent a week and did it. I haven't looked at them almost at all. In some sense bad me, but in the other sense it really shows the isolation is very strong. There's a bunch, though. So C minor has various kinds of nasty control flow, for example, you can break out of loops multiple levels. So let's see you have nested loops. You can break out two levels immediately. So like -- and function call and return. If loops, all kinds of stuff. We don't have, for example, exceptions. But, I don't know, maybe there's 25 rules, 30 rules. These proofs get remarkably long, even just reasoning basically fully sequentially, they get remarkably long, remarkably fast. So that's let's say 25,000 lines. And that was kind of cool. Then there's about 20 or maybe 15,000 lines that are basically borrowed from Zavy Luoux (phonetic), so this is the definition of sequential C minor. The definitions of the memory model and related kinds of library type stuff. So that gives us about 40. So I would say the concurrency stuff, the actual concurrency stuff is only about 20. That's about 5,000 lines for the actual hoare rules, of which most of them are not that bad. The lock rule might be 2500 lines, but unlock, fork, make lock and free lock are relatively short. And then the remaining piece is the definitions of all the machinery. That's probably 10,000 lines, five to ten. Because we have three machines. We have the sequential machine. We have a concurrent machine. We have an oracler machine. It's a real system. So the definitions are larger than in a toy system. I don't know, might be five. And then the remaining piece is relating the oracler execution to the concurrent execution. And maybe that's five to ten. So that's rough. But that would be my guess, which in other words the way I think about it is that the actual -- the actual fully concurrent piece is about 10 percent of it. All of the machinery required to get it up and executing sequentially is more than 50 percent. The machinery required to get it executing concurrently might be another 15 percent. And then the connection might be the rest. >>: So different share locks. >> Aquinos Hobor: Yeah, something I didn't illustrate, due to you have to hide some detail somewhere, is we actually have fractional permissions. So, in other words, speaking again loosely, we have fractional permissions for data, where we can have a situation where two threads can write, I'm sorry, can read from the same data as long as none of them are writing. And also where you really need it is for locks. So multiple threads can have pieces of lock. And, of course, you can't free a lock unless you get 100 percent of it back because you wouldn't want to free a lock that somebody still has a piece of. You only need epsilon to actually lock or unlock. You need 100 percent to free or make. >>: Is that part of the operational semantics, not operational semantics? >>: So have you also used these methods to prove some program. >> Aquinos Hobor: I have some program methods. I mentioned that a little bit at the end. >>: The proofs were down manually in Coq as well? >> Aquinos Hobor: So in the extended version of our paper which doesn't have a (inaudible) I've got to do that, I give an example program. That was done -- that was too complicated for Alexi's algorithm to actually do. For programs that Alexi's algorithm actually can do, there's sort of an automatic piece and that's been run on those sub sets of programs. That's definitely -- I mean there's a lot of work to be done there to get those tools up to where you could take -- I didn't think my example program was all that hard. But the tool was it kind of died. It ran forever, which eventually meant that it was killed. But yes, there have been actual example programs, fully machine checked. Other questions? Well, thank you again. (Applause)

15852 >> Rustan Leino: Good morning, everyone. Welcome to...

Related documents

Products

Support

15852 &gt;&gt; Rustan Leino: Good morning, everyone. Welcome to...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

15852 >> Rustan Leino: Good morning, everyone. Welcome to...