23241 >> Tom Ball: It's my pleasure today to have two speakers. We have Ohad and Guy, and they're going now to OOPSLA this week to give talks at Splash at OOPSLA, the technical track of Splash. And they're giving us a little preview today. Ohad Shacham is a Ph.D. student at Telviv University under the supervision of Mooly Sagiv, who is here visiting, and Eran Yahav. His Ph.D. dissertation addresses questions of checking atomicity of concurrent collection operations. And prior to his thesis he worked at IBM Haifa Research Group on hardware verification. Guy Golan-Gueta, who is sitting in the audience here, is a Ph.D. student also at Telviv University also under the supervision of Mooly and Eran. And he's interested in various aspects of concurrency. In the past he was a software architect on several software projects and was responsible for the design and development of many high performance critical systems. So we welcome them here today. And Ohad will get started, and then we'll hear from Guy. >> Ohad Shacham: Okay. Thank you very much, Tom. Okay. So I'll present the work testing atomicity of composed concurrent operations. Joint work with Nate Brunson, Alex Aiken, Mooly Sagiv, Martin Vechev and Eran Yahav. So writing concurrent data structure is hard. So we will look at modern programming that will provide us with a library of concurrent data structure, and we can use this library using an interface. And each one of the operations of the interface is atomic. The problems that in many cases this interface is not enough and the user need to write a new operation, compose a few operation of the interface. And we know that each one of these operations is atomic. However, we don't know whether this composed operation is atomic. So the question is: How can we test atomicity of this composed operation? So here we have an example of a bug we found in Apache Tomcat. In Tomcat Version 5 they have in attribute map. Attribute that maps from name of an attribute to the object of an attribute. And they allocated a sequential hash map, and they also have a function remove attributes that gets in name, name of attributes, and what it does it first holds the global lock of ATTR. Then it checks whether the attributes inside the collection. And in case that it does, it gets the attribute object, remove it from the collection and return it. And otherwise, it just returns now. So the environment that this function tries to maintain is that we remove attributes returns the value it removes from the collection or not. So what they did in Tomcat Version 6, they said, okay, we have a concurrent hash map, a concurrent collection. So let's go allocate concurrent hash map and we know that each one of these, that each one of these operations is atomic. So just once and remove this lock. Okay. And, of course, this breaks the invariant. And let's see an example of an execution that shows that the environment, the environment does not hold. We start running remove attribute with an interest string A and it's assigned to file now. And then we have some concurrent execution by a different thread that goes and does a put operation with the same, with an input key A and object O. And then we move attributes continued running. It checks whether A inside the collection and, of course, it does because it was added in here. So it enters the branch, and it's get the object O that was added in here. And before we have a remove operation we have some others to come, and that's the concurrent remove operation to the same input key A and then remove attributes continue running. Here it tries to remove A. But it fails because it was already removing in here. So it continues running. Retool. And the value it returns is the value that it was in here which is O. So actually this function run, it didn't remove anything, and it returned O. It is as if this is the value that was removed. Of course it violates the invariant. So by atomicity in this work I mean linearizability. By linearizability, I mean given a concurrent execution, as the one you can see in here, says this execution is linearizable if there's a sequential execution build by this attribution, remove attribute, put and remove, such that each operation in this concurrent execution the result of this operation, this concurrent execution is equivalent to its corresponding operation in the sequential execution. And if I'm looking at this concurrent execution, then as we said before, remove attribute returns O. You can see that put in here returns now because this is the first operation. And here is the collection, therefore it returns now. And remove returns O because this is the value that was added in here. So what I want to do now, I want to check whether there exists a sequential execution such that every operation is concurrent execution has the same return value as in the sequential execution. So actually we have here three options. The first one, first do a put and then a remove and at the end run remove attribute. And you can see that put as before returns now because this is the first operation. Remove returns O because it runs just after the put. However, when I'm running here, remove attributes, you can see that in this point the collection is empty because I added the key A. And then I removed it. So it runs and it just returns now unlike it is in here. So this is not the equivalent equation execution. In here we first run remove attributes and again it returns now because this is the first running of the collection is empty. So it's not equivalent one. And in here we first run put and then we remove attribute. You can see remove attributes in here returns O as in here. However, the remove that we run at the end returns now because this remove operation from, remove attributes already removed this, the key A. Therefore, the collection is empty and the operation returns now. Okay. Therefore, this concurrent execution is nonlinearizable. We didn't find any sequential execution that is equivalent to this concurrent execution. Okay. So what do you want to do here is testing the linearizability of this compose operation. First we want to test the atomicity. By atomicity that we mean linearizability. >>: The previous slide, sorry, when you say sequential, you really mean forcing, leaving. So I mean here you still, because I think of sequential, I think of the first two columns. The third one you actually have some concurrency, right, in the sense that you're interleaving the execution, the full execution. But it's sequential in the sense that you execute the methods completely. But it still has some aspect of concurrency. >>: I mean in the granularity of the operation, the granularity of operation, I have the operation and then another one and then another one. >>: So operations are not created. >> Ohad Shacham: Operation, but threads can be proactive. >>: But threads. >> Ohad Shacham: Yes. >>: What happens to your definition if remove doesn't tell you whether it succeeds or what it removes? >> Ohad Shacham: You need to return to the operation or the define. Yeah. Okay. So what I want to do, we want to test linearizability of this compose operation. Okay. So looking at this choices we saw before it looked very easy to understand why we have a variation. However, reconstructing this in Tomcat is extremely challenging. Okay. Tomcat is a huge program. We have a large race, a large number of choices. And this violation occurs only because we have this remove operation that works on the key A and occurs between this get operation. And this remove operation that works on the same key. Right, we read here the value. And before we remove it someone else came and removed it. Okay. And this thread interleaving, this thread interleaving is very, very rare because we need somehow the program will do a remove operation between these two, three of them should work on the same key. It's very, very hard to reconstruct this operation. So our solution has three phases. The first phase we use modularity. We actually shop this compose operation out of the program and we test it in an environment that does arbitrary collection operation. Of course, this is an obstruction. And it generates simple choices. It can, of course, it may of course generate false alarms, violations which are not feasible in the client. And it also, the modularity to control the environment. Now we control the environment, we can do any operation we want in the environment and we can do -- we can direct it and do some partial order reduction. The second thing is after we generated so many choices, we used linearizability of the base collection. We have a guarantee from the library that each one of the operations is linearizable. Therefore, we can execute the operation of the collection, sequentially. We do not need to overlap them because we know they're linearizable. So this restricts some of the choices that we generated in here. And the last thing we use the influence specification of the library, it's similar to the noncometivety, and we use this information in order to further restrict the choices that we generate and that [inaudible], I'll show how we use it and what we mean. Okay. So first modulo checking, as I said, we take this compose operation. We chop it out of the program, and we test it in an environment that does operational in the collection between any two operations of the compose operation we can do any operation of the environment that we want to do. And this is, of course, as I said an abstraction. It may generate false alarm. However, we argue that the violation that we found, even though are not feasible in the client, fixing these violations may call resilient for future changes, because if, for example, this bug is currently not feasible, then in two days from now someone can remove a lock in another program or add some remove operation and this bug can pop up. And later on you'll see this is something which is backed up by our user experience that some of the bugs we found were not feasible in the client but even though we reported them to the developer, they acknowledged these as bugs and fixed them. >>: So you should have some sort of theorem at least, maybe theorem is strong, but you would hope that the sort of bugs you find here are a super set of what would happen in the application given that your environment, if your modular and environment abstraction is something, right, that you'll get -- you'll bound on the set of errors due to misuse, right, of the ->> Ohad Shacham: Yes. This is the -- and later on I'll talk about some way we use it to prove. But -- yeah. Okay. So first, as I said, we do a modulo checking. So what we want to do, we want to test this linearizability of this compose operation in a modulo fashion. So actually what we can do, we just run the program in a testing method. We run to remove attributes with some random inputs. There might be an environment operation that doesn't influence to control compose operation, some doesn't influence control of the compose operation, but it's running and running and running and it doesn't find any bugs, because still it's generating -- whoa. Okay. Help me with the computer. There are so many traces that we can explore in here, and at the end when it's eventually find a bug, we look at the choice, ignore this one, it shouldn't be in here. We can look at this and we can see as I said before the reason that we found this bug is that this remove operation influenced a result of this remove operation. Because this remove operation removed the key A before this remove operation, this remove operation failed to remove R. So we are -- failed to remove R and therefore we discovered the violation. So what we actually do, we further restrict the environment to do operation, only operation that will influence the next to come operation of the composed operation. Okay. In here, for example, before contains key, I can do a put that will make contains T to return true instead of false. In here remove will make this remove to fail and so on. I'm restricting to environment. This is a partial order reduction, sort of a partial order reduction. >>: What's your definition of influence? >> Ohad Shacham: As in here, and for ->>: Formally, I mean, how do you -- you're not analyzing statically. So you're doing like a dynamic reduction? >> Ohad Shacham: Dynamically. When the program is running between each operation I can have an environment operation. So when I get to this step, I know what my state, I know the state of the collection. And I know the semantic. I can get the semantic of the library, the inference specification of the library. I have the specification. I assume someone provided it to me. >>: I see. So you know you should like use the same key. >> Ohad Shacham: Exactly. Because it's a map you know you should use the same key. And if currently in this state I know that the collection is not empty, and it has the key A inside, so I'll do remove. Otherwise if it would be empty I can do a put. Something that's dynamically. And it's dependent on the state of the collection, each movement. Okay. So running example, here we have remove attributes, as we saw until now. And we have here the execution itself and here you can see the cord that we're currently executing. We start running. Attributes with the key A. And we assign to value now. And before we have this contains key, we ask the environment, do you want to do an operation that will influence this contain key? It's something that happens dynamically. In this case the environment decides that he wants to do and the collection is currently empty. So the environment does a put operation. It enters the key A and the value O to the collection. And then remove attributes continue running, and now contains key will return true, because A was already added to the collection in here. So we continue running. We enter the branch. And before we have this get, again we ask the environment do you want to do an operation that will influence this get. And in this case the environment decided that it doesn't want to do. So we continue running. We do a get that reads the Value O and enter it to val. Before we move we ask the environment do you want to do an operation that will influence remove. And the environment does a remove with the same key. And, of course, it will influence the result of this remove and we continue running. And the remove failed and at the end it returns O. Now as we saw before, we go and we check all the possible executions and see whether there exists an equivalence. Not one. If not, then we report a violation. Okay. So just one thing, as you can see, when we got this trace from our technique, you can see that this trace is very, very concise. It's not a large trace of a program that runs many spreads and is very hard to analyze the bug, it's very, very easy. We have the compose operation and we have once in a while some small environment operation. Then it's very, very easy, all the choices we got, it's very easy to see why the bug occurs and to analyze them. Okay. So we implemented this technique in a tool name Coat. And it guesses an input of the program and elaborate specification which moves to the compose operation extraction. This is a simple static analysis that goes and just looks for compose operation. And what it does it moves to the user back this candidate compose operation, and the reason is that in many cases this compose operation builds inside a function, but in some cases the user instead of writing a function that implements a compose operation just writes this compose operation inside of a large method. So in some of the cases, the user needs to manually compose this concurrent operation. And then it moved to, by instrumentation, together with some driver that runs the function and a driver that does the inference specification, the environment. And this, of course, afterwards we run it and we get, either get a nonlinearizability result or a timeout. Because it's a testing tool. Okay. So let's see our benchmark. So we, as I said we use this simple static analysis to extract the compose operation. In 90 percent of the cases the compose operation was a part of a large method. So we manually needed to chop it out and to write it. We extorted 112 composed operations out of 55 applications. All of them real compose operation. And from real application is Apache Tomcat, Cassandra, and so on. For each one of the applications we extracted all the compose operations and we analyzed all of them. And we didn't find any additional compose operations. We tried to use Google Code, Koder and other search engines to see if we could find any other compose operation. This is all we could find. After we have this 100 compose operation we say we don't know whether these are linearizable. >>: This doesn't seem like a lot. It seems like two per app. >> Ohad Shacham: Hmm? >>: You only found two compose applications per application on average. >> Ohad Shacham: On average. >>: Is that just because these apps were well written and they don't really use compose things at all, or is it just powerful enough that you need the atomic operations for the live ones? >> Ohad Shacham: All of these are actually using the same libraries. They're all concurrent hash maps. So they're the same libraries. >>: I don't really understand how you sort of figured out whether or not these other compose operations that you missed, because you have to know the semantics of the code in order to determine whether some things are composed, right? >> Ohad Shacham: No, actually, you see whether there is an operation that try to access a few operations of the library. >>: A single method ->> Ohad Shacham: A single method. >>: [inaudible]. >>: I mean there could be bugs in the code where they have composed operations hidden because they use other helper methods that do -- the derive -- you wouldn't miss those, is what you're saying? >> Ohad Shacham: We try to find -- we didn't find any of these. But there might be a case that we missed such. But we also try to find these kinds. And we didn't. Most of these compose operations actually didn't have additional method that they called during run. But there might be a case, yeah, that's true. And okay. So after finding these 112 compose operations we said, okay, let's see whether these are linearizable or not. So we ran code. And we terminated on each one of -- it said that 59 are nonlinearizable and for each one of them it terminated in a second or so. And it gave a press and everything. And 33 just timeout after a while and didn't say anything. They say, okay, we know that 59 are really nonlinearabl, but are they really nonlinearizable or are they nonlinearizable only due to our open environment? Okay the abstraction we did. We check it out and we saw that 17 are nonlinearizable only due to our environment. And 42 are nonlinearizable in the client themselves. And for each one of them we wrote a fix and we reported them to the developers of the tool and many of them were already fixed, and acknowledged as violation. And many of these acknowledged violations as well and also were fixed. >>: How do you determine this? Did you actually find a real counterexample in the actual code? Or by looking at ->> Ohad Shacham: In some we were managed to construct. But not most of them. In many cases we got a fix pick from the user. And in some of the cases that we were not sure, we looked whether there existed -- whether there exists an operation that can occur in the, that can occur and create a violation itself. But in most of the cases we just got an acknowledged from the user. Okay. So afterwards we said, okay, we know that for 53 we got a timeout result. So let's see whether these are linearizable or not. We checked them manually. We saw 27 of them are linearizable. We managed to prove them nonlinearizable and 36 are composable, which means not encapsulated, not have the collection itself. And also they have some other variables. And these variable controls the value of the sys control operation into the collection during the run. And if we will augment the environment that instead of doing only operations to collection, that it will do only operation on the global, it will change the value then we can generate choices to show these are nonlinearizable as well. So over all the 112 compose operations we actually found that 85 are nonlinearizable. >>: How many of the nonlinearizable cases did the developer say it's nonlinearizable but that's okay? >> Ohad Shacham: None. They either -- I guess the ones that didn't respond. >>: Okay. >> Ohad Shacham: I guess this was the case. The one that responded was many of them said thanks, acknowledged there was a bug. >>: So nobody had some other -- >> Ohad Shacham: No. In Cassandra, for example, they responded to say, oh, this bug is currently not feasible because remove can happen only when the program terminated and it's not a problem anymore. However, we'll fix it because we plan to do changes or acknowledge -- bugs can pop up in the future. But I guess they wanted to respond, yeah. So 85 of these are nonlinearizable and 27 are linearizable. Then we said, okay, we found many bugs, many real violations, many real violations and it was very easy to detect them. And we only use the inference specification of the library. For example, I mean, if we have some branch which is dependent on the value to check if the value is equal to 42, then go and do some violation, this is something that we would miss. But then looking at this compose operation, we saw that these are very generic. I mean, they get some input. They use it inside the collection. They do not have any branch of the input. And also when they got some return value from the collection operation, the only branch that they do is whether some key is inside the collection or if it doesn't inside collection. Also they had some remove -- again, they checked if the value is equal to now or different to now. It doesn't check some specific value. So we defined the notion of data independent. And informally data independence means a compose operation is independent if the only global is a collection itself. If the input key and if the input is used only as the key in the collection operation. And if the branch, if the other branches in the compose operation are only based on the result of the collection and only check whether the key is inside the collection or not, and we use this notion to show that very fine recognizability of this data independent compose operation is decidable when the local state is bounded. And the reason is that if I know that a collection, compose operation is done independent, then I know that if a bug, if a violation exists for a central key, then a bug exists for each key. So actually I need to check it out only a single input key. I'm doing some small reduction, arise only a single input key. And also know this is the single input key and I know that it is data independent, so basically I know that the compose operation will at most add one value to the collection. The value can be at most dependent on this single input key, the value that is calculated. So actually the compose operation can insert one key with only one value to the collection. And if I'm looking at the inference specification, I know that the operation that influence an operation of the compose operation that you'll have the single input key will use the same single input key, and what I want to do, when I want to do an influence I need to write some different value, but any different value because that independence, it will not branch on this value. So actually these bounds number of elements that I can have in the map because I either have this key and this value or this key and the value that added by the environment or the map would be empty. Okay. So we did some small model reduction, and we explore all the possible execution using one input key. And influence environment that uses a single input value. Okay. And then afterwards, after we did that, we said let's go back to our compose operation and see how many of them are really data independent. Here SCM means that they're data independent, it's single concurrent map. They can be verified using single key. FCM is some extended class of this that they can be verified using a fixed set of keys. For example, if I have an input and at the beginning I have EF key equals now do that, otherwise do that. So I can check it using an input now or any other input. And these are data dependent. So you can see that out of the 105 compose operation a lot of them are data dependent. So it was quite -- we were not quite happy. But then we look at this data dependence and we saw that out of this 60 of them have globals. They're not encapsulated. As I said before, these globals are controlled value to the compose operation and to the connection so these are nonlinearizable. And we also have four that are nonlinearizable and one which is linearizable. This is one that we missed. So overall we ran the tool and we were able to verify all these data independent compose operation and the ones that are buggy afterwards we manually fix them and we were able to verify them as well as correct. So this is the flow of the two extended flow. Before the user provides the program and [inaudible] to the compose operation which generated a candidate compose operation and moved to the user. And then this compose operation can verify to be data independent. We didn't implement this part. But we have a very simple syntactic rules that most of these is compose operation satisfied. Very, very simple. They don't have any aliasing or something like that. Very, very simple. And this is in case of they're data independent, they move to the user and then it is moved -- it is manually generated and we run spin and we get either reliability result or nonlinearizablility result. And in the case it's not independent as before we have this flow that just run testing and try to find bugs. Okay. So the overall result of the tool is they find 42 violations which are nonlinearizable in the client. 17 are nonlinearizable, only in open environment. 26 are globals and therefore has globals and therefore are nonlinearizable. And it was able to prove using the small model reduction that I said, 26 that are linearizable and one we missed. So to summarize, writing concurrent data structure is hard, and we also saw that employing this atomic library application is helpful. So what we do here, we use modern linearizability and use specification and data independent in order to find violation or prove generalizability of these compose operation and we found some things because we identify important bugs and for each one of them we provide a trace that's not only showing the bug but also explaining the bug. As I said before, these traces are very, very concise. It's very, very easy to understand them and to afterwards fix the compose operation accordingly. And otherwise it was very hard to find this bug when running the program. And we also prove linearizability of the compose operation and this is a simple efficient technique. That's it. Thank you. [applause] >> Guy Golan-Gueta: I represent the work automatic fine grain locking using shape properties, which is a joint work with Nathan Bronson and Alex Aiken, Mooly Sagiv and Eran Yahav. Concurrent data structures. Concurrent data structures are widely used in many software systems and in this work we deal with organization automatic synchronization for concurrent data structures. A simple way to implement synchronization for concurrent data structures is to use a cause locking. A common example can be a single lock that protects the entire data structure. The good thing about such a synchronization is that it is easy to implement and understand such locking, but the bad thing is that it is provides a very limited concurrency and usually provides inefficient -- is not efficient enough in many applications. Another way is to use the fine-grained locking. Fine-grained locking usually provides idea of concurrency but the problem with fine-grained locking is that it is very hard to understand and implement such locking. So in this work what you want to do, and we did, is to automatically define fine-grained locking to data structure without synchronization. And we want to be able to end the recursive data structures like recursive trees and recursive lists. So our goal is to attack code without any synchronization and automatically create an equivalent code with fine locking. We want to create, there are many ways to create fine-grained locking so in this work what we want to do is create fine-grained locking which each object has its own lock. So if this is the data structure and N1 is an object we want N1 to have its own object and all the other objects have -- we want them to have their own lock. We also want that a lock will only held with necessary. So if, for example, 20 old some locks we want it to be able to release locks as soon as possible so other threads like thread define will also be able to simultaneously walk on the same data structure. And by doing this we want many threads to walk on the same data structure together. So in this work we show a method to such fine grain locking by using a simple source translation, source-to-source translation and the method uses a simple static analysis, and because part of the method is dynamic and because of that the method is able to enter cases that are usually for static analysis. The main idea of the method is to rely on the shape of shared memory for the synchronization. So the method itself is not applicable in any case. It is applicable when the shared shape can be seen as a dynamic force. There's a force that's dynamically changed. So if we are given a code, it may be a complicated code. The method doesn't really need to understand the details of the code. The method relies on the fact that at the beginning of each operation of the data structure, the shape of the data structure is the forest. Because we rely only on the beginning of each operation, during the operation the shape can be changed. It can be arbitrarily shaped it can be a cycle as long as at the beginning of each operation the shape is a forest. In this case this is the code of binary source so this is a tree. Thread is a forest so this is okay. So in the work we have two parts. We first show a new locking protocol. We call it domination locking, which is a locking protocol for synchronization of dynamic heaps. And it is originallization of several unknown protocols end of locking and dynamic locking and in the second part we use this protocol. We show our method to add fine locking by adding the shared protocol and we show the method is able to interchange in cases. For example, the method is able to add effect of fine locking to some implementations of directories. So I start with domination locking protocol. In this protocol, what we want to do is to leverage the fact that in well typed organs like Java programs, there is a restricted way to access the object. So if, for example, we are able to thread and the stack only as pointers to the root of the data structure to N1, then if you want to access N3 we have to access first N2 so we want to use this. And to do this we distinguish between two types of objects. First IP expose object and the second IP is hidden object. Expose object are the root also of the data structures. And when an operation begins, it may only point to expose objects. Even objects are the other objects and they may be reachable from the exposed objects. So we want to use this and we want to use the idea of domination. We say -- we say in the definition of the protocol that thread dominates object 2 if all the parts form expose object to object U, every lock that is locked by thread T. So in this example we have to expose object E1 and E2 and we have to thread it as a lock on H 4 we know because of that T dominates H3 and H5 all parts expose objects 2. These two objects are protected by this lock. We also -- we also know that T does not dominate H3 because there's a path without any lock that is all by thread T. The protocol itself has three rules. The first rule is needed to protect the access to object and size the thread can access object only when it holds the object lock. So if this is the data structure, then thread T has the locking order in order to access E1. The second rules say that thread can lock hidden object U only when it dominates the object. So this example thread T can lock H2 because it dominates H2. The domination locking protocol allows early unlock. So if thread T wants to release the lock from E1, it is able to release the lock. So other threads will be able to access the same data structure. Also the protocol can enter cycles. So if, for example, we have a cycle as in this case, then the thread T is able to locate H3 because it dominates H3. Also the protocol allows the heap graph to dynamically change. As long as the rules are satisfied. So it is okay that during the work of thread T, thread T will change the graph. Create new object and change the pointers of the heap graph. The third rule of the domination locking protocol is needed for exposed object and for this, for this part we use the simple variant of two phase locking that avoids deadlocks. So if we have such kind of data structure, we use variant of two phase locking forward expose object and for the hidden object we use the first rules of the domination locking protocol. What we show in the work that the domination of locking protocol guarantee atomicity of the operation if the operation is, follow the protocol. But we also show that it is only -- it is only needed to consider sequential executions. So if we know that every sequential execution, if all the execution with a single thread satisfy the protocol, then we know that all the operations are atomic. Are linearizable. So if we have a cord it is enough to think about the sequential execution in order if we want to use the protocol, in order to use it to enforce say atomicity. So we have a protocol. We know it's enough to think about sequential executions, but still the cord may be complicated, and now we want a way to automatically enforce the protocol. We could do it manually, of course. But the second part we show how to do it automatically. So how to do this. Okay. So for this we have our automatic method. And as I said at the beginning, the method is, works only in some cases. It works if we know that the shape of the data structure is a dynamic forest. And the definition is that in every sequential execution, every execution with a single thread the shape is a force at the beginning and the end of each operation of the data structure. So if we have a data structure that is composed from two lists, for example, this kind of data structure, it is okay. This is a forest. It is okay that such data structure will have an operation that change the graph. For example, it moves H3 from list A to list AB. Such operation will violate the forest during the operation this is not a forest. But this is okay for forest-based data structure as long as at the end of each such operation the shape will be a forest. So how the method works the method works in two steps. In the first steps it adds code that collects runtime information and in the second steps, step, it adds locking that uses the runtime information for the locking. So in the first steps, the idea is to use to add two reference counters. Similar to the garbage collection reference counters. And we had stack reference counter that counts number of reference from private memory from stake to object. And we add heap reference counter that counts number of incoming pointers from heap object to the current object and we have the code that manages all the relevant things that we had the code -- that manages the reference counters in the code. So if this is our state, for example, H4 is the pointer from stack so it's stack reference counter is one. H2 is, heap reference counter is 0 because there is no pointer from heap object. And here H2 is two pointers from heap object and stack reference, and because of this stack it's heap reference counter is 2 and stack reference counter is 0 because there is no pointer from stack. So how we use reference counters, what we do is a very simple thing. We lock objects when we see that it's heaper counter stack counter becomes positive. As long as an object is pointed by thread, the thread acquires the lock of the object. And we unlock object when its stack counter becomes 0 and the object itself is not part of a forest violation we want to end violations of the forest and we can see this easily by looking at the heap counter. And we have another thing that in some operations we may have an operation that works on a few objects. And at the beginning the stack counter of both of them is one. We want to lock them without leading to deadlock. So what we do is to lock them in fixed order. So we can, for example, take the address and lock them according to the order of the addresses. We know we can identify them statically at the beginning of the operation. So if we follow these two steps, then we create data structure. We find unlocking, and that follows the domination locking. So in the work we wanted to understand, want to evaluate the method by adding fine unlocking to several data structures. And we did it for two balance sub streets for heap which is sub street which uses run domination, randomization, and we added fine unlocking to heap lock tree. We top-down rebalancing without any parent repointers, and we also did it for self-adjusting heap, sku heap and to specialize the data structures we did -- and we in a priori application in Barsante [phonetic] algorithm and we evaluate the results in the context of the application itself. And according to runtime experiments, the results were quite good. We show a good scalability. Although we could use many optimizations because here we used very simple locking, we could optimize the locking in many, many ways. We didn't do it. We just tested the few method, and without randomization the results seemed to be good. In some cases we evaluated our locking. We compared the locking to manual fine grained locking. Here we have a graph that compares the manually fine-grained locking to automatically fine-grained locking, and we also have to heap, this is the graph we elected to heap. We have single lock and here we have manual fine-grained locking versus automatic locking. The results were almost equivalent. And not enough guesses. That wasn't group but in many cases we saw manual and automatic is the same. In one case we haven't found end of unlocking in the original application, in the a priori application, and we compared the original end-to-end locking to the automatic, to our automatic method and we show, and we saw that the both methods are equivalent but in one case here our automatic method provides better throughput. It was surprising but this was the result. Okay. A summary in the work we show new fine unlocking protocol for dynamic heaps, and we show an automatic way to -- an application for the protocol which is automatic way to add fine unlocking by relying on the shape of a data structure and we show a preliminary performance evaluation for this implementation. And thank you. [applause]. >> Tom Ball: Questions? >>: So you do a shape analysis of the method assuming that your forest and you do some static analysis or you just do the runtime check to see it's not a forest? >> Guy Golan-Gueta: In -- I could use the runtime check but I didn't do it. What I did is do, the program that says this is the forest-based data structure and then I applied the method. >>: So there's no shape analysis really going on here, static shape analysis? >> Guy Golan-Gueta: No. >>: So basically I assert and then the analysis sort of just believes it's going to be a forest? >> Guy Golan-Gueta: Yes. >>: I see. But you do have some dynamic check, you said you checked dynamically that the heat reference count is. >> Guy Golan-Gueta: I could check dynamically that the first time I see an object the heat reference counter is one. And so I could identify violation of the forest. But I didn't -- I didn't do it because I knew that the data structure are forest-based. >>: Okay. >>: So the instrumentation you have to add to keep track of the stack references and the heat references; is that restricted to the data structure you actually are doing this on, so you don't have to go and instrument everything. >> Guy Golan-Gueta: Yes. It's unrestricted to the structure. It has to be restricted to the structure because I don't want to be affected by external pointers. Want to ignore them. >>: So do you check, for example, the internal nodes don't escape into the ->> Guy Golan-Gueta: Yes, I assume that all the objects are encapsulated in that structure. >>: You just assume, okay. I see. >> Tom Ball: Okay. Thank you, gentlemen. [applause]