23241 >> Tom Ball: It's my pleasure today to have... they're going now to OOPSLA this week to give talks...

advertisement
23241
>> Tom Ball: It's my pleasure today to have two speakers. We have Ohad and Guy, and
they're going now to OOPSLA this week to give talks at Splash at OOPSLA, the technical
track of Splash. And they're giving us a little preview today. Ohad Shacham is a Ph.D.
student at Telviv University under the supervision of Mooly Sagiv, who is here visiting,
and Eran Yahav. His Ph.D. dissertation addresses questions of checking atomicity of
concurrent collection operations. And prior to his thesis he worked at IBM Haifa
Research Group on hardware verification.
Guy Golan-Gueta, who is sitting in the audience here, is a Ph.D. student also at Telviv
University also under the supervision of Mooly and Eran. And he's interested in various
aspects of concurrency. In the past he was a software architect on several software
projects and was responsible for the design and development of many high performance
critical systems. So we welcome them here today. And Ohad will get started, and then
we'll hear from Guy.
>> Ohad Shacham: Okay. Thank you very much, Tom. Okay. So I'll present the work
testing atomicity of composed concurrent operations. Joint work with Nate Brunson, Alex
Aiken, Mooly Sagiv, Martin Vechev and Eran Yahav.
So writing concurrent data structure is hard. So we will look at modern programming that
will provide us with a library of concurrent data structure, and we can use this library
using an interface. And each one of the operations of the interface is atomic.
The problems that in many cases this interface is not enough and the user need to write
a new operation, compose a few operation of the interface. And we know that each one
of these operations is atomic.
However, we don't know whether this composed operation is atomic. So the question is:
How can we test atomicity of this composed operation? So here we have an example of
a bug we found in Apache Tomcat. In Tomcat Version 5 they have in attribute map.
Attribute that maps from name of an attribute to the object of an attribute. And they
allocated a sequential hash map, and they also have a function remove attributes that
gets in name, name of attributes, and what it does it first holds the global lock of ATTR.
Then it checks whether the attributes inside the collection. And in case that it does, it
gets the attribute object, remove it from the collection and return it.
And otherwise, it just returns now. So the environment that this function tries to maintain
is that we remove attributes returns the value it removes from the collection or not.
So what they did in Tomcat Version 6, they said, okay, we have a concurrent hash map,
a concurrent collection. So let's go allocate concurrent hash map and we know that each
one of these, that each one of these operations is atomic. So just once and remove this
lock.
Okay. And, of course, this breaks the invariant. And let's see an example of an
execution that shows that the environment, the environment does not hold. We start
running remove attribute with an interest string A and it's assigned to file now. And then
we have some concurrent execution by a different thread that goes and does a put
operation with the same, with an input key A and object O. And then we move attributes
continued running. It checks whether A inside the collection and, of course, it does
because it was added in here. So it enters the branch, and it's get the object O that was
added in here. And before we have a remove operation we have some others to come,
and that's the concurrent remove operation to the same input key A and then remove
attributes continue running. Here it tries to remove A. But it fails because it was already
removing in here. So it continues running. Retool. And the value it returns is the value
that it was in here which is O. So actually this function run, it didn't remove anything, and
it returned O. It is as if this is the value that was removed.
Of course it violates the invariant. So by atomicity in this work I mean linearizability. By
linearizability, I mean given a concurrent execution, as the one you can see in here, says
this execution is linearizable if there's a sequential execution build by this attribution,
remove attribute, put and remove, such that each operation in this concurrent execution
the result of this operation, this concurrent execution is equivalent to its corresponding
operation in the sequential execution.
And if I'm looking at this concurrent execution, then as we said before, remove attribute
returns O. You can see that put in here returns now because this is the first operation.
And here is the collection, therefore it returns now. And remove returns O because this is
the value that was added in here. So what I want to do now, I want to check whether
there exists a sequential execution such that every operation is concurrent execution has
the same return value as in the sequential execution. So actually we have here three
options. The first one, first do a put and then a remove and at the end run remove
attribute. And you can see that put as before returns now because this is the first
operation.
Remove returns O because it runs just after the put. However, when I'm running here,
remove attributes, you can see that in this point the collection is empty because I added
the key A. And then I removed it.
So it runs and it just returns now unlike it is in here. So this is not the equivalent equation
execution. In here we first run remove attributes and again it returns now because this is
the first running of the collection is empty. So it's not equivalent one.
And in here we first run put and then we remove attribute. You can see remove attributes
in here returns O as in here. However, the remove that we run at the end returns now
because this remove operation from, remove attributes already removed this, the key A.
Therefore, the collection is empty and the operation returns now. Okay. Therefore, this
concurrent execution is nonlinearizable. We didn't find any sequential execution that is
equivalent to this concurrent execution.
Okay. So what do you want to do here is testing the linearizability of this compose
operation. First we want to test the atomicity. By atomicity that we mean linearizability.
>>: The previous slide, sorry, when you say sequential, you really mean forcing, leaving.
So I mean here you still, because I think of sequential, I think of the first two columns.
The third one you actually have some concurrency, right, in the sense that you're
interleaving the execution, the full execution. But it's sequential in the sense that you
execute the methods completely. But it still has some aspect of concurrency.
>>: I mean in the granularity of the operation, the granularity of operation, I have the
operation and then another one and then another one.
>>: So operations are not created.
>> Ohad Shacham: Operation, but threads can be proactive.
>>: But threads.
>> Ohad Shacham: Yes.
>>: What happens to your definition if remove doesn't tell you whether it succeeds or
what it removes?
>> Ohad Shacham: You need to return to the operation or the define. Yeah. Okay. So
what I want to do, we want to test linearizability of this compose operation. Okay. So
looking at this choices we saw before it looked very easy to understand why we have a
variation.
However, reconstructing this in Tomcat is extremely challenging. Okay. Tomcat is a
huge program. We have a large race, a large number of choices. And this violation
occurs only because we have this remove operation that works on the key A and occurs
between this get operation. And this remove operation that works on the same key.
Right, we read here the value. And before we remove it someone else came and
removed it.
Okay. And this thread interleaving, this thread interleaving is very, very rare because we
need somehow the program will do a remove operation between these two, three of them
should work on the same key. It's very, very hard to reconstruct this operation.
So our solution has three phases. The first phase we use modularity. We actually shop
this compose operation out of the program and we test it in an environment that does
arbitrary collection operation. Of course, this is an obstruction. And it generates simple
choices. It can, of course, it may of course generate false alarms, violations which are
not feasible in the client. And it also, the modularity to control the environment. Now we
control the environment, we can do any operation we want in the environment and we
can do -- we can direct it and do some partial order reduction.
The second thing is after we generated so many choices, we used linearizability of the
base collection. We have a guarantee from the library that each one of the operations is
linearizable. Therefore, we can execute the operation of the collection, sequentially. We
do not need to overlap them because we know they're linearizable. So this restricts
some of the choices that we generated in here. And the last thing we use the influence
specification of the library, it's similar to the noncometivety, and we use this information in
order to further restrict the choices that we generate and that [inaudible], I'll show how we
use it and what we mean.
Okay. So first modulo checking, as I said, we take this compose operation. We chop it
out of the program, and we test it in an environment that does operational in the
collection between any two operations of the compose operation we can do any
operation of the environment that we want to do.
And this is, of course, as I said an abstraction. It may generate false alarm. However,
we argue that the violation that we found, even though are not feasible in the client, fixing
these violations may call resilient for future changes, because if, for example, this bug is
currently not feasible, then in two days from now someone can remove a lock in another
program or add some remove operation and this bug can pop up.
And later on you'll see this is something which is backed up by our user experience that
some of the bugs we found were not feasible in the client but even though we reported
them to the developer, they acknowledged these as bugs and fixed them.
>>: So you should have some sort of theorem at least, maybe theorem is strong, but you
would hope that the sort of bugs you find here are a super set of what would happen in
the application given that your environment, if your modular and environment abstraction
is something, right, that you'll get -- you'll bound on the set of errors due to misuse, right,
of the ->> Ohad Shacham: Yes. This is the -- and later on I'll talk about some way we use it to
prove. But -- yeah. Okay. So first, as I said, we do a modulo checking. So what we
want to do, we want to test this linearizability of this compose operation in a modulo
fashion.
So actually what we can do, we just run the program in a testing method. We run to
remove attributes with some random inputs. There might be an environment operation
that doesn't influence to control compose operation, some doesn't influence control of the
compose operation, but it's running and running and running and it doesn't find any bugs,
because still it's generating -- whoa. Okay. Help me with the computer. There are so
many traces that we can explore in here, and at the end when it's eventually find a bug,
we look at the choice, ignore this one, it shouldn't be in here. We can look at this and we
can see as I said before the reason that we found this bug is that this remove operation
influenced a result of this remove operation. Because this remove operation removed the
key A before this remove operation, this remove operation failed to remove R. So we
are -- failed to remove R and therefore we discovered the violation.
So what we actually do, we further restrict the environment to do operation, only
operation that will influence the next to come operation of the composed operation.
Okay. In here, for example, before contains key, I can do a put that will make contains T
to return true instead of false. In here remove will make this remove to fail and so on. I'm
restricting to environment. This is a partial order reduction, sort of a partial order
reduction.
>>: What's your definition of influence?
>> Ohad Shacham: As in here, and for ->>: Formally, I mean, how do you -- you're not analyzing statically. So you're doing like a
dynamic reduction?
>> Ohad Shacham: Dynamically. When the program is running between each operation
I can have an environment operation. So when I get to this step, I know what my state, I
know the state of the collection. And I know the semantic. I can get the semantic of the
library, the inference specification of the library. I have the specification.
I assume someone provided it to me.
>>: I see. So you know you should like use the same key.
>> Ohad Shacham: Exactly. Because it's a map you know you should use the same
key. And if currently in this state I know that the collection is not empty, and it has the
key A inside, so I'll do remove. Otherwise if it would be empty I can do a put. Something
that's dynamically. And it's dependent on the state of the collection, each movement.
Okay. So running example, here we have remove attributes, as we saw until now. And
we have here the execution itself and here you can see the cord that we're currently
executing. We start running. Attributes with the key A. And we assign to value now.
And before we have this contains key, we ask the environment, do you want to do an
operation that will influence this contain key? It's something that happens dynamically.
In this case the environment decides that he wants to do and the collection is currently
empty. So the environment does a put operation. It enters the key A and the value O to
the collection. And then remove attributes continue running, and now contains key will
return true, because A was already added to the collection in here.
So we continue running. We enter the branch. And before we have this get, again we
ask the environment do you want to do an operation that will influence this get. And in
this case the environment decided that it doesn't want to do. So we continue running.
We do a get that reads the Value O and enter it to val. Before we move we ask the
environment do you want to do an operation that will influence remove. And the
environment does a remove with the same key. And, of course, it will influence the result
of this remove and we continue running. And the remove failed and at the end it returns
O.
Now as we saw before, we go and we check all the possible executions and see whether
there exists an equivalence. Not one. If not, then we report a violation. Okay. So just
one thing, as you can see, when we got this trace from our technique, you can see that
this trace is very, very concise.
It's not a large trace of a program that runs many spreads and is very hard to analyze the
bug, it's very, very easy. We have the compose operation and we have once in a while
some small environment operation. Then it's very, very easy, all the choices we got, it's
very easy to see why the bug occurs and to analyze them.
Okay. So we implemented this technique in a tool name Coat. And it guesses an input
of the program and elaborate specification which moves to the compose operation
extraction. This is a simple static analysis that goes and just looks for compose
operation. And what it does it moves to the user back this candidate compose operation,
and the reason is that in many cases this compose operation builds inside a function, but
in some cases the user instead of writing a function that implements a compose
operation just writes this compose operation inside of a large method. So in some of the
cases, the user needs to manually compose this concurrent operation. And then it
moved to, by instrumentation, together with some driver that runs the function and a
driver that does the inference specification, the environment.
And this, of course, afterwards we run it and we get, either get a nonlinearizability result
or a timeout. Because it's a testing tool. Okay. So let's see our benchmark. So we, as I
said we use this simple static analysis to extract the compose operation. In 90 percent of
the cases the compose operation was a part of a large method. So we manually needed
to chop it out and to write it.
We extorted 112 composed operations out of 55 applications. All of them real compose
operation. And from real application is Apache Tomcat, Cassandra, and so on. For each
one of the applications we extracted all the compose operations and we analyzed all of
them. And we didn't find any additional compose operations. We tried to use Google
Code, Koder and other search engines to see if we could find any other compose
operation. This is all we could find. After we have this 100 compose operation we say
we don't know whether these are linearizable.
>>: This doesn't seem like a lot. It seems like two per app.
>> Ohad Shacham: Hmm?
>>: You only found two compose applications per application on average.
>> Ohad Shacham: On average.
>>: Is that just because these apps were well written and they don't really use compose
things at all, or is it just powerful enough that you need the atomic operations for the live
ones?
>> Ohad Shacham: All of these are actually using the same libraries. They're all
concurrent hash maps. So they're the same libraries.
>>: I don't really understand how you sort of figured out whether or not these other
compose operations that you missed, because you have to know the semantics of the
code in order to determine whether some things are composed, right?
>> Ohad Shacham: No, actually, you see whether there is an operation that try to access
a few operations of the library.
>>: A single method ->> Ohad Shacham: A single method.
>>: [inaudible].
>>: I mean there could be bugs in the code where they have composed operations
hidden because they use other helper methods that do -- the derive -- you wouldn't miss
those, is what you're saying?
>> Ohad Shacham: We try to find -- we didn't find any of these. But there might be a
case that we missed such. But we also try to find these kinds. And we didn't. Most of
these compose operations actually didn't have additional method that they called during
run. But there might be a case, yeah, that's true.
And okay. So after finding these 112 compose operations we said, okay, let's see
whether these are linearizable or not. So we ran code. And we terminated on each one
of -- it said that 59 are nonlinearizable and for each one of them it terminated in a second
or so.
And it gave a press and everything. And 33 just timeout after a while and didn't say
anything. They say, okay, we know that 59 are really nonlinearabl, but are they really
nonlinearizable or are they nonlinearizable only due to our open environment?
Okay the abstraction we did. We check it out and we saw that 17 are nonlinearizable
only due to our environment. And 42 are nonlinearizable in the client themselves. And
for each one of them we wrote a fix and we reported them to the developers of the tool
and many of them were already fixed, and acknowledged as violation. And many of
these acknowledged violations as well and also were fixed.
>>: How do you determine this? Did you actually find a real counterexample in the actual
code? Or by looking at ->> Ohad Shacham: In some we were managed to construct. But not most of them. In
many cases we got a fix pick from the user. And in some of the cases that we were not
sure, we looked whether there existed -- whether there exists an operation that can occur
in the, that can occur and create a violation itself. But in most of the cases we just got an
acknowledged from the user.
Okay. So afterwards we said, okay, we know that for 53 we got a timeout result. So let's
see whether these are linearizable or not. We checked them manually. We saw 27 of
them are linearizable. We managed to prove them nonlinearizable and 36 are
composable, which means not encapsulated, not have the collection itself. And also they
have some other variables.
And these variable controls the value of the sys control operation into the collection
during the run. And if we will augment the environment that instead of doing only
operations to collection, that it will do only operation on the global, it will change the value
then we can generate choices to show these are nonlinearizable as well.
So over all the 112 compose operations we actually found that 85 are nonlinearizable.
>>: How many of the nonlinearizable cases did the developer say it's nonlinearizable but
that's okay?
>> Ohad Shacham: None. They either -- I guess the ones that didn't respond.
>>: Okay.
>> Ohad Shacham: I guess this was the case. The one that responded was many of
them said thanks, acknowledged there was a bug.
>>: So nobody had some other --
>> Ohad Shacham: No. In Cassandra, for example, they responded to say, oh, this bug
is currently not feasible because remove can happen only when the program terminated
and it's not a problem anymore. However, we'll fix it because we plan to do changes or
acknowledge -- bugs can pop up in the future. But I guess they wanted to respond, yeah.
So 85 of these are nonlinearizable and 27 are linearizable. Then we said, okay, we
found many bugs, many real violations, many real violations and it was very easy to
detect them. And we only use the inference specification of the library.
For example, I mean, if we have some branch which is dependent on the value to check
if the value is equal to 42, then go and do some violation, this is something that we would
miss. But then looking at this compose operation, we saw that these are very generic. I
mean, they get some input. They use it inside the collection. They do not have any
branch of the input.
And also when they got some return value from the collection operation, the only branch
that they do is whether some key is inside the collection or if it doesn't inside collection.
Also they had some remove -- again, they checked if the value is equal to now or
different to now. It doesn't check some specific value.
So we defined the notion of data independent. And informally data independence means
a compose operation is independent if the only global is a collection itself. If the input key
and if the input is used only as the key in the collection operation.
And if the branch, if the other branches in the compose operation are only based on the
result of the collection and only check whether the key is inside the collection or not, and
we use this notion to show that very fine recognizability of this data independent
compose operation is decidable when the local state is bounded. And the reason is that
if I know that a collection, compose operation is done independent, then I know that if a
bug, if a violation exists for a central key, then a bug exists for each key.
So actually I need to check it out only a single input key. I'm doing some small reduction,
arise only a single input key. And also know this is the single input key and I know that it
is data independent, so basically I know that the compose operation will at most add one
value to the collection. The value can be at most dependent on this single input key, the
value that is calculated.
So actually the compose operation can insert one key with only one value to the
collection. And if I'm looking at the inference specification, I know that the operation that
influence an operation of the compose operation that you'll have the single input key will
use the same single input key, and what I want to do, when I want to do an influence I
need to write some different value, but any different value because that independence, it
will not branch on this value.
So actually these bounds number of elements that I can have in the map because I either
have this key and this value or this key and the value that added by the environment or
the map would be empty. Okay. So we did some small model reduction, and we explore
all the possible execution using one input key. And influence environment that uses a
single input value.
Okay. And then afterwards, after we did that, we said let's go back to our compose
operation and see how many of them are really data independent. Here SCM means that
they're data independent, it's single concurrent map. They can be verified using single
key. FCM is some extended class of this that they can be verified using a fixed set of
keys. For example, if I have an input and at the beginning I have EF key equals now do
that, otherwise do that. So I can check it using an input now or any other input.
And these are data dependent. So you can see that out of the 105 compose operation a
lot of them are data dependent. So it was quite -- we were not quite happy. But then we
look at this data dependence and we saw that out of this 60 of them have globals.
They're not encapsulated. As I said before, these globals are controlled value to the
compose operation and to the connection so these are nonlinearizable. And we also
have four that are nonlinearizable and one which is linearizable. This is one that we
missed. So overall we ran the tool and we were able to verify all these data independent
compose operation and the ones that are buggy afterwards we manually fix them and we
were able to verify them as well as correct.
So this is the flow of the two extended flow. Before the user provides the program and
[inaudible] to the compose operation which generated a candidate compose operation
and moved to the user. And then this compose operation can verify to be data
independent. We didn't implement this part.
But we have a very simple syntactic rules that most of these is compose operation
satisfied. Very, very simple. They don't have any aliasing or something like that. Very,
very simple. And this is in case of they're data independent, they move to the user and
then it is moved -- it is manually generated and we run spin and we get either reliability
result or nonlinearizablility result. And in the case it's not independent as before we have
this flow that just run testing and try to find bugs.
Okay. So the overall result of the tool is they find 42 violations which are nonlinearizable
in the client. 17 are nonlinearizable, only in open environment. 26 are globals and
therefore has globals and therefore are nonlinearizable. And it was able to prove using
the small model reduction that I said, 26 that are linearizable and one we missed.
So to summarize, writing concurrent data structure is hard, and we also saw that
employing this atomic library application is helpful. So what we do here, we use modern
linearizability and use specification and data independent in order to find violation or
prove generalizability of these compose operation and we found some things because we
identify important bugs and for each one of them we provide a trace that's not only
showing the bug but also explaining the bug.
As I said before, these traces are very, very concise. It's very, very easy to understand
them and to afterwards fix the compose operation accordingly. And otherwise it was very
hard to find this bug when running the program.
And we also prove linearizability of the compose operation and this is a simple efficient
technique. That's it. Thank you. [applause]
>> Guy Golan-Gueta: I represent the work automatic fine grain locking using shape
properties, which is a joint work with Nathan Bronson and Alex Aiken, Mooly Sagiv and
Eran Yahav.
Concurrent data structures. Concurrent data structures are widely used in many software
systems and in this work we deal with organization automatic synchronization for
concurrent data structures.
A simple way to implement synchronization for concurrent data structures is to use a
cause locking. A common example can be a single lock that protects the entire data
structure. The good thing about such a synchronization is that it is easy to implement
and understand such locking, but the bad thing is that it is provides a very limited
concurrency and usually provides inefficient -- is not efficient enough in many
applications.
Another way is to use the fine-grained locking. Fine-grained locking usually provides
idea of concurrency but the problem with fine-grained locking is that it is very hard to
understand and implement such locking.
So in this work what you want to do, and we did, is to automatically define fine-grained
locking to data structure without synchronization. And we want to be able to end the
recursive data structures like recursive trees and recursive lists.
So our goal is to attack code without any synchronization and automatically create an
equivalent code with fine locking. We want to create, there are many ways to create
fine-grained locking so in this work what we want to do is create fine-grained locking
which each object has its own lock. So if this is the data structure and N1 is an object we
want N1 to have its own object and all the other objects have -- we want them to have
their own lock. We also want that a lock will only held with necessary. So if, for example,
20 old some locks we want it to be able to release locks as soon as possible so other
threads like thread define will also be able to simultaneously walk on the same data
structure.
And by doing this we want many threads to walk on the same data structure together. So
in this work we show a method to such fine grain locking by using a simple source
translation, source-to-source translation and the method uses a simple static analysis,
and because part of the method is dynamic and because of that the method is able to
enter cases that are usually for static analysis.
The main idea of the method is to rely on the shape of shared memory for the
synchronization. So the method itself is not applicable in any case. It is applicable when
the shared shape can be seen as a dynamic force. There's a force that's dynamically
changed. So if we are given a code, it may be a complicated code. The method doesn't
really need to understand the details of the code. The method relies on the fact that at
the beginning of each operation of the data structure, the shape of the data structure is
the forest. Because we rely only on the beginning of each operation, during the operation
the shape can be changed. It can be arbitrarily shaped it can be a cycle as long as at the
beginning of each operation the shape is a forest. In this case this is the code of binary
source so this is a tree. Thread is a forest so this is okay. So in the work we have two
parts. We first show a new locking protocol. We call it domination locking, which is a
locking protocol for synchronization of dynamic heaps. And it is originallization of several
unknown protocols end of locking and dynamic locking and in the second part we use this
protocol. We show our method to add fine locking by adding the shared protocol and we
show the method is able to interchange in cases. For example, the method is able to add
effect of fine locking to some implementations of directories.
So I start with domination locking protocol. In this protocol, what we want to do is to
leverage the fact that in well typed organs like Java programs, there is a restricted way to
access the object. So if, for example, we are able to thread and the stack only as
pointers to the root of the data structure to N1, then if you want to access N3 we have to
access first N2 so we want to use this. And to do this we distinguish between two types
of objects. First IP expose object and the second IP is hidden object. Expose object are
the root also of the data structures. And when an operation begins, it may only point to
expose objects. Even objects are the other objects and they may be reachable from the
exposed objects.
So we want to use this and we want to use the idea of domination. We say -- we say in
the definition of the protocol that thread dominates object 2 if all the parts form expose
object to object U, every lock that is locked by thread T. So in this example we have to
expose object E1 and E2 and we have to thread it as a lock on H 4 we know because of
that T dominates H3 and H5 all parts expose objects 2. These two objects are protected
by this lock.
We also -- we also know that T does not dominate H3 because there's a path without any
lock that is all by thread T.
The protocol itself has three rules. The first rule is needed to protect the access to object
and size the thread can access object only when it holds the object lock. So if this is the
data structure, then thread T has the locking order in order to access E1. The second
rules say that thread can lock hidden object U only when it dominates the object. So this
example thread T can lock H2 because it dominates H2.
The domination locking protocol allows early unlock. So if thread T wants to release the
lock from E1, it is able to release the lock. So other threads will be able to access the
same data structure.
Also the protocol can enter cycles. So if, for example, we have a cycle as in this case,
then the thread T is able to locate H3 because it dominates H3.
Also the protocol allows the heap graph to dynamically change. As long as the rules are
satisfied. So it is okay that during the work of thread T, thread T will change the graph.
Create new object and change the pointers of the heap graph.
The third rule of the domination locking protocol is needed for exposed object and for
this, for this part we use the simple variant of two phase locking that avoids deadlocks.
So if we have such kind of data structure, we use variant of two phase locking forward
expose object and for the hidden object we use the first rules of the domination locking
protocol.
What we show in the work that the domination of locking protocol guarantee atomicity of
the operation if the operation is, follow the protocol. But we also show that it is only -- it is
only needed to consider sequential executions.
So if we know that every sequential execution, if all the execution with a single thread
satisfy the protocol, then we know that all the operations are atomic. Are linearizable. So
if we have a cord it is enough to think about the sequential execution in order if we want
to use the protocol, in order to use it to enforce say atomicity.
So we have a protocol. We know it's enough to think about sequential executions, but
still the cord may be complicated, and now we want a way to automatically enforce the
protocol. We could do it manually, of course. But the second part we show how to do it
automatically. So how to do this. Okay. So for this we have our automatic method.
And as I said at the beginning, the method is, works only in some cases. It works if we
know that the shape of the data structure is a dynamic forest. And the definition is that in
every sequential execution, every execution with a single thread the shape is a force at
the beginning and the end of each operation of the data structure.
So if we have a data structure that is composed from two lists, for example, this kind of
data structure, it is okay. This is a forest. It is okay that such data structure will have an
operation that change the graph. For example, it moves H3 from list A to list AB. Such
operation will violate the forest during the operation this is not a forest. But this is okay
for forest-based data structure as long as at the end of each such operation the shape
will be a forest.
So how the method works the method works in two steps. In the first steps it adds code
that collects runtime information and in the second steps, step, it adds locking that uses
the runtime information for the locking. So in the first steps, the idea is to use to add two
reference counters. Similar to the garbage collection reference counters. And we had
stack reference counter that counts number of reference from private memory from stake
to object. And we add heap reference counter that counts number of incoming pointers
from heap object to the current object and we have the code that manages all the
relevant things that we had the code -- that manages the reference counters in the code.
So if this is our state, for example, H4 is the pointer from stack so it's stack reference
counter is one. H2 is, heap reference counter is 0 because there is no pointer from heap
object. And here H2 is two pointers from heap object and stack reference, and because
of this stack it's heap reference counter is 2 and stack reference counter is 0 because
there is no pointer from stack. So how we use reference counters, what we do is a very
simple thing. We lock objects when we see that it's heaper counter stack counter
becomes positive. As long as an object is pointed by thread, the thread acquires the lock
of the object. And we unlock object when its stack counter becomes 0 and the object
itself is not part of a forest violation we want to end violations of the forest and we can
see this easily by looking at the heap counter. And we have another thing that in some
operations we may have an operation that works on a few objects. And at the beginning
the stack counter of both of them is one. We want to lock them without leading to
deadlock. So what we do is to lock them in fixed order. So we can, for example, take the
address and lock them according to the order of the addresses. We know we can identify
them statically at the beginning of the operation. So if we follow these two steps, then we
create data structure. We find unlocking, and that follows the domination locking. So in
the work we wanted to understand, want to evaluate the method by adding fine unlocking
to several data structures. And we did it for two balance sub streets for heap which is
sub street which uses run domination, randomization, and we added fine unlocking to
heap lock tree. We top-down rebalancing without any parent repointers, and we also did
it for self-adjusting heap, sku heap and to specialize the data structures we did -- and we
in a priori application in Barsante [phonetic] algorithm and we evaluate the results in the
context of the application itself. And according to runtime experiments, the results were
quite good. We show a good scalability. Although we could use many optimizations
because here we used very simple locking, we could optimize the locking in many, many
ways. We didn't do it. We just tested the few method, and without randomization the
results seemed to be good. In some cases we evaluated our locking. We compared the
locking to manual fine grained locking. Here we have a graph that compares the
manually fine-grained locking to automatically fine-grained locking, and we also have to
heap, this is the graph we elected to heap. We have single lock and here we have
manual fine-grained locking versus automatic locking. The results were almost
equivalent.
And not enough guesses. That wasn't group but in many cases we saw manual and
automatic is the same. In one case we haven't found end of unlocking in the original
application, in the a priori application, and we compared the original end-to-end locking to
the automatic, to our automatic method and we show, and we saw that the both methods
are equivalent but in one case here our automatic method provides better throughput. It
was surprising but this was the result. Okay. A summary in the work we show new fine
unlocking protocol for dynamic heaps, and we show an automatic way to -- an application
for the protocol which is automatic way to add fine unlocking by relying on the shape of a
data structure and we show a preliminary performance evaluation for this implementation.
And thank you.
[applause].
>> Tom Ball: Questions?
>>: So you do a shape analysis of the method assuming that your forest and you do
some static analysis or you just do the runtime check to see it's not a forest?
>> Guy Golan-Gueta: In -- I could use the runtime check but I didn't do it. What I did is
do, the program that says this is the forest-based data structure and then I applied the
method.
>>: So there's no shape analysis really going on here, static shape analysis?
>> Guy Golan-Gueta: No.
>>: So basically I assert and then the analysis sort of just believes it's going to be a
forest?
>> Guy Golan-Gueta: Yes.
>>: I see. But you do have some dynamic check, you said you checked dynamically that
the heat reference count is.
>> Guy Golan-Gueta: I could check dynamically that the first time I see an object the
heat reference counter is one. And so I could identify violation of the forest. But I
didn't -- I didn't do it because I knew that the data structure are forest-based.
>>: Okay.
>>: So the instrumentation you have to add to keep track of the stack references and the
heat references; is that restricted to the data structure you actually are doing this on, so
you don't have to go and instrument everything.
>> Guy Golan-Gueta: Yes. It's unrestricted to the structure. It has to be restricted to the
structure because I don't want to be affected by external pointers. Want to ignore them.
>>: So do you check, for example, the internal nodes don't escape into the ->> Guy Golan-Gueta: Yes, I assume that all the objects are encapsulated in that
structure.
>>: You just assume, okay. I see.
>> Tom Ball: Okay. Thank you, gentlemen.
[applause]
Download