>>Doug Burger: It's my pleasure to introduce Oriana, who... get here apparently. 25 hours from Europe to here,...

advertisement
>>Doug Burger: It's my pleasure to introduce Oriana, who has had quite a journey to
get here apparently. 25 hours from Europe to here, which is longer than normal. But
I'm very pleased to have her here.
She got her Ph.D. from the University of Helsinki and is now working as a researcher at
ETH Zurich and is going to talk to us about some work that she's been doing.
>>Oriana Riva: Thank you very much. And thank you for inviting me here today. So
good morning to all of you.
So today I'm going to talk about Anzere, which is a personal system for data
management. And I will give through Anzere a sample of my vision of personal cloud
computing.
So let me start first with the story behind this and what I have been doing in the last two
years.
So if we look at the notion of personal computer and how it has been changing over
time for users, we can identify at least two major changes that have fundamentally
changed this notion.
The first one is the rise of smartphones. These devices are not just communication
tools, but they also integrate reasonably powerful computation capabilities, sensors,
they also provide various communication technologies. So they can actually be used
even independently of the [inaudible] so called self-organizing [inaudible]. And for a
large majority of users, today the primary personal computer is a mobile phone.
There is a second change that is modifying again the notion of personal computer, and
this is happening right now, and I believe it's going to be increasingly evident in the
future. And this is the rise of cloud computing, utility computing and trends like this
where now other mobile devices have the capabilities to run on the fly virtual machines
as well as raw storage resources from providers like [inaudible] to [inaudible] and so
forth.
So in my post doc research at ETH in the last two years and a half I have been following
this evolution through various projects. I have been working on distributed systems for
the support of personal application ranging from small-scale ad hoc networks to cloud
computing infrastructures. And in doing this, my focus has been on mobile phone
platforms, understanding which new applications can be enabled on these platforms
and what are the challenges involving programming them and making them part of
larger systems.
So to give you very quickly some samples of this research, at the beginning I was
exploring the usage of mobile phones independently of the [inaudible] in very
small-scale ad hoc networks, and the focus was particularly on supporting social
networking applications. So we built a platform running on mobile phones for
programming social networking applications running in wireless networks using just
WiFi ad hoc communication. And we built services like presence serves, voice over IP,
video and games like Quake.
Then I started working on cloud computing and moved towards analyzing the interaction
between the phone and the cloud in the client server setup. And with this project,
AlfredO, the main focus is on understanding how we can turn mobile phones into
universal interface to access cloud applications.
And the main goal of this project is to be the platform that happen allows for a single
usage model where you can on-the-fly acquire the client's side of an application running
in the cloud configure it, interact with this application and then discard it after the
interaction.
And the advantage is that the application is organized in a modular fashion so that
dynamically several tiers of the application can be distributed between the client and the
server side in a flexible and customizable way.
But I then move forward in this and move now to even farther in the evolution of the
notion of personal computer with my vision of personal cloud computing. And this is
what I'm going to talk about today.
So if I go back to the question, what's the equivalent of the personal computer in the
age of cloud and mobile computing, my answer to this is that this single thing is now a
collection that spawned physical own devices like your phone, your home PC, your
office PC, your laptop, as well as a dynamic collection of virtual machines, storage
resources that can be acquired dynamically. And this is what I call the personal cloud.
And what I'm seeking to achieve is a new model for programming phone applications
where the phone is going to act as a controller of your personal distributed application
running across this personal cloud. So a simple case of execution could be you have
the application interface running on the phone, you have your data store on the home
server, and then some of the computation tasks may be parallelized on cloud VMs.
So in the Rhizoma project we are investigating the challenge to be the software platform
that can allow for the programming of personal cloud applications. And by we, I mean
from the faculty side, Timothy Roscoe and two Ph.D. students, Qin Yin and Ercan Ucan
that I'm supervising with [inaudible] and a variable number of masters students. Dejan
and Robert were advised by me in the last six months.
So to achieve this challenge to build this platform, these are the three main focus areas
on which we are working on. So, first of all, providing routing in personal clouds. In
doing this, one of the main challenges is to deal with the heterogeneity of the devices
involved. And the key insight of the research we're doing in this area is to use concepts
from declarative networking to represent the heterogeneity of this environment and to
reason about this through very declarative description of your routing request.
The second aspect we are working on is how to support self-managing computing
applications in personal clouds. So the goal is to allow applications to start running on
the personal cloud and then self-maintain depending on device failures, changing on the
load, changing on the requirements so that the application itself can self-manage,
acquire extra resources, release other, and migrate code from devices.
And the third aspect is about the entire application, and this is the focus of my talk today
with Anzere.
So Anzere seeks to achieve a problem that you may have encountered in your
everyday life. So let's look at this scenario. So a user has some personal devices like
this, and sometimes he takes photos with his camera. Whenever he gets back home or
when he remembers, he copies these photos to his own PC. Sometimes he takes
photos with his phone, and whenever he's in proximity of a more powerful device he
moves the photos to this other device. Or sometimes he just takes photos and forgets
about replicating them.
So this is a very common pattern for many users, and this is true for photos, but also for
many other personal data like music collections, contact information, documents. And it
is common that users have to deal with a data distribution of this type. And behind this
data distribution there is a very complicated and intricate set of requirements that users
try to enforce.
For example, I want to make sure that when I leave for my next trip, my camera does
not have photos so my memory is empty, or if I have data on my mobile devices, I want
to have a backup on a fixed device or I want to make sure that my private items do not
get stored on a public PC.
The problem is that in enforcing these requirements, users today just use a dock and
sort of manual -- a manual approach. So the goal of this work is to try to make these
more automatic and much easier for the use himself.
So a possible solution to deal with personal data replication is to use online service
providers, Facebook, Google, [inaudible]. They are very popular, and they definitely
offer many advantages, but they also come with serious drawbacks like loss of privacy
and control, lock in with a specific provider as well as vulnerability to provider failures
and insolvency.
So instead what we want to achieve is a personal system for managing data, and this
system must be able to preserve this growing body of personal data and make the data
available according to various user preference. For example, recently downloaded
music on a device I carry, at least two distributed copies of my photos, and so forth.
So in the rest of my talk I'm going to refer to this setup as a personal storage network,
and I will address the user preferences through replication policies that the user can
express to control the data distribution.
The challenges involved in building this system are different from the focus of typical
online centralized solutions, so the scale is not on throughput or a scale but is more on
policy be flexibility, how flexible are the policies I can support, how to handle the
heterogeneity of the device ensemble, how to cope with the changing set of devices that
can fail, can move around, but also think of the case you buy a new phone, and this
phone has to be integrated in your personal storage network. And then we have to deal
with limited storage resources, again, for example, on phones. And one of the major
requirements that users want to have guaranteed is data durability.
So I will show how we address these challenges in the rest of the talk.
This is a very hot topic in the moment in the research community, but industry as well.
So, first of all, there are several synchronization tools that can be used; LiveMesh,
Dropbox. Probably the closest systems to Anzere are systems for content-based partial
replication. Cimbiosys from Microsoft and Perspective from CMU are two examples.
The main idea of these systems is to provide an interface where users can specify
content-based filters. For example, I want [inaudible] music on my mobile phone. The
filter gets associated to a specific device, and then there is research on device
transparency, partial replication, flexible consistency.
So in Anzere we look at these existing techniques and we try to fill the gap between
these systems. So in particular, what Anzere does is to expand the exclusivity of the
replication policies that can be supported but at the same time does not sacrifice the
tractability of the policy evaluation and the scalability of it. And I'll show these in a
moment.
So to summarize Anzere, so Anzere is a system for policy-based replication in personal
storage networks and replicates data according to flexibly-specified user policies. It
reacts to devices that enter -Yes?
>>: So can you just tell me what the substrate is for locations? Are we assuming that
the cloud is, like, conceptually [inaudible] policy server and all the clients are
[inaudible]? How is the data replicated? What's the mechanism?
>>Oriana Riva: Okay. I think I'm going to answer this in the rest of the talk, but to give
you a very quick answer now, for us, the cloud is just something where you can acquire
a virtual machine and you will copy there some code that can run or some data that is
going to be stored there. So it's acquired and then it's part of our overlay. And it's an
overlay that is -- it's basically a peer-to-peer network, if you want to put it like that.
But there is some centralization between the systems. But I will get back to that with
the architecture. Is that okay?
>>: Okay.
>>Oriana Riva: So it autonomously reacts to devices that change and leave the
system. It dynamically can require cloud resources if there is an advantage in
integrating them in the personal storage network. It ensures data durability, scales to
very large number of data items, and minimizes the cost for policy enforcement.
I'm now going to starting to into the details of the system and how it has been built
starting from the very basic concept of how policies are defined and built. So I like
systems like Cimbiosys and Perspective.
Anzere supports device-neutral policies, which means that policies are specified based
on device predicates rather than names, so a logical predicate that refers to a device.
So instead of saying something like replicate photos to my home server, Anzere
supports something like ensure at least one copy of every photo exists on a fixed server
I own.
Or another example to be make items modified in the last hour accessible at no more
than one minute latency from the phone. So this is the predicate that defines the device
rather than using a specific name.
And the advantage of doing this is that these kind of policies can work across a
changing set of device sets because they are not bound to a specific device and can
also automatically apply to new devices. So if I go back to the case I mentioned before,
I buy a new phone, well, this phone will be automatically integrated in the system and
policy will start automatically be working for this device.
And potentially these kind of policies can even be reused by other users.
Yes?
>>: So one of the interesting things about policy-based systems is that it's really easy to
write a [inaudible].
>>Oriana Riva: Yes. That's a very good point. So I'd lead to that in two slides.
So, first of all, let me first introduce how the policies are implemented and then I'll get to
what you asked.
So we use a policy stratification, which means, first of all, automatically the system
collects data items, metadata, and device metadata. So these are just key value
[inaudible] that describe the data items stored in the system and the devices available.
For example, I will have something like type, size, modified data, targeted save if the
data is private or public and so forth.
And this is automatically gated from the applications.
Then based on this metadata, item and device predicates are defined. For example, a
photo item predicate returns all the items that have type JPEG as well as for device
predicates. And you can think that these item and device predicates are then made
available in a built-in library [inaudible]. So now the user can come and specify policies.
So the policy is basically an item predicate, a device predicate, and a relation that must
hold between the two. So a photo item should be on any device repany. So at least
one replica.
So you see this stratification that starts from the metadata predicates and then the
policy.
Now, in Anzere all this is implemented in -- expressed in logic programming in Prolog.
So these are basically Prolog fonts and these are inference routes. And the user is
working at this level and this specifying policies.
So why logic programming? As it has been shown in previous systems, logic
programming allows you to unify information from a heterogenous set of sources and
makes the system less bound to a specific schema. So it becomes easier to evolve the
system, great new metadata, new content that appears in the system and new devices.
So by doing this, the advantage is that the expressivity of the policy language can be
expanded autonomously. So here is another example of a more complex policy where
basically we have again an item predicate that we will return all the items that have
been modified in the last hour. Then I have a device predicate that gives all the devices
that are closed to this mobile fine within 60 milliseconds. And then there is the relation
repany. So make at least one copy of it.
Yes?
>>: How diverse do you expect these policies to be? Common users can just share
some normal policies and deal with it or are you expecting people to have very specific
policies?
>>Oriana Riva: The way it works in the system is that we -- so in a built-in library you
will have device predicates and item predicates, and then based on that, you can
combine. But these device predicates and item predicates can also be expended as
well and maintained by a developer.
So I expect the user may have, you know, some fixed number of policies that may last
for years and that are pretty simple. Make two copies of this item or -- but then I think
that there are also situations where a user may specify even temporary policy, right?
Like something like this where I say my documents tag in a specific way modified in the
last two hours have to be copied to this device because maybe I'm leaving for a trip and
I want to get a backup. Or I will show you actually in the experiment an example of a
temporary policy.
So I think there are both things. So there is the boring side but also the more exciting
composition.
You had another question?
>>: I'm curious about constraints that have -- a lot of them have variability. Like the
[inaudible] on my phone changes a lot even when I'm walking around MSR. Right now
my phone has no internet, it doesn't have great [inaudible]. Seems like really interesting
things to try and work into the system. Is that something you handle or do you kind of
assume ->>Oriana Riva: You mean the variability?
>>: Uh-huh.
>>Oriana Riva: Yes. Well, that comes from the networking layer of Anzere. So in the
current implementation it's very simple. We try to make sure that when you detect an
event, you make sure that it's a permanent change that the system -- to which the
system has to react. But you can then build in any sort of logic in how you describe
these events. And the most important thing is to detect them, represent them, and then
have a reaction to them when they are there. But, yeah, the system can become very
unstable.
Okay. So, now, this is good. We can express a lot of policies. And now I go back to
your question. But this is actually work because we have then to process them.
So in processing them, we use constraint logic programming. So we basically specify
policies in the action implementations as an optimization problem. And the constraints
of this problem are given by the policies themselves for you say I want two replicas of
items are imposed by the device constraints. For example, keep two gigabyte of free
memory on the mobile phone.
Then there is an objective function that comes from, for example, I want to minimize the
bandwidth needed to transfer data in order to enforce the policies, or I want to maximize
data accessibilities. So I have constraints, I have an objective function. This is
translates into a constraint optimization problem. And the advantage with CLP is that it
offers a natural way to specify these constraints, and it's basically satisfaction problem
that very well applies to these kind of situations.
But to be more concrete about how this works, let me give you some insights on the
actual CLP execution.
So I'll start with the basic approach. So the model behind the CLP execution is a matrix
model where we have a bunch of devices available in the overlay -- yes?
>>: [inaudible].
>>Oriana Riva: Yeah, that's very interesting. So I've never thought about it, actually.
But you can build this with the current framework of the system. It all items on the item
metadata that you specify. So I could have a metadata that says number of access per
second or per hour, whatever. So I can just have that metadata. And then there will be
a device -- sorry, an item predicate that is going to say frequently accessed item, and
you give as an input the frequency of access. And in this way you are defining which
items have been frequently accessed and then a policy can be built on top of that.
So this is exactly the advantage of logic programming. You can just specify one more
parameter. So you just define one of these met data here, it's going to be edit, so it's
going to be available as a fact in the system. Then an inference rule has to be added
that just [inaudible] about these, and these will be in this built-in library, and then a
policy can use that. So this is the accessibility that logic programming gives you.
Did I answer your question?
>>: Okay.
>>Oriana Riva: So there is this matrix model. Devices and all the data items in the
system.
Now the solution to the problem is the value that this cell takes could be a 1, which
means store the item there, or a 0, do not store it there. In order to find a solution,
constraints are imposed. So constraints coming from replication policies make at least
two copies of item 4, and constraints that come from the device-specific feature, like I
want 2 gigabyte of free memory on my phone. So by imposing these constraints some
values of these cells will be already determined and some will still be undecided.
So there is, then, a second set of constraints that has to be imposed in order to support
the optimization part of the program indeed. So imagine you have a current data
distribution that says this is where my data is stored, and you have policies that have to
be imposed. In order to impose those policies, you need to pay a cost that will be the
boundary that has to be used to transfer data so that the new configuration satisfies the
policies.
So constraints are imposed in order to minimize that cost so that the optimal data
distribution can be found that is the one that has the minimum cost from these two
matrices. And once the matrix has been decided, an execution plan will be output that
consists of copy and delete actions. For example, copy item 2 from the phone to the
home PC, delete item 2 on the phone, and then the system will enforce these actions.
Yes?
>>: Can you give me a sense of what the cost is to satisfy providing this mobile
mapping of objects to devices? How does it [inaudible] the number of objects, if I have
100,000 objects in my system, isn't it quite slow to calculate that?
>>Oriana Riva: Here it is. Yes. The answer is yes. A lot.
So this is the CLP execution time. There's the number of items. This was done with
three different data sets generated from different random seeds. And the brute force is
the algorithm I've just shown that reasons item by item.
With 25 items in ours, it could not complete. Not only that, but you see it's also very
dependent on the data sets. There are sweet spots coming from different depending on
the data sets. So this is a problem -- yes?
>>: So I'm a little confused about what software you're running here. Are you running
Prolog and then you're on constraint solver or are you running a particular constraint
language?
>>Oriana Riva: So we are using Prolog, and we are using the Eclipse solver. And
these runs will become clearer when I show the architecture. There is one node in the
system that holds the knowledge base and the CLP solver, runs the time and which are
the actions that have been taken and dispatch the actions to the devices. But I'll show
the architecture later.
>>: Can you go back to the previous slide?
>>Oriana Riva: Yes.
>>: So this formulation doesn't take into account the cost of doing the operations, the
copy costs or anything else like that?
>>Oriana Riva: No, it does not.
>>: So is there any way to take that into account [inaudible]?
>>Oriana Riva: Oh, I see what you mean.
>>: Maybe the tradeoffs would be different if you ->>Oriana Riva: Definitely. That's a good point. I mean, it should be done, and also the
ability to maybe aggregate actions and -- yes, we don't reason about that yet. We do -to be really honest, we do one thing in the system that is, again, another advantage of
Prolog that is simply to order the devices. Just ordering them and the way we specify
them, if I have the option between copying -- for example, I have data that is stored on
the phone and my office PC, and I need to make a third copy in the cloud. So the
system will always choose the office PC to copy the data instead of choosing the phone.
So there is this implicit, if you want, optimization just based on the priority of devices,
but it's not very detailed, as you probably mean. I think it could be interesting to do that
and aggregate actions and things like that. Especially, we had situations where we got
something like thousands of actions who are executing one goal. So that kind of
optimization becomes very important.
Yes?
>>: [inaudible].
>>Oriana Riva: There is no solution. There is no solution, but then it depends what is
the reason why it can't be satisfied. So if you had a policy that was something like make
photos available on my home server and my home server not reachable, that policy
can't be satisfied.
>>: [inaudible].
>>Oriana Riva: Well, then if we can -- so if you know apriori that that policy can't be
satisfied because the device -- for example, one server is not reachable or because the
phone does not have connectivity or -- if you know apriori, that policy is just excluded
from the set. So this won't make CLP failing. But ->>: [inaudible].
>>Oriana Riva: If you don't know it, it's going to give no solution because there is no
solution to the problem.
And one way to deal with that is that if there is no solution, what you could start doing is
to start solving a subset of the policies and see if at least a subset of those policies can
be satisfied. But if there is no solution, the CLP would simply return you no solution.
>>: [inaudible].
>>Oriana Riva: There is a feedback, yes. So the user is informed when a policy can't
be satisfied. So the user basically specifies policies, constraints, and gets back a
feedback. So from the feedback, the user will know what has not been enforced.
Yes?
>>: Where is the policy [inaudible] done? I sense that you don't want to tell us yet,
but ->>Oriana Riva: It's in the CLP. So it's one node that is elected as the leader and runs
the ->>: [inaudible].
>>Oriana Riva: Sorry?
>>: Is that the cloud?
>>Oriana Riva: It can be. It's one device that is elected as a leader. So it can be your
office PC, but if there is no office PC -- it won't be the phone. So we don't run that on
the phone.
>>: But it has to be inured with global knowledge if all of the [inaudible].
>>Oriana Riva: The knowledge base has to be stored there. But I'll show later, really,
because then it's also distributed ->>: So the user has some [inaudible] device and forever that device is going to be used
as the knowledge base and the policy ->>Oriana Riva: Forever if it does not fail. If it fails, there will be a new -- the election
protocol will elect a new leader.
Yes?
>>: Can you tell me a little bit about why you chose to do a global policy rather than a
device-based policy? I mean, a [inaudible] policies and other storage systems have
looked at personal storage networks [inaudible] usually per client. I mean, I can see a
tradeoff here if you have a lot of churn in your system, you're adding new devices
quickly, this might be better because [inaudible] policy, but I'm curious as to why you
chose this approach instead of something that's just [inaudible].
>>Oriana Riva: Next slide [laughter]. You are always a slide ahead of me [laughter].
No, sorry.
Well, the quick answer is so far centralization has not been a problem in the sense that
the set of devices is still small, and I show now that we can still scale. So that's the
quick answer.
But then I have a lot of future work on what you're saying, because in a multi-user
model you need more distribution.
>>: My worry is that in a device model, I could change the policies per device just
[inaudible] changing, but in a global model I have to change my policy per data item. If
I'm creating a lot of content, every time I create a new item I have to evaluate globally
whether that item needs to be replicated across different devices [inaudible].
>>Oriana Riva: I'm not sure I understand. Why do you need to specific a policy for
every new item you create? Because, also, items are addressed by the item predicate.
So if you say recent item in the moment in which a new item is created, created is part
of that class.
>>: But ->>Oriana Riva: So it's device independent, but it's also item dependent, this policy ->>: [inaudible] I don't know whether or not it should exist on these other devices until
you look to the schedule.
>>: So you're worried about reevaluating the same policy or not.
>>: Yes.
>>: You talk about ->>Oriana Riva: I think we should talk about offline. I'm not sure I'm understanding -- or
maybe it will come now with the more complication of equivalence classes.
So basically it does not scale, this basic approach. So we introduced the item
equivalence classes. And the purpose of doing this is to reduce the problem space so
that the CLP solver can converge to a solution quicker.
Now, it's important to understand the equivalence classes are generated by looking at
the item predicates in the set of active policies. So assume that there is just one policy,
and this is the item predicate that is used. The equivalence classes are given by the
permutation, music item, no music item. And all items can be grouped under these two
equivalence classes. Or if you had two item predicates in the active policies, then you
will have 4 equivalence classes that says recent music items, no recent, and so forth.
So by generating the equivalence classes, then all items can be grouped according to
them, so in the previous matrix model, instead of the items, the CLP program will
reason about equivalence classes. And this makes the system scaling much better.
So this is the previous graph, and this is the new execution time with the equivalence
classes algorithm. And even on a larger scale, you can see that even when processing
10,000 items, the solution is found in less than 10 seconds. So with equivalence
classes, it can scale very well with a number of items, and we got what we wanted
because now the system scales with the policy's complexity because it depends on the
equivalence classes.
Yes?
>>: But if you have a rule saying I can't have more than 2 gigabytes on this device, now
you can't have equivalence classes because any two files that have different sizes are
no longer equivalent. So you now -- just having that policy, I care about the aggregate
storage requirements of the set of items means that you can no longer consider any two
items equivalent unless they're exactly the same size.
>>Oriana Riva: So the way you implement that kind of constraints is by defining a
predicate that we say item store on device or item do not store. So that's the
equivalence class that you use. And then you reason about it in the same way. So it's,
again, a property, right? The device is a property. The item is a property that is stored
on a device. So in that way you can still define an equivalence class that will group all
the items that have the property to be stored on device X or on device epsilon.
>>: How would you choose the subset items that you put on device [inaudible].
>>Oriana Riva: So one thing -- so the only case that I have found so far where
equivalence classes may not be efficient is the case where I'm in this particular situation
where I need to make a copy of the items, and they can't fit on my mobile phone or they
can't fit on one unique device. So I would like to split those devices that belong to one
equivalence classes in two so that then I can copy.
But, yes, that's the on case where I don't yet manage to handle. But I think that's
actually pretty rare, because if you think you have a cloud machine or -- it becomes
pretty rare, the case where you need to split your traffic. But for the rest, you can
reconduct always to an equivalence class.
Yes?
>>: So my understanding is Prolog or the CLP type of solvers are pretty poor in terms
of handle [inaudible]. If you specify something as high and say I care about things
between this time of the day and that time of the day or every Monday or something like
that, it's fairly hard -- I think this is similar to the [inaudible].
>>Oriana Riva: So ->>: [inaudible] have the experience in terms of specifying things [inaudible].
>>Oriana Riva: Yes, we did that. And, actually, it's in the previous example. I'm
thinking, because so far I did not have a problem with time, so I'm thinking where there
could be -- sorry, it was actually in the example -- where there could be a limitation in
doing that.
So here, for example, this recent item predicate is defined by having the modified date
that is the metadata of the item and then these modified within is a predicate -- it's an
infinite [inaudible] in Prolog that gets the current time and computes whether it's before
or after the ->>: [inaudible].
>>Oriana Riva: Well, every time. So the CLP solver runs periodically. So whenever it
runs, it will get the fresh time and will evaluate.
>>: [inaudible].
>>Oriana Riva: Yes. That's absolutely correct. So at the moment the CLP solver is
invoked on a periodic basis, we're working on making it more even driven, but for the
moment it's periodic trigger.
>>: [inaudible] when the next time the CLP solver can run. I think this is similar to the ->>Oriana Riva: Well, we have investigated the tradeoff or do I run CLP every 30
seconds or every 10 minutes, right? Obviously if you run it on a shorter interval, then
you can move easily follow what is the system load. It's a tradeoff. But I don't see any
problem in making it even driven where whenever there is an event or there is a large
number, large of items that have to be products then the Prolog server will be invoked.
That's how the system should go.
>>: We can take this offline.
>>Oriana Riva: Okay. Thank you.
Yes?
>>: Can you go to your last slide which had the graph on it [inaudible].
>>Oriana Riva: Yes.
>>: So I have 10 policies and 12 equivalence classes, and if my computation now, the
evaluation is based on the classes of policies, why does it constantly -- the number of
items should not make a difference.
>>Oriana Riva: Very good point, yes. So the reason why there is this increase is that
there is a last phase that is where we have to expand the matrix in order to generate the
actions, and that scales with the number of items.
So at some point the solution is given and actions have to be output, and that has to be
a per-item basis. So the matrix has to be expanded.
>>: So that's good, because that means if we have to move something, we don't have
to move an entire equivalence class.
>>Oriana Riva: No. I mean, you reason by equivalence classes, but the actions are
defined on a per-item basis.
>>: And then how -- so suppose I just have something that I've already done this, I
introduce one new object. It shouldn't take very long to reevaluate everything just for -in other words, I would look for an incremental change in performance as opposed to
reevaluating everything for 100,000 objects. Can I just increment ally reevaluate based
on the introduction of one new object, falls into an equivalent class, do the calculation,
and I don't have to really make rate changes. I just have a sense there should be
something ->>Oriana Riva: So what you mean is that instead of reevaluating all items every time,
you should look at the difference? This is your -- yes. Absolutely. Yes, you're right.
Yes.
So there are a lot of optimizations that we haven't done yet. You can also reduce the
number of active policies, because maybe you have new items, but they do not relate to
all policies. So you can reduce -- yes, there is absolutely a lot that can be done to make
it even faster, yes.
Yes? Then I need to go.
>>Doug Burger: Maybe we could just take the questions afterwards so we can get a
little more of the talk.
>>Oriana Riva: Okay. Thank you. Sorry.
So now we have built a basic framework that allows to specify metadata in variables
and optimize over the data items that are given. And so with this basic framework it
becomes possible to start introducing new variables and reason about them. And one
of these variables we investigated is over acquirable resources, which means to factor
the decision of acquiring a cloud machine or pure storage in the policy process itself.
So the CLP program evaluates the states that the system can achieve when a new
device is added, specifically cloud virtual machine, which means having an extra
column in that matrix that I have shown. And we need to consider also price
constraints. I don't know want to spend more than X per month on renting cloud. And
the actions that are output now will be also acquire and release actions of virtual
machines.
And so now the picture is complete, and this is the personal cloud.
So I now want to give you a brief overview on the architecture of the system. So this is
the architecture that runs on every node in the system for the exception of the CLP
solver and the knowledge base. So this is used for optimization, and all that I have
presented so far basically runs here.
The knowledge base is also only on one single node, but in order to avoid one single
point of failure, it is also replicated -- policies and metadata are fully replicated across
the devices so that the state can be re-established in case the node holding the
acknowledge base will fail.
Now, this CLP, I have shown how we use it for data replication, but actually in the
[inaudible] it is used also for networking optimization, application, computation of
optimization.
Yes.
>>: When you say the knowledge base is fully replicated, you mean on the well
provisioned machines. It's not replicated on each [inaudible].
>>Oriana Riva: So it depends. So policies and item metadata are replicated
everywhere. They don't -- they are really a few megabyte. It's not a problem.
And then there are information like, for example, sensor information that are collected
that describes the state of the system and device properties. And these are replicated
only among a selected subset of potential coordinators, potential nodes that could
become the next leader in the system.
So in order to build a knowledge base, there are a bunch of sensors that run on every
device, and they collect information like which network needs are in use, what's the
device data's CPU usage, what's the latency between any pair of devices.
And also item metadata are collected in the same way so that the CLP can know which
items are available and where they are currently stored, plus external monitoring service
with PlanetLive and EC2. Then the application will submit the replication policies, CLP
are run and output copy and delete actions that are enforced by the data replication
subsystem.
Now, in building the data replication, we used one known technique in data replication,
and in particular for partial replication, we have an implementation, simplified
implementation of [inaudible]. We then use a similar interface to detect systems for
flexible consistency that allows us to have a continuous range of consistency values,
which basically means that every item in the system can be replicated with the weaker
guarantees of [inaudible] consistency up to the stronger guarantees of sequential
consistency. And to enforce sequential consistency the consensus Paxos is invoked.
So every device is part of the Paxos consensus.
Then we have actuators that are in charge of going and acquiring releasing virtual
machines. And, finally, there is the overlay network where all these run.
So the overlay takes care of leader election, membership management, failure
detection, and as I said at the beginning, it implements concept from the declarative
networking to specify routing requests again as a CLP optimization program, which
allows to deal with a heterogenous network path, failures or situations like you
mentioned before, do I have a path to an unstable device.
Okay. So now a few evaluation results. So we're on trials of the system using a real
home and an office network plus virtual machines from EC2 and PlanetLab. And we
investigated the policy sustainability. So I have shown these before, how the system
can scale to a large number of items.
In terms of memory consumption, [inaudible] is okay. The upper bound for example, for
reasoning about 10,000 items was about 64 megabyte, which is more than reasonably
in a modern device.
Then we evaluated the activity of the system. So in this first test we produced in the
system a dramatic event. The home server crashes, so the home server was storing
something like 1200 items. It crashes. The system will have to reason about how to
basically re-establish the replication requirements in the system and decides to acquire
an EC2 machine in order to create a [inaudible] replica and then start copying all the
data items that are available in the system so that the replica that the old server does
not provide anymore can be established.
So this is to show how acquirable resources can help the system to stabilize and
decrease the level of vulnerability and [inaudible] can autonomously go and reason
about I need an extra resource to satisfy the policies. But, also, reactivity can be used
to satisfy -- to react to mobility.
So imagine this scenario. So I'm in Europe in my office network, and from my laptop I'm
downloading photos from my office PC with a certain access time. And then I go to
Seattle, and as soon as I connect, I want to again to go on to download my music
from -- my photos from my office PC with, of course, an increased time since I'm
remote. Now, if a policy like the following is imposed that says picture item modified
within the last day should be replicated to at least one device that is fixed and it is close
to the laptop within one hundred milliseconds from the laptop, in the moment in which
my laptop reconnects, the policy is in force and CLP will acquire an extra machine from
the cloud and start copying the photos so that my access time will be greatly reduced by
the time all the photos have been copied.
So this goes back to your previous question. Our policy is [inaudible] and dynamic.
This is an example of a policy that you could establish just for a temporary time and
then delete from the system once I'm back to Europe. And acquirable resources can
also be used for improved performance.
Now, if I'm okay with -- yes?
>>: That brings up an interesting question. Suppose I didn't want to write that rule
myself. Could the developer, the provider of the system, sort of write a meta rule that
says, you know, any time ->>Oriana Riva: [inaudible] sure. Why not? I mean, one thing could be that policies
could be associated to really what you are doing. You could have home policies,
holiday policies. I think that's possible. Why not? Yeah.
So if I'm okay with the time, I have a one-minute demo. It's just to -- it's nothing fancy,
but it shows how the system works and gives you an insight.
So here is the overlay that we use for most of our tests. So here, for example, I can
access my office PC storing three photos, then I have my mobile phone, this is my
home server also with a few photos. I made the system empty for clarity. So that's the
home server. And then the laptop and then there are some -- a PlanetLab machine and
two EC2 machines. Obviously all these devices were in Switzerland, but for your
visualization I hard-coded the latitude and longitude. So that's the EC2.
So now I can go in the system and I can select a picture and I decide to submit this to
the system. So I'm running these at the moment on this red node that is also the
coordinator, the flowed that runs the CLP in the system. So the picture is going to be
now available only on this node.
Now I go and I specify a policy. So this is a basic GUI we built to quickly specify them,
but of course you're not limited to that GUI to build them. So here I say a picture,
repany, home, and fixed. So these are item and device predicates.
So I save the policy, and this is how it looks like. So the picture item, make at least one
copy on the home device that is also fixed. And I can also go and see which are the
current policies that are present. So, actually, I have another policy that says that
pictures should have at least three [inaudible] on any device.
So now you see that the policy has been replicated to all devices in the overlay. CLP
runs periodically, so now it has run, and it has output two actions: Copy the item from
IKA [phonetic] desk, my office PC, the new photo to the home server, and then the other
action is copy the photo to EC2 Europe.
So I said that I want three replicas of these photos, so one is the office PC, one is the
EC2 Europe, and one is indeed the home server. And so now you can see that the four
photos are now available there.
So then I do another thing. I go and I take the policies, and these are the current
policies I have. And now I specify an additional policy that says pictures that are private
don't replicate it, replicate none to the cloud. So then I specify the name of the policy,
I'll load it, and so this is how it looks like.
So this is the current set of policies that is available. So what is happening now? I
added three copies, and one of them happened to be on the cloud because that's what
CLP output. So by imposing these, if there are photos that are private, they have to be
deleted from the cloud, but still I need to have my third replica. So, indeed, what you
see is that there will be a copy item that will copy the photo this time to the laptop so
that I still have three replicas plus there will be a delete that is going to go and delete
from EC2 Europe this photo that was indeed a private photo. And so now you can see
there that it has been deleted.
So this is again the coordinator, the home server, as well as four photos. The laptop
has this new photo, and the last one is the EC2 that now has three photos.
Oh, I forgot to make this first screen. Sorry.
Okay. Oops.
So now to go back to the talk and to conclude, so I have presented Anzere, which is a
personal system for data management, and it supports policy-based replication and it
can scale to a large number of items thanks to the introduction of equivalence classes.
Now, this is one piece of the full system we are building, the personal cloud computer.
It's the storage part.
Now, what will be next? So, first of all, we have to work on the application support.
One problem there is how to partition an application across the personal cloud in terms
of code and data. And also, once it is partitioned and you can distribute pieces, you
need to have some coordination across the layer of data, storage, and network.
Especially if there are bits of optimization going on at each of these layers. I don't want
to pay more than X for 3G connection, I want my photo to be stored there, and I want
my application to interact with this and finish before and minute. I mean, these kind of
interactions.
And then a question that is open is can these be used by existing applications? What's
the [inaudible] of taking an existing application and applying it to this?
Then if we go to then multiple applications, coordination across them becomes also a
very hard issue. The risk is that this becomes a set of autonomous systems that react
to each other, you have control loop issues, and you have then unexpected interruption.
So there needs to be coordination across the application.
And then to go back to your initial question, so far we have considered a single-user
model, and it scales well. We haven't found problems in having this centralized
architecture. But we are considering a multi-user model where every personal cloud of
each user will start interacting, data is shared, you may even share data with online
services. So then a distributed policy decision becomes important that does not bring to
having distributed CLP but still start making decision more local to a device. And this
would be very interesting.
And then to really conclude, so cloud computing infrastructures today provide us almost
an unlimited number of computation and storage resources. And on the other side we
have mobile phones that for a large majority of users are today the primary personal
computer, and they already are very popular, and they are going to act as an entry point
to the cloud as well.
So with Rhizoma I plan to bring together these two worlds and make the phone a
controller of personal cloud applications that runs across this personal cloud.
So I conclude here, and if you have more questions or questions I haven't answered,
I'm happy to take them.
Thank you.
[applause].
>>Doug Burger: Okay. Any questions?
>>Oriana Riva: Yes? You had a question, I think, before.
>>: Yes, I was curious about -- I want to push a little bit more on this idea of item
classes instead of individual items. If we're getting -- I'm wondering if there are points in
the operation space where [inaudible] where I had an image and all my images get
copied from one [inaudible] to another and then I add -- I delete an image and they all
move back. This could be kind of happening rapidly because I'm unable to reason
about items individually.
>>Oriana Riva: Yes. So it's a good point. So how stable is this, right? Because now
you reason by equivalence classes. Now, in your specific example I think that -- so one
possibility is to say you can have equivalence classes and you can start sort of
specifying them at the final granularity so that you avoid also the case I was saying
where I have to copy them and they can't be all stored.
But I think that in other cases if you delete an item, it means that that item shouldn't be
stored anymore on your local device, but on the other hand, you want to have a certain
number of replicas of that happen item. So the system will look at that single item by
making sure that there is a replica of that. So that won't bring instability.
I think the instability may be more when you start having policies that change very
frequently and then it becomes harder to optimize and coordinate between one change
to the other. I think that's one -- but I haven't really investigated so much. As I said,
there are a lot of optimizations that could make the system much more efficient.
Also, one thing I'm considering is starting looking at what is the load that the user
generates, because if you think -- as a user, if you produce photos every 10 seconds,
you're taking photos with your phone and you want to have them replicated, it's very
unlikely that at the same time you will produce 100 music files and 10 documents and -you know, so this means that you have sort of a partition workload that comes, and the
policies related to those specific items should be activated at that time. So this will
make the system scale much better and probably gives more stability. But these are
things that I have still to investigate, so I don't have a proof of that.
Yes?
>>: Total other random product, but one thing that I would hope that you'd think about
at some point as you continue this work further is thinking about how to automatically
learn policies or learn labels. So the reason I say things like that is -- I manage my own
pictures, right? I have a very zero one thing. I can only treat them as all the pictures. I
have no way of dealing any finer granularity of pictures of [inaudible] versus pictures of
when I was seven. They're all in the same place just because it's too hard for me to try
and separate them out. You might think that, you know, some sort of machine learning
guide can be put on top of this and I should be able to say, well, look, you know, even
though you have these pictures and you treat them all the same to you, you haven't
looked at these pictures in ten years. And while you want them to exist, we can let them
sit over on that old slow hard drive that we never even cared about.
>>Oriana Riva: Sure. Yeah. I think that would be interesting. At the moment,
[inaudible] exit tool so we extract metadata. If we have a user that is very careful and
has tagged all these photos, then we can do very cool things in reasoning about it. So I
think that if there would be some machine learning system that allows me to have more
richer data, then, yeah, the framework is there for reasoning about it. Yeah, sure. That
would be cool, too.
Yes?
>>: I was wondering how things [inaudible].
>>Oriana Riva: Thank you for asking this question.
So I'm a big fan of the consistency model we have in the system. And so with photos
and music files, after all, you don't change them that much because, I mean ->>: Never change them.
>>Oriana Riva: Exactly. But ->>: [inaudible].
>>Oriana Riva: Unless you change the [inaudible], but, still, the file won't change. But if
you start having documents and stuff like that, so then it becomes very interesting.
So I did not go into the details of this part of the system, but I mentioned that we use
[inaudible] as the basic partial replication, and on top of that, we have an arbitrary
consistency. So this means that every time an item is added to the system, you can
specify at which level of consistency the item should be maintained.
So you can say strong consistency, which means that any update you do on that item
will be propagated to all devices that have subscribed to that, which means all items
that are interested in that, they have a policy that says you should have that item.
Or you could specify a much higher consistency, so weaker. You increase the level.
And so then you can have even 10,000 updates that don't get replicated.
And at the user level, this is what we call the [inaudible], which is that if you saw in the
demo I had that little window where I put -- there was a 2, and I put a 0 was the level of
consistency. So if it's 0, it means strong consistency. If I put a 10, it means I can have
10 outstanding rights on this item until I run consensus and I make sure it's consistent
again.
But I think that it is independent on the replication policies. So when I do a copy action,
I make sure that subscriptions among devices are set correctly, and then if that item
was a strongly consistent replicated item, the consistency layer will work on top of it.
But that's independent on the replication policies. So once you've got the item, you are
part of the consensus for that item.
>>: [inaudible].
>>Oriana Riva: Yes.
>>: So consistency could change your tradeoffs, your [inaudible].
>>Oriana Riva: Yes. Yes. Definitely. But what I meant is that's not controlled by the
policies now. It's controlled by the user. So when I create an item, I specify a level of
consistency, I can even specify a bound on the number of invalidations. So there are
two parameters that trigger the consistency. The [inaudible] and the sending bound.
The sending down tells me how many outstanding invalidations I can have before I send
them.
So these two parameters are specified by the user. They are not part of the policies.
Yes, you're right. It's on the user's side when I add an item. But it would be interesting
to then have also a logic that allows you to optimize on that, right? Optimize on which
level the user has specified and what's the optimal distribution of data that can
guarantee that consistency level with a minimal usage of resources.
Did I answer your question?
>>: [inaudible].
>>Oriana Riva: Okay.
>>Doug Burger: Actually, why don't we end it here.
[applause]
Download