>> Roger Barga: Good afternoon, everybody. And welcome to... cloud applications of the future. A few notes first...

advertisement
>> Roger Barga: Good afternoon, everybody. And welcome to our panel on the
cloud applications of the future. A few notes first before we begin, and then I'll do
the instructions for our panelists. We have no break between our panel and the
next session, so I am going to be ruthless on time, and we'll move forward right
into the next session after the panel wraps up.
A few words about the panel. When thinking about options for panels, we
wanted to be country with the theme of this workshop, and that is looking at the
future of the cloud, in particular the applications that the cloud could enable. And
I challenged our panelists here to actually think about the devices, the way they
interact with the cloud and how that would change over time and not focus on
VMs or performance or some of the hard but relatively mundane things we have
to wrestle with over the next couple of years but to look out into the future. And I
would engage you, as the audience, to actually engage with the panelists, trying
to understand their thinking, trying to understand the research challenges to
actually make their vision a reality.
And I was very pleased when all the panelists I invited, I chose people I thought
were creative and actually very forward looking thinkers. All accepted. So I'm
going to make some introductions here before we get started.
Paul Watson, from the University of Newcastle. Bertrand Meyer from ETH
Zurich. Marty Humphrey from University of Virginia. Savas Parastatidis from
Microsoft. And Rob Gillen from Oakridge National Lab.
Two of our presenters will be using the slides; the other three will actually just be
going verbally. And then I've asked them to talk for five to seven minutes and
then we'll open it up for actually questions and discussions from the audience.
So with that, I'll turn it over to Paul. And we can turn the projector on any time.
>> Paul Watson: Okay. So what I wanted to do is to use some examples of
work that's in progress at Newcastle to try to illustrate some points about where I
see the potential for cloud applications in the future, but also the issues that we
really need to address if we're to use the cloud for these new class of
applications that we're seeing.
So I've been working on data for 20, 25 years. And the main change I've seen
over the last two or three years has been sensor based data, realtime data. So
five or six years ago when we started e-Science at Newcastle people would
come along and they had spreadsheets or some database and they wanted to
manipulate the information in them. Now they often come along and they've got
some sensor based information they want to know what to do with it.
And you can see some examples there just from projects that we're working on
which range from nowadays these systems whereas you drive along the system
in the car that monitors everything that's going on, environmental sensors of
course are really revolutionizing that area of science, more and more medical
experiments are taking advantage of these lightweight medical sensors to do the
realtime monitoring of patients.
And in the transport area we've got all these things which allow you to locate cars
and people. And of course RFIDs in the -- typically in the logistics area.
And you've also got people as well. You think of these now as people generating
realtime information, their sensors with their Twittering with their use of e-mail
and texting and so on. And the thing I came to realize a little while ago is you
also see software as a service as an example of new sorts of sensors. Because
as people are typing caring out actions using these applications all the
information is available in realtime, the application. If only it could do something
useful with it. And that's the trick with all of these systems. You've got all this
realtime information and how can you use it?
I'm going to use two specific examples to illustrate the issues. So these are from
a project in Newcastle called SIDE, Social Inclusion Through the Digital
Economy. So the idea here is to use -- look at using technologies, new
technologies to help those people in society who traditionally don't benefit from
technology. So older people, disabled people and some sections of youth are
excluded who are on the margins of society. And there's several strands. But
one's on transport.
And this is our electric car. So we've got five of these that travel around
Newcastle. And I must admit my heart sank when Dan Reed in his talk this
morning started talking about exactly what I was going to talk about which was
monitoring the car where it was, monitoring the charge and giving you
information about how to navigate to a charger.
But in this particular project we're also looking at realtime telemetry to help older
drivers. So the idea is if you've got this information from the engine and you can
process it in realtime, there's all sorts of interesting things you can do with it. So
this is just realtime choice of somebody driving around Newcastle. You can see
the sorts of information you can get in realtime, including things like not just the
distance but break pedal pressed how many times that happens. We get an
event any time anybody does that, any time they press the accelerator, any time
they change gear.
And you can do two things. Long-term trends for older drivers. So if you ask
older people what their main concern is in the UK if you do surveys, the main
thing is mobility. So keeping mobile is really important. And over time people
who are -- people who are as they age they start to run into problems. And one
of them that you can detect these sorts of systems is cog activity impairment, so
for example vision problems.
And one other way to do that is if you find from these telemetry systems you get
acceleration and then braking, acceleration, braking. That often means that
people's eyesight is starting to deteriorate and they're getting very close to the
car in front before they notice it and then they're slamming the brake on an
accelerating again and slamming the brake on. So you can start to detect that.
But you might thing well you could do that not in realtime you could just collect
information, the car and over a period of time you might detect this. But actually
another real problem is to do with drugs. People, older people go to the doctors,
they give them some drugs for a particular ailment and to the surprise of the
medic and the older person actually it really impairs their driving ability, it has
effect on their vision or their ability to navigate the car.
And you can detect this in realtime if you can monitor the information coming
from the engine and start to see the trends which mean that they're not really
driving particularly well. And you can do things like you can suggest that they go
and try a different drug or you can think about putting in assistance into the car to
help them, for example distance sensors so that they get a warning when they're
within a particular range of the car in front.
So you've got this basic system. As an example you want to send information
from the telemetry up somewhere, you want to do a pattern, match, analysis and
reaction which then feeds back down to the car but also to other people, some
medics, opticians, people who can provide assistive technologies. And in order
to do this properly you also need to integrate lots of information. So it would be
nice, for example, if you had some information about weather conditions because
that also affects driving.
So realtime combining this information in realtime allows to you do more than you
can with just one source of information.
The final example is this one which is about the connected homes. So can we
use these new pervasive technologies to allow older people to stay in their home
for longer because from the point of view of older people they would like to stay
in their own homes as long as possible. From the point of view of the state it's
actually very expensive to support older people in sheltered accommodation. So
this is an example of a kitchen which is in Newcastle and this is all wired up with
sensors so the floors have pressure sensors on it so you know cal where people
are and these RFID tags in all of the utensils and all of the ingredients that
somebody might use in their kitchen. And there's cameras as well which we can
use to get [inaudible] about what people are doing in the kitchen. And if I go to
the next slide which is somebody preparing food, you can see that on the back
wall there information about menu and you might notice that the handles that the
utensils are quite thick and this is because we've got remote controllers in them
with the accelerometers. So all the time as these people are using these utensils
it knows if you pick the spoon up, it knows whether you're stirring with it or
whether you're ladling with it and so on. So there's all this activity recognition
going on all the time.
And the idea is if you have all this information from all these sensors then you
can work out what task something is doing. And if, for example, if you have
somebody with dementia, so if they have problems with short-term memory and
they get stuck in the middle of a task then you can prompt them. So basically
you build mathematical models of activity, you trace from the sensor information
where they are in that activity and then you give an audible prompt or a visual
prompt when they get stuck to go and carry out the next action.
So for example, if they're making a cup of tea they might get stuck at the point
where they need to go and fill the kettle up with water so you tell them that they
need to do that. And again you get these sorts of sensor information going
somewhere to be processed and being fed back down through the in terms of
advice to the individual but also to other people. For example, if in's using the
kitchen by 9:00 in the morning and they're not, you might want to alert neighbors
or friends because you might want to go of a look, just check everything's okay.
So we have an architecture for this. We've been building systems to do this sort
of thing for three or four years. And we take the information and we filter it. We
do some realtime analysis and we've got historic warehouses so that we can use
machine learning techniques to look for key patents. And we run this on our own
servers using traditional database techniques.
But the question is could and should we moved these clouds? And so if you
think about that then why might we want to move it to clouds. So I think this is
one of the key examples. You can have very thin clients. So you can see the
car is just a thin cloud which is just sending information up. It doesn't have to do
any processing. All of those applications can be done in the cloud so you get
simpler devices, you need lower energy, you've got much more flexibility.
So for example, you can add new applications and improve applications all the
time. So if you look at the car, for example, so as I said, when Dan mentioned
the navigation to a charger I was rather deflated this morning but actually it
illustrates the point that using the same sensors in the car you can do the
navigation for -- to a charger, you can also give assistance to older drivers in a
way that I've described, you can do congestion avoidance and people interested
in things like dynamic congestion charging to try to influence the way in which
drivers move on roads in order to restitution congestion. And you can do all that
just with applications. You can do all that with applications in the cloud. You
don't have to install anything new in the car itself.
The same with the kitchen. We've actually -- we're not just doing work on helping
people with dementia, we've also done some work with exactly the same kitchen
on teaching cookery to young kids. We have interactive cookery books which
give you advice as you're going along with people like me who aren't very good
at it. And language learning as well. So task based language learning through
cookery is actually something our education department's really very interested
in.
Then you've got scalability. So at the moment we've got five cars. But how do
we get up to 50 million cars with the server that we have that they have at the
moment if you wanted to cover all of the cars in the UK? And the machine
learning algorithms which take all of this data and try to look for these patterns so
that they can provide good advice back to the individual in the car, in the kitchen.
So those are the potential benefits of clouds. There also are some challenges.
So I think I'll stop with this last slide so the question is can clouds meet these
needs. What about the latency, the bandwidth issues, the charges that you get
for sending information in and out of clouds, does that kill you compared to doing
things more locally in the car, in the kitchen? And finally are we all going to go
and invent our own infrastructure to do this separately or is this some sort of a
common middleware for clouds to process this realtime information? So is there
a Hadoop, for example, type of middleware for realtime clouds? Okay. Thank
you.
>> Roger Barga: Thank you, Paul. And I'll introduce the next speaker, Bertrand
Meyer from ETH.
>> Bertrand Meyer: Okay. I appreciate very much the opportunity to be on this
panel. It's -- it's -- I came here basically to learn. So it's great honor to be
suddenly promoted to expert. On the other hand it's not something that would
intimidate me in the sense that I've learned long ago that the best way to learn
about a subject is to write and publish a book about it. So I guess being on a
panel about it is the second best way to learn about a subject.
Also, don't pay too much attention to these slides. I'm going to reveal the great
secret they were all prepared in the past 15 minutes. I mean, 20 minutes ago
there were no slides. So they are basically there for me to help me remember
what I wanted to say.
So what I want to talk about briefly is what the cloud changes with respect to the
application of science with a special emphasis on my field, which is software
engineering. And this is also the opportunity to react to some of the points that
were made in the fascinating key notes that we've had over the past couple of
days.
So the message that we're getting is clear, the cloud changes everything.
Research will move from experiment-based to data-based. I think all three -- all
three of the keynote speakers so far have said this in remarkably similar terms.
And I would say it's a very compelling message. And in particular Lazowska
made a strong point that computational science, which is our way of interacting
with other disciplines will move from simulation to analysis of very large amounts
of data.
And I also perceive this message in this morning's keynote by Dan Reed that
inductive approaches will get new boost, inductive approaches meaning
approaches that work from the existing data can -- as we would say as computer
scientists bottom up rather than top down; that is to say deductive. And this as I
mentioned this morning in a question to Dan, I'm not so sure about.
Clearly there's going to be a number of benefits for science if the world of science
indeed moves in the direction that the speakers unanimously describe. We're
going to be able to do science on a much bigger scale, much more realistic than
what we have done before. It's going to be much more -- yeah, we are going to
have more realistic scenarios than those afforded by simulation. I'm thinking in
particular of the very controversial today but very critical to the survival of the
world issues of climate change, for example.
We're going to have more surprises because when you get amounts of data that
are larger by several orders of magnitude than what we've been used to, it's
inevitable that we're going to get lots of surprises of things that we did not expect.
Also, the cloud, not automatically but naturally favors more openness. And we've
heard in particular about these astronomy experiments in which initially the
astronomers were a bit reluctant to put their data in the public domain and then
realize that everyone would benefit because the pie, the overall pie would grow
so much that everyone's slice would be bigger.
And this is too in my opinion a paradigm of many things to come; in particular in
software engineering to which I will comment in a moment.
Also an important aspect of science, of good science is reproducibility. By the
way, this is one of the areas where computer science is traditionally very bad,
experimental software engineering has been very bad at reproducibility. And
putting everything on the cloud, everything meaning programs and data cannot
but be extremely favorable to reproducibility.
So these are some of the reasons to be enthusiastic. And I think you agree that
there is a certain feeling of enthusiasm throughout this meeting and the
presentations we have had.
On the other hand, there are a number of issues. And so I don't think too many
people in this group who aren't software engineering. The few of us who are in
hard core researchers in software engineering kind of meet in the back corridors
and look at each other and exchange glances that basically mean do these
people know what they're getting into, right? So everyone knows about problems
of security, for example. But everyone seems also to assume that these things
are going to work.
Well, maybe. But if we are increasing the scale of everything, we are also
increasing the scale of problems. And I don't want to get too personal, but when
preparing my attitude, the attitude that is coming in the next session, I actually
crashed Power Point doing something completely ordinary on a top of the line
work station with brand-new version of Windows 7 with all the latest updates. So
you know, before we can get right the program that's going to be connected to
your pace maker so that it calls the ambulance if there's something wrong with
your heart, before we get that one right, perhaps we must learn how to get Power
Point right. So this was my bringing code to Newcastle moment.
So you know the issues of reliability, of security, the issues of scalability are not
going away. And someone at some point should worry about a software
engineering aspect.
Well, this is kind of my little comment on the issue of inductive versus elective.
These guys somewhere between 1930 and 1950 convinced us, convinced most
of the world that the way good scientists work, at least in modern science is not
so much inductive, it's not like you look at the data and suddenly get the right
idea. It's actually that you have an idea, right or wrong, and you try it out.
And it seems to me that the case they made then is as compelling as it was then
and is very dangerous to imagine that just by analyzing zillions of terabytes of
data you are just going to get good science. So take an example from my own -one of the research topics that I'm exploring. There's a very interesting activity at
the moment in programming methodology which is to infer program invariance, in
particular loop invariants, right? So the problem contrary to what might think is
not to find loop invariants. That's very easy. You want them, I'll give them to you
automatically. The problem is to find good loop invariants, right?
Because for -- I mean this kind of cliche almost joke example in the field which is
that the invariant that you are going to find is that the height is always greater
than your shoe size. So it's the typical mathematical property, right, or the bad
mathematical property. It's absolutely accurate, it's absolutely uninteresting.
And this is the kind of thing that you risk getting if you just -- if you just analyze
the data and see what's there. So we -- and I'm -- to take another example I'm
pretty sure that the people from Safeway who analyze -- are buying patterns of
people from Amazon for that matter pretty much know what they're looking for.
And it's not that they just scan their zillions of data and find out what's there.
So to come into the second and last part of this short overview, I talked about
software engineering for the cloud. Now what about the cloud for software
engineering? It seems to me that people in software engineering have been
rather shy in taking advantage so far of huge computing resources, cloud based
or not. And it's a pity because there are lots of things that we can do.
Now, one of the things that has happened in software engineering in recent years
is the explosion of a field which used to be terrible and which is not so bad
anymore. It isn't good yet but is getting much better, a little better. It's empirical
software engineering. Empirical software engineering used to be terrible
because we -- because we had bad data or no data and experiments were not
reproducible. And then suddenly this whole field has become exciting. And
when I say exciting let me qualify this. I mean our stuff is not sexy, right? We
don't have smart cars, we don't have intelligent pacemakers. We don't have
ambient kitchens. And our idea of multidisciplinary in software engineering is
when someone who does symbolic execution talks to someone who does testing,
you know, someone who does multiple checking, talks to someone who does
abstract interpretation. That is how interdisciplinary we are. So exciting needs to
be put in context. But for me it's exciting.
Suddenly we have these huge repositories of data going back five, 10, 15 years,
sometimes more from Apache, from Eclipse and also Microsoft has put a lot of
data available to researchers and suddenly we can seriously talk about software
as an artifact, as an object of empirical objective study. And this has given a new
impetus to this old and boring field of empirical software engineering. And this is
one of the areas we're having much more competing power and much more
storage and having the cloud based architecture can make a big difference.
Also applications like model checking, like in particular SAT solving could make
much more use of cloud based architectures. Proofs. Proofs are -- well, you
know, proof -- putting your code on the cloud is not going to turn an undecidable
problem into a tractable one. But still, there's lots of things we can do by taking
advantage of the cloud and then the -- the ultimate parallel problem is testing.
And here is a platform for some of the work that we're doing. There's an article in
a recent issue of IEEE Computer where we described this Auto Test tool that
takes classes and test them completely automatically by basically trying
everything. It sounds silly but actually it works very nicely. If you have enough
computing power it finds bugs, which is the only significant criterion for testing.
So this is the kind of thing for which we can use described cloud based
architecture. The more the better. And I think we're going to see much more of
this.
And to finish, this is a plug for my -- for the next talk. So come in an hour or so,
45 minutes or so to room 1927 where I'll be describing this project that we have
with another almost ideal application of cloud based computing, that is to say
development environment where the program actually all the artifacts of the
software project are on the cloud. It's completely ridiculous that each developer
should have his or her own copy and then we'll run into these idiotic problems
reconciling everything.
And, yeah, so as a conclusion, the -- I'm not supposed to give a conclusion here.
It's more a start for discussion. There is lots of issues associated with software
engineering and the cloud, but software engineering is absolutely fundamental
for the cloud, and the cloud can bring new mentions to software engineering.
>> Roger Barga: Thank you, Bertrand. We can go ahead and shut the projector
off now, guys. And I'll invite our next speaker Marty Humphrey to go ahead and
talk now.
>> Marty Humphrey: So when Roger told me about the panel, and I heard my
old friends Paul and Savas were going to be on it, I was like excellent, I'm sure I'll
learn a lot. And then he goes no, no, he wants you to serve on the panel. So
things are a little more challenging to serve on a panel because as Paul
mentioned, the morning keynote in particular I thought really summed up what I
thought the next application should be, which is essentially that I want my
applications to deliver the right information to me at the right time in the right
format and in the right context, meaning on my particular device where I can or
my laptop or whatever.
And Dan's application scenario where the phone doesn't call him unless he
wants to pick up, you know, I thought that was right on. That's what we want.
And so my requirements for the next application in some sense is the application
I've been wanting for a while. Whether it now appears in the cloud or not, well,
that's somewhat orthogonal. But those are the applications that I think we really
need to be looking for.
And so then I started thinking about what I thought were the somewhat less
mundane reasons why these applications might not appear so quickly. You
know, it's the problems that I think the community should be looking at and
solving. And so I came up with my top five.
The first that I think we really need to look into as a community is this notion of a
push technology. It's interesting. One of my students told me that Amazon just
came out with something this week which looks really intriguing. Cannot be used
in what context and other platforms as a means to push information to us. I think
that would be really interesting to look into further.
Just like with Paul, I want the realtime cloud. I want the realtime access to data.
I want the realtime compute. That would be number 2 that I think we as a
community need to develop. Number 3 is related to that I want dynamic
scalability on the order of seconds. We know right now it's on the order of
minutes. It's on the order of 10 minutes. I want to scale and crank back down on
the order of seconds. I'm not sure when that's going to happen.
Number 4 is I think that the cloud boundary is too thick. And what I mean by that
is you tend to be in the cloud or out of the cloud. You know kind of where you
are, you know where the data is. You know in particular what cloud our
operating in. You know where things are. I want to barrier between clouds to
break down. I want the barrier between cloud and non cloud to break down as
well.
And then the last one, it may just be my particular pet peeve, in fact I've had
some conversations with various people here and they've told me I'm completely
wrong with this one and I respect that, but the last one is I want to stop thinking
about the economics of the cloud. I want to stop having to consider whether or
not the cloud is economically feasible. I want to not worry about moving data into
the cloud because it's off hours and it won't cost me that much. I don't want to
keep having to go through this in my mind, is this worth $6, is this worth $7? I
want to stop doing that. I don't know when that's going to go away. I don't know
whether it's just going to be a given that it's economically feasible. I'm not quite
there yet. And so those are the five things that I think we should as community
be looking at.
>> Savas Parastatidis: So thank you very much for the invitation to the panel,
Roger. It's very difficult to follow, you know, all the key notes and then what my
co-panelists have said because they've covered most of the space.
And where I would like to concentrate is again the future and the future
applications and the types of applications that I see the cloud enabling. Which I
say the cloud, it's just another name for what we've been talking about, all these
years around services. We've called it Web end point node and we called it grid
and it's -- the main thing that we should really be considering is the type of
applications that we want to enable.
I'm with Marty, I don't know who told you you are wrong but I'm with you on the
cost. I believe that very soon in the next few years, perhaps decade we're going
to see the cloud, the virtualization aspects of the cloud being so commoditized
that they're going to be so cheap companies are going to -- are going to
effectively offer additional services on top and offer the infrastructure for free.
The costs are going down so fast that at some point they are going to be free.
And that's where I see the value coming. Within companies like zone innovating
on the services side all the time and they're going to be making money out of
those services other than the actual hosting of virtual machines.
The interesting thing then becomes how do you combine those services to build
interesting -- more interesting applications? Now the services become the
middleware, the middleware that spans an entire planet and our applications
have to be aware of that. We have applications now that will be a built at a
global scale. Our way of developing those applications will have to
accommodate for that.
Where that leads us is to a ecosystem of services to applications that make it
possible to bring islands of data together. Because everything just sits out there,
it's easier to combine. It's easier to process. It's cheap to process. It's cheap to
transfer and so on.
And this is where I believe the value, the future value of applications is going to
be. And some of you I know I'm a very huge fan of semantics and knows
presentation technology and knows this processing and I believe that going into
the future that's going to be fundamental value of applications that are hosted in
the cloud.
How do we combine data in interesting ways without having to download it, to
process it locally and then perhaps sell it to the community? Everything is out
there. Companies are actually starting to attract, they're trying to attract the
customers to their infrastructure by offering access to data in a faster way.
Amazon has started it and I believe that as said earlier I believe that you saw a
presentation from one of our previous technologist called [inaudible] and I think
this is a [inaudible] for people -- for companies put out there to attract people to
their platforms.
But the interesting applications are built on top.
Another discussion that I had two days ago was this concept of building four
square type applications for data. It's great to go using maps and match up the
four square notes that people put for various places. So you can go [inaudible]
on top of Washington Lake. Someone was -- had left a comment saying do not
try to cross the bridge when there are high winds in Seattle. And now that note is
always there for someone to go and see.
Imagine if you can start combining information that adds -- that adds value to
give value to consumer. So imagine someone posting a note about a concert
that took place in like at the stadium. And they post a note saying I was here,
you could have an ecosystem of data that is combined together to infer that that
note was about the particular concert that took place there. Because we have
both spatial and temporal data from around the world to do that. You can do this
for science. You can annotate data and make the data available.
Everything that we've been talking about, province of information can be out
there. And yes, of course, you access all that information from a variety of
devices and variety of devices that you use.
And so to me, the value for -- of the cloud is that it can enable this type of
applications. Applications of information, knowledge, and the sharing of those.
I'll stop there.
>> Rob Gillen: So I've been spending the last week or so, since Roger asked,
thinking about this, and my background's probably a little different than most in
that I'm not a classically trained scientist. I came from about 10 years in the
industry before you ended up at Oakridge.
And as I've been chewing through this, I came up with sort of two examples or
two examples I'm going to use to start my answer to the question of what I think
the cloud enables.
The first is Lego Mindstorms. Anybody know what leg go Mindstorms are?
Okay. I love Lego Mindstorms. In case you don't know what that is, it's
essentially this toy that allows kids of, well, frankly any age, I'm learning, to take
-- to follow sort of a Lego paradigm and play with robotics. There's no -- there's
very little presupposition about what your knowledge is.
If you can plug things together, if you can follow pictures, you can do -- you could
build this little robot. And -- but it's not just a toy in that it has sort of this
extensibility pattern that beyond the ability to program it or to built it and to
program the motors and see things work within the Lego construct you also have
this ability to talk to if you build more real world robotics programming languages
like Microsoft Robotics Studio and things. So it's sort of a -- I hate to put it this
way, a gateway drug to robotics, right? It's that thing that can get you hooked
and pull you into something. But the barrier to entry is really small.
The second example is I had the privilege sort of on a fluke this weekend of
running into my high school science teacher happened to be passing through
town and we ran into each other at a common event. And there's this guy's
name is Randy White, he's a very quiet unassuming guy. But he's passionate
about his technology and he's very good at transference, right? So he's very
good at helping people understand or to develop a desire to learn more. And
one of the phrases he used time and time again, it's probably the only thing I
remember from his physics class is start with what you know.
He would present this big problem or at least what seemed to me at the time to
be a big problem or to the class and he would say start with what you know.
Solve that problem. And the premise being that most of the major problems we
need to solve are nothing more than conglomerations of smaller problems and
sort of an additive of smaller problems. And if you take incremental steps to
solve those problems, you can turn around one day and see that you've solved
something pretty significant.
So now if you're sitting there wondering well what does that happen to do with
the topic de jure, well let's try to tie it together.
I view cloud computing in sort of the paradigm and some of the cost models that
we mentioned earlier that while you're right it would be nice to not have to sort of
think about that and I think eventually get to the spot where it's like buying a cup
of coffee, you don't think about the fact that my cup of Starbucks cost me four
bucks, you just buy it, right. It becomes sort of cost of doing business.
But I think cloud computing is in the paradigms as we see with both Amazon and
Microsoft right now, that price model opens up opportunities or accessibility,
makes large scale accessible to groups of people who would never be accessible
previously. I think that is amazing, right?
Randy White, I grew up in the -- he taught or class and I grew up in the high
school we had a total of 23 people in our graduating class. So really small high
school. There's not a chance in the world that we would ever have a computing
lab that we could do work with. We didn't have access to clusters. We didn't
know what that was. Yet now with cloud computing Randy can stand up a lab to
do -- to actually demonstrate an experiment to do computational physics for
pennies or dollars, right? He can do a 20 node cluster for a couple of bucks for a
class period.
We saw the example during one of the key notes the other day where they used
-- cloud computing enabled them to do this rather significant testing that they
would not have been able to do before.
And I guess to me the promise is -- you know, I work at this massive computing
center and we've had a lot of talks sort of at the collegiate level and the research
center level, but I'd like to challenge us to look sort of beyond that or below that,
depending on how you view the world, and let's challenge you to look at ways of
how can we -- how can we expose computational thinking or computational
processing from a thought process in massive parallelization that we certainly
wouldn't call it that because they wouldn't get that, but to people in to high
schoolers, to students at the educational level, how do we provide tools so that
teachers such as Randy White who are simply trying to express physics in
computational ideas can do so with richness, with connectivity to massive data
sets?
I was talking to a couple of the guys in the back from Microsoft Research earlier.
I said I had the privilege of not growing up in the -- not being trained as a
classical scientist and learning that you're not supposed to share data until
you've published. And I'd like to see us publish data early and often. Get that
data out. Get larger collections of people consuming the data. You know, we
saw the talk earlier about Dallas. And one of the key things I think Dallas does,
and there's still work that needs to be done there, is to provide a commonality of
access to data. So you don't necessarily have to be a domain scientist to access
that data. You can be someone who may be a scientist, may be a science
teacher and have obviously a certain level of intelligence but you can pull
together data from different, from disparate disciplines and do something
different, do something to open the eyes of that next generation of computational
scientists. I think that's where a real challenge lies, and I'd like to see sort of the
community step up and address that area.
>> Roger Barga: Thank you, Rob.
At this point, I'd like to take questions from the audience. And I'll bring a
microphone to those of you. Just raise your hands and I'll bring it out to you so
we can ask our panelists questions.
>>: Thank you. My name is Rodrigo. And I'd like to propose a team for the
panelists and the team is about water and cost. Two of the panelists mentioned
that they would like to see a day that the cloud doesn't have any cost. So that
would be accessible to all.
>>: Wait. That's what he said. Let me clarify. I don't want it to be free. I don't
need it to be free. But I don't want to worry about doing the map to figure out if I
should do it locally or somewhere else or whatever. I'm more than willing to pay.
He's not. [laughter].
>>: No, I know. Just let me just finish the question first. Okay. So the question
is the following. We're comfortable with our desktop where we do applications
compatible with our spews when we are not spending cycles or we are doing
things with certain overhead it doesn't hurt so much. But in the clouds when we
are introducing overhead to cloud applications, the cluster is misused. Isn't the
cloud targets environments where we should be especially concerned with
performance because when you're spending cycles because we're using
overhead in the technology we're preventing other people from using it and that
maps directly to natural resources?
>>: So I can -- first I need to clarify. I'm not suggesting that everything is going
to be free, it's just that the companies which it comes to cost will give the
infrastructure for free, the platform for free because they are going to see value
from offering additional services. And of course we will have to pay for those
services.
I can take -- even though [inaudible] is not to think about cost and the structure
and platform I think when it comes to the environment, the aggregation of
resources in data center delta, in data centers of environments with efficiency
and economics in mind is better than having -- it's better than having thousands
and thousands of computers doing processing all around the world when it
comes to the environment.
And when -- you mentioned, you know, the performance issue. Again, if the
current examples of how cloud computing resources are being used, whether
they are internal to an organization or whether they are offered to the public is
that again for a -- from an economics perspective, power consumption
perspective I think sometimes it's better to wait and to have scale and rather than
trying to do things fast and get an answer sooner.
And again, you know, the Google's infrastructure [inaudible] demonstrated
exactly that. People are willing to wait if it is economically better to do so.
>> Roger Barga: Thanks, Savas.
>>: On the economic the cloud has a -- it seems to me a rather important
consequence which is to go from a purchase model to a rental model. And in the
case of software engineering this could have a major effect on the evolution of
the field because it could get us out of the quandary in which we've been for
years. The quandary that I'm referring to is the difficulty not to say sometimes
impossibility of selling software tools, of selling software development tools.
Now, I'm exaggerating a little bit because it's not only impossible but it is hard, for
example it's difficult to sell reusable components because what you -- because
you get caught in this dilemma of either charging too much up front or not
recouping your investment. And of course this has been compounded by the
whole emergence of free software and I don't want to go into very politically
charged debates here, but the general assumption that software should -- in
particular software tools should be free has certainly complicated the picture.
And what cloud based tools, a typical example would be sales force choice is
that it seems to be much easier to make customers accept a rental based model
which because it's a little money for really long time as opposed to a lot of money
for a small time and psychologically it's much easier. And this might provide a
new impetus for people who want to advance software engineering to -- it might
enable them to find a economic model that supports progress as opposed to the
model that we have had for the past 20 years which actually has, in my opinion,
hindered progress.
>>: Hi there. [inaudible] from the University of Denver. I just want to say that I
think you're very right about, you know, thinking about cost. This is kind of a
problem I guess when you look at how these services are right now let's say
brought into the market. You're not constrained as a user anymore by the
infrastructure that's there. You have the choice of different types of architectures,
different types of pricing models, so using reserved instances, spot instances, on
demand instances.
And this kind of complexity I guess that lot of users don't want to think about and
that should be tackled with let's say brokering systems that operate on the level
of an entire organization for example. And I can optimize this entire portfolio.
I think we can learn a lot also from fields of electricity markets, for example, how
brokering is done there. To take away this complexity. And this also has a lot to
do I guess with modeling, performance modeling to kind of automate this
allocation process. So I don't know if there's more ideas on the panel in this
respect what could be done there, what are the key questions. But I think it's a
very important research topic.
>>: [inaudible] before we came here exactly, exactly a topic that there is space
for middleware out there that will make it easier for a -- for software vendors to
build applications that can make those type of decisions based on modeling,
based on information that comes dynamically from cloud providers based on
understanding of how the application behaves, the communication, the reaction
patterns, the use of storage. And perhaps you can even think of marketplace
where you can have the deal of the day and you can utilizes some computational
powers and so on.
>>: I think what's difficult with this problem is basically you have to solve all
these issues that that have been along and like distribute computing for tens of
years how you map workloads to infrastructure efficiently, and if you really want
to minimize for cost you need to solve all of this stuff alongside all the difficulty of
trading efficiently. So I think it's all really complex and difficult problem, certainly
to solve optimally.
>>: So one thing that I think is important to bear in mind is actually that we cost
something for every hour as well, so you may charge 10 cents per CP hour to
use a -- to use a cloud put actually most of us in this room are paid, I don't know,
a thousand times that per hour or something, something like that. Even grad
students are paid significant more than 10 cents an hour. And so unless you are
going to produce a code that you're going to run again and again and again, you
really do have to make sure that you're not spending too much time optimizing
something wasting your own time doing that when it would Turing be if you took
everything into account just to run it and then move on to the next thing.
>>: I agree completely, but this is like thinking about when this -- when the cloud
infrastructure becomes a real utility, you know, so something that people rely on
every day like they rely on electricity, you know, and so this supports their
business. So they -- you can think about workloads that are, you know, run often
on this infrastructure and you can learn about how they operate and how you
won't need to map them. And I think it's largely unexplored at the moment.
>>: I just to throw a kink in the argument here, I think the point over here is that
-- is crucial that you know there's -- so I came from industry. I did a lot of
software consulting. So it was always about billable hours. You're always
thinking about how much am I costing that client, how much are my efficiencies
and so forth. I think that's very important.
I also think that and this might slightly be slight heresy in this room but we need
to think about how long it's taking us to solve a given problem for a science
domain, not how perfect can our computer science be. And because, frankly, I -at least I view computer scientists as a means to an end, not in and of itself,
right? We're doing what we're doing, we're coming up with algorithms and
means of solving problems so we can solve actual problems, right? There are
tools or building mechanisms to solve that. And this may sound funny coming
from a place where people spend, you know, years working on squeezing every
last drop of performance out of MPI, but there's a balance, right? One of the
keynoters had a phrase about reducing the time to insight. And that's the key,
right? We -- if a scientist poses a question, or a domain scientist be it biology or
astrophysics or whatever, if they pose a question, we should be looking at how
can we help them answer that question in the most fast and accurately possible
-- not necessarily from a clock time but from a wall clock time, from the time they
pose the question to the time they get their answer. Because I don't frankly care
if it takes -- if it takes an order of magnitude longer to run in the cloud but they
don't have to wait in the queue for six weeks, then that's a win, right?
>>: Finally just very quickly just to add to a point, to Paul's point about humans,
you know, resources cog and I think it's great. I think another one to consider is
scalability. So it's not the scalability of the infrastructure but the scalability
complexity of what you are trying to optimize. Questions that you can ask is
would it make sense economically for Facebook's infrastructure, data processing
infrastructure to be outsourced into cloud utility computing effectively? And could
you provide the middleware to demonstrate to them that their operational
expenses are going to be minimized by using some -- modeling some
performance modeling related middleware?
We are very far away from that happening.
>>: Okay. I had a question about application for something like the smart
kitchen and the uploading of data of the vehicle. Like what would be the security
implications of that data which is I consider to be very sensitive data about every
person?
>>: Yes. So that is an initial in the project. We have a whole strand of work on
trust. And it's interesting how trust differs between older people and younger
people. So you get younger people who are prepared to put everything on the -on Facebook and every picture of them at a party at a university, speaking as a
parent who has got two children at a university, then I find out what they're doing
by looking at the photographs on Facebook and my kids and lots of kids seem
very happy to reveal lots about their private life almost in public whereas older
people are much more, in general much more circumspect about this. So you
have to be very careful about what you're doing.
For example, on one of the slides I put up I deliberately showed medical records
being away from the cloud. The idea that you would integrate some information
with medical records in order to decide say what drug to put a person on given a
particular driving behavior but you might do that off the cloud in the private IT
system of the medics rather than in the cloud if that was an issue, and it often is.
I think for some of us who are starting to look at these things as well in other
countries and the regional aspects of it really vary. So this idea you upload data
into a cloud, then we in the university have particular ethical constraints that we
have to meet about what we do with, for example, patient data than the my
country, the UK has particular constraints as well about what it's allowed to do.
Then you've got to look at the small print of the cloud provider for the region in
which the data's going into. And so it becomes very, very, very complex. And
whether you could take the lead from the idea that you could make decisions
based on cost automatically, whether you could make decisions based on what
data's allowed to go out to a particular cloud and which particular cloud it's
allowed to go to based on some automatic match making between the small print
of the cloud provider and the typical ethical requirements of the data is something
I know some people are looking at. But it's a very difficult problem that you do
have to consider.
>>: If I just throw two points out there. And this is coming from someone who
works at a classified environment with nuclear stuff. This argument comes up all
the time. Two semi provocative points. There is a certain security through
obscurity which I understand is not but you have to think through that. If you're
blended with billions of other records finding that data in someone else's cloud is
going to be a lot harder than finding it in say a hospital's database. Secondly, it's
-- people tend to be and this -- I don't know your case but in general people tend
to be arrogant about the way they protect their data with respect to how they
think cloud providers protect their data. In general the cloud providers are going
to protect your data much, much better than you -- than most data centers that
I've ever seen.
>> Roger Barga: Thank you, Rob. All right. So it's now time for us to wrap up
the panel. I'd like to thank you, the audience, for your questions and attention.
And let's thank -- join me in thanking the panelists.
[applause].
>>: Thank you.
Download