>>: Welcome, everybody, to the session, cloud futures, our... my pleasure to introduce to you Alexandru Iosup, who is...

advertisement
>>: Welcome, everybody, to the session, cloud futures, our last session today. And it's
my pleasure to introduce to you Alexandru Iosup, who is from the Delft University of
Technology. He's going to talk to us on cloud computing support for massively social
gaming. Thank you.
Alexandru Iosup: Okay. Thank you for the introduction.
So before I start with the actual speech, let me mention that this work is done in
cooperation with a wonderful group of people from the University of Deflt, from the Vrije
Universiteit in Amsterdam, the Netherlands, from University of Innsbruck, Austria, and
from Polytechnic University of Bucharest in Romania.
I'm trying to postpone a bit more delving into the subject, and I hope I will be given a few
minute to talk about this excerpt from the presentation. We've been asked to provide
some tips on how we think cloud computing could help our work. Well, I'm a computer
scientist, so I'm going to give you two tips on how I think that clouds could help
computer science now.
The first is, I think, a service that would allow us to do heartbeat-based research
allocation would be great. And let me give you a sample scenario. We are doing now a
project that we call BTWorld. We are trying to collect, observe global information about
the complete BitTorrent network around the world. This is hundreds of millions of
people sharing files in a very large file-sharing network. The main characteristic of such
a project is that you would like to get information continuously and you would preferably
not miss anything.
So, of course, we've got a nice data center, we've got some extra power. The problem
is that at some point, well, parts of our city, the city of Delft, and in particular the
complete campus of the university, just went into a blackout. So we experienced about
a day of failure. That is, no data could be collected during this period. It would have
been extremely helpful if we could have had a service of a cloud so that upon not
receiving any kind of message from our servers, we are okay, we are running, could
have started automatically the same service in the clouds. And definitely we would
have been willing to pay extra just to have this kind of nice backup.
Of course, we could have installed our own extra heartbeats that try to check the
system, but our extra heartbeats, which actually existed, were in the same network. So
we could not imagine that the whole campus would just go in a blackout and everything
will go down.
Okay. The second idea goes in the direction of data, data set storage for computer
science or data set storage in the cloud. We've heard about very large sizes of data
sets that occur that exist in other domains. Well, computer science also has quite large
data sets. And here you can see a graph which on the horizontal axis you see when
different data sets were created for different communities, part of computer science, and
on the vertical sides you will see the different sizes of these data sets.
And I'm talking about the Grid Workloads Archive, which is the largest collection of grid
traces in the world. Established in 2006, it had just under one terabyte -- sorry, one
gigabyte of information.
Then a year ago we built the Failure Trace Archive, which stores failure of resource
availability information for a large number of distributed systems and contains about 10
gigabytes of information.
We are currently building the Peer-to-Peer Trace Archive, so just consider peer-to-peer
systems, about 20 or 30 of them, and then you take the traces. Then you start going to
really large numbers of samples and large data, about 100 gigabytes. The gaming
trace archive goes in the direction of 1 terabyte, and currently with the BTWorld Project
that I've just explained what it does -- collects information from around the world for
BitTorrent -- we are collecting about 1 terabyte per year.
We would definitely like to share this information with thousands of researchers. This is
why we are building these archives. It's pretty difficult to actually maintain them. So the
could help with free storage, and I'm sure people would pay for processing such data
sets in the clouds.
Okay. So without postponing, I'm going to talk today a bit about what massively social
games are. I'm going to present a bit our mission. It was very nice seeing so many
slides with mission and goals and how we hope to achieve them and so forth. So it's
nice that our presentations had such a slide where we have, too.
Then I'm going to talk a bit about challenges for massively social gaming and how we
can convert these challenges into opportunities for cloud computing. I'm going to give
an example of what we did in practice to address these challenges and talk a bit about
our experience with the clouds for our applications and outside our applications -- we
did a lot of performance testing of clouds -- and I will conclude.
So massively social games are a popular growing market. Massively social games are
a nice internet application with tens of millions of users. If you look on the graph on the
left and on the horizontal axis, you will see the growth of the number of subscribed
players over time. You will see that this number of players is growing exponentially.
Currently the number of players that are subscribed, so really paying for the playing
service, is around 25 million, but you have to consider that there are about 150 million
that are actually active players, not all of them paying for this service. This makes the
gaming industry on par with other entertainment industries such as music and video and
movies.
But what's in a name? What's a massively social game? Well, massively social games
are online games with a massive number of players, so let's say over 100,000, for which
social interaction actually helps or improves the game play experience.
And in practice they consist of three things. A virtual world. This is what the game
actually simulates. In this virtual world, players can explore, do things, learn how to do
things or share experience. They can socialize and, more importantly, they can
compete.
There exists content. And game content can be of many different types. You can have
graphical contents or different types of objects, you can have maps, objects can be
distributed around these maps to create an illusion of a real world. When you distribute
objects across maps, you will probably try to add some semantical meaning to this
distribution, and this is what puzzles actually mean. So there is some relationship
between the objects and perhaps the way players interact with them, and players have
to solve specific challenges related to these abstract relationships within objects, and
this turns into more puzzles and chains of puzzles, which are actually quests. And
there's also culture. So content is definitely a part of culture.
And the third part is game analytics. So once we have a virtual world that operates and
has a massive number of players, we have to make sure that this virtual world can
operate safely and that it can continue operating and can continue developing. So with
game analytics you want to collect player stats and relationships and try to do
something useful with this for the game.
So now the famous mission slides. I hope I can read, because I definitely did not
memorize these things. I have here a very small version of the slide.
So our mission in terms of massively social gaming is to enable the development,
deployment and operation of massively social gaming for small businesses and for
amateur game developers.
Now, in terms of strategy, of course we would like to be the first to identify various types
of opportunities that occur in the context of massively social gaming, but the real
strategy is to design and build fully fledged, fully working cloud-based massively social
games and, while doing that, trying to uncover the real fundamental operations and way
of running such nice applications.
And we want, of course, to have a nice impact, so we would like to run
multidisciplinary -- oh, thank you. Now I have a bigger version of the slide. We would
want to have multidisciplinary and multi-institutional international teams with which to
work on trying to achieve such an ambitious mission. And we would like to educate
academics about the interesting world that is massively social gaming and also use
massively social gaming for academic education. We started around the end of 2009.
Now, about the cloud, I think it's pretty obvious what we would like to achieve. We
would like to explore the capabilities of cloud computing to support real applications with
massive social impact, and massively social gaming is such an application.
I guess one of the main things that I would like to stress in our strategy is that we will try
not to run out of hyperbole when describing cloud projects or cloud prospects before the
clouds actually took a foothold on the market. So we've been hyping cloud already
quite a lot. I think we are running out of hyperbole.
One of the key parts of the strategy is also understanding the capabilities of the cloud
paradigm by developing and deploying and running really real, full-fledged applications.
We started doing a bit of cloud computing research in early 2008, and we intensified this
activity in 2009.
So what are the challenges? Well, if you remember what goes inside the massively
social game, that is a virtual world, some content, and then some idea of players that
are somehow synthesized into analytics, then you can convert it easily into challenges.
So what is the virtual world simulation challenge? Well, we know how to do simulations.
The problem is how to do a simulation on such a platform as a cloud or on different
types of platforms.
What you would want from such a platform is the ability to scale very quickly. Let me
give you a few figures. Reaching a million players in just a few days, reaching 10
million players in just a few months. These are the kind of targets requirements that you
would want to achieve.
And if you think about the clouds, then I guess there's a clear match between massively
social gaming and clouds. You can scale -- you want to scale very quickly and
seamlessly. That's what clouds now how to do. Apparently you can do this, according
to our research.
Now, in terms of the content problem, what you would want to do is to generate
automatically content for millions of players, complete automatically, again, on demand.
So somehow this content has to be generated only when needed. You don't want to
generate tons of content that is expensive if your players will not demand it or they will
just quit the game. And you want the content to be balanced and diverse and fresh and
meets the players' expectations and ensures some form of fairness in the competitive
world that is a massively social game.
So we've had some nice results with this as well. What I'm going to focus on today is
the idea of solving the analytics problem of massively social gaming or massively
multi-player online gaming, that is, analyzing the behavior of millions of players on time.
And this kind of challenges raises questions in terms of data mining, in terms of data
access rights, security, authentication in terms of how do you balance a tradeoff
between cost and accuracy. All these challenges match very, very closely what we've
heard until now about what clouds can do and what we think that good research
questions could be in terms of cloud applications.
And for all the challenges that I've been mentioning so far, one of the key, let's say,
requirements for actually having these challenges solved is to ensure that you can
reduce the upfront cost of the specific solution and reduce the operational cost of the
specific solution and also keep the response times low and keep the system scalable.
All these I think match very well what we are trying to do with clouds.
So let me give you a practical example. Consider the subpart of game analytics where
the analytics are performed by a third party. So you've got a game, you've got a
population of tens of millions of players. What you want to do is to build smaller
communities outside the, let's say, control of the game operator, and these smaller
communities have hundreds of thousands of players may require different things.
For instance, they may want to learn what is their progress in the game, they may want
to learn what is their group's progress in the game, they may want to find out who
should they play with so that increase their chances of performing well in the game, they
may want to get a sort of news digest of who played well in the game and how did that
player do it, they may want to get automatically generated videos of how other players
enjoy the game and play it.
So what you want to do is to build a continuous analytics platform for massively social
gaming in which the data are provided by a game operator but taken outside the game
operator's platform and analyzed by a third party. And you want to analyze raw and
derivative massively social gaming data such that important events are not lost.
So you've got millions of users, you've got a dynamic size of relevant data, you've got
different types of requests at different times of the day, you've got peaks of players that
really join your community and try to get information about what happens, and you've
got users with very different requirements.
Some users would want to see only very simple types of information every week, some
sort of summary of how they progressed in the game. They would not be willing to pay
too much for this. Some other players really want interactive content, and they will be
willing to pay for this. The only problem that state-of-the-art third-party analysis only
runs into the tens of thousands of players.
What we did is that we built the CAMEO framework for doing continuous analytics for
massively social gaming. I guess the key part in this framework is that we are able and
willing to clouds, so on-demand, paid-for, guaranteed resources for sparse or excess
load.
And I will give you an example of what we managed to do with this CAMEO framework.
We took one of the major games on the market. It's RuneScape. It currently holds the
world's record in terms of number of open accounts. Over 135 million players have
opened accounts with this game. And we've collected a data set of over 3 million
players, all the players that are recorded in the high scores. So all the best players that
remain in the game for long enough to become top players.
If you remembered one of the major problems with state of the art analysis in terms of
the third-party continuous analytics for massively social gaming, we were talking about
tens of thousands. Here we are talking already about a cloud-based operation that was
able to collect 3 million. So orders of magnitude, more players, more information
collected over a cloud.
What this data collection enabled us to do was to also investigate a level of
achievement of different layers. And we've observed that the number of mid and high
level players is pretty significant.
What does this mean? Well, if you think traditional ways to generate content for games,
then you will notice that most of the game content is generated for very beginner types
of players. I'm not even going to talk about balanced content. I'm not going to talk
about player-customized content. I'm just going to talk about the level for which the
content is addressed.
The reason is very simple. It's difficult to generate content that really stresses the minds
of the top players. This requires a lot of human input in the current solutions.
So this allowed us to identify a new content generation challenge. How do you
generate content automatically or not for mid and high level players? I'm not going to
go into more details about how we solved this challenge or addressed this challenge for
small puzzles, but we have a nice paper on the topic.
Another thing that you can get out of using this platform is the idea of cost. How much
does it cost to do continuous analytics for a specific game? Well, this was one of the
top largest games on the market, so it wasn't just any game. And it cost less than 500
per month. Now, game operators don't usually measure things in total cost, rather in
cost per player per month, so you can see here a cost that I can't even pronounce. It
has quite a few zeros.
Let me give you another example of what you can do in terms of continuous analytics.
We took the largest free online bridge community, which is Bridge Base Online. It has
about 1 million players. What's nice about it is that top-ranked players, real world
player, world cup and professional players join occasionally this network. We collected
a data set of about 100,000 players. And -- thank you -- and tried to investigate what
are the social relationships between players.
And in terms of Bridge, a social relationship is formed between players that play
together. Bridge is a game with four players paired together in two pairs of two players
at a table, and this pairing relationship is something that gets built over time and
becomes a real social relationship.
One of the things that is a sort of constant around social networking around the world is
that a social network of over 100 people cannot usually continue to function. It
disintegrates into some sort of loose connectivity that doesn't continue.
Well, we found that for this gaming social network formed between about 9,000 players
out of those 100,000, this large online social group can coordinate. So those players
stuck to each other and continued to play. We were also able to investigate and identify
different types of player behavior that are really beneficial for games.
Okay. How about lessons learned about the clouds? Well, I think there are two main
pictures of a cloud. On the left part of this slide you will see, well, the path to
abundance. So the cloud's seen as really the path to success. You get on-demand
capacity. It's cheap for short-term tasks. It's absolutely fantastic for web applications.
It has great support for web crawls, especially through the elastic IPs. It has great
database operations, great input, outputs. Everything seems perfect.
But there is another darker reality, and this nice cloud can turn into a real tornado or a
cyclone, as I happened to depict here. The problem is that the performance percent of
applications is absolutely awful. So you can check some results in our previous work.
You will get, I don't know, 10, 15 percent of what you would expect out of a platform that
you get if you would this platform installed in your own department with the best
possible set up interconnections and so forth. That's quite low.
You don't have only this problem. You also have a problem of performance variability.
So let's assume you start provisioning your cloud researches based on some
assumption, so performance model. That one will change quite considerably over time,
and you will get -- okay, so let's skip the low performance for scientific computing. I
guess there have been already two or three of the key note speakers also mentioning
very poor performance for scientific computing.
What they didn't mention is this cloud performance variability here. You see a fine little
sample of an investigation that we've done for all the different operations of all the
different services provided by two types of clouds. The infrastructure is a service cloud,
Amazon web services, and the platform is a service cloud, Google App Engine.
And what we did is we really investigated what's the performance delivered every two
minutes for a period of about a year and a half. Here you see the results for just one of
the years in the middle of the sampling interval. And what you will see most noticeably
is that the median performance fluctuates quite a lot. This is the median performance
for one of the operations in terms of transferring data.
I'm not even going to talk about the outliers and other statistical properties that show
that this particular operation has variable performance. If over the year the performance
varies quite a lot, the median performance varies quite a lot, then you've got to do
something about the cloud provisioning process.
Okay. So to conclude, massively social games are million user -- well, I would say even
100 million user, multi billion, perhaps even tens of billions now market. Such games
consist of content, world simulation and analytics. The current technology involves
upfront payment, cost and scalability problems, and most importantly, it makes players
unhappy. This is mostly related with delays in the world simulation and with the quality
of the current content, which both I can argue, and if you ask me, I will try to argue
about it, are based on the analytics part.
Our vision is that scalability and automation can be achieved by integrating massively
social gaming into clouds. Our ongoing work has different frameworks and platform
improvements for reaching this vision. And I think we can achieve a future in which
happy players are matched with happy cloud operators.
So this is it. Thank you.
>>: Thank you very much. You have questions?
[applause]
>>: Yes, over here.
>>: Did that performance that you observed, are you constantly monitoring this? And
have you commented on this performance, these performance problems with, say, the
providers? And what was their response?
Alexandru Iosup: Yes. So the short answer is yes, we are constantly monitoring this.
We're actually using two different data sources, so we're not measuring ourselves
continuously, or we haven't done this for the past two years. We used the cloud state
as data, and we use the data published by the cloud providers themselves.
As to what causes this performance, I guess, well, the cloud providers themselves told
you in their presentations a bit ago, that basically, well, in terms of the type of
performance that you're getting, the platform itself is not designed for those types of
workloads, and I think one of the presenters, [inaudible], really showed that, for
instance, the networking interconnects were partially to be blamed. There are other
performance problems. I'm not sure that that solves -- answers every part of the
performance issue.
In term of performance variability and variation over time, over long periods of time,
well, these systems are really under development. Also, their own operators recognized
that the systems are under development. So, yeah, I would expect changes. This is, in
a sense, bad news for somebody who would think that, okay, you know, we can take
the platform and really use it long term or just try to provision the capacity on the spot.
That might not be easy to do. You may want to run a bit, understand what's the current
performance in the network or in the cloud, and then do something about it.
>>: Other questions? There's one more here.
>>: In terms of a cloud characterization of the kind of [inaudible] analytics that you
talked about [inaudible]?
Alexandru Iosup: So definitely you will have different tasks that have to synchronize in
terms of data collection. So unlike other data processing sequences, you definitely
need to collect your own data. This changes quite a lot of the game. You can do a lot
of optimizations in terms of what you can compute -- well, collect and compute versus
only what you can compute.
In terms of what kind of computation, many of the computations, especially those going
in the social network analysis part, exceed the capacity of one machine. So those ones
will need to run across different machines and be synchronized.
In terms of other types of operations, many of the things you will want to do in a sense
of a sequence, a bag of tasks, large scale sequence of tasks. So you will want, for
instance, to do the same analysis for a large number of players and how you're actually
splitting the job, whether it's a single task processing multiple players but then having a
large number of tasks or a single task doing this, yeah, depends a bit on the type of
computation.
>>: [inaudible].
>>: Okay. But I think we'd better get the answer for that offline. Sounds an interesting
question.
Alexandru Iosup: Just when I was trying to get -- [laughter].
>>: Okay. Well, thank you very, very much.
[applause].
>>: Okay. And now we have our next speaker that we've got to get hooked up first.
So it's my great pleasure now to introduce as our second speaker Bertrand Meyer from
ETH Zurich, and he probably needs no introduction except I must say one thing, that he
is an absolute expert on using the tablet PC, as you're all going to see in his talk today.
Bertrand Meyer: Well, I think I'm the only person in the world to -- I mean, someone has
to use the great Microsoft technology, so I feel kind of -- well, actually it's great. I don't
understand why not everyone else is using it.
So I'm going to talk about the last commit and the end of configuration management. I
hate configuration management. I think it's bane and a shame and we should replace it
by something better.
So the actual -- the more general talk is about a tool that we've developed which -- I
should mention who is "we." The prime mover behind this effort is Martin Nordio.
Doctor Martin Nordio. I think he got his Ph.D. certificate yesterday, something like that,
at ETH, and he's been developing this as a kind of [inaudible] works project with the
help of a master's student at Hanoi University of Technology who is collaborating with
us as part of a project -- or of a course project, actually, that I'll mention briefly at the
end.
So what they've built is a cloud-based interactive -- or integrated development
environment, IDE, or software development environment, which is a version of our
EiffellStudio development environment, and it makes it possible to have a shared
repository for a software project.
And of course the most important part is code, but really when we're talking about
software artifacts, it's more than code. It code, it's design, it's analysis, document,
everything that a software project might need. It allows direct manipulation by all project
members and developers as well as managers, and it allows unobtrusive configuration
management. I'm going to explain what I mean by this, but in fact we did not invent
anything, we just applied techniques that are now prevalent, ubiquitous, thanks to tools
like Wikipedia, for example, or Google Docs.
So the first part about configuration management is that I should make full disclosure,
and I hope you'll appreciate the discretion of the plug for my latest back. This is a great
book, and I hope you'll all rush and buy it, and it's my introductory programming
textbook which is based on several years of teaching introductory programming at ETH.
There's a software engineering introduction, a chapter idea, which is an introduction,
beyond programming to software engineering, and it [inaudible] about configuration
management, saying this is one of the most important things that the student can
remember, can learn. Configuration management is one of the principal best practices
of modern software engineering which every project, large or small, should apply.
So, now, I know this is being recorded, but I have to make a private confession. This is
my outing. I don't practice what I preach. I hate configuration management. Actually, I
hate the tools. I've tried all of them. I've tried Windows, [inaudible], whatever it's called.
I've tried to use Eclipse just to do configuration management. I've tried Tortoise. It's a
catastrophe. It never works for me, so it's kind of a running joke in my group that I send
an email to an assistant and he checks the stuff into the configuration management
database. And partly it's my fault, but I don't think it's entirely my fault, because after all,
I am a techie, I look tools, and I can learn tools, but this is just too much -- how can I
describe it -- Twentieth Century.
So this Twentieth Century configuration management process, which, of course, quite
seriously, is very important. And, of course, what I'm going to argue is for better
configuration management techniques. I've also seen people, students, for example,
who say we don't need no stupid configuration management, and you can imagine what
the results are, right? So let's put this in context.
But the problem with configuration management as it's practiced today is that you have
this kind of process where you take some piece [inaudible] artifact, it's got an update or
check out, then you start working on it very hard and so that -- yeah, edit, let's say in a
broad sense, then you check it back. So it looks nice when you see it like this, except
that, of course, in the meantime you have ten of your colleagues who have also done
some updates and commits, and then it's a total mess. You have to reconcile the
changes in yours, and this is really the bane of practical software development.
And there is a better way. Okay? There is a better way, which, without too much
fanfare, has been introduced in the past 10 years, 10 or 15 years or so, and which we
all now apply. So one example of this is Google Docs. Provocation has its limits so I'm
not going to use Google Docs here, I'm going to use Wiki.
So let's see. Let's do some search for this guy who has a Wikipedia -- oh, no, it's doing
this to me again [laughter]. No, I logged in two minutes ago. Okay. This doesn't count
in my 20 minutes. You ask [inaudible]. This doesn't count in my 20 minutes.
As guest, and then [inaudible]. Okay. So I'm not responsible for this page, although
now I'm going to be, so let's see, edit. Yeah. Edit this page. And when was the guy
born? Well, okay, let's say 2010. Now, this is very dangerous, right? I'm hacking
Wikipedia in full view of everyone and on Microsoft premises using Microsoft network
[laughter].
This is going to get you into trouble, Judith, I think.
Well, illustrating. Okay. And that's it. I was born in 2010, right? Well, okay, there's a
history page. Okay. Let's go like this.
So this is my configuration management, right? Every time I make a change, there's my
guardian angel who -- now, I pity the guy with 2 million people who are looking me up
right now. This is going to be really a shock for them, but I hope they'll recover
[laughter].
So I'll go back to this page and I say edit again. Where is it? I lost it. Edit this page,
yeah? And now -- yeah, and now I'm editing it and -- I should leave some trace so that
it not -- so that -- okay, and it's back, I guess to my real birth date, which I hate to admit,
but it is correct.
So this is configuration management as it should be, right? You just do things. You just
do your thing, you change, and, of course, you mess up because we all mess up, and
then if you need to go back to a previous version, well, you realize that you've had your
guardian angel watching over you and keeping track of what you're doing. And there's
really no reason we should have this prehistoric, medieval approach of checking out,
working, and checking in.
So there is a form of configuration management that we should be using, and not just
for things like Wikipedia or Google Docs.
Now, more generally, the picture that we have today, if we are a programmer or
software developer -- and here I'm coming to the core of the presentation, if you like.
[inaudible] gave an excellent introduction or excellent setup for this talk yesterday when
he said at the end of his talk that software development is the ultimate application of
cloud computing, approximate, but I think quite close to it. So that's where we're going.
So today we're not using cloud computing. In fact, I also have this running live here.
This is EiffelStudio. Right? It's my project, and I'm developing it, and, of course, if I'm
using a configuration management system at some point through some hook in the
environment or outside of the environment, I'm going to commit my changes and then,
of course, as in the previous scheme, I'm going to have to reconcile them with those of
anyone who might be working on the same elements.
So inside EiffelStudio it could be Eclipse, it could be Visual Studio, except that it
wouldn't be as nice, but the concepts are applicable to any modern development
environment.
Now, the reality of software development is that it's more like this, right? There's a
customer somewhere, there's a requirement scheme somewhere else or some part of
the development in Bangalore or Shanghai and some part somewhere else, and they all
need to work on the same base. That is the context.
And in fact I took this picture by just hacking a real picture which is the development
team for EiffelStudio, partly academic, but mostly an industry from Eiffel Software. So
this is our team, right? Santa Barbara, Florida, France, Zurich, Moscow and Shanghai.
The times are those at which we have our weekly meeting. You can imagine the
difficulty -- I mean, I insist very much on having a weekly meeting, one hour, but just
finding the common time is not that easy.
The part more directly relevant to this particular discussion, however, is that there are all
kinds of people -- all kinds of locations where people are working on the same project,
and this is the rule today. Maybe not as many sites as we do, but it's very common both
in commercial developments and in open source developments, and we need ideas that
are meant, that are devised specifically for this kind of development, and today's ideas
are not.
The issues, of course, are many and difficult. For the programmer, we need to -- I need
to make sure as a programmer that I use the correct version of everyone else's
modules, correct in the sense of the latest one. There is nothing worse than working to
an API and then realizing that it's not the right one. I don't want to step on someone
else's toes and you don't want them to step on my toes. I want to know what others are
doing at all times.
Now, if I'm a manager, I want to get a true picture of the development state. This is one
of the most difficult issues today, because people may have checked out a piece of
code and they might keep it checked out for a week, for two weeks. So you look at
what you have in the repository and it does not give, the manager, the correct picture of
what [inaudible].
You really want something much more accurate and instantaneous. You want to allow
different developers to work concurrently on the same software elements, with some
precautions, we want to make sure that everyone uses the same version of the base
modules, the ones that are supposed to be baselined and fixed and completed.
You want to avoid regression errors, which are terrible. You want to be able to recreate
at any time any previous version of the system because some customer might want it or
you want to be able to reproduce a byte. And for most programmers and managers, we
want to avoid configuration errors, which are a waste of time. We want to avoid
conflicts in new modules. And, of course, I think most of us would like to produce
reliable software unless maybe we are hoping for a good maintenance contract. But
that's a side point.
So from this view of today, what we're trying to provide with CloudStudio is a different
view where we have a cloud-based solution where all developers -- to which, rather, all
developers can refer and where -- and which allows everyone to work on the same
base.
So here's a little demo -- I'm sorry, this is -- I hope it hasn't again logged me out. Okay.
So I'm going to -- this is CloudStudio. I'm going to log in. So it's cloudstudio.ech.ch.
Okay, I have a number of projects. So the point is not that it's particularly fancy or
powerful, the point is that it's going to be look reasonably like EiffelStudio.
So I'm going to open this project. That leaves us what Martin told me I should do for the
demo, and I'm going to change something. Actually, if I may -- so, of course, the demo
effect is -- okay.
So I'm not going to do something very powerful, but let's -- let's replace 50 by 60. It's
about -- courageous as I am, you know, live demos are not usually very a good idea in
talks. And it starts a compilation, and, yeah, it has Paris, the whole thing. And, well, I
could also introduce an error, you know, and -- yeah.
So basically I can do what I would do with the normal version of EiffelStudio, the normal
meaning non-distributed version of EiffelStudio. So here I've introduced a here. So you
see the idea. It's a very simple idea, right? It's like Google Docs for software
development in a way.
So the principles are there's a shared repository of software project. I mentioned this
already. Direct manipulation by project members and the direct manipulation in the core
sense of direct manipulation as introduced by [inaudible]. Many years ago what you
see is what you get. What you work on is the actual project. It could be nothing would
prevent you from making a local copy, but you should not, as a rule.
The manager gets a instantaneous accurate picture of the state of the development and
unobtrusive configuration management, that is to say, you don't do commensurate
dates, you just do your thing. If you're a programmer, what you want to do is to program
and to make changes, and then you want to have someone watch over you.
The comment about methodology, this is a recent stuff for me. Obviously many people
have known about this for a long time. I really discovered it fairly recently, and I think
it's very interesting. There's a move from review to commit -- I'm sorry, review and then
commit to commit then review. There are some interesting articles -- thanks -- in the
literature, particularly about Apache and how around 1998, I think, or a little later, they
moved from RTC, review then commit, to CTR, commit then review. That is to say -- for
those of you who know, for example, things like software transaction management, it's
the same idea except it's applied to people rather than software. You do your thing and
then you check that it is okay.
There's this old saying, or at least it's very popular in this country, that it's easier to
apologize than to ask for permission. Well, that's exactly the point, and it's a fairly
generally tendency in software development today. It's a general trend which you can
see many different areas of software engineering.
And we are transitioning to this at Eiffel Software. However, what we see missing in
other approaches it tools, because you want to know, for example, as a manager what
has been committed but not reviewed. You want to know what has been committed and
reviewed negatively, things like that. So you need a very strong tool infrastructure, and
CloudStudio -- sorry, I have to justify my use of a tablet PC at last. I know this is what
you came for and you have been waiting for me to drawing something on the screen, so
finally I'm vindicated Judith's comment. CloudStudio, a cloud-based environment, is the
ideal context for such tools that enable communication.
So configuration management is going to be the way I describe it. No need for explicit
update and commit. By default, changes are immediately reflected on a shared
repository. What does immediately mean by default? It's a successful compile, okay?
This can be parameterized.
The conflict detection is optimistic. That is to say you let the conflict happen and then,
as early as possible, you know, fail fast, provide a mechanism for reconsideration. The
history is managed automatically, and, of course, you can go back to any earlier
version. You can define explicit, named versions. So you can still do a commit, but a
commit simply means next time I compile, this is going to be version 6.2. That's what a
commit means. There's nothing more explicit than that, but, of course, you want to be
able once in a while to give a name.
One of the problems with the Google Doc scheme is that their granularity, if you looked
at it, if you have noticed, the granularity is far too small. Every time you add a comma,
there's a new version. So that needs to be improved.
So we have many challenges still facing us. We want to make sure that we have the
same performance as with the local performance. And the performance, of course, has
been a recurring theme in this workshop. We want to provide the same level of user
interface quality as its traditional version. And, of course, we're basically like anyone
else, relying on java script and its stuff, but we think we can achieve it.
We want to support branching. We want to find the right level of granularity. We want
to enforce discipline through management tools. Something that is important that we
want to have a kind of global environment, include communication. So we want to be
able to introduce some sort of -- I'd love to be able to cite Microsoft Technologies here,
but I'm going to cite Skype, okay? We'd like to kind of package Skype or WebEx, this
kind of tool, within the environment without reinventing it, because communication is
fundamental to software development.
And we want to apply this to teaching. So this is an occasion for another plug, of
course, Distributed and Outsourced Software Engineering. This is a course that we
teach in which the project is done in cooperation by students in many different
universities. And, actually, there is where this student from Hanoi came into the picture.
Hanoi University of Technology is participating in this course, so the students work in
teams that have two students from one place and two students from another place
typically, which is an exciting experience for everyone.
Okay. The general context -- I'm almost done -- the general context, just to preach a
little bit to finish, is this fundamental idea of seamless development, which is actually
illustrated here by -- sorry -- EiffelStudio, you know, this idea that you don't have an
analysis tool and an ID, it's all together, it can go back and forth between text and
graphics. And more generally, we want to have everything together: Analysis, design,
implementation, maintenance, verification, project management, and computation -sorry, communication. I think I don't have time to go into this.
And as a summary -- probably the first time in my life I finish a talk on time, although I
still haven't finished this, so I should wait before I both of -- Ed Lazowska said on
Thursday that software development is the ultimate cloud application, and I think he's
right, and I'm very surprised at that we as a community have not seen this yet, have not
done more in this area, so we're proud to be going among the first, I think, into this area.
I think this is the way people are going to develop in the future.
Our approach is to leverage the cloud to provide a modern cloud-based integrated
development environment and really commit and update for your dog if you don't like
your dog.
Thank you very much.
[applause].
>>: Questions? At the back there.
>>: Are you using the cloud for compilation as well as for storing the programs?
Bertrand Meyer: At the moment, yes. This is still, as you've probably guessed, is still in
early stages, so we might develop more savvy strategies for moving between -- I mean
doing more on the client side, but at the moment we're just playing the cloud game fully
and doing everything on the surface.
>>: [inaudible].
Bertrand Meyer: Yeah.
>>: Okay. Here?
>>: Like Wikipedia, branching can be very useful in software development so that you
don't [inaudible].
Bertrand Meyer: Right. So branching -- branching is something -- I mean, if branching
could be removed from the human experience, I think, as -- or the human race, we'd be
happier. So the full answer is I don't know. I mean, the good news that it was -- there's
a comment which was on the a slide when I didn't mention which is that this, after all, a
layer on top of a traditional configuration management systems. In our case at the
moment, some version. So anything that you could do before you can still do, right?
And that's about the only answer I'm able to give.
I'd like to know how to handle branching. Branching is a tragedy. And if we could avoid
it, it would be much better. So the short answer is you can do branching, but the way
we have been doing branching for ages, we don't have any new ideas there.
>>: Alexander?
>>: So I don't have -- I don't know many things about software development, but it
seems to me that you're assuming that the developers are all off people's kilt and off
people's styles of program. Do you think that there is an impact if the programmers
have very different styles, one prefers working with bigger chunks of code versus the
others? And if so, what happens if -- and the other part of the question is what happens
if the coders have really different scales?
Bertrand Meyer: Yeah, that's a very good question. We haven't devoted -- so I think
you're wrong. You do know a lot about software development [laughter]. Yeah, we
have not explored this issue very much. I think what we need to do is to build into the
tools ideas that have turned out to be extremely successful, in particular in open source
developments. So it's clear that in successful open source projects, you have a number
of concentric circles, so to speak, with the most novice developers at the outermost
level, and then you know the [inaudible] of this world or [inaudible] at the center.
And so at the moment we have not built anything of that kind into the tool because we're
working on the assumption indeed of teams like ours at EiffelStudio which are of the
pure kind that you're describing and similar teams in customer organizations.
So on the one hand, we need to develop some mechanism to support what you are
describing. On the other hand, I would like to keep such mechanisms right, if you like,
because I believe very much, and I'm a new convert to this, I must admit, in this trend
that I described of, you know, mess up, then apologize. Or fail first, which is the same
thing.
I think with a little bit of boldness, as long as you're not directly changing, say, the
running version of the [inaudible] control system of the United States, some limits, within
reasonable boundaries this idea of let's let people be creative and then if they mess up,
they will have to pay the price rate. I think it has a lot to say for it.
>>: Yes, at the back.
>>: The question is [inaudible].
Bertrand Meyer: It's too early to tell. But I think I understood this partly by mentioning
the methodological implications behind this model.
>>: [inaudible].
Bertrand Meyer: Yeah, the main methodology or applications I think -- and I think I
recently answered your question -- is this general thought of letting people be bold and
correcting errors after the fact rather than putting in all kinds of barriers. You know, it's
like with Wikipedia. Think of -- when people -- I don't know about you. The first time I
heard about Wikipedia, this idea that anyone can edit the entry on George Washington
from some internet cafe in Bangkok, I thought this is crazy, it's not going to work. And in
fact, it does work to some extent. So it's kind of the same general trend here of saying,
well, there's going to be mix-ups, there's going to be problems, but in the end, things will
kind of work out.
>>: [inaudible].
Bertrand Meyer: I think the answer does not fit in the time that's allotted to me.
>>: All right. Well, we'll take one more question.
>>: Do you have a third one [laughter]?
>>: Perfect.
>>: [inaudible].
Bertrand Meyer: It's much lighter. Well, two things. It's much lighter and also it
integrates everything. It's really the development environment. It's not just the
communication and management environment.
>>: [inaudible].
Bertrand Meyer: Describe it for -- you can describe it as jazz for the rest of us, right, for
the poor among us. But more fundamentally, I think the main distinction is the close
integration of absolutely everything into a single ID. You know, there's a general trend
in the evolution of software engineering of adding tools and of saying that management
is key. And this is the response of the hacker, saying programming is what matters
most, so let's take a programming environment and make it handle everything as
opposed to the more top down approach.
>>: Okay. Well, let's thank Bertrand very much.
[applause].
>>: Okay. So it's my great pleasure to introduce in the last session here today Karin
Breitman from PUC Rio and friends, and they're going to be talking to us on -- on Cloud
TV. That's going to be an easy one to remember [laughter].
Karin Breitman: Hi. Thank you everyone for staying with us until the last session of
today.
So our agenda for today is a little bit tight, so we're going to talk about the complete
disintegration of TV and why we think television is going to die in the next few years,
why the cloud is going to be able to catch everything, and we're going to show you two
different architectures that will deal with these different problems in the cloud and why
those are killer applications.
Well, we'll wrap it up in the end and we'll each -- one of the boys will show you one of
the applications, then I'll come back and I'll talk to you about as essentially be cloud
research, what we're doing today, architecture's work we're putting in, and I'll wrap it up
with a very quick thought of what, because the event is cloud futures, so what's going to
be the future of all that, what we're going to be doing, what I think we're going to be
doing during the next five years.
So because we're doing it very quickly, so I just invite everyone who's interested in the
topic, just go to computer. Jim has put out a beautiful issue April all about cloud, and
we feature one of articles describing exactly what we're doing.
But back to TV, well, TV started in the '50s as a whole static model. We have the whole
family sitting around the TV set watching it, and we had very distinct roles. On the top
we have the producers who would produce the shows -- Johnny Carson, for instance -that were distributed to broadcasters that were responsible for making that being aired
until it got to your homes, and consumers just sat down there, enjoyed their Sunday
night TV and watched that.
Well, this changed radically, as you've been to Dan Reed's [phonetic] talk this morning,
I'm sure, and he talked about, you know, how many computers do we have? I don't
think it's about computers but rather it's about displays. So we started with the movies
and then moved to PC today TV that were, you know, inside our own private spaces
and then to PCs.
But now we live in an era where we own private displays. We all do. I have four in my
backpack over there. I see many people with computers. I'm sure you have cell
phones and God knows what else, right? We're running this presentation from a display
that it's not even a computer. What is it?
So this is ruling, and this imposes a huge amount of work to what we have as TV
because it all crumbled down. We don't have TV now. Our TV connects to the internet,
and the internet -- on the internet on my computers I can watch TV and I can watch TVs
on that and I can watch TVs on my mobile phone. And we have to process all this
information to a gazillion different formats.
So what's going on today is that we just mingled everything. So whoever is producing
software now can distribute and has to distribute to dozens, and I mean dozens. We've
been talking about biodiversity and computer science all through the week, but what
about the biodiversity we have in mobile devices we have here? You know, iPhones,
Nokia phones, Motorola phones, all sorts of different computers and stuff. So we have
to produce media and transcode this media to all of these formats.
And this is what I was going to talk about, our killer solution for this and why the cloud
has to be used to process this to this myriad of formats.
Then on the other side you have to distribute it to consumers, and now consumers are
producing media. I'm sure everybody saw Haiti coverage. We turn on the news and
there's more and more and more video, user-generated content, content that was
produced by people and submitted to the networks. But how much of that can we take?
How much of that can we process and receive and see. So we need a lot of
computational power to deal with it.
So, well, we'll just begin by showing you the two architectures.
Rafael Pereira: So before I start talking about the split and merge architecture for
distributed video encoding, I think that's an important to understand why do we need to
compress video and why this task is so complex.
So if we take, for example, 45 minutes of full HD video, video with this profile, it would
take up 177 megabits per second at that rate. And it's not possible to deliver this
content over the internet with the current user connection.
If I want to get access to this content with my cell phone through the 3G network, I just
can't do it because it's too high. So we need to reduce this data rate. And to do this,
we need to remove -- basically we need to remove some redundancy such as temporal
or spatial.
So here's a diagram of a video compression, basically a video compression process.
And it's just to show you that it's quite complex. There is a lot of math that we need to
do, and for each frame we need to do a lot of operations. In a video with, for example,
30, 40 frames per second, that's a lot of information to process.
And if you have some case such as breaking news, we don't have time to -- we need to
process this content very fast. So we need to reduce this processing time to making
this kind of available as fast as we can.
Basically when we deal with huge data sets, I think that the map-reduce is the most
popular paradigm for parallel processing. And it's very good for some applications, but
in the specific case of video processing, it does not fit very well because there's a lot
of -- for video, for example, we have a different algorithms that we need to use for video
processing and audio processing. There's a lot of correlations between subsequent
frames. So the map-reduce is not the best choice to do this parallel and distributed
processing of video.
So when we propose is a split and process and merge architecture where we adapted
the original video, split it in several chunks, split the audio screen also, and then we
processed all these chunks and the audio in a distributed environment [inaudible]. And
then we merged the results of this processing together for encoded video.
And to do this split process and merge, basically for the split we need to split the video
in several chunks with several -- for a specific frame, a specific quantity of frames, and
basically we do not break the file. In this case we are using a time shifting technique
that we just mark the chunks' start and the end. It's very good because we don't need
to write the container of the video.
One important mark is that the chunk size must be always greater than the GOP that is
group of pictures. That is the interval between the key frames. If the chunk size is
smaller than the group of pictures, we will lose the compression efficients because we
will insert a key frame where it not necessary -- where it's not necessary.
And we can use decision making algorithms to help us to discover the optimal chunk
and size.
For the process, basically what we do is the video compression itself and the audio
independent way, and it's important to allow us to choose between different encoding
profiles and techniques. For example, I need to generate a video for web and then I will
lose on a specific data rate for mobile networks I will lose [inaudible]. It's important to
have the possibility to choose between different encoding profiles.
And one important thing that we need to take care is the dynamic resource provisioning
in the cloud. So once I have a different duration content, so we need to -- we can -- we
will have different content of frame or of chunks, so we can just scale up or down our
architecture, adding or removing nodes to do these dynamic resource provisioning and
process all chunks in a simultaneous way.
And, finally, we have to merge the contents, so we need to here reorder all video
chunks and then merging it and finally we do the auto merge with full encoded video
and do the audio remixing, and finally we rewrite the file container.
So if we want to apply this architecture in Amazon, for example, using the Amazon web
surface platform, it will look like -- it looks like this. We will have several EC2 instances,
for example, our worker and master nodes. We can use the S3 for content storage and
the relational database service for state persistency.
We're also using the CloudCrowd, which is a ruby framework for parallel and distributed
task processing. And after some tests, we got these results.
And basically what we can see is that comparison between the traditional process and
the split and merge architecture, and we can see as the content duration increases
using the traditional process, the total encoding duration also increases. And it's not
good because if you have -- for example, if you want to deliver on the internet the
Superbowl game, it will take a very long time using the traditional process. So it's not
too good.
And using the split and merge we have an almost constant encoding duration. And we
can do these just adding more work nodes in the architecture.
So let's do a simple cost analysis of if I want to use the Amazon platform, how much
does it cost first, for example, if I wanted to produce 500 minutes of video in a single
day. So the first step, I need to submit source, for example, a DV, 25 megabits per
second to EC2, and it will cost something around $10. Then I need to encode all the
content, send it to S3, store the source video and encoded video in the S3 and then get
encoded video back.
In the end of this process, we will spend around $60 a day and $20,000 a year, which is
cheaper than a single server.
And we can produce all the Superbowl match in two minutes against the 12 hours from
the traditional process and just for a dollar. And we can reduce it. If you use Spot
Requests from Amazon, it can reduce it to less than 50 cents.
So Marcello will show you now an architecture for submission systems.
Marcello Azambuja: So now I'm going to describe another architectural level in the
cloud computing for open submission systems.
One of the things that Karin was talking was the populationization on photo video
cameras. A lot of content is being produced by users, what is called the user-generated
content, and being sent to the internet where media conglomerates can use that or even
your websites as YouTube which have a huge audience and hold a lot of content.
Usually it's very hard to estimate how many contents you are going to receive in a
system like this, so it's really a guesstimation. You don't usually know how many users
will like the service and if it will be -- how much infrastructure will be needed. That is
really an advantage where cloud computing can help you solve this kind of
infrastructure problems.
So the proposing systems were actually used in a real project in Big Brother Brazil,
which is a reality show that's run in Brazil. This is a disclaimer. Rafael and I work also
at global.com, which is a media conglomerate in Brazil. Global has the ownership of the
Big Brother reality show in Brazil. It's a huge success in Brazil. Don't ask me why
[laughter]. We have an audience 80 million simultaneous people watching the TV show
on open TV, and we have -- and the internet has a very relationship with the TV show
on the TV network because users have to vote -- for the ones who doesn't now how the
reality show works, 16 random people are selected to stay in a house for three months
without any contact for the outside world, and every week one of the guys get out of the
house, and the selection process of who's going out of the house is done on the internet
on a website.
Before the reality show begins, we have a selection process that users have to apply to
be in the reality show. Don't ask me why. I don't know why so many people want to be
in the reality show. Maybe to be in the television. But we see a lot of applications.
And we decided to move the process of receiving -- we used this to receive videotapes
using snail mail, and then we decided to move to receive videos using the internet.
First we thought of using YouTube or Vmail or other video site, but we had a lot of
problems with copyrighted content. We needed to have the copyrighted -- the rights,
exclusive rights, for this video, so we had to develop this infrastructure by ourselves.
So the first big problem was how many videos are we going to receive. We used to
receive 20,000 and 30,000 by snail mail. If we opened that for the internet, we didn't
have a clue on how much videos we would receive. So using the cloud would allow us
to focus on solving the problem without getting to worried with infrastructure.
So, for instance, this is the website that we used to receive the videos. Basically the
user had to fill a form and upload the content.
So the basically infrastructure, what we would need, we would store the user
information on a database. For instance, we used a MySQL. We had to store the
videos. On the storage end we did a lot of service because we received videos in very
different kinds of formats and codes, and we had to transcode all these videos to a
standard form so that they could be watched by the production team of the reality show.
So what we done is we just replaced the storage -- the local storage and the local
transcoding form for a cloud solution. For instance, we used Amazon Web Service.
Right now we have the actual tokens. We can use [inaudible] service as well.
So basically we have a local system that we send the video files to a storage system,
for instance, S3. The EC2 instance will do all the transcoding, and we use SQS to
queue which kind of jobs need to be done. For this system specifically, the transcoding
come on line that will be run on the cloud and then when we have all the videos
transcoded we can offer it to the producers of the program to select which candidates
will be selected.
So what we've got, just to show -- we didn't need to worry with upfront investment or
how it would scale. We wouldn't need to worry on how many services would be needed
to buy or the administration of how this infrastructure, and the cost is pretty low. For
100,000 videos it would cost $500, and so if you do the math, that's less than one cent
for a video, and you don't have the upfront payment and you don't have -- in Brazil,
when you buy new service, you have to import them. They take almost 40 days to ship
to Brazil, so it takes you a huge time not only to plan the infrastructure but even to buy
and to plan that. So it was really a very interesting case what cloud computing could
add to this project and stop us worrying the size of the structure that we needed.
Karin Breitman: Okay. So we very quickly showed a couple of cloud applications and
things that we're doing today. What we have today in the landscape is the following:
From one side we have cloud producers that are doing cloud infrastructures Microsoft
Azure, Rackspace, and on the other side we have people who want to use a cloud, but,
hell, it is difficult. It is really difficult. You think configuration management is difficult?
Try to see what these guys are doing with, you know, terminal. It's really hard to set up.
So research today and what we've seen the last two days is just trying to bridge through
this. But if -- now, to wrap it up, I'm a software engineer, and this is research we're
doing. This is going to be good for the next five years. We need to provide these guys
to provide solutions and make cloud solutions work. But we have to think, and if that we
can give, we're researchers, and we can give some insight, is what going to happen in
five years?
Well, in five years this is going to blow. This is a cloud bubble in the sense that, you
know, people will start to forget how much money they saved by using the cloud and
start thinking like, you know, I don't want architectures, I don't want to rewrite my code
to go to the cloud, I don't want to -- this is error prone, this generates a lot of overhead.
Why can't I generate cloud systems to begin with? Why can't I think about cloud
systems earlier on? Why don't I think about them with requirements and why don't I
think about the models that I need to develop these systems?
And I have to back to things and to principles in software engineering like things that
Barry Beam [phonetic] said. You know what? It's more expensive if you treat bugs later
on. And just doing that at the architectural level will not do. We have to do it before.
And this is what real research is all about, and this is what we software engineers have
to be thinking about and doing in terms of cloud.
And, well, if the cloud is going to be a tenth of what it's cooked up to be, this is the grand
challenge for software engineering. We have to have a hands-on, we have to call
everybody. We have to go and say Barry, come on, work with us. Come on, Caper
Jones [phonetic], let's get the guys from [inaudible] guild systems, let's get everybody
together.
Bertrand, forget about configuration management. Let's try to think how do we specify
the systems to begin with so that we can develop the systems in a rational way.
Methods, tools, and techniques, which have been our mantra for, I don't know, 20 to
30 years.
And we'll leave it all at that, and that's all we had to show today.
>>: Brilliant.
[applause].
>>: Well, I've chaired a couple of sessions and I haven't asked a question, so I want to
ask the first question, if I may, while you're thinking of yours.
When one moves to the cloud, as you've done so successfully with real-life applications
like this, what is the requirement on the use of the internet? Because that's a question I
was asked down in Africa when people wanted to switch from a supercomputer to the
cloud, they said, oops, but we don't actually internet capacity.
Karin Breitman: We don't have it yet. We're abstracting on that. And this is a very
relevant question to Brazil because broadband is really a problem. We're abstracting on
that in the sense that we're thinking that in two years or three years, we cannot advance
software engineering research so quickly. So by the time we have solutions, it will
optimize that, and that's what Anuth [phonetic] said, you know what? Don't optimize
your computer or your algorithm, buy new computers. By the time we're done with that,
we'll have a newer infrastructure, and I think this is going to happen in the next two to
five years. Optical cables, we're renewing it systematically.
>>: So you currently have the problem.
Karin Breitman: We do have, definitely. So by using Amazon or Azure, and I think
Azure -- [inaudible] is promising me that they are going to have cloud edges in Brazil
and South America, because none of the big providers do so far. They don't have it.
So uploading those videos to a cloud provider is not good, and Microsoft could go
forward and just start doing that and do it geographically divided. We have continental
dimensions, and having your cloud provider provide, you know, machines in strategic
geographical locations will make a huge difference for these applications.
>>: Okay. So now over to you.
>>: I was wondering in this [inaudible] and managed approach what kind of code
[inaudible] you can support and what [inaudible] are left out.
Rafael Pereira: So we are using the encoder or FFM pegs, so it supports a range of -we actually are using the [inaudible] where we're encoding, and the input file is on
MJPEG, high definition.
Marcello Azambuja: But since we're using third-party projects to do the actual
transcoding, it doesn't matter. It's important, but the input video that you might be able
to identify the key frames, because not [inaudible].
>>: Yes?
>>: Excuse me. You are talking about encoding via the cloud. Is it possible to think
about decoding via the cloud if necessary?
Rafael Pereira: Well, I would say the main problem with decoding is usually when
you're decoding content, it's to -- it usually means that you're going to reproduce this
content on a video screen and then the overhead of decoding that and then transferring
it to the video screen wouldn't make it -- it wouldn't be worth it right now due to
bandwidth problems. To encode it, you don't usually have that timeframe. You can
usually do it even offline. So, for instance, in these videos for the selection process, we
could send it to the cloud and have them first code it even if we -- more time and it
wouldn't be such a problem. But if you're decoding, you probably want to display it in
this screen, and once you have it and compress it to transfer it from the call to your
screen we'll get huge problem.
>>: Okay. Well, I think that leaves us all ten minutes to go and get coffee and cakes,
which was my objective [laughter], and then we have to assemble over there for the
closing session in the big room. Thank you very much.
[applause]
Download