Jin, for inviting me.

advertisement
>> Dah Ming Chiu: Thank you. I'm very glad to visit Microsoft and thank you,
Jin, for inviting me.
So in today's talk basically I will cover a couple topics.
One is straight from the paper that I will be talking to come and it's collaboration
with PPLive that we did on their P2 VoD system. And then after that I will try to
give some -- introduce some of the work that we have been doing.
I think in academic world people want to just study more modeling, the capacity
of P2P networks, the algorithms, modeling and so on. So I talk a little bit about
that if there's time.
So the case for P2P of VoD, I think this has already been well established last
year in the Sitcom(phonetic) Conference, in fact by people sitting here that you
wrote a good paper studying from the Microsoft to VoD, kind several VoD
system, and to analyze how you might use the P2P technology to significantly
reduce the server loading. Okay. So you can make it more scaleable. So you
call it peer-assisted VoD. I think it is the same thing, P2P VoD, meaning the
same thing.
So the key challenges is clear. We already have P2P streaming systems for
working for quite a few years. I think that was started originally from the Core
Stream paper in InfoCom several years ago. And since then several platforms
have been built and so people already feel that P2P streaming is mature, you
know how to do it and it can be done quite well.
But for P2P VoD is a different story because for P2P streaming you have, you
know, a lot of users that are viewing the same content at the same time. So it's
kind of intuitively it's easy to make them help each other share the content they
are watching. They have more or less a similar content in the buffer and then
can easily help each other by relieving the server in terms of serving each other.
But for P2P VoD it's a different story because you can have many other peers
that may be viewing different movies. There can be many movies. And also they
are viewing the same movie and it could be they are looking at different parts of
the movie and how to make them help each other is a much more challenging
problem.
So the -- so how to build a system, whether it can be built -- I mean, I think the
paper last year was more a sort of paper study and user describe how you might
use Prefetching essentially for the peers to fetch the content and then under
optimal situations you can cut down the server load to around 10%. I think you
predicted that through the analysis.
So the question is how to build a system that will actually deliver that. I think
apparently while you were working on that paper that people in the P2P and
P2P-based streaming, they already started doing these kind of systems. And I
think in the case of P2Live(phonetic) they built a kind of system in maybe last
summer and they were testing it and then by last fall they deployed a system.
And it reached a scale of several hundred thousands subscribers and many
thousands simultaneous users. I think in the whole world, I don't want to keep
exact numbers because I don't necessarily exactly what's the scale, but I think
these are the kind of scale that was reached.
Yeah?
>> Question: This number is much smaller than the Live streaming.
>> Dah Ming Chiu: Yes.
>> Question: I mean, but essentially and I think VoD is a very popular form of
service.
>> Dah Ming Chiu: Right.
>> Question: And that is from my personal value the ones that VoD is here.
>> Dah Ming Chiu: Right.
>> Question: They use that function. (Inaudible) ->> Dah Ming Chiu: Okay. So I think part of the reason is when they first started
the deployment they -- I think they were trying to be cautious. They were using a
no-profile way of starting it. They just created some button on their streaming
thing and so some people may not notice the service.
And I -- partly I also think some of these things I'm just conjecturing, okay,
because I -- our collaboration with PPLive is -- I was not there sort of running
their system so neither was I sort of involved in doing the system. But mainly we
were trying to sort of working with them trying to bring out the insight of the
design of the system and also through the measurement study some of the user
behavior and some of the -- sort of how to measure the system in order to make
sure it works well.
So to answer your question, I think they were my conjecture is they probably tried
to be cautious because P2P VoD actually does take a lot more server look.
Okay. So you know trying to go very fast. I mean the performance may not be
so good. Okay.
And then in -- the other thing they can control is in P2P VoD you can sort of
control the server loading and how well you do by limiting the number of movies.
If you reduce the number of movies to only the popular ones you can probably
reduce the load initially. And then you may lose some users if you have only a
few popular movies, not so many movies.
So they are going through a -- I think a period to build up. I mean this -- I think
this is not surprising that compared to their P2P streaming, this is very mature.
In fact, the numbers is for P2P streaming they can -- the server loading is at
probably 1% or less. I mean, the server doesn't need to do much, but for P2P
VoD initially when they deployed this, as I will mention later on, the server
loading was 20 to 30% of the -- compared to whether the server would serve by
itself. Eventually only this year, I think, earlier this year, they cut it down to about
10% level. Okay.
So they are still going through this and I think they are still trying to optimize and
make it work better. But I think what happened is they actually did succeed in
demonstrating relatively large scale system can be built and made to work.
Okay.
So the and also that through the measurement we can see that in fact the system
delivers reasonable user satisfaction and we'll look at how to measure that. And
obviously they also do some subjective kind of measurement by having friendly
users look at the video and see whether it's delivering reasonable quality.
So this is just a little animation to show the point I made earlier that for streaming
you have essentially all the users looking -- synchronized. They are looking at
more or less the same video at the same time where as for P2P VoD you can
have different users looking at different movies or different parts of a movie so
the problem -- that's why the problem is more challenging.
So what is the secret in their system? I think the probably the most -- I mean, I
don't know what it is, it is how everyone builds it. But the way they make it work
is to additional to making all the peers share what is sitting in their buffer in
memory, they also make the peers contribute some storage. Okay. This is a
significant difference than the P2P streaming.
So as you watch movies, you would be, you know, after you watch it you would
just leave some of the movie in your storage and when you're online you may be
watching one movie, but if there's another user who wants to watch some movie
which is sitting in your storage you can also serve that user. So that is how they
manage to make the P2P VoD work.
So each user -- each peer, is contributing about 1 gigabyte of hard disc, so the
key problem of P2P VoD is how to manage this storage so that the users are
leaving the right content in their hard disc so when the other user come online
there's sufficient content out there sitting in different peers' hard disc who are
online and they can relieve the video server. So that is the challenging problem.
So this is new. This is in addition to whatever you have to do in P2P streaming
already.
So some of the other things I observe, which is I think in building this system that
in contrast to the file-sharing protocols like PT and so on, in this system that
PPLive built, it's really more like a distributed system. They are not letting the
peers sort of control so many things. It's in fact if they want -- if a peer wants to
watch the system they have to watch the movie. They have to stay online. They
have to essentially contribute.
Okay. If they don't contribute they have to show that they are still connected
through hot-b(phonetic) messages and so on. If they are not contributing they're
not watching, they cannot watch. So this, I think, also solves the problem of free
riding.
I think there are many other less technical factors that are important to make
such a system successful. For example, working with ISPs. I know they work
quite closely with ISPs. In fact ISPs are happy to help them provide some
content in many situations. And then they actually work closely with ISPs.
Obviously you need to get good content to draw the eyeballs and get
commercials and these kind of things, which I think more important, very
important for the success of such systems.
So the problem we're looking at is really a problem of how to do content
replication. All right. So what they are doing is what is known as multiple video
replication. So in traditional P2P streaming essentially everyone is watching the
same thing. So there is only one video -- okay -- you're concerned with.
There's a tracker or thing that keeps track of all the peers watching that video
and so on. So in their system essentially each peer is storing multiple movies.
So the tracker system has to know which -- given the movie which peers are
storing that movie and provide help. So you need to build a tracker system to
provide information at that level and then you also need to make sure that the IO
system is fast enough in bringing the movie into the memory when you need to
serve other peers.
So the replication of content is at two levels. One is at the movie level. You
need to provide multiple movies, serve other peers, and also add chunk level.
So once you have a movie, you are sort of serving another peer, you store what
chunks you have for that movie into a bit map. This is just like traditional P2P
streaming, there's no big secret there.
In their case, the size of a chunk is about 2 megabits, so to store a movie it's -the resolution they have is about 100 bits of a bit(phonetic) map. So here's -okay.
>> Question: So does the tracker know which chance, you know, the task. The
tracker only know that, just know that it has something about this (inaudible).
>> Dah Ming Chiu: That's a good question. Actually later on I will clarify on this
point because the tracker actually does keep track of all the chunks, what chunks
appears half. And in fact -- let's see, now the tracker keeps track of -- I know
the -- sorry. I know the tracker keeps track of which chunk it has, but I've
forgotten why it has to do that.
But in terms of streaming, mainly you find out which chunk your peers have
through gossiping. Okay. That's -- because it's more up to date. Okay. But the
tracker does keep a lot of statistics.
So you have the movies divided into different pieces. You have what is called a
chunk, which is a bigger piece b2 megabits and this is used to advertise this is
advertising a bit map(phonetic), so you exchange this information with your
neighbors and then you have what is called a piece, which is a minimum viewing
unit, which is about 16 kilobits. And then you have sub-piece, which is about one
kilobyte or bits, I'm not sure. But this is more useful for transmission. Okay.
One kilobyte, I think.
So you need to schedule a transmission and from different neighbors. So I
mention this point later on in the algorithms.
So there are three important algorithms. One is Piece Selectron(phonetic)
algorithm, which is this is the streaming part. You have to work with different
neighbors to find out what piece to get from other neighbors. The other one is
the replication algorithm, which is what I mentioned earlier. This is the important
new aspect of P2P VoD is to decide what movies you will store in your hard disc
or keep in your hard disc.
This is like managing the cash. Okay.
And this one, in order to do this, the tracker collects information from different
peers and then gives peers the information called -- essentially it's
supply-to-demand ratio. It tells you which movies are in supply and which
movies are in demand. And given this information you can decide better which
movie to store. Okay.
The third algorithm that's very important is the transmission scheduling algorithm.
So in their system what they do is they -- each peer when they're trying to get a
chunk from other peers they actually try to get it from different peers. Okay. And
then they need to schedule how much to request each neighbor and to achieve
the low balancing.
So this is actually a very tricky algorithm. If you don't do it right, you will not sort
of be making good use of the uplink of your neighbors.
So these three algorithms actually, you know, I think they are all interesting
algorithms worthy for the study in terms of research. In their system they just
build something ad hoc. I mean, they just try different ways of doing it and see
how well it works.
In terms of piece selection this part is kind of similar to P2P streaming. We all
know that essentially two kind of algorithms for pulling data. One is -- by the
way, their system is more or less a sort of mash-based system. Okay, not the
tree-based system. So the peers basically pull the content from neighbors.
One approach is called a sequential pulling so that you try to get -- you're a little
bit short sided. You try to get what you need to playback so that the most urgent
content you get higher priority, you get that first.
The other approach, the other algorithm, is called the rarest first. This one -- the
name comes from the Vitorin, and essentially you're trying to get the freshest
content first. Whatever the server is sending out, you try to get that first. And
this strategy helps propagate a content as to provide scalability. So this one -this is the other one.
And a third algorithm is called anchor based, which is you try to select, okay,
certain anchor points in your video and then randomly pick an anchor point and
then do sequential from that anchor point. This also helps -- this achieves two
things. One is to -- if certain anchor point is viewed more often people tend to
drag to that -- to certain area. Then you can use the anchor point to
approximate.
You know, essentially user try to move their viewing to a certain point, okay. And
then you -- you cannot exactly move to that point. You can at least move to an
anchor point close to them, sort of like a little bit cheating way of simulating the
moving to random moving.
And another thing that anchor point helps you is that it also is a way to help
different peers to sort of collect different things so that they can help each other.
Okay. So that's also how anchor point can be used. So in PPLive's system they
actually use a mixture of this. Later on -- I think in fact they experimented with
anchor point approach and they found that at least in the current deployed
system they don't need to use that because the users are not essentially moving,
jumping too much. Okay. You will see from the measurement data that is the
case.
So the second class of algorithms that's very important in the overall system
design is the replication algorithm. In their system they are not doing any
prefetch as in the Sitcom(phonetic) paper last year, you know. Essentially they
use this cache to store movies for other peers to use.
So this -- so the algorithm basically is a cache replacement algorithm. You -- the
peer would be watching -- the user would be watching some movie and then after
the viewing you can just decide to store that movie, but after a while the cache
will fill up. Okay. One gigabyte -- each movie, if it's one or two hours, is probably
at the resolution they are providing is about I think 200 megabytes, around in that
neighborhood. So one gigabyte can store four or five movies if it's full length.
Okay. So if you are storing just a fraction then you can store more.
So the question is which movies you're keeping in your cache. Okay. So the -you can use traditional cache management algorithm, such as the least recently
used or least frequently used and you can use this kind of algorithm, but in the
system they built they did a lot of experimentation and they found out that this
weight-based approach is better than the traditional one. Okay.
The waiting factor is based on these two factors. One is how complete the movie
is. Okay. So if some users may be just browsing, just watching part of the movie
and then go away. So this is not so useful. If you watched almost the entire
movie, this is more useful. So depending on how the fraction of the movie you
have, that's one factor. Okay. The other factor is availability-to-demand ratio.
This is something that the tracker would collect from all the peers, what movie
they are watching and how many peers are storing that movie. So this is the
computer ratio and you can get this information from the tracker before you make
the decision and so these two things multiplied together is the weight. Okay. But
all the movies you essentially look at this weight and then decide which one to
get -- to throw away.
So once you started throwing away a movie you just throw away the whole thing.
You don't try to keep partial movies. So that's the algorithm.
The third algorithm I think it's very important in your system, I think in many
systems, P2P streaming or P2P VoD is the transmission strategy. This is, you
know, you are trying to get a chunk or a piece that many other peers have and
then you can ask anyone of them, okay, simultaneously ask multiple to get this.
And so how do you do this? I think this algorithm is again this is in their case
they have sort of ad hoc way of doing it. They tried to experiment many times,
but I think this is something that actually you know I'm studying with my students
trying to model this more formally as an algorithm.
So the idea is how -- you have all these peers out there. Each peer is receiving
request from many other peers. They can be overloaded. Okay. So you want to
more or less use up all the peers uplink, right, to offload the server. So how do
you schedule so that each peer -- each peer if they just go to one neighbor to get
the content it's very risky because that neighbor may be gone and you have to
wait a long time and timeout and then just wait a long time and you didn't get
anything.
So the strategy they're using, the algorithm they're using, is to simultaneously go
to multiple peers, even for the same piece or chunk. You will be asking different
peers for different sub-pieces. They don't ask the same sub-peers different
sub-pieces. And then dynamically adjust, you know, how much to ask each
neighbor that you're sort of requesting from. And this algorithm reminds me of
the kind of like in the past I've done some work in P2P congestion control, right?
This is very similar to that. How do you adjust the window size and the timeout to
essentially make good use of all the set of neighbors you identify.
An algorithm is more complicated in P2P congestion (inaudible) because this is
like working with multiple, it's not only with one destination. It's multiple
destinations and it's very challenging. Okay. So this is something we are
actually actively working on as a research problem.
>> Question: Let me ask some question about this basically.
>> Dah Ming Chiu: Okay.
>> Question: The transmission strategy.
>> Dah Ming Chiu: Okay.
>> Question: When you talk about the receiving strategy, right?
>> Dah Ming Chiu: Uh-huh.
>> Question: Upon the sender ->> Dah Ming Chiu: Uh-huh.
>> Question: What if it's trying to accommodate different requests? I mean, I
may have let's say 10 or 20 peers asking me for content ->> Dah Ming Chiu: That's true. That's true.
>> Question: -- rerouting them or accommodate for (inaudible) or I mean
consider for an ISP ->> Dah Ming Chiu: That's true. I understand. Unfortunately in -- with the
PPLive system I didn't get enough detail from their system. In fact, they have a
patent on this. I mean at the time we were writing this paper they were applying
for the patent so we didn't get to the detail. But as I'm studying this problem I
know the question you ask is very relevant because the server probably don't
want to keep the request from all users because that would increase the delay he
has. There are many variations to this.
The server probably want to keep only certain requests. Okay. And then that will
determine how the users will set the timeout and all kinds of things. This is an
algorithm that can be quite complicated in reality. And that's why it is quite
interesting. So I know everyone building a P2P streaming system has to work on
this. And this effects the real performance, the efficiency of the system, as wells
user-perceived performance.
>> Question: Very unique thing about basically the P2P congestion
comparable, I think that is a very great direction, although I don't know anyone
has solved that yet. I mean, basically how we can combine P2P congestion
control algorithm into this framework.
>> Dah Ming Chiu: Yeah, what we need is actually a new string of congestion
control. This is something that I think by the time somebody work out the exact
model this is going to be receiving so much interest and probably surpass the
P2P congestion control. Okay.
So the next part I will talk about is look at some of their measurement data.
The -- in this area we look at several things. User behavior, that is how -- sort of
how much the user view a movie and whether it moves around and how often the
users come into the system and these kind of things. Okay.
The second part is about replication, how to measure supply and demand. So
this I touched on earlier so we go to see some real data.
The third part is how to measure user satisfaction. This is a very important
problem. Okay. In fact, this is also a problem we are actively studying as a
research problem. If you want to deliver a P2P VoD system or streaming system
as a platform or as a content provider you better care about user satisfaction.
In fact, in IPTV world people study this very seriously. They have a term called a
QOE, okay, Quality of Experience. And the whole issue is how to measure some
simple parameters and then predict user satisfaction. Because you are just
broadcasting some content. You don't know whether the users are happy or not.
And you cannot afford to ask them one by one and if they are unhappy -- if you
are the operator, you are in big trouble because they are going to phone call and
just a lot of problems. So you want to have simple ways to predict and monitor
how the whole system is doing. Okay. So we'll show some of the results in this
area, as well.
And the last part is I show some results about what kind of uplink bandwidth
users have, what kind of -- whether they are behind firewalls, some statistics. So
the -- sorry. So the data that we got from PPLive to do this study is based on just
traces and each trace you can think of it as just a -- if you have user MS,
something similar to that. You have essentially a sequence of records. Right.
Each record contains basically -- this is a simplified version of the record. You
have a user I.D., unique I.D. of the user, identifying a user. And then a movie
I.D. identify which video the user is watching. And then you have a start time and
end time and a stop position.
Okay. So each viewing record could be just viewing part of a movie and then if
you jump to another point in the movie that is a new viewing record. Okay.
>> Question: -- as long as they are collected by the -- I'm sorry, as long as they
are collected on the client side basis and then uploaded to the server?
>> Dah Ming Chiu: These logs are collected, yes, there's some kind of a log
server, okay, which is maybe the tracker is doing that job or something. And
then the kind where peer periodically send message to this. Okay. Yeah.
And collecting this kind of information, as I said, is also an interesting problem.
How do you -- I mean probably you cannot afford to collect everything. You
probably are doing some kind of sampling and how do you design it so you
collect as much information correctly as possible so that's another challenge.
So this is -- so after we got some traces from them we look at a bunch of movies
and here just they all similar in some sense and here are three typical movies.
And you can see that they are -- there's maybe around each one to two hours
long and then they -- we show here how many chunks there are and how many
viewing records we collected which shows. And we use the fact that if the
viewing records start from the beginning of the movie, you know, that sort of
identifies a unique viewer.
Actually the unique -- the viewer, the I.D. can also identify the viewers, but I think
in the beginning -- in the first version of their software they don't have the viewer
I.D., a unique viewer I.D., okay, so we have to rely on this sort of position,
viewing position to identify the viewers. And you can see the average number of
chunks is not that high, okay, it's just one or two or three chunks when people are
viewing.
And one thing that is very interesting is how much does the user view and what -most of the time the user viewing the whole movie or are they doing a lot of
browsing? So as you can see from this, they're actually -- I mean the amount of
movie they view looks like the average is very low. I mean it's just a few percent.
But if you look at the distribution, what you find out is a lot of times the users
are -- because there is a view in your system they are browsing. They look at the
first few minutes of a movie and if it's not interesting they go to the next movie
and so on.
So a lot of the viewing records are short, but there are significant people who
actually finish viewing entire movie. For example, the movie "To Here" you can
see that there are actually a few thousand users which sort of saw the entire
movie because there's a big chunk at the particular point, okay, here.
So this figure shows the -- whether there are specific points that people tend to
jump to. Okay. So this one actually shows for most of these viewing records
people start from the beginning, which is here, significant, the starting point at the
beginning and then after that, there is no specific point in the movie that people
tend to jump to. They are just random jump to different points. So that's what
this figure shows.
>> Question: (Inaudible) -- interface doesn't really give you any kind of
chaptering information, right? You get into the DVD menus.
>> Dah Ming Chiu: Um ->> Question: -- obviously you just have the slider bar.
>> Dah Ming Chiu: Yeah, they just are using the slider bar, I think. I think that's
why there's no specific place they tend to jump to. If they're a chapter, maybe
that's true, yeah. If there's ->> Question: So you could cause their DVD to be, to jump to more predictable
(inaudible) change in (inaudible) in a way they probably link.
>> Dah Ming Chiu: That's true. That's true.
>> Question: (Talking over each other) -- algorithm may become even more
interesting in a chapter.
>> Dah Ming Chiu: Right. Right. Exactly. Exactly. In their system I think
there's no chapter.
>> Question: You can experiment and then try and force anchoring points.
>> Dah Ming Chiu: You had a question? Okay. Okay. So this one shows
the -- you know, how long the peers tend to stay in the system. This is very
important if they tend to stay there longer then they can help the server more. So
we look at server base of trace records and it shows that the users -- this is the
distribution for each day that users staying for greater than two hours or a few
minutes. More or less pretty flat and then so I think the encouraging thing is that
user tend to stay for 15 minutes or more so that they can actually provide some
help.
>> Question: So I mean this is the number of unique views, right?
>> Dah Ming Chiu: Yeah.
>> Question: They basically are in ->> Dah Ming Chiu: Right.
>> Question: -- the system. And I mean succinctly you have something like 300
falling in this per day.
>> Dah Ming Chiu: Right. Right. Right. This is -- I think this data may be ->> Question: Something around Christmas time.
>> Dah Ming Chiu: Christmas time. Right, right, right. So this is actually the
data that was originally submitted to Sitcom(phonetic). And later on we adjusted
some of the numbers when we send in the final version because they actually did
some additional -- some of the data came from -- came after the paper was
accepted. Yeah.
>> Question: You mean the (inaudible) -- describing this ->> Dah Ming Chiu: No, no, we had some data which measured in May of 2008
or whatever. I mean for different things. You will see a table later on.
>> Question: Higher basically (inaudible) -- saw something.
>> Dah Ming Chiu: Um, no, I didn't get any new number of users data. I think
this is probably quite difficult. I mean I don't know whether they are significantly
higher or not. That is more like marketing, yeah, I don't have latest information
on that.
>> Question: I think basically, I mean, so look at this VoD, right?
>> Dah Ming Chiu: Mmm.
>> Question: You are in the (inaudible) column number of users watch a bit or
part of the movie and they don't finish.
>> Dah Ming Chiu: Right.
>> Question: How can we interpret this behavior, presuming you have synced
at least 20 or 30% of the users watch something like from 15 to an hour because
they don't (inaudible) the movie, but they don't finish watching it.
>> Dah Ming Chiu: Right, right. I think that is interesting. I think for VoD
system maybe a lot of times people are just browsing, they are not -- they just
want to kill time. I'm just guessing, I don't know. They are just looking at some
movies, not interested. Easy to go to the next one and this may be typical
behavior or this may be just the behavior in China. I don't know. This could be,
you know, if you have better quality and better content maybe you have different
behavior, so this is -- the bit is a measure we use of behavior is quite kind of
dangerous depending on what content you have there and what -- yeah. So...
>> Question: One thing of interest is that when you turned in the application
does it just go -- does it go, does it stop or does it go either staying in the system
as a service that hangs around still serves other peers?
>> Dah Ming Chiu: Of the user, right?
>> Question: Yes. Actually terminate a visit (inaudible) continue to serving
peers.
>> Dah Ming Chiu: This part I don't know. I think this is like you can design
your software may cause either one to happen, I think.
>> Question: (Talking over each other) -- quit.
>> Question: I'm sorry.
>> Question: I think they actually quit.
>> Question: They actually quit.
>> Question: Even this, I know there is a bit conformance difference ->> Question: Okay.
>> Question: -- when this application is running.
>> Question: Uh-huh.
>> Question: Versus not running regarding public applications.
>> Question: All right.
>> Question: Whether I'm doing web browsing.
>> Question: So my question really is more, you have stronger results here for
user behavior is the (inaudible), actually quit, right?
>> Dah Ming Chiu: Yeah.
>> Question: Then if you quit then it just sneakily goes off and still does stuff in
the background, in which case...
>> Dah Ming Chiu: No, I think in this case I'm showing you is how users staying
in a system watching because this is what is locked, okay. The time they actually
hang around, which they may still be helping, I don't know that part.
>> Question: Okay.
>> Dah Ming Chiu: This is how much they are actually watching and then you
can make sure they helped. Okay. So ->> Question: This is not dead time?
>> Dah Ming Chiu: Yeah.
>> Question: And this is the VoD ->> Dah Ming Chiu: The VoD -- (talking over each other) -- and the other thing
on the right is we can see that definitely there's rush hours in a day or prime time.
There's certain time of the day you have a lot more users than other time of the
day. So this is again for the whole week and you can see that I think for them it's
like lunchtime there's a lot of people, you know, and in the evening and then
there's not too much after midnight or something.
>> Question: Question.
>> Dah Ming Chiu: Yeah?
>> Question: I mean, what is the vertical access? Doesn't that fall (inaudible) I
mean ->> Dah Ming Chiu: Number of continuous users watching a particular movie.
So these are just ->> Question: So 200 to 250?
>> Dah Ming Chiu: Yeah.
>> Question: Okay. And the total number of users is on the order of 300K?
>> Dah Ming Chiu: Right.
>> Question: I think they have something like 300 channels or something like
that, right?
>> Dah Ming Chiu: Yeah, yeah. They have something like 1 or 200 popular
movies. I think they adjust that number. Sometimes they have 500 movies or
whatever. By the way, I must admit that I haven't -- I'm not an avid user of their
system. I've probably tried with my students watch part of a movie once, but -so I'm not too familiar with a lot of the reel system myself.
>> Question: These statements are only available for ->> Dah Ming Chiu: Huh?
>> Question: -- inside a (inaudible) normal users and outside their PPLive
compliment you are not going to get distinction because users are -- this is based
on (inaudible). Here people try to crawl that data, right? (Inaudible) ->> Question: You can get something not as accurate ->> Dah Ming Chiu: Not as accurate.
>> Question: No. So I think this data is released from the company rather than
your students doing that a lot, that is impossible.
>> Dah Ming Chiu: (Inaudible) definitely. This is given to us by the (inaudible).
Yes, yes.
The other thing is the, you know, sort of measurement to help to do the
replication job. Okay. So we can look at the movie level what is the supply. You
know, how many peers have -- all peers who are viewing some other movie who
are storing these three movies that we are interested in. We can see that this is
actually, you know, out of 200,000 users you have several thousand users
sometimes storing the movies that we are looking at. Sometimes a few hundred.
And then you look at which chunk ->> Question: Let me ask you a question.
>> Dah Ming Chiu: Okay.
>> Question: I mean, I assume the date is similar to the previous slide. I mean,
you are studying ->> Dah Ming Chiu: Yeah, yeah. This is from the same data set, yeah, the
same traces.
>> Question: I mean, we notice some basic issue (inaudible) look at the
(inaudible).
>> Dah Ming Chiu: Uh-huh.
>> Question: -- (inaudible) increased by something like (inaudible). You have
something like 5 (inaudible).
>> Dah Ming Chiu: Uh-huh. Uh-huh.
>> Question: Supposedly this should mean you should at least have 5000 more
users watching this. If you look at previous slide, I mean, the top is something
like 300.
>> Dah Ming Chiu: Uh-huh.
>> Question: How do you -- I mean, why is this discrepancy?
>> Dah Ming Chiu: I think, well, I think this number must be dependent on the
resolution. I mean if you're looking at any particular ->> Question: I mean if the -- if it's a lot dependent on the revenue it means you
have lobbied the content (inaudible).
>> Dah Ming Chiu: Right. Right. I think that that's certainly a lot of users that
are browsing. And then they could -- so in this case they could be storing just a
small fraction of the movie. As long as they have one chunk of the movie they're
storing, they show up in this figure.
>> Question: (Inaudible) -- versions of the movie (inaudible), right?
>> Dah Ming Chiu: Right. You can be storing just portion of the movie and then
as far as the tracker is concerned you are still storing the whole movie. Yeah.
So this -- the second figure shows you which are the chunks that actually get
stored. You can see that the first few minutes get stored a lot more. And
however, the sufficient number of copies of all the chunks are there. I mean
essentially you get about 30%, always covered, of the peers storing that movie.
>> Question: If a movie is a very popular movie, right ->> Dah Ming Chiu: Uh-huh.
>> Question: -- at least the number of users (inaudible).
>> Dah Ming Chiu: Yeah.
>> Question: I mean, after 71(phonetic) chunk you don't have any users storing
that chunk.
>> Dah Ming Chiu: This is -- this figure actually is because these movies are
different length.
>> Question: Oh.
>> Dah Ming Chiu: So this movie is much shorter so this is always get to the -yeah.
>> Question: Okay.
>> Dah Ming Chiu: Okay. So this one -- this one is called the
availability-to-demand ratio. So this is computing the dividing the how many
people are storing the movie sort of storing movie versus how many people are
watching it. So -- let's see, availability to demand. I'm not sure whether this
demand for availability or availability to demand. Anyway, this is trying to capture
the ratio of people watching the movie versus storing the movie. Okay.
>> Question: All the study is on the basically storage cycle, basically where I
have cash in a portion of the movie versus, I mean ->> Dah Ming Chiu: Right.
>> Question: Users asking for that.
>> Dah Ming Chiu: Right.
>> Question: I saw in the peer (inaudible) it is bandwidth, which is more
important in their storage, right? Let's say if I allow the user to catch (inaudible)
gigabyte ->> Dah Ming Chiu: Uh-huh.
>> Question: I can catch something like 10 (inaudible). I think even today I
mean when they catch basic (inaudible) they can catch five minutes. So storage
basically amount is not that small.
>> Dah Ming Chiu: Uh-huh.
>> Question: (Inaudible) -- storage in that (inaudible).
>> Dah Ming Chiu: Right.
>> Question: More important piece here is how much bandwidth is available.
So ->> Dah Ming Chiu: Well, I think the bandwidth part is similar to P2P streaming.
I think that the new problem for P2P VoD is the storage. How can you manage
the storage so that when you have a user coming in to watch a particular movie,
at the same time there may not be other people watching the same movie, but
you want to make sure that there are other people storing that same movie that
these peers want to watch.
>> Question: We can talk this offline ->> Dah Ming Chiu: Yeah, yeah, yeah.
>> Question: -- but I don't think the storage is that critical.
>> Dah Ming Chiu: Okay. Okay. So the next problem, as I said, is the
measurement of the users as satisfaction. And this is mainly measured in terms
of this thing called fluency, which is the percentage of time you are actually
viewing out of the order time, including time buffering and frozen and all these
other times.
And the next figure shows how many -- so this information is sent from every
peer to the logging server at the end of the viewing. Okay. So you -- the left
figure shows how many reports you get. You can see when there are more
users you get more reports basically. And the right figure is the interesting one.
It's what is the (inaudible) you actually get. Okay. So you want to have them all,
you know, all around here. Then it's good. But you can see sufficiently users are
happy, which is when you have this 15.9 to 1 fluency, so it's a high fluency. But
also this -- there's a big, I think this kind of measurement is always by model. So
you have a bunch of users which just have hard time starting even. They
probably, you know, doing a lot of buffering and so on and then just gave up. So
that's why you have some users with very low fluency. And then you have the
rest of users they are just -- you should have this kind of increasing curve. So
this is a very typical measurement of satisfaction.
>> Question: (Inaudible) -- viewing time plus buffering and freezes.
>> Dah Ming Chiu: Pardon?
>> Question: Fluency is viewing time divided by total time.
>> Dah Ming Chiu: Total time, right?
>> Question: Total time, you don't mean the total time or the length of the
movie, right?
>> Dah Ming Chiu: No, no, no.
>> Question: You mean viewing time plus buffering plus ->> Dah Ming Chiu: Yeah, exactly, exactly. Yeah, yeah.
>> Question: -- by 50% fluency ratio.
>> Question: Right.
>> Question: Meaning, I mean, during time let's say you watch 100-minute
movie. 80% of the time you are viewing. The other -- (talking over each other) ->> Dah Ming Chiu: You are watching a commercial or something. They make
you watch commercials sometimes when you are -- yeah, yeah.
>> Question: This actually is pretty bad performance. I mean ->> Dah Ming Chiu: Yeah, yeah.
>> Question: Yeah, I would think the number of interruptions would be crucial,
too, because interrupted every three frames for one frame, that would be much
worse than if there was a big gap at the beginning.
>> Dah Ming Chiu: Right.
>> Question: I get the impression if that's the case, some of its competitors are
doing better in the VoD case than this one.
>> Dah Ming Chiu: I must say their performance is not that great, but I think
probably passable. I think this is, you know -- again, I mean this is just a
snapshot of performance at a particular point in deployment. So this is, I think
the whole idea of the paper is not just trying to study, you know, exact
benchmark in their performance. It's more like describing the design issues and
what are the important problems and what things to measure, how to measure
them. These are the things I'm trying to deliver here. Okay.
>> Question: This is -- (inaudible) ->> Dah Ming Chiu: Yeah.
>> Question: And ->> Dah Ming Chiu: Right. This is (inaudible).
>> Question: Something like May timeframe?
>> Dah Ming Chiu: May timeframe, no, I don't think I have that. This is just
showing there are typical enough -- this may not be related to the other figures.
This is just sort of another snapshot they're looking at the server, how they are
delivering during the day, you know what is there CPU time? You know, sort of
usage, memory usage and what kind of things they're using for the server. So
this is not that interesting.
Now this may be interesting to you. Huh?
>> Question: Question on this number. Did that follow the tracker in content
provider? So, I mean ->> Dah Ming Chiu: No, I think it has different servers for tracker and different
for providing the source.
>> Question: So this is just a content provider ->> Dah Ming Chiu: Yeah, yeah, yeah. ->> Question: -- server.
>> Question: Will they exhibit?
>> Question: I think it's actually interesting because it shows that the server is
pretty much maxed out at about 70% utilization. If you're actually running a
server, you don't want it to hit 70%. It's pretty much maxed out. You're doing it
at these two times then the peaks that you see.
>> Dah Ming Chiu: Mmm.
>> Question: Corresponding to maximum upload rate.
>> Dah Ming Chiu: Yeah. Yeah. Whatever you can make out of this, yeah. So
this is the -- some new things collected after the paper was selected. I think that
one of the reviewers says, why don't you show us some, you know, sort of what
are the typical uplink contributions from different peers and downlink and so on.
So this was measured in May of 2008. Okay. So you can see a distribution of
what peers are contributing and how much server is contributing. In this one it
actually shows the server is doing very well. It's only 8% or something. The
peers are -- total distribution of different kind of peer contribution and so I think
this is kind of interesting.
>> Question: (Inaudible) -- so the download rate of the peers, I mean, have a
(inaudible) distribution, right? So really is something like 360 gigabits per
second.
>> Dah Ming Chiu: Right.
>> Question: For this movie, right?
>> Dah Ming Chiu: Right.
>> Question: You have peers downloading, you have something like 10% of
peers downloading above 600 (inaudible).
>> Dah Ming Chiu: I don't know why is that, sorry. That's what I don't know.
>> Question: Because that is usually indicate ->> Dah Ming Chiu: That's a very small percentage, right?
>> Question: 10% of the (inaudible).
>> Dah Ming Chiu: Yeah. Yeah.
>> Question: And if you look at 360 to 600, depending on how much of the
peers actually close to the 360, you may have a lot of peers ->> Dah Ming Chiu: Yeah. I think this one maybe should be broken down into
more finer granularity. If it's closer 360, it could be that those peers, they are
seeing a lot of losses or they just have to do a lot of retransmission or whatever.
I mean, that's possible, as well. Yeah.
So then the issue is how to measure the server loading. So the way -- turns out
they are defining this server loading is at during the prime time. Okay. The
prime time is divided by sort of looking at the prime of day and the most busiest
two or three hours. And then -- because if a system is not during prime time you
actually don't care so much. The server is already deployed. If they use a higher
percentage of servers it's okay. And it's during the prime time you want to make
sure the server is not loaded too much. Okay. So that's how it is defined.
And as I mentioned earlier, that for P2P streaming they told me that they can
achieve very low server loading, maybe less than 1%. Okay. I forgot the exact
number, but for P2P VoD initially when the paper was written the server loading
was 20 to 30 bits out and by this time it's around 10%. Yes?
>> Question: A lot of servers to go through ->> Dah Ming Chiu: Yeah.
>> Question: Are they geographically distributed and do they dynamically
improve servers ->> Dah Ming Chiu: That is what I don't know. I don't know the details. I know
they have probably placed servers in different ISP networks. They actually work
with ISPs to decide where to put the servers. The ISP may even provide them
some place to put the server that has high bandwidth, uplink, bandwidth and
these kinds of things.
>> Question: -- (inaudible) manage that (inaudible) ->> Dah Ming Chiu: This has some information about the entity traversal, this
80% of the nodes are behind I-19 boxes and may need three kind of energy
boxes. They use something like the stone, you know, protocol to measure that.
So the concluding remarks I think the main message, as I said earlier already is
that we're just trying to. This is like system paper, we are looking at, you know,
relatively large scale P2P VoD deployment at this stage. Later on we'll probably
see more, you know, even larger scale. And we look at the design now, the
insight we get from the deployment in the PPLive case. We look at important
research problems to study and we discuss how to do measurement, what kind
of metrics to measure and how to measurement both in terms of measuring the
replication for the replication algorithm, as well as for user satisfaction. And
that's it. And I think ->> Question: Let me ask a question about when you track this data.
>> Dah Ming Chiu: Okay.
>> Question: They tell you within the core of this system (inaudible). It seems
to me that, I mean, currently the system they have issues if you really want to put
all VoD movies on to the system. What I mean is this may be okay for let's say
100 to 500 popular movies. But let's say you (inaudible) website into VoD server.
>> Dah Ming Chiu: Uh-huh.
>> Question: Then the number of movies may be pretty large.
>> Dah Ming Chiu: Right.
>> Question: Look at the current algorithms, basically need to track each
chunk, where are the peers holding these chunks, right?
>> Dah Ming Chiu: No, no, no. The track is only, I think, responsible for giving
peers, you know, if you come in and you want to watch a particular channel, a
particular movie, which other peers have stored that movie, not at the chunk
level.
>> Question: So it's basically on the movie level, not the chunk level.
>> Dah Ming Chiu: Yeah, yeah. The chunk level is probably just additional
information, they're not keeping necessarily up to date.
>> Question: Okay.
>> Dah Ming Chiu: The chunk level is we use gossip.
>> Question: Okay. Okay. Okay.
>> Dah Ming Chiu: Yeah. The tracker is doing the movie now.
>> Question: So the (inaudible) actually beyond the server pieces so it's
basically chunk ->> Dah Ming Chiu: Right, right. Because you exchange the bit map, yeah.
>> Question: -- (inaudible) difficulty handle case but never was a common one
was the UI pulls it out is people that watch a movie and fast-forward, so they
actually only want one frame out of every chunk or you know ->> Dah Ming Chiu: Yeah, they don't have this feature. They don't support this.
This may be difficult, yeah.
>> Question: Well, that works. (Laughter)
>> Dah Ming Chiu: They only allow you to jump to a particular point, but not to
fast-forward. Yeah.
>> Question: Okay. Let me ask one question. I've heard some speculation that
the PPLive system performs well because the -- a lot of vendors subsidize by
large open pipes that are like people in universities. All right. Having --
>> Dah Ming Chiu: I think this is true for all the P2P systems. I think especially
if you are in China, a lot of the ADSL users, they don't have a lot of uplink
bandwidth.
>> Question: -- the measurements could either support or disproves this
particular speculation. So I know it's speculation, but once it is mentioned I say
quantify it.
>> Dah Ming Chiu: So this one I think you can see some ->> Question: If you look at this study, I mean the number of kind of peers in the
(inaudible).
>> Question: Yep.
>> Question: So the peers (inaudible), I mean ->> Question: 60%.
>> Dah Ming Chiu: About 50%. Having, you know, less than the playback rate
roughly.
>> Question: Yeah. I ->> Dah Ming Chiu: It's not ->> Question: Okay. Great.
>> Question: According to live screening designs ->> Dah Ming Chiu: Uh-huh.
>> Question: In which they talk (inaudible) Sitcom Ultra ->> Dah Ming Chiu: Uh-huh.
>> Question: They majorly do use basically (inaudible) peers.
>> Dah Ming Chiu: Right.
>> Question: University peers. Those peer's bandwidth is actually much higher
than the one (inaudible). It is almost like 100 megahertz.
>> Question: So interesting question is what happens when university network
administrators decide this is not going to work and turn it off.
>> Dah Ming Chiu: Yeah. I think for this, I mean --
>> Question: (Inaudible) study, you need to inject those studies (inaudible).
(Laughter) And also there's difference when there is a (inaudible) downloading
because they follow traversal or that traversal (inaudible) or something ->> Question: Usually for the university network we (inaudible) basically
(inaudible). They don't have power of traversal (inaudible).
>> Question: Well, in any case it's (inaudible). Yes. Okay. Many of the
universities can (inaudible).
>> Dah Ming Chiu: So do you think we should still go through this part two
that -- quickly, maybe about 25. I mean, more about modeling work we're doing.
I mean probably some things you already know.
>> Question: We have something like 15 minutes.
>> Dah Ming Chiu: 15 minutes, I'll quickly flip through them for this. Okay. So I
think there's also great academic interest in this P2P area. So the talk I just gave
is more like during the system.
So during the academic community people are studying basically these two kind
of questions. One is: What is the limit? I mean, in terms of theoretically you can
do. Second question is sort of what -- how to model the algorithms so you can
achieve these limits?
I think for people working this field, the secret, as you already know, is for this
P2P to work essentially you have to use multiple trees to distribute information
and you can build these trees, you know, using this tree-based method of sort of
mash, but essentially you provide multiple paths from the server to other peers.
>> Question: That's basically (inaudible) -- like in my streaming verse us VoD?
>> Dah Ming Chiu: This is streaming. Live streaming. Yeah, yeah. So this is
different than when people were doing multi casts. They were focusing on just
efficiency, rather than the maximum throughput. The capacity limit basically, you
know, you can -- if you don't make these two assumptions it is kind of complex
problem because you have to study, you know, how to pack different trees given
a physical network. But people typically make these two assumptions. One is
what is called uplink sharing problem. Okay. This was mentioned in Moninger's
Thesis in 2005. Essentially you find -- you assume that the network is not a
bottleneck. So only the uplink appears on the bottleneck. And then you can
derive and also you make a fluid assumption, which is to divide a content to as
many small pieces as possible. And in the limit you can derive this result which
says that you can, the maximum throughput you can achieve is bounded -- in
fact, you can achieve this pretty much by this formula is the minimum of the
servers uplink or the total uplink, including the server divide by number of
receivers.
So the -- some other useful results in the theoretical lender case is you have a
sarcastic model of peer population, which is by the two and three kind of
modeling in 2004. I just want to mention that we also did a work in
TyseonP(phonetic) 2006 sort of trying to study in the theoretical case what is the
tradeoff between the throughput versus fairness of contribution. You can see the
different ways you can achieve different level of throughput. And I think what is -so the question of studying the theoretical capacity limit I think is already -- there
is already rich literature. But I think what is beginning to happen and more and
more papers focus on studying this modeling these algorithms, okay, of the
algorithms. Because we see a lot of really successful deployment such as in
PPLive's case. And can we actually, you know, sort of have more rigorous study
of the algorithms?
The -- for example, basically all the algorithms people are studying are the mesh
network case, which is probably more challenging case. The tree case is more
predictable, so the algorithm tend to be simpler. And I think one important result
is this question about push versus pull based algorithm. Okay. I think there is a
nice paper on this topic by these three gentlemen, San Havi(phonetic), Bruce
Hyak, and Mosoli(phonetic). And they have a paper in transaction of information
theory. Essentially they look at the pull type of algorithm versus push. And then
(inaudible) insight is that the push is important in the beginning for the fresh, you
know, chunks when they are sending out for the server because it helps the
distribution scalability of distribution.
And the -- when the -- there is already significant number of pieces out there in
peers then the pull method is better. So this paper actually is very nice
discussing these issues.
We also did paper in ICMP 2007, modeling the streaming case. Okay, that other
paper was more in the file downloading, looking at sort of the maximum
throughput and the delay and so on. So within a (inaudible) ICMP 2007 created
a model to study, to compute continuity. So the intuition we got from there is that
intuitively you would think that the greedy algorithm, which is sequentially get
everything from sort of sequentially is the right thing to do. But it turns out that if
you want to -- a system to scale it's important to do the rarer first. Okay there is a
model in the paper. And it turns out the reason the insight work is very similar to
the previous work, the paper I just mentioned, because rarer first essentially is
very similar to push. It's, you know, if every peer select when they are pulling
they select rarer first, it's like pushing the new pieces first. Okay. Giving priority
to the new pieces.
And then the important result we have in this paper is we show that actually a
mix strategy is the best. Actually if you use a mix strategy you can achieve the
most from -- in both directions, the dimensions. So this is our model. You have
end peers and the server is doing the push to the different peers. And then
every -- after every time slot essentially everything is like a sliding window shift
over and then you playback one piece and then the assumption we make of
course is the simplified assumption is that you have a set of peers which are
viewing at the same time. They are the same playback buffer size and a bunch
of assumptions. But if you take these assumptions then you can essentially
compute the continuity which is the probability of having a particular piece at a
time when you need to playback. Okay.
And you can compute this for different piece selection algorithms so this is
showing the sliding window. Each peer buffers like a sliding window in each time
slot. Each peer randomly select a neighbor and then get a piece. And which
piece you get is the piece selection algorithm and then you can compute this
thing, which is the -- in each time slot the probability a particular piece is gotten
from your neighbor and then essentially the probability of buffer position I plus 1
is equal to the probability you have the content in buffer position I in the previous
time slot plus the probability you get that piece in the current time slot. Okay.
You can essentially, through the model, get each of these quantities and you can
then -- it becomes solving a different equation to get the continuity and then we
can study using this method, study the greedy algorithm and the rarer first and
this illustrates which -- the greedy is getting the one that's closest to playback
and the rarer first is getting the newest piece and then you can set up the
different equation for each case. And it's -- the reason you get this different
equation is showing you some line loss in the paper and then we solve these
equations. I won't go through the degradation, but you can get this kind of
numerical results to see that the two peer selection strategies.
And you can see that in this case the rarer first is doing better. Okay. It's
focused on the probability PI when I equal to 40. Okay, which is the playback
position. And then we show that we can set up the same thing for the mixed
strategy, which is again reduced to a different equation and then you can study
the mixed strategy. And after we publish the paper we actually theoretically can
prove the mix strategy is always better than both the greedy and the rarer first.
The mix is basically just saying we used part of the buffer to use the rarer first
and the other part we used the greedy. Okay.
And this is a closer look at how these three strategies do given the period of time.
Okay. You can see that it's not very clear, but the mixed is almost one. Okay.
And so one of the strategies in the mix strategy is you have to decide how big a
buffer portion of the buffer to use for the rarer first versus the greedy, which is
this parameter M. So how do you set this parameter? So you don't want to have
to set this parameter, you want to this parameter is that based on the population
size and then you can show that -- can you actually let it adapt by saying just
take any M in the middle and then say, let's try to make sure this PM can
achieve, you know, some pocket probability, say .3 and by doing this you can
essentially -- because the system is not very sensitive to what you pick for a
target probability and then you can adapt to the right end given when the
population size change.
So more recent work we are continuing along this line is to model the, you know,
we make a lot of assumptions in that model so we try to relax the assumptions
for -- you know, have different peers have unsynchronized playback. They're
using different kind of start-up algorithms, with a model to start-up algorithm.
And also the -- as well as the piece selection.
And so that's one area we're doing work. The second area I already mentioned
to come up with a model, like the future generation congestion control for P2P
how to do this transmission scheduling.
The third area we're working on is the ISP-friendly content distribution. So these
are the three areas.
So the concluding remark, I think I'm wrapping up just in time now. So I think
there's too many algorithms and variations to study, so this area is still very futile
for researchers. And maybe there needs to be some kind of common simulation
platform like an MS that sort of for P2P. We are actually thinking about doing
something in this area, as well if there's enough students.
As I said this resource allocation problem is very interesting compared to the
congestion control we've been studying a lot. So these are just some thoughts.
Yeah?
>> Question: Has anybody looked at sort of lowering the line between this and
peer-to-peer file sharing, where I can sort of, you know, say what it is that I
expect to be watching and the thing can be proactively doing that and trying to
get ahead to avoid the stalls? It would seem like if I do that it would encourage -it would be one of the benefits to the network is it would encourage people to
leave these applications running whenever their machine is idol.
>> Dah Ming Chiu: That's very interesting question. Yeah. I think you are
saying essentially maybe you don't need to study the streaming. It's just through
file sharing, download the file.
>> Question: Well, it seems like you need the on-demand stuff, but it seems like
there's also the application of people that just want to download the full movie.
>> Dah Ming Chiu: Right.
>> Question: And it seems like that by mixing the two you might be better than
you can do either of the algorithms separately.
>> Dah Ming Chiu: Yeah. Yeah. Interesting thought. I ->> Question: The thing, while the performance differs, I mean, also requiring a
movie, usually a different (inaudible). You want the application to have idea
(inaudible) basically required to, I mean, allow the user to be comfortable using
the application. My observation is it's quite informal in China. Actually the
(inaudible) here (inaudible) actually a five-peer communication. A lot of things
share (inaudible) -- so majority (inaudible) download.
>> Dah Ming Chiu: Uh-huh.
>> Question: And then these peer-to-peer streaming applications (inaudible) -live streaming path. (Inaudible) basically streaming (inaudible) open up
something like 420 channels and viewer can watch each channel. Each of the
channels usually you have something like (inaudible). (Inaudible) in term of the
time they are watching this movie. (Inaudible) actually very light (inaudible) ->> Dah Ming Chiu: Uh-huh.
>> Question: That's basically the live streaming. More reason we are starting to
offer video on-demand service. That's a new service being offered.
>> Question: Right. Yeah, seeming like if you put the multiple services on the
same server that you might get some benefits like file sharing would get a lot
slower at peak video-on-demand times (inaudible). Interesting thing is basically
you sort of balance.
>> Question: Right.
>> Question: And I think (inaudible) even today isn't working that well.
(Inaudible) actually using Live streaming service (inaudible) simply because, I
mean ->> Question: Well, the things you are trying to do on the machine at the same
time.
>> Question: It's basically sucking all the bandwidth (inaudible).
>> Question: Right. But if you're about to go to dinner, you'd be happy to let it
get ahead while you're gone.
>> Question: Yes.
>> Dah Ming Chiu: Any other questions?
>> Question: I want to go back to the PPLive replication strategy.
>> Dah Ming Chiu: Yes.
>> Question: So is there limitation of the replication strategy particular to the
fact that you're seeing movies which are kind of large content as opposed to
things like YouTube videos, which might be something like 30 seconds. Do you
know to what degree it's dependent on the fact that the length of the movie and
also the issue of the fact that you need all the data for file replication and file
(inaudible) whereas movies you don't (inaudible). Do you have any sense of how
the (inaudible)?
>> Dah Ming Chiu: Interesting. I think from the user behavior we see, right, a
lot of the users were actually browsing. I mean, they were. So I think, at least
the browsing part is similar to the shock (inaudible). In contrast to their system, I
think if you want to do it like a YouTube system, right, I think the challenge is
more there's so many views versus just having only like a few hundred
(inaudible). So the design, I think, it is quite differently. There is a tracker
(inaudible). So maybe you could think about doing something like -- something
that help you scale up or discover different things. That is very important.
>> Jin Li: Any other questions? Thank you. Very, very interesting questions.
(Applause) --
Download