s10_7

advertisement
Верхний колонтитул (оставить верстальщику)
Modeling of Hybrid CDN-P2P for Full HD video downloading
with realistic demand distribution
Салищев С.И., ст. преп. кафедры информатики СПбГУ,
sergey.i.salishev@gmail.com
Шеин Р.Е., студент кафедры системного программирования СПбГУ,
marso.des@gmail.com
Introduction
With rising popularity of Full High Definition (HD) quality video the
problem of delivering such content to end-users online appeared. The existing
Internet infrastructure is physically unable to stream Full HD quality video on
demand to a significant number of users even in well-developed areas not
mentioning rural areas and developing countries. So the only feasible solution is
content downloading.
Even for content downloading, the network bandwidth is still a problem. To
overcome it a CDN is usually employed which distributes payload across
different nodes and network segments. The number of films currently is over
100K and growing exponentially over time. Due to the long-tail property of the
demand distribution [2] the number of the most popular items is growing at the
same rate. Large number of popular items reduces the efficiency of caching and
multicasting of data as most of the neighboring users watch different content. So
the CDN capacity should exponentially grow with the demand.
Another major problem is the content protection from redistribution.
Unsecure digitally downloaded content is ready for redistribution with effort as
small as one click, as opposed to the long process of grabbing a Blu-ray Disk
(BD). So the content should be copy-protected. Digital Rights Management
(DRM) is a standard technique in the downloaded video copy-protection. On the
other hand, DRM hinders the user experience as it severely limits the ways of
using the content below the level considered as the fair use. Also, DRM is
always decrypted before showing, so it is susceptible to attack on the chain-oftrust, i.e. hacking the playback pipeline after the decoding. All known DRM
including HDCP have already been hacked by hardware or software means.
Due to bandwidth and copy-protection problems there is no Full HDquality online video service which could compete in popularity with physical
BD sales. On the contrary, DVD physical sales are pushed by online services
like Netflix, Hulu, and Amazon. Based on the multibillion size of the video
market, developing such a service may create a promising business opportunity.
P2P is similar to CDN except two properties. First, peers are not reliable
sources, they can appear and disappear at random leading to Quality of Service
Верхний колонтитул (оставить верстальщику)
(QoS) degradation which is a problem for video streaming scenarios. Second,
peers are not trusted to own non-copy-protected content. The content copyprotection problem for P2P is yet to be solved.
P2P effectively solves the bandwidth problem as it effectively scales up
with the number of users with the cost burdened on the users themselves. For
the downloading scenario the constant QoS is a secondary concern. As in case
of low QoS, a user can watch the video offline after downloading. The
efficiency of P2P is supported by the latest Internet traffic analysis attributing
about 30% of total Internet traffic to P2P [3]. P2P is future-proof for new
content types like HDR video, retina resolution video, multi-view 3D video.
While DRM can be adapted to P2P networks it is not quite suitable for it,
as P2P is based not only on Internet networking, but also on social networking
between users which presumes hardware neutrality and user convenience. The
solution is embedding digital fingerprint using video watermarking on each
copy of video and using traitor tracing to discourage users form redistribution.
In this paper we consider Hybrid CDN-P2P system for video downloading.
It is different usage model from streaming used by existing Adobe Live Video
P2P system and system considered by LaFortune et al. [4]. As user can watch
the content off-line QoS requirements are easier to comply as the system only
needs to guarantee completing the download in reasonable time independently
from order of blocks. While video streaming needs to guarantee a constant
throughput for sequential blocks. In opposite to streaming, downloading system
can provide reasonable service even for low bandwidth users and allows better
load balancing as video is persistently stored on clients.
The contribution of this paper is as follows:
We analyze the data form Demonoid torrent tracker [5] and show that it is
better fitted by Stretched Exponential (SE) distribution (complementary to
Weibull) than Zipf distribution which supports the data of Guo et al. [6] on
demand distribution in commercial VoD systems and for user generated video
content. It suggests that SE distribution is universal for video demand
independently from content source and method of delivery.
We compare the data from Demonoid vs. IMDB [7], and demonstrate good
correlation between user demand distribution and votes distribution within one
year. It allows us to predict the demand distribution based on publicly available
popularity statistics independently of method of video delivery.
We implement the agent based model of P2P assisted CDN for video
downloading using bit-torrent protocol with Demonoid demand distribution.
Simulation shows 94% reduction in CDN traffic which exceeds the 75%
Верхний колонтитул (оставить верстальщику)
reduction reported by LaFortune et al. [4] for Hybrid CDN-P2P video streaming
model. The main difference of our model is that we drop all QoS requirements
needed for streaming and only implement QoS requirements embedded into bittorrent protocol which guarantee timely download completion.
Data sets and demand analysis
For this analysis we gathered Demonoid user statistics snapshot on
13.10.2011 and IMDB votes snapshot on 26.10.2011.
We considered the CDN assisted with P2P network used for content
downloading. Our hypothesis is that such a network would behave similarly to
existing bit-torrent networks. We aimed to assess the impact of P2P assistance
and downloading strategy on the CDN data traffic.
Our first goal was to accurately study the user behavior. We used
Demonoid statistics to analyze the actual demand in P2P networks (Fig. 1). The
Demonoid data was fitted with Zipf distribution common to Web content and
Stretched Exponential Distribution. SE better captures the demand distribution
R2=0.99 than Zipf R2=0.77 with the scale parameter close to reported by Guo et
al. [6] for commercial VoD systems.
-1
10
demonoid
2
Zipf( =0.58) R =0.7664
-2
10
2
SE( =3.1,c=0.36) R =0.9927
-3
10
-4
PDF
10
-5
10
-6
10
-7
10
-8
10
-9
10
0
10
1
10
2
3
10
10
4
10
5
10
rank
Figure 1. "Demonoid" downloaders in category "movie"
Tracker statistics snapshot in the category ‘Movies’ by the end of 2011
containing 33K of unique film names with non-zero number of downloaders
shows that 13.7% of films serve 86.3% of the users which is in agreement with
statistics of Tan et al [8]. It should be noted that for niche markets the rate can
be substantially higher, i.e. 28/72 for “noir” films (Fig.2).
Верхний колонтитул (оставить верстальщику)
Figure 2. Pareto number per category
As we are interested in easier and more reliable prediction of user demand
we investigated the correlation of the number of torrent downloaders with a
popularity rating on ratings sites. We observed that within one year there is a
high correlation (Pearson’s r=0.99) between the ranked number of votes on
IMDB and ranked number of torrent downloaders on overlapping subset of
movies (Fig. 3). It suggests that both values are generated by similar processes.
Figure 3. "IMDB" vs. "Demonoid" demand on overlapping subset (ranked), 2010 year
The correlation between votes and demand probability for the same film is
much less explicit (Fig. 4); with r=0.84. It means that vote distribution only
predicts the shape of demand distribution but not the demand probability for the
specific film. This may be explained by the snapshot nature of Demonoid data
and differences in audience between Demonoid and IMDB. The overall
behavior for all years is substantially different as torrent popularity decreases
exponentially over time, which does not happen on ratings sites.
Верхний колонтитул (оставить верстальщику)
Figure 4. "IMDB" vs. "Demonoid" demand on overlapping subset, 2010 year
The usual model for the video demand probability is a Zipf distribution
which is commonly observed in citation ratings and word frequencies in natural
languages. However, it’s noted that the actual video demand distribution has
differences [6][9], that are a limited fetch for top-rated items and an exponential
cutoff for low-rated items so it is better fitted by Stretched Exponential
distribution which is supported by our data. Currently there is no clear
explanation of the process generating this distribution.
P2P simulations
To investigate the impact of changing the system architecture we
implemented a behavioral simulator of a P2P network based on ROSS
framework [10]. We used two-layer network model consisting of a star-like
backbone model and local networks (LAN) with CDN modeled as a super-peer
(Fig. 5). We implemented bit-torrent protocol simulation for communication
between nodes. As we considered only downloading usage model we
implemented no QoS requirements needed for streaming. We only implemented
QoS requirements of bit-torrent protocol which guarantee the download
completion in reasonable time. During the simulation with 10K peers we
observed CDN traffic reduction by 94%. This is higher than 75% reduction
reported by LaFortune et al. [4] for modeling of Hybrid CDN-P2P video
streaming and is on par with the numbers for large scale P2P software
distribution [11]. We achieve better result due to weaker QoS requirements and
larger swarm of uploaders for the downloading scenario. The result suggests that
P2P assisted CDN can be profitable even if operating in niche markets, i.e. no
top-rated movies.
Верхний колонтитул (оставить верстальщику)
Figure 5. Star network topology
Larger-scale simulations based on GPS P2P simulator were performed to
eliminate potential mistakes in bit-torrent stack implementation. For these
simulations a similar star topology network model with 200 LANs with 512
peers each was considered, CDN and tracker being super-peers. Due to
performance issues in the simulator's core only a brief interval of time was
feasible for simulation: after approximately 45-48 seconds simulated the
simulator's core practically stalls due to overgrowth of events queue. The data
obtained a show that upon stabilizing the CDN traffic reduction is 94% or more.
Integrated copy-protection
The bit-torrent protocol does not provide the copy-protection. Copy
protection is a critical requirement from the content owners for any practical
implementation of a video distribution system. To amend this problem we
consider the combined solution including a centralized billing system (Fig. 6).
P2P
Network
Peer
Common
Stream
m
CDN
Private Stream
Authority
Figure 6. System with copy-protection layer
The content is divided into a public and a private layer. The public layer is
distributed through P2P-assisted CDN, while the private layer is distributed
through a centralized system and is unique per user. The public layer is
“encrypted” making the content useless to the user without the private layer.
This can be achieved by modifying the semantic elements of video stream
preventing it from decoding or producing embarrassing artifacts . After
downloading both layers are jointly “decrypted” to provide a useful copy. The
private layer can be as small as 1% of the data which does not substantially
hinder the system performance. This approach is compatible with both DRM
and fingerprinting copy-protection schemes.
Верхний колонтитул (оставить верстальщику)
Conclusions
The correlation between Demonoid and IMDB ratings allows us to predict
user demand based on publicly available statistics. We use Demonoid demand
probability distribution to simulate Hybrid CDN-P2P video distribution system
for downloading. Simulations with different tools show 94% traffic reduction
for CDN. This demonstrates that Hybrid CDN-P2P for video downloading can
effectively solve bandwidth shortage for full HD quality video distribution.
Paired with the copy-protection suggested above and some motivation for peers
to keep and upload stored video to the system the approach can dramatically
improve speed and reduce maintenance costs of video distribution services.
Most likely, the model can be adapted to other categories of bulk downloads
beyond video like video games.
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
C. Anderson, “The Long Tail: Why the Future of Business is Selling Less of More,” Hyperion,
2006.
S. Goel, A. Broder, E. Gabrilovich, B. Pang, “Anatomy of the long tail: ordinary people with
extraordinary tastes,” In Proceedings of the third ACM international conference on Web search
and data mining (WSDM '10). ACM, New York, NY, USA, 2010, pp.201-210.
C. Labovitz, “The Other 50% of the Internet”, 54th North America Network Operators’ Group
Meeting (NANOG54), February 2012.
R. LaFortune, C.D. Carothers, W.D. Smith, J. Czechowski, W. Xi, "Simulating Large-Scale
P2P Assisted Video Streaming," 42nd Hawaii International Conference on System Sciences
(HICSS '09), Jan. 2009.
“Demonoid torrent tracker,” http://demonoid.me
L. Guo, E. Tan, S. Chen, Z. Xiao, X. Zhang. “Does internet media traffic really follow Zipflike distribution?” In Proceedings of the 2007 ACM SIGMETRICS international conference
on Measurement and modeling of computer systems (SIGMETRICS '07). ACM, New York,
NY, USA, 2007, pp.359-360.
“International Movie Database,” http://imdb.com
T. F. Tan and S. Netessine. Is Tom Cruise threatened? Using Netflix Prize data to examine the
long tail of electronic commerce. Working Paper, 2009.
M. Cha, H. Kwak, P. Rodriguez, Y. Ahn, S. Moon. “Analyzing the video popularity
characteristics of large-scale user generated content systems,” IEEE/ACM Trans. Netw. 17, 5
October 2009, pp.1357-1370.
C. D. Carothers, D. Bauer, S. Pearce, “ROSS: A high-performance, low-memory, modular
Time Warp system,” Journal of parallel and distributed computing (Elsevier) 62 (11), 2002,
pp.1648–1669.
C. Huang, A. Wang, J. Li, K.W. Ross, “Understanding hybrid CDN-P2P: why limelight needs
its own Red Swoosh,” In Proceedings of the 18th International Workshop on Network and
Operating Systems Support for Digital Audio and Video (NOSSDAV '08). ACM, New York,
NY, USA, 2008, pp.75-80
Z. Dekun, N. Prigent, J. Bloom, "Compressed video stream watermarking for peer-to-peer
based content distribution network," IEEE International Conference on Multimedia and Expo,
(ICME 2009), pp.1390-1393, 2009, June 28
GPS peer-to-peer simulator http://www.cs.binghamton.edu/~nael/gps/
Download