Challenges, Design and Analysis of a Large-scale P2P-VoD System Yan Huang, Tom Z. J. Fu, Dah-Ming Chiu, John C. S. Lui and Cheng Huang 2008. 10. 6. SeungHo Lee Outline P2P overview An architecture of a P2P-VoD system Performance metrics Measurement results and analysis Future works P2P Overview Advantages of P2P • Users help each other so that the server load is significantly reduced. • P2P increases robustness in case of failures by replicating data over multiple peers. P2P services • P2P file downloading : BitTorrent and Emule • P2P live streaming : Coolstreaming and PPLive • P2P video-on-demand (P2P-VoD) : PPLive – Like P2P streaming systems, P2P-VoD systems also deliver the content by streaming, but peers can watch different parts of a video at the same time. – P2P-VoD systems require each user to contribute a small amount of storage (usually 1GB) instead of only the playback buffer in memory as in the P2P streaming systems [Ref] P2P Protocols and Applications Network or Protocol Use Applications BitTorrent File sharing / Software distribution / Media distribution ABC, AllPeers, Vuze (formerly Azureus), BitComet, BitLord, BitTornado, BitTorrent, Burst!, Deluge, FlashGet, G3 Torrent, Halite, KTorrent, LimeWire, MLDonkey, Opera, Panthera, QTorrent, rTorrent, Shareaza, TorrentFlux, Transmission, Tribler, µTorrent, Thunder eDonkey File sharing aMule, eDonkey2000 (discontinued), eMule, eMule Plus, FlashGet, iMesh, Jubster, lMule, MLDonkey, Morpheus, Panthera, Pruna, Shareaza, xMule Gnutells File sharing Acquisition, BearShare, Cabos,FilesWire,FrostWire, Gnucleus, Grokster, gtk-gnutella, iMesh, Kiwi Alpha, LimeWire, MLDonkey, Morpheus, MP3 Rocket, Panthera, Poisoned, Shareaza, Swapper, XoloX Napster File sharing Napigator, Napster P2PTV Video stream / File sharing TVUPlayer, Joost, CoolStreaming, Cybersky-TV, TVants, PPLive, LiveStation P2P-VoD system Major components • • • • Peers Servers : the source of content Trackers : help peers connect to other peers to share the same content A bootstrap server : helps peers to find a suitable tracker and to perform other bootstrapping functions • Other servers – log servers : log significant events for data measurement – transit servers : help peers behind NAT boxes Segment sizes How to divide a video into multiple pieces • Small segment size gives more flexibility to schedule which piece should be uploaded from which neighboring peer. • The larger the segment size the smaller the overhead. – Header overhead – Bitmap overhead – Protocol overhead • The video player expects a certain minimum size for a piece of content to be viewable. Segmentation of a movie in PPLive’s VoD system Replication Strategy Goal • To make the chunks as available to the user population as possible to meet users’ viewing demand while without incurring excessive additional overheads Considerations • Whether to allow multiple movies be cached – Multiple movie cache (MVC) / single movie cache (SVC) • Whether to pre-fetch or not • Which chunk/movie to remove when the disk cache is full – Least recently used (LRU) / least frequently used (LFU) Content Discovery Content advertising and look-up methods • Trackers – Used to keep track of which peers replicate a given movie – As soon as a user starts watching a movie, the peer informs its tracker that it is replicating that movie. – When a peer wants to start watching movie, it goes to the tracker to find out which other peers have that movie. • Gossip method – Discovering where chunks are is by the gossip method. – This cuts down on the reliance on the tracker, and makes the system more robust. • DHT – Used to automatically assign movies to trackers to achieve some level of load balancing. Piece Selection Which piece to download first • Sequential : select the piece that is closest to what is needed for the video playback • Rarest first : selecting the rarest piece helps speeding up the spread of pieces, hence indirectly helps streaming quality. • Anchor-based : when a user tries to jump to a particular location in the movie, if the piece for that location is missing then the closest anchor point is used instead. Transmission Strategy Goals • Maximize downloading rate • Minimize the overheads Strategies (by levels of aggressiveness) • A peer can send a request for the same content to multiple neighbors simultaneously • A peer can request for different content from multiple neighbors simultaneously (PPLive’s choice) – For playback rate of 500Kbps, 8-20 neighbors is the best. More than this number can still improve the achieved rate, but at the expense of heavy duplication rate. • A peer can work with one neighbor at a time. Other Design Issues NAT and firewalls • Discovering different types of NAT boxes • Pacing the upload rate and request rate Content authentication • Chunk level authentication – Some pieces may be polluted and cause poor viewing experience locally at a peer. – If a peer detects a chunk is bad, discard it. • Piece level authentication What to measure User behavior • includes the user arrival patterns, and how long they stayed watching a movie • used to improve the design of the replication strategy External performance metrics • includes user satisfaction and server load • used to measure the system performance perceived externally Health of replication • measures how well a P2P-VoD system is replicating a content • used to infer how well an important component of the system is doing User Behavior MVR (movie viewing record) User Satisfaction Simple fluency • measures the fraction of time a user spends watching a movie out of the total time he spends waiting for and watching that movie R(m, i) : the set of all MVRs for a given movie m and user i n(m, i) : the number of MVRs in R(m, i) r : one of the MVRs in R(m, i) User Satisfaction (cont’) User satisfaction index • considers the quality of the delivery of the content r(Q) : a grade for the average viewing quality for an MVR r Health of Replication Three levels • Movie level – The number of active peers who have advertised storing chunks of that movie – The information that the tracker collects about movies • Weighted movie level – Considers the fraction of chunks a peer has in computing the index • Chunk bitmap level – The number of copies each chunk of a movie is stored by peers – Various other statistics can be computed; the average number of copies of a chunk in a movie, the minimum number of chunks, the variance of the number of chunks. Statistics on video objects Overall statistics of the three typical movies Statistics on user behavior (1) Interarrival time distribution of viewers Statistics on user behavior (2) View duration distribution Statistics on user behavior (3) Start position distribution Health index of Movies (1) Number of peers that own the movie Health index of Movies (2) Average owning ratios for different chunks Health index of Movies (3) Chunk availability and chunk demand Health index of Movies (4) The available to demand ratios User Satisfaction Index (1) Generating fluency index • The computation of F(m, i) is carried out by the client software. • The client software reports all MVRs and the fluency F(m, i) to the log server whenever a “stop-watching” event occurs. – The STOP button is pressed – Another movie/programme is selected – The user turns off the P2P-VoD software User Satisfaction Index (2) The number of fluency records • A good indicator of the number of viewers of the movie User Satisfaction Index (3) The distribution of fluency index Future works Further research in P2P-VoD systems • How to design a highly scalable P2P-VoD system to support millions of simultaneous users • How to perform dynamic movie replication, replacement, and scheduling so as reduce the workload at the content servers • How to quantify various replication strategies so as to guarantee a high health index • How to select proper chunk and piece transmission strategies so as to improve the viewing quality • How to accurately measure and quantify the user satisfaction level