P2P Streaming Amit Lichtenberg Based on: - Hui Zhang , Opportunities and Challenges of Peer-to-Peer Internet Video Broadcast, http://esm.cs.cmu.edu/technology/papers/Hui-P2PBroadcastr.pdf - Yong Liu · Yang Guo · Chao Liang, A survey on peer-to-peer video streaming systems, Peer-to-Peer Netw Appl (2008) 1:18–28 http://www.springerlink.com/index/C62114G6G4863T32.pdf 1 What is Video Streaming? Source video (TV, sports, concerts, distance education) available at some server. Users wish to watch video ◦ ◦ ◦ ◦ Real-time (?) Simultaneously (?) High quality Over the internet (for free) Many users ◦ AOL: Live 8 concert (2005) – 175,000 viewers ◦ CBS: NCAA tournament (2006) – 268,000 viewers 2 What is Video Streaming? (cont) Problem: ◦ Minimum video bandwidth: 400 Kbps ◦ X viewers ◦ Required server/network bandwidth: 400X Kbps ◦ CBS/NCAA (270K viewers) needs ~100Gbps server/network bandwidth ◦ Infeasible! 3 Agenda Approaches for Live Video Streaming P2P live video broadcast ◦ Difference from other P2P applications ◦ Overlay design issues and approaches Tree based Data driven / Mesh based ◦ Challenges and open issues ◦ Deployment status As time permits: P2P VOD streaming ◦ Similar, but yet different 4 Internet Multicast Architectures Terminology ◦ Broadcast: send data to everyone ◦ Multicast: send data to a subgroup of users ◦ We will use both interchangeably, although generally mean the latter. Where to implement? ◦ Network Layer (IP)? Efficient Costly ◦ Application Layer? Cheap Efficiency? 5 IP Multicast Implemented at Network Layer (IP) Introduce open, dynamic IP groups Easiest at network level! (knows network topology the best) Problems: ◦ Hurts routers’ “stateless” architecture ◦ Scaling, complexity ◦ Only best effort – not sure how to use it from TCP point of view ◦ Need to replace infrastructure – no incentives 6 Non-routers Based Architectures reases bandwidth mands significantly Move multicast functionality to endsystems (application level) ◦ Penalties: package duplication, latency ◦ Objective: bring penalties to a minimum Infrastructure-centric architecture ◦ CDN (Content Distribution Centers) P2P architectures 7 P2P Streaming How do you feel about low upload cap and high down cap? Motivation: addition of new users requires more download bw, but also contributes some more upload bw - Scalability However, users come and go at will ◦ Key: function, scale and self-organize in a highly dynamic environment 8 P2P Streaming vs. Other P2P Application Single, dedicated, known source ◦ NBC Large scale ◦ Tens, sometimes hundreds of thousands of simultaneous viewers Performance-Demanding ◦ >100’s of Kbps Real Time ◦ Timely and continuously streaming Quality ◦ Can be adjusted according to demands and bandwidth resources ◦ Nevertheless, minimal (high) quality is a must 9 Design Issues Organize the users into an overlay for disseminating the video stream: ◦ Efficiency high bandwidth, low latencies Can tolerate a startup delay of a few seconds ◦ Scalability, Load Balancing ◦ Self-Organizing ◦ Per-node bandwidth constrains Quality of received video Amount of required bw ◦ System considerations (Later on…) 10 Approaches for Overlay Construction Many approaches, basically divided into 2 classes: ◦ Tree based ◦ Data driven / Mesh based 11 Tree Based Overlay Organize peers into a tight predefined structure (usually a tree). Packets sent down the tree from parent to children. Push based – always send (push) received data to descendents Optimize the structure Take care of nodes joining and leaving 12 Example Tree-Based Approach: ESM End System Multicast (*) Construct a tree rooted at source Packages transmitted down the tree from parent to children Self-organizing, performance-aware, distributed (*) Y.-H. Chu, S. G.Rao, and H. Zhang, “A case for end system multicast”, in Proc. ACM SIGMETRICS’00, June 2000. 13 ESM: Group Management Each node maintains info about a small random subset of members + info about path from source to itself. New node: ◦ contact the server, ask for a random list of members ◦ Select one of them as parent (how?) Gossip-like protocol to learn more ◦ Node A picks a random node B and shares his info to him. Timestamps 14 ESM: Membership Dynamics When a node leave: ◦ Graceful leaving: continue forwarding data for a short period to allow adjustment ◦ Door slamming: detected via timestamps ◦ Members send control packets to indicate liveness 15 ESM: Adaptation Maintain throughput info for timeframe If throughput << source rate, select a new parent. DT: Detection Time ◦ How long should a node wait before abandoning parent? ◦ Default DT = 5 seconds Effected by TCP/TFRC congestion control (slowstart) ◦ May need to adjust during runtime 16 ESM: Parent Selection Node A joins broadcast / replaces parent A chooses and examines a random subset of known members ◦ Prefer unexamined / low delay members Node B answers: ◦ Current performance (throughput) ◦ Degree-saturated or not ◦ Descendant of A or not Enables RTT from A to B 17 ESM: Parent Selection (cont) Wait timeout: ~1 second Eliminate saturated and descendent nodes. ◦ Static degree per node Evaluate performance (throughput, delay) for each B Switch to B if the potential throughput is better, or the same but delay is smaller. ◦ Helps cluster close nodes 18 ESM Tree - example 19 ESM Tree - example 20 ESM Tree - example 21 Tree Based Overlay – Problems Node leaves -> disrupt delivery to all of its descendents Majority (~n/2) of nodes are leaves, and their outgoing bw is not being utilized Introduce: Multi-trees 22 Multi-Trees At source: ◦ encode stream into sub-streams ◦ Distribute each along a different tree At receiver: ◦ Quality ≈ # received sub-streams 23 Multi-Tree: Example S – source rate Server: distributes a stream rate of S/2 over each tree Each peer: receives S/2 from each tree. Peer 0: donates its upload BW at the left subtree, and is a leaf at the right 24 Multi-Tree + & Advantages: ◦ A node not completely disrupted by the failure of an ancestor ◦ The potential bw of all nodes can be utilized, as long as each node is not a leaf in at least one subtree. Disadvantages: ◦ Requires special encoding/decoding at server/user ◦ Code must enable degraded quality when received from only a subset of trees ◦ Currently uncommon 25 Data Driven / Mesh based Overlay Do not construct and maintain an explicit structure for data delivery Use the availability of data to guide the data flow ◦ Similar to BitTorrent techniques Nodes form “meshes” of partners in which they exchange data 26 Data Driven overlay: Naïve Approach Use a “gossip algorithm” ◦ Each node selects a random group of members and sends all the data to them. ◦ Other nodes continue the same way Push based: push all available data onwards Problems: ◦ Redundant data ◦ Startup, transmission delay 27 Data Driven overlay: Pull Based Nodes maintain a set of partners and periodically exchange data availability info. A node may retrieve (“pull”) data from partners. Crucial difference from BitTorrent: ◦ Segments must be obtained before playback deadline! 28 Example Data-Driven Approach: CoolStreaming (*) CoolStreaming node: ◦ Membership Manager: Help maintain a partial view of other nodes ◦ Partnership Manager: Establish and maintain partnership with other nodes ◦ Scheduler: Schedule the transmission of video data (*) X. Zhang, J. Liu, B. Li and T.-S. P.Yum, “DONet/CoolStreaming: A data-driven overlay network for live media streaming”, in Proc. INFOCOM’05, Miami, FL, USA, March 2005. 29 CoolStreaming: Group and Partner Management New node: ◦ Contact server to obtain a set of partner candidates. Maintain a partial subset of other participants. Employs SCAM ◦ Scalable Gossip Membership Protocol ◦ Scalable, light weight, uniform partial view 30 CoolStreaming: overlay No formal structure! Video stream divided into segments Nodes hold Buffer Maps (BM) ◦ Availability of each segment Exchange BMs with partners Pull segments from partners as needed 31 CoolStreaming: Scheduling Goal: timely and continuous segment delivery ◦ As appose to file download! Uses a sliding window technique ◦ ◦ ◦ ◦ 1 second segment 120 segments per window BM = bit-string of 120b. Segments sequence numbers 32 CoolStreaming vs BitTorrent Buffer Snapshots BitTorrent Whole File CoolStreaming Sliding window Playback point 33 CoolStreaming: Scheduling Need a scheduling algorithm to fetch segments. ◦ Simple round-robin? Two constrains: ◦ Playback deadline If not, then at least keep # of missing segments at a minimum ◦ Heterogeneous streaming bandwidth Parallel Machine Learning problem ◦ NP Hard In reality, use some simple heuristics 34 CoolStreaming: Failure Recovery and Partnership Refinement Departure (gracefully or crash) ◦ Detected by idle time ◦ Recover: re-schedule using BM info Periodically establish new partnerships with local random members ◦ Helps maintain a stable number of partners ◦ Explore partners of better quality 35 CoolStreaming – Buffer Map example 36 Technical Challenges and Open Issues Tree Based vs. Data Driven ◦ Neither completely overcome the challenges from the dynamic invironment ◦ Data driven: latency-overhead tradeoff. ◦ Tree based: inherited instability, bandwidth underutilization. ◦ Can there be an efficient hybrid? Chunkyspread: split stream into segments and send on separate, not necessarily disjoint trees + neighboring graph. – simpler tree construction and maintenance + efficiency. More explicit tree-bone approach: use stable nodes to construct tree base. 37 Technical Challenges and Open Issues Incentives and Fairness ◦ Users cooperation A small set of nodes are required to donate 10-35 times more upload bw. Always act as a leaf in tree construction ◦ Need an incentive mechanism Challenging, due to real-time requirement Popularity (as in BitTorrent) – not enough Short period of time Slow download rate -> user will leave Tit-fot-tat Ineffective: users look after same portion of data. 38 Technical Challenges and Open Issues Bandwidth Scarce Regimes ◦ Basic requirement: ∑(upload bw) ≥ ∑(received bandwidth) Nodes behind DSL/Cable can receive a lot, but cannot donate much ◦ Introduce Resource Index (RI) ◦ (for particular source rate) RI<1: not all participants can receive a full source rate RI=1: system is saturated RI>1: less constrains, con construct better overlay trees 39 Technical Challenges and Open Issues Peer Dynamics and Flash Crowds ◦ Flash crowd: large increase in number of users joining the broadcast in a short period of time. ◦ Opposite situation: massive amount of users leaving the broadcast. ◦ Need to adjust quickly: Otherwise, users will leave in first case Or lose quality in the second 40 Technical Challenges and Open Issues Heterogeneous Receivers ◦ Download bandwidth Dial-up vs. DSL ◦ Single bw support isn’t enough ◦ ESM: video is encoded at multiple bitrates and sent parallel – adjust video quality according to capabilities. ◦ Requires new coding techniques: Layer coding, Miller Descriptive coding. Not yet deployed in mainstream media players 41 Technical Challenges and Open Issues Implementation Issues ◦ NATs, Firewalls ◦ Transport protocol ◦ Startup delay and buffer interaction Separate streaming engine and playback engine 42 P2P streaming – Deployment Status CoolStreaming: ◦ > 50,000 concurrent users, > 25,000 in one channel at peak times. ◦ In the table: From 20:26 Mar 10 To 2:14 Mar 14 (GMT) Total: ~78 hours 43 P2P streaming – Deployment Status (cont) Most users reported satisfactory viewing experience. Current internet has enough available BW to support TV-quality streaming (300-450 Kbps). Higher user participation -> even better statistical results! 44 P2P VOD Streaming Watch ◦ Any point of video ◦ Any time More flexibility for user ◦ “Watch whatever you want, whenever you want!” Key to IPTV 45 P2P VOD Large number of users may be watching the same video Asynchronous to each other Tree based overlay: Synchronous! ◦ Receive the content in the order sent by the server. Mesh based: ◦ Random order – should arrive before playback deadline ◦ High throughput 46 Tree Based VOD Group users into sessions based on arrival times. Define a threshold T ◦ Users that arrive close in time and within T constitute a session Together with the server, form a multicast tree ◦ “Base tree” Server streams the video over the base-tree ◦ As in the P2P live streaming. 47 Tree Based VOD (cont) New client joins a session (or forms a new session) ◦ Join base tree, retrieve base stream for it. ◦ Obtain a patch – the initial portion of the video that it has missed. Available at the server or at other users who have already cached it. ◦ Users provide two services: Base stream forwarding Patch serving ◦ Users are synchronous in the base tree. The asynchronous requirement addressed via patches. 48 Tree Based VOD: example 49 Cache and Relay Buffer a sliding window of video content around the current point. Serve other users whose viewing point is within the moving window by forwarding the stream. Forms a tree (“cluster”) with different playback points. 50 Cache and Relay: example (10 minutes cache) 51 Cache and Relay: example (10 minutes cache) 52 Tree Problems Same as P2P streaming ◦ Construct the tree ◦ Maintain it ◦ Handle peer churn In cache-and-relay ◦ Another constraint about adequate parents ◦ Directory service to locate parents 53