Client-server architecture Pure P2P architecture server: ❍ always-on host ❍ permanent IP address ❍ server farms for scaling clients: ❍ client/server ❍ ❍ ❍ communicate with server may be intermittently connected may have dynamic IP addresses do not communicate directly with each other ❒ no always-on server ❒ arbitrary end systems directly communicate ❒ peers are intermittently ❍ ❍ ❍ ❍ File sharing File distribution Searching for information Case Studies: Bittorrent and Skype TDDD36: Peer-to-peer TDDD36: Peer-to-peer Hybrid of client-server and P2P P2P file sharing Skype ❍ voice-over-IP P2P application ❍ centralized server: finding address of remote party: ❍ client-client connection: direct (not through server) Instant messaging ❍ chatting between two users is P2P ❍ centralized service: client presence detection/location • user registers its IP address with central server when it comes online • user contacts central server to find IP addresses of buddies Example ❒ Alice runs P2P client application on her notebook computer ❒ intermittently connects to Internet; gets new IP address for each connection ❒ asks for “Hey Jude” ❒ application displays other peers that have copy of Hey Jude. TDDD36: Peer-to-peer P2P: centralized directory original “Napster” design 1) when peer connects, it informs central server: ❍ ❍ Bob peers 1 2 the peers, Bob. ❒ file is copied from Bob’s PC to Alice’s notebook: HTTP ❒ while Alice downloads, other users uploading from Alice. ❒ Alice’s peer is both a Web client and a transient Web server. All peers are servers = highly scalable! TDDD36: Peer-to-peer ❒ single point of failure ❒ performance bottleneck 1 3 1 ❒ Alice chooses one of P2P: problems with centralized directory centralized directory server IP address content 2) Alice queries for “Hey Jude” 3) Alice requests file from Bob peer-peer connected and change IP addresses ❒ Three topics: ❒ copyright infringement: “target” of lawsuit is obvious file transfer is decentralized, but locating content is highly centralized 1 Alice TDDD36: Peer-to-peer TDDD36: Peer-to-peer 1 Gnutella: protocol Query flooding: Gnutella ❒ fully distributed ❍ no central server ❒ public domain protocol ❒ many Gnutella clients implementing protocol overlay network: graph ❒ edge between peer X and Y if there’s a TCP connection ❒ all active peers and edges form overlay net ❒ edge: virtual (not physical) link ❒ given peer typically connected with < 10 overlay neighbors File transfer: HTTP ❒ Query message sent over existing TCP connections ❒ peers forward Query message ❒ QueryHit sent over reverse path Query QueryHit Query QueryHit Scalability: limited scope flooding TDDD36: Peer-to-peer TDDD36: Peer-to-peer Gnutella: Peer joining Hierarchical Overlay joining peer Alice must find another peer in Gnutella network: use list of candidate peers 2. Alice sequentially attempts TCP connections with candidate peers until connection setup with Bob 3. Flooding: Alice sends Ping message to Bob; Bob forwards Ping message to his overlay neighbors (who then forward to their neighbors….) ❒ peers receiving Ping message respond to Alice with Pong message 4. Alice receives many Pong messages, and can then setup additional TCP connections ❒ between centralized 1. index, query flooding approaches ❒ each peer is either a group leader or assigned to a group leader. ❍ ❍ TCP connection between peer and its group leader. TCP connections between some pairs of group leaders. ❒ group leader tracks content in its children ordinary peer group-leader peer neighoring relationships in overlay network TDDD36: Peer-to-peer TDDD36: Peer-to-peer Distributed Hash Table (DHT) DHT Identifiers ❒ DHT = distributed P2P database ❒ Assign integer identifier to each peer in range [0,2n-1]. ❒ Database has (key, value) pairs; ❍ ❍ ❍ key: ss number; value: human name key: content type; value: IP address Each identifier can be represented by n bits. ❒ Require each key to be an integer in same range. ❒ Peers query DB with key ❍ DB returns values that match the key ❒ Peers can also insert (key, value) peers TDDD36: Peer-to-peer ❒ To get integer keys, hash original key. ❍ ❍ eg, key = h(“Led Zeppelin IV”) This is why they call it a distributed “hash” table TDDD36: Peer-to-peer 2 How to assign keys to peers? Circular DHT (1) 1 ❒ Central issue: ❍ Assigning (key, value) pairs to peers. 3 15 ❒ Rule: assign key to the peer that has the closest ID. ❒ Convention in lecture: closest is the immediate successor of the key. ❒ Ex: n=4; peers: 1,3,4,5,8,10,12,14; ❍ ❍ key = 13, then successor peer = 14 key = 15, then successor peer = 1 4 12 5 10 8 only aware of immediate successor and predecessor. ❒ “Overlay network” ❒ Each peer TDDD36: Peer-to-peer TDDD36: Peer-to-peer Circular DHT with Shortcuts Circle DHT (2) 1 O(N) messages on avg to resolve query, when there are N peers 0001 Who’s responsible for key 1110 ? I am 15 0011 4 1111 12 5 1110 10 0100 1110 1110 1100 8 ❒ Each peer keeps track of IP addresses of predecessor, 1110 1110 Define closest as closest successor Who’s responsible for key 1110? 3 0101 1110 1010 successor, short cuts. ❒ Reduced from 6 to 2 messages. ❒ Possible to design shortcuts so O(log N) neighbors, O(log N) messages in query 1000 TDDD36: Peer-to-peer Peer Churn P2P Case study: Skype 1 3 15 4 12 •To handle peer churn, require each peer to know the IP address of its two successors. • Each peer periodically pings its two successors to see if they are still alive. 5 10 TDDD36: Peer-to-peer 8 ❒ Peer 5 abruptly leaves ❒ Peer 4 detects; makes 8 its immediate successor; asks 8 who its immediate successor is; makes 8’s immediate successor its second successor. ❒ What if peer 13 wants to join? TDDD36: Peer-to-peer Skype clients (SC) ❒ inherently P2P: pairs of users communicate. ❒ proprietary Skype login server application-layer protocol (inferred via reverse engineering) ❒ hierarchical overlay with Supernodes (SNs) ❒ Index maps usernames to IP addresses; distributed over SNs Supernode (SN) TDDD36: Peer-to-peer 3 Scalable Content Delivery Peers as relays Motivation ❒ Problem when both ❒ Use of Internet for content delivery is massive … and Alice and Bob are behind “NATs”. ❍ becoming more so (e.g., recent projection that by 2013, 90% of all IP traffic will be video content) NAT prevents an outside peer from initiating a call to insider peer ❒ How to make scalable and efficient? ❍ ❒ Solution: ❍ Using Alice’s and Bob’s SNs, Relay is chosen ❍ Each peer initiates session with relay. ❍ Peers can now communicate through NATs via relay Here: Cost efficiency (e.g., energy efficiency) ... ❒ Variety of approaches: broadcast/multicast, batching, replication/caching (e.g. CDNs), P2P, peer-assisted, … ❒ In these slides (only examples of a few peer-assisted techniques): ❍ ❍ BitTorrent (peer-to-peer) Peer-assisted streaming TDDD36: Peer-to-peer File Distribution: Server-Client vs P2P Question : How much time to distribute file from one server to N peers? us: server upload bandwidth Server u1 d1 u2 us ui: peer i upload bandwidth d2 di: peer i download bandwidth File, size F dN uN Network (with abundant bandwidth) TDDD36: Peer-to-peer File distribution time: server-client Server ❒ server sequentially sends N copies: ❍ u1 d1 u2 d2 F us NF/us time dN ❒ client i takes F/di Network (with abundant bandwidth) uN time to download Time to distribute F to N clients using client/server approach = dcs = max { NF/us, F/min(di) } i increases linearly in N (for large N) TDDD36: Peer-to-peer File distribution time: P2P Server-client vs. P2P: example Client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us Server copy: F/us time ❒ client i takes F/di time to download ❒ NF bits must be downloaded (aggregate) ❒ fastest possible upload rate: us + u1 d1 u2 F us d2 Network (with abundant bandwidth) dN uN Σu i 3.5 Minimum Distribution Time ❒ server must send one TDDD36: Peer-to-peer P2P Client-Server 3 2.5 2 1.5 1 0.5 0 dP2P = max { F/us, F/min(di) , NF/(us + i Σu ) } 0 5 10 15 20 25 30 35 i N TDDD36: Peer-to-peer TDDD36: Peer-to-peer 4 File distribution: BitTorrent BitTorrent-like systems ❒ P2P file distribution ❍ torrent: group of tracker: tracks peers ❒ File split into many smaller pieces ❒ Pieces are downloaded from both seeds and downloaders ❒ Distribution paths are dynamically determined peers exchanging chunks of a file participating in torrent Based on data availability Downloader obtain list of peers Seed Downloader Seed Downloader trading chunks Torrent Arrivals (x downloaders; y seeds) Departures peer Downloader Seed residence time Download time 25 26 Background Background Peer discovery in BitTorrent Peer discovery in BitTorrent ❒ Torrent file ❍ ❒ Torrent file “announce” URL ❍ ❒ Tracker ❍ ❍ Register torrent file Maintain state information ❍ ❍ ❒ Peers ❍ ❍ ❍ ❍ ❍ Register torrent file Maintain state information ❒ Peers Obtain torrent file Announce Report status Peer exchange (PEX) ❍ ❍ ❍ ❍ ❒ Issues ❍ “announce” URL ❒ Tracker Obtain torrent file Announce Report status Peer exchange (PEX) ❒ Issues Central point of failure Tracker load ❍ Swarm = Torrent 27 ❍ Central point of failure Tracker load Background Background Multi-tracked torrents Multi-tracked torrents ❒ Torrent file ❍ ❍ ❍ ❍ ❍ ❍ ❍ Register torrent file Maintain state information ❍ ❍ ❍ Register torrent file Maintain state information ❒ Peers Obtain torrent file Choose one tracker at random Announce Report status Peer exchange (PEX) ❒ Issue “announce-list” URLs ❒ Trackers ❒ Peers ❍ 28 ❒ Torrent file “announce-list” URLs ❒ Trackers ❍ Swarm = Torrent Swarm⊆Torrent Multiple smaller swarms ❍ ❍ ❍ ❍ ❍ Swarm⊆Torrent Obtain torrent file Choose one tracker at random Announce Report status Peer exchange (PEX) ❒ Issue 29 ❍ Swarm⊆Torrent Multiple smaller swarms Swarm⊆Torrent 30 5 Background Download using BitTorrent Multi-tracked torrents Background: Incentive mechanism ❒ Establish connections to large set of peers ❍ At each time, only upload to a small (changing) set of peers ❒ Torrent file ❍ “announce-list” URLs ❒ Rate-based tit-for-tat policy ❍ Downloaders give upload preference to the downloaders that provide the highest download rates ❒ Trackers ❍ ❍ Register torrent file Maintain state information ❒ Peers ❍ ❍ ❍ ❍ ❍ Obtain torrent file Choose one tracker at random Announce Report status Peer exchange (PEX) Swarm⊆Torrent ❒ Issue ❍ Highest download rates Pick top four Swarm⊆Torrent Multiple smaller swarms Optimistic unchoke 31 Pick one at random 32 Live Streaming Download using BitTorrent using BT-like systems Background: Piece selection Peer 1: 1 2 3 Peer 2: 1 2 3 Peer N : 1 2 3 Pieces in neighbor set: (1) 1 2 (2) (1) (2) 3 … k … K … k … K … k … K … k … from Internet piece upload/downloads (2) (3) (2) K Media player queue/buffer to ❒ Rarest first piece selection policy ❍ Achieves high piece diversity ❒ Request pieces that ❍ the uploader has; ❍ the downloader is interested (wants); and ❍ is the rarest among this set of pieces Buffer window ❒ Live streaming (e.g., CoolStreaming) ❍ All peers at roughly the same play/download position • High bandwidth peers can easily contribute more … ❍ (relatively) Small buffer window • Within which pieces are exchanged 34 33 Peer-assisted VoD streaming A final note … Some research questions ... ❒ If you are interested in one (or more) ❒ Can BitTorrent-like protocols provide scalable on- topic(s) in this course, please do not hesitate to contact me ... ❒ There are always lots interesting problems to work on (for an exjobb, for example)!!! ❒ You can also find out more about my research here: www.ida.liu.se/~nikca/ demand streaming? ❒ How sensitive is the performance to the application configuration parameters? ❍ Piece selection policy ❍ Peer selection policy ❍ Upload/download bandwidth ❒ What is the user-perceived performance? ❍ ❍ Start-up delay Probability of disrupted playback ACM SIGMETRICS 2008; IFIP Networking 2007; IFIP Networking 2009 35 TDDD36: Peer-to-peer 6