INF 123 SW ARCH, DIST SYS & INTEROP LECTURE 12 Prof. Crista Lopes Objectives Understanding of Peer-to-Peer architectures Solid knowledge of well-known P2P systems Node 1 Node 2 Peer-to-Peer State and behavior are distributed among peers which can act as either clients or servers. Peers: independent components, having their own state and control thread. Connectors: Network protocols, often custom. Data Elements: Network messages Topology: Network (may have redundant connections between peers); can vary arbitrarily and dynamically Supports decentralized computing with flow of control and resources distributed among peers. Highly robust in the face of failure of any given node. Scalable in terms of access to resources and computing power. But caution on the protocol! Issues to consider in P2P Locating resources Retrieving resources Napster The system that made P2P (in)famous Case Study: Napster “The Napster” Case Study: Napster Resource localization was centralized Resource retrieval was P2P Protocol: custom over TCP/IP Spec Case Study: Napster Notification of song: [Client Napster] "<filename>" <md5> <size> <bitrate> <frequency> <time> Example: "C:\random band - random song.mp3" b92870e0d41bc8e698cf2f0a1ddfeac7 443332 128 44100 60 Case Study: Napster Search query: [Client Napster] [FILENAME CONTAINS "artist name"] MAX_RESULTS <max> [FILENAME CONTAINS "song"] [LINESPEED <compare> <link-type>] [BITRATE <compare> "<br>"] [FREQ <compare> "<freq>"] [WMA-FILE] [LOCAL_ONLY] Example: FILENAME CONTAINS ”random" MAX_RESULTS 75 FILENAME CONTAINS ”song" BITRATE "AT LEAST" "128" Case Study: Napster Query Response: [Napster Client] "<filename>" <md5> <size> <bitrate> <frequency> <length> <nick> <ip> <link-type> [weight] Example: ”C:\random band - random song.mp3" 7d733c1e7419674744768db71bff8bcd 2558199 128 44100 159 lefty 3437166285 4 Case Study: Napster Retrieving song (no firewall): GET <nick> "<filename>" [Client Client] Example: lefty "C:\random band - random song.mp3" Case Study: Napster Retrieval Response: [Client Client] <nick> <ip> <port> "<filename>" <md5> <linespeed> (if file exists) or <nick> "<filename>" (if file doesn’t exist) Example: lefty 4877911892 6699 "C:\random band - random song.mp3" 10fe9e623b1962da85eea61df7ac1f69 3 Case Study: Napster Retrieving song (firewall): [Client Napster Client Client] SEND <nick> "<filename>" Example: lefty "C:\random band - random song.mp3" Napster’s Aquilles Heel “The Napster” central server Single point of failure Shutdown mandated by court order Without the central server, the peers were useless Gnutella The textbook P2P architecture Case Study: Gnutella Case Study: Gnutella Resource localization is decentralized Gnutella is, essentially, a decentralized search system Resource retrieval is P2P Protocols: custom over TCP/IP + HTTP Spec Case Study: Gnutella Node discovery done off-band channel List shipped with software IRC Mailing lists Only need 1 neighbor node to connect to the node network Gnutella nodes = “servents” Case Study: Gnutella Search: flooding (originally) Case Study: Gnutella Connection to peer: GNUTELLA CONNECT/<protocol version string>\n\n Response: GNUTELLA OK\n\n Case Study: Gnutella Gnutella Protocol Descriptor Descriptor ID Payload Descriptor Code Type 0x00 Ping 0x01 Pong 0x80 Query 0x81 QueryHit 0x40 Push TTL Hops Payload Length Case Study: Gnutella Ping messages Used to discover other servents Pong messages Responses to ping messages May be cached; receiver may send may many pong messages to to ping request Payload: Case Study: Gnutella Query messages Used to find resources Payload: QueryHit messages Responses Payload: to query messages Case Study: Gnutella [Normal] File download is done via HTTP GET GET /get/<File Index>/<File Name>/ HTTP/1.0\r\n Connection: Keep-Alive\r\n Range: bytes=0-\r\n User-Agent: Gnutella\r\n3 \r\n HTTP 200 OK\r\n Server: Gnutella\r\n Content-type: application/binary\r\n Content-length: 4356789\r\n \r\n file data Case Study: Gnutella Routing Protocol rules: Pong messages may only be sent along the same path as corresponding pings QueryHit messages may only be sent along the same path as corresponding queries Push messages may only be sent along the same path that carried the incoming QueryHits. A servent will forward incoming Ping and Query messages to all of its directly connected servents, except the one that delivered the incoming Ping or Query. A servent will decrement a descriptor header’s TTL field, and increment its Hops field, before it forwards the descriptor to any directly connected servent. If, after decrementing the header’s TTL field, the TTL field is found to be zero, the descriptor is not forwarded along any connection. A servent receiving a message with the same Payload Descriptor and Descriptor ID as one it has received before, should avoid forwarding the message to any connected servent. Case Study: Gnutella Ping/Pong routing Case Study: Gnutella Query/QueryHit/Push routing Skype Hybrid P2P and Client-Server Proprietary Protocols, not much documentation Case Study: Skype Case Study: Skype A mixed client-server and peer-to-peer architecture addresses the discovery problem. Replication and distribution of the directories, in the form of supernodes, addresses the scalability problem and robustness problem encountered in Napster. Promotion of ordinary peers to supernodes based upon network and processing capabilities addresses another aspect of system performance: “not just any peer” is relied upon for important services. A proprietary protocol employing encryption provides privacy for calls that are relayed through supernode intermediaries. Restriction of participants to clients issued by Skype, and making those clients highly resistant to inspection or modification, prevents malicious clients from entering the network. Summary Understanding of Peer-to-Peer architectures node discovery resource retrieval how to deal with firewalls Solid knowledge of well-known P2P systems Napster Gnutella Skype (briefly)