CS 34701: Large-Scale Networked Systems Professor: Ian Foster TA: Adriana Iamnitchi http://dsl.cs.uchicago.edu/Courses/cs347-2002/ CS 34701 Course Goals Primary – Gain deep understanding of fundamental issues that effect design of large-scale networked systems – Map primary contemporary research themes – Gain experience in network research Secondary – By studying a set of outstanding papers, build knowledge of how to present research – Learn how to read papers & evaluate ideas How the Class Works Research papers – Prior to each class, we all read and evaluate two research papers – During each class, we discuss those papers Project – One-page project description by 2nd week – Five-page project summary by 5th week – 10-20 final paper by 9th week – Project presentations: 9th and 10th weeks. Paper Review & Discussion Everyone reads two papers per class and submits an evaluation (see below) We discuss (not present) papers in class – A team of 2-3 leads each discussion – The leading team submits discussion plan before class, submits “master critique” and summarizes discussion at the beginning of following class Look over schedule between now & Friday, when we will allocate discussants Evaluations You must submit evaluations of papers – Email them by 6pm the day before Answer a set of standard questions 1. State the main contribution of the paper 2. Critique the main contribution 3. What are the three strongest and/or most interesting ideas in the paper? 4. Three most striking weaknesses in the paper? 5. Three questions to ask the authors? 6. Detail an interesting extension to the work not mentioned in the future work section. 7. Optional comments on the paper that you’d like to see discussed in class. What I’ll Assume You Know Basic Internet architecture – IP, TCP, DNS, HTTP Basic principles of distributed computing – Asynchrony (cannot distinguish between communication failures and latency) – Partial global state knowledge (cannot know everything correctly) – Failures happen. In very large systems, even rare failures happen often If there are things that don’t make sense, ask! Large-Scale Networked Systems Internet-connected networks with a large number of components, spanning multiple DNS domains (usually WAN) Designed to solve specific problems: – Content distribution – Cycle sharing – File sharing – Sensor data fusion – Distributed data analysis –… Example: Gnutella Peer-to-peer file sharing system – File sharing: goal is to enable publication and access to files – P2P: no central servers; all clients also act as servers and are equivalent (more or less) Issues – Scaling to very large numbers of nodes – Properties: bootstrapping, reliability, cost, anonymity, security, freeloading, … Gnutella Protocol Overview P2P file sharing application on top of an overlay network: – Nodes maintain open TCP connections. – Messages are broadcasted (flooded) or back-propagated. Protocol: Membership Query File download Broadcast (Flooding) Backpropagated PING PONG QUERY QUERY HIT Node to node GET, PUSH Gnutella search mechanism Steps: 1. Node 2 initiates search for file A 7 1 A 4 2 6 3 5 Gnutella search mechanism A Steps: 1. Node 2 initiates search for file A 2. Sends message to all neighbors 7 1 4 2 3 A 6 A 5 Gnutella search mechanism A A Steps: 1. Node 2 initiates search for file A 2. Sends message to all neighbors 3. Neighbors forward message 7 1 4 2 6 3 A 5 A Gnutella search mechanism A:7 A 7 1 4 2 6 3 A:5 A 5 A Steps: 1. Node 2 initiates search for A 2. Sends message to all neighbors 3. Neighbors forward message 4. Nodes that have file A initiate a reply message Gnutella search mechanism 7 1 4 2 3 A:7 A:5 A 6 A 5 Steps: 1. Node 2 initiates search for A 2. Sends message to all neighbors 3. Neighbors forward message 4. Nodes that have file A initiate a reply message 5. Query reply message is backpropagated Gnutella search mechanism 7 1 A:7 2 4 A:5 6 3 5 Steps: 1. Node 2 initiates search for A 2. Sends message to all neighbors 3. Neighbors forward message 4. Nodes that have file A initiate a reply message 5. Query reply message is backpropagated 6. Node 2 gets replies Gnutella search mechanism download A 1 7 4 2 6 3 5 Steps: 1. Node 2 initiates search for A 2. Sends message to all neighbors 3. Neighbors forward message 4. Nodes that have file A initiate a reply message 5. Query reply message is backpropagated 6. Node 2 gets replies 7. File download Tools for network exploration Eavesdropper - modified node inserted into the network to log traffic. Crawler - connects to all active nodes and uses the membership protocol to discover graph topology. Parallel crawling. Graph analysis tools high-volume offline computations. Network growth Number of nodes in the largest network component ('000) 50 Gnutella Network Growth . 40 High user interest: 30 Users tolerate high latency, low quality results. Better resources: 20 10 05/12/01 05/16/01 05/22/01 05/24/01 05/29/01 02/27/01 03/01/01 03/05/01 03/09/01 03/13/01 03/16/01 03/19/01 03/22/01 03/24/01 11/20/00 11/21/00 11/25/00 11/28/00 - DSL and cable modem nodes grew from 24% to 41% over 6 months. Open architecture / open-source environment: Competing implementations, Lower overhead network traffic, improved resource utilization, better structure, Recently, two-level structure. Growth invariants Percent of node pairs (%) 1. Graph connectivity: 3.4 links per node on average. 2. Path length distribution: node-to-node distance maintains similar distributions. 50% 40% 30% 20% 10% 0% 1 2 3 4 5 6 7 8 9 10 11 12 Node-to-node shortest path (hops) Avg. node-to-node distance grew 25% while the network grew 50 times over 6 months. Random graph theory predicts about 75% increase. Is Gnutella a power-law network? Power-law networks: the number of nodes N with exactly L links is proportional to L-k N ~ L-k Num. of nodes (log scale) 10000 Examples: The Internet, In/out links to/from HTML pages, Citations network, US power grid, Social networks. November 2000 1000 100 10 1 1 10 100 Number of links (log scale) Implication: High tolerance to random node failure but low reliability when facing an ‘intelligent’ adversary Is Gnutella a power-law network? Later, larger networks display a bimodal distribution. Implications: High tolerance to random node failures preserved Increased reliability 10000 when facing an attack. 1000 Number of nodes (log scale) May 2001 100 10 1 1 10 100 Number of links (log scale) Traffic analysis Message Frequency 25 . 20 Ping Push Query Other 15 10 5 353 321 289 257 225 193 161 129 97 65 33 - 1 messages per secod 6-8 kbps per link over any connection. Traffic structure changed over time. minute Total generated traffic 1Gbps (or 330TB/month)! – Note that this estimate excludes actual file transfers – Q: Does it matter? – Compare to 15,000TB/month estimated in US Internet backbone (Dec. 2000). Reasoning: and PING messages are flooded. They form more than 90% of generated traffic predominant TTL=7 >95% of nodes are less than 7 hops away measured traffic at each link about 6 to 8kbs network with 50k nodes and 170k links QUERY Topology mismatch The overlay network topology doesn’t match the underlying Internet infrastructure topology! 40% of all nodes are in the 10 largest Autonomous Systems (AS). Only 2-4% of all TCP connections link nodes within the same AS. Largely ‘random wiring’. Entropy experiment gives similar results. Course Topics Internet Architecture and Design Principles Flat Pricing vs. Prioritized Traffic Internet Measurements Availability in Wide-Area Patterns in Real Networks Modeling the Internet Topology Internet Services: DNS Web Caching, Content Distribution Networks Overlay Networks Peer-to-Peer systems Computational Grids Security Issues Sensor Nets Wireless Networks XML SOAP and Web Services Course Topics Internet Design Principles – How do I deliver Internet services: end-to-end vs. within the network? Flat Pricing vs. Prioritized Traffic – How do I determine which traffic to pass over the Internet? Internet Measurements – What does the Internet really look like? Course Topics Availability in Wide-Area – How reliable is the Internet? Patterns in Real Networks – What does Internet traffic look like? Modeling the Internet Topology – How can I construct realistic models of Internet structure? Course Topics Internet Services: DNS – How well does DNS work? Web Caching, Content Distribution Networks – How do we optimize Web content mgmt? Overlay Networks – Improving routing performance Course Topics Peer-to-Peer systems – Gnutella, etc., etc. Computational Grids – Globus, etc. Security Issues – Authorization, etc. Circulatory Net Course Topics Sensor Nets – How do I structure & program networks of lightweight devices? Wireless Networks – How do I route in ad hoc networks? XML SOAP and Web Services – What are Web services anyway? Disaster Response Projects Literature surveys, real implementations, analytical evaluations Can be performed individually or in a team of two Your project ideas appreciated (to be discussed before proposal due date) Primary goal is to do something interesting and to do it well Example Project Gnutella network analysis – Develop a “crawler” that traverses network, collects membership & connectivity info – Analyze structure – Characterize structure See, e.g.: – Mapping the Gnutella Network: Properties of LargeScale Peer-to-Peer Systems and Implications for System Design, M. Ripeanu, I. Foster, A. Iamnitchi, in IEEE Internet Computing Journal, vol. 6(1), 2002 Project Ideas http://dsl.cs.uchicago.edu/Courses/cs3472002/cs347_projects.htm Gnutella network measurements – Topology discovery for 500K nodes – Structural analysis with 500K nodes – Study impact of overlay networks – Etc. Project Ideas Overlay networks: build unstructured or semistructured self-organizing overlays optimizing different cost functions: – Topology-aware: map onto physical infrastructure – Usage-aware: map onto usage patterns Analysis of Sloan Digital Sky Survey logs to explore access patterns – What files are accessed how often – What community usage patterns emerge? – How can we exploit these in content distribution networks? Project Ideas Compare qualitatively and analytically current filelocation solutions (CAN, Chord, Gnutella, Napster, etc.) in the context of scientific file-sharing collaborations. – Evaluate sharing patterns based on real usage traces in a scientific collaboration – Use these patterns to evaluate benefits/drawbacks and propose better alternatives Expand existing simulator to evaluate request forwarding techniques for resource location in grid environments For More Information Contact me – Ian Foster, foster@cs.uchicago.edu – Email or set up a meeting Contact Anda, our TA – Adriana Iamnitchi, anda@cs.uchicago.edu Monitor the class web page – http://dsl.cs.uchicago.edu/Courses/cs347-2002/ Next 2 Classes Friday: – Discuss: > J. Saltzer, D. Reed, and D. Clark, End-to-end Arguments in System Design. ACM Transactions on Computer Systems, Vol. 2, No. 4, pp. 195-206, 1984. > D. Clark and M. Blumenthal, Rethinking the design of the Internet: The end to end arguments vs. the brave new world, Workshop on Policy Implications of End-to-End. December 1, 2001. – Leading group: Ian + 2 volunteers (who?) Wednesday: – Leading Group: Anda + 1-2 volunteers (who?)