Advanced Computer Networks (COS 561) Jennifer Rexford Advanced Computer Networks http://www.cs.princeton.edu/courses/archive/fall08/cos561/ Tuesdays/Thursdays 1:30pm-2:50pm Focus of the Course: Network Architecture • Network architecture – Definition and placement of functions – Types of nodes, and information they exchange – Not measuring or redesigning individual protocols • Revisiting the functions inside the network – Naming and addressing – Routing and forwarding – Virtualization and programmability • To address critical challenges – Performance, scalability, security, manageability,… – Interactive applications, content services, … Goals of the Course • Understand the Internet architecture – Reading and discussing the classic papers – Considering the strengths and limitations • Critically study new architectural alternatives – Reading and discussing recent research papers – Considering how well they address the limitations, and emerging challenges • Create and evaluate new architectural ideas – Learning tools for experimental systems research – Completing a systems-oriented research project Reading Research Papers • Classic papers – For the first few weeks of the course – Lectures (quickly) reviewing today’s architecture • Recent research papers – Emphasis on new architectural ideas – Some full-length papers with thorough evaluation (e.g., from SIGCOMM and NSDI conferences) – And some short papers selling a new idea (e.g., from HotNets and other workshops) – Lectures reviewing the related limitations of today’s architecture In-Class Discussion • Big part of each class is devoted to discussion – Focused on the research papers – Everyone is expected to participate. (Really!) • To prepare for the discussions – Critically read the assigned paper(s) – Consider how you would summarize: • The main idea and contributions • Strengths of the paper • Weaknesses of the paper • Directions for future work – No need to submit a formal written review Homework Assignments • Learning tools for experimental systems work – Routers: Click and Quagga – Evaluation facilities: Emulab – Measurement data: RouteViews and Netflow • Three assignments – Two in the first half of the course – One in the second half • You can work in pairs on the assignments – Each person should complete their own write-up – Include the name of the person you worked with Course Project • Research project – Capstone for the semester – Design and evaluate a new networking idea – Can work in groups, if you like • Due dates – Before fall break: short proposal (1-2 pages) • Must discuss the topic with me ahead of time – Dean’s date: final report (10 two-column pages) • Paper format listed on the course Web site – During exam period: oral presentations • To be scheduled later… Grading Breakdown • Class participation: 30% • Homework assignments: 30% – That is, 10% for each assignment • Course project: 40% – Includes both the report and presentation • Students auditing the class – Do not need to complete the homework assignments and course project… For You To Do (See Class Web Site for Details) • Join the class mailing list – For follow-up discussions and pointers – For questions about the homework assignments • Create an account on Emulab – Needed for the first homework assignment – Requires time for approval, so do it right away • Read the “how to read a paper” tips – Two short write-ups linked from today’s class – To help you read the research papers efficiently • Start reading assignment for Tuesday’s class The Internet: The Good, The Bad, and The Ugly What is the Internet? The Internet is the worldwide, publicly accessible network of interconnected computer networks that transmit data by packet switching using the standard Internet Protocol (IP). It is a "network of networks" that consists of millions of smaller domestic, academic, business, and government networks, which together carry various information and services, such as electronic mail, online chat, file transfer, and the interlinked Web pages and other documents of the World Wide Web. http://en.wikipedia.org/wiki/Internet The Internet: A Remarkable Story • Tremendous success – A research experiment that truly escaped from the lab • The brilliance of under-specifying – Best-effort packet-delivery service – Key functionality at programmable end hosts • Enabled massive growth and innovation – Ease of adding hosts & links, & new technologies – Ease of adding new services (Web, P2P, VoIP, …) Idea #1: Functionality at the (Programmable) Edge of the Network Telephone Network: Dumb Edge, Smart Core • Dumb phones – Dial a number – Speak and listen • Smart switches – Set up and tear down a circuit – Forward audio along the path • Limited services – Audio – Later, fax, caller-id, … • A monopoly for a long time Internet: Smart Edge, Dumb Core End-to-End Principle Whenever possible, communications protocol operations should be defined to occur at the end-points of a communications system. Programmability With programmable end hosts, new network services can be added at any time, by anyone. And then end hosts became powerful and ubiquitous…. Programmability • Architectural decision with profound effects – Where you place programmability in the system determines who gets to innovate – And what kinds of innovations can happen • Today’s Internet – Programmable hosts innovation in applications – Non-programmable routers more control by standards bodies, routers vendors, and carriers • Democratizing Innovation – Interesting book by Eric von Hippel – http://web.mit.edu/evhippel/www/democ1.htm Idea #2: Best-Effort Packet Switching Internet Protocol (IP) Packet Switching • Like the postal system – Divide information into letters – Stick them in envelopes – Deliver them independently – And sometimes they get there • What’s in an IP packet? – The data you want to send – A header with the “from” and “to” addresses Why Packets? • Packets can be delivered by most anything – Serial link, fiber optic link, coaxial cable, wireless, birds • Data traffic is bursty – Logging in to remote machines, exchanging e-mail • Don’t waste bandwidth – No traffic exchanged during idle periods • Better to allow multiplexing – Different transfers share access to same links Best-Effort Packet-Delivery Service • Best-effort delivery – Packets may be lost – Packets may be corrupted – Packets may be delivered out of order source destination IP network Why Best-Effort? • Simpler network – No error detection and correction – Don’t remember from one packet to next – Don’t reserve bandwidth and memory – Transient disruptions are okay during failover • … but, applications do want efficient, accurate transfer of data in order, in a timely fashion • Fortunately, the end host take care of that! End Host Can Take Care of Requirements • No error detection or correction – Higher-level protocol can provide error checking • Successive packets may not follow same path – No problem as long as packets reach destination • Packets can be delivered out-of-order – Receiver can put packets back in order (if needed) • Packets may be lost or arbitrarily delayed – Sender can send the packets again (if desired) • No reaction to congestion, beyond “drop” – Sender can slow down in response to loss or delay Idea #3: Layering and the IP Hourglass Model Layering: A Modular Approach • Sub-divide the problem – Each layer relies on services from layer below – Each layer exports services to layer above • Interface between layers defines interaction – Hides implementation details – Layers can change without disturbing other layers Application Application-to-application channels Host-to-host connectivity Link hardware The Narrow Waist of IP FTP HTTP NV TCP TFTP Applications UDP TCP UDP Waist IP Data Link NET1 NET2 … NETn Physical The Hourglass Model The waist facilitates interoperability Above and Below the Waist • IP over anything – Internetworking protocol that runs on anything – Accommodate innovation in link technology – … and heterogeneity throughout the network • Anything over IP – Variety of transport protocols can be built – Though, in practice, mainly just TCP and UDP • TCP: ordered, reliable stream of bytes • UDP: simple (unreliable) message delivery – And any applications on top of that End-to-End IP host host HTTP message HTTP TCP segment TCP router IP Ethernet interface HTTP IP packet Ethernet interface Ethernet frame IP TCP router IP packet SONET interface SONET interface SONET frame IP IP packet Ethernet interface IP Ethernet interface Ethernet frame Idea #4: Decentralized Control Benefits of Decentralization: Scalability • Scalability – Limit amount of state, and frequency of updates • Addressing – Internet routers only need to know how to reach blocks of addresses (e.g., 12.0.0.0/8) • Routing – Link failure in one network is typically not visible in another • Naming – Look-up of www.cnn.com doesn’t go to same server as look-up of www.princeton.edu Benefits of Decentralization: Autonomy • Autonomy – Allow different parties to manage different parts of the system, and apply their own policies • Addressing – ARIN delegates address space to AT&T, who delegates smaller blocks to its customers • Routing – AT&T controls flow of traffic through its backbone • Naming – CNN controls addresses for www.cnn.com Problems Lurking Challenges Tied to Early Design Decisions • Power of programmable end hosts – Easy to spoof IP addresses, e-mail addresses, … – Incentives for users to violate congestion control – Malicious users launching Denial-of-Service attacks • Best-effort packet-delivery service – Inefficient in high-loss environments (wireless) – Poor performance for interactive applications – Expensive per-packet handling on high-speed links Challenges Tied to Early Design Decisions • Layering and the IP narrow waist – Low efficiency due to many layers of headers – Poor visibility into underlying shared risks – Complex network management due to multiple interconnected protocols and systems • Decentralized control – Hierarchical addressing makes mobility difficult, and requires careful configuration – Autonomy makes measurement (and troubleshooting and accountability) hard – Autonomy makes protocol changes difficult Recurring Challenges • Security – Weak notions of identity that are easy to spoof – Protocols that rely on good behavior – Incomplete or non-existent registries, keys, … • Mobility and disconnected operation – Hierarchical addressing closely tied with routing – Presumption that hosts are connected • Network management – Many coupled, decentralized control loops – Limited visibility into across layers and networks • Application performance requirements – Real-time, interactive applications – Throughput sensitive vs. delay-sensitive Internet is Not Standing Still • Partial solutions to these problems – Often as “add ons” or “extensions” – Hampered by need to be backwards compatible, and work when only partially deployed – Rather than complete architectural solutions • Solutions create problems of their own – Violations of architectural assumptions – Unexpected interactions with applications – Adding complexity to an already complex system Example: Middleboxes • Middleboxes are intermediaries – Interposed in-between the communicating hosts – Often without knowledge of one or both parties • Examples – Network address translators – Firewalls – Traffic shapers – Intrusion detection systems – Transparent Web proxy caches – Application accelerators Middleboxes Address Practical Challenges • Host mobility – Relaying traffic to a host in motion • IP address depletion – Allowing multiple hosts to share a single address • Security concerns – Discarding suspicious or unwanted packets – Detecting suspicious traffic • Performance concerns – Controlling how link bandwidth is allocated – Storing popular content near the clients Middleboxes Violate Network-Layer Principles • Globally unique identifiers – Each node has a unique, fixed IP address – … reachable from everyone and everywhere • Simple packet forwarding – Network nodes simply forward packets – … rather than modifying or filtering them source destination IP network Two Views of Middleboxes • An abomination – Violation of layering – Cause confusion in reasoning about the network – Responsible for many subtle bugs • A practical necessity – Solving real and pressing problems – Needs that are not likely to go away • Would they arise in any edge-empowered network, even if redesigned from scratch? Clean-Slate Network Architecture • Clean-slate architecture – Without constraints of today’s artifacts – To have a stronger intellectual foundation – And move beyond the incremental fixes • Still, some constraints inevitably remain – Ignore today’s artifacts, but not necessarily all reality • Such as… – – – – – Resource limitations (CPU, memory, bandwidth) Time delays between nodes Independent economic entities Malicious parties The need to evolve over time Conclusions • Internet architecture is a huge success – Functionality at programmable edge nodes – Best-effort packet-delivery service – Layering and the IP hourglass model – Decentralized control of the global system • These very features are causing problems – Security, mobility, manage-ability, performance, reliability, … • Rethinking the network architecture – For a strong intellectual foundation – And long-term improvements to the Internet