Building a Strong Foundation for a Future Internet Jennifer Rexford Princeton University http://www.cs.princeton.edu/~jrex/talks/stoc08.ppt The Internet: A Remarkable Story • Tremendous success – A research experiment that truly escaped from the lab • The brilliance of under-specifying – Best-effort packet-delivery service – Key functionality at programmable end hosts • Enabled massive growth and innovation – Ease of adding hosts and links, and new technologies – Ease of adding new services (Web, P2P, VoIP, …) 2 Rethinking the Network Architecture • But, the Internet is showing signs of age – Security, mobility, availability, manageability, … • Challenges rooted in early design decisions – Weak notions of identity, tying address to location, … – Not a simple matter of redesigning a single protocol • Revisiting the definition and placement of function – What are the types of nodes in the system? – What are their powers and limitations? – What information do they exchange? The “computational model” of the (future) Internet. 3 Clean-Slate Network Architecture • Clean-slate architecture – Without constraints of today’s artifacts – To have a stronger intellectual foundation – And move beyond the incremental fixes • Still, some constraints inevitably remain – Ignore today’s artifacts, but not necessarily all reality • Such as… – Resource limitations (CPU, memory, bandwidth) – Time delays between nodes – Independent economic entities – Malicious parties – The need to evolve over time 4 A Big Research Challenge Evolvable Protocols (under-specified, programmable) ? Autonomy Global Properties (autonomous parties, with different economic objectives) (stability, scalability, reliability, security, managability, …) Can we have all three? Under what conditions? 5 A Real Need for a Theory of Networks • Formal definitions of network architecture – Can the theory community do for network architecture what it did for, e.g., cryptography and machine learning? • Programmability – What are good programming models that strike the right balance been flexibility and restraint? • Incentives – How much should we rely on economic incentives to ensure key system properties? • System properties – What are the fundamental trade-offs and bounds? 6 Example: Wide-Area Internet Routing • Seemingly a simple matter – Computing paths on graphs • Many, many design goals – Global connectivity – Flexible local policies – Fast recovery from changes – Good end-to-end paths – Low protocol overhead – Security, scalability, … – <your wish list here> • Perhaps we cannot satisfy all of these goals – No matter how hard we try… 7 Four Example Problems in Routing • Policy-based interdomain routing – Programmable routing policies in each network – While ensuring global stability, efficiency, … – #1: Can economic incentives ensure global stability? – #2: How should a distributed network realize its policy? • End-to-end traffic management – Adapting the flow of traffic over each path – While ensuring good aggregate performance – #3: What should hosts, routers, and operators do? – #4: How to support diverse application requirements? Getting a distributed set of nodes to do the right thing. 8 Policy-Based Interdomain Routing + $$$ = ??? What is an Internet? • A “network of networks” – Networks run by different institutions • Autonomous System (AS) – Collection of routers run by a single institution – With a clearly defined routing policy • ASes have different goals – Different views of which paths are good • Interdomain routing is what reconciles those views – To compute end-to-end paths through the Internet Wonderful problem setting for game theory and mechanism design 10 Autonomous Systems (ASes) Path: 6, 5, 4, 3, 2, 1 4 3 5 2 7 1 6 Web server Client Around 30,000 ASes today… 11 Border Gateway Protocol (BGP) • ASes exchange reachability information –Destination: block of IP addresses –AS path: sequence of ASes along the path • Policies “programmed” by network operators –Path selection: which path to use? –Path export: which neighbors to tell? “I can reach d via AS 1” “I can reach d” 2 1 data traffic d 3 data traffic 12 Stable Paths Problem (SPP) Model • Model of routing policy – Each AS has a ranking of the permissible paths • Model of path selection – Pick the highest-ranked path consistent with neighbors 12d 1d 1 • Flexibility is not free 2 23d 2d 3 31d 3d d – Global system may not converge to a stable assignment – Depending on the way the ASes rank their paths http://portal.acm.org/citation.cfm?id=508332 13 Ways to Achieve Global Stability • Detect conflicting rankings of paths? – Computationally intractable (NP-hard) – Requires global coordination • Restrict the policy programming languages? – In what way? How to require this globally? – What if the world should change, and the protocol can’t? • Rely on economic incentives? – Policies typically driven by business relationships – E.g., customer-provider and peer-peer relationships – Sufficient conditions to guarantee unique, stable solution 14 Bilateral Business Relationships • Provider-Customer – Customer pays provider for access to the Internet • Peer-Peer – Peers carry traffic between their respective customers Valid paths: “6 “1 4 23 d”d” and and “7“8 d”5 d” Invalid Invalidpaths: path: “5 “6 85 d” d” and “1 4 3 d” 1 4 3 2 d 5 6 Provider-Customer Peer-Peer 7 8 15 Act Locally, Prove Globally • Global topology – Provider-customer relationship graph is acyclic – Peer-peer relationships between any pairs of ASes • Route export – Do not export routes learned from a peer or provider – … to another peer or provider • Route selection – Prefer routes through customers – … over routes through peers and providers • Guaranteed to converge to unique, stable solution http://www.cs.princeton.edu/~jrex/papers/sigmetrics00.long.pdf 16 Rough Sketch of the Proof • Two phases – Walking up the customer-provider hierarchy – Walking down the provider-customer hierarchy 1 4 3 2 d 5 6 Provider-Customer Peer-Peer 7 8 17 Trade-offs Between Assumptions • Three kinds of assumptions – Route export, route selection, and global topology – Relax one assumption, need to tighten the other two • Are these assumptions reasonable? – Could business practices change over time? • What if nodes are dishonest about their choices? – See Levin-Schapira-Zohar paper in the next session! • What if the protocol changes? – What if the protocol allows multiple paths? 18 An Incomplete Understanding… • Desirable global properties – Convergence to a unique route assignment – Fast convergence after topology changes – Honesty in route announcements and packet forwarding • And how they relate to – Topology, policies, path verification, revenue models, … • And most known results are “sufficient” conditions – Lacking complete characterization of the trade-off space • With basic questions about economic incentives – When are they enough? What else do we need? – Where do the economic issues really belong? 19 Ensuring an AS Realizes its Policy • How should the nodes inside an AS behave? – To correctly realize the AS’s routing policy – To satisfy the expectations of neighboring ASes – To minimize protocol overhead within the AS • Different problem than interdomain routing – Not about reconciling (possibly conflicting) policies – But instead about correctly realizing a single policy 20 The Route Assignment Problem { r1 r2 r3 … rn } = R n rn 1 2 Route Assignment (based on policy) data traffic 3 e1 … e2 en from R … e3=rn from R 21 Propagating Information Within the AS { r1 r2 … rn } = R r3 n 1 A 2 Route Assignment (based on policy) B 3 e1 … e2 … en from R e3=rn from R What information do A and B need to propagate? 22 An Incomplete Understanding… • How to define and model an AS – To design and analyze interdomain routing – … without regard to the intra-AS details • How to propagate routing information within an AS – So the routers can realize the policy “correctly” – … without introducing excessive overhead • What are the overhead-flexibility trade-offs? – How much information must the routers exchange – … and how does it depend on the programming model • How should an AS express (“program”) its policy? 23 End-to-End Traffic Management Traffic Management Today • How much traffic should traverse each path? Operator: Traffic Engineering End hosts: Congestion Control Routers: Routing Protocols 25 Models and Algorithms for Each Part • End hosts: congestion control –Maximizing aggregate utility over all users –Additive increase, multiplicative decrease • Routers: routing protocols –Minimizing path cost as sum of link weights –Bellman-Ford and Dijkstra’s algorithms • Operators: traffic engineering –Minimizing load on the network links –Local-search algorithms for tuning link weights But, is the whole more than the sum of its parts? 26 Shortcoming of Today’s Architecture • Ignores protocol interactions –Congestion control assumes routing is fixed –Traffic engineering assumes traffic is inelastic • Inefficiency of traffic engineering –Tuning link weights in shortest-path routing –Cannot achieve optimal flow, and is NP-hard –… and is typically performed on long timescale • Only limited use of multiple paths –Missed opportunity for better performance What would a clean-slate redesign look like? 27 Distributed Traffic Management Problem • Should have a clearly-stated problem – Variables: source rates and path-splitting ratios – Constraints: link load staying below capacity – Objectives: performance and robustness max. ∑iUi(xi) - ∑lf(ul) aggregate utility (as a function of the source rate xi, over all users i) aggregate congestion cost (as a function of utilization, ul, over all links l) http://www.cs.princeton.edu/~jrex/papers/conext07.pdf 28 Distributed Traffic Management Problem • Solutions with well-understood properties –Optimality, convergence, low overhead, … • Distributed load-balancing algorithms –Feedback about network conditions –Adaptation of the sending rate on each path s s s Edge nodes: Update path rates z Rate limit incoming traffic Routers: Set up multiple paths Measure link load Update link prices 29 s An Incomplete Understanding… • Promising initial results – Using optimization theory, game theory, control theory… • Simple tuning of the system – Algorithms that are robust across a range of settings? – Self-tuning load-balancing algorithms? • Trade-offs in the number of paths – How many paths are really necessary? – How should these paths be computed? • Implicit vs. explicit feedback from the links – Can edge nodes adapt based on path-level metrics? – Robustness to adversaries trying to bias measurements? 30 Supporting Multiple Classes of Traffic vs. file Different Strokes for Different Folks • Applications have different requirements – High throughput: bulk file transfers – Low delay/jitter: VoIP and gaming • Could design protocols for each traffic class – Using application-specific objective functions • But, how should these applications co-exist? – Multiple customized traffic-management protocols – On a shared underlying network – To maximize the aggregate utility of the users 32 Virtualization to the Rescue • Multiple customized architectures in parallel – Multiple virtual nodes on a single physical node – Isolation of resources, like CPU and bandwidth – Programmability for customizing each “virtual network” 33 An Incomplete Understanding • How important are customized architectures? – Quantifying the inefficiencies of “one size fits all” – Understanding gains and overheads of customization • How to balance isolation and efficiency? – Allowing multiple architectures to run in parallel – Without requiring static resource partitioning • How to support other application requirements? – Security/privacy, scalability trade-offs, … – With appropriate support in the underlying substrate • What kind of programming model on the nodes? – To enable creation of new networked services – Without compromising efficiency, security, … 34 Virtualization for Economic Refactoring Today’s Internet Competing ISPs with different goals must coordinate Virtualized Internet Single service provider controls end-to-end path • Infrastructure providers: Own routers, links, data centers • Service providers: Offer end-to-end services to users Economics play out vertically on a coarser timescale. http://www.cs.princeton.edu/~jrex/papers/cabo-short.pdf 35 Conclusions • These are just a few examples – In the context of wide-area Internet routing • Meant to illustrate a larger question – Programmability, incentives, and global properties • And importance of theoretical disciplines – In putting network architecture on a sound foundation • Great opportunities for interdisciplinary research – Grappling with problem formulations and solutions • And for significant practical impact – Adding clarity to our understanding of today’s Internet – And leading to a future Internet worthy of society’s trust 36 Acknowledgments • My research group and collaborators –Including the problems mentioned in this talk –See pointers to references at end of slides • Colleagues in the theoretical CS community –Especially at AT&T Research and Princeton –Forcing greater precision in problem formulation • Particularly Joan Feigenbaum –For great discussions on network architecture –… and advice and feedback about this talk http://www.cs.princeton.edu/~jrex/talks/stoc08.ppt 37 Thank you! References: Policy-Based Routing • T. Griffin, B. Shepherd, and G. Wilfong, “The stable paths problem and interdomain routing” – http://portal.acm.org/citation.cfm?id=508332 • L. Gao and J. Rexford, “Stable Internet routing without global coordination” – http://www.cs.princeton.edu/~jrex/papers/sigmetrics00.long.pdf • T. Griffin and J. Sobrinho, “Metarouting” – http://www.sigcomm.org/sigcomm2005/paper-GriSob.pdf • H. Levin, M. Schapira, and A. Zohar, “Interdomain routing and games” – In the next session here at STOC’08! 39 References: Dynamic Traffic Management • A. Elwalid, C. Jin, S. Low, and I. Widjaja, “MATE: MPLS Adaptive Traffic Engineering” – http://ieeexplore.ieee.org/iel5/7321/19795/00916625.pdf • S. Kandula, D. Katabi, B. Davie, and A. Charny, “Walking the tightrope: Responsive yet stable traffic engineering” – http://nms.lcs.mit.edu/papers/index.php?detail=128 • S. Fischer, N. Kammenhuber, and A. Feldmann, “Dynamic traffic engineering based on Wardrop routing policies” – http://adetti.iscte.pt/events/CONEXT06/Conext06_Proceedings/pape rs/f58.pdf • J. He, M. Suchara, M. Bresler, J. Rexford, and M. Chiang, “Rethinking Internet traffic management: From optimization theory to a practical protocol” – http://www.cs.princeton.edu/~jrex/papers/conext07.pdf 40 References: Network Virtualization • J. Turner and D. Taylor, “Diversifying the Internet” – http://www.arl.wustl.edu/~jst/pubs/globecom05divNet.pdf • T. Anderson, L. Peterson, S. Shenker, and J. Turner, “Overcoming the Internet impasse through virtualization” – http://geni.net/GDD/GDD-05-01.pdf • N. Feamster, L. Gao, and J. Rexford, “How to lease the Internet in your spare time” – http://www.cs.princeton.edu/~jrex/papers/cabo-short.pdf 41 Backup Slides on Monitoring in an Adversarial Setting Importance of Monitoring in Routing • Path-quality monitoring – Identify performance problems along a path – E.g., to drive load-balancing decisions – E.g., to decide what service provider to hire – E.g., to identify violations of service-level agreements • Presence of adversaries – Drop, delay, add, or tamper with any packets – Bias measurements by preferentially treating probes – Out of malice or greed Alice ? Bob 43 Many Variations on the Problem • Threat model – Drop/delay packets only? – Also tamper or add? • Goal of the measurement – Loss/delay estimates vs. alarm on a threshold? – One-way path measurements vs. round-trip? – Verification that packets follow advertised path? – Identification of the links/nodes responsible for problem? • Practical constraints – (Not) allowing modification of packets – (A)symmetric forward and reverse paths – (No) clock synchronization between Alice and Bob – Large number of Alice-Bob pairs 44 An Incomplete Understanding… • Promising initial results – Fundamental bounds – Secure sampling – Secure sketches • Challenges remain – Complete characterization of the design space • Larger architectural questions – Role of security in routing Why should we care if packets follow the (BGP) advertised path? Is routing’s only job is to provide (multiple) end-to-end paths? – Role of monitoring in routing Can path-quality monitoring drive load-balancing decisions? 45