Internet Routing (COS 598A) Today: Intradomain Topology Jennifer Rexford http://www.cs.princeton.edu/~jrex/teaching/spring2005 Tuesdays/Thursdays 11:00am-12:20pm Outline • Router architecture – Line cards – Switching fabric – Router processor • Network topology – From hub-and-spoke to backbones – Customer connecting to providers • Measuring the topology – Traceroute probes from many vantage points – Associating an IP address with an AS • Discussion of the papers What is a Router? • A computer with… – Multiple interfaces – Implementing routing protocols – Packet forwarding • Wide range of variations of routers – Small LinkSys device in a home network – Linux-based PC running router software – Million-dollar high-end routers with large chassis • … and links – Serial line – Ethernet – Packet-over-SONET Network Components Links Interfaces Fibers Ethernet card Switches/routers Large router Wireless card Coaxial Cable Telephone switch Inside a High-End Router Processor Line card Line card Line card Line card Switching Fabric Line card Line card Router Components: Line Cards • Interfacing – Physical link – Switching fabric to/from link Receive – Buffer management – Link scheduling – Packet filtering (ACLs) – Packet forwarding (FIB) – Rate-limiting – Packet marking – Measurement FIB to/from switch Transmit • Packet handling Router Components: Switching Fabric • Deliver packet inside the router – From incoming interface to outgoing interface – A small network in and of itself • Must operate very quickly – Multiple packets going to same outgoing interface – Switch scheduling to match inputs to outputs • Implementation techniques – Bus, crossbar, interconnection network, … – Running at a faster speed (e.g., 2X) than links – Dividing variable-length packets into cells Router Components: Router Processor • So-called “Loopback” interface – IP address of the CPU on the router • Control-plane software – Implementation of the routing protocols – Creation of forwarding table for the line cards • Interface to network administrators – Command-line interface for configuration – Transmission of measurement statistics • Handling of special data packets – Packets with IP options enabled – Packets with expired Time-To-Live field Network Topology Hub-and-Spoke Topology • Single hub node – Common in enterprise networks – Main location and satellite sites – Simple design and trivial routing • Problems – Single point of failure – Bandwidth limitations – High delay between sites – Costs to backhaul to hub Simple Alternatives to Hub-and-Spoke • Dual hub-and-spoke – Higher reliability – Higher cost – Good building block • Levels of hierarchy – Reduce backhaul cost – Aggregate the bandwidth – Shorter site-to-site delay … Backbone Networks • Backbone networks – Multiple Points-of-Presence (PoPs) – Lots of communication between PoPs – Need to accommodate diverse traffic demands – Need to limit propagation delay Abilene Internet2 Backbone Points-of-Presence (PoPs) • Inter-PoP links – Long distances – High bandwidth Inter-PoP Intra-PoP • Intra-PoP links – Short cables between racks or floors – Aggregated bandwidth • Links to other networks – Wide range of media and bandwidth Other networks Deciding Where to Locate Nodes and Links • Placing Points-of-Presence (PoPs) – Large population of potential customers – Other providers or exchange points – Cost and availability of real-estate – Mostly in major metropolitan areas • Placing links between PoPs – Already fiber in the ground – Needed to limit propagation delay – Needed to handle the traffic load Customer Connecting to a Provider Provider 1 access link Provider 2 access routers Provider 2 access links Provider 2 access PoPs Multi-Homing: Two or More Providers • Motivations for multi-homing – – – – Extra reliability, survive single ISP failure Financial leverage through competition Better performance by selecting better path Gaming the 95th-percentile billing model Provider 1 Provider 2 Measuring the Topology Motivation for Measuring the Topology • Business analysis – Comparisons with competitors – Selecting a provider or peer • Scientific curiosity – Treating data networks like an organism – Understand structure and evolution of Internet • Input to research studies – Network design, routing protocols, … • Interesting research problem in its own right – How to measure/infer the topology Basic Idea: Measure from Many Angles Source 2 Source 1 Where to Get Sources and Destinations? • Source machines – Get accounts in many places • Good to have a lot of friends – Use an infrastructure like PlanetLab • Good to have friends who have lots of friends – Use public traceroute servers (nicely) • http://www.traceroute.org • Destination addresses – Walk through the IP address space • One (or a few) IP addresses per prefix – Learn destination prefixes from public BGP tables • http://www.route-views.org Traceroute: Measuring the Forwarding Path • Time-To-Live field in IP packet header – Source sends a packet with a TTL of n – Each router along the path decrements the TTL – “TTL exceeded” sent when TTL reaches 0 • Traceroute tool exploits this TTL behavior TTL=1 source Time exceeded destination TTL=2 Send packets with TTL=1, 2, 3, … and record source of “time exceeded” message Example Traceroute Output (Berkeley to CNN) Hop number, IP address, DNS name No response from router 1 169.229.62.1 inr-daedalus-0.CS.Berkeley.EDU 2 169.229.59.225 soda-cr-1-1-soda-br-6-2 3 128.32.255.169 vlan242.inr-202-doecev.Berkeley.EDU 4 128.32.0.249 gigE6-0-0.inr-666-doecev.Berkeley.EDU 5 128.32.0.66 qsv-juniper--ucb-gw.calren2.net 6 209.247.159.109 POS1-0.hsipaccess1.SanJose1.Level3.net 7 * ? 8 64.159.1.46 ? 9 209.247.9.170 pos8-0.hsa2.Atlanta2.Level3.net No name resolution 10 66.185.138.33 pop2-atm-P0-2.atdn.net 11 * ? 12 66.185.136.17 pop1-atl-P4-0.atdn.net 13 64.236.16.52 www4.cnn.com Problems with Traceroute • Missing responses – Routers might not send “Time-Exceeded” – Firewalls may drop the probe packets – “Time-Exceeded” reply may be dropped • Misleading responses – Probes taken while the path is changing – Name not in DNS, or DNS entry misconfigured • Mapping IP addresses – Mapping interfaces to a common router – Mapping interface/router to Autonomous System • Angry operators who think this is an attack Map Traceroute Hops to ASes Traceroute output: (hop number, IP) 1 169.229.62.1 AS25 2 169.229.59.225 AS25 Berkeley 3 128.32.255.169 AS25 4 128.32.0.249 AS25 5 128.32.0.66 AS11423 Calren 6 209.247.159.109 AS3356 7 * AS3356 8 64.159.1.46 AS3356 9 209.247.9.170 AS3356 10 66.185.138.33 AS1668 11 * AS1668 12 66.185.136.17 AS1668 13 64.236.16.52 AS5662 CNN Level3 AOL Need accurate IP-to-AS mappings (for network equipment). Candidate Ways to Get IP-to-AS Mapping • Routing address registry – Voluntary public registry such as whois.radb.net – Used by prtraceroute and “NANOG traceroute” – Incomplete and quite out-of-date • Mergers, acquisitions, delegation to customers • Origin AS in BGP paths – Public BGP routing tables such as RouteViews – Used to translate traceroute data to an AS graph – Incomplete and inaccurate… but usually right • Multiple Origin ASes (MOAS), no mapping, wrong mapping Example: BGP Table (“show ip bgp” at RouteViews) Network * 3.0.0.0/8 * * * * *> * * 9.184.112.0/20 * *> * * * Next Hop Metric LocPrf Weight Path 205.215.45.50 0 4006 701 80 i 167.142.3.6 0 5056 701 80 i 157.22.9.7 0 715 1 701 80 i 195.219.96.239 0 8297 6453 701 80 i 195.211.29.254 0 5409 6667 6427 3356 701 80 i 12.127.0.249 0 7018 701 80 i 213.200.87.254 929 0 3257 701 80 i 205.215.45.50 0 4006 6461 3786 i 195.66.225.254 0 5459 6461 3786 i 203.62.248.4 0 1221 3786 i 167.142.3.6 0 5056 6461 6461 3786 i 195.219.96.239 0 8297 6461 3786 i 195.211.29.254 0 5409 6461 3786 i AS 80 is General Electric, AS 701 is UUNET, AS 7018 is AT&T AS 3786 is DACOM (Korea), AS 1221 is Telstra Refining Initial IP-to-AS Mapping • Start with initial IP-to-AS mapping – Mapping from BGP tables is usually correct – Good starting point for computing the mapping • Collect many BGP and traceroute paths – Signaling and forwarding AS path usually match – Good way to identify mistakes in IP-to-AS map • Successively refine the IP-to-AS mapping – Find add/change/delete that makes big difference – Base these “edits” on operational realities http://www.cs.princeton.edu/~jrex/papers/sigcomm03.pdf http://www.cs.princeton.edu/~jrex/papers/infocom04.pdf Extra AS due to Internet eXchange Points • IXP: shared place where providers meet – E.g., Mae-East, Mae-West, PAIX – Large number of fan-in and fan-out ASes A B C D E A E F B F G C G Traceroute AS path BGP AS path Ignore extra traceroute AS hop with high fan-in and fan-out Extra AS due to Sibling ASes • Sibling: organizations with multiple ASes: – E.g., Sprint AS 1239 and AS 1791 – AS numbers equipment with addresses of another A B C H D E A F B G C Traceroute AS path E D F G BGP AS path Merge sibling ASes “belong together” as if they were one AS. Unannounced Infrastructure Addresses 12.0.0.0/8 A B C does not announce part of its address space in BGP (e.g., 12.1.2.0/24) ACAC C AC BAC BC Fix the IP-to-AS map to associate 12.1.2.0/24 with C Improving the IP-to-AS Mapping • Algorithm for modifying the IP-to-AS map – Small number of rules for modifying the map – Making small changes that make a big difference • Results of the algorithm – Changes about 2.9% of mappings – Much better agreement (95%) with BGP AS paths • Validation – AT&T router configuration data – Whois queries to verify sibling ASes – List of known Internet eXchange Points Exploring the Remaining Mismatches • Route aggregation B C D D C BGP path: B C Traceroute path: B C D E E – Traceroute AS path longer in 20% of mismatches – Different paths for destinations in same prefix • Interface numbering at AS boundaries B B C D D BGP path: B C D Traceroute path: B D – Boundary links numbered from one AS – Verified cases where AT&T (AS 7018) is involved Discussion of the Two Papers • Measuring ISP topologies with RocketFuel – Measure judiciously – First view of ISP topologies – PoP structure, inter-PoP graphs, peering, … – Good? Bad? What areas for future work? • First-principles of router-level topology – Explain the high variability in router degree – Technological limits on switching capacity – Many low-speed links at edge, few large in core – High variability at edge due to economics – Good? Bad? What areas for future work? Some Project Ideas • Accuracy of router-level mapping – Apply traceroute to map out the Abilene network – Use PlanetLab nodes for many vantage points – Verify against the actual topology of the network • Influence of inaccuracy in router-level maps – Characterize the types of inaccuracy that arise – Determine the influence on key graph metrics – Identify ways to limit the effects of inaccuracy • Design better router support for measurement – To support topology discovery, troubleshooting, … – Be cognizant of need to be efficient, not used for attacks, not reveal too-sensitive information, etc. Reading for Thursday: AS-Level Topology • Two papers, and one video – “Toward capturing representative AS-level Internet topologies” – “Interconnection, peering, and settlements” – NANOG video on evolution of Internet peering • One-page review of first paper (hard-copy) – Brief summary of the paper – Reasons to accept the paper – Reasons to reject the paper – Three suggestions for future research directions • Optional reading – Should computer scientists experiment more?