Overview: Routing

advertisement

Interdomain Routing

Jennifer Rexford

Advanced Computer Networks http://www.cs.princeton.edu/courses/archive/fall08/cos561/

Tuesdays/Thursdays 1:30pm-2:50pm

Outline

• Interdomain routing

– Autonomous Systems (ASes)

• Path-vector routing

– Loop detection and convergence

– Flexible path selection

• Business relationships

– Customer-provider and peer-peer

– Hierarchy from tier-1 ASes to stubs

• Border Gateway Protocol (BGP)

– Announcements and withdrawals

– Import and export policies

• Discussion of Paxson96 and Labovitz97

Interdomain Routing: Between Networks

• AS-level topology

– Nodes are Autonomous Systems (ASes)

– Destinations are prefixes (e.g., 12.0.0.0/8)

– Edges are links and business relationships

4

3

2

1

Client

7

5

6

Web server

AS Numbers (ASNs)

ASNs are 16 bit values.

64512 through 65535 are “private”

Currently around 30,000 in use.

• Level 3: 1

• MIT: 3

• Harvard: 11

• Yale: 29

• Princeton: 88

• AT&T: 7018, 6341, 5074, …

• UUNET: 701, 702, 284, 12199, …

• Sprint: 1239, 1240, 6211, 6242, …

• …

ASNs represent units of routing policy

Challenges for Interdomain Routing

• Scale

– Prefixes: 250,000, and growing

– ASes: 30,000, and growing

– Routers: at least in the millions…

• Privacy

– ASes don’t want to divulge internal topologies

– … or their business relationships with neighbors

• Policy

– No Internet-wide notion of a link cost metric

– Need control over where you send traffic

– … and who can send traffic through you

Policy-Based Path-Vector Routing

Path-Vector Routing

• Extension of distance-vector routing

– Support flexible routing policies

– Reduce convergence time (avoid count-to-infinity)

• Key idea: advertise the entire path

– Distance vector: send distance metric per dest d

– Path vector: send the entire path for each dest d

3

“d: path (2,1)” data traffic

2

“d: path (1)” data traffic

1 d

Faster Loop Detection

• Node can easily detect a loop

– Look for its own node identifier in the path

– E.g., node 1 sees itself in the path “3, 2, 1”

• Node can simply discard paths with loops

– E.g., node 1 simply discards the advertisement

3

“d: path (2,1)”

2

“d: path (3,2,1)”

“d: path (1)”

1

Flexible Policies

• Each node can apply local policies

– Path selection: Which path to use?

– Path export: Whether to advertise the path?

• Examples

– Node 2 may prefer the path “2, 3, 1” over “2, 1”

– Node 1 may not let node 3 hear the path “1, 2”

2

3 2

3

1 1

Business Relationships

Business Relationships

• Neighboring ASes have business contracts

– How much traffic to carry

– Which destinations to reach

– How much money to pay

• Common business relationships

– Customer-provider

• E.g., Princeton is a customer of USLEC

• E.g., MIT is a customer of Level3

– Peer-peer

• E.g., UUNET is a peer of Sprint

• E.g., Harvard is a peer of Harvard Business School

Customer-Provider Relationship

• Customer needs to be reachable from everyone

– Provider tells all neighbors how to reach the customer

• Customer does not want to provide transit service

– Customer does not let its providers route through it

Traffic to the customer Traffic from the customer d provider announcements provider traffic customer d customer

Multi-Homing: Two or More Providers

• Motivations for multi-homing

– Extra reliability, survive single ISP failure

– Financial leverage through competition

– Better performance by selecting better path

– Gaming the 95 th -percentile billing model

Provider 1 Provider 2

Princeton Example

• Internet: customer of USLEC and Patriot

• Research universities/labs: customer of Internet2

• Local non-profits: provider for several non-profits

Patriot USLEC Internet2

Peer-Peer Relationship

• Peers exchange traffic between customers

– AS exports only customer routes to a peer

– AS exports a peer’s routes only to its customers

– Often the relationship is settlement-free (i.e., no $$$)

Traffic to/from the peer and its customers peer announcements traffic d peer

AS Structure: Tier-1 Providers

• Tier-1 provider

– Has no upstream provider of its own

– Typically has a national or international backbone

• Top of the Internet hierarchy of ~10 ASes

– AOL, AT&T, Global Crossing, Level3, UUNET, NTT, Qwest,

SAVVIS (formerly Cable & Wireless), and Sprint

– Full peer-peer connections between tier-1 providers

AS Structure: Other ASes

• Other providers

– Provide transit service to downstream customers

– … but, need at least one provider of their own

– Typically have national or regional scope

– Includes several thousand ASes

• Stub ASes

– Do not provide transit service to others

– Connect to one or more upstream providers

– Includes vast majority (e.g., 85-90%) of the ASes

Border Gateway Protocol

Border Gateway Protocol

• Prefix-based path-vector protocol

• Policy-based routing based on AS Paths

• Evolved during the past 18 years

• 1989 : BGP-1 [RFC 1105], replacement for EGP

• 1990 : BGP-2 [RFC 1163]

• 1991 : BGP-3 [RFC 1267]

• 1995 : BGP-4 [RFC 1771], support for CIDR

• 2006 : BGP-4 [RFC 4271], update

BGP Operations

Establish session on

TCP port 179

AS1

BGP session

Exchange all active routes

Exchange incremental updates

AS2

While connection is ALIVE exchange route UPDATE messages

Incremental Protocol

• A node learns multiple paths to destination

– Stores all of the routes in a routing table

– Applies policy to select a single active route

– … and may advertise the route to its neighbors

• Incremental updates

– Announcement

• Upon selecting a new active route, add node id to path

• … and (optionally) advertise to each neighbor

– Withdrawal

• If the active route is no longer available

• … send a withdrawal message to the neighbors

128.112.0.0/16

AS Path = 1239 7018 88

AS 1239

Sprint

128.112.0.0/16

AS Path = 88

AS 88

Princeton

128.112.0.0/16

Prefix Originated

ASPATH Attribute

128.112.0.0/16

AS Path = 1755 1239 7018 88

AS 1129

Global Access

AS 1755

Ebone

128.112.0.0/16

AS Path = 1129 1755 1239 7018 88

128.112.0.0/16

AS Path = 7018 88

AS 12654

RIPE NCC

RIS project

AS7018

AT&T

128.112.0.0/16

AS Path = 7018 88

128.112.0.0/16

AS Path = 3549 7018 88

AS 3549

Global Crossing

BGP Policy: Applying Policy to Routes

• Import policy

– Filter unwanted routes from neighbor

• E.g. prefix that your customer doesn’t own

– Manipulate attributes to influence path selection

• E.g., assign local preference to favored routes

• Export policy

– Filter routes you don’t want to tell your neighbor

• E.g., don’t tell a peer a route learned from other peer

– Manipulate attributes to control what they see

• E.g., make a path look artificially longer than it is

BGP Policy: Influencing Decisions

Receive

BGP

Updates

Open ended programming.

Constrained only by vendor configuration language

Apply Policy = filter routes & tweak attributes

Based on

Attribute

Values

Best

Routes

Apply Policy = filter routes & tweak attributes

Transmit

BGP

Updates

Apply Import

Policies

Best Route

Selection

Best Route

Table

Apply Export

Policies

Install forwarding

Entries for best

Routes.

IP Forwarding Table

Import Policy: Local Preference

• Favor one path over another

– Override the influence of AS path length

– Apply local policies to prefer a path

• Example: prefer customer over peer

Local-pref = 90

AT&T

Sprint

Local-pref = 100

Tier-2

Tier-3 Yale

Import Policy: Filtering

• Discard some route announcements

– Detect configuration mistakes and attacks

• Examples on session to a customer

– Discard route if prefix not owned by the customer

– Discard route that contains other large ISP in AS path

Patriot

Princeton

128.112.0.0/16

USLEC

Export Policy: Filtering

• Discard some route announcements

– Limit propagation of routing information

• Examples

– Don’t announce routes from one peer to another

UUNET AT&T

Sprint

Export Policy: Filtering

• Discard some route announcements

– Limit propagation of routing information

• Examples

– Don’t announce routes for network-management hosts or the underlying routers themselves

USLEC network operator

Princeton

Export Policy: Attribute Manipulation

• Modify attributes of the active route

– To influence the way other ASes behave

• Example: AS prepending

– Artificially inflate the AS path length seen by others

– To convince some ASes to send traffic another way

Sprint

Patriot

88 88

Princeton

128.112.0.0/16

88

USLEC

BGP Policy Configuration

• Routing policy languages are vendor-specific

– Not part of the BGP protocol specification

– Different languages for Cisco, Juniper, etc.

• Still, all languages have some key features

– Policy as a list of clauses

– Each clause matches on route attributes

– … and either discards or modifies matching routes

• Configuration often done by human operators

– Implementing the policies of their AS

– Biz relationships, traffic engineering, security, …

Measuring Internet Routing

Motivations for Measuring the Routing System

• Characterizing the Internet

– Internet path properties

– Demands on Internet routers

– Routing convergence

• Improving Internet health

– Protocol design problems

– Protocol implementation problems

– Configuration errors or attacks

• Operating a network

– Detecting and diagnosing routing problems

– Traffic shifts, routing attacks, flaky equipment, …

Techniques for Measuring Internet Routing

• Active probing

– Inject probes along path through the data plane

– E.g., using traceroute

• Passive route monitoring

– Capture control-plane messages between routers

– E.g., using tcpdump or a software router

– E.g., dumping the routing table on a router

• Injecting network events

– Cause failure/recovery at planned time and place

– E.g., BGP route beacon, or planned maintenance

Internet Routing is Hard to Measure

• Nobody knows the Internet topology

– No central registry of the AS-level graph

– Little public information about intra-AS topologies

• Deploying monitoring infrastructure is hard

– Forwarding: active probes of end-to-end paths

– Routing: passive monitoring of routing messages

• Many measurement challenges

– Network conditions vary by location

– Network conditions change over time

– One-way measurements are hard to collect

– Controlled experiments are hard to do

Two Papers for Today

• Both early measurement studies of routing

– Initially appeared at SIGCOMM’96 and ’97

– Both won the “best student paper” award 

• And recently won the SIGCOMM “test of time” award!

– Early glimpses into the health of Internet routing

– Early wave of papers on Internet measurement

• Differences in emphasis

– Paxson96: end-to-end active probing to measure the characteristics of the data plane

– Labovitz97: passive monitoring of BGP update messages from several ISPs to characterize

(in)stability of the interdomain routing system

Backup Slides Summarizing

Paxson96 and Labovitz97

Active Measurement: Traceroute

• Time-To-Live field in IP packet header

– Source sends a packet with a TTL of n

– Each router along the path decrements the TTL

– “TTL exceeded” sent when TTL reaches 0

• Traceroute tool exploits this TTL behavior

Time exceeded

TTL=1 source

TTL=2 destination

Send packets with TTL=1, 2, 3, … and record source of “time exceeded” message

Paxson Study: Forwarding Loops

• Forwarding loop

– Packet returns to same router multiple times

• May cause traceroute to show a loop

– If loop lasted long enough

– So many packets traverse the loopy path

• Traceroute may reveal false loops

– Path change that leads to a longer path

– Causing later probe packets to hit same nodes

• Heuristic solution

– Require traceroute to return same path 3 times

Paxson Study: Causes of Loops

• Transient vs. persistent

– Transient: routing-protocol convergence

– Persistent: likely configuration problem

• Challenges

– Appropriate time boundary between the two?

– What about flaky equipment going up and down?

– Determining the cause of persistent loops?

• Causes of persistent loops

– E.g., misconfiguration

12.1.2.0/24 0.0.0.0/0

Paxson Study: Path Fluttering

• Rapid changes between paths

– Multiple paths between a pair of hosts

– Load balancing policies inside the network

• Packet-based load balancing

– Round-robin or random

– Multiple paths for packets in a single flow

• Flow-based load balancing

– Hash of some fields in the packet header

– E.g., IP addresses, port numbers, etc.

– To keep packets in a flow on one path

Paxson Study: Routing Stability

• Route prevalence

– Likelihood of observing a particular route

– Relatively easy to measure with sound sampling

– Poisson arrivals see time averages (PASTA)

– Most host pairs have a dominant route

• Route persistence

– How long a route endures before a change

– Much harder to measure through active probes

– Look for cases of multiple observations

– Typical host pair has path persistence of a week

• Hot-potato routing

Customer B

Paxson Study: Route Asymmetry

Provider B

• Other causes

– Asymmetric link weights in intradomain routing

– Cold-potato routing, where AS requests traffic enter at particular place multiple peering points

Provider A

Early-exit routing

• Consequences

– Lots of asymmetry

– One-way delay is not necessarily half of the round-trip time

Customer A

Labovitz Study: Surprising Statistics

• BGP is an incremental protocol

– In theory, no update messages in steady state

• Two kinds of update messages

– Announcement: advertising a new route

– Withdrawal: withdrawing an old route

• Study saw an alarming number of updates

– At the time, Internet had around 45,000 prefixes

– Routers were exchanging 3-6 million updates/day

– Sometimes as high as 30 million in a day

• Placing a very high load on the routers

Labovitz Study: Classifying Update Messages

• Analyze update messages

– For each (prefix, peer) tuple

– Classify the kinds of routing changes

• Forwarding instability

– WADiff: explicit withdraw, replaced by alternate

– AADiff: implict withdraw, replaced by alternate

• Pathological

– WADup: explicit withdraw, and then reanounced

– AADup: duplicate announcement

– WWDup: duplicate withdrawal

Labovitz Study: Duplicate Withdrawals

• Time-space trade-off in router implementation

– Common system building technique

– Trade one resource for another

– Can have surprising side effects

• The gory details

– Ideally, you should not send a withdrawal if you never sent a neighbor a corresponding announcement

– Requires remembering what update message you sent to each neighbor

– Easier to just send everyone a withdrawal when your route goes away

Labovitz Study: Practical Impact

• “Stateless BGP” is compliant with the standard

– But, it forces other routers to handle more load

– So that you don’t have to maintain state

– Arguably very unfair, and bad for global Internet

• One router vendor was largely at fault

– Router vendor modified its implementation

– ISPs then deployed the updated software

Labovitz Study: Still Hard to Diagnose Problems

• Despite having very detailed view into BGP

– Some pathologies were very hard to diagnose

• Possible causes

– Flaky equipment

– Synchronization of BGP timers

– Interaction between BGP and intradomain routing

– Policy oscillation

• Topics addressed more in subsequent work

Download