Internet Routing (COS 598A) Jennifer Rexford Today: Non-Convergence: Policy Conflicts Tuesdays/Thursdays 11:00am-12:20pm

advertisement
Internet Routing (COS 598A)
Today: Non-Convergence: Policy Conflicts
Jennifer Rexford
http://www.cs.princeton.edu/~jrex/teaching/spring2005
Tuesdays/Thursdays 11:00am-12:20pm
Outline
• Stable Paths Problem
– The problem BGP is solving
– Abstract model for BGP
– Translating reality into SPP
• Conflicting routing policies
– Examples of policy conflicts
– Difficulty of identifying conflicts
• Guaranteeing convergence
– Guidelines based on business relationships
– Provable convergence without global control
• Recent work and a project idea
What Problem Does a Routing Protocol Solve?
• Most do shortest-path routing
– Shortest hop count
• Distance vector routing (e.g., RIP)
– Shortest path as sum of link weights
• Link-state routing (e.g., OSPF and IS-IS)
• Policy makes BGP is more complicated
– An AS might not tell a neighbor about a path
• E.g., Sprint can’t reach UUNET through AT&T
– An AS might prefer one path over a shorter one
• E.g., ISP prefers to send traffic through a customer
What is a good model for BGP?
Could Use A Simulation Model
• Simulate the message passing
– Advertisements and withdrawals
– Message format
– Timers
• Simulate the routing policy on each session
– Filter certain route advertisements
– Manipulate the attributes of others
• Simulate the decision process
– Each router applying all the steps per prefix
Feasible, but tedious and ill-suited for formal arguments
Stable Paths Problem (SPP) Instance
• Node
– BGP-speaking router
– Node 0 is destination
210
2
20
– BGP adjacency
– Set of1 routes to 0 at
each node
– Ranking of the paths
5210
2
• Edge
• Permitted paths
5
4
420
430
3
30
0
1
130
10
most preferred
…
least preferred
A Solution to a Stable Paths Problem
• Solution
2
210
20
– Path assignment per node
– Can be the “null” path
– {u,w} is an edge in the graph
– Node w is assigned path wP
– The highest ranked path
1
consistent with the
assignment of its neighbors
5210
2
• If node u has path uwP
• Each node is assigned
5
4
420
430
3
30
0
1
130
10
A solution need not represent
a shortest path tree, or
a spanning tree.
Translating a Real Configuration into SPP
• Permitted paths at a node
– Composition of export policies at other nodes
Node 0 exports
route to node 2
0
210
20
Node 2 exports
5210
“2 1 0” but not “2 0”
2
5
Node 1 exports
“1 0” to node 2
• Ranking of paths at a node
– Import policies at the node
– Rank in terms of BGP decision process (i.e., local
preference, AS path length, origin type, MED, …)
An SPP May Have Multiple Solutions
120
10
120
10
1
120
10
1
0
0
2
210
20
1
2
210
20
First solution
0
2
210
20
Second solution
An SPP May Have No Solution
2
210
20
4
0
130
10
1
3
3
320
30
Stable System Unstable After Failure
210
20
BGP is not robust :
it is not guaranteed
to recover from
network failures.
1
130
10
2
Becomes a BAD GADGET if link
(4, 0) goes down.
4
40
420
430
0
3
3420
30
Strawman Solution Doesn’t Work
• Create a global Internet routing registry
– Store the AS-level graph and all routing policies
– Store all routing policies
– But, ASes may be unwilling to divulge
• Check for conflicting policies
– Analyze the global system and identify conflicts
– Contact the affected ASes to resolve them
– But, checking is an NP-complete problem
– … and, a safe system may be unsafe after failure
Goal: sufficient condition for convergence with local control
Guaranteeing Convergence
Think Globally, Act Locally
• Key features of a good solution
– Flexibility: allow diverse local policies for each AS
– Privacy: do not force ASes to divulge their policies
– Backwards-compatibility: no changes to BGP
– Guarantees: convergence even if system changes
• Restrictions based on AS relationships
– Path selection rules: which route you prefer
– Export policies: who you tell about your route
– AS graph structure: who is connected to who
Customer-Provider Relationship
• Customer pays provider for Internet access
– Provider exports customer’s routes to everybody
– Customer exports only to downstream customers
Traffic to the customer
Traffic from the customer
d
provider
advertisements
provider
traffic
customer
d
customer
Peer-Peer Relationship
• Peers exchange traffic between customers
– AS exports only customer routes to a peer
– AS exports a peer’s routes only to its customers
Traffic to/from the peer and its customers
advertisements
peer
d
traffic
peer
Hierarchical AS Relationships
• Provider-customer graph is directed & acyclic
– If u is a customer of v and v is a customer of w
– … then w is not a customer of u
w
v
u
Local Path Selection Rules
• Classify routes based on next-hop AS
– Customer routes, peer routes, and provider routes
• Rank routes based on classification
– Prefer customer routes over peer/provider routes
• Allow any ranking of routes within a class
– E.g., rank one customer route higher than another
– Gives network operators the flexibility they need
• Consistent with traffic engineering practices
– Customers pay for service, and providers are paid
– Peer relationship based on balanced traffic load
Two Interpretations
• System is stable because ASes act like this
– High-level argument
• Export and topology assumptions are reasonable
• Path selection rule matches with financial incentives
– Empirical results
• BGP routes for popular destinations stable for ~10 days
• Most instability from a few flapping destinations
• ASes should follow rules for system stability
– Encourage operators to obey these guidelines
– … and provide ways to verify the configuration
– Need to consider more complex relationships
Playing One Condition Off Against Another
• All three conditions are important
– Path ranking, export policy, and graph structure
• Allowing more flexibility in ranking routes
– Allow same preference for peer and customer routes
– Never choose a peer route over a shorter customer route
• … at the expense of stricter AS graph assumptions
– Hierarchical provider-customer relationship (as before)
– No private peering with (direct or indirect) providers
Peer-peer
Extension to Backup Relationships
• Backups: liberal export and ranking policies
– The motivation is increased reliability
– …but ironically it may cause routing instability!
Backup Provider
Peer-Peer Backup [RFC 1998]
provider
primary
provider
failure
backup path
failure
backup path
backup
provider
peer
Backup Path Needs Global Significance
2
4
3
0
1
• Peer-backup relationship between 0 and 1
– Adds backup paths (2,1,0), (3,1,0), …
• When link {2,0} fails…
– Node 2 prefers (2,3,1,0) through a peer over the
backup path (2,1,0)
– Leads to the “bad gadget” example
Backup Paths: Keeping Count of Backup Edges
• Solution
– Prefer routes with fewest backup links
– Then, break ties by preferring customer routes
• Mechanism
– Tag BGP route advertisement with a counter
– Increment the count as you cross a backup edge
No backup
One backup
customer
One backup
peer
20
2
210
2310
2410
4
3
0
1
Recent Work
Recent Work: Relaxing Export Rules
• Goal: no restrictions on export and topology
– Allow an AS to decide whether to export
– Do not require hierarchical relationships
• Question
– How much do you have to restrict path ranking to
have a guarantee that the system is safe?
• Answer
– Limited to shortest-path routing
• Implications
– Trade-off in safety, autonomy, & expressiveness
Recent work by Nick Feamster and Ramesh Johari
Recent Work: MED Oscillation (RFC 3345)
• MED comparison when next-hop AS is same
• No total ordering at the leftmost router
– B > A: preferring smaller router-id
– C > B: preferring smaller MED attribute
– A > C: preferring eBGP-learned over iBGP
AS 1
AS 2
B: Id=1,
MED=20
C: MED=10
A: Id=2
iBGP
Project Idea: Stable Paths Problem
and Root-Cause Analysis
Project Idea: Root-Cause Analysis
• Root-cause analysis
– Identify location and cause of routing changes
– Inference from BGP protocol messages
• Active area of research
– Several proposed algorithms
– Limited accuracy in making inferences
• Research question
– Is the problem just very hard?
– Does the data not reveal enough information?
• Project idea: study using SPP
Project Idea, Continued
• Model root-cause analysis
– Start with an SPP instance
– Fail a link (or a node)
– See what path changes would occur
• What events might cause these changes?
120
10
2340
20
1
340
320
3
2
0
4
40
Questions
• Can you infer cause and location
– If you observe routing changes at all nodes
– If you observe only some of the nodes
• What if you make some assumptions
– E.g., policies based on business relationships
• Where would you place monitors?
– Best locations to place n monitors
– Minimum number of monitors you need
• What changes would you make to the routing
protocol to make diagnosis easier?
Next Time: Hot-Potato Routing
• Two papers
– “Dynamics of Hot-Potato Routing in IP Networks”
– “TIE Breaking: Tunable Interdomain Egress
Selection”
• NANOG video
– Covering material in the first paper
• In honor of spring break
– No written reviews
• Talk with me about your course project
– ... by Thursday March 24
– Final written report due Tuesday May 10
Download