Controlling the Impact of BGP Policy Changes on IP Traffic Jennifer Rexford

advertisement
Controlling the Impact of BGP
Policy Changes on IP Traffic
Jennifer Rexford
IP Network Management and Performance
AT&T Labs – Research; Florham Park, NJ
http://www.research.att.com/~jrex/papers/tm011106.02.ps
Work with Nick Feamster & Jay Borkenhagen
Summary
• Why do interdomain traffic engineering?
– We are just gluttons for punishment…
– Operators need to control traffic on edge links
• Why work within BGP?
– BGP is what we have to work with…
– Hopefully we’ll get insight for improving BGP
• Our approach
– Model influence of BGP policy changes on traffic
– Find ways to minimize the overhead of changes
– Limit the impact of changes on neighboring ASes
– Evaluate ideas by using traffic and routing data
What is Traffic Engineering?
• Don’t IP networks manage themselves?
– TCP adapts sending rate to network congestion
– Routing protocols adapt to changes in topology
• Goals: user performance & network efficiency
– Add routers/links to increase network capacity
– Change routing to improve the flow of traffic
• Tuning the configuration of existing protocols
– Works today without deploying new protocols
– Avoids stability challenges of load-sensitive routing
– Allows operators to incorporate diverse constraints
– Gives insight to drive future changes to protocols
BGP Traffic Engineering?
• Limitations of intradomain traffic engineering
– Alleviating congestion on edge links
– Making use of new or upgraded edge links
– Influencing choice of end-to-end path
• Extra flexibility by allowing changes to BGP policies
– Direct traffic toward/from certain edge links
– Change the set of egress links for a destination
2
4
1
3
Impact of Routing on Traffic Flow
Topology
BGP
updates
Routing
configuration
Distributed
routing protocols
Flow of traffic through the network
Offered
traffic
Components of BGP
• BGP protocol
– Definition of how two BGP neighbors communicate
– Message formats, state machine, route attributes, etc.
– Standardized by the IETF
• BGP decision process
– Complex sequence of rules for selecting the best route
– De facto standard applied by router vendors
– Certain steps can be disabled or tuned by configuration
• Policy specification
– Flexible language for filtering and manipulating routes
– Indirectly affects the selection of the best route
– Varies across vendors, though constructs are similar
BGP Decision Process
• Highest local preference
– Set by import policies upon receiving advertisement
• Shortest AS path
– Included in the route advertisement
• Lowest origin type
– Included in advertisement or reset by import policy
• Smallest multiple exit discriminator
– Included in the advertisement or reset by import policy
• Smallest internal path cost
– Based on intradomain routing protocol (e.g., OSPF)
• Smallest next-hop router id
– Final tie-break
Paths to a Destination Prefix
Two routes with shortest
AS path among routes
with max local-pref…
… each router selects
“closest” egress point
based on OSPF weights
Limitations of BGP for TE
• BGP protocol
– Distributed, policy based path vector protocol
– No capacity or traffic load information in attributes
• Commercial realities
– ASes may have different/inconsistent policy goals
– Change in best path may affect neighbor’s choices
– Commercial relationships impose some policy constraints
• BGP decision process
–
–
–
–
Policies have only indirect influence on path selection
Each router has a single “best” BGP route per prefix
All “best routes” for a prefix must have same path length
Strict hierarchy of rules in the decision process
Scoping the Problem
• Predictability
– Ensure the BGP decision process is deterministic
– Assume that BGP updates are (relatively) stable
• Outbound traffic (import policy, local preference)
– Easier to control how traffic leaves the network
– Cooperate with neighbor ASes for inbound traffic
• Limit overhead introduced by routing changes
– Minimize frequency of changes to routing policies
– Limit number of prefixes affected by changes
• Limit impact on how traffic enters the network
– Avoid new routes that might change neighbor’s mind
– Select route with same attributes, or at least path length
AT&T Data (June 1, 2001)
• Analysis of peering links
– Links connecting AT&T to other large providers
– Relatively small number of high-end routers
– Availability of both routing and traffic data
• BGP routing tables
– Log daily output of “show ip bgp” command
– Extract BGP advertisements for each destination prefix
– Focus only on routes learned directly from peers
• Cisco Netflow
– Collect continuous flow-level measurement
– Aggregate the traffic based on the destination prefix
– Focus only on outbound traffic headed to peers
Modeling Routing Choices
• Representation of BGP updates per prefix
– Consider all routes learned from neighboring ASes
– Separate into rows based on AS path length
– In each row, group routes by border router
– Order routes by other attributes (origin, MED, id)
• Example
– 3: cgcil01: {1239 2179 11485, 701 2179 11485},
la2ca01: {1239 12179 11485}
– 5: attga01: {3561 12179 12179 12179 11485},
la2ca01: {3561 12179 12179 12179 11485}
Controlling the Scale
• Destination prefixes
– More than 90,000 destination prefixes
 Don’t want to have per-prefix routing policies
– 1% of prefixes  55% of traffic
 Focus on the small number of heavy hitters
– Define routing policies for selected prefixes
• Routing choices
– About 27,000 unique “routing choices”
 Help in reducing the scale of the problem
– 1% of routing choices  70% of traffic
 Focus on the very small number of “routing choices”
– Define routing policies on common attributes
Avoid Impacting Downstream Neighbors
Will traffic volume change???
Predictable Routing Change
• Predictability
– Do not change the route sent to downstream neighbor
– Focus on prefixes where all “best” routes are identical
– Neighbors do not even receive a new BGP advertisement!
• Example application
– Multiple links to same peer, with one congested link
– Assign lower local-pref at that link for some prefixes
• Empirical results
– 83.5% of prefixes have shortest AS paths that are all
identical (same next-hop AS, same AS path, etc.)
– These prefixes are responsible for 45% of the traffic
– Plenty of scope to move traffic in a predictable fashion
Semi-Predictable Routing Change
• Semi-predictability
– Do not change the length of the AS path sent to neighbor
– Neighbors receive new advertisement with same length
– (Hopefully) they still make the same routing decision
• Example application
– Need to move some traffic from one peer to another
– Find prefixes with “best” paths via both neighbors
– Assign lower local-pref at some links for some prefixes
• Empirical results
– 10-15% of prefixes have shortest paths with 2 next hops
– These prefixes contribute 35-40% of the traffic
– Potential to move traffic in a semi-predictable fashion
Influence of AS Path Length
• AS path length
– Plays a significant role in the BGP decision process
– All “best” routes must have the same AS path length
– 10% of prefixes have choices with different path lengths
• An idea: a more flexible approach
– Possible to disable consideration of AS path length, and
incorporate AS path length in local-pref assignment
– E.g., treat paths of length 3 and 4 as equally good
• AS prepending by other ASes
– Inflating AS path length by adding fake hops
– E.g., “701 80 80 80” instead of “701 80”
– 18% of routes had some form of AS prepending
Conclusions
• BGP traffic engineering
– Alleviate congestion on the edge links between ASes
– Configure BGP policies to control the flow of traffic
• Data analysis
– Extract routing choices from the BGP routing tables
– Collect prefix-level measurements of outgoing traffic
– Analyze traffic to help in scoping the BGP TE problem
• Ongoing work
–
–
–
–
Toolkit for what-if questions about changes in BGP policies
Heuristics for suggesting possible BGP policy changes
Techniques for coordination between neighboring ASes
Lightweight support for measurement in IP routers
(Shameless) Psamp Plug
• IETF activity on packet sampling
– Minimal functionality for packet-level measurement
– Tunable trade-offs between overhead and accuracy
– Measurement data for a variety of important applications
• Basic idea: parallel filter/sample banks
–
–
–
–
Filter on header fields (src/dest, port #s, protocol)
1-out-of-N sampling (random, periodic, or hash)
Record key IP and TCP/UDP header fields
Send measurement record to a collection system
• Get involved!!!
– http://www.ietf.org/internet-drafts/draft-duffieldframework-papame-01.txt
– http://ops.ietf.org/lists/psamp/
Download