Controlling the Impact of BGP Policy Changes on IP Traffic Jennifer Rexford

Controlling the Impact of BGP
Policy Changes on IP Traffic
Jennifer Rexford
IP Network Management and Performance
AT&T Labs – Research; Florham Park, NJ
Work with Nick Feamster & Jay Borkenhagen
• Why do interdomain traffic engineering?
– We are just gluttons for punishment…
– Operators need to control traffic on edge links
• Why work within BGP?
– BGP is what we have to work with…
– Hopefully we’ll get insight for improving BGP
• Our approach
– Model influence of BGP policy changes on traffic
– Find ways to minimize the overhead of changes
– Limit the impact of changes on neighboring ASes
– Evaluate ideas by using traffic and routing data
What is Traffic Engineering?
• Don’t IP networks manage themselves?
– TCP adapts sending rate to network congestion
– Routing protocols adapt to changes in topology
• Goals: user performance & network efficiency
– Add routers/links to increase network capacity
– Change routing to improve the flow of traffic
• Tuning the configuration of existing protocols
– Works today without deploying new protocols
– Avoids stability challenges of load-sensitive routing
– Allows operators to incorporate diverse constraints
– Gives insight to drive future changes to protocols
BGP Traffic Engineering?
• Limitations of intradomain traffic engineering
– Alleviating congestion on edge links
– Making use of new or upgraded edge links
– Influencing choice of end-to-end path
• Extra flexibility by allowing changes to BGP policies
– Direct traffic toward/from certain edge links
– Change the set of egress links for a destination
Impact of Routing on Traffic Flow
routing protocols
Flow of traffic through the network
Components of BGP
• BGP protocol
– Definition of how two BGP neighbors communicate
– Message formats, state machine, route attributes, etc.
– Standardized by the IETF
• BGP decision process
– Complex sequence of rules for selecting the best route
– De facto standard applied by router vendors
– Certain steps can be disabled or tuned by configuration
• Policy specification
– Flexible language for filtering and manipulating routes
– Indirectly affects the selection of the best route
– Varies across vendors, though constructs are similar
BGP Decision Process
• Highest local preference
– Set by import policies upon receiving advertisement
• Shortest AS path
– Included in the route advertisement
• Lowest origin type
– Included in advertisement or reset by import policy
• Smallest multiple exit discriminator
– Included in the advertisement or reset by import policy
• Smallest internal path cost
– Based on intradomain routing protocol (e.g., OSPF)
• Smallest next-hop router id
– Final tie-break
Paths to a Destination Prefix
Two routes with shortest
AS path among routes
with max local-pref…
… each router selects
“closest” egress point
based on OSPF weights
Limitations of BGP for TE
• BGP protocol
– Distributed, policy based path vector protocol
– No capacity or traffic load information in attributes
• Commercial realities
– ASes may have different/inconsistent policy goals
– Change in best path may affect neighbor’s choices
– Commercial relationships impose some policy constraints
• BGP decision process
Policies have only indirect influence on path selection
Each router has a single “best” BGP route per prefix
All “best routes” for a prefix must have same path length
Strict hierarchy of rules in the decision process
Scoping the Problem
• Predictability
– Ensure the BGP decision process is deterministic
– Assume that BGP updates are (relatively) stable
• Outbound traffic (import policy, local preference)
– Easier to control how traffic leaves the network
– Cooperate with neighbor ASes for inbound traffic
• Limit overhead introduced by routing changes
– Minimize frequency of changes to routing policies
– Limit number of prefixes affected by changes
• Limit impact on how traffic enters the network
– Avoid new routes that might change neighbor’s mind
– Select route with same attributes, or at least path length
AT&T Data (June 1, 2001)
• Analysis of peering links
– Links connecting AT&T to other large providers
– Relatively small number of high-end routers
– Availability of both routing and traffic data
• BGP routing tables
– Log daily output of “show ip bgp” command
– Extract BGP advertisements for each destination prefix
– Focus only on routes learned directly from peers
• Cisco Netflow
– Collect continuous flow-level measurement
– Aggregate the traffic based on the destination prefix
– Focus only on outbound traffic headed to peers
Modeling Routing Choices
• Representation of BGP updates per prefix
– Consider all routes learned from neighboring ASes
– Separate into rows based on AS path length
– In each row, group routes by border router
– Order routes by other attributes (origin, MED, id)
• Example
– 3: cgcil01: {1239 2179 11485, 701 2179 11485},
la2ca01: {1239 12179 11485}
– 5: attga01: {3561 12179 12179 12179 11485},
la2ca01: {3561 12179 12179 12179 11485}
Controlling the Scale
• Destination prefixes
– More than 90,000 destination prefixes
 Don’t want to have per-prefix routing policies
– 1% of prefixes  55% of traffic
 Focus on the small number of heavy hitters
– Define routing policies for selected prefixes
• Routing choices
– About 27,000 unique “routing choices”
 Help in reducing the scale of the problem
– 1% of routing choices  70% of traffic
 Focus on the very small number of “routing choices”
– Define routing policies on common attributes
Avoid Impacting Downstream Neighbors
Will traffic volume change???
Predictable Routing Change
• Predictability
– Do not change the route sent to downstream neighbor
– Focus on prefixes where all “best” routes are identical
– Neighbors do not even receive a new BGP advertisement!
• Example application
– Multiple links to same peer, with one congested link
– Assign lower local-pref at that link for some prefixes
• Empirical results
– 83.5% of prefixes have shortest AS paths that are all
identical (same next-hop AS, same AS path, etc.)
– These prefixes are responsible for 45% of the traffic
– Plenty of scope to move traffic in a predictable fashion
Semi-Predictable Routing Change
• Semi-predictability
– Do not change the length of the AS path sent to neighbor
– Neighbors receive new advertisement with same length
– (Hopefully) they still make the same routing decision
• Example application
– Need to move some traffic from one peer to another
– Find prefixes with “best” paths via both neighbors
– Assign lower local-pref at some links for some prefixes
• Empirical results
– 10-15% of prefixes have shortest paths with 2 next hops
– These prefixes contribute 35-40% of the traffic
– Potential to move traffic in a semi-predictable fashion
Influence of AS Path Length
• AS path length
– Plays a significant role in the BGP decision process
– All “best” routes must have the same AS path length
– 10% of prefixes have choices with different path lengths
• An idea: a more flexible approach
– Possible to disable consideration of AS path length, and
incorporate AS path length in local-pref assignment
– E.g., treat paths of length 3 and 4 as equally good
• AS prepending by other ASes
– Inflating AS path length by adding fake hops
– E.g., “701 80 80 80” instead of “701 80”
– 18% of routes had some form of AS prepending
• BGP traffic engineering
– Alleviate congestion on the edge links between ASes
– Configure BGP policies to control the flow of traffic
• Data analysis
– Extract routing choices from the BGP routing tables
– Collect prefix-level measurements of outgoing traffic
– Analyze traffic to help in scoping the BGP TE problem
• Ongoing work
Toolkit for what-if questions about changes in BGP policies
Heuristics for suggesting possible BGP policy changes
Techniques for coordination between neighboring ASes
Lightweight support for measurement in IP routers
(Shameless) Psamp Plug
• IETF activity on packet sampling
– Minimal functionality for packet-level measurement
– Tunable trade-offs between overhead and accuracy
– Measurement data for a variety of important applications
• Basic idea: parallel filter/sample banks
Filter on header fields (src/dest, port #s, protocol)
1-out-of-N sampling (random, periodic, or hash)
Record key IP and TCP/UDP header fields
Send measurement record to a collection system
• Get involved!!!