Internet Routing (COS 598A) Jennifer Rexford Today: Router Software Tuesdays/Thursdays 11:00am-12:20pm

advertisement
Internet Routing (COS 598A)
Today: Router Software
Jennifer Rexford
http://www.cs.princeton.edu/~jrex/teaching/spring2005
Tuesdays/Thursdays 11:00am-12:20pm
Outline
• Continuing discussion from last class
– Proposals for removing routing from routers
– Feasibility, collecting data, computing paths, etc.
• BGP implementation
– Storage overhead
– CPU overhead
• Recent proposals
– Graceful restart to limit effects of resets
– Tunneling to limit hot-potato changes
– Computing routes for groups of routers
Proposal #1: Routing As a Service
• Goal: third parties pick end-to-end paths for
clients to satisfy diverse user objectives
• Forwarding infrastructure
– Basic routing (e.g., default routing)
– Primitives for inserting routes
• Route selector
– Aggregates network information
– Selects routes on behalf of clients
– Competes with other selectors for customers
• End host
– Queries route selector to set up paths
Proposal #2: Routing Control Platform
• Goal: Move beyond today’s artifacts, while
remaining compatible with the legacy routers
• Incentive compatibility: phased evolution
– Intelligent route reflector in a single AS
– Learning eBGP routes directly from neighbor ASes
– Interdomain routing between RCPs
• Backwards compatibility: internal BGP
eBGP
Inter-AS
Protocol
–eBGP
Using
answers to the routers
RCPiBGP to “push”RCP
RCP
iBGP
– No need
RCPat all
RCPto change the legacy routers
iBGP
iBGP
– Keep
message
format
and
change
decision
AS
1
AS
2
AS 3 rules
Physical
peering
Proposal #3: Wafer-Thin Control Plane
• Goal: Refactor the data, control, and
management planes from scratch
• Management plane  Decision plane
– Operates on network-wide view and objectives
– Directly controls the data plane in real time
• Control plane  Discovery plane
– Responsible for providing the network-wide view
– Topology discovery, traffic measurement, etc.
• Data plane
– Queues, filters, and forwards data packets
– Accepts direct instruction from the decision plane
Simple routers that have no control-plane configuration
How Does These Differ From Overlays
• Overlays: circumventing the underlay
– Host nodes throughout the network
– Logical links between the host nodes
– Active probes to observe the performance
– Direct packets through good intermediate nodes
• Routing services: controlling the underlay
– Servers collect data directly from the routers
– Servers compute forwarding tables for the routers
– Data packets do not go through the servers
– Like an overlay for managing the underlay
Maybe some combination of the two makes sense?
Practical Issues: Feasibility
• Fast reaction to failures
– Routers are closer to the failures
– Can a service react quickly enough?
• Scalability with network size
– State and computation grow with the topology
– Can a service manage a large network?
• Reliability?
– Service is now a point of failure
– Is simple replication enough?
• Security?
– Service is now a natural point of attack
– Easier (or harder) to protect than the routers?
Practical Issues: Collecting Measurement Data
• All three proposals make measurement a firstorder part of running the network
• Routers have only two jobs
– Forward packets
– Collect measurement data
• What measurements?
– Topology discovery
– Traffic demands
– Performance statistics
– …?
Practical Issues: Path-Computation Algorithms
• Selecting routes should be easier
– Complete view of network topology and traffic
– Possibility of using centralized algorithms
– Direct control over forwarding tables
• …but what algorithms to use?
– Still need a separation of timescale, but how?
• Fast reaction to topological changes
• Semi-offline optimization of routing
• … and how to compute end-to-end paths?
– Policy-based path vector protocol?
– Publish/subscribe system?
– Something else?
Practical Issues: Solving Real Problems?
• Customer load-balancing
– Trading off load, performance, and cost
– Controlling inbound and outbound traffic
– Avoiding small subnets and BGP tweaks
• Preventing overloading router resources
– Minimum-sized forwarding table per router
– Minimum stretch while obeying memory limits
• Flexible end-to-end path selection
– Satisfy the goals of end users and providers
– Handle pricing/economics in the right way
Other Thoughts?
Router Software
Basic BGP Implementation
RIB-in-1
RIB-in-2
Import
Export
RIB-out-1
Import
Export
RIB-out-2
RIB
RIB-in-n
Import
Export
Decision
process
RIB-out-n
Storage Overhead: RIB-In
• Storing routes learned from each neighbor
– Before applying the import policy
• Advantages of keeping a RIB-In
– Verify receipt of routes that have been filtered
– Use as input to simulate import-policy changes
– Apply new policies directly on local RIB-In
• Alternatives for keeping a RIB-In
– Reset the session after any policy change
• Undesirable, unless policy changes are infrequent
– Route-refresh option to signal neighbor to resend
• Relatively new feature, so not universally supported
Storage Overhead: Main RIB
• Storing all candidate routes
– All routes after import processing
– Keep track of the best route for each prefix
• Advantages
– Necessary to store at least one copy of each route
– … since BGP is an incremental protocol
• Alternatives
– Store only the RIB-In for each neighbor
• Require rerunning import policies per decision
Storage Overhead: RIB-Out
• Storing routes sent to each neighbor
– After applying the export policy
• Advantages of keeping a RIB-Out
– Verify sending of route to the neighbor
– Compare routes to suppress unnecessary updates
• No update message if all attributes are the same
• No withdrawal message if there was no advertisement
• Alternative to keeping a RIB-Out
– Reapply export policy to recompute the route
• … or send some unnecessary update messages
– Single RIB-Out per export policy (peer groups)
BGP Peer Groups
• Group of BGP neighbors with same policies
– Avoid repetitive configuration
– Avoid reapplying the same policy
– Avoid duplicating the storage
• Example iBGP peer groups
– Route-reflector clients
– Route-reflector peers
• Example eBGP peer groups
– Customers
– Peers
CPU Overhead: New BGP Update Message
• When receiving a new BGP update
– Apply import policy and update the RIB
– Re-run the BGP decision process for this prefix
– If best route changes, apply export policies and
send update message to affected neighbors
• Running decision process
– Ideally, just compare with the best route
• Withdraw non-best route: no change
• Update non-best route: compare to current best
– But, BGP does not always form a total ordering
• MED attribute compared only for same next-hop AS
• Re-run decision process for deterministic outcome
CPU Overhead: Events that Amplify Work
• BGP session failure
– Must discard all routes learned from this neighbor
– … and run decision process for affected prefixes
• Policy change
– Must apply the new routing policy to all routes
learned from (or sent to) this neighbor
– … and run decision process for affected prefixes
• Intradomain change
– Must revisit BGP decision for affected prefixes
• Exclude routes with unreachable next-hop
• Prefer the route with the closest egress point
CPU Overhead: Deferring Heavy Jobs
• Event-driven approach
– Process most events as they occur
– Defer heavy-load items to background task
– Make sure these tasks can run soon
– Example: XORP handling session failures
• Timer-driven approach
– Periodic timer driving the operation
– Scan the data structures when the timer expires
– … and identify and perform any needed work
– Example: Cisco scan timer for IGP changes
Reducing Overhead: Operational Practices
• Avoiding RIB-In storage
– Configuring router not to store RIB-In
– Convincing neighbors to support route-refresh
• Configuring peer groups
– Limiting the number of unique export policies
• … or limiting the number of these per router
– Putting all possible sessions in same peer group
• Selecting good timer settings
– Allow grouping of update messages
– Avoid false detection of session failures
Reducing the Effects of Session Failures
• Separating control from data
– Suppose a router’s BGP process fails
– … but the data plane is just fine
RIB
RIB
FIB
data FIB
• When the neighbor’s BGP process fails
– Do not delete routes learned from neighbor
– Continue to forward data packets
• When the neighbor’s process restarts
– Refresh the neighbor by re-sending BGP routes
– Neighbor re-builds its RIB and goes back to normal
• BGP “Graceful Restart” mechanism
– New BGP capability for neighbors to negotiate
– Mark routes from the neighbor as “stale”
– Refresh by resending RIB-Out with End-of-RIB marker
Reducing the Effects of IGP Changes
• Circumvent hot-potato routing
– Avoid small IGP changes leading to BGP changes
– … and avoid the software overhead on BGP
• Tunneling between edge routers
– Create tunnel from ingress to egress router
– Assign a weight to the tunnel (e.g., air miles)
– Tunnel weight does not depend on IGP path
dst
A
3
4
D
8 10
3
8
F 5
C
B
9
E
4
G
Reducing Overhead for Groups of Routers
• Additional overhead in RCP-like approaches
– Computing routes on behalf of many routers
– Could lead to a linear increase in overhead
• Store a single copy of each BGP route
– One big global RIB for the network
– Plus, avoid repeating some of decision process
• Compute for groups of routers (e.g., PoP)
– One shared RIB-out for each group of routers
– Plus, avoid repeating the decision process
• Reduce the overhead of IGP changes
– E.g., by use of tunnel, as on previous slide
Conclusion
• Router software
– Very challenging systems problem
– New open-source software (Quagga, OpenBGPd)
• Improving scalability
– Scaling with # of routers, sessions, and prefixes
– Trading off memory and CPU resources
– Avoiding events that create excessive work
• Newly active research area
– Importance of control plane in network
performance, reliability, and security
– Creation of new platforms for router software
Next Time: BGP Security
• Two papers
– “Beware of BGP Attacks”
– “Secure Border Gateway Protocol (Secure-BGP)”
• Review just of second paper
– Summary
– Why accept
– Why reject
– Future work
• Optional NANOG video
– See the Web site later today…
Download