
Dynamic routing – QoS routing
• Other approaches to QoS routing
• Traffic Engineering
• Practical Traffic Engineering
Other approaches
• Reduce the load from updates: Use more efficient
distribution methods
– Trees instead of flooding
– But setting up the tree has its own complexity and issues
• Handle inaccurate state information: Crankback
– If the path I try is not good (because of stale information) step back
and try a different one
– Path setup may take long time now
• Reduce the load from updates:
– Introduce hierarchy (similar to IGP areas)
• Avoid updates altogether: Probing [Chen, Nahrsted 1998]
– Send probes over multiple paths towards the destination
– The probes will collect information about the network conditions
– Have extra probe traffic now
• The QoS routing component of ATM
• Only standardized QoS routing protocol
– Gone now that ATM is gone
• Link state with Strict Hierarchy
– Recursively create multiple peer-groups
– Flooding only inside a peer-group
– Parent floods information to its descendants too
• Abstract nodes
– Summarize QoS characteristics of a whole peer-group
– Appears as a single node in the parent peer-group
• A route is signaled using source routing
– Source route is expanded as we enter a peer-group
• Crankback
– If signaling fails backup to the entry point of the peer-group and
try another path
PNNI Routing Algorithm
• What is advertised
Administrative weight
Available bandwidth
Loss rates
Delay variation
• The routing algorithm was not specified in
the protocol specification
– Many proposals
Why is QoS routing “dead”
• Partly because per-flow QoS is hard to achieve
– Int-serv lost to diff-serv
• Partly because it is very hard to have QoS in the
– All that we talked about are intra-domain
• “Applications did not need it” argument
– Applications never had the chance to use it, it never
• It became traffic engineering
Different timescales
Offline algorithms
Traffic matrices
Potentially different optimization objectives
Still intra-domain though!
Traffic Engineering
• Given
– A traffic matrix
• Demands between any two endpoints in my network
• In practice demands between POPs
– The network
• Topology
• Link sizes
• Find
– How to arrange the offered traffic into the
network so as to optimize network
• What should I optimize?
• Do I really have the traffic matrix?
• How easy is to do this optimization
What should I optimize?
• In QoS routing I was trying to maximize the
traffic I could fit in the network
– One request I the time
• In TE I known all the traffic
– I can optimize some global routing metrics
– Minimize the overall cost of routing
• Depends on how I define the cost of a link as a
function of its load
• A common function is one that makes the cost
exponentially higher as the link approaches
• Minimizing this, minimizes the load on each link
How to get the traffic matrix
• Traffic matrix:
– Volume of data between all pairs of ingress/egress points of my
– Could be PoPs, customers etc..
• Hard to get the traffic matrix data
– Packet counting is expensive
– Sometimes count only packets and not packets/destination
– Even when I can count packets/destination I have to map
destinations to egress points
• Routing dependent
– Changes in routing can dramatically change the traffic matrix
• BGP hot-potato routing
• Need to estimate the traffic matrix
– Y is the table of link loads, A is the routing, X is the traffic matrix
– Y=A*X
– This is very under-determined, too many possible solutions
Traffic matrix estimation
• Active research area
– Probabilistic approaches
• Start from an estimate of the traffic matrix
• Assume some statistics for the traffic
– That may not necessarily be true, real traffic does not follow
much these models
• Refine the estimates
– Choice models
• Model as each POP making a choice where to send its traffic
– Gravity models
• Traffic between POP a and POP b is
– Proportional to the volume of traffic leaving a
– Proportional to the volume of traffic entering b
– Inversely proportional to their distance
General Routing problem
• Network with N nodes and E edges
• Traffic matrix T for each pair in N x N
• Cost function C(e)
– Dependent on the load of edge e
• Find how to split traffic into flows to
– Cost = Sum of C(e) for all e in E
• Can solve in linear time
– If I can split flows arbitrarily
• In IP networks
– Routing depends only indirectly on the link costs
• My algorithm should find link costs
– Can not split flows arbitrarily
• May not have enough paths
• If I have ECMPs flow is split equally among the multiple
– Destination based routing
• Traffic to the same destination will follow the same path
• With these constraints
– Problem of finding IGP link costs so as to minimize the
cost of routing is NP-complete
Enter MPLS
• MPLS can approximate the flow splitting
– No destination routing anymore
– Can control exactly what traffic goes into an LSP
– And how this traffic is delivered to its destination
• This connection oriented nature is what makes
MPLS (and ATM before) good for traffic
– Of course there is some cost
• Full mesh of LSPs
• Higher administrative complexity
TE in practice
• I have a 3 level network
– Customer, aggregation and wan routers
• Three approaches for TE
– IGP only
• Mostly IGP with the occasional LSP
• For unequal cost forwarding
• For temporarily repairing hot-spots
• Full mesh of LSPs
• Compute paths
– On-line
– Off-line
Pros and Cons
– Mostly manual process
– Error prone
– It is not too easy to patch up network problem with a
few LSPs
• And may cause other problems
– Scaling of the full mesh
• Can work at each of the 3 levels , Wan level full mesh scales
ok, Customer router full mesh could be a problem
• With 100 customer routers will have 10,000 LSPs
• Can be more if I have separate LSPs for each Diff-Serv class
• Signaling overhead
• May hit the limit of LSPs in the transit routers
Off-line MPLS TE
• Compute best LSP paths for the whole network
– Signal them using RSVP
• When something changes
– Re-compute all the LSPs again
• Off-line allows for better control
– Compute best LSP paths for the whole network
• No oscillations
– Global view can optimize resource usage
• But can not respond to sudden traffic changes
– Attacks
– Flash crowds
IP TE is not impossible
• Recent research has shown that it is possible
to achieve solutions that are very close to
the optimal using just IP
• I do this by picking the right IGP weights
for each link
– But as we said this problem is NP-complete
– Need to do a state space search in the link
weight state space
– With some tricks this is feasible
• For real networks this can get within few
percent of the optimal flow based routing !
IGP link weight space search
• Typical state search problem
– Can use variety of methods
• Steepest descent
• Tabu search
– Local vs. global minima
• Start for a set of link weights state1
– Compute the cost of routing for this set
– This is expensive
• Need to route all the traffic and measure how much load each
link has in order to compute its cost
• Modify one or more weights -> state2
– Compute new routing cost
– Keep if new routing cost is better
• Continue until … ?
Some tricks to speed up search
• Avoid cycles
– Remember states that were visited before and do not evaluate them
– Need to do this efficiently
• Faster routing cost evaluation
– Consider the effects of only large flows
– Incremental SPFs
• Find out which links are the ones that have a large impact
on the cost and optimize for them only
• Adaptation
– Consider a dynamically sized “neighborhood” and explore it first
before moving on
• Neighborhood becomes smaller when I improve on the solution
• Larger when I do not improve on the solution
• Avoid local minima
– Essentially repeat search starting from random places in the state
search space
When links fail?
• All this TE is good when links do not fail
• What happens in failures?
– Failures are handled by fast reroute
– Some minimal optimization when determining backup LSPs
– Global re-optimization if the failure lasts for too long
• IGP weight optimization
– It is possible to optimize weights to take into account single link
• Other approaches:
– Multi-topology traffic engineering
– Optimize the weights in the topologies that are used for protecting
from failures