Dynamic routing – QoS routing • Other approaches to QoS routing • Traffic Engineering • Practical Traffic Engineering Other approaches • Reduce the load from updates: Use more efficient distribution methods – Trees instead of flooding – But setting up the tree has its own complexity and issues • Handle inaccurate state information: Crankback – If the path I try is not good (because of stale information) step back and try a different one – Path setup may take long time now • Reduce the load from updates: – Introduce hierarchy (similar to IGP areas) • Avoid updates altogether: Probing [Chen, Nahrsted 1998] – Send probes over multiple paths towards the destination – The probes will collect information about the network conditions – Have extra probe traffic now PNNI • The QoS routing component of ATM • Only standardized QoS routing protocol – Gone now that ATM is gone • Link state with Strict Hierarchy – Recursively create multiple peer-groups – Flooding only inside a peer-group – Parent floods information to its descendants too • Abstract nodes – Summarize QoS characteristics of a whole peer-group – Appears as a single node in the parent peer-group • A route is signaled using source routing – Source route is expanded as we enter a peer-group • Crankback – If signaling fails backup to the entry point of the peer-group and try another path PNNI Routing Algorithm • What is advertised – – – – – Administrative weight Available bandwidth Loss rates Delay Delay variation • The routing algorithm was not specified in the protocol specification – Many proposals Why is QoS routing “dead” • Partly because per-flow QoS is hard to achieve – Int-serv lost to diff-serv • Partly because it is very hard to have QoS in the inter-domain – All that we talked about are intra-domain • “Applications did not need it” argument – Applications never had the chance to use it, it never worked • It became traffic engineering – – – – – Different timescales Offline algorithms Traffic matrices Potentially different optimization objectives Still intra-domain though! Traffic Engineering • Given – A traffic matrix • Demands between any two endpoints in my network • In practice demands between POPs – The network • Topology • Link sizes • Find – How to arrange the offered traffic into the network so as to optimize network performance Problems • What should I optimize? • Do I really have the traffic matrix? • How easy is to do this optimization algorithmically? What should I optimize? • In QoS routing I was trying to maximize the traffic I could fit in the network – One request I the time • In TE I known all the traffic – I can optimize some global routing metrics – Minimize the overall cost of routing • Depends on how I define the cost of a link as a function of its load • A common function is one that makes the cost exponentially higher as the link approaches saturation • Minimizing this, minimizes the load on each link How to get the traffic matrix • Traffic matrix: – Volume of data between all pairs of ingress/egress points of my network – Could be PoPs, customers etc.. • Hard to get the traffic matrix data – Packet counting is expensive – Sometimes count only packets and not packets/destination – Even when I can count packets/destination I have to map destinations to egress points • Routing dependent – Changes in routing can dramatically change the traffic matrix • BGP hot-potato routing • Need to estimate the traffic matrix – Y is the table of link loads, A is the routing, X is the traffic matrix – Y=A*X – This is very under-determined, too many possible solutions Traffic matrix estimation • Active research area – Probabilistic approaches • Start from an estimate of the traffic matrix • Assume some statistics for the traffic – That may not necessarily be true, real traffic does not follow much these models • Refine the estimates – Choice models • Model as each POP making a choice where to send its traffic – Gravity models • Traffic between POP a and POP b is – Proportional to the volume of traffic leaving a – Proportional to the volume of traffic entering b – Inversely proportional to their distance General Routing problem • Network with N nodes and E edges • Traffic matrix T for each pair in N x N • Cost function C(e) – Dependent on the load of edge e • Find how to split traffic into flows to minimize – Cost = Sum of C(e) for all e in E • Can solve in linear time – If I can split flows arbitrarily Unfortunately • In IP networks – Routing depends only indirectly on the link costs • My algorithm should find link costs – Can not split flows arbitrarily • May not have enough paths • If I have ECMPs flow is split equally among the multiple options – Destination based routing • Traffic to the same destination will follow the same path • With these constraints – Problem of finding IGP link costs so as to minimize the cost of routing is NP-complete Enter MPLS • MPLS can approximate the flow splitting properties – No destination routing anymore – Can control exactly what traffic goes into an LSP – And how this traffic is delivered to its destination • This connection oriented nature is what makes MPLS (and ATM before) good for traffic engineering – Of course there is some cost • Full mesh of LSPs • Higher administrative complexity TE in practice • I have a 3 level network – Customer, aggregation and wan routers • Three approaches for TE – IGP only – IGP+MPLS • Mostly IGP with the occasional LSP • For unequal cost forwarding • For temporarily repairing hot-spots – MPLS • Full mesh of LSPs • Compute paths – On-line – Off-line Pros and Cons • IGP+MPLS – Mostly manual process – Error prone – It is not too easy to patch up network problem with a few LSPs • And may cause other problems • MPLS – Scaling of the full mesh • Can work at each of the 3 levels , Wan level full mesh scales ok, Customer router full mesh could be a problem • With 100 customer routers will have 10,000 LSPs • Can be more if I have separate LSPs for each Diff-Serv class • Signaling overhead • May hit the limit of LSPs in the transit routers Off-line MPLS TE • Compute best LSP paths for the whole network – Signal them using RSVP • When something changes – Re-compute all the LSPs again • Off-line allows for better control – Compute best LSP paths for the whole network • No oscillations – Global view can optimize resource usage • But can not respond to sudden traffic changes – Attacks – Flash crowds IP TE is not impossible • Recent research has shown that it is possible to achieve solutions that are very close to the optimal using just IP • I do this by picking the right IGP weights for each link – But as we said this problem is NP-complete – Need to do a state space search in the link weight state space – With some tricks this is feasible • For real networks this can get within few percent of the optimal flow based routing ! IGP link weight space search • Typical state search problem – Can use variety of methods • Steepest descent • Tabu search – Local vs. global minima • Start for a set of link weights state1 – Compute the cost of routing for this set – This is expensive • Need to route all the traffic and measure how much load each link has in order to compute its cost • Modify one or more weights -> state2 – Compute new routing cost – Keep if new routing cost is better • Continue until … ? Some tricks to speed up search • Avoid cycles – Remember states that were visited before and do not evaluate them again – Need to do this efficiently • Faster routing cost evaluation – Consider the effects of only large flows – Incremental SPFs • Find out which links are the ones that have a large impact on the cost and optimize for them only • Adaptation – Consider a dynamically sized “neighborhood” and explore it first before moving on • Neighborhood becomes smaller when I improve on the solution • Larger when I do not improve on the solution • Avoid local minima – Essentially repeat search starting from random places in the state search space When links fail? • All this TE is good when links do not fail • What happens in failures? • MPLS TE – Failures are handled by fast reroute – Some minimal optimization when determining backup LSPs – Global re-optimization if the failure lasts for too long • IGP weight optimization – It is possible to optimize weights to take into account single link failures • Other approaches: – Multi-topology traffic engineering – Optimize the weights in the topologies that are used for protecting from failures