TCOM 799: Algorithms in Networking Instructor: Saswati Sarkar Contact Info: 215 573 9071(Phone) swati@ee.upenn.edu Office Hours: 1-3 p.m. 360 MB Course webpage: http://www.seas.upenn.edu/~swati/TCOM79901.htm 1 Topics covered in this course • Routing algorithms – Point to point routing – Point to multipoint routing • Flow control algorithms – Network flows – Optimization algorithms • Resource scheduling algorithms – Matching – Integer Linear programming 2 Direction of approach • Basic Algorithmic Theory • Recent Networking papers 3 Gradation • Homeworks (20 % of grade) – One set every month • Paper Criticism (20% of grade) • Term project (60 % of grade) 4 Todays Class • Brief Introduction • Basics of algorithm analysis and design • Point to point routing algorithms – Dijkstras algorithm – Bellman-ford algorithm – Overview of Internet routing 5 Next Class • Flloyd-Warshalls algorithm • Johnsons algorithm • Networking papers on point to point routing 6 Keywords • Networks – A bunch of users communicating among themselves • Algorithm – A sequence of logical instructions • Common problems in Networking – Find a route for a message – Distribute the network resources among contending users 7 Source Destination Routing Resource Allocation Bandwidth Buffer Space 8 • Need efficient algorithms to decide these • Design an algorithm • Prove its correctness – prove that it attains its objective • Analyze its complexity – how much time does it take to finish – how much storage does it require 9 • Run time is machine dependent – will define in terms of operations • For i = 1 to N – ith num ->ith num + 1 – Ith num->ith num*ith num + 1 • N memory accesses, 2N additions, N multiplications • 4N operations – An operation takes a constant time 10 • Broad nature of complexity function – – – – Logarithmic in input size (log N) Linear in input size (N) Polynomial in input size (Nk) Exponential in input size (exp(N)) • c1N and c2N are both linear, N is linear, 100000log(N) is logarithmic • Constants do not matter 11 • We are interested in the algorithm performance for large size inputs (asymptotic analysis) – All linear functions will show the same nature of increase with increase in input size – A logarithmic function will be less than a linear function for all sufficiently large input size independent of the constant • Get rid of the constants in complexity analysis! 12 Formal complexity analysis • Refer to Supplement 1 13 Routing Algorithms • Shortest path routing • What is a shortest path? – Minimum number of hops? – Minimum distance? • There is a weight associated with each link – Weight can be a measure of congestion in the link, propagation delay etc. • Weight of a path is the sum of weight of all links • Shortest path is the minimum weight path 14 Path 1 Source Destination 1 1.5 0.5 2.5 Path 2 Weight of path 1 = 2.5 Weight of path 2 = 3.0 15 Computation of shortest paths • Enumerate all paths? – Exponential complexity • Several polynomial complexity algorithms exist – – – – Dijkstras algorithm (greedy algorithm) Bellman-ford algorithm (distributed algorithm) Flloyd-Warshall algorithm (dynamic programming) Johnsons algorithm 16 Dijkstras algorithm •Assumes a directed graph Source Destination •Given any node, finds the shortest path to every other node in the graph •O(V log V + E) 17 • Let source node be s • Maintains shortest path ``estimate’’ for every vertex v, (d(v)) – ``estimate’’ is what it believes to be the shortest path from s and the list of vertices for whom the shortest path is known • Initially the list of vertices for whom the shortest path is known is empty and the estimates are infinity for all vertices except the 18 source vertex itself. • It holds that whenever the estimate d(v) is finite for a vertex v, there exists a path from the source s to v with weight d(v) • It turns out that the estimate is accurate for the vertex with the minimum value of this estimate – Shortest path is known for this vertex (v) • This vertex (v) is added to the list of vertices for whom shortest path is known • Shortest path estimates are upgraded for every vertex which has an edge from v, and is not in this ``known list’’. 19 Estimate Upgrade Procedure • Suppose vertex v is added to the list newly, and we are upgrading the estimate for vertex u – d(v) is the shortest path estimate for v, d(u) is the estimate for u – w(v, u) is the weight of the edge from v to u d(u) -> min(d(u), d(v) + w(v, u)) 20 Intuition behind the upgrade procedure • Assume that d(u) and d(v) are finite • So there exists a path to v from s of weight d(v), (s, v1, v2,…..v) • Hence there exists a path from s to u (s, v1, v2,…..v, u) of weight d(v) + w(v, u) • Also, there exists a path to u of weight d(u). • So the shortest path to u can not have weight more than either d(u) or d(v) + w(v, u). • So we upgrade the estimate by the minimum of the two. 21 Notation • • • • • • • • • Source vertex: s Set of vertices: V Set of edges: E Shortest path estimate of vertex v: d(v) Weight of edge (u, v): w(u, v) Set of edges originating from vertex v: Adj(v) Set of vertices whose shortest paths are known: S Q=V\S Refer to supplement 3 for algorithm statement 22 Example 1 7 s 3 2 0 2 8 5 5 1 s 0 7 2 5 4 7 s 8 4 5 32 0 2 2 7 5 5 1 6 2 3 8 4 5 2 5 7 1 s 7 2 3 0 2 2 8 5 5 1 s 0 7 10 8 23 4 5 2 2 5 7 6 5 1 6 23 8 45 5 7 4 5 7 s 7 0 2 2 23 Algorithm Complexity • • • • Statement 1 is executed |V| times Statements 2 and 3 are executed once Loop at statement 4 is executed |V| times Every extract-min operation can be done in at most |V| operations • Statement 4(a) is executed total |V|2 times • Statements 4(b) and 4© are executed |V| times each (total) • Observe that statement 4(d) is executed |E| times 24 • So overall complexity is O(|V|2 + |E|) and this is same as O(|V|2) • Using improved data structures complexity can be reduced – O((|V| + |E|)log |V|) using binary heaps – O(|V| log |V| + |E|) using fibonacci heaps 25 Proof of Correctness • Exercise: – Verify that whenever d(v) is finite, there is a path from source s to vertex v of weight d(v) 26 Assumptions • Assume that source s is connected to every vertex in the graph, and all edge weights are finite Also, assume that edge weights are positive. • Let p(s, v) be the weight of the shortest path from s to v. • Will show that the graph terminates with d(v)=shortest path weights for every vertex 27 • Will first show that once a vertex v enters S, d(v) equals the shortest path weight from source s, at all subsequent times. – Clearly this holds in step 1, as source enters S in step 1, and d(s) = 0 – Let this not hold for the first time in step k > 1 • Thus the vertex u added has d(u) > p(s, u) • Consider the situation just before insertion of u. • Consider the true shortest path, p, from s to u. 28 •Since s is in S, and u is in Q, path p must jump from S to Q at some point. S Q u s x y Path p Let the jump have end point x in S, and y in Q (possibly s = x, and u = y) We will argue that y and u are different vertices Since path p is the shortest path from s to u, the segment of path p between s and x, is the shortest path from s to x, and that between s and y is the shortest from s to y 29 S Q u s w(x,y) x y Path p Weight of the segment between s and x is d(x) •since x is in S, d(x) is the weight of the shortest path to x Weight of the segment between s and y is d(x) + w(x, y) Thus, p(s, y) = d(x) + w(x, y) Also, d(y) <= d(x) + w(x, y) = p(s, y) Follows that d(y) = p(s, y) However, d(u) > p(s, u). So, u and y are different 30 s y u Since, y appears somewhere along the shortest path between s and u, but y and u are different, p(s, y) < p(s, u) Using the fact that all edges have positive weight Hence, d(y) = p(s, y) < p(s, u) < d(u) Both y and u are in Q. So, u should not be chosen in this step So, whenever a vertex u is inducted in S, d(u) = p(s, y). Once d(u) equals p(s, u) for any vertex it can not change any further (d(u) can only decrease or remain same, and d(u) can not fall below p(s, u). Since the algorithm terminates only when S= V, we are done! 31 • We have proved only for edges with positive weight – CLR proves for edges with nonnegative weight (Chapter 25) • Shortcoming – Does not hold for edges with nonnegative weight – Centralized algorithm 32 Exercise: This computation gives shortest path weights only. Modify this algorithm to generate shortest paths as well! 33 Bellman-ford Algorithm • Applies as long as there are no nonpositive weight cycles – If there are circles of weight 0 or less, then the shortest paths are not well defined • Capable of full distributed operation • O(|V||E|) complexity – slower than Dijkstra 34 Algorithm description • Every node v maintains a shortest path weight estimate, d(v) • These estimates are initialized to infinity, for all vertices except source, s, d(s)=0 • Every node repeatedly updates its shortest path estimate as follows d (v) min (d (u) w(u, v)) u:vAdj( u ) 0.5 2 2.1 d(v) = 2.6 1 v 0.1 5 35 d (v ) t min (d t 1 (u ) w(u, v)) u:vAdj ( u ) Refer to supplement 3 for algorithm statement 36 Example 1 7 1 7 s 0 3 2 2 8 5 5 1 s 0 7 2 4 5 1 7 5 s 0 8 4 5 23 2 5 2 6 2 3 8 4 5 7 5 1 s 0 2 5 7 2 s 0 7 23 8 8 4 5 2 2 5 7 6 2 3 8 4 5 2 5 7 37 Complexity Analysis • Initialization step takes |V| + 1 steps • The loop in statement 3 is executed |V| times • Each execution takes |E| steps • Overall, there are |V| + 1 + |V||E| steps – O(|V||E|) 38 Proof that it works • Assume that all vertices are reachable from source, s. – Thus there is a shortest path to any vertex v from s. • Assume that the graph has no cycles of weight 0 or less – So the shortest paths can not have more than |V|-1 edges. • We will prove that at the termination of BellmanFord algorithm, d(v)=p(s,v) for every vertex v. • We will show that if there is one shortest path to a vertex of k hops, then after the kth execution of 39 the loop in statement 3, d(v) freezes at p(s, v) We know the above holds for k = 0, as d(s) = p(s, s) = 0 at all times. Let the above hold for 1,….,k. We will show that this holds for k+1 Consider a vertex u with a shortest path p of k + 1 hops. p s y u p1 Let vertex y be its predecessor. Clearly p1 is a shortest path to y and it has k hops. So weight of path p1 is p(s, y) So, by induction hypothesis, d(y) = p(s, y) after the kth iteration and at all subsequent times . So by the estimate update procedure, d(u) <= d(y) + w(y, u) = p(s, y) + w(y, u) = weight of path p = p(s, u) after the k+1 th iteration and all subsequent times. 40 We have just shown that d(u) <= p(s, u) after the k+1 th iteration Again verify that as long as d(v) is finite, d(v) is length of some path to vertex v. Hence d(u) >= p(s, u) always Thus, d(u) = p(s, u), always after the k+1th iteration. 41 Features of this algorithm • Note that a node needs information about its neighbors only! • So we do not need a global processor. • However, all nodes need to synchronize their clocks (compute at the same times, t = 1, 2,…..) – Difficult in practice 42 Asynchronous Bellmanford • Every node periodically receives estimates (d(u)) from its neighbors • Every node periodically computes the shortest path estimates based on its knowledge of the estimates of its neighbors 0.5 2 2.1 1 v 0.1 5 43 0.5 d (v ) t min (d u:vAdj ( u ) (t ) u ,v (u ) w(u, v)) 2 2.1 1 •Here, u ,v (t ) is the time of the last computation at u that reaches node v before t. v 0.1 5 Here, d (t ) (u ) is the computation at time u ,v (t ) at node u. u ,v Note that u ,v (t ) depend on u, v, t. The only requirement is that updates come infinitely often from each neighbor, i.e., for all edges (u, v) lim u ,v (t ) t 44 The asynchronous updates converge to shortest path values starting from any initial condition (Bertsekas and Gallager(pp 406-410)) Note that for both Dijkstra and Bellman ford we have discussed the computation of routes from a single source to all destinations. We can obtain routes from all nodes to a single destination by using the above algorithms and a reversed graph (verify!) Also, for Bellman-ford we can obtain the routes from all nodes to a single destination s, by initializing d0(s)=0, d0(v)=infinity, for all other vertices , and the following updates: (v ) ( (u ) w(v, u )) d t min d uAdj ( v ) u ,v (t ) 45 Now d (v) t is the distance to destination s from vertex v. 46 Current Internet Routing • Packet headers have destination address. • Routers have routing tables with the next hop link for every entry : Destination Next hop a l1 b l2 When a packet arrives, a router checks the destination, locates it in the routing table and forwards it to the corresponding next hop link 47 • Source Routing – The source puts the desired route in he packet, and the routers forward in accordance wih the mentioned route. • Routing tables are constructed as per shortest path routing • Protocols for shortest pah computation – Link state (Dijkstra) – Distance Vector (Bellmanford) 48 Link state Protocols • Every router maintains the entire topology of the network • For this purpose, every router periodically broadcasts the status of links to its neighbors to the entire network • Whenever a router gets a message indicating that there is topology change – It updates its topology record – Uses Dijkstras algorithm to compute shortest paths to all destinations 49 • Example protocol: OSPF (open shortest path forwarding) • Refer to ``An Engineering Approach to Computer Networking’’, S. Keshav, Chapter 11 50 Distance Vector Protocols • Every router maintains the distance and next link to every destination in the network (routing table). • It sends this routing table to all of its neighbors 0.5 2 2.1 1 v 0.1 5 51 •When a router gets an update, it recomputes the shortest paths to all the destinations as per Bellman ford update d d t (v, x) min (d (v, x) Here, t to destination x uAdj ( v ) (t ) u ,v (u, x) w(v, u )) is the distance from vertex v 52