CSE 331: Review August 1, 2013 Main Steps in Algorithm Design Problem Statement Real world problem Problem Definition Precise mathematical def Algorithm “Implementation” Data Structures Analysis Correctness/Run time Stable Matching Problem Gale-Shaply Algorithm Stable Marriage problem Set of men M and women W M and W with preferences Preferences (ranking of potential spouses) Stable Matching Matching (no polygamy in M X W) Perfect Matching (everyone gets married) m w m’ w’ Instablity Stable matching = perfect matching+ no instablity Gale-Shapley Algorithm Intially all men and women are free At most n2 iterations While there exists a free woman who can propose Let w be such a woman and m be the best man she has not proposed to w proposes to m If m is free (m,w) get engaged Else (m,w’) are engaged O(1) time implementation If m prefers w’ to w Else w remains free (m,w) get engaged and w’ is free Output the engaged pairs as the final output GS algorithm: Firefly Edition Mal Wash 1 1 2 2 3 3 4 4 5 5 Inara Zoe 6 6 Simon Kaylee GS algo outputs a stable matching Lemma 1: GS outputs a perfect matching S Lemma 2: S has no instability Proof technique de jour Proof by contradiction Assume the negation of what you want to prove After some reasoning Source: 4simpsons.wordpress.com Two obervations Obs 1: Once m is engaged he keeps getting engaged to “better” women Obs 2: If w proposes to m’ first and then to m (or never proposes to m) then she prefers m’ to m Proof of Lemma 2 By contradiction Assume there is an instability (m,w’) m prefers w’ to w w’ prefers m to m’ w’ last proposed to m’ m w m’ w’ Contradiction by Case Analysis Depending on whether w’ had proposed to m or not Case 1: w’ never proposed to m By Obs 2 w’ prefers m’ to m Assumed w’ prefers m to m’ Source: 4simpsons.wordpress.com Case 2: w’ had proposed to m Case 2.1: m had accepted w’ proposal m is finally engaged to w Thus, m prefers w to w’ 4simpsons.wordpress.com By Obs 1 Case 2.2: m had rejected w’ proposal m was engaged to w’’ (prefers w’’ to w’) m is finally engaged to w (prefers w to w’’) m prefers w to w’ 4simpsons.wordpress.com By Obs 1 By Obs 1 Overall structure of case analysis Did w’ propose to m? Did m accept w’ proposal? 4simpsons.wordpress.com 4simpsons.wordpress.com 4simpsons.wordpress.com Graph Searching BFS/DFS O(m+n) BFS Implementation BFS(s) Array Input graph as Adjacency list CC[s] = T and CC[w] = F for every w≠ s Set i = 0 Set L0= {s} While Li is not empty Linked List Li+1 = Ø For every u in Li For every edge (u,w) If CC[w] = F then CC[w] = T Add w to Li+1 i++ Version in KT also computes a BFS tree An illustration 1 1 2 7 3 8 4 5 6 2 3 4 5 7 8 6 O(m+n) DFS implementation BFS(s) CC[s] = T and CC[w] = F for every w≠ s O(n) Intitialize Q= {s} O(1) While Q is not empty Delete the front element u in Q For every edge (u,w) If CC[w] = F then CC[w] = T Add w to the back of Q Repeated Σ at most u O(nu) once for each vertexO(Σ u u nu) O(m) = = Repeated nu times O(1) O(nu) O(1) A DFS run using an explicit stack 7 8 1 7 76 3 2 3 8 5 4 4 5 5 3 6 2 3 1 Topological Ordering Run of TopOrd algorithm Greedy Algorithms Interval Scheduling: Maximum Number of Intervals Schedule by Finish Time End of Semester blues Can only do one thing at any day: what is the maximum number of tasks that you can do? Write up a term paper Party! Exam study 331 HW Project Monday Tuesday Wednesday Thursday Friday Schedule by Finish Time O(n log n) time sort intervals such that f(i) ≤ f(i+1) O(n) time build array s[1..n] s.t. s[i] = start time for i Set A to be the empty set While R is not empty Choose i in R with the earliest finish time Add i to A Remove all requests that conflict with i from R Return A*=A Do the removal on the fly The final algorithm Order tasks by their END time Write up a term paper Party! Exam study 331 HW Project Monday Tuesday Wednesday Thursday Friday Proof of correctness uses “greedy stays ahead” Interval Scheduling: Maximum Intervals Schedule by Finish Time Scheduling to minimize lateness All the tasks have to be scheduled GOAL: minimize maximum lateness Write up a term paper Exam study Party! 331 HW Project Monday Tuesday Wednesday Thursday Friday The Greedy Algorithm (Assume jobs sorted by deadline: d1≤ d2≤ ….. ≤ dn) f=s For every i in 1..n do Schedule job i from s(i)=f to f(i)=f+ti f=f+ti Proof of Correctness uses “Exchange argument” Proved the following Any two schedules with 0 idle time and 0 inversions have the same max lateness Greedy schedule has 0 idle time and 0 inversions There is an optimal schedule with 0 idle time and 0 inversions Shortest Path in a Graph: nonnegative edge weights Dijkstra’s Algorithm Shortest Path problem Input: Directed graph G=(V,E) s 100 5 Edge lengths, le for e in E w 15 u “start” vertex s in V s s w 5 5 u Output: All shortest paths from s to all nodes in V 15 u Dijkstra’s shortest path algorithm 1 3 u 1 1 4 2 x 2 2 43 y 1 4 s d’(w) = min e=(u,w) in E, u in R d(u)+le w 2 3 2 z 54 d(s) = 0 d(u) = 1 d(w) = 2 d(x) = 2 d(y) = 3 d(z) = 4 s Input: Directed G=(V,E), le ≥ 0, s in V u R = {s}, d(s) =0 While there is a x not in R with (u,x) in E, u in R Pick w that minimizes d’(w) Add w to R d(w) = d’(w) Shortest paths w x y z Dijkstra’s shortest path algorithm (formal) Input: Directed G=(V,E), le ≥ 0, s in V S = {s}, d(s) =0 While there is a v not in S with (u,v) in E, u in S At most n iterations Pick w that minimizes d’(w) Add w to S d(w) = d’(w) O(m) time O(mn) time bound is trivial O(m log n) time implementation is possible Proved that d’(v) is best when v is added Minimum Spanning Tree Kruskal/Prim Minimum Spanning Tree (MST) Input: A connected graph G=(V,E), ce> 0 for every e in E Output: A tree containing all V that minimizes the sum of edge weights Kruskal’s Algorithm Input: G=(V,E), ce> 0 for every e in E T=Ø Sort edges in increasing order of their cost Joseph B. Kruskal Consider edges in sorted order If an edge can be added to T without adding a cycle then add it to T Prim’s algorithm Similar to Dijkstra’s algorithm 0.5 2 1 3 50 51 Robert Prim 0.5 2 Input: G=(V,E), ce> 0 for every e in E S = {s}, T = Ø 1 50 While S is not the same as V Among edges e= (u,w) with u in S and w not in S, pick one with minimum cost Add w to S, e to T Cut Property Lemma for MSTs Condition: S and V\S are non-empty S V\S Cheapest crossing edge is in all MSTs Assumption: All edge costs are distinct Divide & Conquer Sorting Merge-Sort Sorting Given n numbers order them from smallest to largest Works for any set of elements on which there is a total order Mergesort algorithm Input: a1, a2, …, an Output: Numbers in sorted order MergeSort( a, n ) If n = 2 return the order min(a1,a2); max(a1,a2) aL = a1,…, an/2 aR = an/2+1,…, an return MERGE ( MergeSort(aL, n/2), MergeSort(aR, n/2) ) An example run 51 1 100 19 2 8 1 51 19 100 2 8 1 19 51 100 2 1 2 3 4 8 4 3 3 19 MergeSort( a, n ) If n = 2 return the order min(a1,a2); max(a1,a2) aL = a1,…, an/2 aR = an/2+1,…, an return MERGE ( MergeSort(aL, n/2), MergeSort(aR, n/2) ) 3 4 4 51 8 100 Correctness Input: a1, a2, …, an Output: Numbers in sorted order MergeSort( a, n ) If n = 1 return the order a1 If n = 2 return the order min(a1,a2); max(a1,a2) aL = a1,…, an/2 aR = an/2+1,…, an return MERGE ( MergeSort(aL, n/2), MergeSort(aR, n/2) ) Inductive step follows from correctness of MERGE By induction on n Counting Inversions Merge-Count Mergesort-Count algorithm Input: a1, a2, …, an Output: Numbers in sorted order+ #inversion T(2) = c MergeSortCount( a, n ) If n = 1 return ( 0 , a1) If n = 2 return ( a1 > a2, min(a1,a2); max(a1,a2)) aL = a1,…, an/2 T(n) = 2T(n/2) + cn O(n log n) time aR = an/2+1,…, an (cL, aL) = MergeSortCount(aL, n/2) O(n) (cR, aR) = MergeSortCount(aR, n/2) (c, a) = MERGE-COUNT(aL,aR) return (c+cL+cR,a) Counts #crossing-inversions+ MERGE Closest Pair of Points Closest Pair of Points Algorithm Closest pairs of points Input: n 2-D points P = {p1,…,pn}; pi=(xi,yi) d(pi,pj) = ( (xi-xj)2+(yi-yj)2)1/2 Output: Points p and q that are closest The algorithm O(n log n) + T(n) Input: n 2-D points P = {p1,…,pn}; pi=(xi,yi) Sort P to get Px and Py Closest-Pair (Px, Py) O(n log n) T(< 4) = c T(n) = 2T(n/2) + cn If n < 4 then find closest point by brute-force Q is first half of Px and R is the rest O(n) Compute Qx, Qy, Rx and Ry O(n) (q0,q1) = Closest-Pair (Qx, Qy) O(n log n) overall (r0,r1) = Closest-Pair (Rx, Ry) δ = min ( d(q0,q1), d(r0,r1) ) S = points (x,y) in P s.t. |x – x*| < δ return Closest-in-box (S, (q0,q1), (r0,r1)) O(n) O(n) Assume can be done in O(n) Dynamic Programming Weighted Interval Scheduling Scheduling Algorithm Weighted Interval Scheduling Input: n jobs (si,ti,vi) Output: A schedule S s.t. no two jobs in S have a conflict Goal: max Σi in S vj Assume: jobs are sorted by their finish time A recursive algorithm Compute-Opt(j) Proof of correctness by induction on j Correct for j=0 If j = 0 then return 0 return max { vj + Compute-Opt( p(j) ), Compute-Opt( j-1 ) } = OPT( p(j) ) = OPT( j-1 ) OPT(j) = max { vj + OPT( p(j) ), OPT(j-1) } Exponential Running Time 1 2 p(j) = j-2 3 4 Only 5 OPT values! 5 OPT(5) OPT(3) Formal proof: Ex. OPT(4) OPT(3) OPT(1) OPT(2) OPT(2) OPT(1) OPT(1) OPT(1) OPT(2) OPT(1) Bounding # recursions M-Compute-Opt(j) If j = 0 then return 0 If M[j] is not null then return M[j] M[j] = max { vj + M-Compute-Opt( p(j) ), M-Compute-Opt( j-1 ) } return M[j] Whenever a recursive call is made an M value of assigned At most n values of M can be assigned O(n) overall Property of OPT OPT(j) = max { vj + OPT( p(j) ), OPT(j-1) } Given OPT(1), …, OPT(j-1), one can compute OPT(j) Recursion+ memory = Iteration Iteratively compute the OPT(j) values Iterative-Compute-Opt M[0] = 0 For j=1,…,n M[j] = max { vj + M[p(j)], M[j-1] } M[j] = OPT(j) O(n) run time Shortest Path in a Graph Bellman-Ford Shortest Path Problem Input: (Directed) Graph G=(V,E) and for every edge e has a cost ce (can be <0) t in V Output: Shortest path from every s to t 1 1 899 100 s Shortest path has cost negative infinity -1000 t Assume that G has no negative cycle Recurrence Relation OPT(i,u) = cost of shortest path from u to t with at most i edges OPT(i,u) = min { OPT(i-1,u), min(u,w) in E { cu,w + OPT(i-1, w)} } Path uses ≤ i-1 edges Best path through all neighbors P vs NP P vs NP question P: problems that can be solved by poly time algorithms Is P=NP? NP: problems that have polynomial time verifiable witness to optimal solution Alternate NP definition: Guess witness and verify!