1 Good Facts to Know 1.1 Trees Trees are graphs, graphs are not trees A graph that is not a tree has at least one cycle A Hamiltonian cycle visits each vertex exactly once every tree is a bipartite graph and a median graph, countable trees are a planar graph height and nodes: 2h1 <= n <= 2h − 1 leaf nodes = (n + 1)/2 nodes given l leaves = 2l − 1 number of nodes = 2h 1 number of nodes on given level : 2( h) recurrence relation for √ number of nodes: an = a2n−1 + an−1 (1 + 4an−1 − 3) number of binary trees with n nodes are the Catalan number Cn successor of node x is the node with the smallest key greater than x− > key, which can be determined without comparing keys 1.2 Heaps height = θ(log(n)), basic operations O(log(n)) maxheapify(A,n): T (n) = T (largest) + θ(1); largest <= 2n/3 (last row half full is worst case) T (n) = T (2n/3) + θ(1) => T (n) = O(log(n)) lg(n) buildmaxheap = n ∗ sum0 h/2h = O(n) 1.3 Max Heap Correctness Initialisation: Prior to first iteration, i = [n/2], each node [n/2] + 1, [n/2] + 2, . . . , n is a leaf and thus the root of a trivial max heap. Maintenance: The children of node i are numbered higher than i, there both roots of max heaps. Max heapify preserves the property that the nodes up to n are all roots of max heaps. Decrementing i for in the for loop reestablishes the invariant for the next iteration. Termination: At terminion, i = 0. By the loop invariant each node is the root of a max-heap, in particular the first node is. 2.3 Hash Function Division: h(k) = k mod d, d = 2r where r prime not too close from power of 2 or 10 Multiplication h(k) = (A ∗ k mod 2e ) >> (w − r), 2w−1 < A < 2w , multiplies and extracts r bits from end 2.4 Open Addressing Open addressing: linear and quadratic probing and double hashing Linear probing: interval between probes fixed, best cache performance but sensitive to clustering Quadratic probing: interval between probes increases linearly Double hashing: more computation No memory overhead but wasted entries Recommended .7 load factor Performance relative to α/(1 − α) Deletion needs to add marker to continue search in addressing Given uniform hashing expected number of probes in unsuccessful search is at most 1/(1 − α) and successful search is 1/α ∗ log(1/(1 − α)) keys used number of keys in universe Uniform hashing: each key equally likely to have any of the m0 permutations as its probe sequence 2.1 Direct Address Table Represented by an array T [0 . . . m − 1] Each slot corresponds to key in U If element x with key k then T [k] has pointer to x Otherwise T [k] is empty Delete = Θ(1), Search = Θ(1), Insert = Θ(1) 2.2 Chaining Search, removal, lookup all constant by amortized analysis but otherwise average case is O(α) where α is the load factor Worst case is number of entires in table if all hash equal for all but Insert Unsuccessful search O(1) + O(α) = O(1 + α) α α = Θ(1 + α)) Successful search Θ(1 + 1 + + 2 2n 3.1 BuildMaxHeap Calls MaxHeapify on each element in bottom up manner. Correctness: Loop invariant At the start of each iteraion of the for loop, each node i+1, i+2, ..., n is the root of a maxâĂŘheap. Initialisation: Before first iteration i = [n/2], nodes [n/2] + 1, [n/2] + 2, .., n are leaves, hence roots of trivial max heaps Maintenance: By LI, subtrees at children of node i are max heaps, hence MaxHeapify(i) renders node i a max heap root, decrementing i re-establishes the loop invarian for the next iteration. Cost of MaxHeapify is O(h), ≤ [n/2h+1 ] nodes of height h, height of heap is [lg(n)]. 2.5 Universal hashing FILL THIS SHIT O(n [lg(n)] X h=0 3 Heaps Tree based structure: max heap largest element at root and for all nodes the parent is greater than or equal to the node, opposite for min heap Most operations run in O(lg(n)) time and heap height is Θ(lg(n)) Max heaps are used for sorting Heap property maintained by exchanging value of an offending node with the large values at its children, which may in turn lead to subtree at child not being a heap which must be recursively fixed To sort: build heap, then swap first and last elements, max heapify repeatedly at the root Heapsort is O(nlg(n)) like merge sort but in place like insertion sort, built from BuildMaxHeap of O(n) and for loop n − 1 times of MaxHeapify of O(lg(n)). 2 Hashing Load factor: lst[root], lst[child] = lst[child], lst[root] root = child else: break MaxHeapif y(n) = T (largest) + Θ(1) largest ≤ 2n/3(last row half full) T (n) ≤ T (2n/3) + Θ(1) = T (n) = O(lg(n)) def heapsort(lst): for start in range((len(lst)-2)/2, -1, -1): siftdown(lst, start, len(lst)-1) for end in range(len(lst)-1, 0, -1): lst[end], lst[0] = lst[0], lst[end] siftdown(lst, 0, end - 1) return lst def siftdown(lst, start, end): root = start while True: child = root * 2 + 1 if child > end: break if child + 1 <= end and lst[child] < lst[child + 1]: child += 1 if lst[root] < lst[child]: (1) (2) (3) h ) = O(n) n2 (4) 4 Disjoint Sets map: f (x) = y relation: R ⊆ (a, b) : a, b ∈ S, can be defined by a boolean matrix equivalence relation: i is equivalent to j if they belong to the same set reflexivity: ∀a ∈ S, (a, a) ∈ R symmetry: ∀a, b ∈ S, (a, b) ∈ R → (b, a) ∈ R transitivity: ∀a, b, c ∈ S, (a, b) ∈ R ∧ (b, c) ∈ R → (a, c) ∈ R make-set makes a new set whose only member is x, assuming x is not in another set already. O(1) using linked list method by making a new LL. union takes two values and forms a union between the two, some implementations of which will choose a particular representative. we must destroy the original sets for the values to ensure disjointedness. find set returns a pointer to the representative set containing x. O(1) with linked list method by following point from x back to its set object and returning the member point to by head CONNECTED-COMPONENTS(G): for each vertex v \in G.V make_set(v) for each edge (u,v) \in G.E if find_set(u) != find_set(v) union(u,v) SAME-COMPONENTS(u,v): return if find_set(u) == find_set(v) path_compression_find(i): (O(log(n))) if p[i] == i { return i; } else { p[i] = find(p[i]); return p[i]; } S, then A0 = A \ {1} is an optimal solution to the activity-selection problem S = {i ∈ S : si ≥ f1 }. Why? If we could find a solution B’ to S’ with more activities then A’, adding 1 to B’ would yield a solution B to S with more activities than A, contradicting the optimality. 5 Trees 5.1 Binary Trees n is nodes, h is height, l is leaves 2(h + 1) − 1 ≤ h ≤ 2h+1 − 1 l = (n + 1)/2 perfect tree : n = 2l − 1 balanecd : h = [log(n + 1)] (5) (6) (7) (8) perfect : l = 2h , n = 2h+1 − 1 (9) 5.2 Binary Search Trees Search, Insert, and Delete all Θ(h) where h is height Each node contains a key and two subtrees, which are both binary search trees Left subtree strictly less than node, right subtree strictly greater than Shape can degenerate by poor insertions Comparison computation required to traverse tree Balanced tree Θ(log(n)), unbalanced tree Θ(n) BST sort like heap sort but from Binary Search Tree BST-sort = Ω(nlog(n)), O(n2 ) 5.3 AVL Trees BST such that heights of two child subtrees differ by at most one Operations average and worst cases O(log(n)) Properties follow balanced trees Uses tree rotations to re-establish tree properties AVL-sort, worst case brought down to O(nlog(n)) 5.4 Red-black trees Variation on BST to ensure balance O(lg(n)) worst case for operations Has extra bit per node for colour red or black Every node is red or black, root is black, nil leaves are black, if a node is red than it’s children are black, for each node, paths from node to descendent leaves contain same number of black nodes Black-height number of black nodes on the path from x to a leaf, including the leaf 6 Greedy algorithms Bellman-Ford’s algorithm uses dynamic programming Dijkstra’s algorithm uses the greedy approach 6.1 Scheduling problem Proof of optimality Let S = {1, 2, . . . , n} be the set of activities ordered by finish time. Thus activity 1 has the earliest finish time. Suppose A is a subset of S is an optimal solution and let activities in A be ordered by finish time. Suppose that the first activity in A is k âĽă 1, that is, this optimal solution does not start with the "greedy choice." We want to show that there is another solution B that begins with the greedy choice, activity 1. Let B = (A \ {k}) ∪ {1}. Because f1 ≤ fk , the activities in B are disjoint and since B has same number of activities as A, i.e., |A| = |B|, B is also optimal. Once the greedy choice is made, the problem reduces to finding an optimal solution for the subproblem. If A is an optimal solution to the original problem 6.2 Huffman encoding 6.3 Dijkstra’s Substructure Optimal substructure simply that subpath of any shortest path is itself a shortest path, and the shortest path length between some u and v is less than or equal to the shortest path from u to x to v. (triangle inequality) 6.4 Differences Between Bellman Ford and Dijkstra Bellman-Ford algorithm is a single-source shortest path algorithm, which allows for negative edge weight and can detect negative cycles in a graph. Dijkstra algorithm is also another single-source shortest path algorithm. However, the weight of all the edges must be non-negative. For your case, as far as the total cost is concerned, there will be no difference, since the edges in the graph have non-negative weight. However, Dijkstra’s algorithm is usually used, since the typical implementation with binary heap has Theta((|E|+|V|)log|V|) time complexity, while Bellman-Ford algorithm has O(|V||E|) complexity. If there are more than 1 path that has minimum cost, the actual path returned is implementation dependent (even for the same algorithm). 6.5 Making Change with Coins Greedy algorithm gives as many of the highest value coins as possible until moving to coins of decreasing value in order. Correctness: for US coins this is optimal as we must have at most 2 dimes otherwise can replace with quarter and a nickel, if 2 dimes then no nickels otherwise we can replace 2 dimes and 1 nickel with quater, at most 1 nickel (otherwise we can replace 2 nickels with a dime), and at most 4 pennies (otherwise we can replace 5 pennies with nickel) 6.6 Rental Car Problem Let c(i, j) by the optimal cost of driving from agency i to j. 6.7 Lecture Hall Assignment Problem Given a set of activities to among lecture halls. Schedule all the activities using minimal lecture halls. In order to determine which activity should use which lecture hall, the algorithm uses the GREEDY-ACTIVITY-SELECTOR to calculate the activities in the first lecture hall. If there are some activities yet to be scheduled, a new lecture hall is selected and GREEDY-ACTIVITY-SELECTOR is called again. This continues until all activities have been scheduled. The algorithm can be shown to be correct and optimal. As a contradiction, assume the number of lecture halls are not optimal, that is, the algorithm allocates more hall than necessary. Therefore, there exists a set of activities B which have been wrongly allocated. An activity b belonging to B which has been allocated to hall H[i] should have optimally been allocated to H[k]. This implies that the activities for lecture hall H[k] have not been allocated optimally, as the GREED-ACTIVITY-SELECTOR produces the optimal set of activities for a particular lecture hall. In the worst case, the number of lecture halls require is n. GREED-ACTIVITY-SELECTOR runs in Îÿ(n). The running time of this algorithm is O(n2). Observe that choosing the activity of least duration will not always produce an optimal solution. For example, we have a set of activities (3, 5), (6, 8), (1, 4), (4, 7), (7, 10). Here, either (3, 5) or (6, 8) will be picked first, which will be picked first, which will prevent the optimal solution of (1, 4), (4, 7), (7, 10) from being found. Also observe that choosing the activity with the least overlap will not always produce solution. For example, we have a set of activities (0, 4), (4, 6), (6, 10), (0, 1), (1, 5), (5, 9), (9, 10), (0, 3), (0, 2), (7, 10), (8, 10). Here the one with the least overlap with other activities is (4, 6), so it will be picked first. But that would prevent the optimal solution of (0, 1), (1, 5), (5, 9), (9, 10) from being found. 7 Graphs 7.1 Edge Classification Forward edge: Back edge: Cross edge: Tree Edge: Light edge: White vertex: unvisited by search Grey vertex: vertex in progress Black vertex: DFS/otheralgorithm has finished processing vertex 7.2 Depth First Search 7.3 Breadth First Search 7.4 Parenthesis Theorem For every two vertices u and v, exactly one of the following conditions holds: the intervals [s[u], f[u] and [s[v], f[v]] are disjoint or one interval contains the other (either s[u] < s[v] < f[v] < f[u] or s[v] < s[u] < f[u] < f[v]) To prove the theorem it suffices to prove that if s[u] < s[v] < f[u] then s[u] < s[v] < f[v] < f[u] (and similarly if s[v] < s[u] < f[v] then s[v] < s[u] < f[u] < f[v]). So suppose that s[u] < s[v] < f[u]. In this case, at time s[v] when v is colored Gray (and pushed back to the stack) u is on the stack and has color Gray. Thus u is never added again to the stack, and therefore it can only become Black after this occurrence of v is taken out and v is colored Black. This means f[v] < f[u]. 7.5 White Space Theorem In a DFS forest of a (directed or undirected) graph G, vertex v is a descendant of vertex u if and only if at time s[u] (just before u is colored Gray), there is a path from u to v that consists of only White vertices. Proof There are two directions to prove. (forward arrow) Suppose that v is a descendant of u. So there is a path in the tree from u to v. (Of course this is also a path in G.) All vertices w on this path are also descendants of u. So by the corollary above, they are colored Gray during the interval [s[u], f[u]]. In other words, at time s[u] they are all White. (backward arrow) Suppose that there is a White path from u to v at time s[u]. Let this path be v0 = u, v1, v2, ..., vk1, vk = v To show that v is a descendant of u, we will indeed show that all vi (for 0 ≤ i ≤ k) are descendants of u. (Note that this path may not be in the DFS tree.) We prove this claim by induction on i. Base case: i = 0, vi = u, so the claim is obviously true. Induction step: Suppose that vi is a descendant of u. We show that vi+1 is also a descendant of u. By the corollary above, this is equivalent to showing that s[u] < s[vi + 1] < f [vi + 1] < f [u] i.e., vi+1 is colored Gray during the interval [s[u], f[u]]. Since vi+1 is White at time s[u], we have s[u] < s[vi+1]. Now, since vi+1 is a neighbor of vi , vi+1 cannot stay White after vi is colored Black. In other words, s[vi+1] < f[vi]. Apply the induction hypothesis: vi is a descendant of u so s[u] ≤ s[vi] < f [vi] ≤ f [u], we obtain s[vi+1] < f[u]. Thus s[u] < s[vi+1] < f[vi+1] < f[u] by the Parenthesis Theorem. QED. state[node] = BLACK while enter: dfs(enter.pop()) return order 7.6 Directed Acyclic Graph Good for partial order as a > b ∧ b > c =⇒ a > c but can also have neither a > b nor b > a Can always make a total ordering from a partial order May not have back edges Proof: back edge means cycle suppose there is a back edge (u, v) then v is ancestor of u in depth-first forest, therefor there is a path from v to u so v to u to v is a cycle Proof: a DAG is acyclic iff a DFS of G yields no back edges a cycle in G, v presents as the first vertex discovered in c, (u, v) preceding edge in c, and at some time d[v] vertices of c form a white path from v to u then by the white path theorem u is a descendent of v in a depth first forest, there u to v is a back edge 7.6.1 Topological sort Performed on a DAG Is a linear ordering of the vertices such that if (u, v) ∈ E then u appears somewhere before v 1. Call DFS to compute finishing times for all vertices, as each vertex finishes, insert it onto the front of a linked list, and return the linked list of vertices Time: Θ(V + E) Proof: Topological sort correctness show that if (u, v) ∈ E then f [v] < f [u]: when we explore (u, v)what are the colors of u and v, if us is grey then v cannot be grey because then v would be ancestor of u therefore (u, v) is a black edge, which is a contradiction of DAG, if v is white than becomes descendant of u and parenthesis theorem works, if v is black then v is already finished therefore f [v] < f [u] topological sort (sometimes abbreviated topsort or toposort) or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering GRAY, BLACK = 0, 1 def topological(graph): order, enter, state = deque(), set(graph), {} def dfs(node): state[node] = GRAY for k in graph.get(node, ()): sk = state.get(k, None) if sk == GRAY: raise ValueError("cycle") if sk == BLACK: continue enter.discard(k) dfs(k) order.appendleft(node) parent = dict() rank = dict() def make_set(vertice): parent[vertice] = vertice rank[vertice] = 0 def find(vertice): if parent[vertice] != vertice: parent[vertice] = find(parent[vertice]) return parent[vertice] 7.7 Minimum Spanning Tree Has |V | − 1 edges with no cycles and may not be unique Can be calculated by an algorithm that adds edges to a copy of the vertices such that added edges are always safe for A. Loop invariant: A is a subset of some MST: initialises with empty set, which satisfies invariant, mainains by adding only safe edges, terminates by all edges added to A are in an MST so spanning tree of A must also be an MST A cut partitions vertices into two disjoint sets A cut respects A if and only if no edge in A crosses the cut A light edge crosses the cut A safe edge is a light edge crossing (S, V − S), then (u, v) is safe for A Proof of safe edge Let T be a MST that includes A. for case 1 the path u v is in T and we’re done, for case two u to v is not in T giving us some edge x to y crossing the cut. Let T 0 = T − {(x, y)} ∪ {(u, v)}. because u to v is light for cut, w(u, v) ≤ w(x, y) thus w(T 0 ) = w(T ) − w(x, y)) + w(u, v) ≤ w(T ) hence T 0 is also a MST so u to v is safe for A. Corollary if u to v is a ligh edge connecting one CC in (V, A) to another CC in (V, A), then u to v is safe for A def union(vertice1, vertice2): root1 = find(vertice1) root2 = find(vertice2) if root1 != root2: if rank[root1] > rank[root2]: parent[root2] = root1 else: parent[root1] = root2 if rank[root1] == rank[root2]: rank[root2] += 1 7.7.1 Kruskal’s Algorithm First for loop is O(V ), sort E O(E lg E), second for loop O(E) find sets and unions. Assuming union by rank and path compression O((V+E)alpha(V) + O(E lg E). Since G is connected |E| geq |V| 1 implies O(E alpha (V)) + O(E lg E), alpha(|V|) = O(lg(V)) = O(lg E), therefore total time is O(E lg E), |E| leq |V|2 therefore lg|E| = O(2lgV) = O(lgV) = O(E lg V) time. Proof We are given a graph G = (V, E), with costs on the edges, and we want to find a spanning tree of minimum cost. We use KruskalâĂŹs algorithm, which sorts the edges in order of increasing cost, and tries to add them in that order, leaving edges out only if they create a cycle with the previously selected Basic algorith: sort all edges in increasing order by weight. Pick smallest edge, check if it forms a cycle with spanning tree formed so far. If no cycle, include, else discard. Repeat until (V-1) edges in spanning tree. Starts with each vertex in its own component and repeatedly merges two components into one by choosing a light edge that connects them, scans the set of edges in monotonically increasing order by weight, uses a disjoint-set data structure to determine whether an edge connects vertices in different components def kruskal(graph): for vertice in graph[’vertices’]: make_set(vertice) minimum_spanning_tree = set() edges = list(graph[’edges’]) edges.sort() for edge in edges: weight, vertice1, vertice2 = edge if find(vertice1) != find(vertice2): union(vertice1, vertice2) minimum_spanning_tree.add(edge) return minimum_spanning_tree edges. Let T = (V , F) be the spanning tree produced by KruskalâĂŹs algorithm, and let T ∗ = (V , F∗) be a minimum spanning tree. If T is not optimal then F , F, and there is an edge e ∈ F∗ such that e < F Then e creates a cycle C in the graph T + e, and at least one edge f of this cycle crosses the cut defined by T ∗ e. Furthermore, e < F because we tried to add it after the rest of C was already in the tree. Then it was the most expensive edge in C, and so cost(f ) ≤ cost(e). If we add the edge f to the graph T ∗ −e, then we reconnect the graph and create a spanning tree. Also, cost(T ∗ e + f ) = cost(T ∗)cost(e) + cost(f ) ≤ cost(T ∗), and so we have created a new spanning tree of no more cost than T ∗, but with one more edge in common with T. We can do this for every edge that differs between T and T ∗ , until we either obtain the tree T , of no more cost than T ∗ , contradicting that T was not optimal. 7.7.2 Prim’s Algorithm 1) Create a set mstSet that keeps track of vertices already included in MST. 2) Assign a key value to all vertices in the input graph. Initialize all key values as INFINITE. Assign key value as 0 for the first vertex so that it is picked first. 3) While mstSet doesn’t include all vertices .a) Pick a vertex u which is not there in mstSet and has minimum key value. .b) Include u to mstSet. .c) Update key value of all adjacent vertices of u. To update the key values, iterate through all adjacent vertices. For every adjacent vertex v, if weight of edge u-v is less than the previous key value of v, update the key value as weight of u-v a greedy algorithm that finds a minimum spanning tree for a connected weighted undirected graph. Builds on tree so A is always a tree, starts from an arbitrary "root" r, and at each step adds a ligh edge crossing cut (Va, V Va) to A where Va = vertices that A is incident on. Using different data structures for representing and linearly searching array of weights to find the minimum weight edge: adj. matrix O(|V |2 ), bin heap and adj lst O((|V | + |E|)log|V |) = O(|E|log|V |), Fibonacci heap and adj list O(|E| + |V |log|V |). In the method that uses binary heaps, we can observe that the traversal is executed O(V+E) times (similar to BFS). Each traversal has operation which takes O(LogV) time. So overall time complexity is O(E+V)*O(LogV) which is O((E+V)*LogV) = O(E*LogV) (For a connected graph, V = O(E)). vertex in a subgraph to a vertex outside the subgraph. Since P is connected, there will always be a path to every vertex. The output Y of Prim’s algorithm is a tree, because the edge and vertex added to tree Y are connected. Let Y1 be a minimum spanning tree of graph P. If Y1=Y then Y is a minimum spanning tree. Otherwise, let e be the first edge added during the construction of tree Y that is not in tree Y1, and V be the set of vertices connected by the edges added before edge e. Then one endpoint of edge e is in set V and the other is not. Since tree Y1 is a spanning tree of graph P, there is a path in tree Y1 joining the two endpoints. As one travels along the path, one must encounter an edge f joining a vertex in set V to one that is not in set V. Now, at the iteration when edge e was added to tree Y, edge f could also have been added and it would be added instead of edge e if its weight was less than e, and since edge f was not added, we conclude that w(f ) ≥ w(e) Let tree Y2 be the graph obtained by removing edge f from and adding edge e to tree Y1. It is easy to show that tree Y2 is connected, has the same number of edges as tree Y1, and the total weights of its edges is not larger than that of tree Y1, therefore it is also a minimum spanning tree of graph P and it contains edge e and all the edges added before it during the construction of set V. Repeat the steps above and we will eventually obtain a minimum spanning tree of graph P that is identical to tree Y. This shows Y is a minimum spanning tree. 7.8 Shortest Path Algorithms Shortest paths not necessarily unique Single source shortest paths find shortest paths from a given source vertex to every vertex v ∈ V Can use modified breadth first search to count number of edge traversals to reach another vertex. 7.8.1 Dijkstra’s Algorithm 7.8.2 Bellman-Ford Algorithm Like Dijkstra’s Algorithm, BellmanâĂŞFord is based on the principle of relaxation, in which an approximation to the correct distance is gradually replaced by more accurate values until eventually reaching the optimum solution. Dijkstra’s algorithm greedily selects the minimum-weight node that has not yet been processed, and performs this relaxation process on all of its outgoing edges; by contrast, the BellmanâĂŞFord algorithm simply relaxes all the edges, and does this |V | − 1 times, where |V | is the number of vertices in the graph. In each of these repetitions, the number of vertices with correctly calculated distances grows, from which it follows that eventually all vertices will have their correct distances. This method allows the BellmanâĂŞFord algorithm to be applied to a wider class of inputs than Dijkstra. BellmanâĂŞFord runs in O(|V | ∗ |E|) time, where |V | and |E| are the number of vertices and edges respectively. Correctness 7.8.3 Code Proof Let P be a connected, weighted graph. At every iteration of Prim’s algorithm, an edge must be found that connects a from pythonds.graphs import PriorityQueue, Graph, Vertex def dijkstra(aGraph,start): pq = PriorityQueue() start.setDistance(0) pq.buildHeap([(v.getDistance(),v) for v in aGraph]) while not pq.isEmpty(): currentVert = pq.delMin() for nextVert in currentVert.getConnections(): newDist = currentVert.getDistance() \ + currentVert.getWeight(nextVert) if newDist < nextVert.getDistance(): nextVert.setDistance( newDist ) nextVert.setPred(currentVert) pq.decreaseKey(nextVert,newDist) ## BELLMAN FORD # Step 1: For each node prepare the destination and predecessor def bel_initialize(graph, source): d = {} # Stands for destination p = {} # Stands for predecessor for node in graph: d[node] = float(’Inf’) # We start admiting that the rest # of nodes are very very far p[node] = None d[source] = 0 # For the source we know how to reach return d, p def bel_relax(node, neighbour, graph, d, p): if d[neighbour] > d[node] + graph[node][neighbour]: # Record this lower distance d[neighbour] = d[node] + graph[node][neighbour] p[neighbour] = node def bellman_ford(graph, source): d, p = bel_initialize(graph, source) for i in range(len(graph)-1): #Run this until is converges for u in graph: for v in graph[u]: #For each neighbour of u bel_relax(u, v, graph, d, p) #Lets bel_relax it # Step 3: check for negative-weight cycles for u in graph: for v in graph[u]: assert d[v] <= d[u] + graph[u][v] return d, p 8 Bipartite Graphs Graph bipartite iff does not contain an odd cycle Run DFS and build a DFS tree of the graph, then colour vertices red back, if all non-tree edges join vertices of different color then the graph is bipartite 8.1 Gale Shapley Correctness: algorithm must terminate after at most n2 iterations because each time through the while loop a man proposes to a new woman and there are only n2 possible proposals. All men and women get matched as given a man and a woman who are unmatched the woman must have never been proposed to but since the man must propose to everyone, there is a contradiction. The pairs must be stable as supposing an unstable pair creates the contradiction that either someone was never proposed to by the other person, which means he must have been stable, or she was proposed to but prefers a more stable partner. In either case, the pairing is stable, which makes a contradiction. All executions yield a man-optimal assignment, which is a stable matching. Suppose some man is paired with someone other than his best partner. Men propose in decreasing order of preference âĞŠ some man is rejected by a valid partner. Let Y be first such man, and let A be the first valid woman that rejects him. Let S be a stable matching where A and Y are matched. In building matching, when Y is rejected, A forms (or reaffirms) engagement with a man, say Z, whom she prefers to Y. Let B be Z’s partner in S. In building matching, Z is not rejected by any valid partner at the point when Y is rejected by A. Thus, Z prefers A to B. But A prefers Z to Y. Thus A-Z is unstable in S. def matchmaker(): guysfree = guys[:] engaged = {} guyprefers2 = copy.deepcopy(guyprefers) galprefers2 = copy.deepcopy(galprefers) while guysfree: guy = guysfree.pop(0) guyslist = guyprefers2[guy] gal = guyslist.pop(0) fiance = engaged.get(gal) if not fiance: # She’s free engaged[gal] = guy print(" %s and %s" % (guy, gal)) else: # The bounder proposes to an engaged lass! galslist = galprefers2[gal] if galslist.index(fiance) > galslist.index(guy): # She prefers new guy engaged[gal] = guy print(" %s dumped %s for %s" % (gal, fiance, guy)) if guyprefers2[fiance]: # Ex has more girls to try guysfree.append(fiance) else: # She is faithful to old fiance if guyslist: # Look again guysfree.append(guy) return engaged return self.adj[v] def add_edge(self, u, v, w=0): if u == v: raise ValueError("u == v") edge = Edge(u,v,w) redge = Edge(v,u,0) edge.redge = redge redge.redge = edge self.adj[u].append(edge) self.adj[v].append(redge) self.flow[edge] = 0 self.flow[redge] = 0 def find_path(self, source, sink, path): if source == sink: return path for edge in self.get_edges(source): residual = edge.capacity - self.flow[edge] if residual > 0 and edge not in path: result = self.find_path( edge.sink, sink, path + [edge]) if result != None: return result def max_flow(self, source, sink): path = self.find_path(source, sink, []) while path != None: residuals = [edge.capacity self.flow[edge] for edge in path] flow = min(residuals) for edge in path: self.flow[edge] += flow self.flow[edge.redge] -= flow path = self.find_path(source, sink, []) return sum(self.flow[edge] for edge in self.get_edges(source)) 9 Network Flow capacity constraint skew symmetry flow conservation ∀u, v ∈ V , f (u, v) ≤ c(u, v) ∀u, v ∈ V , f (u, v) P = −f (v, u), v ∈ V ∀u ∈ V − {s, t}, v∈V f (u, v) = 0 9.1 Ford-Fulkerson Algorithm class Edge(object): def __init__(self, u, v, w): self.source = u self.sink = v self.capacity = w def __repr__(self): return "%s->%s:%s" % (self.source, self.sink, self.capacity) class FlowNetwork(object): def __init__(self): self.adj = {} self.flow = {} def add_vertex(self, vertex): self.adj[vertex] = [] def get_edges(self, v): Finding path from s to t takes O(|E|) by either BFS or DFS, flow increases by at least 1 at each iteration, runs in O(Ef ) where f is the maximum flow in the graph FORD FULKERSON FOREVER s v1 e1 v2 e3 t v3 e2 v4 9.2 Dijkstra’s Algorithm Does not allow for negative weights Uses a priority queue Keys are shortest-path weights Similar to Prim’s algorithm but computing d[v] using shortest-path weights as keys At each step we make greedy choice to choose light edge 9.3 Bellman Ford Allows negative-weight edges Returns TRUE if no negative-weight cycles reachable from s, FALSE otherwise, if has not converged after V (G) − 1 iterations, then there cannot be a shortest path tree, so there must be negative weight cycle. When the algorithm is used to find shortest paths, the existence of negative cycles is a problem, preventing the algorithm from finding a correct answer. However, since it terminates upon finding a negative cycle, the BellmanâĂŞFord algorithm can be used for applications in which this is the target to be sought - for example in cycle-cancelling techniques in network flow analysis Proof Lemma. After i repetitions of for loop: If Distance(u) is not infinity, it is equal to the length of some path from s to u; If there is a path from s to u with at most i edges, then Distance(u) is at most the length of the shortest path from s to u with at most i edges. Proof. For the base case of induction, consider i=0 and the moment before for loop is executed for the first time. Then, for the source vertex, source.distance = 0, which is correct. For other vertices u, u.distance = infinity, which is also correct because there is no path from source to u with 0 edges. For the inductive case, we first prove the first part. Consider a moment when a vertex’s distance is updated by v.distance := u.distance + uv.weight. By inductive assumption, u.distance is the length of some path from source to u. Then u.distance + uv.weight is the length of the path from source to v that follows the path from source to u and then goes to v. For the second part, consider the shortest path from source to u with at most i edges. Let v be the last vertex before u on this path. Then, the part of the path from source to v is the shortest path from source to v with at most i-1 edges. By inductive assumption, v.distance after iâĹŠ1 iterations is at most the length of this path. Therefore, uv.weight + v.distance is at most the length of the path from s to u. In the ith iteration, u.distance gets compared with uv.weight + v.distance, and is set equal to it if uv.weight + v.distance was smaller. Therefore, after i iteration, u.distance is at most the length of the shortest path from source to u that uses at most i edges. If there are no negative-weight cycles, then every shortest path visits each vertex at most once, so at step 3 no further improvements can be made. Conversely, suppose no improvement can be made. Then for any cycle with vertices v[0], ..., v[kâĹŠ1], v[i].distance <= v[(i-1) mod k].distance + v[(i-1) mod k]v[i].weight Summing around the cycle, the v[i].distance terms and the v[iâĹŠ1 (mod k)] distance terms cancel, leaving 0 <= sum from 1 to k of v[i-1 (mod k)]v[i].weight I.e., every cycle has nonnegative weight. 10 Dynamic programming 10.1 Best alignment Levenshtein distance minimal number of substitutions, insertions, and deletions is considered optimal 11 Some Proofs 11.1 Minimum spanning trees 1. Show that if an edge (u, v) is the unique light edge crossing some cut of the connected, weighted undirected graph G, then (u, v) must be included in all minimum spanning trees of G. Solution: Suppose for contradiction that there is some MST T that does not contain the light edge (u, v) crossing a cut C. Because T is a connected tree, there is some path in T connecting the vertices u and v, and therefore adding the edge (u, v) to T forms a cycle. Following this cycle, there must be at least one other edge in the cycle crossing the cut C. Removing this edge, and leaving (u, v), we now have a tree with a lower total weight than the original T. Because T was assumed to be an MST, this is a contradiction. Therefore, any unique light edge of a cut must be part of every MST. 2. Suppose that we have a connected, weighted undirected graph G = (V, E) such that every cut of G has a unique light edge crossing the cut. Show that G has exactly one minimum spanning tree. Solution: From the previous proof (proof 1 above) we know that for every cut of G, the unique light edge crossing the cut must be part of every MST of G. Consider the set T of edges, containing the unique light edge crossing each cut C of G. By the previous part, we know that T is a subset of every MST. Now we can show that T must be equal to every MST. Because there is exactly one edge of T crossing every cut of G, the graph (V, T) must have exactly one connected component: if there were two unconnected components V1 and V âĹŠ V1, then the cut (V1, V âĹŠ V1) would not be crossed by any edge of T, which contradicts the definition of T. In addition, T cannot contain any cycles: the heaviest edge on a cycle in T cannot be the unique lightest edge in any cut. Then T itself must be a spanning tree of G. Because T is a subset of all minimum spanning trees of G, it must be the unique MST of G, as any additional edge added to T would form a cycle, making it no longer be a tree. 3. Given any connected undirected graph G with positive edge weights w, does there always exist a shortest path tree S such that S is a minimum spanning tree of G? Solution: counterexample. draw square with an X in it. outside edges with weight 2 and diagonal inner edges have a weight of 3. Minimum spanning trees are any three of the weight 2. the shortest path tree from each node contains just the edges containing a single node. 4. Does there exist some connected undirected graph G with positive edge weights w such that G has a shortest path tree S and a minimum spanning tree T that do not share any edges? Prove your answer. Solution: Assuming that G has more than one vertex, there is no such graph. Given any source vertex s, consider the light edges leaving s, defined as the edges (s, v) of minimum weight (where v is some neighbor of s). 5. Lemma Every MST of G must contain at least one of these light edges. Proof: Suppose there is some MST T containing none of the light edges leaving s. Consider any light edge (s, u). There is some path from s to u in T, and therefore adding (s, u) to T forms a cycle. There must be some other edge (s, v) in the cycle such that w(s, u) < w(s, v), as we have assumed that there are no light edges leaving s contained in T. Removing (s, v) from T and replacing it with (s, u) forms a spanning tree of smaller total weight than T, contradicting our assumption that T is an MST. 6. Lemma All of the light edges leaving s must be contained in all shortest path trees rooted at s. PROOF. If (s, u) is a light edge leaving s, then because G has positive edge weights, the single edge (s, u) must be the unique shortest path from s to u. Any other path from s to u must go through some edge (s, v), and by definition we know that w(s, v) geq w(s, u); therefore, all other paths from s to u must have strictly greater weight than the single edge (s, u). These two lemmas together show that for any source node s all shortest path trees from s must share some edge with all MSTs of G, so there is no shortest path tree S and MST T that share no edges 7. either G has some path of length at least k. or G has O(kn) edges. Proof: look at the longest path in the DFS tree. If it has length at least k, we’re done. Otherwise, since each edge connects an ancestor and a descendant, we can bound the number of edges by counting the total number of ancestors of each descendant, but if the longest path is shorter than k, each descendant has at most k-1 ancestors. So there can be at most (k-1)n edges. 11.2 Shortest Paths Any subpath of a shortest path is a shortest path suppose some path p is the shortest path from u to v. u to v has x and y in its path, which also have a path. suppose there is a shorter path from x to y. then there must be a shorter path from u to v, which is a contradiction of the hypothesis that p is the shortest possible path. 11.3 Graphs 1. A graph is bipartite iff it does not contain an odd cycle DECIDE IF THIS IS NECESSARY 11.4 DFS Corollary For v , u, v is a descendant of u in a DFS tree if and only if s[u] < s[v] < f[v] < f[u]. Proof This is an âĂIJif and only ifâĂİ statement, so we must prove two directions. (back): The condition s[u] < s[v] < f[v] < f[u] implies that v is added to the stack during the time u is Gray. When u is Gray, only descendants of u are added to the stack. Therefore v is a descendant of u. (forward): For this direction, it suffices to show that if u is the parent of v in the DFS tree, then s[u] < s[v] < f[v] < f[u]. Suppose that u is the parent if v, then u is the last vertex that causes v to be added on the stack. So at time s[u] v is White, i.e., s[u] < s[v]. Furthermore, u is not added to the stack after s[u]. So when v is colored Black, u is still on stack and has not yet colored Black. Hence f[v] < f[u]. QED 11.5 BFS Vertices in order The proof that vertices are in this order by breadth first search goes by induction on the level number. By the induction hypothesis, BFS lists all vertices at level k-1 before those at level k. Therefore it will place into L all vertices at level k before all those of level k+1, and therefore so list those of level k before those of level k+1. 11.6 Code def dfs(graph, start): visited, stack = set(), [start] while stack: vertex = stack.pop() if vertex not in visited: visited.add(vertex) stack.extend(graph[vertex] - visited) return visited def dfs_paths(graph, start, goal): stack = [(start, [start])] while stack: (vertex, path) = stack.pop() for next in graph[vertex] - set(path): if next == goal: yield path + [next] else: stack.append((next, path + [next])) def bfs(graph, start): visited, queue = set(), [start] while queue: vertex = queue.pop(0) if vertex not in visited: visited.add(vertex) queue.extend(graph[vertex] - visited) return visited 12 Divide and Conquer 12.1 Merge Sort Recurrence analysis def mergesort(arr): if len(arr) == 1: return arr m = len(arr) / 2 l = mergesort(arr[:m]) r = mergesort(arr[m:]) if not len(l) or not len(r): return l or r result = [] i = j = 0 while (len(result) < len(r)+len(l)): if l[i] < r[j]: result.append(l[i]) i += 1 else: result.append(r[j]) j += 1 if i == len(l) or j == len(r): result.extend(l[i:] or r[j:]) break return result 12.2 Karatsuba Multiplication def multiply(x, y): if x.bit_length() <= _CUTOFF or y.bit_length() <= _CUTOFF: return x * y else: n = max(x.bit_length(), y.bit_length()) half = (n + 32) // 64 * 32 mask = (1 << half) - 1 xlow = x & mask ylow = y & mask xhigh = x >> half yhigh = y >> half a = multiply(xhigh, yhigh) b = multiply(xlow + xhigh, ylow + yhigh) c = multiply(xlow, ylow) d = b - a - c return (((a << half) + d) << half) + c def prim(G,start): pq = PriorityQueue() for v in G: v.setDistance(sys.maxsize) v.setPred(None) start.setDistance(0) pq.buildHeap([(v.getDistance(),v) for v in G]) while not pq.isEmpty(): currentVert = pq.delMin() for nextVert in currentVert.getConnections(): newCost = currentVert.getWeight(nextVert) \ + currentVert.getDistance() if v in pq and newCost<nextVert.getDistance(): nextVert.setPred(currentVert) nextVert.setDistance(newCost) pq.decreaseKey(nextVert,newCost) 12.3 Block Matrix Multiplication 12.4 Strassen’s Algorithm Strassen algorithm, named after Volker Strassen, is an algorithm used for matrix multiplication. It is faster than the standard matrix multiplication algorithm and is useful in practice for large matrices, but would be slower than the fastest known algorithms for extremely large matrices. the matrices are square, and the size is a power of two, and that padding should be used if needed. This restriction allows the matrices to be split in half, recursively, until limit of scalar multiplication is reached The number of additions and multiplications required in the Strassen algorithm can be calculated as follows: let f(n) be the number of operations for a 2n 2n matrix. Then by recursive application of the Strassen algorithm, we see that f (n) = 7f (n1) + l4n , for some constant l that depends on the number of additions performed at each application of the algorithm. Hence f (n) = (7 + o(1))n , i.e., the asymptotic complexity for multiplying matrices of size N = 2n using the Strassen algorithm is O([7 + o(1)]n ) = O(N log2 7+o(1) ) The proof that Strassen’s algorithm should exist is a simple dimension count (combined with a proof that the naive dimension count gives the correct answer). Consider the vector space of all bilinear map C n × C n → C n , this is a vector space of dimension n3 (in the case of matrix multiplication, we have n = m2 , e.g. n = 4 for the 2 × 2 case). The set of bilinear maps of rank one, i.e., those computable in an algorithm using just one scalar multiplication, has dimension 3(n − 1) + 1 and the set of bilinear maps of rank at most r has dimension the min of r[3(n − 1)] + r and n3 for most values of n, r (and one can check that this is correct when r = 7, n = 4. Thus any bilinear map C 4 × C 4 → C 4 , with probability one has rank at most 7, and may always be approximated to arbitrary precision by a bilinear map of rank at most 7. 12.5 Dot Product Grade school dot product optimal 12.6 Fast Fourier Given A(x) an B(x) as some polynomials, adding them is O(n), multiplying them is O(n) but we need 2n = 1 points, and O(n2 ) for evaluation using Lagrange’s formula. 13 Amortized Analysis Works well for online algorithms that can process input piece-by-piece in a serial fashion 13.1 Aggregate Analysis Determines upper bound T (n) and then calculated amortized cost to be T (n)/n 13.2 Accounting Method Determines cost of each operating, combining immediate execution and influence on future operations. Usually short running operations accumulate a debt of unfavourable state in small increments while long running operations decrease it drastically. The amortized cost of each operation must be greater than or equal to cost of actual operation. 13.3 Potential Method Like accounting but overcharges operations early to compensate for undercharges later 13.4 Infinite Binary Counter Prove that any non-negative integer n can be represented as the sum of distinct powers of 2. Proof: The base case n = 0 is trivial. For any n > 0, the inductive hypothesis implies that there is set of distinct powers of 2 whose sum is n âĹŠ 1. If we add 20 to this set, we obtain a multiset of powers of two whose sum is n, which might contain two copies of 20 . Then as long as there are two copies of any 2 i in the multiset, we remove them both and insert 2i+1 in their place. The sum of the elements of the multiset is unchanged by this replacement, because 2i+1 = 2i + 2i . Each iteration decreases the size of the multiset by 1, so the replacement process must eventually terminate. When it does terminate, we have a set of distinct powers of 2 whose sum is n. Naive runtime: Now suppose we want to use INCREMENT to count from 0 to n. If we only use the worst-case running time for each increment, we get an upper bound of O(nlogn) on the total running time. Summation/aggregate method: least significant bit B[0] does flip every time but B[1] every other time... in general B[i] flips every 2i th time. n increments flips each bit B[i] floor [n/2i ] times. thus the total number of bit flips is (all floors) Pf loorlog(n) P i [n/2i ] < ∞ i=0 (n/2 ) = 2n. thus on average each call i=0 flips two bits and thus runs in constant time. Accounting method: charge 2 dollars for setting bit from 0 to 1. one spent on changing, another for changing it back. we always have enough credit to pay for the next increment on that bit. 13.5 Queue Earlier in the semester we saw a way of implementing a queue (FIFO) using two stacks (LIFO). Say that our stack has three operations, push, pop and empty, each with cost 1. We saw that a queue can be implemented as enqueue(x): push x onto stack1 dequeue(): if stack2 is empty then pop the entire contents of stack1 pushing each element in turn onto stack 2. Now pop from stack2 and return the result. We’ve seen earlier that this algorithm is correct, now we will consider the running time in more detail. A conventional worst case analysis would establish that dequeue takes O(n) time, but this is clearly a weak bound for a sequence of operations, because very few dequeues actually take that long. Thus O(n2 ) is not a very accurate characterization of the time needed for a sequence of n enqueue and dequeue operations, even though in the worst case an individual dequeue can take O(n) time. To simplify the amortized analysis, we will consider only the cost of the push and pop operations and not of checking whether stack2 is empty. Aggregate method. Each element is clearly pushed at most twice and popped at most twice, at most once from each stack. If an element is enqueued and never dequeued, then it is pushed at most twice and popped at most once. Thus the amortized cost of each enqueue is 3 and of each dequeue is 1. Banker’s method. Each enqueue will be charged $3. This will cover the $2 cost of popping it and pushing it from stack1 to stack2 if that ever needs to be done, plus $1 for the initial push onto stack1. The dequeue operations cost $1 to pop from stack2. Note that the analysis in both cases seems to charge more for storing than removing, even though in the code it is the other way around. Amortized analysis bounds the overall sequence, which in this case depends on how much stuff is stored in the data structure. It does not bound the individual operations. 16 Master Theorem 14 Randomized Algorithms 14.1 Karger Contraction Algorithm Pick edge uniformly at random. Contract the edge by replacing edge nodes with single supernode, keeping parallel edges but deleting self loops. Repeat until graph has just two nodes. Return the cut (all nodes that were contracted). By repeating this algorithm n2 log(n) times with independent choices, then the probability of failing to find the min-cut is ≤ 1/n2 . Proof for fastmincut (Karger Stein) the probability of finding a √ specific cutset is P (n) = 1 − (1 − (1/2)P ([1 + (n/ 2)])2 with solution P (n) = O(1/log(n)). √ The running time of fastmincut satisfies T (n) = 2T ([1 + n/ 2]) + O(n2 ) with solution T (n) = O(n2 logn). To achieve error probability O(1/n) the algorithm can be repeated O(logn/(P (N ))) times for an overall runtime of O(n2 log 3 n). 14.2 Maximum 3 Satisfiability MAX-3SAT is a problem in the computational complexity subfield of computer science. It generalises the Boolean satisfiability problem (SAT) which is a decision problem considered in complexity theory. It is defined as: Given a 3-CNF formula φ (i.e. with at most 3 variables per clause), find an assignment that satisfies the largest number of clauses. Approx-Max3SAT is 7/8-approximate Proof: random variable Z is 1 if clause satisfied 0 otherwise. Sum from j=1 to k is number of clauses satisfied. Expected number of clauses satisfied is sum from j = 1 to k Pr(Cj is satisfied) = 7k/8 Corollary (Lower Bound on Number of Satisfiable Clauses) For any instance of 3-SAT, there exists a truth assignment that satisfies at least 7/8 of the clauses. Proof: A random variable is at least its expectation some of the time Corollary Any instance of 3-SAT with at most 7 clauses is satisfiable. Proof. Follows from the lower bound on number of satisfiable clauses T (n) = aT (n/b) + f (n) where a ≥ 1 and b > 1 are constants and f (n) is asymptotically positive. Case 1: if f (n) = O(nlogb a− ) for some then T (n) = Θ(nlogb a ). Case 2: if f (n) = Θ(nlogb a log k n) with k ≥ 1 then T (n) = Θ(nl ogb alog k+1 n). Case 3: if f (n) = Ω(nlogb a+ ) with > 0 and f (n) satisfies the regularity condition, then T (n) = Θ(f (n)). Regularity condition is af (n/b) ≤ cf (n) for some c < 1 and sufficiently large n. Case on right hand side for examples T (n) = 3T (n/2) + n2 = Θ(n2 ) 3 (15) T (n) = 4T (n/2) + n2 = Θ(n2 log(n)) T (n) = T (n/2) + 2n = Θ(2n ) 2 (16) 3 (17) T (n) = 2n T (n/2) + nn = ... a not constant (18) T (n) = 16T (n/4) + n = Θ(n2 ) 1 (19) T (n) = 2T (n/2) + nlog(n) = Θ(nlog 2 n) T (n) = 2T (n/2) + n/log(n) 2 (20) non poly (21) T (n) = 2T (n/4) + n0.51 = Θ(n0.51 ) T (n) = .5T (n/2) + 1/n T (n) = 16T (n/4) + n! = O(n!) √ √ T (n) = 2T (n/2) + log(n) = Θ( n) √ T (n) = 3T (n/3) + n = Θ(n) T (n) = 4T (n/2) + cn = Θ(nlog(n)) T (n) = T (n/2) + n(2 − cos(n)) 3 (22) a < 1 (23) 3 (24) selection, randomized search trees, ham-sandwich trees Zn T (n) = T (3n/4) + T (n/4) + n = Θ(n(1 + du/u)) = O(nlog(n)) 1 (29) T (n) = T (n/5) + T (7n/10) + n = Θ(np ∗ (1 + Θ(n1−p ))) = Θ(n) (30) Zn T (n) = (1/4)T (n/4) + (3/4)T (3n/4) + 1 = Θ(1 + du/u) = Θ(log(n)) 1 (31) T (n) = T (n/2) + T (n/4) + 1 = Θ(np (1 + Θ(1))) = O(nlogφ ) (32) 18 Red Black Trees A redâĂŞblack tree is a binary search tree with an extra bit of data per node, its color, which can be either red or black. The extra bit of storage ensures an approximately balanced tree by constraining how nodes are colored from any path from the root to the leaf. Thus, it is a data structure which is a type of self-balancing binary search tree. 1 (25) 1 (26) 3 (27) regularity violate (28) 16.1 Proof Probably too long to go onto the exam. In addition to the requirements imposed on a binary search tree the following must be satisfied by a redâĂŞblack tree: 1. A node is either red or black. 2. The root is black. (This rule is sometimes omitted. Since the root can always be changed from red to black, but not necessarily vice versa, this rule has little effect on analysis.) 3. All leaves (NIL) are black. (All leaves are same color as the root.) 4. Every red node must have two black child nodes. 5. Every path from a given node to any of its descendant NIL nodes contains the same number of black nodes. 15 Common Recurrence Relations T (n) = 2T (n/2) + n T (n) = 4[2T (n/8) + n/4] + 2n (10) (11) T (n) = 2k T (n/(2k )) + kn T (n) = n + nlog2 n T (n) = 2log2 nT (1) + (log2 n)nT (n) = O(nlogn) (12) (13) (14) Recurrence T (n/2) + Θ(1) T (n − 1) + Θ(1) 2T (n/2) + Θ(1) T (n − 1) + Θ(n) 2T (n/2) + Θ(n) T (n − 1) + T (0) + Θ(n) Algorithm Binary Search Sequential Search tree traversal Selection Sort (n2 sorts) Mergesort Quicksort Big O O(log(n)) O(n) O(n) O(n2 ) O(n log n) O(n2 ) 16.2 Height of Recursion Tree 17 Akra-Bazzi Theorem Generalizes master theorem to divide and conquer algorithms. Given ai > 0, 0 < bi ≤ 1, functions hi (n) = O(n/log 2 n) and g(n) = O(nc ), if the function T (n) satisfies P T (n) = ki=1 ai T (bi n + hi (n)) + g(n) then Rn T (n) = Ω(np (1 + 1 (g(u)du/u p+1 )) where p satisfies Pk p i=1 ai bi = 1. The below are, in order, randomized quicksort, deterministic Proof: height Lemma 1: any node x with height h(x) has a black height bh(x) ≥ h(x)/2 Proof: By property of 4, ≤ h/2 nodes on the path from he node are red, hence ≥ h/2 are black. Lemma 2: The subtree rooted at any node Âăx contains Âă≤ 2bh(x)1 internal Âănodes. Proof: ÂăBy Âăinduction Âăon height of x. Base Case: Height h(x) = 0 implies x is a leaf implies bh(x) = 0. Subtree has 20 1 = 0 nodes. Induction Step: 1. Each child of x has height h(x) - 1 and black-ÂŋâĂŘheight either b(x) (child is red) or b(x) - 1 (child is black). 2. By ind. hyp., each child has ≥ 2bh(x)1 1 internal nodes. 3. Subtree rooted at x has ≥ 2(2bh(x)1 1) + 1 = 2bh(x) 1 internal nodes. (The +1 is for x itself) Lemma 3: a red-black tree with n internal nodes has height at most 2lg(n + 1) Proof: by lemma 2, n ≥ 2bh − 1, by lemma 1 bh ≥ h/2, thus n ≥ 2h/2 − 1 implies h ≤ 2lg(n + 1) 18.1 Insertion x = x.left return x def maximum(self, x=None): if None == x: x = self.root while x.right != self.nil: x = x.right return x def insert_key(self, key): self.insert_node(self._create_node(key=key)) 18.2 Code class rbnode(object): def __init__(self, key): "Construct." self._key = key self._red = False self._left = None self._right = None self._p = None def insert_node(self, z): y = self.nil x = self.root while x != self.nil: y = x if z.key < x.key: x = x.left else: x = x.right z._p = y if y == self.nil: self._root = z elif z.key < y.key: y._left = z else: y._right = z z._left = self.nil z._right = self.nil z._red = True self._insert_fixup(z) def _insert_fixup(self, z): while z.p.red: if z.p == z.p.p.left: class rbtree(object): y = z.p.p.right def __init__(self, create_node=rbnode): if y.red: self._nil = create_node(key=None) z.p._red = False "Our nil node, used for all leaves." y._red = False self._root = self.nil z.p.p._red = True "The root of the tree." z = z.p.p self._create_node = create_node else: "A callable that creates a node." if z == z.p.right: z = z.p self._left_rotate(z) root = property(fget=lambda self: self._root, doc="The tree’s root node") z.p._red = False nil = property(fget=lambda self: self._nil, doc="The tree’s nil node") z.p.p._red = True def search(self, key, x=None): self._right_rotate(z.p.p) if None == x: else: x = self.root y = z.p.p.left while x != self.nil and key != x.key: if y.red: if key < x.key: z.p._red = False x = x.left y._red = False else: z.p.p._red = True x = x.right z = z.p.p return x else: def minimum(self, x=None): if z == z.p.left: if None == x: z = z.p x = self.root self._right_rotate(z) while x.left != self.nil: z.p._red = False z.p.p._red = True self._left_rotate(z.p.p) self.root._red = False def _left_rotate(self, x): y = x.right x._right = y.left if y.left != self.nil: y.left._p = x y._p = x.p if x.p == self.nil: self._root = y elif x == x.p.left: x.p._left = y else: x.p._right = y y._left = x x._p = y def _right_rotate(self, y): x = y.left y._left = x.right if x.right != self.nil: x.right._p = y x._p = y.p if y.p == self.nil: self._root = x elif y == y.p.right: y.p._right = x else: y.p._left = x x._right = y y._p = x def check_invariants(self): def is_red_black_node(node): # check has _left and _right or neither if (node.left and not node.right) or (node.right and not node.left): return 0, False # check leaves are black if not node.left and not node.right and node.red: return 0, False # if node is red, check children are black if node.red and node.left and node.right: if node.left.red or node.right.red: return 0, False # descend tree and check black counts are balanced if node.left and node.right: # check children’s parents are correct if self.nil != node.left and node != node.left.p: return 0, False if self.nil != node.right and node != node.right.p: return 0, False # check children are ok left_counts, left_ok = is_red_black_node(node.left) if not left_ok: return 0, False right_counts, right_ok = is_red_black_node(node.right) if not right_ok: return 0, False # check children’s counts are ok if left_counts != right_counts: return 0, False return left_counts, True else: return 0, True num_black, is_ok = is_red_black_node(self.root) return is_ok and not self.root._red 19 Interval Scheduling def schedule_weighted_intervals(I): # Use dynamic algorithm to schedule weighted intervals # sorting is O(n log n), # finding p[1..n] is O(n log n), # finding OPT[1..n] is O(n), # selecting is O(n) # whole operation is dominated by O(n log n) I.sort(lambda x, y: x.finish - y.finish) # f_1 <= f_2 <= .. <= f_n p = compute_previous_intervals(I) # compute OPTs iteratively in O(n), here we use DP OPT = collections.defaultdict(int) OPT[-1] = 0 OPT[0] = 0 for j in xrange(1, len(I)): OPT[j] = max(I[j].weight + OPT[p[j]], OPT[j - 1]) # given OPT and p, find actual solution intervals in O(n) O = [] def compute_solution(j): if j >= 0: # will halt on OPT[-1] if I[j].weight + OPT[p[j]] > OPT[j - 1]: O.append(I[j]) compute_solution(p[j]) else: compute_solution(j - 1) compute_solution(len(I) - 1) # resort, as our O is in reverse order (OPTIONAL) O.sort(lambda x, y: x.finish - y.finish) Proof of optimality for weighted: To prove optimality, we just need to show that for all 1 ≤ i ≤ n, Si contains the value of an optimal solution of the first i intervals. We do so using induction. Proof. For i = 0, S0 = 0 and it is optimal since no interval has been processed, and suppose the claim holds for Sj for all j < i. Consider Si : Either Ii was added to the solution or it wasnâĂŹt. If Ii was not added to the solution then the optimal solution for the first i intervals is just the same as optimal solution for the first i âĹŠ 1 intervals, i.e. Si = SiâĹŠ1. Otherwise, suppose Ii is added to the solution. Then all the intervals Ip[i]+1, Ip[i]+2, ..., IiâĹŠ1 conflict with Ii and the remaining intervals to chose from are amongst the first p[i] intervals. Therefore any optimal solution that includes Ii must be a subset of {I1, I2, ..., Ip[i] , Ii}. Since Ii does not intersect with any interval in {I1, ..., Ip[i]} and Sp[i] is the optimal solution of the first p[i] intervals (by induction hypothesis), we conclude that Si = Sp[i] + wi. And since Si is the maximization of these two cases, the larger value out of these two scenarios is the value of Si. def schedule_unweighted_intervals(I): # Use greedy algorithm to schedule unweighted intervals # sorting is O(n log n), selecting is O(n) # whole operation is dominated by O(n log n) # f_1 <= f_2 <= .. <= f_n I.sort(lambda x, y: x.finish - y.finish) O = [] finish = 0 for i in I: if finish <= i.start: finish = i.finish O.append(i) return O return O 20 Insert into AVL Tree 1. Left-left Case: x is left child of y and y is left child of z. right rotate 2. Left-Right Case: x is the right child of y and y is the left child of z. 3. Right-Left Case: x is the left child of y and y is the right child of z. 4. Right-Right Case: x is the right child of y and y is the right child of z. def compute_previous_intervals(I): # For every interval j, # compute the rightmost mutually # compatible interval i, where i < j # I is a sorted list of Interval objects (sorted by finish time) # extract start and finish times start = [i.start for i in I] finish = [i.finish for i in I] p = [] for j in xrange(len(I)): # rightmost interval f_i <= s_j i = bisect.bisect_right(finish, start[j]) - 1 p.append(i) return p 21 Difference Between Prim’s and Kruskal’s Algorithm Kruska’s builds a minimum spanning tree by adding one edge at a time. The next line is always the shortest (minimum weight) ONLY if it does NOT create a cycle. Prims builds a mimimum spanning tree by adding one vertex at a time. The next vertex to be added is always the one nearest to a vertex already on the graph. In Prim’s, you always keep a connected component, starting with a single vertex. You look at all edges from the current component to other vertices and find the smallest among them. You then add the neighbouring vertex to the component, increasing its size by 1. In N-1 steps, every vertex would be merged to the current one if we have a connected graph. In Kruskal’s, you do not keep one connected component but a forest. At each stage, you look at the globally smallest edge that does not create a cycle in the current forest. Such an edge has to necessarily merge two trees in the current forest into one. Since you start with N single-vertex trees, in N-1 steps, they would all have merged into one if the graph was connected. 22 Heap Sort def heapsort( aList ): # convert aList to heap length = len( aList ) - 1 leastParent = length / 2 for i in range ( leastParent, -1, -1 ): moveDown( aList, i, length ) # flatten heap into sorted array for i in range ( length, 0, -1 ): if aList[0] > aList[i]: swap( aList, 0, i ) moveDown( aList, 0, i - 1 ) def quickSortHelper(alist,first,last): if first<last: splitpoint = partition(alist,first,last) quickSortHelper(alist,first,splitpoint-1) quickSortHelper(alist,splitpoint+1,last) def partition(alist,first,last): pivotvalue = alist[first] leftmark = first+1 rightmark = last done = False while not done: while leftmark <= rightmark and \ alist[leftmark] <= pivotvalue: leftmark = leftmark + 1 def moveDown( aList, first, last ): while alist[rightmark] >= pivotvalue and \ largest = 2 * first + 1 rightmark >= leftmark: while largest <= last: rightmark = rightmark -1 # right child exists and is larger than left child if rightmark < leftmark: if ( largest < last ) and ( aList[largest] < aList[largest + 1] ): done = True largest += 1 else: # right child is larger than parent temp = alist[leftmark] if aList[largest] > aList[first]: alist[leftmark] = alist[rightmark] swap( aList, largest, first ) alist[rightmark] = temp # move down to largest child temp = alist[first] first = largest; alist[first] = alist[rightmark] largest = 2 * first + 1 alist[rightmark] = temp else: return rightmark return # force exit def swap( A, x, y ): tmp = A[x] A[x] = A[y] A[y] = tmp 23 Quick Sort Randomized quicksort: Analysis: One can show that randomized quicksort has the desirable property that, for any input, it requires only O(n log n) expected time (averaged over all choices of pivots). However, there also exists a combinatorial proof. To each execution of quicksort corresponds the following binary search tree (BST): the initial pivot is the root node; the pivot of the left half is the root of the left subtree, the pivot of the right half is the root of the right subtree, and so on. The number of comparisons of the execution of quicksort equals the number of comparisons during the construction of the BST by a sequence of insertions. So, the average number of comparisons for randomized quicksort equals the average cost of constructing a BST when the values inserted (x_1, x_2, ..., x_n) form a random permutation. By linearity of expectation, the expected value is the sum from i and j less than i of the probability of the C for i, j where c(i,j) is a binary random value expressing whether during insert of xi there was a comparison to xi. Since these are random permutation the probability xi is adjacent to xj is exactly (2/(j + 1)). the sum from i sum from j < i of (2/(j+1)) is O(sum i log i) = O(nlog(n)) def quickSort(alist): quickSortHelper(alist,0,len(alist)-1) 24 Quick Find and Union Union-find cost model. When studying algorithms for union-find, we count the number of array accesses (number of times an array entry is accessed, for read or write). Definitions. The size of a tree is its number of nodes. The depth of a node in a tree is the number of links on the path from it to the root. The height of a tree is the maximum depth among its nodes. Proposition. The quick-find algorithm uses one array access for each call to find() and between N+3 and 2N+1 array accesses for each call to union() that combines two components. Proposition. The number of array accesses used by find() in quick-union is 1 plus twice the depth of the node corresponding to the given site. The number of array accesses used by union() and connected() is the cost of the two find() operations (plus 1 for union() if the given sites are in different trees). Proposition. The depth of any node in a forest built by weighted quick-union for N sites is at most lg N. Corollary. For weighted quick-union with N sites, the worst-case order of growth of the cost of find(), connected(), and union() is log N. # # the item is not yet part of a set in X, a new singleton set is created for it. #- X.union(item1, item2, ...) merges the sets containing each item # into a single larger set. If any item is not yet part of a set # in X, it is added to X as one of the members of the merged set. def __init__(self): #Create a new empty union-find structure. self.weights = {} self.parents = {} def __getitem__(self, object): #Find and return the name of the set containing the object. # check for previously unknown object if object not in self.parents: self.parents[object] = object self.weights[object] = 1 return object # find path of objects leading to the root path = [object] root = self.parents[object] while root != path[-1]: path.append(root) root = self.parents[root] # compress the path and return for ancestor in path: self.parents[ancestor] = root return root def __iter__(self): #Iterate through all items ever found or unioned by this # structure. return iter(self.parents) def union(self, *objects): #Find the sets containing the objects and merge them all. roots = [self[x] for x in objects] heaviest = max([(self.weights[r],r) for r in roots])[1] for r in roots: if r != heaviest: self.weights[heaviest] += self.weights[r] self.parents[r] = heaviest 25 Knapsack Problem The knapsack problem or rucksack problem is a problem in combinatorial optimization: Given a set of items, each with a mass and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a class UnionFind: given limit and the total value is as large as possible. It derives #Union-find data structure. its name from the problem faced by someone who is constrained #Each unionFind instance X maintains a family of disjoint sets of by a fixed-size knapsack and must fill it with the most valuable items. #hashable objects, supporting the following two methods: The decision problem form of the knapsack problem (Can a #- X[item] returns a name for the set containing the given item. value of at least V be achieved without exceeding the weight W?) # Each set is named by an arbitrarily-chosen one of its members; is asNP-complete, thus there is no possible algorithm both correct and fast (polynomial-time) on all cases, unless P=NP. While the # long as the set remains unchanged it will keep the same name. If decision problem is NP-complete, the optimization problem is NP-hard, its resolution is at least as difficult as the decision problem, and there is no known polynomial algorithm which can tell, given a solution, whether it is optimal (which would mean that there is no solution with a larger V, thus solving the decision problem NP-complete). There is a pseudo-polynomial time algorithm using dynamic programming. There is a fully polynomial-time approximation scheme, which uses the pseudo-polynomial time algorithm as a subroutine, described below. Many cases that arise in practice, and "random instances" from some distributions, can nonetheless be solved exactly. Another algorithm for 0-1 knapsack, sometimes called "meet-in-the-middle" due to parallels to a similarly named algorithm in cryptography, is exponential in the number of different items but may be preferable to the DP algorithm when W is large compared to n. In particular, if the w_i are nonnegative but not integers, we could still use the dynamic programming algorithm by scaling and rounding (i.e. using fixed-point arithmetic), but if the problem requires d fractional digits of precision to arrive at the correct answer, W will need to be scaled by 10d , and the DP algorithm will require O(W ∗ 10d ) space and O(nW ∗ 10d ) time. Meet-in-the-middle algorithm input: a set of items with weights and values output: the greatest combined value of a subset partition the set 1...n into two sets A and B of approximately equal size compute the weights and values of all subsets of each set for each subset of A find the subset of B of greatest value such that the combined weight is less than W keep track of the greatest combined value seen so far 26 Minimum cut In graph theory, a minimum cut of a graph is a cut (a partition of the vertices of a graph into two disjoint subsets that are joined by at least one edge) whose cut set has the smallest number of edges (unweighted case) or smallest sum of weights possible. Several algorithms exist to find minimum cuts. For a graph G = (V, E), the problem can be reduced to 2|V | âĹŠ 2 = O(|V |) maximum flow problems, equivalent to O(|V |) s âĹŠ t cut problems by the max-flow min-cut theorem. Hao and Orlin[1] have shown an algorithm to compute these max-flow problems in time asymptotically equal to one max-flow computation, requiring O(|V ||E|log(|V |2 /|E|)) steps. Asymptotically faster algorithms exist for undirected graphs, though these do not necessarily extend to the directed case. A study by Chekuri et al. established experimental results with various algorithms