COMP 482: Design and Analysis of Algorithms Spring 2013 Lecture 4 Prof. Swarat Chaudhuri Q3: Analyzing an algorithm You have an array A with integer entries A[1],…, A[n] Output a 2-D array B such that B[i,j] contains the sum A[i] + A[i+1] + … + A[j] Here is an algorithm: For i=1, 2, …, n For j = i+1, 2, …, n { Add up entries A[i] through A[j] Store result in B[i,j] } } Obtain an upper bound and a lower bound for the algorithm. 2 The lower bound is the more interesting one Consider the times during the execution of the algorithm when i ≤ n/4 and j ≥ 3n/4. In these cases, j − i + 1 ≥ 3n/4 − n/4 + 1 > n/2. Therefore, adding up the array entries A[i] through A[j] would require at least n/2 operations, since there are more then n/2 terms to add up. How many times during the execution of the given algorithm do we encounter such cases? There are (n/4)2 pairs (i, j) with i ≤ n/4 and j ≥ 3n/4. the given algorithm enumerates over all of them, and as shown above, it must perform at least n/2 operations for each such pair. Therefore, the algorithm must perform at least n/2.(n/4)2 = n3/32 operations. This is Ω(n3), as desired. 3 Graphs 3.1 Basic Definitions and Applications Undirected Graphs Undirected graph. G = (V, E) V = nodes. E = edges between pairs of nodes. Captures pairwise relationship between objects. Graph size parameters: n = |V|, m = |E|. V = { 1, 2, 3, 4, 5, 6, 7, 8 } E = { 1-2, 1-3, 2-3, 2-4, 2-5, 3-5, 3-7, 3-8, 4-5, 5-6 } n=8 m = 11 5 Some Graph Applications Graph Nodes Edges transportation street intersections highways communication computers fiber optic cables World Wide Web web pages hyperlinks social people relationships food web species predator-prey functions function calls scheduling tasks precedence constraints circuits gates wires software systems 6 World Wide Web Web graph. Node: web page. Edge: hyperlink from one page to another. cnn.com netscape.com novell.com cnnsi.com timewarner.com hbo.com sorpranos.com 7 9-11 Terrorist Network Social network graph. Node: people. Edge: relationship between two people. Reference: Valdis Krebs, http://www.firstmonday.org/issues/issue7_4/krebs 8 Ecological Food Web Food web graph. Node = species. Edge = from prey to predator. Reference: http://www.twingroves.district96.k12.il.us/Wetlands/Salamander/SalGraphics/salfoodweb.giff 9 Graph Representation: Adjacency Matrix Adjacency matrix. n-by-n matrix with Auv = 1 if (u, v) is an edge. Two representations of each edge. Space proportional to n2. Checking if (u, v) is an edge takes (1) time. Identifying all edges takes (n2) time. 1 2 3 4 5 6 7 8 1 0 1 1 0 0 0 0 0 2 1 0 1 1 1 0 0 0 3 1 1 0 0 1 0 1 1 4 0 1 0 1 1 0 0 0 5 0 1 1 1 0 1 0 0 6 0 0 0 0 1 0 0 0 7 0 0 1 0 0 0 0 1 8 0 0 1 0 0 0 1 0 10 Graph Representation: Adjacency List Adjacency list. Node indexed array of lists. Two representations of each edge. degree = number of neighbors of u Space proportional to m + n. Checking if (u, v) is an edge takes O(deg(u)) time. Identifying all edges takes (m + n) time. 1 2 3 2 1 3 4 5 3 1 2 5 7 4 2 5 5 2 3 4 6 6 5 7 3 8 8 3 7 8 11 Paths and Connectivity Def. A path in an undirected graph G = (V, E) is a sequence P of nodes v1, v2, …, vk-1, vk with the property that each consecutive pair vi, vi+1 is joined by an edge in E. Def. A path is simple if all nodes are distinct. Def. An undirected graph is connected if for every pair of nodes u and v, there is a path between u and v. 12 Cycles Def. A cycle is a path v1, v2, …, vk-1, vk in which v1 = vk, k > 2, and the first k-1 nodes are all distinct. cycle C = 1-2-4-5-3-1 13 Trees Def. An undirected graph is a tree if it is connected and does not contain a cycle. Theorem. Let G be an undirected graph on n nodes. Any two of the following statements imply the third. G is connected. G does not contain a cycle. G has n-1 edges. 14 Rooted Trees Rooted tree. Given a tree T, choose a root node r and orient each edge away from r. Importance. Models hierarchical structure. root r parent of v v child of v a tree the same tree, rooted at 1 15 Phylogeny Trees Phylogeny trees. Describe evolutionary history of species. 16 GUI Containment Hierarchy GUI containment hierarchy. Describe organization of GUI widgets. Reference: http://java.sun.com/docs/books/tutorial/uiswing/overview/anatomy.html 17 3.2 Graph Traversal Connectivity s-t connectivity problem. Given two nodes s and t, is there a path between s and t? s-t shortest path problem. Given two nodes s and t, what is the length of the shortest path between s and t? Applications. Facebook. Maze traversal. Erdos number. Kevin Bacon number. Fewest number of hops in a communication network. 19 Breadth First Search BFS intuition. Explore outward from s in all possible directions, adding nodes one "layer" at a time. Effect: find “shallow” paths to nodes. s L1 L L 2 n-1 BFS algorithm. L0 = { s }. L1 = all neighbors of L0. L2 = all nodes that do not belong to L0 or L1, and that have an edge to a node in L1. Li+1 = all nodes that do not belong to an earlier layer, and that have an edge to a node in Li. Theorem. For each i, Li consists of all nodes at distance exactly i from s. There is a path from s to t iff t appears in some layer. 20 Implementing BFS Q: What’s a good way to implement the above algorithm? A: Use a queue for the “frontier” 21 Breadth First Search Property. Let T be a BFS tree of G = (V, E), and let (x, y) be an edge of G. Then the level of x and y differ by at most 1. L0 L1 L2 L3 22 Breadth First Search: Analysis Theorem. The above implementation of BFS runs in O(m + n) time if the graph is given by its adjacency list representation. Pf. Easy to prove O(n2) running time: – at most n lists Li – each node occurs on at most one list; for loop runs n times – when we consider node u, there are n incident edges (u, v), and we spend O(1) processing each edge Actually runs in O(m + n) time: – when we consider node u, there are deg(u) incident edges (u, v) – total time processing edges is uV deg(u) = 2m ▪ each edge (u, v) is counted exactly twice in sum: once in deg(u) and once in deg(v) 23 Connected Component Connected component. Find all nodes reachable from s. Connected component containing node 1 = { 1, 2, 3, 4, 5, 6, 7, 8 }. 24 Q1: Finding connected components Give an algorithm to find the set of all connected components of an undirected graph. 25 Connected Component Connected component. Find all nodes reachable from s. s R u v it's safe to add v Theorem. Upon termination, R is the connected component containing s. BFS = explore in order of distance from s. 26 Q2: Flood Fill Flood fill. Given lime green pixel in an image, change color of entire blob of neighboring lime pixels to blue. recolor lime green blob to blue 27 Flood Fill Flood fill. Given lime green pixel in an image, change color of entire blob of neighboring lime pixels to blue. recolor lime green blob to blue 28 Flood Fill Flood fill. Given lime green pixel in an image, change color of entire blob of neighboring lime pixels to blue. Node: pixel. Edge: two neighboring lime pixels. Blob: connected component of lime pixels. recolor lime green blob to blue 29 Depth-first search Use recursion DFS intuition. Explore outward from s along one path as far as possible, and backtrack when you cannot progress. Effect: find faraway nodes. DFS(u): Mark u as “Explored” and add u to R For each edge (u,v) incident to u If v is not marked “Explored” then Recursively call DFS(v) 30 Depth-first search Property. For a given recursive call DFS(u), all nodes marked “Explored” between the beginning and end of this recursive call are descendants of u in T. Theorem. Let T be a depth-first search tree, let x and y be nodes in T, and let (x,y) be an edge of G that is not an edge of T. Then one of x or y is an ancestor of the other. 31 Q3: BFS and DFS trees We have a connected graph G = (V, E) and a specific vertex u. Suppose we compute a DFS tree rooted at u, and obtain a tree T that includes all nodes of G. Suppose we then compute a BFS tree rooted at u, and obtain the same tree T. Prove that G = T. 32 Answer Suppose G has an edge e = {a, b} that does not belong to T. As T is a DFS tree, one of the two ends must be an ancestor of the other—say a is an ancestor of b. (*) Since T is a BFS tree, the distance of the two nodes from u in T can differ at most by one. But if a is an ancestor of b, and (*) holds, then a must be the direct parent of b. This means that {a, b} is an edge in T. Contradiction. 33 Q4: Finding a cycle Given a graph G, determine if it has a cycle. If so, the algorithm should output this cycle. Answer: Assume that G is connected; otherwise work on the connected components. Run BFS from an arbitrary node s, and obtain a BFS tree T. If every edge of G appears in the tree, then G = T and there is no cycle. Otherwise, there is an edge e = (v, w) that is in G but not in T. Consider the least common ancestor u of v and w in T. We get a cycle from edge e and paths u-v and u-w in T. 34