1. Give and verify a linear time algorithm that takes two sequences of events (say encoded as lists of binary integers), and determines whether the first sequence of events is a subsequence of the second. Problem formulation: Two sequences: ๐ด and ๐ต, with length of ๐ and ๐, respectively. Test whether ๐ด is a subsequence of ๐ต. Idea: Use two pointers, ๐1 for sequence ๐ด, ๐2 for sequence ๐ต. Compare ๐ด[๐1] with ๐ต[๐2]: If ๐ด[๐1] = ๐ต[๐2], move both pointers forward, which is ๐1 = ๐1 + 1, ๐2 = ๐2 + 1 If ๐ด[๐1] ≠ ๐ต[๐2], only move ๐2 forward, which is ๐2 = ๐2 + 1. In this way, ๐1 traverses ๐ด once, and ๐2 traverses ๐ต once. The time complexity is ๐(๐ + ๐) Algorithm: Verification: This is a greedy algorithm. We claim that greedy algorithm always finds the feasible solution if there is any. We prove this by contradiction. For the example below, greedy algorithm selects a set of jobs from sequence B to match sequence A, which is (g1, g2,..., gr, g(r+1), …). The solution is denoted as (s1, s2, …, sr, s(r+1) , …). g1=s1, g2=s2, …, gr=sr for largest possible value of r. Since g(r+1) and s(r+1) match the same job. Why do not replace s(r+1) with g(r+1)? After replacement, the solution is still feasible. We get, s(r+1) =g(r+1). There is another job in common, which contradicts with the largest possible value is r. Inductively, we can always replace the choice of solution with the choice of greedy algorithm. And such replacements will not jeopardize the feasibility of the solution. In another word, greedy algorithm always returns a feasible solution if there is any. 2. In lecture 3 we discussed the greedy cashier’s algorithm for making change of x units using the smallest number of coins. The cashier’s algorithm gives the customer one unit of the highest denomination coin of at most x units, say d units. Now repeat to make change of the remaining x-d units. For each of the following nation’s coinage, establish whether or not this greedy algorithm always minimizes the number of coins returned in change. If so, prove it, if not give a counter example. (a) MiddleEarth coinage, which includes coins for 1, 4, 5, 10, and 20. No. X=8 Optimal solution: 8=4+4, two coins Greedy Algorithm: 8=5+1+1+1, four coins So, greedy algorithm does not always minimize the number of coins. (b) English coinage before the decimalization, which consisted of half-crowns (30 pence), orins (24 pence), shillings (12 pence), sixpence (6 pence), threepence (3 pence), pennies (1 pence). No. X=48; Optimal solution: 48=24+24, two coins Greedy Algorithm: 48=30+12+6, three coins So, greedy algorithm does not always minimize the number of coins returned. (c) Martian coinage, where the available denominations are powers of some integer p>1, i.e., 1, p, p^2, p^3,…,p^k. Yes, greedy algorithm always minimizes the number of coins returned. Let arbitrary ๐ฅ ๐๐๐ ๐ satisfy: ๐๐ ≤ ๐ฅ < ๐๐+1 For greedy algorithm, the solution contains coin k. Suppose greedy algorithm is not an optimal solution. In another word, the optimal solution does not contain coin k. Thus, there exists a set of coefficients (๐0 , ๐1 , ๐2 , … , ๐๐−1 ) that satisfies that: ๐ฅ = ๐0 × ๐0 + ๐1 × ๐1 + ๐2 × ๐2 + โฏ + ๐๐−1 × ๐๐−1 This is impossible for any optimal solution, which can be seen from the table below. k Ck 0 ๐0 All optimal solutions must satisfy ๐0 ≤ ๐ − 1 Max value of coins 0,1,….,k-1 in any OPT - 1 ๐1 2 ๐2 … … k ๐ ๐1 ≤ ๐ − 1 ๐2 ≤ ๐ − 1 … (๐ − 1) × ๐0 = ๐ − 1 (๐ − 1) × ๐1 + (๐ − 1) = (๐ − 1)(๐0 + ๐1 ) … ๐−1 ๐ ๐๐ ≤ ๐ − 1 ∑ (๐ − 1) × ๐๐ = ๐๐ − 1 ๐=0 Therefore, we refuse the hypothesis that the optimal solution does not contain coin k. Then, problem reduces to coin-changing ๐ฅ − ๐๐ , which, by induction, is also optimally solved by greedy algorithm. 3. The single-destination shortest path problem for a weighted directed graph is to find the shortest path from every vertex to a specified vertex v. Give and verify an efficient algorithm to solve the single-destination shortest paths problem. Idea: First, reverse all edges; second, use the Dijkstra’s Algorithm to find the shortest paths. Algorithm: Reverse the direction of all edges. Thus, the destination vertex ๐ฃ becomes the source. Maintain a set of explored nodes ๐ for which we have determined the shortest path distance ๐(๐ข) from ๐ฃ to vertex ๐ข. And a set of unexplored nodes ๐. Initialize ๐ = ∅, ๐(๐ฃ) = 0, ๐(๐ข) = ∞ ๐๐๐ ๐๐๐ ๐๐กโ๐๐ ๐ฃ๐๐๐ก๐๐ฅ ๐ข while (๐ ≠ ∅) do { select the node ๐ฅ, which subjects to ๐ฅ ∈ ๐, and ๐ฅ has the smallest ๐(๐ฅ) among all nodes in ๐ add ๐ฅ to S delete ๐ฅ from P for node ๐ฆ, ๐ฆ ∈ ๐ ๐๐๐ ๐ฆ ๐๐ ๐ ๐๐๐๐โ๐๐๐ ๐๐ ๐ฅ, do { ๐(๐ฆ) = ๐(๐ฅ) + ๐(๐ฅ, ๐ฆ), ๐(๐ฅ, ๐ฆ) is the weight of edge from ๐ฅ to ๐ฆ If ๐(๐ฆ) < ๐(๐ฆ) do ๐(๐ฆ) = ๐(๐ฆ) } } Verification: Claim: ๐(๐ข) is the shortest path from ๐ฃ to ๐ข We prove this by induction on |๐|. |๐| = 1, ๐(๐ข) is obviously the shortest path from ๐ฃ to ๐ข, if ๐ข is selected to be added to ๐ next. |๐| > 1, let ๐ข be the next node to be added to ๐. And we need to prove ๐(๐ข) is still the shortest path from ๐ฃ to ๐ข. There are two possibilities for the path from ๐ฃ to ๐ข: 1, the path only contains the nodes in ๐. In this situation, ๐(๐ข) is the shortest path since we update ๐(๐ข) every time we add a new node to ๐ 2, the path ๐ contains nodes in both ๐ and ๐, as shown in the figure below. Suppose there is a node ๐ฆ on path ๐. Let ๐ฅ − ๐ฆ be the first edge in ๐ that leaves ๐, and let ๐’ be the subpath to ๐ฅ. Due to nonnegative property of weights, we ignore the weights from ๐ฆ to ๐ข, thus ๐(๐) ≥ ๐(๐′ ) + ๐(๐ฅ, ๐ฆ) Due to the hypothesis that ๐(๐ฅ) is the shortest path from ๐ฃ to ๐ฅ for all ๐ฅ in ๐ ๐(๐) ≥ ๐(๐′ ) + ๐(๐ฅ, ๐ฆ) ≥ ๐(๐ฅ) + ๐(๐ฅ, ๐ฆ) From the definition of ๐(๐ฆ), we have ๐(๐) ≥ ๐(๐′ ) + ๐(๐ฅ, ๐ฆ) ≥ ๐(๐ฅ) + ๐(๐ฅ, ๐ฆ) ≥ ๐(๐ฆ) ๐ข is the next node to be added to ๐, so ๐(๐ฆ) ≥ ๐(๐ฅ) ๐(๐) ≥ ๐(๐′ ) + ๐(๐ฅ, ๐ฆ) ≥ ๐(๐ฅ) + ๐(๐ฅ, ๐ฆ) ≥ ๐(๐ฆ) ≥ ๐(๐ข) To sum up, any path from ๐ฃ to ๐ข will have a greater weight than ๐(๐ข) Due to the reversibility of path, we conclude that ๐(๐ข) is the shortest path from ๐ข to ๐ฃ. The time complexity for this algorithm is ๐๐๐๐๐, where ๐ is the maximal value of out-going degree among all vertices in the reversed graph, ๐ is the number of vertices. (Implemented with binary heap) 4. Let G = (V,E) be an undirected weighted graph, and let T be the shortest-path spanning tree rooted at a vertex v. (a) Consider the graph G* obtained by modifying all the edge weights in G by multiplying the weight by a constant factor c >0. Is T still the shortest-path spanning tree in G* from v? Justify your answer. Yes, T is still the shortest-path spanning tree in G* from ๐ฃ. First of all, we should clarify that shortest-path spanning tree is defined as a tree, in which the path weight from root node ๐ฃ to any other node is minimized. In G, for an arbitrary vertex ๐ข, the path from ๐ฃ to ๐ข defined by ๐ is denoted as ๐, any other path from ๐ฃ to ๐ข is denoted as ๐’. We have ๐(๐′ ) ≥ ๐(๐) for any ๐’. In G*, all edges have been multiplied by a constant factor ๐. Hence, the path weight between any two vertices is ๐ times of the weight in the original graph ๐บ. Thus for G*, ๐ ∗ (๐′ ) = ๐ × ๐(๐′ ), ๐ ∗ (๐) = ๐ × ๐(๐). So, ๐ ∗ (๐′ ) ≥ ๐ ∗ (๐) ๐๐ ๐ > 0, which means path ๐ still has the lowest weight in graph G*. Since vertex ๐ข and path ๐’ are selected arbitrarily. We are safe to conclude that T is still the shortest-path spanning tree in G* from ๐ฃ. (b) Consider the graph G+ obtained by modifying all the edge weights in G by adding to the weight by a constant d >0. Is T still the shortest-path spanning tree in G+ from v? Justify your answer. No, T is not necessarily the shortest-path spanning tree in G+ from ๐ฃ. Similar to question (a), for an arbitrary vertex ๐ข, the path defined by ๐ is denoted as ๐, and another arbitrary path is denoted as ๐’. We have, ๐(๐′ ) ≥ ๐(๐) In G+, ๐ + (๐′ ) = ๐(๐′ ) + ๐ × ๐, ๐ + (๐) = ๐(๐) + ๐ × ๐, where ๐ and ๐ are the number of edges on path ๐’ and ๐, respectively. Since the relationship between ๐ and ๐ is not determined (I mean ๐ could be much larger than ๐), the relationship between ๐ + (๐′ ) and ๐ + (๐) can not be determined. It could be ๐ + (๐′) < ๐ + (๐). Following is a counter example. The shortest-path spanning tree is shown in red. We can see that the shortest-path spanning tree in two graphs are not the same. 5. Suppose that you run both depth-first search and breadth-first search on a connected graph G, and they both return the same tree T. Prove that G=T, i.e., there are no additional edges in the graph. Suppose ๐บ ≠ ๐. First of all, T will not have edges that are not in G, since T has the minimal number of edges to construct a connected graph. Second, assume there is an edge in G but not in T. Then, there will be a cycle in G. Nodes in this cycle is denoted as: (๐1 , ๐2 , … , ๐๐ ). ๐1 is connected to ๐2 and ๐๐ . Let ๐๐ ∈ (๐1 , ๐2 , … , ๐๐ ) is the first accessed node in this cycle. Breadth-first search: both ๐๐−1 and ๐๐+1 are one depth deeper than node ๐๐ . Depth-first search: at most one of the two neighboring nodes (๐๐−1 and ๐๐+1 ) will be explored at the next level. This contradicts with “both depth-first search and breadth-first search return the same tree”. Therefore, there is no additional edges in G. Thus, G=T.