Uploaded by Jing Chen

Stevens_CS 600 Advanced Algorithm Design and Implementation Final Exam_Answer

advertisement
Final Exam
CS 600
Instruction: Answer the following questions in this document or another document and
submit it in Canvas according to the Final Exam Procedure.
1.
(12 Points) Consider a connected communication network of routers that
form a free tree T. Assume the time-delay of a packet transfer from one
router to another is determined by multiplying a small, fixed constant by the
number of communication links between the two routers. Develop an
efficient algorithm, better than O(n3), that computes the maximum possible
time delay in the network T.
Answer:
Find the largest distance between source point S and all other vertices in a
weighted directed acyclic network For directed acyclic graphs, the longest path
problem has a linear time solution.
Find the topological order by initializing the distance to all vertices to negative
infinity and the distance to the source point to 0. The topological ordering of the
graphs represents the linear order of a graph.
When the topological order is found, all the vertices in IT are processed one by
one. Update the distance of each processed vertex’s neighbor using the current
vertex.
Algorithm: FindLongestPath()
Input: Tree roots
Output: maximum possible time delay
1. Initialize dist[] = {NINF, NINF,....} , dist[s] = 0 .
## s: source point
## NINF: negative infinity.
## Dist: longest distance from the source point to the other point.
2. Create a topological sequence contain all vertices.
3. Repeat the following process for each vertex u in the topology sequence
For each adjacent point v of u:
If (dist[v] < dist[u] + weight(u, v))
dist[v] = dist[u] + weight(u, v)
Time complexity: V and E are the symbol to represent the running time, like n.
1. topological sorting is O (V+E).
2. After finding the topological order, for each vertex, all adjacent points are
found by loop , and the time complexity is O (E).
3. Internal loop runs O (V+E) times.
4. Total time complexity of the algorithm is O (V+E).
2.
(12 Points) Suppose you are told that you have a goat and a wolf that need to
go from a node s to a node t, in a directed acyclic graph G. To avoid the wolf
eating the goat, their paths must never share an edge. Design an efficient
algorithm for finding two edge-disjoint paths in G, if such path exists, to
provide a way for the goat and the wolf to go from s to t without risk to the
goat.
Answer:
Consider the directed graph 'G with two vertices: source s and destination t. We
want to find a route that avoid them sharing any edges. In that case, we can
formula it as a maximum flow problem using the Ford-Fulkerson technique.
First, we consider the supplied source and destination as sources and sinks in a
flow network. Assign unit capacity to each edge. Second, we calculate the
maximum flow from source to sink using the Ford-Fulkerson procedure. Then the
maximum flow equals the number of edge-disjoint pathways.
Ford-Fulkerson Algorithm:
1. Initialize the flow total = 0
2. Repeat the following process until no path exists between s and t.
a. Run depth first search from the source vertex to find a flow path to the end
vertex.
b. Let f be the minimum capacity value on the path
c. Add f to flow_total
3. For each edge on the path
a. Decrease capacity from one side to another
b. Increase capacity of the edge from other side to back
3.
(12 Points) Consider a graph G and two distinct vertices, v and w in G.
Define HAMILTONIAN-PATH to be the problem of determining whether
there is a path that starts at v and ends at w and visits all the vertices of G
exactly once. Show that the HAMILTONIAN-PATH problem is NP-complete.
Answer:
We first demonstrate that this belongs to the NP class, and then identify aknown
NP-complete issue that can be reduced to a Hamiltonian route.
Suppose a graph G, and the Hamiltonian path is solved by non-deterministically
selecting the edges from G that should be included in the path. Next, we traverse
the path and ensure that each vertex is visited exactly once, which can be done in
polynomial time. Therefore, it belongs to the NP class.
For identifying an NP-complete issue that can be reduced to a Hamiltonian route.
We look for a Hamiltonian cycle in the graph, which is a route that start and finish
at the same vertex. We can reduce this to the Hamiltonian route, since the
Hamiltonian cycle is NP-complete.
We create a graph G0 from the graph G = (V,E) in which G has a Hamiltonian
cycle and G0 contains a Hamiltonian route. This is accomplished by selecting an
arbitrary vertex u in G and copying it along with all of its edges, u0. Then, in the
graph, add vertices v and v0 and connect v with u and v0 with u0. Thus, if we
start in v, follow the Hamiltonian cycle of G back to u0 instead of u, and
eventually end in v0, we get a Hamiltonian route in G0. The route of G0 have
both v and v0 endpoints, and it can be converted to a G cycle.
The path must have endpoints in u and u0 if we ignore v and v0. And removing
u0 causes a G cycle if we close the path back to u instead of u0. The construction
will fail if G is a single edge, so it must be handled separately. Therefore, we
demonstrated that G contains a Hamiltonian cycle only if and only if G0 contains
a Hamiltonian path, completing the proof that Hamiltonian Path is NP-complete.
4.
(12 Points) Suppose we are given an undirected graph G with positive
weights on its edges and asked to find a tour that visits the vertices of G
exactly once and returns to the start to minimize the cost of maximum-weight
edge in the tour. Assuming that the weights in G satisfy the triangle
inequality, design a polynomial-time 3-approximation algorithm for this
version of traveling salesperson problem.
Note that this version of TSP is different than the 2-approximation for
METRIC-TSP in Section 18.1, where G is assumed to be a complete graph.
Hint: Show that it is possible to turn an Euler-tour traversal, E, of an MST
for G into a tour visiting each vertex exactly once such that each edge of the
tour skips at most two vertices of E.
Answer:
Let S be the sequence of vertices v1 to vn.
Check the following requirements in this traveling salesperson problem:
1. Every edge traversed by adjacent vertices is an edge in G
2. every vertex in G is in V(vertices), and every node has been traversed.
To demonstrate that the TSP is NP-hard, we have to show that NP reduces TSP in
the polynomial time. We can prove it using the Hamiltonian Cycle. HC is NPhard and every problem in NP reduces to HC in polynomial time because every
HC is NP complete. Since the sum of two polynomials is still a polynomial, so if
we reduce HC to TSP in polynomial time, then we will show every vertex in NP
reduces to TSP in polynomial time.
We already demonstrated that given a graph G=(V,E), a simple cycle in G
existsthat travels each vertex precisely once in answer 3.
For a simple cycle with n vertices and n edges. We use the following algorithm to
decrease HC to TSP:
1. For G=(V,E), set all edge weights to 1 and let k = |V|=n
## n is the number of nodes in G.
2. The weight of edges that are outside the G at begining is 2
3. test TSP using the changed graph, if exists a Tour on G with a cost of at least k
4. If the answer of TSP is yes, the answer of HC is yes. If TSP is NO, HC is NO
As we mentioned before, reduction takes polynomial time, we will demonstrate
that HC solutions are in 1-1 correspondence with TSP solutions using the
reduction.
There are two possible answers to this question:
1. HC has one => TSP has a YES. Thus the simple cycle C exists that visits each
node precisely once, creating n edges to C. For a tour of weight n, each node
weighs 1 and k = n. So TSP has a YES answer,
2. HC has a NO = > TSP has a NO. Then there is no simple cycle C that visits
every node precisely once, creating n edges to C. Assume TSP's response is yes.
Then there's a tour that goes around every vertex once with a maximum weight of
k. Suppose there is a loop with a weight of n, then the loop needs to traverse each
node whose weight is 1 and k=n. This proves that edges exist in the HC graph.
This is where HC comes into play.
Thus, TSP is both NP and NP-Hard, and it is NP-Complete, as requirement.
5.
(12 Points) Suppose we have a Monte Carlo algorithm, A, and a deterministic
algorithm, B, for testing if the output of A is correct. How can we use A and
B to construct a Las Vegas algorithm? Also, if A succeeds with probability ½
and both A and B run O(n) time, what is the expected running time of the
Las Vegas algorithm that is produced?
Answer:
Las Vegas Algorithm: a randomized algorithm that always produces the correct
result but whose running time is determined by random events (probabilistic).
Monte Carlo Algorithm: a randomized algorithm that always has a deterministic
running time but whose output may be incorrect with some probability.
We will demonstrate that the estimated run-time is <=O(n log n):
First, suppose we have a (Monte Carlo) randomized algorithm called WeirdSort
that accepts an unsorted array I of elements as input and returns another array O.
The run time of algorithm is T(n), and the output O is always a permuted version
of the input I. In this case, we know that Pr[sorted output O] >= p.
We can summarize that the algorithm runs correct with probability at least p. The
most basic concept works: run the algorithm once, see if the result is sorted, and if
not, repeat (with fresh randomness, so that the outcome is independent of the
previous rounds). To execute the algorithm once will take T(n) time. To check if
the output is sorted in one "round" takes O(n) time.
Thus, each run of the algorithm succeeds is independent with a probability of at
least p. And the expected number of it before we succeeds is 1/p.
It takes O(n) time to run the algorithm once, and O(n) time to check if the output
is sorted. Each run succeeds with the probability of 1/2, so the expected running
time is O(nlogn) and the worst case is O(n2).
6.
(12 Points) Let S be a set of n intervals of the form [a, b], where a < b. Design
an efficient data structure that can answer, in O(log n +k) time, queries of
the form contains(x), which asks for an enumeration of all intervals in S that
contain x, where k is the number of such intervals. What is the space usage of
your data structure?
Hint: Think about reducing this to a two-dimensional problem.
Answer:
We can use the priority search tree to solve this problem. The input of the priority
search tree is (x1, x2, y1) and the output is a set of (x, y) that satisfy x1 ≤ x ≤ x2
and y1 ≤ y.
For this problem, we use all intervals in S as 2 dimensional items to build the
priority search tree. The input should be x and the output should be all intervals [a,
b] that contain x. We have x ≥ a and x ≤ b. So we should let y1 be x, x2 be x as
well, and let x1 be −∞. Then if the input of priority search tree is x, then the
output of it should be all intervals than contain x.
The priority search tree has n nodes and if the number of output is k, the running
time of this algorithm is the same as using PST search to answer three-sided range
queries, which is O(log n + k).
The space usage of the data structure is O(n), as shown in our text book.
7.
(10 Points) Given a set P of n points, design an efficient algorithm for
constructing a simple polygon whose vertices are the points of P.
Answer:
We can use Graham's scan algorithm to create a simple polygon whose vertices
are the points of P.
For the points in the range of (0, n-1):
1. We first locate the points at the y-bottom most coordinate. If two points with
the same y-coordinate value, select the one with the smaller x-coordinate value.
Make and place the lowest point A0 in the first position in the output hull.
2. In the counterclockwise sequence around points [0], sort the remaining n-1
points by polar angle. If two points have equal polar angles, place the closest
point at first.
3. Check if two or more points have the same angle. If yes, select the one that is
farthest away from A0. Let the array's new size be m.
4. Return, if we can use m to create a triangle.
5. Create an empty stack and add the points p0, p1, and p2 to it.
6. Identify the fourth vertex by processing the remaining points (which is the
minimum requirement for a polygon).
7. Remove points from the stack until it is not oriented counterclockwise. Thus,
we will have a point near to the stack's top, b point at the stack's top, and c push
points.
Time Complexity: The approach takes O(n log n) time for n input points.
8.
(10 Points) DNA strings are sometimes spliced into other DNA strings as a
product of re-combinant DNA processes. But DNA strings can be read in
what would be either the forward or backward direction for a standard
character string. Thus it is useful to be able to identify prefixes and their
reversals. Let T be a DNA text string of length n. Describe an O(n)-time
method for finding the longest prefix of T that is a substring of the reversal
of T.
Hint: Consider using a prefix trie.
Answer:
At first we should compute the reversal of the DNA text string T, and denote the
reversal as T’, which can be done in O(n) time. Then we build a suffix trie with T’
in O(n) time as shown in the text book. And we can use suffix trie to find the
longest prefix of T that is a substring of T’, which can be done in O(n) time in the
worst case.
Each node at most has 4 nodes because of the type of DNA is 4. For the first
character in T, we search it in the suffix trie, if it is in a node of the suffix trie, we
continue to find the next character in T starting from the current node. When we
compare all the characters in the current node and all of them match, we move to
it child node and repeat until there is no match any more. So in the worst case, the
running time is O(n).
So the total time of this algorithm is O(n).
9.
(8 Points) Solve the following linear program using the Simplex Method:
Maximize: 18 x1 + 12.5 x2
Subject to: x1 + x2 ≤ 20
x1 ≤ 12
x2 ≤ 16
x1 , x2 ≥ 0.
Answer:
We first convert it to slack form.
z = 18x1+12.5x2
x3 = 20−x1−x¬2
x4 = 12−x1
x5 = 16−x2
x1, x2, x3, x4, x5 > = 0
The variable x1 has a positive coefficient in the objective function. With respect
to x1, the second constraint is the most restrictive. The objective value is z = 0.
We can substitute x1 with x1 = 12 − x4. We get:
z = 216+12.5x2−18x4
x1 = 12−x4
x3 = 8−x2+x4
x5 = 16−x2
x1, x2, x3, x4, x5 > = 0
x2 has a positive coefficient in the new objective function, and now the first
constraint is the most restrictive. The objective value is z = 216. We replace it by
x2 = 8 − x3 + x4 and substitute x2 .
z = 316−12.5x3−5.5x4
x1 = 12−x4
x2 = 8−x3+x4
x5 = 8+x3−x4
x1, x2, x3, x4> = 0
all coefficients are negative now, so the optimal solution is x1 = 12 and x2 = 8.
Thus, the final objective value is z = 316.
Download