Programming Assignment/Contest

advertisement
CSE 830: Design and Theory of Algorithms
Final Programming Assignment/Contest
Due Monday, April 29th 2002
The Problem:
The vertex cover problem takes as input a graph G = (V; E). The goal is to find the smallest subset of
vertices (i.e. the smallest V0 in V) such that every edge in E is incident on at least one vertex in V0.
This is best understood through an example. Let G be the graph on V = {1; 2; 3; 4} consisting of the
edges E = {(1; 2); (1; 3); (1; 4); (3; 4)}. The minimum vertex cover consists of two vertices,
specifically vertices 1 and 4.
Vertex cover arises when we seek a small subgraph that is representative of the graph. Finding a cover
is easy: you can simply take all the vertices. However, the smallest vertex cover can be much less than
the total number of vertices - consider a star; the tree where only one vertex has degree > 1.
What is known about algorithms for vertex cover? The problem is NP-complete, meaning that it is
exceedingly unlikely that you will be able to find an algorithm with polynomial worst-case running
time. It remains NP-complete even for certain restricted graphs. However, since the goal of the
problem is to find a subset of V, a backtracking program which iterates through all 2n possible subsets
of vertices and tests whether each covers all m edges gives an easy O(n m 2n) algorithm. But the goal
of this assignment is to find as practically good an algorithm as possible.
Input and Output:
There are a variety of data files available on the CSE unix system at ~torng/web/830/Contest/Public.
There is also a graph generator program available in that directory that you can use to generate test
graphs. Each graph is in a format such that the first two lines give the number of edges and vertices,
and the rest of the file consists of a pair of vertices per line representing an edge. Vertices are
numbered from 1 to n.
In a combinatorially explosive problem such as this, adding one to the problem size can multiply the
running time by n, so test on the smaller files first.
Your program must be able take a single argument of the filename containing the graph to perform the
minimum set cover on. You may assume that this file contains a legal graph. Your program must
output the size of your vertex cover and then a list of the nodes in your cover. Do NOT include any
extra words in your output as this will confuse my correctness checker.
Implementation:
You will be graded on how fast and clever your program is, not on style. Incorrect programs will
receive no credit. A program will be deemed incorrect if it does not find a subset corresponding to a
minimum set cover for some system of subsets.
You must use C or C++ for uniformity/efficiency. The programs must be able to run under UNIX on
the CSE cluster.
Writing efficient programs is an iterative process. Build your first solution so you can throw it away,
and start it early enough to go through several iterations, especially if you feel your programming
background is weak. Use a simple backtracking approach first; don't get fancy until you have
something that works. It is possible to get a program working in about 100 lines.
Don't forget to use the code optimizer on your compiler when you make the final run, to get a free 20%
or so speedup. The system profiler tool, gprof, will help you tune your program.
What you turn in:
You are to turn in via handin a listing of your program, sample runs, and a README file containing a
brief description of your algorithm, any interesting optimizations you included in your program, the
largest test files your program could handle in one minute or less of wall clock time, and instructions
on how to compile your code. The top five self-reported times / largest sizes will be collected and
tested by me to determine the winner.
Other Details:
Everyone must do this question individually. The idea is to think about the problem from scratch and
on your own. If you do not completely understand the definition of minimum vertex cover, you don't
have the slightest chance of producing a working program. Don't be afraid to ask for clarification or
explanation!
What kind of approaches might you consider? Instead of testing every subset of V, you should be able
to develop a backtracking algorithm that prunes partial solutions (i.e. when there will be no way to
complete a partial cover that could reduce the cost of our best solution to date). Can you order the
vertices so that the search is likely to proceed more quickly? Starting off with a good approximate
solution may help achieve faster cutoffs. There is room for cleverness in selecting data structures to
minimize the time needed to test if a given collection of subsets covers G.
I will also include a sample program on the CSE UNIX machines at
~torng/web/830/Contest/Public/graph.exe. This program takes the filename to solve and returns the
size only of the minimum cover; it will help you verify your solution. Remember that your graph
needs to also say what that cover is.
Grading:
The assignment has a base value of 10 basic points. In order to get the 10 basic points, you must get
your program to solve random graph input instances correctly where the number of nodes in the graph
is at least 25.
Bonuses:
Grid graphs (which can be generated by the graph generator in the directory specified above) can be
solved more efficiently if you can recognize that you are dealing with a grid graph. A special case of a
grid graph is a path. You get 1 extra credit advanced point if you can handle paths of arbitrary size.
You get 2 extra credit advanced points if you can handle grids of arbitrary size.
The person whose program handles the largest input instances gets a bonus of 5 extra credit advanced
points. Bonuses will be also awarded to those whose programs exceed the mean/median value of the
class by at least the standard deviation.
Acknowledgements
Thanks to Charles Ofria for the contest idea and all the source code.
Good luck!
Download