Lab Report – Problem 2 Researcher/affiliations Jui Sun CSC532 Student ID: 850291753 Abstract This Lab Report is solving a problem A = {x| 0 < x < 100} of n randomly selected integers contains a subset that sums exactly to an integer value k. To solve this problem, I use an algorithm to generate all the possible combinations (all possible subsets) and then get the solutions from the algorithm. I would talk about what problem I met while generating, what is found from the experiment data, and what direction we should focus on. 1. Introduction As it is a NP-complete problem, we need to know some information about NP – hard. P problem stands for a problem which can be solved by a polynomial algorithm, that is, the solution time can be in polynomial time. NP problem is a problem which can be solved by a non-deterministic polynomial algorithm. NP-Hard problem is a problem after reducing by a polynomial algorithm. If the problem after reducing is still a NP problem, we call this problem a NP-complete problem. And for the algorithm I used to solve for this problem, by searching the logic value table, it takes 2n times to calculate, and the time complexity of this algorithm is O( 2n ). A simple definition of a NP-hard problem is: A mathematical problem for which, even in theory, no shortcut or smart algorithm is possible that would lead to a simple or rapid solution. Instead, the only way to find an optimal solution is exhaustive analysis in which all possible outcomes are tested. An Example of NP-hard problems is the traveling salesman problem. 2. Background The concept of NP-complete was introduced in 1971 by Stephen Cook. He formalized the notion of NP-completeness in a famous 1971 paper "The Complexity of Theorem Proving Procedures". Since then, researcher tried to find out whether N = NP or not. Since then, there are many theories try to prove that N versus NP. In a survey in 2002, within 100 researchers, 61 believed the answer is not, and 22 of them not sure. By asking about when the problem will be solved, 79 people give certain numbers, as below: P=NP will be resolved between 2002-2009: 5 P=NP will be resolved between 2010-2019: 12 P=NP will be resolved between 2020-2029: 13 P=NP will be resolved between 2030-2039: 10 P=NP will be resolved between 2040-2049: 5 P=NP will be resolved between 2050-2059: 12 P=NP will be resolved between 2060-2069: 4 P=NP will be resolved between 2070-2079: 0 P=NP will be resolved between 2080-2089: 1 P=NP will be resolved between 2090-2099: 0 P=NP will be resolved between 2100-2110: 7 P=NP will be resolved between 2100-2199: 0 P=NP will be resolved between 2200-3000: 5 P=NP will never be resolved : 5 The Clay Mathematics Institute is offering a one million US dollars reward to anyone who has a formal proof that P = NP or that P ≠ NP. 3. Experimental Design To solve this problem, we need to get the sum of each member of every subset, determine whether it equals to a number we set or not. To present all subsets, the easiest way is to generate a truth table for all set members. By observing the table, it is obvious that for a set with n members, there would be 2n subsets (take or do not take). In the logic value table, 1 means take, and 0 means not. And because of the table is composed of 1’s and 0’s, I discovered that all possible combination can be regard as a binary system, from 0 to2n − 1. If I can transform all decimal numbers (from 0 to2n − 1) to binaries, then I can generate all possible subsets, and find all solutions to this problem. So this algorithm would focus on - how to transform a decimal numbers to a binary number. The answer is easy. For transferring, divide the decimal number by 2. Let n = n’ * 2 + b, b would be the present bit numbers in binary system. And dividing n’ in the same way, remainder will be the next bit’s number. Continue this process until n = 0. And the result is what we want for decimal transferring. To execute it, declare a 2-dimention array Val to record the subset. Random selects n numbers from 1 to 99, and save them individually to each column. Two rows 0 and 1, shows that to present this number or not, used to calculate sum and to print out in later process. The main process used double layer for-loop. The outside set for each number from 0 to 2n − 1, and the inside for-loop set transferring those numbers. In the inside for-loop, use the algorithm described before to get the subset elements. And each loop calculates the sum and uses a counter to record it (for printing out if necessary). Set an if-condition to check the sum is larger or not, if it is, break the loop and go through the next number. After transferring check the sum equals the target number or not, if it is, prints the subset out. After running all loops, the result printed out would be the answer to this problem. 4. Findings The total duration of this method is 14 hours. By using the initial sum 1789, it is hard to find a subset which satisfies this problem because the sum of all members not even equals to 1789. So I motivated it to 1589. Unfortunately, after over 10 hours running, the solution of all possible subsets came over 10000. So I changed the target to 1739, and add a print out instruction just after the set was generated (to avoid the sum of all members is smaller than target number). And to test this design is work or not, set the number of n = 10 and target k = 500. It is hard to get solution subsets, even if there is, only 1 or 2 solutions will be generated. The final implement generated these 35 numbers: 34 46 20 50 86 80 11 71 15 47 7 99 58 80 89 11 65 29 79 66 22 17 79 84 44 92 35 61 56 12 16 64 93 63 77 and let k =1739 Then I got 328 solutions, subset1: 34 46 20 50 86 80 11 71 47 7 99 58 80 89 65 29 79 66 22 17 79 84 44 92 35 61 56 12 64 93 63 subset2: 34 46 20 50 86 80 71 47 7 99 58 80 89 11 65 29 79 66 22 17 79 84 44 92 35 61 56 12 64 93 63 subset3: 34 46 20 50 86 80 11 71 15 47 99 58 80 89 11 65 29 79 66 22 17 79 84 44 92 61 56 12 16 64 93 63 subset4: 34 46 50 86 80 11 71 15 47 7 99 58 80 89 11 65 29 79 66 17 79 84 44 92 35 61 56 12 16 64 93 63 subset5: 34 46 50 86 80 71 15 47 7 99 58 80 89 65 29 79 66 22 17 79 84 44 92 35 61 56 12 16 64 93 63 … subset325: 46 50 86 80 71 15 99 58 80 89 11 65 29 79 66 22 17 79 84 44 92 35 61 56 12 16 64 93 63 77 subset326: 34 46 20 50 80 71 47 99 58 80 89 11 65 29 79 66 22 17 79 84 44 92 35 61 56 12 16 64 93 63 77 subset327: 34 20 50 86 80 71 7 99 58 80 89 11 65 29 79 66 22 17 79 84 44 92 35 61 56 12 16 64 93 63 77 subset328: 46 86 80 11 71 47 7 99 58 80 89 11 65 29 79 66 22 17 79 84 44 92 35 61 56 12 16 64 93 63 77 Because the set has some repeated numbers, some subsets seem the same. And to observe the result, I discovered that many elements almost appeared in every solution. And if n >= 63, the algorithm I used will be terminated immediately just after the entering the transferring for-loop. That is because the limitation for the outside for-loop, which I declared as a long integer, exceeded the limitation of long type (because I set the limitation as 2n ). 5. Conclusions This problem can be seen as a job allocation problem. Given a certain working time, evaluate each job cost time, and find the most effective way to implement. From numerous solutions, some job must be taken apparently (from the finding). And it might have no solutions if the job options are rare. Using the LVTG method to solve this problem seems an easy and direct way to solve this problem, but since it is a type of exhaustive search, it is easily imagined that the duration would expand fast as set elements increased. To find some constraints or methods to decrease the searching time, will be the mainstream purpose of improving this algorithm. 6. Future work For solving this problem, we need to use some other model to speed up this algorithm. Using the parallel computing, possibility calculate model, or some other models may reduce the running duration. And about the algorithm for this problem, we can also use a recursive algorithm instead of using decimal-to-binary way. Compare these two solutions, and find each algorithm is better. And for the studies of NP-complete, try to get an algorithm to make the time complexity to be represented as a polynomial, may help the discussion progress in P VS NP problem. 7. References Sanjoy Dasgupta, Christos Papadimitriou, Umesh Vazirani, “NP-complete problems”, Algorithms, P.232 – P.247 R.C.T.Lee, S.S.Tseng, R.C.Chang, Y.T.Tsai, “The Theory of NP-Completeness”, INTRODUCTION TO THE DESIGN AND ANALYSIS OF ALGORITHM, P.364-365