COSC 413 Assignment 2 NP-completeness, due May 10, 2013, 25% Question 1. [5 marks] Transform the following CNF SAT problem into the corresponding 3-clique problem and discuss two corresponding solutions. F = (A + B’ + C)(A’ + B + C’)(A + B’ + C’), X’ is complement of X Question 2. [5 marks] Transform the following 4-clique problem into the corresponding 2-vertex cover problem and discuss all corresponding solutions. 1 2 3 4 5 6 Question 3. [5 marks] Transform one of the vertex cover problem obtained above into the corresponding feedback edge set problem and discuss two corresponding solutions. Question 4. [10 marks] Transform that vertex cover problem into the corresponding directed Hamilton circuit problem and discuss two corresponding solutions. Question 5. [10 marks] Transform the 3-SAT in question 1 into the corresponding colorability problem, and discuss two corresponding solutions. Do not use the special colour for vertices that correspond to factors F1, F2, ... Question 6. [10 marks] Transform the following colorablity problem into the corresponding exact cover problem, and discuss two corresponding solutions. 1 a b 4 c e 2 d 3 Question 7. [5 marks] Transform the following exact cover problem into the corresponding knapsack problem, and discuss all corresponding solutions. S1 = {1, 2, 3}, S2 = {2, 3, 4}, S3 = {1, 2, 4}, S4 = {4, 5}, S5={1, 5} Question 8. [50 marks] Knapsack problem: Let a[i] and w[i] be the profit and weight of item i. The problem is to maximize the profit of all items that can be put in the knapsack of limit b. More formally obtain the binary vector x and corresponding maximum sum sum = max {x[1]a[1] + ... + x[n]a[n]} subject to x[1]w[1] + ... + x[n]w[n] <= b The purpose of this question is to experience a parallel algorithm for the knapsack problem. The attached program solves the problem based on binary tree search. (1) [25 marks] The attached parallel program has a knapsack problem hard coded. Give your own knapsack problem with at least two sizes of n between 30 and 40, and measure the computing time for various numbers of processors. (2) [25 marks] The algorithm checks the feasibility of the possible solution only at the leaves. If we cut the branch in the search tree when the partial weight sum exceeds the knapsack limit, we can save time. Also if array a is sorted in descending order, branch cut will be more effective. Try to speed up the attached program based on this idea or any other idea, and do the same performance measurements as in (1). Basic idea of the parallel algorithm. A sequential algorithm checks all 2n binary strings for the solution. The parallel algorithm partitions the set of strings into several groups and assign parallel processes to those groups. x1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 x2 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 x3 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 x4 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 P0 is responsible for up to this string P1 is responsible for up to this string P2 is responsible for up to this string P3 is responsible for up to this string Let processor Pi (i=0, .., 3) solve the sub-problem such that the first two bits are fixed to the binary expansion of i. The partial value sum and weight sum are accordingly adjusted. For example, P3 maximizes s = x[3]a[3]+x[4]a[4] subject to x[3]w[3]+x[4]w[4] <= b – (w[1]+w[2]). Then P3 reports s+a[1]+a[2] to the coordinating process that finds the maximum of those four local solutions Present the source code used for the experiment. In the next few pages, a basic source code and a load leveller file are given. Parallel programming is done in the MPI environment. A separate tutorial will be given. On Learn another advanced code will be provided. You should stay within the framework of provided program codes. Acknowledgement. We thank Blue Fern for offering us the use of Blue Gene/L for this assignment Example of knapsack problem Suppose you have a knapsack that can take maximum 15 kg but there are too many items. You will often encounter this situation when you pack for overseas trip. You wish to maximize your enjoyment (or pleasure) by taking many useful items (such as laptop, camera, clothes etc), but the airlines usually have strict weight limit and you will need to make a careful decision which ones to take in or take out. Or a thief broke into a jewellery shop will have the same problem. He wishes to maximize the profit by taking most bang-for-buck items. Let's consider the following example. i 0 1 2 3 4 5 6 7 value 10 3 1 8 2 7 5 9 weight 3 6 9 5 7 1 4 2 Maximum weight is 15kg. Let's represent the selection as a binary string. For example, if items 1,3,5,7 are included, the binary string is 01010101. Knapsack problem can be seen as a tree traversal. Starting from the root of the tree, if the 0-th item is included, we go to the right subtree, otherwise go left. We append 1 to the current selection if an item is included, appended 0 otherwise. When items 0,1 are included and 2,3 are not, we come to the node 1100 [13,9], where 13 is the current cumulative value, and 9 is the current cumulative weight. If the current cumulative weight already exceeds the maximum weight, we can cut the branch there. (eg. 11011 [23,21] is cut) See the picture of tree traversal in the next page // This works for the exact knapsack with 2^d cores // This works for general knapsack #include <stdio.h> #include <mpi.h> #define array_size 24 int a[100], x[100], y[100], w[100], b, s, n, z, i, j, weight; int local_array[100]; int global_array[100]; // used by process with rank=0 int local_max; int global_max; // used by process with rank=0 int main(int argc, char *argv[]) { double t1, t2; int ierror, rank, size, d, j; MPI_Status status; int from; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); t1=MPI_Wtime(); d=6; // no. of processes 2^d = 64 knapsack(rank, d); global_max=local_max; copy(local_array, global_array); t2=MPI_Wtime(); /* Other processes are sending their work to process 0 */ if(rank!=0)MPI_Send(&local_max, 1, MPI_INT, 0, 0, MPI_COMM_WORLD); //non-0 processes send local_max to process 0 if(rank!=0)MPI_Send(&local_array, 24, MPI_INT, 0, 0, MPI_COMM_WORLD); // Similar to above /*process 0 is coordinating to get the maximum from other processes */ if(rank==0) for(from=1;from<=exp(2,d)-1;from++){ MPI_Recv(&local_max, 1, MPI_INT, from, 0, MPI_COMM_WORLD, &status); //Process 0 receives local_max from other processes MPI_Recv(&local_array, 24, MPI_INT, from, 0, MPI_COMM_WORLD, &status); // Similar to above // printf("%d received from %d\n", local_max, from); if(local_max>global_max){ global_max=local_max; copy(local_array, global_array); } } /* Reporting solution by process 0 */ if(rank==0){ printf("local_max = %d \n", global_max); for(j=1;j<=n;j++)printf("%d ", global_array[j]); printf("\ntime= %f\n", t2-t1); } MPI_Finalize(); return 0; } int exp(int x, int d){ // compute x^d int i; int y=1; for(i=1;i<=d;i++)y=y*x; return y; } int max(int a, int b){ if(a>=b)return a; else return b; } int knapsack(int rank, int d){ a[1]=112; a[7]=32; a[13]=113; a[19]=121; a[2]=25; a[8]=25; a[14]=125; a[20]=271; a[3]=141; a[9]=45; a[15]=45; a[21]=411; a[4]=9; a[10]=49; a[16]=99; a[22]=91; a[5]=17; a[11]=19; a[17]=171; a[23]=117; a[6]=24; a[12]=21; a[18]=211; a[24]=211; w[1]=112; w[7]=32; w[13]=113; w[19]=121; w[2]=25; w[8]=26; w[14]=125; w[20]=271; w[3]=41; w[9]=45; w[15]=45; w[21]=41; w[4]=9; w[10]=49; w[16]=99; w[22]=91; w[5]=17; w[11]=91; w[17]=171; w[23]=117; w[6]=24; w[12]=27; w[18]=211; w[24]=241; b=141; n=24; s=0; weight=0; // b is the knapsack limit /* Initialization for each process */ z=rank; local_max=0; /* y is the binary expression of process rank */ for(j=1; j<=d;j++){if((z/2)*2==z)y[j]=0; else y[j]=1; z=z/2;} /* Solution array is initialized at the prefix by array y */ for(j=1;j<=n;j++)x[j]=y[j]; /* s and weight are initialized by the partial sums at prefix */ for(j=1; j<=d;j++){s=s+y[j]*a[j]; weight=weight+y[j]*w[j];} if(rank<=exp(2,d)-1){ binary(rank, d+1, s, weight); } /* Main recursive function for tree traversal */} binary(int rank, int i, int s, int weight){ int j; if(i<=n){ x[i]=0; binary(rank,i+1,s+x[i]*a[i],weight+x[i]*w[i]); x[i]=1; binary(rank,i+1,s+x[i]*a[i],weight+x[i]*w[i]); } else if (weight<=b){ if(s>local_max){local_max=s; copy(x, local_array);} } } copy(int x[], int y[]){ int i; // copy x to y for(i=1;i<=n;i++)y[i]=x[i]; } Load Leveller (similar to script in unix) #!/bin/sh # @ output = $(Executable).$(Cluster).out # @ error = $(Executable).$(Cluster).err # @ wall_clock_limit = 0:01:00 # @ notification = always # @ job_type = bluegene # @ bg_size = 1024 # @ group=bg01 # @ class=bg32_100 # @ account_no = bfcs00231 # @ queue mpirun -np 1024 -verbose 0 -cwd $HOME/hello $HOME/hello/a.out To use less cores, change the two occurrences of 1024 in the above.