MAKESPAN DISTRIBUTIONS IN FLOW SHOPS WITH MULTIPLE PROCESSORS Wei Wang1 and John L. Hunsucker Department of Industrial Engineering University of Houston, Houston, Texas 77204-4812, USA Abstract The makespan distributions in a flow shop with multiple processors (FSMP) are investigated in this study. The FSMP problem is characterized as the processing of n jobs through an m stage flow shop, where there exists one or more processors at each stage. Since it is known that the minimum makespan FSMP problem is computationally explosive, it was necessary to investigate the distribution of makespans in FSMP in order to provide guidance to the FSMP research. Results of this study may help gain a better understanding of the FSMP makespan problem. Keywords: Scheduling, Flow shop, Makespan, Distribution, Heuristic 1. Introduction This paper is concerned with the makespan distributions in flow shops with multiple processors (FSMP). Scheduling in an FSMP involves the sequencing of n jobs through m stages where there can be one or more identical processors at each stage. The makespan is the total elapsed time required to process all of a set of given jobs. It is an important characteristic of a schedule. The optimization objective of many studies is to minimize the makespan. It is known that the FSMP makespan problems are computationally explosive. This has caused researchers to turn to heuristic methods or computer simulation programs to hopefully find near-optimal solutions to the FSMP problems. Without knowing the optimal solutions, a tool is then needed to estimate the optimal solution and to evaluate the quality of various heuristics. Determination of a strong global lower bound has been shown empirically to be an effective tool to achieve this objective. Nevertheless, it is not the only method to evaluate the effectiveness of various heuristics. Investigation of the makespan distributions in FSMP can also help gain a better understanding of the makespan problem. Hence, this study examines the makespan distributions in the FSMP environment and the result of this study reveals the statistical nature of the FSMP makespan problems. 2. Background Review The pure flow shop scheduling problem with the makespan objective has been the focus of a lot of research for last several decades. In the pure flow shop environment, there is only one processor available at each stage throughout the entire flow shop. Johnson’s (1954) constructive algorithm for the two-stage pure flow shop makespan problem is a very important contribution to the makespan study because it can optimally solve the problem. Under the tenet of Johnson’s algorithm, roughly defined as “start quick and finish quick”, other important heuristics were developed to solve the makespan problem in the pure flow shop with more than two stages. These significant heuristics are those developed by Palmer (1965), Campbell et al. (1970), Gupta (1971) 1 Corresponding Author 1 and Dannenbring (1977). These four heuristics can quickly produce a sequence for the multiplestage scheduling problem but will not always produce an optimal solution. The FSMP environment is a more complex class of the flow shop than the pure flow shop. The FSMP makespan problem has been the focus of recent research efforts. Some significant work done on FSMP is reviewed because they formed the foundation for this research. Starting from two-stage FSMP’s, work has been done by Deal and Hunsucker (1991), Gupta and Tunc (1991), Lee and Vairaktarakis (1994), and Deal et al. (1994). Other research has been conducted on FSMP’s with more than two stages. For instance, Brah et al. (1991a) performed studies on the mathematical modeling of FSMP’s. Brah and Hunsucker (1991b) developed a branch and bound model for FSMP’s. Hunsucker and Shah (1992) evaluated the priority rules in scheduling FSMP’s. Since the optimal solutions are unknown to researchers, a good makespan lower bound is needed in order to test a heuristic. Santos (1993) developed a lower bound, the SHD lower bound, to evaluate the quality of various heuristics when the optimal makespan is unknown. It was shown by Santos’ study that the lower bound is a strong indicator of the optimal makespan in the FSMP environment. Moreover, Santos et al. (1995b) developed a computer heuristic called FLOWMULT which can help solve the makespan problem in a timely manner. 3. Experimental Design In order to better understand the FSMP makespan problems it is necessary to investigate the distribution of the makespans in FSMP. Hence, the primary objective of this study is to examine the statistical nature of the makespan distributions in the FSMP environment. The investigation analyzes the results of all n! multiple-permutation (MP) schedules for each job configuration. A multiple-permutation schedule is a non-delay schedule in which the start order of jobs on the current stage is determined by the finish order of jobs in the previous stage. Ties are handled by using the start order on the previous stage. This new term, MP, applies only to the FSMP environment. Considering only MP schedules effectively reduces the solution space to n! different sequences. A MP schedule is generally close to an optimal schedule even if it is not optimal. Therefore, examining only the MP schedules does not significantly affect the quality of the study. To fulfill the primary objective of this study, the following specific tasks are performed: 1. 2. 3. 4. Investigate the distribution of makespans over all experimental problems. Investigate the distribution of makespans in terms of the number of processors per stage. Investigate the distribution of makespans in terms of the number of stages. Investigate the distribution of makespans in terms of the number of jobs. A computer program called AUTOMSCT is developed in this study. AUTOMSCT basically enumerates the makespans of all MP schedules and counts the number of all possible makespans generated by each FSMP problem. AUTOMSCT is coded in Microsoft Quick Basic. The processing times of the jobs are generated from a uniform distribution over the integers from 1 to 99, exclusively. A random number generator in Quick Basic is used to produce these random numbers. The test was performed on a Dell Latitude C600 laptop computer. It has a Pentium III 750MHZ processor and 256MB memory. The FSMP problem size examined in this study is relatively small because the computational time grows exponentially when the number of jobs increases. This limits the study to a maximum number of jobs of seven. For each job configuration, there are at least two stages and at most four stages. The number of identical processors at each stage ranges from 2 to N-1, where N is the 2 Table 1: Problem Breakdown of the Uniform FSMP Configuration Jobs 3 4 Stages 2 3 4 2 Processors/Stage 2 2 2 2 3 2 3 2 3 2 3 4 2 3 4 2 3 4 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 6 2 3 4 5 6 2 3 4 5 6 3 4 5 2 3 4 6 2 3 4 7 2 3 4 3 No. of Problems 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 number of jobs. Table 1 is the breakdown of all job configurations tested in the experiment. The table contains the number of jobs, the number of stages, the number of identical processors per stage and the number of test problems. For each test configuration, 30 test problems are run. Therefore, the test for the uniform FSMP environment contains 45 test configurations and 1350 problems. To better evaluate the makespan distributions, the makespan values are standardized using the following formula: Relative Deviation = Makespan Optimal MP Makespan 100%. Optimal MP Makespan (1) The optimal makespans of these experimental problems are unknown. Therefore, the optimal multiple-permutation (MP) makespan for each problem is used for calculating the relative deviation. The optimal MP makespan is the minimum makespan of all n! makespans produced by the AUTOMSCT program. 4. Experimental Results 4.1 Overall Results The histogram in Figure 1 illustrates the makespan distributions of each problem configuration and the overall distribution. It is obvious that the makespan distribution of the 7-job configuration dominates the makespan distribution of all experimental problems because even the sum of the numbers of schedules (296460) generated by all other job configurations is still too small when comparing to the number of schedules (2268000) generated by the 7-job configuration. Table 2 shows a comparison of the overall performance of the 5 different job configurations. As shown by Table 2, the number of schedules increases factorially as the number of jobs (N) increases. The number of test problems for each job configuration is different from one configuration to another. This is because the increase of number of jobs allows more number of processors to be added to each stage and therefore increases the number of test problems for a certain job configuration. For each problem, there are n! schedules generated. The overall number of schedules tested for each job configuration varies significantly simply because the size of jobs and the number of problems vary significantly. For example, the 3-job configuration generates 540 schedules over all test problems and the 7-job configuration generates 2268000 schedules for all test problems. This again explains why the overall makespan distribution shows the same performance as the 7-job configuration. According to Table 2, it seems that the number of optimal schedules decreases when the number of jobs increases. This needs more investigation because the comparison in Table 2 is too generic to reflect the differences between the job configurations tested. Table 2 also shows that the worst relative deviation is more than doubled when the number of jobs increases from 3 to 7 jobs. The worst case happens mostly when there are 2 processors at each stage. This means that increasing the number of processors at each stage will help improve the makespan performance. More investigation needs to be conducted in order to better understand the influence on the makespan when more processors are added to the FSMP. In any job configuration, more than 50% of the schedules produce makespans that are within 10% of the optimal MP solution. Moreover, more than 99% of the time, a schedule will produce a makespan that is within 50% of the optimal MP value. This suggests that the probability that the makespan obtained from a random schedule is within 10% of the optimal is more than 50%; and the probability that the makespan obtained from a random schedule is within 50% of the optimal is more than 99%. This 4 Makespan Distributions 50% 45% 40% Frequency (%) 35% 3 job 30% 4 job 5 job 25% 6 job 20% 7 job 15% Over All 10% 5% 0% 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90% Relative Deviation from Optimal MP Makepsan Figure 1 Makespan Distributions for FSMP Table 2: Comparison of Makespan Distributions Schedule Performance 3 job 4 job 5 job 6 job 7 job 1. N! 6 24 120 720 5040 2. No. of test problems 90 180 270 360 450 3. No. of schedules 540 4320 32400 259200 2268000 4. Percent of optimal schedules 43.33% 36.44% 30.69% 32.36% 29.21% 5. Worst relative deviation 40.53% 50.29% 64.63% 87.71% 85.71% 6. No. of stages (worst case) 2 3 4 2 2 7. No. of processors (worst case) 2 2 2 2 3 8. No. of worst cases 2 2 2 2 6 9. Within 10% of optimal 76.29% 67.04% 58.67% 57.61% 52.39% 10. Within 50% of optimal 100% 99.95% 99.68% 99.54% 99.06% 5 information indicates that it is quite difficult to get extremely bad makespan values (i.e. makespan values that are not within 50% of the optimal). 4.2 Makespan Distributions vs. Number of Processors It is found from previous section that the number of processors per stage may play a pivotal role in the makespan distributions. Hence, the relationship between the number of processors at each stage and the makespan distributions is further investigated. As shown by Table 1, there are three different stage configurations, namely the two-stage, three-stage and four-stage configurations. Theoretically the stage configuration will not affect the makespan distributions. Thus the twostage configuration is chosen to study the relationship between the number of processors and the makespan distribution. Since the number of jobs ranges from 2 to N – 1 (N is the number of jobs), the 7-job configuration is selected for better representation. Figure 2 and Table 3 show the comparison of relative deviations between different processor configurations. It is noticed that the number of optimal MP schedules is about 75% when there are 6 (N-1) processors. The worst relative deviation is only about 40% when there are 6 (N-1) processors. It is obvious that distributions of the relative deviations are getting less normal when the number of processors per stage increases. Figure 3 shows that the percentage of the optimal MP makespans may fit into an exponential distribution when the number of processors increases from 2 to 6 (N-1). Hence, a hypothesis is used in order to test the observations. The hypothesis is constructed as follows: H0: For a 7-job, 2-stage FSMP problem, the number of optimal MP schedules in percentage follows the exponential distribution with the mean equal to 0.8 when the number of processors per stage increases from 2 to N-1 processors. H1: For a 7-job, 2-stage FSMP problem, the number of optimal MP schedules in percentage doesn’t follow the exponential distribution with the mean equals to 0.8 when the number of processors per stage increases from 2 to N-1 processors. The Kolmogorov-Smirnov (K-S) Goodness of Fit Test is carried out in order to test the above hypothesis. The Kolmogorov-Smirnov test is used to decide if a sample of data comes from a specific distribution. Since the test statistic D (=0.417) is less than the critical value D0.1, 5 (= 0.510), the K-S test failed to reject the hypothesis H0 at the 0.1 significance level. Therefore, it can be concluded that the number of optimal MP schedules in percentage follows exponential distribution with mean equals to 0.8 with the number of processors per stage ranging from 2 to 6 (N-1). Details of this test may be found in Wang (2001). Similarly, the K-S test was carried out to test the observations of 2-stage FSMP problems that have 4, 5 and 6 jobs. For the K-S test of the 6-job, 2-stage FSMP problem, the null hypothesis assumes that the mean of the exponential distribution equals to 0.9. The test statistic D equals to 0.435, which is less than the critical value D, n at 0.1 significance level (D0.1, 4 = 0.564). Hence, the K-S test failed to reject the null hypothesis and the number of the best MP schedules in percentage was concluded to follow the exponential distribution with mean = 0.9 with the number of processors per stage ranges from 2 to 5. For the K-S test of the 5-job, 2-stage FSMP problem, the null hypothesis assumes that the mean of the exponential distribution equals to 1.0. The test statistic D equals to 0.517, which is less than the critical value D, n at 0.1 significance level (D0.1, 3 = 0.642). Hence, the K-S test failed to reject the null hypothesis and the number of the best MP schedules in percentage was concluded 6 Makespan Distribution - 7-job, 2-stage, 2-6 processor 80% 70% 60% Percentage (%) 2processor 3processor 4processor 5processor 6processor 50% 40% 30% 20% 10% 0% 0% 20% 40% 60% 80% 100% Relative Deviation Figure 2 Makespan Distributions for 7-job, 2-stage and 2 to 6 processor FSMP Table 3: Makespan Distributions vs. No. of Processors – 7-job, 2-stage FSMP Schedule Performance 7-job, 2-processor 5040 7-job, 3-processor 5040 7-job, 4-processor 5040 7-job, 5-processor 5040 7-job, 6-processor 5040 30 30 30 30 30 3. No. of schedules 151200 151200 151200 151200 151200 4. Percent of optimal schedules 0.30% 3.21% 16.14% 43.41% 74.76% 5. Worst Deviation 67.41% 85.71% 61.33% 44.58% 39.29% 4 6 192 240 720 11.78% 21.97% 46.52% 71.74% 92.38% 1. N! 2. No. of test problems 6. No. of worst cases 7. Within 10% of optimal 7 Optimality vs. Number of Processors 80% 74.76% 70% 60% Percentage (%) 50% 43.41% 40% 30% 20% 16.14% 10% 3.21% 0.30% 0% 2-processor 3-processor 4-processor 5-processor 6-processor Number of Processors Figure 3 The MP Optimal Schedules – 7-job, 2-stage, 2 to 6 processors to follow the exponential distribution with mean = 1.0 with the number of processors per stage ranges from 2 to 4. For the K-S test of the 4-job, 2-stage FSMP problem, the null hypothesis assumes that the mean of the exponential distribution equals to 1.1. The test statistic D equals to 0.611, which is less than the critical value D, n at 0.1 significance level (D0.1, 2 = 0.776). Hence, the K-S test failed to reject the null hypothesis and the number of the best MP schedules in percentage was concluded to follow the exponential distribution with mean = 1.1 with the number of processors per stage ranges from 2 to 3. Table 4 shows the means of the exponential distributions from 4- to 7-job FSMP problems, in which there are 2 stages for each problem. A plot of the exponential distribution mean versus the number of jobs is shown in Figure 4. As can be seen, a straight line fits the data very well, which suggests that a direct linear relationship exists. A linear regression analysis was performed on the data. Table 5 shows the results of the linear regression analysis. The correlation coefficient R^2 is 1, which means there is a direct linear relationship which exists between the number of jobs and the exponential distribution mean. Therefore, if the number of jobs is known, it might be possible to predict the exponential distribution mean. The equation of the fitted line is: Y = 1.5 - 0.1X1, where X1 is the number of jobs. 8 Table 4: The Mean of Exponential Distribution Mean 4-job 5-job 6-job 7-job 1.1 1.0 0.9 0.8 Exponential Distribution Mean vs. Number of Jobs 1.2 1 0.8 0.6 0.4 0.2 0 4-job 5-job 6-job 7-job Number of Jobs Figure 4 Plot of the exponential distribution mean vs. the number of jobs Table 5: Linear Regression Analysis on the Exponential Distribution Mean Model is from : Y = B0 + B1*X1 COEFFICIENT SE T B0 = 1.5 7.242792E-08 2.071024E+07 B1 = -0.1 1.290478E-08 -7749064 N=4 R^2 = 1 S = 2.885598E-08 9 In summary, this section investigated the relationship between the number of processor per stage and the makespan distributions. It may be concluded from the study that the number of optimal MP schedules increases exponentially as the number of processors per stage increases from 2 to N-1 (N is the number of jobs). This means that increasing the number of processors per stage will significantly improve the makespan performance. Moreover, the worst relative deviation from the optimal MP makespan decreases when the number of processors increases. This implies that it is more difficult to get extremely bad solutions when the number of processors per stage increases. 4.3 Makespan Distributions vs. Number of Stages This section investigates the relationship between the makespan distributions and the number of stages. The 7-job, 2-processor FSMP problems were tested in this study. The number of stages ranges from 2 to 4. Figure 5 shows that the three different stage configurations follow similar makespan distributions. To better compare the makespan distributions of these three different stage configurations, a hypothesis is used and tested. The hypothesis is: H0: There is no difference in the makespan distributions of the 7-job, 2-stage, 2-processor FSMP problem, the 7-job, 3-stage, 2-processor FSMP problem and the 7-job, 4-stage, 2processor FSMP problem. H1: At least one of the makespan distributions is significantly different. A one-way analysis variance (ANOVA) was carried out in order to test the above hypothesis. The details of the ANOVA may be found in Wang (2001). The ANOVA test cannot reject the hypothesis H0. Therefore, it can be concluded from the test that there are no significant differences between the makespan distributions of FSMP problems that have the same number of jobs and processors but with different number of stages. This result implies that the makespan performance of an FSMP problem may be predicted using the makespan distributions of another FSMP problem with the same number of jobs and processors but with different number of stages. Makespan Distributions vs. Number of Stages 25% 20% Percentage (%) 2-stage 15% 3-stage 4-stage 10% 5% 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% Relative Deviation Figure 5 Makespan Distributions – 7-job, 2-processor and 2, 3, and 4 stages 10 4.4 Makespan Distributions vs. Number of Jobs The relationship between the makespan distributions and the number of jobs is examined in this section. Figure 6 illustrates the makespan distributions when the number of jobs increases. It can be seen from Figure 6 that makespan distribution are more spread out when the number of jobs increases. Also, more schedules are getting worse relative deviations when the number of jobs increases. Figure 7 shows the change of the number of optimal MP schedules when the number of jobs increases. A K-S test is carried out in order to test if the change of the number of MP optimal schedules follows the exponential distribution. The hypothesis below is constructed as follows: H0: For a 2-stage FSMP problem with 2 processors per stage, the change of the number of optimal MP schedules in percentage follows the exponential distribution with the mean equals to 0.7 when the number of jobs increases from 3 to 7. H1: For a 2-stage FSMP problem with 2 processors per stage, the change of the number of optimal MP schedules in percentage doesn’t follow the exponential distribution with the mean equals to 0.7 when the number of jobs increases from 3 to 7. The K-S test failed to reject the null hypothesis H0 at the 0.01 significance level and the number of optimal MP schedules in percentage was concluded to follow the exponential distribution with mean = 0.7. Details of this test may be found in Wang (2001). This means that the probability that a bad makespan is generated by a random schedule increases significantly when the number of jobs increases. Relative Deviation vs. Number of jobs 45% 40% 35% Percentage (%) 3 job 30% 4 job 25% 5 job 20% 6 job 15% 7 job 10% 5% 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% Relative Deviation Figure 6 Makespan Distributions vs. Number of jobs – 2-stage, 2-processor FSMP 11 Optimality vs. Number of jobs 45% 42.22% 40% 35% Percentage (%) 30% 25% 20% 16.11% 15% 10% 3.50% 5% 0.95% 0.30% 0% 3 job 4 job 5 job 6 job 7 job Number of Jobs Figure 7 Optimal MP Schedules vs. Number of Jobs – 2-stage, 2-processor FSMP 5. Conclusions This paper reveals the statistical nature of the makespan distributions in FSMP. The results of this research can be used as guidance for other FSMP research. There are several important findings regarding the makespan distributions. First, the number of processors per stage is the determinant of the makespan distributions in FSMP. As the number of processors increases, the number of optimal MP schedules increases exponentially. Moreover, the makespan distributions are less spread out when there are more processors. This implies that the probability that a fairly good makespan is generated by a random schedule increases significantly as more processors are added to the FSMP environment. The result about the number of processors also suggests that it is more meaningful to specify the number of processors at each stage when performing statistical studies in the FSMP environment. Second, the number of stages in the FSMP problem doesn’t affect the makespan distributions significantly. This result suggests that the number of stages is an insignificant variable in the FSMP statistical studies. The results of statistical studies on a specific stage configuration may also apply to configurations with different number of stages (with the same number of jobs and processors). Third, when the number of processors per stage is the same, the number of optimal schedules decreases exponentially when the number of jobs increases. This means that the probability that a bad makespan is generated by a random schedule increases significantly when the number of jobs increases. The np-hard nature of the FSMP problem forces researchers towards the development of heuristic procedures, which hopefully can generate a good makespan solution in a timely manner. In reality, the optimality of the FSMP problem’s solution is normally sacrificed for less time computational times. When dealing with medium or large sized FSMP problems, heuristics that generate near-optimal makespan solutions with less computational times will be more effective. 12 References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] Brah, S.A., Hunsucker, J.L. and Shah, J. (1991a), “Mathematical Modeling of Scheduling problems.” Journal of Information & Optimization Sciences, Vol. 12, No. 1, 113-137. Brah, S.A., and Hunsucker, J.L. (1991b), “Branch and Bound Algorithm for the Flow Shop with Multiple Processors”, European Journal of Operational Research, Vol. 51, 88 – 99. Deal, D.E., and Hunsucker, J.L. (1991). “The Two-Stage Flowshop Scheduling Problem with M Machines at each Stage”, Journal of Information & Optimization Sciences, Vol. 12, 407 – 417. French, Simon. (1982), Sequencing and Scheduling: An Introduction to the Mathematics of Job-Shop, John Wiley & Sons, New York. Gupta, J.N.D. and Tunc, E.A. (1991). “Schedules for a Two-Stage Hybrid Flowshop with Parallel Machines at the Second Stage”, International Journal of Production Research, Vol. 29, 1489 – 1502. Hunsucker, J.L., and Santos, D.L. (1991). “The Effects of Adding an Additional Machine to a Flow Shop Environment”, Paper presented at the 16th Annual Technical Symposium of the American Institute of Aeronautics and Astronauts in Houston, TX. Hunsucker, J.L., and Shah, J.R. (1994). “Comparative Performance Analysis of Priority Rules in a Constrained Flow Shop with Multiple Processors Environment”, European Journal of Operational Research. Vol. 72, 102 – 114. Hunsucker, J.L., and Shah, J.R. (1992). “Performance of Priority Rules in a Due Date Flow Shop”, Omega, Vol. 20, 73 – 89. Lee, David Z. L. (2000). “Zero Wait Flow Shop with Multiple Processors with Heuristic, Algorithm, and Mathematical Concepts”, Unpublished Ph.D. Dissertation, University of Houston. Lee, C.-Y., and Vairaktarakis, G. L., (1994). “Minimizing Makespan in Hybrid Flowshops”, Operations Research Letters 16, pp. 149-58. Santos, D.L., Hunsucker, J.L., and Deal, D.E. (1995a). “Global Lower Bounds for Flow Shops with Multiple Processors”, European Journal of Operational Research”, Vol. 80, 112 – 120. Santos, D.L., Hunsucker, J.L., and Deal, D.E. (1995b). “FLOWMULT: Permutation Sequences for Flow Shops with Multiple Processors”, Journal of Information and Optimization Sciences, Vol. 16, 351 – 366. Santos, D.L., Hunsucker, J.L., and Deal, D.E. (1996). “An Evaluation of Sequencing Heuristics in Flow Shops with Multiple Processors”, Computers and Engineering, Vol. 10, No. 4, 681-691. Santos, D.L., Hunsucker, J.L., and Deal, D.E. (2001). “On Makespan Improvement in Flowshops with Multiple Processors”, Production Planning and Control, Vol. 12, No. 3, 283-295. Schellhase, John C. (1996). “The Placement of an Additional Processor in a Flow Shop with Multiple Processor”, Unpublished Ph.D. Dissertation, University of Houston. Thornton, Henry W. (2000). “The Placement of Additional Storage in a Flow Shop with Multiple Processors and No Intermediate Storage”, Unpublished Ph.D. Dissertation, University of Houston. Wang, Wei. (2001). “A Study of Mean Flow Time and Makespan in Flow Shops with Multiple Processors”, Unpublished Ph.D. Dissertation, University of Houston. 13