International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com Volume 2, Issue 10, October 2013 ISSN 2319 - 4847 A hybrid algorithm for grid task scheduling problem AtenaShahkolaei1, Hamid Jazayeriy2 1 Department of computer engineering, Islamic Azad University, Science and Research Ayatollah Amoli branch, Amol, Iran 2 Babol Noshirvani University of Technology, Iran Abstract At the beginning the Grid computing and Grid technologies was used for scientific and technical jobs where computers were distributed geographically, connected through internet, are used in order to invent virtual supercomputers with high computing competency. The goal of inventing such a virtual supercomputer was solving complex scientific problems in less time than before. The goal of task scheduling in Grid systems is decreasing the total completion time of the all tasks with providing a timetable to perform all existing tasks. The problem of task scheduling in grid systems has large solution space and is considered as a kind of NP-Hard problems. In this paper a new method for solving task scheduling in Grid computing system is proposed by combining Genetic Algorithm with Tabu Search Algorithm. The goal of proposed algorithm is to combining the global search abilities of Genetic Algorithm with local search of Tabu Search, and creating a stable algorithm that highly guarantees the accessing of the global optima. The proposed algorithm is compared with other heuristic algorithms and experimental results showed that the proposed algorithm has high efficiency for solving task scheduling in Grid computing. Keywords:Computational Grids,ETC Matrix, Genetic Algorithm, Tabu Search. 1. INTRODUCTION Grid computing and its technologiescreates at first for technical and scientific activities in computers where geographicallydistributed, connected through internet, and used to createvirtual supercomputers with high computing abilities. The purpose of creating these virtual supercomputers is solving complex scientific problems in less time than before. Afterwards, Grid Computing was applied for helping and reaching improvements in meteorology, physics, medicine and other computing-intensive scientific fields. Examples of such works in large scale including optimization and data-intensive computing are studied in [1]–[7]. The current definition of Grid was introduced for first time by [8] in 1998 as following: "A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities." Task scheduling in Grid Computing contains a set of tasks that should be run on a limited set of resources. The purpose of task scheduling in Grid systems is offering a time table for executing all tasks and decreasing the execution time of all tasks. The problem of task scheduling in grid systems has large solution space and is considered as a kind of NP-Hard problems [9], [10].In the past few years, many researches were performed for solving Grid Task Scheduling problems based on GA [11]–[14] and TS [11], [16]–[18].A hybrid GA (TS) algorithm for the problem of independent scheduling in computational grids presented in [15]. In their proposed algorithm the hybridization follows a low level approach in which GA is the main flow and TS is subordinated to it. The experimental evaluation showed that GA (TS) outperforms both GA and TS for small and medium size grid scenarios but achieves worse value for large size scenario. In this paper, the proposed algorithm utilize GA and TS together to maintain the global search abilities of Genetic Algorithm and local search of Tabu Search. In GA(TS) algorithm [15] the TS algorithm used instead of mutation operation, but in this paper, in the proposed algorithm the TS algorithm execute separately after Crossover and Mutation operations, to optimize some percent of individuals in GA population. 2. GRID TASK SCHEDULING The Grid Task Scheduling is including of a set of tasks that should run on limited set of resources. The purpose of task scheduling in Grid systems is offering a time tablefor executing all tasks, so that the execution time of all tasks minimized and the computational load of all system divided between all resources equally. For simulating grid environment, a static environment used that contained at×mmatrix called ETC (expected time to compute); the model offered by [11] is used. Any row in ETC matrix shows the estimated execution times for a given task on each machine. Thus, for an arbitrary taskti and an arbitrary machine mj the entryETC(ti ,mj)is the estimated execution time of running task i on machine j. In simulated model, the purpose of heuristic algorithm is minimizing the metatask execution time that is called themakespan. Consider that the earliest time machine can complete the execution of all the tasks that have previously been assigned to it, called Machine availability time. Thus, the completion time for a new task on machine will define by relation (1): Volume 2, Issue 10, October 2013 Page 343 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com Volume 2, Issue 10, October 2013 ISSN 2319 - 4847 (1) ct ( t i , m j ) mat ( m j ) ETC (t i m j ) Where matis machine availability time for machinemjandctis the completion time for taskti on machinemj. With having the amount of for each task, the metatask execution time (makespan) will calculate by relations (2). makespan max i t m ct ( t i m j ) 0 j 0 (2) 3. PROPOSED ALGORITHM In this paper a new method for solving task scheduling in Grid computing system is proposed by combining Genetic Algorithm with Tabu Search Algorithm that will describe in this section. For displaying chromosomes a 1 × t vector is used[15], where t is equal by the number of all available tasks for processing. Each chromosome in the proposed algorithm shows apossible mappingfor all tasks of users. In the proposed algorithm parents are selected using therankingmethod, and the chosen operator for generating new children is single point crossover. After generating new children the mutation operator is used for making diversity in children population. Swap and Transfer are two types of methods that selected for mutation. The main purpose of these mutations is changing tasks between different machines to process the tasks with more suitable machines. In Population Optimizations step, some percentages of available chromosomes optimized by Tabu Search algorithm and added to the genetic population as new children. Tabu Search algorithm receive one genetic chromosome as base solution, and try to optimize the chromosome with its operators in some iterations. In the proposed Tabu Search algorithm, at first a solution will create in neighborhood of current solution. Then, the index of tasks that has a role in generating the new neighbor will locate in the tabu list, to forbidden these tasks to participate in generating new neighbors. In following, if the generated neighbor be better than the best solution produced so far, the generated neighbor will select as the best solution and the structures of memory will updated. This cycle of generating neighborhood will repeat in numbers that was determined before, and finally the Tabu Search algorithm will return its best solution to Genetic Algorithm. The proposed algorithm get terminated if no improvement observed in chromosomes in a predefine number of generation. Figure 1 shows the flowchart of the proposed algorithm. Figure1: Flowchart of the proposed algorithm 4. EXPERIMENTAL RESULTS To evaluate the performance of the proposed algorithm, it is compared with other heuristic algorithms in [11]. The proposed algorithm was implemented in C# 2010 and the tests were made on a Pentium IV 3.00GHz processor with 1.00 GB of RAM.For testing the proposed algorithm in simulated environment, 9 test data set based on different type of ETC matrices are designed, each of them consisting of 100 instances. They are labeled as Test_X_YZ_N where X is the type of ETC matrices(C - consistent, I – inconsistent and p means partially consistent), Y and Z indicate the task and machine heterogeneity. Their value could be H for maximum heterogeneity and L for the minimum heterogeneity. N is used to number the corresponding test. For example, if a test labeled as Test_C_HH_10, it means that, in this test a consistent matrices is used with maximum heterogeneity for tasks and machines.In test of the proposed algorithm, all test data sets are composed from 512 jobs and 16 machines. Various GA and TS options used are: Population size = 100 Crossover probability = 0.75 Volume 2, Issue 10, October 2013 Page 344 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com Volume 2, Issue 10, October 2013 ISSN 2319 - 4847 Mutation probability = 0.25 Tabu Search neighborhood size = 20 tabu_tenure = 5 The proposed algorithm for any test performed 100 times and the average of fitness values is calculated by the proposed algorithm. Figure 2 shows the output of Test_C_HH_1 in Gant chart. This Chart is consisting of 16 rows and 512 colorful rectangles that any one of them is showing one of the 16 machines and 512 works in the simulated system. Figure2: Gant chart of Test_C_HH_1 In Figure 3 the chart of movements of best chromosome in genetic population is shown,that represents high speed of algorithm in achieving to optimal solution. In this chart tilt of curve will decrease during the time, and finally it’s stopped in local optima. Figure3: Fitness of the best chromosome in Genetic Population Table1 shows the result of several tests that obtained to optimal solutions. In Table 1, the first column presents test name, and other results such as the best iteration that optimal solution obtained, the total running time for the proposed algorithm, the time that optimal solution obtained, and fitness of the optimal solution are presented in other columns respectively. Table 1: The result of some tests Test Data Best Iteration Run Time Best time Best fitness Test_C_HH_ 1 40115 785 767 7234264 Test_I_HH_1 69897 1358 1339 3103133 Volume 2, Issue 10, October 2013 Page 345 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com Volume 2, Issue 10, October 2013 ISSN 2319 - 4847 Test_P_HH_1 57390 1104 1086 4425099 Figure 4 shows the chart of average running time for the proposed algorithm and other heuristic algorithms [11]. In Figure 4 the average running time of proposed algorithm shown with red color. Figure4: Average run time of the proposed and other heuristic algorithms Figure 5 shows the comparison of makespan obtained by the proposed algorithm and other heuristic algorithms [11]. As you can see in Figure 5 the proposed algorithm obtained better optimal solution in comparison with other heuristic algorithms. Figure5: Comparison of makespan versus other algorithms in the literature 5. CONCLUSION In this paper a new method for solving task scheduling in Grid computing system is proposed by combining Genetic Algorithm with Tabu Search Algorithm. The goal of proposed algorithm is to minimizing the execution time of all tasks and dividing the computational load of the whole system between all resources equally. To evaluate the performance of the proposed algorithm, it is compared with other heuristic algorithms, and experimental results showed that the proposed algorithm has high performance for solving task scheduling in Grid computing. For future works, the effect of selecting other type of operations and parameters on the quality of solving problem in proposed algorithm could be examine. References [1] H. Casanova, and J. Dongarra,“Applying NetSolve's network-enabled server,”Computational Science &Engineering, Vol. 5, No. 3, pp. 57-67, 1998. [2] J. P. Goux, S. Kulkarni, J. Linderoth, and M. Yoder, “An enabling framework for master-worker applications on the computational grid,”In Proceedings Ninth IEEE International Symposium on High-Performance Distributed Computing, pp. 43-50, 2000. [3] S. J. Wright, “Solving optimization problems on computational grids,”Optima, 65(ANL/MCS/JA-38869), 2001. [4] J.Linderoth, and S.Wright, “Decomposition algorithms for stochastic programming on a computational grid,” Computational Optimization and Applications, Vol. 24,No. 2-3, pp. 207-250, 2003. [5] H. B.Newman, M. H.Ellisman, and J. A.Orcutt, “Data-intensive e-science frontier research,” Communications of the ACM, Vol. 46, No. 11, pp. 68-77,2003. Volume 2, Issue 10, October 2013 Page 346 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com Volume 2, Issue 10, October 2013 ISSN 2319 - 4847 [6] F. Xhafa, C. Paniagua, L. Barolli, and S. Caballe, “A parallel grid-based implementation for real-time processing of event log data of collaborative applications,” International Journal of Web and Grid Services, Vol. 6, No. 2, pp. 124140, 2010. [7] M. D. Beynon, A. Sussman, U. Catalyurek, T. Kurc, and J. Saltz, “Performance optimization for data intensive grid applications,” In Proceedings Third IEEE Annual International Workshop on Active Middleware Services, pp. 97105, 2001. [8] I. Foster, C. Kesselman (Eds.), The Grid: Blueprint for a Future ComputingInfrastructure, Morgan Kaufmann, San Francisco, USA, 1999. [9] D. Fernández-Baca,“Allocating modules to processors in a distributed system,” In Proceedings of the IEEE Transactions on Software Engineering, Vol. 15, No. 11, pp. 1427-1436, 1989. [10] O. H.Ibarra, and C. E.Kim,“Heuristic algorithms for scheduling independent tasks on nonidentical processors,” Journal of the ACM (JACM), Vol. 24, No. 2, pp. 280-289, 1977. [11] T. D.Braun, H. J.Siegel,N.Beck, L. L.Bölöni, M.Maheswaran, A. I.Reuther, and R. F.Freund,“A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems,” Journal of Parallel and Distributed computing, Vol. 61, No. 6, pp. 810-837, 2001. [12] J.Carretero, F.Xhafa, and A.Abraham,“Genetic algorithm based schedulers for grid computing systems,” International Journal of Innovative Computing, Information and Control, Vol. 3, No. 6, pp. 1-19, 2007. [13] Y. Jia-bin, L.Jiao-min, and S. Zhen-yu,“Strategy for tasks scheduling in grid combined neighborhood search with improved adaptive genetic algorithm based on local convergence criterion,”In Proceedings of IEEE International Conference on Computer Science and Software Engineering, Vol. 3, pp. 9-13, 2008. [14] X. Zhang, and W. Zeng,“Grid Workflow Scheduling based on improved genetic algorithm,” In Proceedings IEEE International Conference on Computer Design and Applications (ICCDA), Vol. 5, pp. 270-273, 2010. [15] Z.Pooranian, M.Shojafar, R.Tavoli,M.Singhal, and A.Abraham,“A Hybrid Meta-Heuristic Algorithm for Job Scheduling on Computational Grids,”Inform J,Vol. 37, No. 2, pp. 157–164, 2013. [16] R.Subrata, A. Y.Zomaya, and B.Landfeldt,“Artificial life techniques for load balancing in computational grids,” Journal of Computer and System Sciences, Vol. 73,No. 8, pp. 1176-1190,2007. [17] SH. Benedict,V. Vasudevan,“Improving scheduling of scientific workflows using tabu search for computational grids,” Inf Technol J, Vol. 7, No. 1, pp. 91–97, 2008. [18] F.Xhafa, and A. Abraham,“Computational models and heuristic methods for Grid scheduling problems,”Future generation computer systems, Vol. 26, No. 4, pp. 608-621, 2010. AUTHOR AtenaShahkolaei received her B.S. in IT Engineering from University of Ghaemshahr, Iran and received her M.S. degree in Software Engineering from Science and Research University of Amol, Iran. She is a reader in Grid Computing, Cloud Computing and she has done a lot of researches about Grid Computing in recent years. Hamid Jazayeriy is a lecturer at the faculty of electrical and computer engineering, NIT, IRAN. He received his B.Sc. of computer engineering from university of Tehran in 1996 and his master of software engineering from university of Isfahan in 2000. In 2011, he received his Ph.D. from university Putra Malaysia, UPM. He is a reader in multi-agent systems, swarm intelligence and graph theory. Volume 2, Issue 10, October 2013 Page 347