Optimal Server Allocation in Reconfigurable Clusters with Multiple Job Types J. Palmer I. Mitrani School of Computing Science University of Newcastle NE1 7RU jennie.palmer@ncl.ac.uk isi.mitrani@ncl.ac.uk Outline Introduction The model System State Computation of the optimal policy Experimental Results Look-up tables Policy Comparison The Heuristic Policy Simulation results Conclusions 2 Introduction In a Grid environment, Users submit jobs without heterogeneous clusters of necessarily knowing or servers provide a variety of caring where they will be services to widely distributed executed user communities Cluster 2 Cluster 3 ... Cluster 1 Cluster M Pool Manager Job Requests Users 3 The model Demands (jobs) of M types are submitted to a pool of N servers A configuration consists of dedicating ki of the servers to job type i, such that M k i 1 type 1 i N l1 N Servers b1 k1 queue 1 type 2 l2 k2 queue 2 bM queue M ... ... type M lM b2 kM 4 The model Servers can be switched from type i to type j What is a good policy for deciding dynamically when to reconfigure the system? N Servers type 1 l1 b1 k1 Switch a server queue 1 type 2 l2 k2 queue 2 bM queue M ... ... type M lM b2 kM 5 The model Arrival rates l1 , l 2 , . . . , lM Average service times b1 , b2 , . . . , b M li bi Holding Costs (the cost of waiting) c1 , c2 , . . . , c M Switching Costs c i, j i, j Switching Rates i, j 6 System State The system state is S ( j, k , m) where j ( j1 , j2 ,..., jM ) k (k1 , k2 ,..., kM ) m (mi , j )iM, j 1 The system has been modelled by a continuous Markov process, the transition rates of which depend on the decisions taken in various states A dynamic configuration policy must decide, for any given state S, whether to i. Do nothing ii. Initiate a switch from queue i to queue j 7 Computation of the optimal policy Principles of dynamic programming have been used to solve the optimization problem M V (S ) i 1 The optimal policy is specified by the action d which minimises the right-hand side ji ci min c(d ) qd ( s, s ' )V ( S ' ) d S' The computational complexity of determining the optimal switching policy is of the order O( J M N M 1 M ( M 1) / 2 ) 8 Experimental Results – Look-up Tables N = 2, M = 2 Optimal decisions have been stored in look-up tables which may then be referred to during simulations j2 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 j1 4 5 6 Key 7 Do nothing Switch 1 2 8 Switch 2 1 9 10 9 Experimental Results – Look-up Tables N = 3, M = 3, j1=0 j3 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 Key j2 Do nothing Switch 1 2 4 5 6 7 Switch 1 3 Switch 2 3 8 9 10 10 Policy Comparison An exact characterisation of the optimal policy is unlikely Instead, we formulate a heuristic which performs reasonably well and is easy to implement Three policies compared in simulations: i. Static Assign servers in proportion to the holding cost and offered load for each type ii. Heuristic Attempts to balance the total holding costs of the job types Use pre-computed tables of optimal decisions iii. Optimal 11 The Heuristic Policy Calculate the following for each of the M(M-1)/2 possible switches from queue a to queue b. Find the maximum of all quantities calculated. If strictly positive, this will be the most advantageous switch to take, so take the action corresponding to this switch. Otherwise, do nothing. 1 lb b min( kb , jb ) cb jb a ,b 1 la a min( k a 1, ja ) Kca ja a ,b 12 Increasing number of servers M=2 13 Increasing loads M=2 14 Increasing number of servers M=3 450 static optimal heuristic Average cost 400 350 300 250 200 150 100 50 0 3 4 5 Number of servers 6 15 Increasing loads M=3 700 Average cost 600 static optimal heuristic 500 400 300 200 100 0 2.6 2.8 3 3.2 Total load 3.4 3.6 16 Conclusions A problem of interest in the area of distributed computing and dynamic Grid provision has been examined The optimal reconfiguration policy can be computed and tabulated For practical purposes, an easily implementable heuristic policy is available 17 Acknowledgment This work was carried out as part of the collaborative project GridSHED, funded by North-East Regional e-Science Centre and BT This project also aims to develop Grid middleware to demonstrate the legitimacy of our models, providing a basis for the development of commercially viable Grid hosting environments Project web page: http://www.neresc.ac.uk/projects/GridSHED/ 18