Fall Oral Presentation Slides

Senior Design Project: Parallel Task Scheduling in Heterogeneous Computing Environments Senior Design Presentation Senior Design Students: Christopher Blandin and Dylan Machovec Post-doctoral Scholar: Bhavesh Khemka Faculty Advisor: H. J. Siegel Outline motivation  our system model  problem statement  existing work  simulation details  future work  2 Motivation High Performance Computing (HPC) used by wide variety of fields to solve challenging problems physics simulations, oil and gas industry, climate modeling, computational biology, computational chemistry, and many more  improving performance increases productivity in these fields  we plan on improving performance of system by designing novel scheduling techniques  scheduling refers to the assignment and ordering of tasks to machines for execution  3 System Model – Definitions heterogeneity differing execution characteristics  homogeneity have the same execution characteristics  oversubscribed more tasks arriving than the system can execute immediately  4 System Model – Cluster Model clusters have multiple homogeneous nodes  clusters are heterogeneous from each other  nodes may have multiple multicore processors  each node may only have one task running at a given time avoids interference between tasks  task assignments are done at node-level  a task cannot be spread across two clusters  5 System Model – Workload Characteristics dynamically arriving tasks  when a task arrives, scheduler obtains the following information: arrival time execution time  different times on different clusters (because of heterogeneity) number of processing cores required value function  tasks are heterogeneous  no pre-emption  6 System Model – Value Function each task has a value function represents value of the task when it completes value function may be different for each task monotonically decreasing functions  value functions can be fully described with four parameters a constant starting value after soft deadline value decays linearly to a final value after hard deadline value drops to zero  7 Problem Statement we measure the performance of a scheduler in our environment as the sum of the value earned by completing tasks over a given amount of time  goal of heuristics: maximize total sum of value earned over a given amount of time improve performance of HPC systems  main contribution design, simulation, and analysis of resource allocation heuristics for task scheduling  heterogeneous HPC system with multiple clusters  tasks with associated value functions with soft and hard deadlines  each task executes in parallel over multiple cores  8 Mapping Event mapping event: when task assignment decision(s) are made  trigger mapping event whenever: a node becomes available, or a task arrives  during mapping event, all tasks that have not been reserved or have not started execution are considered mappable  only makes task assignments that can start now heuristic may or may not make reservations  unmapped tasks set t7 t13 t3 t10 t9 t11 t2 t6 n3 nodes of cluster 1 n4 t5 t2 t8 time 9 n2 t4 t12 t4 n1 t1 n1 n2 nodes of cluster 2 current time Planned Heuristics four planned heuristics EASY Backfilling FCFS with Multiple Queues Max-Max Value Max-Max Value-Per-Resource  submit to Metaheuristics International Conference (MIC 2015) submission deadline: 2/6/15  10 Existing Work – Dr. Siegel’s Group focuses on utility of tasks B. Khemka, R. Friese, L. D. Briceño, H. J. Siegel, A. A. Maciejewski, G. A. Koenig, C. Groer, G. Okonski, M. M. Hilton, R. Rambharos and S. Poole, “Utility Functions and Resource Management in an Oversubscribed Heterogeneous Computing Environment,” IEEE Transactions on Parallel and Distributed Systems, accepted 2014, to appear.  another work that models stepped value functions J-K Kim, S. Shivle, H. J. Siegel, A. A. Maciejewski, T. D. Braun, et al. “Dynamically Mapping Tasks with Priorities and Multiple Deadlines in a Heterogeneous Environment,” Journal of Parallel and Distributed Computing, vol. 67, no. 2, pp. 154-169, Feb. 2007  11 Existing Work  12 other parallel task scheduling techniques EASY Backfilling  D. A. Lifka, “The ANL/IBM SP Scheduling System,” Proc. First Workshop Job Scheduling Strategies for Parallel Processing, pp. 295-303, 1995. S. Gerald, R. Kettimuthu, A. Rajan and P. Sadayappan, “Scheduling of Parallel Jobs in a Heterogeneous Multi-Site Environment,” Job Scheduling Strategies for Parallel Processing, pp. 87-104, 2003. Design of Parallel Simulator for Experiments extends existing serial simulator from Dr. Siegel’s group modified to handle scheduling of parallel tasks  created new modules cluster class  has nodes within it methods for obtaining parallel task information from workload trace created a sleep task object to model idle time within each machine  developed an algorithm to locate slots for parallel tasks within the area occupied by sleep tasks  developed a method that picks the nodes that create the best packing (i.e., create the least future restrictions)  13 Workloads for Simulations will use Dr. Dror Feitelson’s Parallel Workload Trace to model the workload arrival workload log from Curie Supercomputer in France (has 93,312 cores)  using last 10 months of data  may use Downey’s model for execution time scaling  14 Future Work Use simulator to implement and compare the planned heuristics  running a post-mortem analysis use a genetic algorithm to find a loose upper bound solution when we know in advance the arrival time and characteristics of all tasks  since scheduling is NP-hard it is hard to quantify the performance of heuristics this analysis will give us a better metric to compare our results with  15 Thank You Questions? Feedback? 16 Back-up Slides 17 Packing Nodes Efficiently  whenever an assignment is to be made, all heuristics pick the nodes that create the least amount of restrictions for future assignments e.g., if task t8 needs 3 nodes, it will be assigned: n1, n2, n5 t8 n1 t8 n2 n3 n4 t8 n5 time 18 current time Heuristics – Overview EASY Backfilling considers tasks in a first come first serve (FCFS) order makes only one reservation for the first task that cannot fit on idle machines backfills other tasks so that they do no delay the reservation  FCFS with Multiple Queues puts the tasks in three queues takes 1, 4, and 8 tasks from the large, medium, and small queues respectively assigns tasks if possible, and otherwise makes the earliest reservation for them repeats until the queues are empty  19 Heuristics – Overview Max-Max Value First phase: Considering all tasks  Determine the allocation choice that will earn it the highest value without delaying any place holder task If there are ties, pick the choice with the earlier completion time Second phase: Consider tasks from first phase  Make assignment or a place-holder for the choice that earns the highest value This assignment should not start execution after the start of the earliest place holder task Repeat the two phases until no more tasks can be mapped  Max-Max Value-Per-Resource Similar to Max-Max Value  20 Simulation Study to model real-world system environment  experiments run on ISTeC Cray HPC System  uses real workload traces as inputs  21

Fall Oral Presentation Slides

Related documents

Products

Support

Fall Oral Presentation Slides

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib