The Case for Tiny Tasks in Compute Clusters

The Case for Tiny Tasks in Compute Clusters Kay Ousterhout*, Aurojit Panda*, Joshua Rosen*, Shivaram Venkataraman*, Reynold Xin*, Sylvia Ratnasamy*, Scott Shenker*+, * UC Berkeley, Ion Stoica+* ICSI Setting Tas k Tas k Map Reduce/Spark/Dr yad … Tas k Job … Tas k 0 1 2 3 tasks Slots Slots Use smaller tasks! Today’s Tiny Tasks Time 0 1 2 3 Time Why How Wher ? ? e? Why How Wher ? ? e? Slots Problem: Skew and Stragglers 0 1 2 3 Contended machine? Data skew? Time Benefit: Handling of Skew and Stragglers 0 1 2 3 Tiny Tasks 0 1 2 3 Slots Slots Today’s tasks Timeas 5.2x reductionTim As much in ejob completion time! Problem: Batch and Interactive Sharing Clusters forced to trade off utilization and responsiveness! Slots 0 1 2 3 Low priority batch task High priority interactive job Time Benefit: Improved Sharing Today’s tasks Tiny Tasks 0 1 2 3 Slots Slots 0 1 2 3 Time Time High-priority tasks not subject to long wait times! Benefits: Recap 0 1 2 3 0 1 2 3 Slots Slots (1) Straggler mitigation Time Quincy (SOSP ‘09 Amoeba (SOCC ’12) … 0 1 2 3 Slots 0 1 2 3 Slots (2) Improved sharing Time Time Mantri (OSDI ‘10) Scarlett (EuroSys ’11 SkewTune (SIGMOD ‘ Dolly (NSDI ’13) … Time Why How Wher ? ? e? Sched ule task Scheduling requirements: (millions per High second) Throughput Low (millisecon Latency ds) Distributed Scheduling (e.g., Sparrow Sched ule task Launc h task Use existing thread pool to launch tasks Sched ule task Launc h task Use existing thread pool to launch tasks + Cache task binaries Task launch = RPC time (<1ms) Sched ule task Launch task Read input data Smallest efficient file block size: 8M B Distribute Metadata (à la Flat Datacenter Storage, OSDI Sched ule task Launch task … … Read input data Execute task + read data for Tons of tiny transfers! Framework(enables optimizations, e.g., Sched ule task Launch task How low can you go? 8MB disk block Read input data Execute task + read data for next 100’s of millisecon Why How Wher ? ? e? Original Job … Map Task 1 K2: K2: K3:  K5: K5:  … Reduce Task 1 Reduc e Tasks N K1: K1: K1:    K2: K2:   Kn: Kn: Kn:    … K1: K1: K1:  Map Task s 2 3 4 … … Map Task 2 Tiny Tasks 1 Job Original Reduce Phase K1: K1: K1: K1:  K1: K1: K1: K1:  K1: K1: K1: K1:  K1: K1: K1: K1:  Reduce Task 1 Tiny Tasks Splitting Large Tasks • Aggregation trees – Works for functions that are associative and commutative • Framework-managed temporary state store • Ultimately, need to allow a small number of large tasks Slots Slots 0 1 2 3 0 1 2 3 Time Time 0 1 2 3 Slots 0 1 2 3 Slots mitigate stragglers + Improve sharing Time Time Distribu Pipelined Launch Distribu ted file task in task ted metada execution existing schedul ta thread ing Questions?pool Find me or Shivaram: Backup Slides Benefit of Eliminating Stragglers 0 1 2 3 Slots Time 0 1 2 3 Cumulative Fraction Slots Based on Facebook Trace 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Original Job Size 1 - 9 Tasks 10 - 99 Tasks 100+ Tasks 1 2 3 4 5 6 7 8 9 10 Improvement in Job Completion Time Time th 95 5.2x at the percentile! Why Not Preemption? • Preemption only handles sharing (not stragglers) • Task migration is time consuming • Tiny tasks improve fault tolerance Dremel/Drill/Impala • Similar goals and challenges (supporting short tasks) • Dremel statically assigns tablets to machines; rebalances if query dispatcher notices that a machine is processing a tablet slowly  standard straggler mitigation • Most jobs expected to be interactive (no Scheduling Throughput 10,000 Machines 16 cores/machine 100 millisecond tasks Over 1 million task scheduling decisions per Sparrow: Technique Place m tasks on the least loaded of dm slaves Job m=2 tasks 4 Schedu probes ler (d = 2) Schedu ler Schedu ler Schedu ler Slave Slave Slave Slave Slave Slave More at tinyurl.com/sparrow-scheduler Sparrow: Performance on TPCH Workload Within 12% of offline optimal; median delay of 8ms More atqueuing tinyurl.com/sparrow-scheduler 29

The Case for Tiny Tasks in Compute Clusters

Related documents

Products

Support

The Case for Tiny Tasks in Compute Clusters

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib