Multi-core Real-Time Scheduling for Generalized Parallel Task Models

Multi-core Real-Time Scheduling for Generalized Parallel Task Models Abusayeed Saifullah, Kunal Agrawal, Chenyang Lu, Christopher Gill Real-Time Systems on Multi-core  Traditional multiprocessor scheduling   Focuses on inter-task parallelism Mostly restricted to sequential task models  Computation-intensive complex real-time tasks are growing    Video surveillance Radar tracking Hybrid real-time structural testing  Multi-core processors provide an opportunity to schedule computation-intensive tasks in real-time   Most of the tasks exhibit intra-task parallelism Real-time systems need to be developed to exploit intra-task parallelism 2 Parallel Task Model  Synchronous task model Parallel threads form a segment Each horizontal bar indicates a thread of execution (sequence of instructions) Segment 1 Seg 2 Seg 3 Segment 4 Segment 5 Threads of each segment synchronize at the end of the segment Threads of Segment 1 synchronize here  Lakshmanan et al. (RTSS ’10) have addressed a restricted synchronous model where    A task is an alternate sequence of parallel and sequential segments All parallel segments have an equal number of threads The total number of threads in each segment ≤ number of cores 3 Our Contributions  We address a general synchronous parallel task model   Different segments may have different numbers of threads Each segment can have an arbitrary number of threads  Example: such tasks are generated by   Parallel for loops in OpenMP, CilkPlus Barrier primitives in thread libraries  This model is more portable  The same program can execute on machines with different numbers of cores 4 A Task Example void parallel_task(float *a,float *b,float *c,float * d) { start 7 int n=7; int i=0; parallel_for(; i< n; i++) c[i] = a[i] + b[i]; n=4; i=0; parallel_for(; i< n; i++) d[i] = a[i] - b[i]; end } 5 Our Contributions (contd..)  We propose a task decomposition for general synchronous parallel task model   Decomposes each parallel task into a set of sequential subtasks Subtasks are scheduled like traditional tasks  Why decomposition?   We can exploit the rich literature of multiprocessor scheduling The proposed decomposition ensures that if the decomposed tasks are schedulable, the original task set is also schedulable 6 Our Contributions (contd..)  We analyze schedulability in terms of processor speed augmentation bound  Speed augmentation bound ν for an Algorithm A: if an optimal algorithm can schedule a synchronous parallel task set on unitspeed processor cores, then A can schedule the decomposed tasks on ν-speed processor cores.  We prove that the proposed decomposition requires a speed augmentation of at most   4 for Global Earliest Deadline First (G-EDF) scheduling 5 for Partitioned Deadline Monotonic (P-DM) scheduling 7 Overview of a Task Decomposition  Each thread of the task becomes an individual task with   An intermediate subdeadline A release offset to retain precedence relations in the original task  Deadlines are assigned by distributing slack among segments  Deadline of a thread= execution requirement+ assigned slack 8 Slack Distribution  How much slack a segment demands depends on   Available slack of the task Execution requirement of the segment  Execution requirement of a segment is the product of   Total number of parallel threads in the segment and Execution requirement of each thread in the segment  Larger execution requirement implies more demand for slack  In the figure, Segment 1 requires more slack than Segment 2 9 Slack Distribution (contd..)  We use the following principle to distribute slack  All segments that receive slack will achieve an equal density executionrequirem ent deadline (totalthreadsin S) * (exec. req. of a thread) Density of a Segm entS  Assigneddeadline Density of a task   Reasons to equalize the density among segments     Fairness: deadline of each segment becomes proportional to its execution requirement We can bound the density of the decomposed tasks We can exploit existing density-based analyses for multiprocessor 10 Slack Distribution (contd..)  Slack of each segment is determined by solving the equalities   Sum of subdeadlines=task deadline (total assigned slack = task slack) Density of Segment 1= density of Segment 2 = … so on  All threads in a segment have the same deadline and offset   Deadline= execution requirement of the thread + segment slack Release offset=sum of deadlines of preceding segment 11 An Example of Task Decomposition Segment 1: Segment 2: Segment 3: Segment 4: Segment 5: deadline=20 deadline=4 deadline=9 deadline=16 deadline=3 density= (5*4)/20=1 density= density= (2*2)/4=1 (3*3)/9=1 density= (4*4)/16=1 density= (1*3)/3=1 All segments have an equal density! 12 Global EDF (G-EDF) Schedulability  A sufficient condition for G-EDF scheduling on m unitspeed cores [Baruah RTSS ’07] d sum £ m - (m -1)d max total density max density  A necessary condition for any task set for any scheduler total utilization usum £ m Using the density bounds for decomposed tasks If the original task set is schedulable anyway on m unit-speed cores, the decomposed tasks are schedulable under G-EDF on 4-speed cores 13 Partitioned DM (P-DM) Schedulability FBB-FFD (Fisher Baruah Baker – First-Fit Decreasing) is a well-known P-DM scheduler [ECRTS ’06]  A sufficient condition for FBB-FFD scheduling on m unit-speed cores load + usum - d max m£ 1 - d max max cumulative exe. req. of tasks divided by time length  A necessary condition for any scheduler total utilization usum £ m Using load and density bounds for decomposed tasks If the original task set is schedulable anyway on m unit-speed cores, the decomposed tasks are FBB-FFD schedulable on 5-speed cores 14 Conclusion  Multi-core processors provide opportunities to schedule computation-intensive tasks in real-time  Real-time systems need to exploit intra-task parallelism  We have addressed real-time scheduling for generalized synchronous parallel task model   Different segments may have different number of threads Each segment can have an arbitrary number of threads  We have proposed a task decomposition that achieves   A processor-speed augmentation bound of 4 for Global EDF A processor-speed augmentation bound of 5 for Partitioned DM 15

Multi-core Real-Time Scheduling for Generalized Parallel Task Models

Related documents

Products

Support

Multi-core Real-Time Scheduling for Generalized Parallel Task Models

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib