ECRTS'13 July 12, 2013

advertisement
A Fully Preemptive
Multiprocessor Semaphore Protocol for
Latency-Sensitive Real-Time Applications
ECRTS’13
July 12, 2013
Björn B. Brandenburg
bbb@mpi-sws.org
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
A Rhetorical Question
On uniprocessors, why do we use
the priority inheritance protocol (PIP)
or the priority ceiling protocol (PCP)
instead of simple non-preemptive sections?
AUTOSAR Non-Preemptive Critical Section:
SuspendAllInterrupts(…);
// critical section
ResumeAllInterrupts(…);
MPI-SWS
Brandenburg
2
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
RT 101: Preemptive Synchronization Matters
uniprocessor, non-preemptive critical sections
release
deadline
unrelated, latency-sensitive
high-priority task
less time-critical
lower-priority tasks
long lower-priority
critical section (CS)
time
MPI-SWS
Brandenburg
3
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
RT 101: Preemptive Synchronization Matters
Deadline miss due to latency increase!
uniprocessor, non-preemptive critical sections
release
deadline
unrelated, latency-sensitive
high-priority task
less time-critical
lower-priority tasks
long lower-priority
critical section (CS)
time
Long non-preemptive critical section.
MPI-SWS
Brandenburg
4
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
RT 101: Preemptive Synchronization Matters
uniprocessor, with PIP
release
deadline
unrelated, latency-sensitive
high-priority task
long lowerpriority CS
less time-critical
lower-priority tasks
time
MPI-SWS
Brandenburg
5
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
RT 101: Preemptive Synchronization Matters
Latency-sensitive task
isolated from unrelated critical section!
uniprocessor, with PIP
release
deadline
unrelated, latency-sensitive
high-priority task
long lowerpriority CS
less time-critical
lower-priority tasks
time
Lower-priority critical section: fully preemptive execution.
MPI-SWS
Brandenburg
6
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
The Multiprocessor Case
What if we host the same workload on a multiprocessor?
partitioned multiprocessor scheduling
unrelated, latency-sensitive
high-priority task
less time-critical
lower-priority tasks
(on same core)
time
MPI-SWS
Brandenburg
7
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
No existing
real-time
semaphore
protocol
The Multiprocessor Case
for partitioned or clustered scheduling
isolates high-priority tasks from unrelated CSs.
partitioned multiprocessor scheduling
release
deadline
unrelated, latency-sensitive
high-priority task
less time-critical
lower-priority tasks
(on same core)
long lower-priority
critical section (CS)
time
MPCP, FMLP,
MPI-SWS
+
FMLP ,
Brandenburg
OMLP, …
8
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
This Paper
Independence preservation formalizes the idea that
“tasks should never be delayed by unrelated critical sections.”
MPI-SWS
Brandenburg
9
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
This Paper
Independence preservation formalizes the idea that
“tasks should never be delayed by unrelated critical sections.”
Independence preservation is
impossible without (limited) job migrations.
MPI-SWS
Brandenburg
10
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
This Paper
Independence preservation formalizes the idea that
“tasks should never be delayed by unrelated critical sections.”
Independence preservation is
impossible without (limited) job migrations.
First independence-preserving semaphore protocol
for clustered/partitioned scheduling; the protocol also has
asymptotically optimal blocking bounds.
MPI-SWS
Brandenburg
11
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Clustered JLFP Scheduling
Job-Level Fixed-Priority Scheduling (JLFP)
c … number of processors per cluster
m … number of processors (total)
J4
J3
J2
J1
J4
J3
Q4
Core 4
L2
Cache
L2
Cache
J1
J3
J1
Core 1
Core 2
Core 3
L2
Cache
Q1
Q3
Core 3
J2
Q2
Q2
Core 2
J4
Q1
Q1
Core 1
J2
Core 4
L2
Cache
Core 1
Core 2
Core 3
L2
Cache
Core 4
L2
Cache
Main Memory
Main Memory
partitioned scheduling
clustered scheduling
global scheduling
c=1
1≤c≤m
c=m
MPI-SWS
Brandenburg
Main Memory
12
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
This talk: Partitioned Fixed-Priority (P-FP) Scheduling
Clustered JLFP Scheduling
Job-Level Fixed-Priority Scheduling (JLFP)
c … number of processors per cluster
m … number of processors (total)
J4
J3
J2
J1
J4
J3
Q4
Core 4
L2
Cache
L2
Cache
J1
J3
J1
Core 1
Core 2
Core 3
L2
Cache
Q1
Q3
Core 3
J2
Q2
Q2
Core 2
J4
Q1
Q1
Core 1
J2
Core 4
L2
Cache
Core 1
Core 2
Core 3
L2
Cache
Core 4
L2
Cache
Main Memory
Main Memory
partitioned scheduling
clustered scheduling
global scheduling
c=1
1≤c≤m
c=m
MPI-SWS
Brandenburg
Main Memory
13
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Clustered JLFP Scheduling
Job-Level Fixed-Priority Scheduling (JLFP)
c … number of processors per cluster
m … number of processors (total)
J4
J3
J2
J1
J4
J3
Q4
Core 4
L2
Cache
L2
Cache
J1
J3
J1
Core 1
Core 2
Core 3
L2
Cache
Q1
Q3
Core 3
J2
Q2
Q2
Core 2
J4
Q1
Q1
Core 1
J2
Core 4
L2
Cache
Main Memory
Main Memory
partitioned scheduling
clustered scheduling
Core 1
Core 2
Core 3
L2
Cache
Core 4
L2
Cache
Main Memory
global scheduling
c=1
1≤c≤m
c=m
Task model: implicit-deadline sporadic tasks
(choice of deadline constraint irrelevant to results)
MPI-SWS
Brandenburg
14
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Real-Time Semaphore Protocols
Binary Semaphores
in POSIX
pthread_mutex_lock(…)
// critical section
pthread_mutex_unlock(…)
A blocked task suspends
& yields the processor.
MPI-SWS
Brandenburg
15
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Real-Time Semaphore Protocols
Binary Semaphores
in POSIX
Priority Inversion
A job should be
scheduled, but is not.
pthread_mutex_lock(…)
// critical section
pthread_mutex_unlock(…)
A blocked task suspends
& yields the processor.
MPI-SWS
PI-Blocking: increase in
worst-case response time
due to priority inversions.
Brandenburg
16
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Real-Time Semaphore Protocols
Binary Semaphores
in POSIX
Priority Inversion
A job should be
scheduled, but is not.
pthread_mutex_lock(…)
// critical section
pthread_mutex_unlock(…)
A blocked task suspends
& yields the processor.
PI-Blocking: increase in
worst-case response time
due to priority inversions.
Goal: bounded pi-blocking.
Bounded in terms of critical section lengths only!
MPI-SWS
Brandenburg
17
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Real-Time Semaphore Protocols
Binary Semaphores
in POSIX
Priority Inversion
A job should be
scheduled, but is not.
pthread_mutex_lock(…)
// critical section
pthread_mutex_unlock(…)
A blocked task suspends
& yields the processor.
PI-Blocking: increase in
worst-case response time
due to priority inversions.
Assumptions
➡ Unnested critical sections.
➡ Suspension-oblivious schedulability analysis.
MPI-SWS
Brandenburg
18
Part 1
Avoiding Delays due to
Unrelated Critical Sections
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Independence Preservation
(specific to s-oblivious analysis)
“Tasks should never be delayed by unrelated critical sections.”
MPI-SWS
Brandenburg
20
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Independence Preservation
(specific to s-oblivious analysis)
Let bi,q denote the maximum pi-blocking incurred by
task Ti due to requests for resource q.
Let Ni,q denote the maximum number of times that any
job of Ti accesses resource q.
Under an independence-preserving locking protocol,
if Ni,q = 0, then bi,q = 0.
“You only pay for what you use.”
MPI-SWS
Brandenburg
21
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Independence Preservation
(specific to s-oblivious analysis)
Let bi,q denote the maximum pi-blocking incurred by
task Ti due to requests for resource q.
Let Ni,q denote the maximum number of times that any
job of Ti accesses resource q.
Under an independence-preserving locking protocol,
if Ni,q = 0, then bi,q = 0.
Isolation useful for:
latency-sensitive
(if no
delay
be tolerated) or
“You workloads
only pay for
what
youcan
use.”
if low-priority tasks contain unknown or untrusted critical sections.
MPI-SWS
Brandenburg
22
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Real-Time Semaphore Protocols
real-time
locking protocol
MPI-SWS
=
progress
mechanism
Brandenburg
+
queue
structure
23
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Real-Time Semaphore Protocols
real-time
locking protocol
=
progress
mechanism
+
queue
structure
Ensure that a lock holder is scheduled
(while waiting tasks incur pi-blocking).
How to order conflicting critical sections
(e.g., priority queue, FIFO queues).
MPI-SWS
Brandenburg
24
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Real-Time Semaphore Protocols
real-time
locking protocol
=
partitioned scheduling
priority boosting
MPI-SWS
progress
mechanism
+
clustered scheduling
priority donation
Brandenburg
queue
structure
global scheduling
priority inheritance
25
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Priority boosting and Priority Donation:
lock-holding jobs have higher priority than non-lock-holding jobs
Real-Time Semaphore Protocols
➔ effectively non-preemptive ➔ not independence preserving
real-time
locking protocol
=
partitioned scheduling
priority boosting
MPI-SWS
progress
mechanism
+
clustered scheduling
priority donation
Brandenburg
queue
structure
global scheduling
priority inheritance
26
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Real-Time Semaphore Protocols
real-time
locking protocol
=
partitioned scheduling
priority boosting
progress
mechanism
queue
structure
+
clustered scheduling
priority donation
global scheduling
priority inheritance
Existing independence-preserving locking protocols:
Global PIP, Global FMLP, Global OMLP, …
MPI-SWS
Brandenburg
27
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Observation
Independence preservation + bounded priority inversion
requires intra-cluster job migrations.
J2
J1
Q3
Q4
Core 3
Core 4
J1
J3
Core 1
Q2
Q2
Core 2
J4
Q1
Q1
Core 1
L2
Cache
MPI-SWS
J4
J3
J2
Core 2
Core 3
L2
Cache
L2
Cache
Core 4
L2
Cache
Main Memory
Main Memory
partitioned scheduling
clustered scheduling
Brandenburg
28
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Intra-cluster: (temporarily) execute jobs on
processors/clusters they have not been assigned to.
Observation
Independence preservation + bounded priority inversion
requires intra-cluster job migrations.
J2
J1
Q3
Q4
Core 3
Core 4
J1
J3
Core 1
Q2
Q2
Core 2
J4
Q1
Q1
Core 1
L2
Cache
MPI-SWS
J4
J3
J2
Core 2
Core 3
L2
Cache
L2
Cache
Core 4
L2
Cache
Main Memory
Main Memory
partitioned scheduling
clustered scheduling
Brandenburg
29
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
Core 1
Core 2
three tasks, two cores, one resource, P-FP scheduling
T3
T2
T1
time
MPI-SWS
Brandenburg
30
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
Core 1
Core 2
three tasks, two cores, one resource, P-FP scheduling
T3
T2
T1
time
T2 starts executing critical section…
MPI-SWS
Brandenburg
31
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
Core 1
Core 2
three tasks, two cores, one resource, P-FP scheduling
Job of T1 is released.
T3
What to do with in-progress critical section?
T2
T1
time
MPI-SWS
Brandenburg
32
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
Case 1: priority boosting (=let T2 continue).
Core 1
Core 2
three tasks, two cores, one resource, P-FP scheduling
T3
T2
critical section
T1
time
MPI-SWS
Brandenburg
33
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
Case 1: priority boosting (=let T2 continue).
Core 1
Core 2
three tasks, two cores, one resource, P-FP scheduling
T3
T2
CS
pi-blocked
critical section
T1
time
MPI-SWS
Brandenburg
34
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
Case 1: priority boosting (=let T2 continue).
Core 2
three tasks, two cores, one resource, P-FP scheduling
T3
CS
pi-blocked
Core 1
Benefit: T3 incurs only bounded pi-blocking, meets deadline.
T2
critical section
T1
time
MPI-SWS
Brandenburg
35
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
Case 1: priority boosting (=let T2 continue).
Core 1
Core 2
three tasks, two cores, one resource, P-FP scheduling
T3
T2
CS
pi-blocked
critical section
T1
time
Problem: T1 misses its deadline.
MPI-SWS
Brandenburg
36
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
Core 1
Core 2
three tasks, two cores, one resource, P-FP scheduling
Job of T1 is released.
T3
What to do with in-progress critical section?
T2
T1
time
MPI-SWS
Brandenburg
37
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
Case 2: independence preservation (= preempt T2).
Core 1
Core 2
three tasks, two cores, one resource, P-FP scheduling
T3
T2
T1
time
MPI-SWS
Brandenburg
38
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
Case 2: independence preservation (= preempt T2).
Core 1
Core 2
three tasks, two cores, one resource, P-FP scheduling
T3
T2
T1
time
Independence preservation: T1 meets its deadline.
MPI-SWS
Brandenburg
39
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
Case 2: independence preservation (= preempt T2).
Core 1
Core 2
three tasks, two cores, one resource, P-FP scheduling
T3
pi-blocked
CS
T2
T1
time
MPI-SWS
Brandenburg
40
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
Case 2: independence preservation (= preempt T2).
Core 2
three tasks, two cores, one resource, P-FP scheduling
T3
pi-blocked
CS
Core 1
Problem: T3 incurs “unbounded” pi-blocking, misses deadline!
T2
T1
time
MPI-SWS
Brandenburg
41
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Partitioned Scheduling with Migrations?
MPI-SWS
Brandenburg
42
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Partitioned Scheduling with Migrations?
Partitioned By Necessity
J1
J2
J3
J4
Q1
Q2
Q3
Q4
Core 1
Local
Memory
Core 2
Local
Memory
Core 3
Local
Memory
Core
4
E.g., SoC with heterogeneous
cores (ARM, PowerPC, x86, MIPS).
Local
Memory
Shared Memory
migrations infeasible
for lack of technical capability
MPI-SWS
Brandenburg
43
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Partitioned Scheduling with Migrations?
Partitioned By Necessity
J1
J2
J3
J4
Q1
Q2
Q3
Q4
Core 1
Local
Memory
Core 2
Local
Memory
Core 3
Local
Memory
Core
4
E.g., SoC with heterogeneous
cores (ARM, PowerPC, x86, MIPS).
Local
Memory
Shared Memory
migrations infeasible
for lack of technical capability
➔ independence preservation and bounded priority inversion
impossible to achieve!
MPI-SWS
Brandenburg
44
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Partitioned Scheduling with Migrations?
Partitioned By Choice
Partitioned By Necessity
J1
J4
J3
Q4
Q1
Q2
Q3
Q4
Core 2
Core 3
Core
4
Core 1
Core 2
Core 3
Core 4
Local
Memory
Local
Memory
Local
Memory
L2
Cache
L2
Cache
Main Memory
Shared Memory
migrations disallowed
but technically feasible
migrations infeasible
for lack of technical capability
MPI-SWS
J2
J1
J4
Q3
Local
Memory
J3
Q2
Q1
Core 1
J2
Brandenburg
45
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Occasional migrations not desirable, but possible!
Partitioned Scheduling with Migrations?
(Focus of this work.)
Partitioned By Choice
Partitioned By Necessity
J1
J4
J3
Q4
Q1
Q2
Q3
Q4
Core 2
Core 3
Core
4
Core 1
Core 2
Core 3
Core 4
Local
Memory
Local
Memory
Local
Memory
L2
Cache
L2
Cache
Main Memory
Shared Memory
migrations disallowed
but technically feasible
migrations infeasible
for lack of technical capability
MPI-SWS
J2
J1
J4
Q3
Local
Memory
J3
Q2
Q1
Core 1
J2
Brandenburg
46
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
Core 1
Core 2
three tasks, two cores, one resource, P-FP scheduling
Job of T1 is released.
T3
What to do with in-progress critical section?
T2
T1
time
MPI-SWS
Brandenburg
47
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
1) Ensure independence preservation (= preempt T2).
Core 1
Core 2
three tasks, two cores, one resource, P-FP scheduling
T3
T2
T1
time
Independence preservation: T1 meets its deadline.
MPI-SWS
Brandenburg
48
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
2) Ensure bounded pi-blocking (= schedule T2).
Core 1
Core 2
three tasks, two cores, one resource, P-FP scheduling
T3
pi-blocked
CS
T2
T1
time
Easy fix: migrate T2 when T3 suspends.
MPI-SWS
Brandenburg
49
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
Temporarily move T3 to Core 2…
Core 1
Core 2
three tasks, two cores, one resource, P-FP scheduling
T3
pi-blocked
CS
T2
T1
time
Easy fix: migrate T2 when T3 suspends.
MPI-SWS
Brandenburg
50
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Example: Job Migration is Necessary
Core 1
Core 2
three
tasks,
one resource,
Benefit:
T3 two
incurscores,
only bounded
pi-blocking,P-FP
meetsscheduling
deadline.
T3
pi-blocked
CS
T2
T1
time
Easy fix: migrate T2 when T3 suspends.
MPI-SWS
Brandenburg
51
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Theorem
Under non-global scheduling (c ≠ m), it is impossible
for a semaphore protocol to simultaneously
(i) prevent unbounded pi-blocking,
(ii) be independence-preserving, and
(iii) avoid inter-cluster job migrations.
Pick any two…
MPI-SWS
Brandenburg
52
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Combinations of Properties
Under non-global scheduling (c ≠ m), it is impossible
for a semaphore protocol to simultaneously
(i) prevent unbounded pi-blocking,
(ii) be independence-preserving, and
(iii) avoid inter-cluster job migrations.
(i) & (iii)
➡ MPCP, Part. FMLP, FMLP+, OMLP, …
(ii) & (iii)
➡ Applying PIP to partitioned scheduling (not sound!)
(i) & (ii)
➡ no such protocol known!
MPI-SWS
Brandenburg
53
Part 2
Independence Preservation
+
Asymptotically Optimal
PI-Blocking
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
High-Level Overview
real-time
locking protocol
MPI-SWS
=
progress
mechanism
Brandenburg
+
queue
structure
55
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
High-Level Overview
real-time
locking protocol
=
progress
mechanism
Must be
independence-preserving.
MPI-SWS
Brandenburg
+
queue
structure
Must ensure
asymptotic optimality.
56
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
High-Level Overview
real-time
locking protocol
=
progress
mechanism
+
Must be
independence-preserving.
queue
structure
Must ensure
asymptotic optimality.
Adopt intuition from example:
when lock holder is preempted,
migrate to blocked task’s processor.
MPI-SWS
Brandenburg
57
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Migratory Priority Inheritance
classic priority inheritance
inherit priority of blocked jobs
MPI-SWS
Brandenburg
58
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Migratory Priority Inheritance
classic priority inheritance
inherit priority of blocked jobs
+
“cluster inheritance”
inherit eligibility to execute
on assigned clusters
from blocked jobs
MPI-SWS
Brandenburg
59
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Jobs remain fully preemptive even in critical sections.
Migratory
Priority
Inheritance
➔ enables independence preservation
classic priority inheritance
inherit priority of blocked jobs
+
“cluster inheritance”
inherit eligibility to execute
on assigned clusters
from blocked jobs
MPI-SWS
Brandenburg
60
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
High-Level Overview
real-time
locking protocol
=
progress
mechanism
Must be
independence-preserving.
MPI-SWS
Brandenburg
+
queue
structure
Must ensure
asymptotic optimality.
61
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
High-Level Overview
real-time
locking protocol
=
progress
mechanism
Must be
independence-preserving.
+
queue
structure
Must ensure
asymptotic optimality.
Resolve (most) contention within clusters:
use a multi-level queue.
MPI-SWS
Brandenburg
62
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
A 3-Level FIFO/FIFO/PRIO Queue
one 3-level queue for each resource
Cluster 1
…
shared
resource
Cluster 2
Cluster K
MPI-SWS
Brandenburg
63
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
A 3-Level FIFO/FIFO/PRIO Queue
one 3-level queue for each resource
Cluster 1
Cluster 2
priority
PQ2 queue
…
shared
resource
priority
PQ1 queue
Cluster K
MPI-SWS
Brandenburg
priority
PQK queue
64
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
A 3-Level FIFO/FIFO/PRIO Queue
one 3-level queue for each resource
Cluster 1
priority
PQ1 queue
FIFO Queue FQ1
priority
PQ2 queue
FIFO Queue FQ2
…
shared
resource
Cluster 2
Cluster K
FIFO Queue FQK
MPI-SWS
Brandenburg
priority
PQK queue
65
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
A
3-Level
FIFO/FIFO/PRIO
Queue
Bounded length: at most c jobs (in each cluster).
one 3-level
queue
for each resource
(c = number
of cores
in cluster)
Cluster 1
priority
PQ1 queue
FIFO Queue FQ1
priority
PQ2 queue
FIFO Queue FQ2
…
shared
resource
Cluster 2
Cluster K
FIFO Queue FQK
MPI-SWS
Brandenburg
priority
PQK queue
66
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Priority queue used only if more than c jobs contend.
A 3-Level FIFO/FIFO/PRIO
Queue
(c = number of cores in cluster)
one 3-level queue for each resource
Cluster 1
priority
PQ1 queue
FIFO Queue FQ1
priority
PQ2 queue
FIFO Queue FQ2
…
shared
resource
Cluster 2
Cluster K
FIFO Queue FQK
MPI-SWS
Brandenburg
priority
PQK queue
67
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
A 3-Level FIFO/FIFO/PRIO Queue
one 3-level queue for each resource
Cluster 1
priority
PQ1 queue
FIFO Queue FQ1
FIFO Queue GQ
Cluster 2
priority
PQ2 queue
…
FIFO Queue FQ2
Cluster K
FIFO Queue FQK
MPI-SWS
Brandenburg
priority
PQK queue
68
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Global FIFO Queue resolves inter-cluster contention.
A
3-Level
FIFO/FIFO/PRIO
Queue
Bounded length: at most K = m / c jobs (one per cluster).
one 3-level queue for each resource
(m = total number of cores, c = number of cores per cluster)
Cluster 1
priority
PQ1 queue
FIFO Queue FQ1
FIFO Queue GQ
Cluster 2
priority
PQ2 queue
…
FIFO Queue FQ2
Cluster K
FIFO Queue FQK
MPI-SWS
Brandenburg
priority
PQK queue
69
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
The O(m) Independence-Preserving
Locking Protocol (OMIP)
The
OMIP
=
migratory priority
inheritance
independence-preserving
MPI-SWS
Brandenburg
+
3-level F/F/P
queue
O(m) s-oblivious
pi-blocking
70
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
The O(m) Independence-Preserving
Locking Protocol (OMIP)
The
OMIP
=
migratory priority
inheritance
independence-preserving
+
3-level F/F/P
queue
O(m) s-oblivious
pi-blocking
Ω(m) lower bound on s-oblivious pi-blocking (— & Anderson, 2010)
➔ The OMIP ensures asymptotically optimal s-oblivious pi-blocking.
MPI-SWS
Brandenburg
71
Part 3
Evaluation
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Prototype Implementation
Linux Testbed for Multiprocessor Scheduling in Real-Time Systems
www.litmus-rt.org
3-level queues
➡ easy (reuse Linux wait queues)
➡ cheap compared to syscall
Migratory priority inheritance
➡ more tricky (need to avoid global locks)
➡ store bitmap of cores “offering” to
schedule lock holder in each lock
MPI-SWS
Brandenburg
73
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Response Times Experiment
on an 8-core, 2-Ghz Xeon X7550 System
Core 1
Core 8
T1
T8
T9
…
T16
T17
T24
T25
T32
Setup
➡ 4 tasks on each core (one independent & latency-sensitive)
➡ one shared resource
➡ max. critical section length: ~1ms
MPI-SWS
Brandenburg
74
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
One latency-sensitive,
independent task with
period = 1ms.
on an 8-core, 2-Ghz Xeon X7550 System
Response Times Experiment
Core 1
Core 8
T1
T8
T9
…
T16
T17
T24
T25
T32
Setup
➡ 4 tasks on each core (one independent & latency-sensitive)
➡ one shared resource
➡ max. critical section length: ~1ms
MPI-SWS
Brandenburg
75
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Response Times Experiment
Three tasks (on each core) with
periods 25 ms, 100 ms, and 1000 ms.
on an 8-core,
Xeontasks
X7550
System
Each 2-Ghz
job of these
locks
the resource once.
Core 1
Core 8
T1
T8
T9
…
T16
T17
T24
T25
T32
Setup
➡ 4 tasks on each core (one independent & latency-sensitive)
➡ one shared resource
➡ max. critical section length: ~1ms
MPI-SWS
Brandenburg
76
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Response Times Experiment
on an 8-core, 2-Ghz Xeon X7550 System
Core 1
Core 8
T1
T8
T9
…
T16
T17
T24
T25
T32
MPI-SWS
Three Configurations
➡ No locks (unsound!)
‣ no blocking (baseline)
➡ Clustered OMLP
‣ priority donation
➡ OMIP
‣ migratory priority inheritance
Experiment
➡ Measured response times with
sched_trace
➡ 30-minute traces
➡ more than 45 million jobs
Brandenburg
77
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Response Time CDF
Fraction of jobs with
response time at most X
100.00%$
90.00%$
P(response)*me))≤)X))
80.00%$
70.00%$
60.00%$
50.00%$
40.00%$
30.00%$
20.00%$
10.00%$
NONE$CDF$
higher is better
OMIP$CDF$
OMLP$CDF$
increasing response time
0.00%$
0$
MPI-SWS
0.1$
0.2$
0.3$
0.4$ 0.5$ 0.6$ 0.7$
response)*me)(in)ms)))
Brandenburg
0.8$
0.9$
1$
78
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Response Time CDF of 1-ms Tasks
100.00%$
90.00%$
P(response)*me))≤)X))
80.00%$
70.00%$
60.00%$
OMLP
50.00%$
NONE$CDF$
OMIP &
NONE
(overlapping)
40.00%$
30.00%$
OMIP$CDF$
OMLP$CDF$
20.00%$
10.00%$
0.00%$
0$
MPI-SWS
0.1$
0.2$
0.3$
0.4$ 0.5$ 0.6$ 0.7$
response)*me)(in)ms)))
Brandenburg
0.8$
0.9$
1$
79
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Response Time CDF of 1-ms Tasks
100.00%$
90.00%$
P(response)*me))≤)X))
80.00%$
70.00%$
60.00%$
NONE$CDF$
50.00%$
With priority donation (or priority
boosting), ~20% of the jobs of the
1ms-tasks miss their deadline.
40.00%$
30.00%$
OMIP$CDF$
OMLP$CDF$
20.00%$
10.00%$
0.00%$
0$
MPI-SWS
0.1$
0.2$
0.3$
0.4$ 0.5$ 0.6$ 0.7$
response)*me)(in)ms)))
Brandenburg
0.8$
0.9$
1$
80
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Response Time CDF of 1-ms Tasks
100.00%$
90.00%$
P(response)*me))≤)X))
80.00%$
70.00%$
60.00%$
NONE$CDF$
50.00%$
Response time distribution under OMIP
equivalent to case without locks.
40.00%$
30.00%$
20.00%$
OMIP$CDF$
OMLP$CDF$
(OMIP & NONE curves overlap)
10.00%$
0.00%$
0$
MPI-SWS
0.1$
0.2$
0.3$
0.4$ 0.5$ 0.6$ 0.7$
response)*me)(in)ms)))
Brandenburg
0.8$
0.9$
1$
81
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Response Time CDF of 100-ms Tasks
100.00%$
90.00%$
P(response)*me))≤)X))
80.00%$
70.00%$
60.00%$
NONE$CDF$
50.00%$
OMIP$CDF$
40.00%$
OMLP$CDF$
30.00%$
20.00%$
10.00%$
0.00%$
0$
MPI-SWS
10$
20$
30$
40$
50$
60$
70$
response)*me)(in)ms))
Brandenburg
80$
90$
100$
82
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
OMLP
Response Time CDF of 100-ms Tasks
NONE
100.00%$
90.00%$
P(response)*me))≤)X))
80.00%$
OMIP
70.00%$
60.00%$
NONE$CDF$
50.00%$
OMIP$CDF$
40.00%$
OMLP$CDF$
30.00%$
20.00%$
10.00%$
0.00%$
0$
MPI-SWS
10$
20$
30$
40$
50$
60$
70$
response)*me)(in)ms))
Brandenburg
80$
90$
100$
83
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Response Time CDF of 100-ms Tasks
100.00%$
90.00%$
P(response)*me))≤)X))
80.00%$
70.00%$
60.00%$
NONE$CDF$
50.00%$
OMIP$CDF$
40.00%$
OMIP: blocking shifted to lowerpriority (= later-deadline) jobs.
30.00%$
20.00%$
OMLP$CDF$
10.00%$
0.00%$
0$
MPI-SWS
10$
20$
30$
40$
50$
60$
70$
response)*me)(in)ms))
Brandenburg
80$
90$
100$
84
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Analytical Blocking/Latency Tradeoff
Large-scale schedulability experiments
➡ Varied #tasks, #cores, #resources, max. critical section lengths, etc.
➡ >150,000,000 task sets
➡ 678 schedulability plots, available in online appendix
MPI-SWS
Brandenburg
85
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Analytical Blocking/Latency Tradeoff
Large-scale schedulability experiments
➡ Varied #tasks, #cores, #resources, max. critical section lengths, etc.
➡ >150,000,000 task sets
➡ 678 schedulability plots, available in online appendix
In the presence of latency-sensitive tasks,
the OMIP is generally the only viable option.
MPI-SWS
Brandenburg
86
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Analytical Blocking/Latency Tradeoff
Large-scale schedulability experiments
➡ Varied #tasks, #cores, #resources, max. critical section lengths, etc.
➡ >150,000,000 task sets
➡ 678 schedulability plots, available in online appendix
In the presence of latency-sensitive tasks,
the OMIP is generally the only viable option.
Without latency-sensitive tasks, the OMIP
does not offer substantial improvements.
MPI-SWS
Brandenburg
87
Conclusion
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Summary
Independence preservation formalizes the idea that
“tasks should not be delayed by unrelated critical sections.”
Independence preservation is
impossible without (limited) job migrations.
The OMIP is the first independence-preserving
semaphore protocol for clustered scheduling. It ensures
asymptotically optimal s-oblivious pi-blocking.
MPI-SWS
Brandenburg
89
Future Work
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Budget
Overruns
Nesting
Suspension-Aware Analysis
MPI-SWS
Brandenburg
90
Thanks!
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
SchedCAT
Schedulability test
Collection And Toolkit
Linux Testbed for Multiprocessor Scheduling in Real-Time Systems
www.mpi-sws.org/~bbb/
projects/schedcat
www.litmus-rt.org
MPI-SWS
Brandenburg
Appendix
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Design Inspirations
Migrate to Blocked Task’s CPU
➡ “Local helping” in TU Dresden’s Fiasco/L4
‣ Hohmuth & Peter (2001)
➡ Multiprocessor bandwidth inheritance (MBWI)
‣ Faggioli, Lipari, & Cucinotta (2010)
Queue Design
➡ Intra-cluster queues adopted from global OMLP
‣ — & Anderson (2010)
➡ Inter-cluster queues similar to clustered OMLP
‣ — & Anderson (2011)
MPI-SWS
Brandenburg
93
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
100.00%$
100.00%$
90.00%$
90.00%$
80.00%$
80.00%$
70.00%$
60.00%$
NONE$CDF$
50.00%$
OMIP$CDF$
40.00%$
OMLP$CDF$
30.00%$
P(response)*me))≤)X))
P(response)*me))≤)X))
What about overheads?
70.00%$
60.00%$
10.00%$
10.00%$
0.00%$
0.00%$
0.2$
0.3$
0.4$ 0.5$ 0.6$ 0.7$
response)*me)(in)ms)))
0.8$
0.9$
OMLP$CDF$
30.00%$
20.00%$
0.1$
OMIP$CDF$
40.00%$
20.00%$
0$
NONE$CDF$
50.00%$
1$
0$
10$
20$
30$
40$
50$
60$
70$
response)*me)(in)ms))
80$
90$
100$
Aren’t job migrations expensive?
➡ response time experiments reflect all overheads in real system
➡ latency-sensitive tasks do not migrate, only lower-priority tasks do
➡ only working set of critical section migrates (likely small), not
entire task working set (likely much larger)
➡ the critical section would have been preempted anyway
MPI-SWS
Brandenburg
94
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Blocking Analysis
FIFO Queue GQ
m … number of processors (total)
MPI-SWS
FIFO Queue FQ
PQ
priority
queue
c … number of processors per cluster
Brandenburg
95
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Blocking Analysis
FIFO Queue GQ
FIFO Queue FQ
at most
K-1=m/c-1
queued jobs
at most
c-1
queued jobs
m … number of processors (total)
MPI-SWS
PQ
priority
queue
c … number of processors per cluster
Brandenburg
96
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
Blocking Analysis
FIFO Queue GQ
FIFO Queue FQ
at most
K-1=m/c-1
queued jobs
at most
c-1
queued jobs
at most
m/c-1
blocking CS
at most
(c - 1) · (m / c)
blocking CS
m … number of processors (total)
MPI-SWS
PQ
priority
queue
c … number of processors per cluster
Brandenburg
97
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
At most
m / c - 1 + (c - 1) · (m / c) = m - 1 = O(m)
blocking critical sections.
Blocking Analysis
FIFO Queue GQ
FIFO Queue FQ
at most
K-1=m/c-1
queued jobs
at most
c-1
queued jobs
at most
m/c-1
blocking CS
at most
(c - 1) · (m / c)
blocking CS
m … number of processors (total)
MPI-SWS
PQ
priority
queue
c … number of processors per cluster
Brandenburg
98
A fully preemptive multiprocessor semaphore protocol for latency-sensitive real-time applications
At most
m / c - 1 + (c - 1) · (m / c) = m - 1 = O(m)
blocking critical sections.
Blocking Analysis
FIFO Queue GQ
at most
K-1=m/c-1
queued jobs
FIFO Queue FQ
PQ
priority
queue
at most
c-1
queued jobs
Under s-oblivious analysis:
at most O(m) critical sections
cause pi-blocking.
at most
m/c-1
blocking CS
m … number of processors (total)
MPI-SWS
at most
(c - 1) · (m / c)
blocking CS
c … number of processors per cluster
Brandenburg
99
Download