Towards a User-Mode Approach to Partitioned

advertisement
Towards a User-Mode Approach to Partitioned
Scheduling in the seL4 Microkernel
Mikael Åsberg and Thomas Nolte
MRTC/Mälardalen University
Outline
● Introduction
● Problem formulation
● General solution
● Contribution summary
● Background
● Related work
● Implementation
● Results
● Conclusion
2
Introduction
●Increasing software complexity=increase
integration
Shift to software complexity
3
Problem formulation
●Increasing software complexity implies a higher
degree of software integration
●This can be observed in automotive industry - AUTOSAR
●More unpredictable ”interference” (difficult to
analyze) when software execute together
4
General solution
● Solution: separate software into ”partitions”
● Seems to be the way to go in industry – ARINC653,
pikeOS etc.
● Advocated by researchers in our community –
Secure Embedded L4 (seL4)
● Partitioning facilitates:
● Certification/verification
● Reusability (clear interfaces)
● Controlled interference between applications
● Isolates CPU/memory overruns in each partition
5
Contribution summary
● An implementation of a one-level partition
scheduler in the seL4 microkernel user-space
● Why:
● seL4 (currently) lacks proper time partitioning
● It’s a difficult trade-off between flexibility and
performance
● Investigate the performance at user-space
● It’s flexible but potentially bad in performance
● The results will reveal if a ”verified” user-space
scheduler is reasonable to develop for seL4
6
Background
● Partitioned/hierarchical scheduling
● seL4: a fully verified microkernel (machinechecked proof of ~9000 LOC)
Trusted
components
Device'
driver
› •–‡•
Real-time
applicatio
n
…
Untrusted
Wh
components
/
Linux
/
>
charge
Device
driver
/
…
…
>
d
seL4 microkernel
W
W
Hardware
W
…
Sched
the fiel
schedul
This fr
main sp
as hier
for earl
theorem
(IMA)
critical
7
to each
Related work
● Partitioned/hierarchical scheduler
implementations
● Wang et al. (1999), Linux
● Oikawa et al. (1999), Linux
● Kim et al. (2000), SPIRIT-uKernel
● Regehr et al. (2001), Windows 2000
● …VxWorks, uC/OS-II, FreeRTOS…
● Yang et al. (2011), L4/Fiasco
● Verified scheduler implementations
● Muller et al. (2002/2004), Bossa/Linux
● Ha et al. (2004), DEOS kernel
● Åsberg et al. (2011), VxWorks
8
Implementation (1/3)
● Time partitioning in seL4:
● Privileged mode:
● + Good performance (fast scheduling decisions)
● - Flexibility (re-verification of the kernel)
● User mode:
● + Flexibility (re-verify a user-space module)
● - Bad performance (extra overhead of scheduling
decisions)
How bad is it?
9
Implementation (2/3)
● Scheduling in seL4: FPPS and Round robin
● Proposed scheduling: EDF with periodic partitions
10
Implementation (3/3)
● Implementation details:
● The scheduler is implemented as a user-space
thread
● Highest priority (root thread)
● Triggered by periodic interrupts relayed from the
seL4 kernel
● Time between interrupts is the scheduler resolution
● Scheduled threads are activated/deactivated
using seL4 thread-management API functions
● Complexity O(1) for release and deadline queues
(using bitmaps)
11
Results (1/3)
● Hardware/software setup:
● seL4 kernel (version 1.1)
● Emulated seL4 on QEMU (version 0.13.91)
● QEMU settings: Intel 533 MHz Pentium3 (Katmai,
model 7, stepping 3)
● Time measurements using RDTSC (x86 register)
12
w
p
p
o
w
s
p
t
w
ARM1176 (416MHz) processor (with L4/Fiasco). The comparisons are summarized in Table 1.
Conclusively, it is difficult to draw any final conclusions from our
ResultsThe
(2/3)
measurements.
comparisons we have made relate to general
system overheads in the seL4 and L4/Fiasco kernels. Based on
this, the overhead of the PS (without rollback) does not seem over● Average scheduler overhead with 2-9 partitions:
whelming, i.e., this overhead is at least not orders of magnitude
Comparison to related work:
larger● than
general system overheads in seL4/L4/Fiasco kernels.
Measurement
Platform
PS (with rollb.)
Intel P3 533MHz (seL4)
Scheduler invocation: ~213us
PS
Intel P3 533MHz (seL4)
Context switch
Intel P3 533MHz (seL4)
Set timer [40]
AMD 2GHz (L4/Fi.)
seL4 context switch: ~109us
System call [8]
ARM-A8 800MHz (seL4)
Int. delivery [8] ARM-A8 800MHz (seL4)
IPC [16]
ARM-11 416MHz (L4/Fi.)
Table 1: Overhead comparison.
Time (µs)
346
213
109
236
20
59/318
35/54
T
s
l
(
o
d
t
q
13
W
2
Results (3/3)
task3
task3
PS
PS
cs
cs
idle
idle 0
5
10
0
5
10
15
20
15
25
20
30
25
35
30
40
35
40
Figure 9: Execution trace of the PS (with rollback) and a context switch in seL4.
Figure 9: Execution trace of the PS (with rollback) and a context switch in seL4.
task1
task1
task2
task2
task3
task3
PS
PS
cs
cs 0
5000
10000
15000
20000
0
5000
10000
15000
20000
25000
25000
30000
30000
35000
35000
40000
40000
45000
45000
50000
50000
55000
55000
60000
60000
65000
65000
70000
70000
75000
75000
80000
80000
Figure 10: Execution trace of a set of threads scheduled by the PS (with rollback) in seL4.
85000
85000
90000
90000
95000
14
95000
Conclusion
● Is the scheduler performance bad, i.e., to much
overhead?
● Well, its not ”much” larger than related
overheads, i.e., context switches, system calls,
interrupt latency etc.
● 2x seL4 context switches, 4x IPC calls, 2,5x interrupt
latency in a closed system (limitations on API calls)
● If further optimizations could squeeze down the
overhead a bit then it could be a promising
approach
● Future work:
● Optimize the implementation
● Perhaps develop a verified version
15
Thank you!
16
Download