User-space I/O for μs-level storage devices

advertisement
User-space I/O for μs-level storage devices
Anastasios Papagiannis, Giorgos Saloustros,
Manolis Marazakis, and Angelos Bilas
Institute of Computer Science (ICS)
Foundation for Research and Technology – Hellas (FORTH)
Greece
Current Storage Technologies
DRAM
+ Fast
+ Byte-addressable
- Volatile
Disk / Flash + Non-volatile
- Slow
- Block-addressable
2
Iris (presentation at WOPSS'16)
Emerging Storage Technologies
NVM connectivity ?
3
PCI-Express (10-100 μsec latency, DMA)
DIMM (nsec-scale latency, load/store interface, caching)
Iris (presentation at WOPSS'16)
Linux I/O Path
Application
libC
VFS
File System
Block Layer
Device Drivers
Device
S/W overheads: more pronounced with low-latency storage devices.
4
Iris (presentation at WOPSS'16)
Software overheads in I/O Path
Request Latency in log (µs)
10000
[ source: Moneta, MICRO’2010 ]
1000
File System
Operating System
Hardware
100
10
1
Disk
Flash
NVM
~ 20,000 instructions to issue and complete a 4 KB I/O request.
5
Iris (presentation at WOPSS'16)
Our goal
How to design the I/O path to take full advantage of fast
storage devices (NVM) ?
6
Performance (latency, throughput) close to hardware spec’s
Scalable sharing
Strong protection
Iris (presentation at WOPSS'16)
Outline of this talk
Motivation
Design
Protection features in modern server processors
Key-Value interface to storage
Evaluation
Conclusions
7
Iris (presentation at WOPSS'16)
I/O Path: mostly in kernel-space (via system calls)
Hardware access in user-space is restricted
For security and isolation
How to allow applications to use hardware features ?
Key insight from the Dune prototype (OSDI’2012)
8
Use hardware-assisted virtualization to provide a process, rather
than a machine abstraction, to safely and efficiently expose
privileged hardware features to user programs while preserving
standard OS abstractions
Iris (presentation at WOPSS'16)
Protection features in processors: privilege rings
9
Iris (presentation at WOPSS'16)
Protection features in processors: virtualization
10
Iris (presentation at WOPSS'16)
System calls Hypercalls
Kernel
Process
hypercall
syscall
System Call
System Call
Handlers
Interposer
Host
Guest
Mode
Mode
Processor
11
Hypercalls are normally used
by the kernel running in a VM.
VMEXIT + VMENTRY sequences
(save/restore state of host and guest)
Memory accesses: protected by EPT
Privileged State: managed/protected by VT-x
Iris (presentation at WOPSS'16)
I/O path: From user-space to device (via hypercall)
DUNE
module
12
Iris (presentation at WOPSS'16)
I/O Interposer
Common I/O path: intercept block accesses, serve them
from key/value store
User-space library (LD_PRELOAD)
Translates read/write system calls to key-value calls
(get/put)
Allows running unmodified applications
Maps each file as a set of key-value pairs
Maintains in-memory metadata
Forwards rest of I/O-related system calls to the Linux kernel
13
Iris (presentation at WOPSS'16)
Iris kernel
Protected access to key-value store
Get/Put API for using the key-value store
Permission checks and updates
Leverages Intel/VT-x to run in a more privileged domain
compared to user-space applications
Enter kernel-space only for initialization and coarse-grain
file operations
14
Iris (presentation at WOPSS'16)
Iris Key-Value Store
Storage engine (based on Be-tree)
Atomicity and reliability guarantees (CoW update protocol)
Direct access to storage hardware
15
DMA API (for PCIe-attached storage)
Load/Store instructions (for memory bus-attached storage)
Iris (presentation at WOPSS'16)
Outline of this talk
Motivation
Design
Protection features in modern server processors
Key-Value interface to storage
Evaluation
Conclusions
16
Iris (presentation at WOPSS'16)
Evaluation: Testbed
2 x Intel Xeon E5620 processors (2x4 cores)
DRAM: 24GB
Microbenchmark: FIO
8GB used for PCM emulation (PMBD driver – MSST’2014)
Block-size: 512 bytes
Queue-depth: 1
Direct I/O (bypassing Linux page cache)
# I/O issuing threads: 1-4
Comparison with EXT4 and XFS
17
Iris (presentation at WOPSS'16)
Evaluation: single-thread IOPS performance
KIOPS
Read
Write
ext4
269
203
xfs
261
199
Iris
445
439
Iris: 1.7x read IOPS, 2.2x write IOPS
I/O latency: on the order of a few microseconds
Reads: 2.24 – 3.8
Writes: 2.27 – 5.02
18
Iris (presentation at WOPSS'16)
Evaluation: Read and Write IOPS
Iris: 400 KIOPS per core (2x improvement)
19
Iris (presentation at WOPSS'16)
Conclusions
I/O path providing direct access to fast storage devices
Key-value store for keeping data and metadata
Minimizing software overheads
Without sacrificing strong protection
Guarantees atomicity and reliability
Intel VT-x to provide protected access interface
Encouraging preliminary evaluation results
20
400 KIOPS per core
Better than ext4 and xfs: ~ 2x IOPS (random, small-sized)
Iris (presentation at WOPSS'16)
Questions ?
Manolis Marazakis
Institute of Computer Science, FORTH – Heraklion, Greece
E-mail: maraz@ics.forth.gr
Web: http://www.ics.forth.gr/carv
Acknowledgements
(EU FETHPC Grant 671553)
21
http://www.exanest.eu
https://twitter.com/exanest_h2020
Iris (presentation at WOPSS'16)
Backup Slides
22
Iris (presentation at WOPSS'16)
PMBD
Hybrid architecture:
23
Physical: NVM DIMMs attached to memory bus
Logical: NVM exposed as block device to the OS
Iris (presentation at WOPSS'16)
Evaluation
Millions
random read IOPS 4K 1rqs
8
7
6
IOPS
5
XFS
EXT4[j]
EXT4[o]
EXT4[w]
Tucana
2.4x
4
3
2
1
0
1
24
2
4
8
16
#threads
Iris (presentation at WOPSS'16)
32
64
128
CARV Laboratory @ FORTH (Heraklion, Greece)
25
Iris (presentation at WOPSS'16)
Download