Willow: A User-­‐Programmable SSD Sudharsan Seshadri, Mark Gahagan, Sundaram Bhaskaran, Trevor Bunker, Arup De, Yanqin Jin, Yang Liu, and Steven Swanson Non-­‐VolaDle Systems Laboratory Computer Science and Engineering University of California, San Diego 1 PCIe-­‐PCM (2010) PCIe-­‐PCM (2015?) Bandwidth Rela;ve to disk 1000 640x à 2.2x/yr PCIe-­‐Flash (2012) 100 PCIe-­‐Flash (2007) 866à 2.3x/yr 10 Hard Drives (2006) 1 1 10 100 1/Latency Rela;ve To Disk 2 1000 Case Study: Programmable GPUs Hidden Programmability (Firmware) ParDal Programmability (Shaders) Full blown Programmability 3 Modern SSDs Hide Their Programmability Host SATA/NVMe Block Driver • Fixed interface – SATA or NVMe – Storage-­‐centric operaDons Read() Write() SATA/NVMe • Flexible hardware – MulD-­‐core processors – Complex firmware CPU CPU CPU CPU NV Memory SSD 4 Candidates for Near-Storage Compute • Data-­‐intensive computaDon – Database scans – Transcoding – AnalyDcs Modern SSD processors are inadequate • Data-­‐dependent accesses – e.g. pointer chasing Feasible on modern SSD processors • SemanDc extension – e.g. transacDons • Privileged execuDon – e.g. OS offload 5 OS Bypass [Zhang’12] OS Offload [Caulfield’12] [Bhaskaran’13] [Saxena’12] [Caulfield’12] Virtualization Caching Specialized SSDs Transaction Support Query Processing [Kang’13] [Coburn’13] [Prabhakaran’08] [Wang’14] [De’13] Novel IO Abstractions Key Value Store 6 [Do’13] [Kang’13] [Balakrishnan’12] [Huang’12] [Josephson’10] A Programmable SSD Should… • Provide a flexible interface – New arguments, semanDcs, and operaDons – Programmable in C (or something beher) • Enforce file system permissions • Allow execuDon of untrusted code • Allow mulDple specialized funcDons to coexist • Allow for reuse and sharing of funcDons between applicaDons • Allow applicaDons to invoke operaDons without a system call. • Be able to run trusted code – The OS can delegate operaDons to Willow – Untrusted applicaDons can to invoke them. 7 Willow System Overview Host Willow Application 4 GB/s Custom Kernel Code Willow Driver PCIe Ctrl PCIe ~2 GB/s Interconnect Interconnect Interconnect SPU SPU SPU Bridge Trusted Custom Firmware Code Emulated PCM 8 Trusted Custom Firmware Code Emulated PCM Trusted Custom Firmware Code Emulated PCM The Willow Processor Complex SPU 32 KB DMem Streamer Interface Streamer 9 Interconnect MIPS Pipeline • 125 MHz MIPS processor • 32 KB of D-­‐ and I-­‐mem • A bank of NVM • Network interface • High-­‐bandwidth Data Streamer 32 KB IMem Emulated PCM Willow Usage Model and SSD Apps • The programmer creates an “SSD App” • The kernel installs “SSDApps” for applicaDons App Module Interconnect Willow SSDApp CPU NVM 10 Interconnect Willow SSDApp CPU Streamer • CommunicaDon via RPCs • Host and SSD code can send and receive RPCs Kernel Streamer – The Willow-­‐resident code – A userspace library – A kernel module, if needed Process SSDApp Library NVM Trust and Protection Process Process Kernel Interconnect Interconnect NVM 11 Willow CPU Streamer Willow CPU Streamer • A file system sets protecDon policy • RPCs carry an unforgeable ProcessID • ExecuDon at SPUs is always on behalf of a ProcessID • The Willow driver installs access rights • Willow firmware checks permissions on access NVM Willow Case Studies • • • • • • Basic IO Standard Equipment Direct IO Caching TransacDon processing Key-­‐Value Store File Append w/o the file system 12 TransacDon AcceleraDon with MARS [SOSP’13] 13 Editable Atomic Writes in Willow LogWrite(bufA,addrA,lenA,logAddrA); LogWrite(bufB,addrB,lenB,logAddrB); LogWrite(bufC,addrC,lenC,logAddrC); Commit(); Host Memory Willow Metadata File A B A B C Log File C Data File 14 Performance Benefits On TPCB Transac;ons Per Second 25000 20000 1.5x 15000 ARIES-­‐DirectIO 10000 MARS MARS-­‐Willow 5000 0 1 2 4 Thread Count 15 8 16 Observations and Limitations • SSD App development is relaDvely easy • Composability of SSD Apps is very valuable • Striping data across SPUs increases complexity for some SSD Apps • Limited instrucDon and data storage at SPUs is a persistent challenge 16 The time is ripe for programmable storage • Fast NVMs increase storage flexibility and performance demands • ExisDng SSDs are already “sopware defined” • Numerous applicaDons already exist • Willow provides a clean, flexible interface – Smooth integraDon with exisDng sopware – Powerful enough for complex applicaDons – Preserves file system protecDons • Programmable storage can simplify and accelerate applicaDons 17 Thanks! 18