Storage Management in Virtualized Cloud Environments Sankaran Sivathanu, Ling Liu, Mei Yiduo and Xing Pu Student Workshop on Frontiers of Cloud Computing, IBM 2010 Talk Outline • Introduction • Measurement results & Observations – Data Placement & Provisioning – Workload Interference – Impacts of Virtualization • Summary 2 Cloud & Virtualization • Cloud Environment – Goals – Flexibility in resource configuration – Maximum resource utilization – Pay-per-use Model • Virtualization – Benefits – Resource consolidation – Re-structuring flexibility – Separate protection domains • Virtualization suits as one of the basic foundations of Cloud infrastructures 3 Fundamental Issues • Could Service Providers (CSPs) vs. Customers – Customers purchase computing resources – CSPs provide virtual resources (VMs) – Customers perceive their resources as physical machines! • Multiple VMs reside in single physical host – Resource Interference – End-user performance depends on other users • End-user unaware of where their data physically exists 4 Goals of our Measurement • For cloud service providers – How to place data such that end-user performance is maximized ? – How to co-locate workloads for least interference ? • For End-Users – How to purchase resources in tune with requirement ? – How to tune applications for maximum performance ? • General insights on storage I/O in virtualized environments 5 Benchmarks Used • Postmark – Mail Server Workload – Create/Delete, Read/Append files – Parameters • File Size • # of files • Read/Write ratio • Synthetic Workload – Sequential vs. random accesses – Zipf Distribution 6 Data Provisioning & Placement 7 Disk Provisioning Consider a 100GB Disk Case - I Case - II Workload Data footprint ~150MB 40GB Partition 4GB Partition Throughput : 2.1 MB/s Throughput : 1.4 MB/s Performance Difference : 33% 8 Where to place VM disk ? • Postmark benchmark – Read operation • Cases : – Read from physical partitions in different zones • Based on LBNs • LBNs start from inner zone and proceeds towards outer zones. – Read from disk file (.vmdk) 9 Where to place multiple VM disks ? • Postmark benchmark – 2 instances (1 for each VM) • Random reads • Compare physical partitions placed in different zones – O -> Outer – I -> Inner – M -> Mid 10 Observations • Customers should purchase storage based on workload requirement, not price • Thin provisioning may be practiced • Throughput intensive VMs can be placed in outer disk zones • Multiple VMs that may be accessed simultaneously should be co-located on disk – CSPs can monitor access patterns and move virtual disks accordingly 11 Workload Interference 12 CPU-Disk Interference Physical Host VM - 1 VM - 2 CPU CPU DISK DISK Throughput : 23.4 MB/s Throughput : 27.6 MB/s Performance Difference : 15.3% 13 CPU-Disk Interference CPU allocation ratios has no effect on disk throughput across VMs Disk intensive job performs better along with a CPU intensive job 14 CPU-Disk Interference Reason ? Dynamic Frequency Scaling 15 CPU-Disk Interference CPU DFS is enabled in Linux by default Three ‘governors’ to control the DFS policy On-demand (default) Performance Power-save When 1 core is idle, entire CPU is down-scaled because overall CPU utilization falls 16 Disk-Disk Interference VM-1 CPU Physical Host V.Disk-1 CPU V.Disk-2 VM-2 Physical Disk • 1 instance of Postmark in each VMs • 65.3% more time taken when compared to running Postmark in a single VM • Overhead mainly attributed to disk seeks : No more sequential accesses 17 Disk-Disk Interference VM-1 Physical Host CPU V.Disk-1 Disk - 1 CPU V.Disk-2 VM-2 Disk - 2 • VMs using separate physical disks • 17.52% more time taken when compared to running Postmark in a single VM • Overhead attributed to contention in Dom-0’s queue structures 18 Disk-Disk Interference • Postmark Benchmark (Reads) • Cases : – Running in a single VM – 1 instance in each of two VMs • 2 VMs reading from virtual disks in same physical disk • 2 VMs reading from virtual disks in different physical disks 19 Disk-Disk Interference • IO scheduling policy in Dom-0 has less effect • ‘Ideal’ case is time taken when running Postmark in single VM • Other cases are running 1 instance of Postmark in each of 2 VMs (separate physical disks) 20 Disk-Disk Interference • Interference with respect to workload type • Synthetic read workload • VMs use separate physical disks • Cases : – Mix of sequential versus random reads • Sequential requests from both VMs flood Dom-0 queue - contention 21 Observations • CPU-intensive and disk-intensive workloads can be colocated for optimal performance and power • Virtual disks that may be accessed simultaneously must be placed in separate physical disks • I/O scheduling in Dom-0 has less effect on disk workload interference • Two sequential workloads, when co-located suffer in performance due to queue contention • With separate disks, workload contention is generally minimal, other than the case of two sequential workloads 22 Impacts of Virtualization 23 Sequentiality • Postmark benchmark (reads) • No much overhead seen for random disk accesses • VM overhead is mitigated by larger disk overhead • More felt for sequential disk accesses 24 Block Size • Postmark sequential reads • Fixed overhead with every requests • As block sizes increase, # of requests are reduced, hence overhead is reduced • Efficient to read in larger blocks 25 Block size wrt. Locality 26 Observations • VM overhead is not felt in random workloads – amortized by disk seeks • Extra layers of indirection is the reason for VM overhead – when block size is large, overhead is amortized • Block size may be increased only if there is sufficient locality in access 27 Summary • Storage purchased must depend on requirement, not price! • It is better to place sequentially accessed streams in outer disk zone • Co-locate virtual disks that may be accessed simultaneously • Co-locate CPU intensive task with disk intensive task for better power and performance • Avoid co-locating two sequential workloads on single physical machine – even when it goes to separate physical disks! • Read in large blocks only when there is locality in workload 28 Questions Contact : sankaran@gatech.edu 29