NAS

advertisement
Virtual Machine
Disk Images
Introspection
and a bit more...
Vasily Tarasov (SBU)
Dean Hildebrand (IBM Almaden)
Renu Tewari (IBM Almaden)
Erez Zadok (SBU)
File system and Storage Lab (FSL)
Outline
• How all that started
• The idea of introspection
• A couple of results from a 1st prototype
• Future work
• Benchmarking, Filebench
Two important technologies
Virtual Machines (VMs)
- Computational resources consolidation
- Flexible, efficient and scalable
- Hardware support
- Multiple solutions: VMWare, KVM, Xen, ...
- Cloud-way of delivering services

Network Attached Storage (NAS)
- Storage consolidation
- Scalable, manageable and efficient
- NFS/CIFS available on majority of Operating Systems
- NAS sales jumped from $540M in 1998 to $5.1B in 2003
- IBM SONAS

Two technologies…
Dean
VM
NAS
…and they grow
Dean
VM
NAS
How do VM & NAS work together?
Can we make them work better?
VM
IBM
SONAS
Typical Setup
VMWare, KVM, XEN, ...
Virtual
Machines
Host
1
VM 1-2
Virtual
Machines
Host
2
VM 2-2
Virtual
Machines
Host
3
VM 3-2
NFS CLIENT
VM 3-1
NFS CLIENT
VM 2-1
NFS CLIENT
VM 1-1
Storage 1-1
GPFS
Node 1
NFS
SERVER
1
NFS
SERVER
2
Storage 1-2
Storage 2-1
GPFS
Node 2
Storage 2-2
Storage 3-1
GPFS
Node 1
3
Storage 3-2
Storage 4-1
GPFS
Node 1
4
Storage 4-2
Datapath Decomposed
RM
– Read-Ahead
– Request
Mangling and
Scheduling
Virtual File System
On-Disk File System
Block Layer
CA RA
RA RM
CA RA RM
Controller Driver
Host
RA
– CAching
VM Guest
CA
Applications
Controller Emulator
RM
CA RA
NFS Client
RM
NETWORK
NFS Server
NAS
Virtual File System
On-Disk File System
Block Layer
Controller Driver
RM
CA RA
RA RM
CA RA RM
RM
Collecting
traces:
setup
Rand/Seq
Read
Rand/Seq Write


Various I/O sizes
 Multi-file workloads
 Multi-process workloads
 Meta-data intensive

VMWare ESX4
NFS Server
Within VM
trace
1Gbps
VSCSI Layer
Trace
Block Layer
Trace
Network Trace
Applications
Virtual File System
On-Disk File System
Block Layer
Controller Driver
Host
Rand/Seq Read
 Rand/Seq Write
 Various I/O sizes
 Multi-file workloads
 Multi-process workloads
 Meta-data intensive

VM Guest
User-Space
Workload
Collecting traces: setup
Network Trace
VSCSI Layer
Trace
Controller Emulator
NFS Client
NETWORK
NFS Server
NAS
Virtual File System
On-Disk File System
Block Layer
Controller Driver
Block Layer
Trace
Some interesting results
VM Guest
Applications
4MB
Virtual File System

On-Disk File System
4KB
Block Layer
1MB
Controller Driver
128KB
Host
Controller Emulator
NFS Client
32KB
NETWORK
NFS Server
NAS
Virtual File System
On-Disk File System
Block Layer
Controller Driver
256KB
I/O sizes change
WIOV’11 - Revisiting the Storage
Stack in Virtualized NAS
Environments
Meta-data Ops
Data Ops
Non-VM case
# stat /foo/bar
sys_stat(/foo/bar)
NFS_GETATTR(foobar_fh)
VM case
Update attributes
# stat /foo/bar
 List directories
 Creation/deletion
 Lookup
 Access permissions
sys_stat(/foo/bar)
 Link/Symlink operations

NFS_READ(dskimg_fh)
NFS_WRITE(dskimg_fh)
Come up with an idea
Disk Image File
Ext, NTFS,
UFS, ...
What is located in
this region?
READ(dskfh, offset, len)
Offset
Size
NFS
Server
READ from:
 Inode
 Directory entry
 Data of specific file
 ...
Do
intelligent
things!
Prototype Results: Find
80% improvement
40
35
find
35
Runtime (sec)
30
25
Non-optimized
Optimized
20
15
10
5
0
7
Prototype Results: Startup
2.6x times faster
130 sec
50 sec
Future work
• Solid implementation
• More efficient cache policies
• Optimizations on the write path
• Analysis of more complex
workloads
Virtual Machine
Disk Images
Introspection
a bit more...
A Recent Study Concluded that…
1. Much of what researchers conclude in their studies is
misleading, exaggerated,
or flat-out wrong
2. A new claim about a research findings is more likely to
be false than true
3. Researchers tend to publish positive results more often
HotOS’11:
Benchmarking FS Benchmarking: It is Rocket Science
than negative
findings
4. Chances
be accepted to a conference are higher if
2005-2008to
study
thebyresults
are “more exciting”
J. Ioannidis
A
Medicine
B
D
5/4/2011
Biology
Sociology
E
C
Computer Science
Physics
18
Filebench
• Originally created by SUN Microsystem (RIP )
• Maintained by FSL
• Used in many papers
• Flexible: Workload Model Language – WML
• Portable: Linux, FreeBSD, Solaris, MacOS,
Windows *
Filebench WML
define fileset name=myfileset,size=16kb,entries=1000
define process name=reader,instances=1
{
thread name=readerthread,memsize=10m,instances=10
{
flowop read name=myread,filesetname=myfileset,iosize=2kb
}
}
Filebench for Cloud Services
flowops:
• Reads
• Writes
• Creates
POSIX
NFS RPC
• Deletes
AFS RPC
• +20 more
sophisticated
Cloud
Filebench for Virtualized
Environments
define hypervisor name=hpv,type=esx3.1,instances=1
{
define vm name=hpv,type=windows,instances=5
{
define process name=reader,instances=1
{
thread name=readerthread,memsize=10m,instances=10
{
flowop read name=myread1,filesetname=myfileset,…
}
}
}
}
Virtual Machine
Disk Images
Introspection
and a bit more...
Thank you!
Vasily Tarasov (SBU)
Dean Hildebrand (IBM Almaden)
Renu Tewari (IBM Almaden)
Erez Zadok (SBU)
File system and Storage Lab (FSL)
Download