Flexibility, Manageability and Performance in a Grid Storage Appliance

advertisement
Flexibility, Manageability and Performance
in a Grid Storage Appliance
John Bent, Venkateshwaran Venkataramani, Nick Leroy,
Alain Roy, Joseph Stanley, Andrea Arpaci-Dusseau, Remzi
Arpaci-Dusseau, and Miron Livny
University of Wisconsin
Two trends
s
a set

Dat

Performan
ce
• Storage appliances address both trends.
Storage Appliances: + and 
Storage Appliances: Great for basic file service




Easy to manage: Plug in and it works
Good performance: Specialized just for I/O
Reliable and available too
Storage Appliances for the Grid: Mismatch?



Inflexible: Few, specific protocols (e.g., NFS)
Costly: 10x the cost of PC + a few disks
Difficult to integrate: Just one piece of the puzzle
A Solution: NeST

NeST: A Storage Appliance for the Grid

Flexible: Multiple simultaneous protocols


Low-cost: Use commodity machines


Virtual protocol layer
Dynamic adaptation
Grid-aware: Integrate w/ higher-level systems

Design specifically for the Grid
Outline



Introduction
General architecture
Design goals





Flexibility
Low-cost
Grid-aware features
NeST in the Grid example
Conclusions
NeST: Protocol layer
Physical network layer
Chirp
HTTP
Grid FTP

NFS
Common protocol layer

Dispatcher
Storage Mgr
Transfer Mgr
Physical storage layer
Control flow
Datal flow
Concurrency Models
Virtualizes
different
protocols
Mediates
access to
network
NeST: Dispatcher
Physical network layer
Chirp
HTTP
Grid FTP

NFS
Common protocol layer
Dispatcher
Storage Mgr
Transfer Mgr
Physical storage layer
Control flow
Datal flow

Concurrency Models
Mediates
interaction
between
other
components
Gathers
information,
advertises
NeST: Storage manager
Physical network layer
Chirp
HTTP
Grid FTP

NFS

Common protocol layer

Dispatcher
Storage Mgr
Transfer Mgr
Physical storage layer
Control flow
Datal flow
Concurrency Models
Space
management
Access
control
Virtualizes
physical
storage
NeST: Transfer manager
Physical network layer
Chirp
HTTP
Grid FTP

NFS
Common protocol layer

Dispatcher
Storage Mgr
Transfer Mgr
Physical storage layer
Control flow
Datal flow
Concurrency Models
Implementss
cheduling
policies
Chooses
concurrency
model
Outline



Introduction
General architecture
Design goals





Flexibility
Low-cost
Grid-aware features
NeST in the Grid example
Conclusions
Flexibility: Multiple protocols

Problem: How to support multiple protocols?
One approach: Just a Bunch of Servers (JBOS)

Problems with JBOS





Lack of control (scheduling)
Painful administration
No shared code
Larger memory footprint
wu-ftpd
nfsd
httpd
JBOS Server
NeST: Flexibility By Design

NeST: Integrate protocols and gain advantage


Implementation like VFS
Integration introduces new challenges



Different protocols allow different auth models
More expensive to add a new protocol
Less fault isolation
NeST vs JBOS
Linux cluster
- Dual PIII
- 1 GB Ram
- linux 2.2.19
Each protocol
- 4 clients
- 10 MB files
30
25
35
30
25
20
15
15
5
0
Chirp
GridFTP
Apache
10
linux nfsd
20
wu-ftpd
Server bandwidth (MB/s)
35
HTTP
NFS
10
5
Total
0
• For each protocol, NeST is comparable to JBOS server.
Exerting scheduling control

Different scheduling policies




FCFS
Cache-aware [USENIX ‘02]
Proportional share
Proportional share scheduling

Allows administrators to set protocol proportions


e.g. favor NFS
Very difficult in JBOS
Server bandwidth (MB/s)
Proportional share
35
30
25
Linux cluster
- Dual PIII
- 1 GB Ram
- linux 2.2.19
Each protocol
- 4 clients
- 10 MB files
20
15
10
5
0
FCFS
1:1:1:1
1:2:1:1
1:1:1:4
Scheduling configuration
• In most cases, achieves Jain’s metric of fairness > 0.98 (1 is “fair”).
Outline



Introduction
General architecture
Design goals





Flexibility
Low-cost
Grid-aware features
NeST in the Grid example
Conclusions
Low-Cost: New challenges

Desire: Run on arbitrary OS on arbitrary PC



Software-only, user-level storage appliance
Currently on Linux (release 0.9) and Solaris (beta)
Problem: Portable performance

Performance under load is platform / workload dependent



Threads or processes on some systems, events on others
May also be workload dependent (e.g. whether in cache)
NeST approach: Dynamic adaptivity



Simultaneously support multiple concurrency models
Monitor performance using each model
Bias towards better model over time
Adaptive Concurrency
1
0
Adaptive
0.5
Threads
Adaptive
Threads
1
Linux: 10 MB files
Events
2
0
1.5
Ave time per request (sec)
3
Events
Ave time per request (ms)
Solaris: 1K files
• Dynamic adaptation approaches “ideal” without static information.
Outline



Introduction
General architecture
Design goals





Flexibility
Low-cost
Grid-aware features
NeST in the Grid example
Conclusions
Grid-Aware Mechanisms

Basic functionality

Users and groups: Dynamic creation/deletion



does not need administrative intervention
Access control: Generic AFS-style ACLs
Advanced functionality




QoS: Preferential scheduling
Advertises into global scheduling systems
Flexible protocol and authentication mechanisms
Self-cleaning storage guarantees: Lots
Storage guarantees: Lots

Characteristics of Lots:




Self-cleaning


Expired lots become “best-effort” lots
Lot management


Capacity: Total amount of data lot can store
Duration: Time for which data is guaranteed to exist
Set of files: Multiple files may co-exist within lot
Either default set created by administrator, OR
use resource management protocol to create before
usage
Implementation: File system quotas


Advantage: Integrates cleanly with local access methods
Disadvantage: Performance hit for large writes
Outline



Introduction
General architecture
Design goals





Flexibility
Low-cost
Grid-aware features
NeST in the Grid example
Conclusions
NeST in the Grid
Advertisement
Global Execution Manager
Linux
NeST
Solaris
NeST
compute
node
compute
node
Home
compute
node
Remote
compute
node
NeST in the Grid
Global
Execution Mgr
1)
2)
6)
3)
N GridFTP N
5)
4) NFS
Home
Remote
1) Home submits jobs
2) Global reserves space
3) Global coordinates xfer
4) Global starts jobs
5) Global coordinates xfer
6) Global terminates space
Conclusions

NeST: A storage appliance for the Grid



Design goals:





Gain manageability
Without sacrificing performance
Flexibility: Virtual protocol architecture
Low-cost: Adaptation mechanisms
Grid-aware: Space management
Current status: release 0.9 available
Future work

Hot deployable NeSTs, lot management extensions
Questions?
http://www.cs.wisc.edu/condor/nest
Download