Secure Location-Independent Autonomic Storage Architectures GR/S44501/01 February 2004 - January 2007

advertisement
Secure Location-Independent
Autonomic Storage Architectures
GR/S44501/01
February 2004 - January 2007
Graham Kirby, Alan Dearle, Ron Morrison & Stuart Norcross
School of Computer Science, University of St Andrews
{graham, al, ron, stuart}@dcs.st-and.ac.uk
Project Aims

Desirable features of a data storage system








unbounded capacity
zero latency & cost
total reliability
location independence
simple interface
complete security
complete historical archive
Aim: a storage architecture approximating above, focusing on:


simple interface for end user (file system)
abstracting over:



user location
physical devices
provision of significant benefits with acceptable cost
EPSRC e-Science 26/3/04
2
Potential Benefits

Simplify user experience

‘home directory’ ubiquitously available, irrespective of:




data highly durable


no need for backup
simple data sharing


machines and disks
physical location
firewalls
uniform global name space
Historical views

data never over-written
EPSRC e-Science 26/3/04
3
Potential Hurdles to User Adoption


Speed and convenience must be close enough to that of
a local disk
Users must be able to trust system



not to allow inappropriate access to data by other users
to be sufficiently reliable for serious evaluation
Need viable exit strategy

may require that system can reproduce effects of user’s existing
backup regime



Financial cost
Critical mass of nodes and users required


e.g. by maintaining a local copy of all data
envisaged architecture relies on autonomic management of large
numbers of nodes
Storage overhead must be low enough

incurred through replication of data
EPSRC e-Science 26/3/04
4
User Control

End users should deal only with very high-level
configuration



set broad goals regarding trade-offs (or ignore completely)
task of autonomic management system to try to achieve these
goals
Examples of trade-offs


speed of reads and writes
durability



consistency


related to number and placement of replicas
both absolute & time to converge
how long before updates to shared data are visible to others?
resource consumption

storage, bandwidth, computation
EPSRC e-Science 26/3/04
5
Control Example
EPSRC e-Science 26/3/04
6
Control and Feedback Example
EPSRC e-Science 26/3/04
7
Implementation Approach

File system interface


Replication of files or fragments


abstracted by peer-to-peer overlay e.g. Tapestry
Probes & gauges to monitor state of system


controlled explicitly
Routing to data


erasure-resilient encoding
Placement of data


SMB or NFS
publish/subscribe infrastructure e.g. Siena
Autonomic management elements

attempt to map user goals and probe events into suitable lowlevel actions
EPSRC e-Science 26/3/04
8
Challenges

Core distributed storage infrastructure


appropriate replication mechanisms
Autonomic management



low-level policies
probe & gauge infrastructure
high-level views for users



synthesising views from low-level events
heuristics for adapting low-level policies to achieve high-level
goals
Evaluation


simulation, local cluster, PlanetLab
end-user adoption
EPSRC e-Science 26/3/04
9
Conclusions

Aim to design, implement and evaluate distributed
storage system targeted at benefits to end-user





very simple interface
ubiquitously available
highly durable
append-only: historical views
Project details

http://www-systems.dcs.st-and.ac.uk/asa/
EPSRC e-Science 26/3/04
10
Download