CSAL: A CLOUD STORAGE ABSTRACTION LAYER TO ENABLE PORTABLE CLOUD APPLICATIONS Zach Hill & Marty Humphrey Dept. of Computer Science, University of Virginia zjh5f@cs.virginia.edu A Cloud Application User Requests Worker Worker Worker Worker Front-End Queue Service Table Service Object/ Blob Service Z. Hill and M. Humphrey, Dept. of Computer Science, University of Virginia Dec. 2, 2010 2 Many clouds, many code versions Worker Storage Services Worker Front-End Blobs Worker Tables Queues Front-End Z. Hill and M. Humphrey, Dept. of Computer Science, University of Virginia Storage Services Dec. 2, 2010 3 Single code version, many clouds Worker Storage Services CSAL Worker CSAL Front-End CSAL Blobs Tables Queues Worker CSAL Front-End CSAL Z. Hill and M. Humphrey, Dept. of Computer Science, University of Virginia Storage Services Dec. 2, 2010 4 CSAL Overview Application CSAL BlobStore Metadata Manager Service Manager createContainer listContainers deleteContainer getBlob putBlob deleteBlob Blob Namespace S3 Plugin TableStore createTable deleteTable Insert Update Delete Query getItem Table Namespace SimpleDB Plugin QueueStore createQueue deleteQueue getMessage putMessage peekMessage deleteMessage Q Namespace Azure Queue Plugin … Metadata Store table service(s) Z. Hill and M. Humphrey, Dept. of Computer Science, University of Virginia Dec. 2, 2010 5 CSAL Namespaces One namespace for each abstraction type Metadata only for containers Service endpoint, identifier, user credentials Each abstraction has an independent metadata store Metadata caching Container ops are not very common If data is stale, simply re-fetch and retry Z. Hill and M. Humphrey, Dept. of Computer Science, University of Virginia Dec. 2, 2010 6 CSAL Namespaces 1. Call getBlob(“X”, “foo”); 5. Access “foo” in “X” in Service1 CSAL 2. Lookup “X” metadata – cache first X Service1 foo BlobStore Metadata Manager Plugin 4. Use metadata to Cache determine plugin to use 3. Retrieve from table if not in cache X Service2 Y Service3 (“X”, http://service1.com, testusr, testKey,…) Z. Hill and M. Humphrey, Dept. of Computer Science, University of Virginia Dec. 2, 2010 7 CSAL Implementation Client-side java library i.e. BlobStore.getBlob(“Container”,”foo”); Metadata backing store in the cloud Currently supports Azure & AWS storage Both SOAP and REST Z. Hill and M. Humphrey, Dept. of Computer Science, University of Virginia Dec. 2, 2010 8 Performance of CSAL Adding software layers isn’t free Compare CSAL to Azure’s and Amazon’s SDK APIs Set of micro-benchmarks to test operation latency Container Ops and Data Ops Expect a slowdown for container ops due to metadata Z. Hill and M. Humphrey, Dept. of Computer Science, University of Virginia Dec. 2, 2010 9 Performance – Container Ops CSAL in AWS CSAL in Windows Azure 1.6 1.6 Median Operation Time in Seconds Median Operation Time in Seconds 1.4 1.2 1 0.8 0.6 0.4 0.2 0 1.4 1.2 1 0.8 0.6 0.4 0.2 0 Create Container Create Table Queue Create Create Container Create Table Queue Create CSAL Total Time Native API CSAL Total Time Native API Core Op Time Metadata Op Time Core Op Time Metadata Op Time Note: Error Bars indicate 1 Standard Deviation Z. Hill and M. Humphrey, Dept. of Computer Science, University of Virginia Dec. 2, 2010 10 Performance – Data Ops CSAL in Windows Azure 0.4 0.4 0.35 0.35 Median Operation Time in Seconds Median Operation Time in Seconds CSAL in AWS 0.3 0.25 0.2 0.15 0.1 0.05 0 0.3 0.25 0.2 0.15 0.1 0.05 0 CSAL Total Time Native API CSAL Total Time Native API Note: Error Bars indicate 1 Standard Deviation Z. Hill and M. Humphrey, Dept. of Computer Science, University of Virginia Dec. 2, 2010 11 What about Standards? Standards Efforts OCCI OVF Standards take time to develop and are resisted by vendors Multi-cloud APIs SimpleCloud, jClouds, DeltaCloud, LibCloud SAGA Z. Hill and M. Humphrey, Dept. of Computer Science, University of Virginia Dec. 2, 2010 12 Future work What if Cloud X doesn’t have tables/blobs/queues? Map one abstraction to other (i.e. filesystem) 3rd party services: Hbase, HyperTable, Cassandra… Placing, replicating, and migrating data in real-time for performance and/or cost Real-world applications such as multi-cloud MR Z. Hill and M. Humphrey, Dept. of Computer Science, University of Virginia Dec. 2, 2010 13 Summary Application lock-in and portability are problems in clouds Standards are great, but don’t hold your breath just yet CSAL provides storage abstractions to make the application code itself portable with little performance impact for common data operations Z. Hill and M. Humphrey, Dept. of Computer Science, University of Virginia Dec. 2, 2010 14 Questions? zjh5f@cs.virginia.edu Z. Hill and M. Humphrey, Dept. of Computer Science, University of Virginia Dec. 2, 2010 15