Limitless Storage, Boundless Opportunities Technology Overview – January 2009 Cleversafe Mission Enable the world to confidently store and distribute limitless data Commercial Products • Providing Products and Service to companies who are building and operating Dispersed Storage Networks Dispersed Storage - Open Source Project • Creating a genuine open standard and a broad technical community 2 Data Storage Growth Traditional Data Additional, New Data + Documents Character & numerical databases Images – 500KB per picture Audio – 5,000 KB per song Video – 5,000,000 KB per movie Digital Content - 85% of all data by 2012 - Growing 10x every 4 years Source: IDC 3 Current High Availability Scenario Server1 @ Location 1 The quick brown fox jumps over the A1 A2 lazy brown dog. A3 11010010 00110010 Parity Internet Connection RAID3 Controller Server3 @ Location 3 Internet Connection Server2 @ Location 2 The quick brown fox jumps over the A1 A2 lazy brown dog. A3 The quick brown fox jumps over the A1 A2 11010010 00110010 lazy brown dog. A3 11010010 00110010 Parity RAID3 Controller Parity RAID3 Controller 300% Disk Storage Overhead + Tape Backup -Total bytes stored = 4x usable capacity 200% Bandwidth Overhead -Each node supports full operational requirement -Total bandwidth required = 3x operational requirement Internet Connection Higher Cost - More Power - More Management - More Space - More Equipment More Security Risks 4 Digital Data Storage - An Antiquated Approach Currently Data Storage = Data Copies • Not Secure – 200 major announced security breaches since 2004 • Not Private – Data copies are… data copies • Not Long Term – Tied to hardware which doesn’t last over 5 years • More Reliable = More Cost – Additional copies, synchronization traffic, high cost hardware • Not Scalable – Performance and management degrades as scale increases 5 Information Dispersal With the emergence of Broadband and modern microprocessors, Information Dispersal Algorithms (IDA’s) can be used to store the world’s data… – Inherently secure – Inherently private – Inherently reliable – Inherently long term Similar mathematical methods are the basis of digital mobile telephony and the Internet – Packet Switching, Reed-Solomon, Erasure Coding, Forward Error Correction, etc. 6 How Information Dispersal Works 36 example characters = 36 total Bytes Information Dispersal Algorithms - Quick Mathematical Transformation 16 example slices = 58 total Bytes This Slicing example has a 60% Storage Overhead - Total bytes stored = 1.6X usable capacity “Slices” are to data storage …what “packets” are to data communications. - Provide inherently reliable, private, secure and long-term storage 7 Dispersal versus Replication Dispersal Nines of Width Threshold Reliability Storage Bandwidth Overhead Overhead Access Slice Storage Choices Source data size Storage Overhead size 8 5 6 60% 60% 70 1 2 3 8 6 7 33% 33% 28 1 2 3 4 5 6 7 8 16 10 >16 60% 60% 8008 16 12 11 33% 33% 1820 32 24 >16 33% 33% 11 million 64 56 >16 14% 14% 214 million 4 5 6 7 8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 2 3 4 5 6 7 8 9 1011121314151617181920212223242526272829303132 Typical Configurations Replication Nines of Storage Bandwidth Access Reliability Overhead Overhead Choices Copies + Parity Storage Copies Parity 2 No 5 100% 100% 2 Copy Copy 2 Yes 10 167% 100% 2 Copy Copy 3 No 7 200% 200% 3 Copy Copy Copy 3 Yes >16 300% 200% 3 Copy Copy Copy Source data size Storage Overhead size Parity Parity Parity Parity Parity 8 Data Storage with Information Dispersal Object dsNet Client iSCSI File Accesser Storage Overhead 15-60% Maximum Delivery 16 at once Bandwidth Needed 15-60% Delivery choices thousands 9 Data Retrieval with Information Dispersal Object dsNet Client iSCSI File Accesser X X X X Storage Overhead 15-60% Maximum Delivery 16 at once Bandwidth Needed 15-60% Delivery choices thousands 10 Product – Appliance Components Slicestor Accesser Dispersed Storage server • • • • 4 TB “Raw” capacity per 1U server Store, return and delete slices Unlimited vaults (similar to LUNs) per dsNet Deployable in a single rack or geographically distributed around the world • Slices-disperses-retrieves data to/from Slicestors Provides standard storage interfaces Ideal for digital content loading Deployable in redundant configurations Dispersed Storage router • • • dsNet Client Dispersed Storage client software • • • Disperses and retrieves data to/from slice servers Approximately 3 MB of Java code Ideal for content distribution 11 Exabytes of Data Storage Require a Paradigm Shift Scenario 1: Exbibyte (~1,000,000 Terabytes) Usable Storage, 10 nines of Reliability 1 EiB Usable 25% Parity 1 EiB Replicated 25% Parity 1 EiB Replicated 25% Parity Replication: Total Raw Storage ~ 3.75 EiB 1 EiB Usable Dispersal: Traditional Storage Replication: ~300% Storage overhead Additional costs for replication SW 33% Dispersal Overhead Total Raw Storage ~ 1.33 EiB Dispersed Storage Dispersal: ~33% storage overhead built in multi-site availability dsNet – Standard Interfaces iSCSI: Acts like a hard drive -Works with any OS or file system WebDAV: Acts like a URL -Works with any browser Java client enables any device to access a dsNet: - Media players, phones, set top box, security cameras, sensors, etc. 13 Complete Storage Architecture Client Layer Network Java SDK Named Object Java SDK Simple Object Object WebDAV Web Service dsFTP REST API File System iSCSI Protocol Layer SCSI Interface Layer Block Vault Structure 14 Project Overview An Open Source Project with Commercial Backing Dispersed Storage – an Open Source Project • Hosted at www.cleversafe.org • Includes the complete protocol and algorithms • Incorporates and/or enhances additional open source software – Bouncy Castle – Cryptography – JSAP – Java Simple Argument Parser – Bzip2 – Data Compressor – Apache Commons – Logging, Statistics, basic Internet protocols – JUnit – Testing Framework – Log4j – Logging Utility – MINA -- Network Application Framework – SLF4J – Simple Logging Façade for Java – SVNKit – Java Subversion library – Wrapper – Java Service Wrapper – ws-commons – Webservices Common Utilities – jSCSI – iSCSI Initiator 15 Open Source Complements Commercial Commercial Cleversafe Dispersed Storage Commercial Capabilities Services - Training - Certification - Support Internet Equipment Providers Additional Capabilities - Management - Reporting Commercial Capabilities Products - Integrated hw/sw Appliances - Customized OS - Additional hardware features - Performance Contribute to standards efforts Interoperability Protocols - Standards - Open Source software Contribute to standards efforts Open Source 16 Thank You