RainStor and Dell DX: Online Structured Data

advertisement
RainStor and Dell DX:
Online Structured Data Retention
for Data Center Consolidation
Presentation for AFPOA, August 26th, 2011
Ramon.chen@rainstor.com – VP Product Management, RainStor
Craig_Warthen@dell.com – Product Marketing, Product Group, Storage
RainStor + Dell DX
Archival solution for
Reduction,
Retention
& on-demand Retrieval
of historical (semi) structured data and
big machine generated data
at 10x Less TCO
Unified Platform for Data Consolidation
Enterprise Information Archiving (EIA) will become a key infrastructure component by 2013, as the archiving of
structured data and unstructured content into a single platform emerges. EIA products that support multiple information
types are replacing stand-alone application-specific archiving products.
-Enterprise Information Archiving Transforms the Strategy and Approach for Archiving
-June 2010 – Gartner: Kenneth Chin, Sheila Childs
DX Object Storage
Platform
CONSOLIDATE:
All Your Static Data on one
massively scalable repository
with no complexity
STORE BOTH
Structured and
unstructured
3
Active Transactional Data
Become A Fraction Of Total Storage Over Time
Majority becomes Historical Data over time or even all historic
when no longer active
Data
Application Performance
10%
100%
Active
70%
90%
Static
30% Cost $$$ and
PAIN
Transactional Data
Time
Automated Data Creation Changes the Mix
Machine Generated or Human Generated Immediately Historical Data
SEC-store every trade and price in
every market forever!
Facebook over 1 PB of
log data
Smart Grid will generate over 1 EB of
data in the US alone!
US Telco 750 TBs of CDR data
retained today
Application Performance
Data
Static
100%
Static
100%
Costs $$$
Time
Static
100%
Current Technology Approaches for Long Term Structured Data Retention
RDBMS
Warehouse
/ Store
Dev
Test
DR
Operational copies compound the storage problem
Tape
Impact of Traditional or Absent Strategies
 More Data
 More Infrastructure
 More Resources
 Higher Costs
 More Risk
Business Challenges:




Flat or Reduced IT Budgets
Limited IT Resources & Need to focus on core business systems
Increasing compliance retention periods compound data volume and management issues
Limited access to larger data sets impacts ability to perform deeper analysis
Technical Challenges:





Multi-terabytes of structured data in traditional RDBMS’ or Files
Challenging to maintain & high cost to manage legacy systems – just for data access!
Backup windows aren’t being met.
Traditional RDBMS systems cannot ingest high volume data and store it
Requires expert resources to provide significant care and feeding
Overburdened
Resources
Online Data Retention Requires New Technology
Transactional
OLTP
Online Data Retention
(OLDR)
Static Machine-Generated Data
(MGD)
Analytical
OLAP
How We Do It
Reduce
Retain
Retrieve
Size: Massive de-dupe ~97% savings in storage
Hardware: On low-cost Dell servers and DX Object Storage Platform
Resources: Without specialist DBA support and less storage management
Preserved: Massive record volumes in original form
Immutable: Tamper proofed with audit trail and WORM
Configurable: With retention & expiry policies
Massively Scalable: With no complexity
Long-Term Preservation: Optimized on object based technology platform
with metadata
Standards: SQL & BI tools via ODBC/JDBC, HTTP
Performant: Fast queries for large complex data sets
Flexible: With schema evolution & point-in-time access
Disruptive Technology
• Patented
Peter
Smith
Pharmaceutical
$40,000
• Data Reduction through value
and pattern de-duplication =
Highest rate of compression
available
Peter
Smith
Paul
Pharmaceutical
$40,000
Finance
$35,000
Peter
Smith
Pharmaceutical
$40,000
Paul
Brown
Finance
$35,000
John
• Fast Queries in stored format
without re-inflation =
Access via SQL and ODBC/JDBC,
any BI tool (e.g. Cognos,
Business Object)
Use case:
Application Retirement
Confidential
Application Retirement
1000s of Legacy apps using Oracle,
SQL Server, Mainframes etc.
Retire apps and store data in optimized
repository
Search/
Analytics
12
Same user searches & reports work. No changes
needed.
Current Pains
Benefits of Dell Solution
Multiple environments, expensive expert IT resources
All legacy data on single platform
High ongoing HW/SW & maintenance costs
Reduced maintenance =
frees-up budgets
Biz users still need occasional but rare access to data
Continued access to data = happy biz users
App Retirement TCO Example
300 Legacy apps
(250 GB each) = 75TB
25:1 Compression = 3TB
Saving $5.7m/yr
Retire apps and store data in optimized
repository
Search/
Analytics
Same user searches & reports work. No changes
needed.
Current OPEX= $6m/yr.
Dell Solution OPEX= $300k/yr.
Storage
75TB * $20k/TB = $1.5m
3TB * $20k = $60k
Servers
300 (1 svr/app) * $10k/svr = $3m
4 shared svrs * $10k = $40k
Admin
1 DBA ($200k) per 4 apps= $1.5m
1 Admin ($200k) for entire solution
13
Use case:
Machine Generated Data Retention
Confidential
Machine Generated Data Retention
Billions of Human Activity or Machine Auto-generated Records
Dell solution acts as primary repository
-Closed payments/Transactions
-Logs, IP Records
-Facilities Management Sensor Data
 OK
-Manufacturing Test, QC
Search
Ability to quickly access data, even as data continues
to be ingested.
Current Pains
Benefits of Dell Solution
Daily volumes outstripping RDBMS capacity
Scalable ingestion, storage & query better than RDBMS
Strict compliance and query latency needs
Configurable expiry/purge, low latency access
Significant $$$ spend to support growth
Lowest cost per retained TB (10x less)
15
MGD Retention TCO Example
20B WAP logs/day.
Retained over 3 Months = 2
Petabytes
20:1 Compression = 100TB
Saving $11m/year
Dell solution acts as primary repository
-WAP Logs
Search
Ability to quickly access data, even as data continues
to be ingested.
Current OPEX = $11.8m/yr.
Dell Solution OPEX= $800k/yr.
Storage
2Pb * $5k/TB = $10m
100TB * $5k = $500k
Servers
100 Svrs * $10k/svr = $1m
10 svrs * $10k = $100k
Admin
4 Admin ($200k) = $800k
1 Admin = $200k
16
Retention & Compliance TCO Comparison
RainStor Cloud/Hosting Enabled
OR
Dell
DX
OR
Other
Comput
e
Dell DX Object Storage
Confidential
Solution Overview
A long-term preservation optimized solution that solves for the “Big Data” storage problems and enables a
better way to retire and archive legacy applications.
Integrated Solution Stack
•
Integrated SW and HW for maximum
optimization of solution
•
Services practice
•
Dell storage and servers
•
Integrated specialized database
•
Dell networking platforms
•
Dell cloud infrastructure
Use Cases
20
•
“Big Data” Machine Generated Data
•
Application Retirement
•
Application Archive (Future)
•
Data Warehouse Archive (Future)
SQL & BI
Analytics
Retired Apps
Services Layer
OLDR
Layer
Storage
Layer
-DX Object Store
Retired Apps
CDR
Smart Meter
Consulting, Implementation, Support
Trade
Network
Logs
DX Object Storage Platform
Enterprise-class storage for fixed digital content
Solution approach to tiered storage and archive/content
management utilizing a common platform
Manageability
Scalability
• Non-disruptive and simple HW
expansions, technology transitions,
and retire
• Self managing, self healing
• Easy to retrieve data, HTTP API
• Selectable WORM capabilities
• Peer-scale architecture
• Scales to petabytes
• Near limitless number of files
• Can scale by as little as 1 node at a
time
• X86-based modular architecture
TCO
• Reduce cost of data management
50%
• X86-based arch.
• Power management features
• No backup infrastructure required
Application
The Digital Object
Doc
Object Address
UUID
Metadata
101000101010100111010101100010110100…110010
HTTP/1.1 200 OK
Date: Thu, 26 Jun 2008 21:26:34 GMT
Server: Object Cluster/2.2
Application-Name: MS Word
Create-Date: 2008-06-26 21:26:14.687000
System-Cluster: Internet Demo Cluster
System-Created: Thu, 26 Jun 2008 21:26:20 GMT
Content-Disposition: inline; filename=Sports %Segment%20626-08.doc
Content-Length: 8619354
Content-type: application/doc
lifepoint: [Thu, 03 Jul 2015 21:26:14 GMT] reps=2,
deletable=True
lifepoint: [] delete
Replica-Count: 2
CUSTOM ELEMENTS
Dell Confidential - Restricted
• Object = Metadata + File Data
• Stored together for life of object
• Metadata:
– Rich, descriptive data about the data
– Context persisted over time
– Enables policy-based management
Simplified Expansion and Technology Refresh
Adding capacity is simple
• Rack & Cable
•
Power-up
•
No config or provisioning
Refresh is just as easy
• Upgrade without
interruption
• Retire node or volume
• Replicates data to
another node or volume
Recovery & balancing is
automatic
• Continuous data availability
•
Load and capacity balanced
Enterprise Ready Solution
Import
Query
Key Benefits
Low TCO
Less Software, Hardware & People to maintain large data sets
Ease of Use
• Very low admin & no tuning needed
• Peer-scale architecture
• No provisioning or configuring
• Backupless environment
Compliant
• Configurable retention rules
• WORM options
• Auto disposition
Performance
• High ingestion rate
• Fast queries
• Linear scale-out performance
• Optimized
Massive Data Compression
Consolidated
One Database, One Platform
25
Scalable
• Massive scalability with no
complexity
• Big Data volumes
Download