Managing the Unimaginable: A Practical Approach to Petabyte Data Storage Randy Cochran, Infrastructure Architect, IBM Corporation, hcochran@us.ibm.com TOD -1366 - Information on Demand Infrastructure Data Storage is Getting Out-of-Hand Are storage demands starting to overpowering you? 1 Most Research Firms Agree “It is projected that just four years from now, the world’s information base will be doubling in size every 11 hours.” (“The toxic terabyte; How data-dumping threatens business efficiency”, Paul Coles, Tony Cox, Chris Mackey, and Simon Richardson, IBM Global Technology Services white paper, July 2006) “Our two-year terabyte CAGR of 52% is 3ppt (percentage points) below rolling four quarter results of 55%.” ("Enterprise Hardware: 2007-08 storage forecast & views from CIOs", Richard Farmer and Neal Austria, Merrill Lynch Industry Overview, 03 January 2007) “With a 2006–2011 CAGR nearing 60%, there is no lack in demand for storage…” ("Worldwide Disk Storage Systems 2007–2011 Forecast: Mature, But Still Growing and Changing", Research Report # IDC206662, Natalya Yezhkova, Electronics.ca Publications, May 2007) “According to TheInfoPro…..the average installed capacity in Fortune 1000 organizations has jumped from 198 TB in early 2005 to 680 TB in October 2006. …..TIP found that capacity is doubling every 10 months.” (InfoStor Magazine, Kevin Komiega, October 19, 2006) 2 What’s Driving Petabyte Level Storage? The “Perfect Storm” General Increase in demand New digital data technologies More regulatory requirements Better protection from litigation Proliferation of Sophisticated applications Disaster Recovery plans Declining storage media costs A desire for greater storage efficiency Storage technical skills scarcity A growing understanding of retained data’s business value According to IDC, between 2006 and 2010 information added annually to the digital universe will increase more than six fold from 161 to 988 exabytes. 3 Just How Big is a Petabyte? Data Storage Size Relationships Terminology IEC Notation Value Bit bit ---- Byte B ---- Kilobyte KB 1 1024 =2 Megabyte MB 1024 2 =2 Gigabyte GB 1024 Terabyte TB 1024 Petabyte PB 1024 Exabyte EB 1024 Zettabyte ZB 1024 Yottabyte YB Bits Bytes 1 -- 8 1 10 8,192 1,024 20 8,388,608 1,048,576 3 =230 8,589,934,592 1,073,741,824 4 =240 8,796,093,022,208 1,099,511,627,776 5 =250 9,007,199,254,740,990 1,125,899,906,842,620 6 =260 9,223,372,036,854,780,000 1,152,921,504,606,850,000 7 70 9,444,732,965,739,290,000,000 1,180,591,620,717,410,000,000 9,444,732,965,739,290,000,000,000 1,180,591,620,717,410,000,000,000 =2 8 80 1024 =2 Petabyte storage had been around for years – online Petabyte storage has not. “Ninety-two percent of new information is stored on magnetic media, primarily hard disks.” “How Much Information 2003”, UC Berkeley's School of Information Management and Systems 4 How Big is That in Human Terms? Data Storage Size Relationships Terminology Value Bit Two A "bit" is short for "binary digit" and can hold only two states, 0 or 1 Byte Eight A unit of storage capable of holding a single alpha-numeric character Kilobyte Thousand One page of a book having around a thousand characters on it Megabyte Million A medium resolution color photograpy taken by a digital camera Gigabyte Billion A gigabyte is equal to the contents of about 10 yards of books on a shelf Terabyte Trillion A Terabyte could hold 1,000 copies of the Encyclopedia Britannica Petabyte Quadrillion Approximately 100 times the printed collection of the Library of Congress Exabyte Quintillion Estimated as 1/5th of all of words spoken since the beginning of history Zettabyte Sextillion Estimated size of data storage for all computers in the world by 2010 Yottabyte Septillion (Difficult to equate to a meaningful example) According to Britannica.com the U.S. Library of Congress contains approximately 18 million books, 2.5 million recordings, 12 million photographs, 4.5 million maps, and more than 54 million manuscripts. 5 Why is Petabyte Storage a Challenge? Areas Impacted by Petabyte Storage: • • • • • • • • • • • Content and File Management Application & Database Characteristics Storage Management Architectural Design Strategy Performance and Capacity SAN Fabric Design Backup and Recovery Methods Security System Complexity Compliance with Regulatory Requirements Operational Policies and Processes Maintenance Requirements 6 Content and File Management 7 Management Starts With Data Classification Data Classification Assumptions • Not all data is created equal • The business value of data changes over time • Performance can be improved by re-allocating data to an optimized storage configuration • The value of most business data is not fixed; it is expected to change over time Understanding the business value of data is a crucial in designing an effective data management strategy Which data has a greater value to the business - a client’s purchase record, or a memo about last year’s phone system upgrade? 8 Data Classification Example Percent of Data Classification Active Data Storage Class Life Critical / Business Critical FC SCSI Enterprise 15K or 10K RPM, Class low to medium capacity Business Important Business Standard Nearline / Reference 10% - 15% 20% - 30% 50% or more 50% or more Vendor Models Relative Cost/GB IBM DS8300 EMC DMX1/2/3000 HDS Tagmastore x45 Tier 1 Recovery in <15 minutes Mid-Range FC SCSI 10K RPM, medium to high capacity IBM DS6800, DS4800 EMC DMX800 HDS 9970/80 x25 Tier 2 Recovery in < 4-hours Mid-Range SATA, FATA, or SAS high capacity IBM DS48000, DS4700 EMC CX300/500/700 HDS 9530 Thunder x5 Tier 3 Recover in < 72-hours Optical or Tape Device IBM DR550 or TS3500 EMC Centera, Hitachi WMS100, HP 7100ux Optical Jukebox x3 Tier 4 Recover as soon as practical x1 Tier 5 Recover when required Low End Disk Drive Characteristics Typical Tier Recovery Level Requirement Iron Mountain, internal archiving Low Value / Archived Inactive data Archived Archived data There are no universally accepted standard definitions for Tier Levels. 9 Control Your File Content Implement file aging Set data retention periods Eliminate low value data • • • • • Clean out old backup files Eliminate outdated information Deploy de-duplication technology Reduce storage of low value data Locate and purge corrupt files Crack down on unauthorized storage usage Periodically review log files and archive or delete obsolete information 10 Application and Database Characteristics 11 Know Your Application and Database Needs Know your applications needs • • • • • • • • • User expectations Workload complexity Read or write intensity Sequential files usage IOPS dependence Stripe size optimization Throughput requirements Service prioritization Growth expectations Don’t allow databases to “lock up” vast amounts of storage 12 Applications Will Drive Storage Requirements Applications characteristics will drive storage decisions • Value to the business • Number of users • Usage patterns • • • • • Steady Bursty Cyclical Variable 7x24 or 9x5 access Domestic or global access Distributed or self-contained High or low security data Architectural constraints Significant performance gains (or losses) can be achieved by matching requirements to storage characteristics 13 Storage Management 14 Large Storage Systems Must Be Managed Information Lifecycle Management (ILM) Hierarchical Storage Management (HSM) Storage Resource Management (SRM) Storage Virtualization "Enterprises can achieve better and more targeted utilization of resources by first establishing the value of their information assets and then using storage management software to execute the policies that define how resources are utilized." Noemi Greyzdorf, research manager, Storage Software, IDC 15 Information Lifecycle Management “(ILM is) the process of managing business data throughout its lifecycle from conception until disposition across different storage media, within the constraints of the business process.” (courtesy of Veritas Corporation, Nov. 2004) ILM is not a commercial product, but a complete set of products and processes for managing data from its initial inception to its final disposition. 16 Information Lifecycle Management Information has business value • • • • It’s value changes over time It ages at different rates It has a finite life-cycle As data ages its performance needs change Some Information is subject to different security requirements, due to government regulatory or legal enforcements Outdated information has different disposal criteria A combination of processes and technologies that determine how information flows through a corporate environment Encompasses management of information from its creation until it becomes obsolete and is destroyed 17 “Best Practices” for ILM Implementations Know exactly where information is stored Be able to retrieve information quickly and efficiently Limit access to only those who need to view data Create policies for managing and maintaining data Do not destroy important documents Avoid keeping multiple copies of the same data Retain information only until it is no longer useful Destroy outdated files on a regular basis Document all processes and keep them up-to-date 18 Hierarchical Storage Management “HSM is a policy-based data storage management system that automatically moves data between highcost and low-cost storage media, without requiring the knowledge or involvement of the user.” (courtesy of http://searchstorage.tedchtarget.com) IBM has been involved in providing HSM solutions for over 30-years and offer a wide variety of products with automated data movement capabilities. 19 File Access Activity Over Time Number of Accesses AIX File Accesses Per Day 1,800,000 1,600,000 1,400,000 Files accessed between 2 and 6 months. Files accessed within the last 2-months. Files accessed beyond 6 months 1,200,000 15% 10% 75% 1,000,000 Last Accessed Expon. (Last Accessed) 800,000 600,000 400,000 200,000 0 <= 1 day 1 day - 1-wk. 1-wk. - 1-mo 1-mo. - 2-mo. 2-mo. - 3-mo. 3-mo. - 6-mo. 6-mo. - 9-mo. 9-mo. - 1-yr > 1-yr 20 Hierarchical Storage Management 10% 20% 70% Archive HSM Concepts • • • • • Only 10%-15% of most data is actively accessed The business value of data changes over time Between 80% and 90% of all stored data is inactive High performance storage (FC disks) are expensive Lower performance media (tape, optical platters, and SATA disk) are comparatively inexpensive 21 Hierarchical Storage Management $$$$ $$$ $$ $ HSM Concepts (cont.) • Enterprise class storage is not required for all data • Policies can be set to establish the proper frequency for transitioning aging data to less expensive media • HSM allows optimal utilization of expensive disk storage • Low cost, high density disks consume fewer resources • Overall storage system performance may improve 22 IBM Products with HSM Capabilities General Parallel File System (GPFS) IBM Content Manager for Multiplatforms Tivoli Storage Manager HSM for Windows Tivoli Storage Manager for Space Management (AIX) SAN File System (SFS) DFSMShsm (Mainframe) High Performance Storage System (HPSS) 23 Storage Resource Management “Storage Resource Management (SRM) is the process of optimizing the efficiency and speed with which the available drive space is utilized in a storage area network (SAN). Functions of an SRM program include data storage, data collection, data backup, data recovery, SAN performance analysis, storage virtualization, storage provisioning, forecasting of future needs, maintenance of activity logs, user authentication, protection from hackers and worms, and management of network expansion. An SRM solution may be offered as a stand-alone product, or as part of an integrated program suite.” (Definition Courtesy of http://searchstorage.techtarget.com) IBM’s primary tool for Storage Resource Management is their TotalStorage Productivity Center suite of tools for disk, data, fabric, and replication. 24 Storage Resource Management Functions Deployment Management Compliance Management Event Management Performance Management Accounting Management Quota Management Asset Management Change Management Capacity Planning Operational Management Service Level Management Policy Management Security Management Automation Backup & Recovery HSM Operations Point-in-Time Copies Disaster Recovery Data Migration Data Archiving 25 Storage Virtualization Virtualization “The act of integrating one or more (back end) services or functions with additional (front end) functionality for the purpose of providing useful abstractions. Typically virtualization hides some of the back end complexity, or adds or integrates new functionality with existing back end services. Virtualization can be nested or applied to multiple layers of a system.” (Definition Courtesy of http://www.snia.org/education/dictionary) Virtualization allows most of the complexity of a storage infrastructure to be hidden from the user. 26 Virtualization Makes Storage One Large Pool Virtualization Characteristics • Makes storage configuration details invisible to the user • Improves overall manageability of the system • Aggregates isolated storage “islands” into a unified view • Facilitates greater flexibility and scalability • Optimizes utilization of storage capacity • Provides the ability to move data on-the-fly • Improves storage subsystems flexibility • Allows rapid re-allocation of storage resources • Improves performance by providing another layer of caching • May provide additional functionality for the SAN 27 Architectural Design Strategy 28 Key Architectural Design Considerations Resource Consumption Storage Economics RAID Allocation Performance Objectives Other Design Issues The integrity of the architectural design will determine the overall performance, stability, economic efficiency, manageability and future scalability of the system. 29 Power Consumption vs. Storage Capacity Disk Power Consumption - Cost per Petabyte. Average # of Disks Total Total **Cost per Power per Power in Power in KW/hr. @ per Disk Petabyte KW KW/hr. $.0874 in Watts Power Power Cost Over a Consumption BTUs Cost Per 5-yr Period Efficiency Index per disk Year (Watts/GB) Disk Type Disk Capacity in GB FC 36.7 9.9 27,248 270.0 20.25 $1.77 $15,505 $77,523 0.270 33.8 3,314,653,798 FC 73.4 9.3 13,624 126.8 9.51 $0.83 $7,283 $36,417 0.127 31.7 1,557,078,474 FC 146.8 10.8 6,812 73.7 5.53 $0.48 $4,232 $21,158 0.074 36.9 904,644,196 FC 300 13.8 3,333 46.0 3.45 $0.30 $2,642 $13,209 0.046 47.1 564,777,840 SATA 250 9.3 4,000 37.2 2.79 $0.24 $2,136 $10,680 0.037 31.7 456,667,200 SATA 320 9.3 3,125 29.1 2.18 $0.19 $1,669 $8,344 0.029 31.7 356,771,250 SATA 400 9.3 2,500 23.3 1.74 $0.15 $1,335 $6,675 0.023 31.7 285,417,000 SATA 500 9.3 2,000 18.6 1.40 $0.12 $1,068 $5,340 0.019 31.7 228,333,600 SATA 750 9.3 1,333 12.4 0.93 $0.08 $712 $3,560 0.012 31.7 152,222,400 SATA 1000 9.3 1,000 9.3 0.70 $0.06 $534 $2,670 0.009 31.7 114,166,800 BTU/hr per Petabyte These disks all have very similar power consumption requirements, even though the largest one features 28 times the capacity of the smaller one. In addition, each disk will require approximately 0.4-0.6 watts of electrical power to cool each BTU of heat produced. ** National retail price of electricity per KwH from “Power, Cooling, Space Efficient Storage”, page 2, ESG white paper, Enterprise Strategy Group, July. 2007. 30 Comparing Storage Subsystem Power Costs Power Cost - Traditional Storage All on One Tier Percent of Total 1000 TB of Storage Storage Type Storage Frames In KW Disk Type All Data 100% 1000 (11) DS8300 143.0 FC Total Disk Size in Number of Disk Power Power Cost Power in GB Disks in KW per Year KW 146 6850 74.1 217.1 $166,219 Power Cost - Tiered Storage Tiered by Activity Percent of Total 1000 TB of Storage Storage Type Storage Frames In KW Disk Type Total Disk Size in Number of Disk Power Power Cost Power in GB Disks in KW per Year KW Frequently accessed 10% 100 (2) DS8300 26.0 FC 146 685 7.4 33.4 $25,580 Infrequently accessed 20% 200 (3) DS4800 7.9 FC 300 667 9.2 17.1 $13,097 Seldom accessed 70% 700 (9) DS4200 6 SATA 750 934 8.7 14.7 $11,244 ====== $49,921 Significant power savings may be realized by redistributing data to the appropriate type and size of disk drive. 31 Comparing Storage Subsystem Cooling Costs Cooling Cost - Traditional Storage All on One Tier Percent of Total 1000 TB of Storage All Storage 100% 1000 Storage Individual Frame Disk Size Number Disk Heat Heat in in GB of Disks in BTUs BTUs (11) DS8300 262801 146 6850 36.9 Storage Type Total Disk Heat in BTUs Total power for Cooling in KW Cooling Cost per Year 252765 Total System Heat in BTUs 515566 74 $56,702 Total Disk Heat in BTUs 25277 31416 29608 Total System Heat in BTUs 73059 33828 43252 Total power for Cooling in KW 21 10 13 86300 150,138 44 Cooling Cost - Tiered Storage Tiered by Activity Percent of Total 1000 TB of Storage Frequently accessed Infrequently accessed Seldom accessed 10% 20% 70% 100 200 700 Storage Individual Frame Disk Size Number Disk Heat Heat in in GB of Disks in BTUs BTUs (2) DS8300 47782 146 685 36.9 (3) DS4800 2412 300 667 47.1 (9) DS4200 13644 750 934 31.7 Storage Type Cooling Cost per Year $16,389 $7,588 $9,703 ==== $33,680 Additional power savings may be realized from the reduced cooling requirements provided by high capacity, lower wattage disk drives. 32 Comparing Storage Floor-Space Cost Service Clearance Model Scenario DS8300 Traditional Storage DS8300 DS4800 DS4200 Tiered Storage Typical Total Unit Unit Front Rear - Footprint Total Cost per Number Width in Depth in Depth in Depth in per Unit Footprint sq. ft. per Total Cost of Units Inches Inches Inches Inches (sq. ft.) (sq. ft.) Month per Month 11 104.1 46.6 48 30 90.08 990.83 2 104.1 46.6 48 30 90.08 180.15 3 25.4 43.3 36 36 20.34 61.01 9 25.4 43.3 36 36 20.34 183.04 ==== 424.20 Total Cost Percent Per Year Difference $65 $64,404 $772,847 ---- $65 $27,573 $330,878 57% The DS4800 and DS4200 storage subsystems include the required number of disk expansion trays mounted in standard equipment racks. 33 How Do the Costs Add Up? Tiered Storage Approach Traditional Approach Everything on DS8300s DS8300 DS4800 DS4200s with SATA Disk Catalyst 2948G-GE-TX CONSOLE PSI STATUS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 10/100 MGT 48 49 RPSU 50 51 52 MDS 9140 MULTILAYER INTELLIGENT FC SWITCH 1 MGMT 10/100 CONSOLE 4 9 -SP LN- STATUS 12 13 -SP LN- 16 17 -SP LN- 20 21 -SP LN- 24 -SP LN- P/S FAN 5 LINK ACT 8 25 -SP LN- RESET 28 29 -SP LN- 32 33 -SP LN- 36 37 -SP LN- 40 -SP LN- DS - C9140 - K9 MDS 9140 MULTILAYER INTELLIGENT FC SWITCH 1 MGMT 10/100 CONSOLE 4 9 -SP LN- STATUS 12 13 -SP LN- 16 17 -SP LN- 20 21 -SP LN- 24 -SP LN- P/S FAN 5 LINK ACT 8 25 -SP LN- RESET 28 29 -SP LN- 32 33 -SP LN- 36 37 -SP LN- 40 -SP LN- DS - C9140 - K9 System x3455 System x3455 DS4800 Catalyst 2948G-GE-TX CONSOLE PSI STATUS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 10/100 MGT 48 49 RPSU 50 51 52 MDS 9140 MULTILAYER INTELLIGENT FC SWITCH 1 MGMT 10/100 CONSOLE P/S FAN 4 DS4800 5 LINK ACT 9 -SP LN- STATUS LN- 12 25 RESET 13 -SP LN- 8 -SP 16 17 -SP LN- 28 29 -SP LN- 20 21 -SP LN- 32 33 -SP LN- 24 -SP LN- 36 37 -SP LN- 40 -SP LN- DS - C9140 - K9 MDS 9140 MULTILAYER INTELLIGENT FC SWITCH TotalStorage 1 MGMT 10/100 CONSOLE 4 9 -SP LN- STATUS 12 13 -SP LN- 16 17 -SP LN- 20 21 -SP LN- 24 -SP LN- P/S FAN 5 LINK ACT 8 25 -SP LN- RESET 28 29 -SP LN- 32 33 -SP LN- 36 37 -SP LN- 40 -SP LN- DS - C9140 - K9 System x3455 System Storage DS4700 System Storage EXP810 System x3455 DS4800 Catalyst 2948G-GE-TX CONSOLE PSI STATUS 4GB/s RPSU 2GB/s 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 10/100 MGT 49 50 51 52 MDS 9140 MULTILAYER INTELLIGENT FC SWITCH 1 MGMT 10/100 CONSOLE P/S FAN 4 DS4800 5 LINK ACT 9 -SP LN- STATUS LN- 12 -SP LN- 8 25 -SP RESET 28 -SP LN- 13 LN- 29 LN- 16 -SP 32 -SP 17 20 21 -SP LN- 33 LN- 36 37 -SP LN- LN- 24 -SP 40 -SP DS - C9140 - K9 MDS 9140 MULTILAYER INTELLIGENT FC SWITCH TotalStorage 1 MGMT 10/100 CONSOLE 4 9 -SP LN- STATUS 12 -SP LN- 13 LN- 16 -SP 17 20 21 -SP LN- LN- 24 -SP P/S FAN 5 LINK ACT 8 25 -SP LN- RESET 28 -SP LN- 29 LN- 32 -SP 33 36 37 -SP LN- LN- 40 -SP DS - C9140 - K9 System x3455 System Storage System Storage DS4700 System Storage EXP810 4GB/s 2GB/s 4GB/s 2GB/s EXP810 System x3455 DS4800 4GB/s System Storage 2GB/s EXP810 DS4800 TotalStorage System Storage System Storage DS4700 System Storage EXP810 System Storage EXP810 System Storage DS4700 System Storage EXP810 System Storage EXP810 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s EXP810 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s DS4700 System Storage EXP810 System Storage DS4700 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s DS4700 System Storage System Storage System Storage EXP810 System Storage EXP810 4GB/s 2GB/s 4GB/s 2GB/s System Storage EXP810 System Storage DS4700 4GB/s 2GB/s 4GB/s 2GB/s EXP810 DS4200 DS4700 DS4200 System Storage System Storage EXP810 System Storage EXP810 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s EXP810 4GB/s System Storage 2GB/s EXP420 System Storage DS4200 System Storage EXP420 4GB/s System Storage EXP420 System Storage EXP420 2GB/s DS4200 System Storage 4GB/s EXP810 2GB/s System Storage DS4200 System Storage EXP420 4GB/s System Storage EXP420 System Storage EXP420 4GB/s 2GB/s 2GB/s DS4200 Power Cost $166,219 Cooling Cost $56,702 Floor Space $772,847 ======= $995,768 System Storage EXP420 System Storage EXP420 4GB/s 2GB/s 4GB/s 2GB/s Power Cost $49,921 Cooling Cost $33.68 Floor Space $330,878 ======= $380,833 Savings: $614,935 / yr. 34 A Look at Older Disk Subsystem Efficiency 100 TB of Storage Model Capacity Largest Frames in TB FC Drives Watts BTUs/hr. Width Depth Access Clearance Front Rear Total Sq. Ft. ESS800 w/frame ESS800 w/frame 1 1 ==== 2 55.9 55.9 ==== 111.8 145.6 GB ---------- 13,112 13,112 ==== 26224 47,000 47,000 ==== 94000 115.7 115.7 ==== 231.4 35.8 35.8 ==== 35.8 34 ---==== 34 45 ---==== 45 5672.06 4142.06 ==== 9814.12 DS8300 2107-9AE 1 1 ==== 2 115.2 ---==== 115.2 300 GB ---==== ---- 7000 6,000 ==== 13000 23,891 20,478 ==== 44369 104.1 ---==== 104.1 44.6 ---==== 44.6 48 ---==== 30 ---==== 6082.86 ---==== 6082.86 DS8300 Benefits 100% 103% ---- 50% 53% 45% 125% ---- ---- 38% Storing 100 TB of data on more modern storage subsystems results in 50% less power consumption, a 53% reduction in BTUs per hr., and a reduction in required floor space of 38%. In addition, a DS8300 system has over 7x the throughput of the ESS800. 35 Why is Tiered Storage Important? Maps data’s business value to disk characteristics Places data on storage appropriate to its usage Incorporates lower cost disks Reduces resource usage (power, cooling, etc.) Matches user access needs to storage characteristics Capitalizes on higher capacity disk drive technology Increases overall performance of the system 36 A Typical Tiered Storage Architecture DS8300 DS4800 PSI STATUS 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 3 4 5 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 28 29 30 31 DS4200s with SATA Disk Catalyst 2948G-GE-TX CONSOLE 1 10/100 MGT 49 RPSU 50 51 52 MDS 9140 MULTILAYER INTELLIGENT FC SWITCH 1 MGMT 10/100 CONSOLE 4 9 -SP LN- STATUS 12 13 -SP LN- 16 17 -SP LN- 20 21 -SP LN- 24 -SP LN- P/S FAN 5 LINK ACT 8 25 -SP LN- 28 29 -SP LN- RESET 32 33 -SP LN- 36 37 -SP LN- 40 -SP LN- DS - C9140 - K9 MDS 9140 MULTILAYER INTELLIGENT FC SWITCH 1 MGMT 10/100 CONSOLE 4 9 -SP LN- STATUS 12 13 -SP LN- 16 17 -SP LN- 20 21 -SP LN- 24 -SP LN- P/S FAN 5 LINK ACT 8 25 -SP LN- 28 29 -SP LN- RESET 32 33 -SP LN- 36 37 -SP LN- 40 -SP LN- DS - C9140 - K9 System x3455 Catalyst 2948G-GE-TX CONSOLE PSI STATUS RPSU System x3455 1 2 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 28 29 30 31 10/100 MGT 49 50 51 52 MDS 9140 MULTILAYER INTELLIGENT FC SWITCH 1 MGMT 10/100 CONSOLE 4 9 -SP LN- STATUS 12 13 -SP LN- 16 17 -SP LN- 20 21 -SP LN- 24 -SP LN- P/S FAN 5 LINK ACT 8 25 -SP LN- 28 29 -SP LN- RESET DS4800 32 33 -SP LN- 36 37 -SP LN- 40 -SP LN- DS - C9140 - K9 MDS 9140 MULTILAYER INTELLIGENT FC SWITCH 1 MGMT 10/100 CONSOLE 4 9 -SP LN- STATUS 12 13 -SP LN- 16 17 -SP LN- 20 21 -SP LN- 24 -SP LN- P/S FAN 5 LINK ACT 8 25 -SP LN- 28 29 -SP LN- RESET 32 33 -SP LN- 36 37 -SP LN- 40 -SP LN- DS - C9140 - K9 System x3455 DS4800 2 3 4 5 6 7 Catalyst 2948G-GE-TX CONSOLE PSI STATUS TotalStorage 1 8 9 10 11 12 13 14 RPSU System x3455 15 16 17 18 19 20 21 22 23 24 25 26 27 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 10/100 MGT 49 50 51 52 MDS 9140 MULTILAYER INTELLIGENT FC SWITCH 1 MGMT 10/100 CONSOLE 4 9 -SP LN- STATUS 12 13 -SP LN- 16 -SP LN- 17 20 21 -SP LN- LN- 24 -SP P/S FAN 5 LINK System Storage DS4700 ACT 8 25 -SP LN- RESET DS4800 28 29 -SP LN- 32 -SP LN- 33 36 37 -SP LN- LN- 40 -SP DS - C9140 - K9 MDS 9140 MULTILAYER INTELLIGENT FC SWITCH 1 MGMT 10/100 CONSOLE 4 9 -SP LN- STATUS 12 13 -SP LN- 16 -SP LN- 17 20 21 -SP LN- LN- 24 -SP P/S FAN 5 LINK ACT 8 25 -SP LN- RESET 28 29 -SP LN- 32 -SP LN- 33 36 37 -SP LN- LN- 40 -SP DS - C9140 - K9 System x3455 DS4800 TotalStorage System Storage 4GB/s 2GB/s 4GB/s 2GB/s System x3455 EXP810 System Storage DS4700 System Storage EXP810 DS4800 DS4800 System Storage EXP810 System Storage EXP810 System Storage 4GB/s System Storage EXP810 System Storage EXP810 System Storage 4GB/s System Storage TotalStorage 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s DS4700 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s EXP810 2GB/s DS4700 System Storage EXP810 DS4700 4GB/s System Storage 2GB/s EXP810 4GB/s System Storage 2GB/s DS4700 System Storage EXP810 DS4700 4GB/s System Storage 2GB/s EXP810 4GB/s System Storage System Storage 4GB/s System Storage DS4700 DS4700 2GB/s EXP810 4GB/s System Storage 2GB/s EXP810 System Storage 4GB/s System Storage 2GB/s EXP810 EXP810 2GB/s DS4200 4GB/s System Storage 2GB/s EXP810 System Storage EXP810 DS4200 4GB/s System Storage 2GB/s EXP420 4GB/s System Storage 2GB/s DS4200 System Storage EXP810 DS4200 4GB/s System Storage 2GB/s EXP420 4GB/s System Storage System Storage 4GB/s System Storage DS4200 DS4200 2GB/s EXP420 4GB/s System Storage 2GB/s EXP420 System Storage 4GB/s System Storage 2GB/s EXP420 EXP420 2GB/s 4GB/s 2GB/s 4GB/s 2GB/s EXP420 System Storage EXP420 System Storage EXP420 Business Critical Business Important Business Standard High Performance / Very High Availability Good Performance / High Availability Average Performance / Standard Availability Normally a tiered storage strategy is based on data’s business value. TS3500 Tape Library Reference / Historical Near-line or Off-line 37 Choosing the Right Controller Frame Storage Controller Cost Comparison List Price $250,000 $200,000 DS8300 Base System DS6800 Controller DS4800 Controller $150,000 DS4700 Controller DS4200 Controller $100,000 $50,000 $0 DS8300 Base System DS6800 Controller DS4800 Controller DS4700 Controller DS4200 Controller 38 Choosing the Right Disk Characteristics Notes: q Tiers 0 and 1 must be physically close to the client equipment for latency reasons. Tiers 2 – 4 can be more remote. q Active data in Tiers 0 & 1 is normally less than 20% of the total data. System Primary Memory (DRAM and Cache ) Solid State Disk Ultra High Performance Tier 0 q Inactive data (80%) is not significantly affected by latency issues from asynchronous connections. Tier 1 q Management and capacity planning for the entire tiered structure can be executed from any geographic location. Tier 2 FC SCSI Disk High Performance (15K RPM) or Low capacity – High Spindle Count (10K RPM) FC or SAS SCSI Disk Medium Performance SCSI or SAS (10K RPM) or High capacity SCSI or SAS (10K RPM) SATA Disk Tier 3 High capacity, low cost SATA Disk UDO, DVD, MO, or TAPE Library (Near-Line Media) High capacity, very low cost medium for long-term archiving Tier 4 39 Comparing Disk Drive Attributes Cost-per-GB By Disk Drive Type $30.00 $25.00 $24.51 $22.87 $22.87 4GB FC 73.4GB 15K 4GB FC 146.8GB 15K 4GB FC 300 GB 15K $20.00 $16.89 $16.89 2GB FC 36.4GB/15K 2GB FC 73.4GB/15K $15.25 $15.00 $13.51 2GB FC 146.8GB/15K $13.69 2GB FC 73.4GB/10K $10.66 2GB FC 146.8GB/10K 2GB FC 300GB/10K $10.00 400 GB SATA II 500 GB SATA II 750 GB SATA II $5.00 $2.56 $2.56 $2.30 II TA II 75 0 G B SA TA II G B SA TA SA B 50 0 G 40 0 B 4G 4G B FC 73 .4 G FC B 15 14 K 6. 4G 8G B B FC 15 30 K 0 G B 15 K 2G B FC 36 2G .4 B G B FC /1 5K 73 2G .4 B G FC B /1 14 5K 6. 8G B /1 5K 2G B FC 73 2G .4 B G FC B /1 14 0K 6 2G .8 G B B FC /1 0K 30 0G B /1 0K $0.00 40 The Cost Impact of Adding Disk Trays Declining Cost-Per-GB (DS4800, in 4-disk increments) Cost per GB $100.00 $90.00 $80.00 $70.00 $60.00 $50.00 Per GB $40.00 $30.00 $20.00 $10.00 $0.00 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Disk Trays Note: Calculations based on 146 GB, 10K RPM Drives 41 Tiered Storage Design Pros and Cons Advantages • • • • • • Lower initial purchase price Higher capacity per square foot Reduced power consumption Decreased requirement for cooling Increased equipment flexibility Potentially a higher performance solution Disadvantages • Inherently a more complex architecture • Greater up-front effort to design and implement • Requires advanced storage design skills and knowledge 42 RAID Selection Decision Drivers Application or Database characteristics • Read/write mix • Dependency on IOPS RAID Performance characteristics • • • • Appropriate RAID level Number of disks per array Stripe size Available bandwidth Configuration rules and recommendations Loss from data parity and hot sparing Disk failure probability RAID parity rebuild times 43 Loss from Mirroring, Striping, and Sparing Mirror Usable RAID10 468.8 TB S RAID10 = Unusable 531.2 TB Mirror plus Stripe S Stripe Mirror Loss 53.1% Usable RAID1 468.8 TB RAID1 = Mirror Only S Unusable 531.2 TB S Loss 53.1% 44 Loss from RAID5 Parity and Sparing Usable 559.5 TB RAID5 - 3+P Array Note: The second tray has one 2+P array to allow for one spare drive per two trays. P P P P P P P P Unusable 531.2 TB S Loss 44.1% Usable 791.5 TB RAID5 - 7+P Array Note: The second tray has one 6+P array to allow for one spare drive per two trays. P P P P Unusable 208.5 TB S Loss 20.9% Usable 857.1 TB RAID5 - 14+P Array Note: Each tray has one spare drive per tray. P S Unusable 142.9 TB P S Loss 14.3% 45 Other Architectural Considerations Compatibility High availability Architectural robustness Flexibility and scalability Stability of the technology Vendor’s financial standing Well defined product line roadmap Support for industry standards 46 Performance and Throughput 47 Storage Subsystem Performance Drivers Business objectives and user expectations Applications and database characteristics Server characteristics SAN fabric characteristics Storage controller characteristics Caching characteristics Configuration characteristics Disk latency characteristics "We can't solve problems by using the same kind of thinking we used when we created them." Albert Einstein 48 Storage Performance Enhancers Data Mover – Reassigning data transfer tasks to a specialized “engine” reduces the workload on the host processing system. Search Engines – Systems dedicated to executing searches in vast amounts of stored data to satisfy specific requests. Directory Services – Stores and organizes information about resources and data objects. High Speed Interconnections – Dedicated “behind the scenes” networks dedicated to the transfer of large amounts of data. Autonomic Computing – must have an ability to reconfigure itself under varying and possibly unpredictable conditions. 49 Other Storage Performance Tips Method RAID selection Thin Provisioning Parallel file systems Description Striping data across multiple disks to reduce the response latency of individual disks and increase the available bandwidth of the array. Allows applications and databases to be allocated more capacity than is physically reserved on the storage subsystem. Provides individual file block striping across multiple disks to greatly improve I/O throughtput.and storage capacity. Virtualization Provides some caching of data, spreads I/O across multiple storage controller channels, and can be used as a "data mover" for remote data replication. Increased spindle count Deploying a large number of low capacity, high speed disks in a RAID array to increase the number of spindles and improve the array's IOPS capability. High RPM disk drives Implementing 15K RPM drives to reduce the amount of disk latency and increase the transfer rate. Striping across trays Building a RAID array by vertically striping it across multiple disk trays and storage controller channels. Increasing cache size Installing additional storage cache to increase the percentage of "read hits" from cache and avoiding much slower data read requests from disks. Sequential file isolation Isolate sequential write files on dedicated physical disks to minimize the amount of head seek time and disk rotational latency. 1:1 SAN port relationships Assign high performance servers and storage controllers to SAN Fabric ports with a 1:1 relationship to ensure maximum bandwidth is always available. Hot spot elimination Monitor disk storage for "hot spots" (over-worked array areas) and reallocate storage to eliminate the usage imbalance. 50 SAN Fabric Network 51 SAN Fabric Overview SAN Fabric is the interconnecting structure between associated servers and storage devices Proper fabric design will directly impact on: • • • • • Performance Availability Equipment cost Manageability Maintainability Communications protocol can be either Fibre Channel, Ethernet, or a combination of both Ability of the fabric to scale is critical Monitoring of SAN fabric traffic is a necessity 52 Designing a High Performance Fabric Select an optimal SAN fabric topology • • • • • • • • Mesh Waterfall Core / Edge Fat tree Butterfly Torus Hypercube Hybrid Ensure the design is driven by business application requirements Keep it as simple as possible for manageability 53 Common SAN Fabric Examples Server Server Server Storage Basic Dual Switch Server Server Director Storage Storage Server Server Storage Fabric Mesh Storage Server Director Storage Storage Storage Fat Tree Fabric Architecture Highly Availability Core - Edge Fabric 54 SAN Fabric Design Considerations SAN Fabric design issues: • • • • • • • • • • • • • Throughput requirements Potential bottlenecks Port speed / port count Port subscription rate Maximum hop count Redundancy for High Availability Modularity (flexibility/scalability) Future upgradeability Complexity vs. overall manageability Isolation vs. unification Wide Area Network interconnections Power consumption and footprint Component cost 55 Backup and Recovery 56 Backup and Recovery Challenges Size of the backup window Ability to recover files The time to recover files Integrity of the data backups Required frequency of backups Stored data retention period • Functional • Legal Available bandwidth for backup resources Media deterioration over time Technological obsolescence 57 The Traditional Storage Backup Approach 1.0 PB of Storage TS3500 Tape Library (192) of the newest LTO 4 tape drives running at a maximum native transfer rate of 120 MB/sec. would need at least 13-hours to back up 1.0 PB of data. 58 Fast Tape Drives can Saturate the Network One LTO-4 tape drive run in native mode is 20% faster than the effective usable bandwidth of Gigabit Ethernet! Native Transfer Rate in MB/sec. MB/sec. 120 Effective transfer rate of Gigabit Ethernet 100 LTO-4 80 LTO-3 LTO-2 LTO-1 60 1120 1/2" 3592 1/2" Four LTO-4 tape drives run at a 2:1 compressed mode will take most of the usable bandwidth of 10Gbps Ethernet! 40 3590 1/2" 9940B 9940A 20 9840C 9840B 0 -4 O T L 9840A -3 O T L -2 O T L -1 2" 2" 2" O 1/ 1/ 1/ T L 20 92 90 11 35 35 40 99 B 40 99 A 40 98 C 40 98 B 40 98 A Drive Type 59 Large Systems Backup Approaches Point-in-Time Copies and Replication • Snapshots Most popular method of large storage backup Snapshots create an exact copy of the source data Once a bitmap is created, storage access can resume While copying, new data is written to both source and target Requires minimal downtime for production systems • Replication Replication creates a mirror image of data over distance May be synchronous (consistent) or asynchronous (lagging) Synchronous is distance-limited, asynchronous is not 60 Point-in-Time Copy Mirror Side A Mirror Side B p5 P590 server LPARs 1 Redundant 200MB/sec. HBAs 200MB/sec. x 4 ports x 4 Clusters 3.2 GB/sec. bandwidth SVC Virtualization Engine 6a/b 5b ACS Initiated 200MB/sec. x 4 ports x 4 Clusters 3.2 GB/sec. bandwidth 2 DSE ESD 2 1 0 5 4 3 0 1 2 3 4 5 3 663 seireSx SVC Virtualization Engine xSeries 366 DSE ESD 2 1 0 0 1 2 5 4 3 3 4 5 4a 663 seireSx xSeries 366 5a 4b 200MB/sec. bandwidth FlashCopy BitMap WS-X9530 SFI MGMT MGMT 10/100 CONSOLE COM 1 CFI STATUSSYSTEMACTIVEPWR RESET SUPERVISOR FlashCopy BitMap WS-X9530 SFI MGMT MGMT 10/100 CONSOLE COM 1 CFI STATUSSYSTEMACTIVEPWR RESET STATUS SUPERVISOR WS-X9016 1 2 3 4 1 2 3 4 5 6 7 8 5 6 7 8 9 10 11 12 9 10 11 12 13 14 15 16 13 14 15 16 STATUS STATUS 1/2 Gbps FC Module WS-X9016 WS-X9016 1 2 3 4 1 2 3 4 5 6 7 8 5 6 7 8 9 10 11 12 9 10 11 12 13 14 15 16 13 14 15 16 STATUS STATUS 1/2 Gbps FC Module 1/2 Gbps FC Module WS-X9016 STATUS 1/2 Gbps FC Module WS-X9032-SMV STATUS 1 7 2 9 15 17 23 25 31 8 10 16 18 24 26 32 1 7 9 15 17 23 25 31 2 8 FC Services Module WS-X9032-SMV WS-X9032-SMV STATUS STATUS 10 16 18 24 26 32 1 7 9 15 17 23 25 31 2 8 10 16 18 24 26 32 1 7 9 15 2 17 23 25 31 10 16 18 24 26 32 1 7 9 15 17 23 25 31 2 8 8 10 16 18 24 26 32 FC Services Module FC Services Module WS-X9032-SMV WS-X9032-SMV STATUS STATUS FC Services Module FC Services Module WS-X9032-SMV WS-X9032-SMV 1 1 7 9 15 17 23 25 31 2 8 10 16 18 24 26 32 STATUS STATUS 2 7 9 8 10 16 18 24 26 32 9 15 17 23 25 31 10 15 16 17 18 23 24 25 26 31 32 FC Services Module FC Services Module WS-X9032-SMV FlashCopy Backup and Cloning Process STATUS 1 7 2 8 FC Services Module FlashCopy Backup Process egarotSlatoT TotalStorage egarotSlatoT 0 1 7PXE egarotSlatoT 0 1 7PXE egarotSlatoT 0084SD egarotSlatoT 0 1 7PXE egarotSlatoT 0 1 7PXE egarotSlatoT DS8300 Storage 200MB/sec. x 8 ports 1.6 GB/sec.bandwidth 200MB/sec. x 8 ports 1.6 GB/sec.bandwidth TotalStorage 2 DS8300 Storage DS4800 Storage DS4800 TotalStorage 0 1 7PXE egarotSlatoT egarotSlatoT TotalStorage EXP710 1 1 egarotSlatoT TotalStorage EXP710 0 1 7PXE 2 DS4800 Storage EXP710 EXP710 TotalStorage EXP710 TotalStorage EXP710 TotalStorage EXP710 TotalStorage EXP710 TotalStorage 0 1 7PXE 0 1 7PXE 200MB/sec. x 2 channel bandwidth (400MB/sec. usable) Internal Bandwidth 400MB/sec. x 4 adapter ports 1.6 GB/sec. (1.2 GB/sec. effective rate) Internal Bandwidth 400MB/sec. x 32 adapter ports 128 GB/sec. (64GB/sec. usable) 6a/b To Tape Library Internal Bandwidth 400MB/sec. x 4 adapter ports 1.6 GB/sec. (1.2 GB/sec. effective rate) Internal Bandwidth 400MB/sec. x 32 adapter ports 128 GB/sec. (64GB/sec. usable) 61 Data Replication Structures 01/18/06 Hypothetical Backup & Recovery Approach Service Level 2 – Under 15-Minute Recovery Service Level 1 Continuous Availability Service Class 3 Under 4-Hour Recovery Mainframe Mainframe Storage Storage Metro AreaNorth Building Storage Site Bunker East Primary Facility Data Center Tiered Storage On-line file On-line file recovery andrecovery and Warm Failover Warm Failover Hot FailoverHot Failover Global backupsMetro Mirror backups Metro Mirror Disk Array Disk Array Mirror Global Mirror Disk Array Disk Array Unix Storage Unix Storage Wintel Storage Wintel Storage Local Tape Backups ss r-l e TS M Se r TSM Serverless Backups ve (asynchronous) (asynchronous) TSM Serverless Backups ? (synchronous) (synchronous) TSM Serverless Backups TSM Serverless Backups Disk Array Disk Array (DS4000 Se TS Se TS Series with M rv M L(check-points) (check-points) SATA) rve La e r-l an r-l n es - le es - le s ss s ss Ba a Ba a ck n d ck n d up up s s ?? (DS6800) (DS8300) Business Standard FlashCopy Storage ps (DS8300) Business Standard Business Storage Important Storage (DS4000 Series with (DS6800) SATA) ku Business Mission Critical Important Storage Storage Ba c Mission Critical Storage Bunker West Remote Facility Storage Site Local Site Area Backups ??? Area Site (within 170 miles) Remote Backups And Archiving Local Tape Backups Remote Site (anywhere in the World) NOTE: This is about a 50,000 ft. overview of one possible approach. It is designed in a triple redundant disaster recovery architecture, so the primary structure could sustain a failure and the secondary structure could also sustain a disaster before the primary structure was restored, and the structure would still continue to support key business operations. 62 Other Large Systems Backup Approaches Object-based Backups • • • • Backs up only new blocks that have changed Copies only files it has never seen before Inserts pointers if file exists somewhere else Provides instant recoveries by presenting a mountable volume Delta-Block Incremental Backups • • • • Evaluates changed data by breaking a file down into discrete blocks Block-for-block comparison of a modified file with an existing file When a difference is detected it extracts a copy of that block only Usually copies a number of blocks, but not the entire file Continuous Data Protection (CDP) • • • • Copies blocks to the backup system as soon as they change Stores data in a log that allows recovery from any point in time Performs fast recoveries by restoring only blocks that have changed Provides instant recoveries by presenting a mountable volume 63 Security Requirements 64 Impact of Petabyte Storage on Security Traditional distributed access control techniques are designed for smaller systems with general or random workloads Petabyte storage may service tens of thousands of clients and hundreds of storage devices Storage design must be capable of supporting I/O patterns that are highly parallel and very bursty by nature Security solutions must be kept highly scalable to keep up with storage growth patterns 65 Impact of Petabyte Storage on Security (Cont.) Authentication and authorization requirements can dramatically impact server performance Performance could be further reduced if data is encrypted Traditional security protocols perform poorly because they do not scale well The number of security operations is closely tied to the number of devices and requests 66 Regulatory Compliance 67 The Challenge Of Regulatory Compliance What’s driving storage regulatory legislation? • Corporate fraud and illegal practices • Increased emphasis on the security of personal data • The threat of terrorist activities • The global impact of the Internet • Increased reliance on stored data for defense against litigation • Increased business dependence on electronic communications (e-mail, digital voice messaging, instant messaging, VoIP, etc.) 68 Regulatory Requirements Continue to Grow According to the Storage Networking Industry Association (SNIA) there are over 20,000 regulations worldwide addressing the storage of data The number of government regulatory requirements increases every year • There is little chance this upward trend will reverse itself in the future • Regulatory guidelines do not dictate how you should maintain your data, only what the expected outcome should be. If you do business overseas, you must also be aware of the applicable foreign regulatory requirements 69 Common Regulatory Compliance Goals Most regulatory requirements are based upon: • Security: Maintain data in a secure environment • Efficiency: Rapid location and retrieval of data • Legibility: Recovered documents must be in a readable format that is clear and concise • Authenticity: The recovered document must be verifiable as the original • Validation: Documentation process must be available for review by a neutral third party Regulatory compliance becomes more challenging as storage subsystems grow in size and complexity. 70 Regulatory Legislation Examples Sarbanes-Oxley HIPAA USA Patriot Act Gramm-Leach-Bliley Act FRCP CFR 240 17a(f) NASD 3010 and 3110 21 CFR Part 11 (FDA) DOD 5015.2 California Senate Bill 1386 Florida Sunshine Law PCI ISO 17799 CFR Title 18 (FERC) E-SIGN EU Directive 95/46/EC Basel II NARA GRS20 CFR Title 47, Part 42 NASD 2711/NYSE Rule 472 JCAHO FPC 65 COOP compliance 71 Maintenance Requirements 72 Disk Drive Reliability Misconceptions Actual disk failure rate is usually higher than published – Vendors indicate a .58% - .88% failure rate – Actual field usage suggests a 1.5% - 3.5% (or greater) failure rate Field studies show no appreciable difference in reliability between SCSI, FC, and SATA drives Heat and high duty cycles do not appear to have as detrimental of an effect on disk life as once thought 73 Disk Drive Reliability Misconceptions (Cont.) Infant Mortality doesn’t appear to be a significant issue for newly installed disks Disks exhibit a fairly linear rate of failure over time, which is contrary to the standard “bathtub” model Self-Monitoring, Analysis and Reporting Technology (S.M.A.R.T.) diagnostics appear to predict only about 50% of all disk failures 74 Disk Failures Are an Expected Occurrence Projected Disk Failures per Month 30 20 25 20 Traditional 15 M onolythic 7 10 5 Tiered 2 Traditional 6 Tiered Tiered 0 1% Rate 3.5% Rate Using the disk count from our previous Traditional vs. Tiered models, it’s easy to see that disk failures will occur on a regular basis. 75 Other Considerations 76 Other Issues to Consider Design for minimal human intervention Maintain extensive monitoring of the environment Exercise control over information propagation Architect for maximum performance and throughput Ensure robustness and high availability Configure for scalability and flexibility Develop well defined SLA objectives Implement a structured support operation 77 Emerging Technologies 78 Emerging Technologies to Watch Thin Provisioning Data de-duplication SAS Interface InfiniBand NPIV (N_Port ID Virtualization) Large-platter storage technologies 2.5” disk technologies Solid state disk drives Virtualized file systems (i.e.- ZFS, SOFS) Grid Storage 79 Summary 80 Some Parting Thoughts Online multi-petabyte storage is a reality Data will double every two to three years Storage media cost-per-GB will continue to decline Storage Operational management is a growing issue Governmental regulations will increase over time New technologies will demand additional storage Experienced storage designers and administrators will grow increasingly harder to find Scarce data center resources (bandwidth, floor space, power, cooling, etc.) will become more expensive A carefully designed architecture is the key to efficient storage operations 81 Putting It All Together 05/12/06 Multi-Petabyte Storage Infrastructure Remote Data Sources Redundant WAN Providers Captain Kirk: Scotty - We need more power!!! Data Reduction Techniques Network Data Compression Long-Term Archive (Bunker Storage) Local Server Internal Backups Storage Management Linux Server GPFS Clusters Event Reporting HSM Metadata Database Virtualization Cluster Virtualization Cluster Performance Management Mr. Scott: Capn, I'm gi'in ya all I got, she can na take much more! Edge Edge Infiniband Core Data Movers SAN Routers Backup & Recovery (Tape Library) High Speed Search Engine Fabric Management Infiniband Core Primary Primary Primary Primary Storage Storage Storage Storage Fast File Recovery Data Movers Edge Edge HSM Structure Redundant Backup Capability Infiniband Core Secondary Primary Primary Primary Storage Storage Storage Storage FlashCopy Storage Synchronous Replication (DR Hot Site) Asynchronous Replication (DR Warm Site) Within 300 KM Any Distance Near-Line Storage (Tape Library) FlashCopy Storage 82 Questions? Randy Cochran, Infrastructure Architect IBM Global Technical Services - Cell: (630) 248-0660 - hcochran@us.ibm.com 83