Managing the Unimanigable: A Practical Approach to - Dc

advertisement
Managing the Unimaginable:
A Practical Approach to Petabyte Data Storage
Randy Cochran, Infrastructure Architect, IBM Corporation, hcochran@us.ibm.com
TOD -1366 - Information on Demand Infrastructure
Data Storage is Getting Out-of-Hand
Are storage demands starting to overpowering you?
1
Most Research Firms Agree

“It is projected that just four years from now, the world’s information base will
be doubling in size every 11 hours.”
(“The toxic terabyte; How data-dumping threatens business efficiency”, Paul Coles, Tony Cox, Chris Mackey, and Simon Richardson,
IBM Global Technology Services white paper, July 2006)

“Our two-year terabyte CAGR of 52% is 3ppt (percentage points) below rolling
four quarter results of 55%.”
("Enterprise Hardware: 2007-08 storage forecast & views from CIOs", Richard Farmer and Neal Austria, Merrill Lynch Industry Overview,
03 January 2007)

“With a 2006–2011 CAGR nearing 60%, there is no lack in demand for
storage…”
("Worldwide Disk Storage Systems 2007–2011 Forecast: Mature, But Still Growing and Changing", Research Report # IDC206662,
Natalya Yezhkova, Electronics.ca Publications, May 2007)

“According to TheInfoPro…..the average installed capacity in Fortune 1000
organizations has jumped from 198 TB in early 2005 to 680 TB in October
2006. …..TIP found that capacity is doubling every 10 months.”
(InfoStor Magazine, Kevin Komiega, October 19, 2006)
2
What’s Driving Petabyte Level Storage?
The “Perfect Storm”
General Increase
in demand
New digital data
technologies
More regulatory
requirements
Better protection
from litigation
Proliferation of
Sophisticated
applications
Disaster
Recovery plans
Declining storage
media costs
A desire for
greater storage
efficiency
Storage technical
skills scarcity
A growing
understanding of
retained data’s
business value
According to IDC, between 2006 and 2010 information added annually to the digital
universe will increase more than six fold from 161 to 988 exabytes.
3
Just How Big is a Petabyte?
Data Storage Size Relationships
Terminology IEC Notation
Value
Bit
bit
----
Byte
B
----
Kilobyte
KB
1
1024
=2
Megabyte
MB
1024
2
=2
Gigabyte
GB
1024
Terabyte
TB
1024
Petabyte
PB
1024
Exabyte
EB
1024
Zettabyte
ZB
1024
Yottabyte
YB
Bits
Bytes
1
--
8
1
10
8,192
1,024
20
8,388,608
1,048,576
3
=230
8,589,934,592
1,073,741,824
4
=240
8,796,093,022,208
1,099,511,627,776
5
=250
9,007,199,254,740,990
1,125,899,906,842,620
6
=260
9,223,372,036,854,780,000
1,152,921,504,606,850,000
7
70
9,444,732,965,739,290,000,000
1,180,591,620,717,410,000,000
9,444,732,965,739,290,000,000,000
1,180,591,620,717,410,000,000,000
=2
8
80
1024 =2
Petabyte storage had been around for years – online Petabyte storage has not.
“Ninety-two percent of new information is stored on magnetic media, primarily
hard disks.”
“How Much Information 2003”, UC Berkeley's School of Information Management and Systems
4
How Big is That in Human Terms?
Data Storage Size Relationships
Terminology
Value
Bit
Two
A "bit" is short for "binary digit" and can hold only two states, 0 or 1
Byte
Eight
A unit of storage capable of holding a single alpha-numeric character
Kilobyte
Thousand
One page of a book having around a thousand characters on it
Megabyte
Million
A medium resolution color photograpy taken by a digital camera
Gigabyte
Billion
A gigabyte is equal to the contents of about 10 yards of books on a shelf
Terabyte
Trillion
A Terabyte could hold 1,000 copies of the Encyclopedia Britannica
Petabyte
Quadrillion
Approximately 100 times the printed collection of the Library of Congress
Exabyte
Quintillion
Estimated as 1/5th of all of words spoken since the beginning of history
Zettabyte
Sextillion
Estimated size of data storage for all computers in the world by 2010
Yottabyte
Septillion
(Difficult to equate to a meaningful example)
According to Britannica.com the U.S. Library of Congress contains approximately
18 million books, 2.5 million recordings, 12 million photographs, 4.5 million maps,
and more than 54 million manuscripts.
5
Why is Petabyte Storage a Challenge?
 Areas Impacted by Petabyte Storage:
•
•
•
•
•
•
•
•
•
•
•
Content and File Management
Application & Database Characteristics
Storage Management
Architectural Design Strategy
Performance and Capacity
SAN Fabric Design
Backup and Recovery Methods
Security System Complexity
Compliance with Regulatory Requirements
Operational Policies and Processes
Maintenance Requirements
6
Content and File Management
7
Management Starts With Data Classification
 Data Classification Assumptions
• Not all data is created equal
• The business value of data changes over time
• Performance can be improved by re-allocating data to an
optimized storage configuration
• The value of most business data is not fixed; it is expected to
change over time
 Understanding the business value of data is a crucial
in designing an effective data management strategy
Which data has a greater value to the business - a client’s purchase
record, or a memo about last year’s phone system upgrade?
8
Data Classification Example
Percent of
Data Classification Active Data
Storage
Class
Life Critical / Business
Critical
FC SCSI
Enterprise
15K or 10K RPM,
Class
low to medium capacity
Business Important
Business Standard
Nearline / Reference
10% - 15%
20% - 30%
50% or more
50% or more
Vendor Models
Relative
Cost/GB
IBM DS8300
EMC DMX1/2/3000
HDS Tagmastore
x45
Tier 1
Recovery in <15
minutes
Mid-Range
FC SCSI
10K RPM, medium to
high capacity
IBM DS6800, DS4800
EMC DMX800
HDS 9970/80
x25
Tier 2
Recovery in
< 4-hours
Mid-Range
SATA, FATA, or SAS
high capacity
IBM DS48000, DS4700
EMC CX300/500/700
HDS 9530 Thunder
x5
Tier 3
Recover in
< 72-hours
Optical or Tape Device
IBM DR550 or TS3500
EMC Centera, Hitachi
WMS100, HP 7100ux
Optical Jukebox
x3
Tier 4
Recover as soon
as practical
x1
Tier 5
Recover when
required
Low End
Disk Drive
Characteristics
Typical
Tier
Recovery
Level Requirement
Iron Mountain,
internal archiving
Low Value / Archived
Inactive data
Archived
Archived data
There are no universally accepted standard definitions for Tier Levels.
9
Control Your File Content
 Implement file aging
 Set data retention periods
 Eliminate low value data
•
•
•
•
•
Clean out old backup files
Eliminate outdated information
Deploy de-duplication technology
Reduce storage of low value data
Locate and purge corrupt files
 Crack down on unauthorized storage usage
 Periodically review log files and archive or
delete obsolete information
10
Application and Database Characteristics
11
Know Your Application and Database Needs
 Know your applications needs
•
•
•
•
•
•
•
•
•
User expectations
Workload complexity
Read or write intensity
Sequential files usage
IOPS dependence
Stripe size optimization
Throughput requirements
Service prioritization
Growth expectations
 Don’t allow databases to “lock up”
vast amounts of storage
12
Applications Will Drive Storage Requirements
 Applications characteristics will drive storage decisions
• Value to the business
• Number of users
• Usage patterns




•
•
•
•
•
Steady
Bursty
Cyclical
Variable
7x24 or 9x5 access
Domestic or global access
Distributed or self-contained
High or low security data
Architectural constraints
 Significant performance gains (or losses) can be achieved
by matching requirements to storage characteristics
13
Storage Management
14
Large Storage Systems Must Be Managed
 Information Lifecycle Management (ILM)
 Hierarchical Storage Management (HSM)
 Storage Resource Management (SRM)
 Storage Virtualization
"Enterprises can achieve better and more targeted utilization of resources
by first establishing the value of their information assets and then using
storage management software to execute the policies that define how
resources are utilized."
Noemi Greyzdorf, research manager, Storage Software, IDC
15
Information Lifecycle Management
“(ILM is) the process of managing business data
throughout its lifecycle from conception until disposition
across different storage media, within the constraints of
the business process.”
(courtesy of Veritas Corporation, Nov. 2004)
ILM is not a commercial product, but a complete set of products and
processes for managing data from its initial inception to its final
disposition.
16
Information Lifecycle Management
 Information has business value
•
•
•
•
It’s value changes over time
It ages at different rates
It has a finite life-cycle
As data ages its performance needs change
 Some Information is subject to different security requirements,
due to government regulatory or legal enforcements
 Outdated information has different disposal criteria
 A combination of processes and technologies that determine
how information flows through a corporate environment
 Encompasses management of information from its creation until
it becomes obsolete and is destroyed
17
“Best Practices” for ILM Implementations









Know exactly where information is stored
Be able to retrieve information quickly and efficiently
Limit access to only those who need to view data
Create policies for managing and maintaining data
Do not destroy important documents
Avoid keeping multiple copies of the same data
Retain information only until it is no longer useful
Destroy outdated files on a regular basis
Document all processes and keep them up-to-date
18
Hierarchical Storage Management
“HSM is a policy-based data storage management
system that automatically moves data between highcost and low-cost storage media, without requiring the
knowledge or involvement of the user.”
(courtesy of http://searchstorage.tedchtarget.com)
IBM has been involved in providing HSM solutions for over 30-years and
offer a wide variety of products with automated data movement capabilities.
19
File Access Activity Over Time
Number of
Accesses
AIX File Accesses Per Day
1,800,000
1,600,000
1,400,000
Files accessed between 2
and 6 months.
Files accessed within the
last 2-months.
Files accessed beyond 6 months
1,200,000
15%
10%
75%
1,000,000
Last Accessed
Expon. (Last Accessed)
800,000
600,000
400,000
200,000
0
<= 1 day
1 day - 1-wk.
1-wk. - 1-mo
1-mo. - 2-mo.
2-mo. - 3-mo.
3-mo. - 6-mo.
6-mo. - 9-mo.
9-mo. - 1-yr
> 1-yr
20
Hierarchical Storage Management
10%
20%
70%
Archive
 HSM Concepts
•
•
•
•
•
Only 10%-15% of most data is actively accessed
The business value of data changes over time
Between 80% and 90% of all stored data is inactive
High performance storage (FC disks) are expensive
Lower performance media (tape, optical platters, and SATA
disk) are comparatively inexpensive
21
Hierarchical Storage Management
$$$$
$$$
$$
$
 HSM Concepts (cont.)
• Enterprise class storage is not required for all data
• Policies can be set to establish the proper frequency for
transitioning aging data to less expensive media
• HSM allows optimal utilization of expensive disk storage
• Low cost, high density disks consume fewer resources
• Overall storage system performance may improve
22
IBM Products with HSM Capabilities
 General Parallel File System (GPFS)
 IBM Content Manager for Multiplatforms
 Tivoli Storage Manager HSM for Windows
 Tivoli Storage Manager for Space Management (AIX)
 SAN File System (SFS)
 DFSMShsm (Mainframe)
 High Performance Storage System (HPSS)
23
Storage Resource Management
“Storage Resource Management (SRM) is the process of optimizing
the efficiency and speed with which the available drive space is
utilized in a storage area network (SAN). Functions of an SRM
program include data storage, data collection, data backup, data
recovery, SAN performance analysis, storage virtualization, storage
provisioning, forecasting of future needs, maintenance of activity logs,
user authentication, protection from hackers and worms, and
management of network expansion. An SRM solution may be offered
as a stand-alone product, or as part of an integrated program suite.”
(Definition Courtesy of http://searchstorage.techtarget.com)
IBM’s primary tool for Storage Resource Management is their TotalStorage
Productivity Center suite of tools for disk, data, fabric, and replication.
24
Storage Resource Management Functions
Deployment
Management
Compliance
Management
 Event Management
 Performance Management
 Accounting Management
 Quota Management
Asset Management
Change Management
Capacity Planning
Operational
Management
Service Level
Management
 Policy Management
 Security Management
 Automation
 Backup & Recovery
 HSM Operations
 Point-in-Time Copies
 Disaster Recovery
 Data Migration
 Data Archiving
25
Storage Virtualization
 Virtualization
“The act of integrating one or more (back end) services or
functions with additional (front end) functionality for the
purpose of providing useful abstractions. Typically
virtualization hides some of the back end complexity, or
adds or integrates new functionality with existing back end
services. Virtualization can be nested or applied to
multiple layers of a system.”
(Definition Courtesy of http://www.snia.org/education/dictionary)
Virtualization allows most of the complexity of a storage infrastructure to be
hidden from the user.
26
Virtualization Makes Storage One Large Pool
 Virtualization Characteristics
• Makes storage configuration details
invisible to the user
• Improves overall manageability of
the system
• Aggregates isolated storage
“islands” into a unified view
• Facilitates greater flexibility and
scalability
• Optimizes utilization of storage capacity
• Provides the ability to move data on-the-fly
• Improves storage subsystems flexibility
• Allows rapid re-allocation of storage resources
• Improves performance by providing another layer of caching
• May provide additional functionality for the SAN
27
Architectural Design Strategy
28
Key Architectural Design Considerations





Resource Consumption
Storage Economics
RAID Allocation
Performance Objectives
Other Design Issues
The integrity of the architectural design will determine the overall
performance, stability, economic efficiency, manageability and future
scalability of the system.
29
Power Consumption vs. Storage Capacity
Disk Power Consumption - Cost per Petabyte.
Average
# of Disks
Total
Total
**Cost per
Power
per
Power in Power in KW/hr. @
per Disk
Petabyte
KW
KW/hr.
$.0874
in Watts
Power
Power
Cost Over a Consumption
BTUs
Cost Per
5-yr Period Efficiency Index per disk
Year
(Watts/GB)
Disk
Type
Disk
Capacity
in GB
FC
36.7
9.9
27,248
270.0
20.25
$1.77
$15,505
$77,523
0.270
33.8
3,314,653,798
FC
73.4
9.3
13,624
126.8
9.51
$0.83
$7,283
$36,417
0.127
31.7
1,557,078,474
FC
146.8
10.8
6,812
73.7
5.53
$0.48
$4,232
$21,158
0.074
36.9
904,644,196
FC
300
13.8
3,333
46.0
3.45
$0.30
$2,642
$13,209
0.046
47.1
564,777,840
SATA
250
9.3
4,000
37.2
2.79
$0.24
$2,136
$10,680
0.037
31.7
456,667,200
SATA
320
9.3
3,125
29.1
2.18
$0.19
$1,669
$8,344
0.029
31.7
356,771,250
SATA
400
9.3
2,500
23.3
1.74
$0.15
$1,335
$6,675
0.023
31.7
285,417,000
SATA
500
9.3
2,000
18.6
1.40
$0.12
$1,068
$5,340
0.019
31.7
228,333,600
SATA
750
9.3
1,333
12.4
0.93
$0.08
$712
$3,560
0.012
31.7
152,222,400
SATA
1000
9.3
1,000
9.3
0.70
$0.06
$534
$2,670
0.009
31.7
114,166,800
BTU/hr per
Petabyte
These disks all have very similar power consumption requirements, even though
the largest one features 28 times the capacity of the smaller one.
In addition, each disk will require approximately 0.4-0.6 watts of electrical power
to cool each BTU of heat produced.
** National retail price of electricity per KwH from “Power, Cooling, Space Efficient Storage”, page 2, ESG white paper, Enterprise Strategy Group, July. 2007.
30
Comparing Storage Subsystem Power Costs
Power Cost - Traditional Storage
All on One Tier
Percent
of Total
1000 TB
of Storage
Storage
Type
Storage
Frames
In KW
Disk
Type
All Data
100%
1000
(11) DS8300
143.0
FC
Total
Disk Size in Number of Disk Power
Power Cost
Power in
GB
Disks
in KW
per Year
KW
146
6850
74.1
217.1
$166,219
Power Cost - Tiered Storage
Tiered by Activity
Percent
of Total
1000 TB
of Storage
Storage
Type
Storage
Frames
In KW
Disk
Type
Total
Disk Size in Number of Disk Power
Power Cost
Power in
GB
Disks
in KW
per Year
KW
Frequently accessed
10%
100
(2) DS8300
26.0
FC
146
685
7.4
33.4
$25,580
Infrequently accessed
20%
200
(3) DS4800
7.9
FC
300
667
9.2
17.1
$13,097
Seldom accessed
70%
700
(9) DS4200
6
SATA
750
934
8.7
14.7
$11,244
======
$49,921
Significant power savings may be realized by redistributing data to the appropriate
type and size of disk drive.
31
Comparing Storage Subsystem Cooling Costs
Cooling Cost - Traditional Storage
All on One Tier
Percent
of Total
1000 TB
of
Storage
All Storage
100%
1000
Storage
Individual
Frame Disk Size Number
Disk Heat
Heat in
in GB
of Disks
in BTUs
BTUs
(11) DS8300 262801
146
6850
36.9
Storage
Type
Total
Disk
Heat in
BTUs
Total
power for
Cooling
in KW
Cooling
Cost per
Year
252765
Total
System
Heat in
BTUs
515566
74
$56,702
Total
Disk
Heat in
BTUs
25277
31416
29608
Total
System
Heat in
BTUs
73059
33828
43252
Total
power for
Cooling
in KW
21
10
13
86300
150,138
44
Cooling Cost - Tiered Storage
Tiered by Activity
Percent
of Total
1000 TB
of
Storage
Frequently accessed
Infrequently accessed
Seldom accessed
10%
20%
70%
100
200
700
Storage
Individual
Frame Disk Size Number
Disk Heat
Heat in
in GB
of Disks
in BTUs
BTUs
(2) DS8300
47782
146
685
36.9
(3) DS4800
2412
300
667
47.1
(9) DS4200
13644
750
934
31.7
Storage
Type
Cooling
Cost per
Year
$16,389
$7,588
$9,703
====
$33,680
Additional power savings may be realized from the reduced cooling requirements
provided by high capacity, lower wattage disk drives.
32
Comparing Storage Floor-Space Cost
Service Clearance
Model
Scenario
DS8300
Traditional
Storage
DS8300
DS4800
DS4200
Tiered
Storage
Typical
Total
Unit
Unit
Front Rear - Footprint
Total
Cost per
Number Width in Depth in Depth in Depth in per Unit Footprint sq. ft. per Total Cost
of Units Inches Inches Inches Inches
(sq. ft.)
(sq. ft.)
Month
per Month
11
104.1
46.6
48
30
90.08
990.83
2
104.1
46.6
48
30
90.08
180.15
3
25.4
43.3
36
36
20.34
61.01
9
25.4
43.3
36
36
20.34
183.04
====
424.20
Total Cost
Percent
Per Year Difference
$65
$64,404
$772,847
----
$65
$27,573
$330,878
57%
The DS4800 and DS4200 storage subsystems include the required number of disk
expansion trays mounted in standard equipment racks.
33
How Do the Costs Add Up?
Tiered Storage Approach
Traditional Approach
Everything on DS8300s
DS8300
DS4800
DS4200s with
SATA Disk
Catalyst 2948G-GE-TX
CONSOLE
PSI
STATUS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
10/100 MGT
48
49
RPSU
50
51
52
MDS 9140 MULTILAYER INTELLIGENT FC SWITCH
1
MGMT
10/100
CONSOLE
4
9
-SP
LN-
STATUS
12
13
-SP
LN-
16
17
-SP
LN-
20
21
-SP
LN-
24
-SP
LN-
P/S
FAN
5
LINK
ACT
8
25
-SP
LN-
RESET
28
29
-SP
LN-
32
33
-SP
LN-
36
37
-SP
LN-
40
-SP
LN-
DS - C9140 - K9
MDS 9140 MULTILAYER INTELLIGENT FC SWITCH
1
MGMT
10/100
CONSOLE
4
9
-SP
LN-
STATUS
12
13
-SP
LN-
16
17
-SP
LN-
20
21
-SP
LN-
24
-SP
LN-
P/S
FAN
5
LINK
ACT
8
25
-SP
LN-
RESET
28
29
-SP
LN-
32
33
-SP
LN-
36
37
-SP
LN-
40
-SP
LN-
DS - C9140 - K9
System x3455
System x3455
DS4800
Catalyst 2948G-GE-TX
CONSOLE
PSI
STATUS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
10/100 MGT
48
49
RPSU
50
51
52
MDS 9140 MULTILAYER INTELLIGENT FC SWITCH
1
MGMT
10/100
CONSOLE
P/S
FAN
4
DS4800
5
LINK
ACT
9
-SP
LN-
STATUS
LN-
12
25
RESET
13
-SP
LN-
8
-SP
16
17
-SP
LN-
28
29
-SP
LN-
20
21
-SP
LN-
32
33
-SP
LN-
24
-SP
LN-
36
37
-SP
LN-
40
-SP
LN-
DS - C9140 - K9
MDS 9140 MULTILAYER INTELLIGENT FC SWITCH
TotalStorage
1
MGMT
10/100
CONSOLE
4
9
-SP
LN-
STATUS
12
13
-SP
LN-
16
17
-SP
LN-
20
21
-SP
LN-
24
-SP
LN-
P/S
FAN
5
LINK
ACT
8
25
-SP
LN-
RESET
28
29
-SP
LN-
32
33
-SP
LN-
36
37
-SP
LN-
40
-SP
LN-
DS - C9140 - K9
System x3455
System Storage
DS4700
System Storage
EXP810
System x3455
DS4800
Catalyst 2948G-GE-TX
CONSOLE
PSI
STATUS
4GB/s
RPSU
2GB/s
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
10/100 MGT
49
50
51
52
MDS 9140 MULTILAYER INTELLIGENT FC SWITCH
1
MGMT
10/100
CONSOLE
P/S
FAN
4
DS4800
5
LINK
ACT
9
-SP
LN-
STATUS
LN-
12
-SP
LN-
8
25
-SP
RESET
28
-SP
LN-
13
LN-
29
LN-
16
-SP
32
-SP
17
20
21
-SP
LN-
33
LN-
36
37
-SP
LN-
LN-
24
-SP
40
-SP
DS - C9140 - K9
MDS 9140 MULTILAYER INTELLIGENT FC SWITCH
TotalStorage
1
MGMT
10/100
CONSOLE
4
9
-SP
LN-
STATUS
12
-SP
LN-
13
LN-
16
-SP
17
20
21
-SP
LN-
LN-
24
-SP
P/S
FAN
5
LINK
ACT
8
25
-SP
LN-
RESET
28
-SP
LN-
29
LN-
32
-SP
33
36
37
-SP
LN-
LN-
40
-SP
DS - C9140 - K9
System x3455
System Storage
System Storage
DS4700
System Storage
EXP810
4GB/s
2GB/s
4GB/s
2GB/s
EXP810
System x3455
DS4800
4GB/s
System Storage
2GB/s
EXP810
DS4800
TotalStorage
System Storage
System Storage
DS4700
System Storage
EXP810
System Storage
EXP810
System Storage
DS4700
System Storage
EXP810
System Storage
EXP810
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
EXP810
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
DS4700
System Storage
EXP810
System Storage
DS4700
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
DS4700
System Storage
System Storage
System Storage
EXP810
System Storage
EXP810
4GB/s
2GB/s
4GB/s
2GB/s
System Storage
EXP810
System Storage
DS4700
4GB/s
2GB/s
4GB/s
2GB/s
EXP810
DS4200
DS4700
DS4200
System Storage
System Storage
EXP810
System Storage
EXP810
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
EXP810
4GB/s
System Storage
2GB/s
EXP420
System Storage
DS4200
System Storage
EXP420
4GB/s
System Storage
EXP420
System Storage
EXP420
2GB/s
DS4200
System Storage
4GB/s
EXP810
2GB/s
System Storage
DS4200
System Storage
EXP420
4GB/s
System Storage
EXP420
System Storage
EXP420
4GB/s
2GB/s
2GB/s
DS4200
Power Cost $166,219
Cooling Cost $56,702
Floor Space $772,847
=======
$995,768
System Storage
EXP420
System Storage
EXP420
4GB/s
2GB/s
4GB/s
2GB/s
Power Cost
$49,921
Cooling Cost
$33.68
Floor Space $330,878
=======
$380,833
Savings: $614,935 / yr.
34
A Look at Older Disk Subsystem Efficiency
100 TB of Storage
Model
Capacity Largest
Frames
in TB
FC Drives
Watts
BTUs/hr.
Width
Depth
Access Clearance
Front
Rear
Total
Sq. Ft.
ESS800 w/frame
ESS800 w/frame
1
1
====
2
55.9
55.9
====
111.8
145.6 GB
----------
13,112
13,112
====
26224
47,000
47,000
====
94000
115.7
115.7
====
231.4
35.8
35.8
====
35.8
34
---====
34
45
---====
45
5672.06
4142.06
====
9814.12
DS8300
2107-9AE
1
1
====
2
115.2
---====
115.2
300 GB
---====
----
7000
6,000
====
13000
23,891
20,478
====
44369
104.1
---====
104.1
44.6
---====
44.6
48
---====
30
---====
6082.86
---====
6082.86
DS8300 Benefits
100%
103%
----
50%
53%
45%
125%
----
----
38%
Storing 100 TB of data on more modern storage subsystems results in 50% less
power consumption, a 53% reduction in BTUs per hr., and a reduction in required
floor space of 38%.
In addition, a DS8300 system has over 7x the throughput of the ESS800.
35
Why is Tiered Storage Important?
 Maps data’s business value
to disk characteristics
 Places data on storage
appropriate to its usage
 Incorporates lower cost disks
 Reduces resource usage
(power, cooling, etc.)
 Matches user access needs to storage characteristics
 Capitalizes on higher capacity disk drive technology
 Increases overall performance of the system
36
A Typical Tiered Storage Architecture
DS8300
DS4800
PSI
STATUS
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
3
4
5
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
28
29
30
31
DS4200s with
SATA Disk
Catalyst 2948G-GE-TX
CONSOLE
1
10/100 MGT
49
RPSU
50
51
52
MDS 9140 MULTILAYER INTELLIGENT FC SWITCH
1
MGMT
10/100
CONSOLE
4
9
-SP
LN-
STATUS
12
13
-SP
LN-
16
17
-SP
LN-
20
21
-SP
LN-
24
-SP
LN-
P/S
FAN
5
LINK
ACT
8
25
-SP
LN-
28
29
-SP
LN-
RESET
32
33
-SP
LN-
36
37
-SP
LN-
40
-SP
LN-
DS - C9140 - K9
MDS 9140 MULTILAYER INTELLIGENT FC SWITCH
1
MGMT
10/100
CONSOLE
4
9
-SP
LN-
STATUS
12
13
-SP
LN-
16
17
-SP
LN-
20
21
-SP
LN-
24
-SP
LN-
P/S
FAN
5
LINK
ACT
8
25
-SP
LN-
28
29
-SP
LN-
RESET
32
33
-SP
LN-
36
37
-SP
LN-
40
-SP
LN-
DS - C9140 - K9
System x3455
Catalyst 2948G-GE-TX
CONSOLE
PSI
STATUS
RPSU
System x3455
1
2
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
28
29
30
31
10/100 MGT
49
50
51
52
MDS 9140 MULTILAYER INTELLIGENT FC SWITCH
1
MGMT
10/100
CONSOLE
4
9
-SP
LN-
STATUS
12
13
-SP
LN-
16
17
-SP
LN-
20
21
-SP
LN-
24
-SP
LN-
P/S
FAN
5
LINK
ACT
8
25
-SP
LN-
28
29
-SP
LN-
RESET
DS4800
32
33
-SP
LN-
36
37
-SP
LN-
40
-SP
LN-
DS - C9140 - K9
MDS 9140 MULTILAYER INTELLIGENT FC SWITCH
1
MGMT
10/100
CONSOLE
4
9
-SP
LN-
STATUS
12
13
-SP
LN-
16
17
-SP
LN-
20
21
-SP
LN-
24
-SP
LN-
P/S
FAN
5
LINK
ACT
8
25
-SP
LN-
28
29
-SP
LN-
RESET
32
33
-SP
LN-
36
37
-SP
LN-
40
-SP
LN-
DS - C9140 - K9
System x3455
DS4800
2
3
4
5
6
7
Catalyst 2948G-GE-TX
CONSOLE
PSI
STATUS
TotalStorage
1
8
9
10
11
12
13
14
RPSU
System x3455
15
16
17
18
19
20
21
22
23
24
25
26
27
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
10/100 MGT
49
50
51
52
MDS 9140 MULTILAYER INTELLIGENT FC SWITCH
1
MGMT
10/100
CONSOLE
4
9
-SP
LN-
STATUS
12
13
-SP
LN-
16
-SP
LN-
17
20
21
-SP
LN-
LN-
24
-SP
P/S
FAN
5
LINK
System Storage
DS4700
ACT
8
25
-SP
LN-
RESET
DS4800
28
29
-SP
LN-
32
-SP
LN-
33
36
37
-SP
LN-
LN-
40
-SP
DS - C9140 - K9
MDS 9140 MULTILAYER INTELLIGENT FC SWITCH
1
MGMT
10/100
CONSOLE
4
9
-SP
LN-
STATUS
12
13
-SP
LN-
16
-SP
LN-
17
20
21
-SP
LN-
LN-
24
-SP
P/S
FAN
5
LINK
ACT
8
25
-SP
LN-
RESET
28
29
-SP
LN-
32
-SP
LN-
33
36
37
-SP
LN-
LN-
40
-SP
DS - C9140 - K9
System x3455
DS4800
TotalStorage
System Storage
4GB/s
2GB/s
4GB/s
2GB/s
System x3455
EXP810
System Storage
DS4700
System Storage
EXP810
DS4800
DS4800
System Storage
EXP810
System Storage
EXP810
System Storage
4GB/s
System Storage
EXP810
System Storage
EXP810
System Storage
4GB/s
System Storage
TotalStorage
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
DS4700
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
EXP810
2GB/s
DS4700
System Storage
EXP810
DS4700
4GB/s
System Storage
2GB/s
EXP810
4GB/s
System Storage
2GB/s
DS4700
System Storage
EXP810
DS4700
4GB/s
System Storage
2GB/s
EXP810
4GB/s
System Storage
System Storage
4GB/s
System Storage
DS4700
DS4700
2GB/s
EXP810
4GB/s
System Storage
2GB/s
EXP810
System Storage
4GB/s
System Storage
2GB/s
EXP810
EXP810
2GB/s
DS4200
4GB/s
System Storage
2GB/s
EXP810
System Storage
EXP810
DS4200
4GB/s
System Storage
2GB/s
EXP420
4GB/s
System Storage
2GB/s
DS4200
System Storage
EXP810
DS4200
4GB/s
System Storage
2GB/s
EXP420
4GB/s
System Storage
System Storage
4GB/s
System Storage
DS4200
DS4200
2GB/s
EXP420
4GB/s
System Storage
2GB/s
EXP420
System Storage
4GB/s
System Storage
2GB/s
EXP420
EXP420
2GB/s
4GB/s
2GB/s
4GB/s
2GB/s
EXP420
System Storage
EXP420
System Storage
EXP420
Business Critical
Business Important
Business Standard
High Performance /
Very High Availability
Good Performance /
High Availability
Average Performance /
Standard Availability
Normally a tiered storage strategy is
based on data’s business value.
TS3500
Tape
Library
Reference / Historical
Near-line or Off-line
37
Choosing the Right Controller Frame
Storage Controller Cost Comparison
List Price
$250,000
$200,000
DS8300 Base System
DS6800 Controller
DS4800 Controller
$150,000
DS4700 Controller
DS4200 Controller
$100,000
$50,000
$0
DS8300 Base
System
DS6800
Controller
DS4800
Controller
DS4700
Controller
DS4200
Controller
38
Choosing the Right Disk Characteristics
Notes:
q Tiers 0 and 1 must be physically close
to the client equipment for latency
reasons. Tiers 2 – 4 can be more remote.
q Active data in Tiers 0 & 1 is normally
less than 20% of the total data.
System Primary Memory
(DRAM and Cache )
Solid State Disk
Ultra High Performance
Tier 0
q Inactive data (80%) is not significantly
affected by latency issues from
asynchronous connections.
Tier 1
q Management and capacity planning for
the entire tiered structure can be
executed from any geographic location.
Tier 2
FC SCSI Disk
High Performance (15K RPM)
or
Low capacity – High Spindle Count (10K RPM)
FC or SAS SCSI Disk
Medium Performance SCSI or SAS (10K RPM)
or
High capacity SCSI or SAS (10K RPM)
SATA Disk
Tier 3
High capacity, low cost SATA Disk
UDO, DVD, MO, or TAPE Library
(Near-Line Media)
High capacity, very low cost
medium for long-term archiving
Tier 4
39
Comparing Disk Drive Attributes
Cost-per-GB By Disk Drive Type
$30.00
$25.00
$24.51
$22.87
$22.87
4GB FC 73.4GB 15K
4GB FC 146.8GB 15K
4GB FC 300 GB 15K
$20.00
$16.89
$16.89
2GB FC 36.4GB/15K
2GB FC 73.4GB/15K
$15.25
$15.00
$13.51
2GB FC 146.8GB/15K
$13.69
2GB FC 73.4GB/10K
$10.66
2GB FC 146.8GB/10K
2GB FC 300GB/10K
$10.00
400 GB SATA II
500 GB SATA II
750 GB SATA II
$5.00
$2.56 $2.56 $2.30
II
TA
II
75
0
G
B
SA
TA
II
G
B
SA
TA
SA
B
50
0
G
40
0
B
4G
4G
B
FC
73
.4
G
FC
B
15
14
K
6.
4G
8G
B
B
FC
15
30
K
0
G
B
15
K
2G
B
FC
36
2G
.4
B
G
B
FC
/1
5K
73
2G
.4
B
G
FC
B
/1
14
5K
6.
8G
B
/1
5K
2G
B
FC
73
2G
.4
B
G
FC
B
/1
14
0K
6
2G
.8
G
B
B
FC
/1
0K
30
0G
B
/1
0K
$0.00
40
The Cost Impact of Adding Disk Trays
Declining Cost-Per-GB
(DS4800, in 4-disk increments)
Cost
per GB
$100.00
$90.00
$80.00
$70.00
$60.00
$50.00
Per GB
$40.00
$30.00
$20.00
$10.00
$0.00
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Disk Trays
Note: Calculations based on 146 GB, 10K RPM Drives
41
Tiered Storage Design Pros and Cons
 Advantages
•
•
•
•
•
•
Lower initial purchase price
Higher capacity per square foot
Reduced power consumption
Decreased requirement for cooling
Increased equipment flexibility
Potentially a higher performance solution
 Disadvantages
• Inherently a more complex architecture
• Greater up-front effort to design and implement
• Requires advanced storage design skills and knowledge
42
RAID Selection Decision Drivers
 Application or Database characteristics
• Read/write mix
• Dependency on IOPS
 RAID Performance characteristics
•
•
•
•




Appropriate RAID level
Number of disks per array
Stripe size
Available bandwidth
Configuration rules and recommendations
Loss from data parity and hot sparing
Disk failure probability
RAID parity rebuild times
43
Loss from Mirroring, Striping, and Sparing
Mirror
Usable
RAID10
468.8 TB
S
RAID10 =
Unusable
531.2 TB
Mirror plus Stripe
S
Stripe
Mirror
Loss
53.1%
Usable
RAID1
468.8 TB
RAID1 =
Mirror Only
S
Unusable
531.2 TB
S
Loss
53.1%
44
Loss from RAID5 Parity and Sparing
Usable
559.5 TB
RAID5 - 3+P Array
Note: The second tray has one
2+P array to allow for one
spare drive per two trays.
P
P
P
P
P
P
P
P
Unusable
531.2 TB
S
Loss
44.1%
Usable
791.5 TB
RAID5 - 7+P Array
Note: The second tray has one
6+P array to allow for one
spare drive per two trays.
P
P
P
P
Unusable
208.5 TB
S
Loss
20.9%
Usable
857.1 TB
RAID5 - 14+P Array
Note: Each tray has one spare
drive per tray.
P
S
Unusable
142.9 TB
P
S
Loss
14.3%
45
Other Architectural Considerations
 Compatibility
 High availability
 Architectural robustness
 Flexibility and scalability
 Stability of the technology
 Vendor’s financial standing
 Well defined product line
roadmap
 Support for industry
standards
46
Performance and Throughput
47
Storage Subsystem Performance Drivers
 Business objectives and user expectations
 Applications and database characteristics
 Server characteristics
 SAN fabric characteristics
 Storage controller characteristics
 Caching characteristics
 Configuration characteristics
 Disk latency characteristics
"We can't solve problems by using the same kind of thinking we used
when we created them."
Albert Einstein
48
Storage Performance Enhancers
 Data Mover – Reassigning data transfer tasks to a specialized
“engine” reduces the workload on the host processing system.
 Search Engines – Systems dedicated to executing searches in
vast amounts of stored data to satisfy specific requests.
 Directory Services – Stores and organizes information about
resources and data objects.
 High Speed Interconnections – Dedicated “behind the scenes”
networks dedicated to the transfer of large amounts of data.
 Autonomic Computing – must have an ability to reconfigure
itself under varying and possibly unpredictable conditions.
49
Other Storage Performance Tips
Method
RAID selection
Thin Provisioning
Parallel file
systems
Description
Striping data across multiple disks to reduce the response latency of individual
disks and increase the available bandwidth of the array.
Allows applications and databases to be allocated more capacity than is
physically reserved on the storage subsystem.
Provides individual file block striping across multiple disks to greatly improve I/O
throughtput.and storage capacity.
Virtualization
Provides some caching of data, spreads I/O across multiple storage controller
channels, and can be used as a "data mover" for remote data replication.
Increased spindle
count
Deploying a large number of low capacity, high speed disks in a RAID array to
increase the number of spindles and improve the array's IOPS capability.
High RPM disk
drives
Implementing 15K RPM drives to reduce the amount of disk latency and
increase the transfer rate.
Striping across
trays
Building a RAID array by vertically striping it across multiple disk trays and
storage controller channels.
Increasing cache
size
Installing additional storage cache to increase the percentage of "read hits"
from cache and avoiding much slower data read requests from disks.
Sequential file
isolation
Isolate sequential write files on dedicated physical disks to minimize the
amount of head seek time and disk rotational latency.
1:1 SAN port
relationships
Assign high performance servers and storage controllers to SAN Fabric ports
with a 1:1 relationship to ensure maximum bandwidth is always available.
Hot spot
elimination
Monitor disk storage for "hot spots" (over-worked array areas) and reallocate
storage to eliminate the usage imbalance.
50
SAN Fabric Network
51
SAN Fabric Overview
 SAN Fabric is the interconnecting structure between
associated servers and storage devices
 Proper fabric design will directly impact on:
•
•
•
•
•
Performance
Availability
Equipment cost
Manageability
Maintainability
 Communications protocol can be either Fibre Channel,
Ethernet, or a combination of both
 Ability of the fabric to scale is critical
 Monitoring of SAN fabric traffic is a necessity
52
Designing a High Performance Fabric
 Select an optimal SAN fabric topology
•
•
•
•
•
•
•
•
Mesh
Waterfall
Core / Edge
Fat tree
Butterfly
Torus
Hypercube
Hybrid
 Ensure the design is driven by business application
requirements
 Keep it as simple as possible for manageability
53
Common SAN Fabric Examples
Server
Server
Server
Storage
Basic Dual Switch
Server
Server
Director
Storage
Storage
Server
Server
Storage
Fabric Mesh
Storage
Server
Director
Storage
Storage
Storage
Fat Tree Fabric Architecture
Highly Availability
Core - Edge Fabric
54
SAN Fabric Design Considerations
 SAN Fabric design issues:
•
•
•
•
•
•
•
•
•
•
•
•
•
Throughput requirements
Potential bottlenecks
Port speed / port count
Port subscription rate
Maximum hop count
Redundancy for High Availability
Modularity (flexibility/scalability)
Future upgradeability
Complexity vs. overall manageability
Isolation vs. unification
Wide Area Network interconnections
Power consumption and footprint
Component cost
55
Backup and Recovery
56
Backup and Recovery Challenges
 Size of the backup window
 Ability to recover files
 The time to recover files
 Integrity of the data backups
 Required frequency of backups
 Stored data retention period
• Functional
• Legal
 Available bandwidth for backup resources
 Media deterioration over time
 Technological obsolescence
57
The Traditional Storage Backup Approach
1.0 PB of Storage
TS3500 Tape
Library
(192) of the newest LTO 4 tape drives running at a
maximum native transfer rate of 120 MB/sec. would
need at least 13-hours to back up 1.0 PB of data.
58
Fast Tape Drives can Saturate the Network
One LTO-4 tape
drive run in native
mode is 20%
faster than the
effective usable
bandwidth of
Gigabit Ethernet!
Native Transfer Rate in MB/sec.
MB/sec.
120
Effective transfer rate of
Gigabit Ethernet
100
LTO-4
80
LTO-3
LTO-2
LTO-1
60
1120 1/2"
3592 1/2"
Four LTO-4 tape
drives run at a 2:1
compressed mode
will take most of the
usable bandwidth
of 10Gbps Ethernet!
40
3590 1/2"
9940B
9940A
20
9840C
9840B
0
-4
O
T
L
9840A
-3
O
T
L
-2
O
T
L
-1
2"
2"
2"
O
1/
1/
1/
T
L
20
92
90
11
35
35
40
99
B
40
99
A
40
98
C
40
98
B
40
98
A
Drive Type
59
Large Systems Backup Approaches
 Point-in-Time Copies and Replication
• Snapshots





Most popular method of large storage backup
Snapshots create an exact copy of the source data
Once a bitmap is created, storage access can resume
While copying, new data is written to both source and target
Requires minimal downtime for production systems
• Replication
 Replication creates a mirror image of data over distance
 May be synchronous (consistent) or asynchronous (lagging)
 Synchronous is distance-limited, asynchronous is not
60
Point-in-Time Copy
Mirror Side A
Mirror Side B
p5
P590 server
LPARs
1
Redundant
200MB/sec. HBAs
200MB/sec. x 4 ports x 4 Clusters
3.2 GB/sec. bandwidth
SVC
Virtualization
Engine
6a/b
5b
ACS
Initiated
200MB/sec. x 4 ports x 4 Clusters
3.2 GB/sec. bandwidth
2
DSE
ESD
2
1
0
5
4
3
0
1
2
3
4
5
3
663 seireSx
SVC
Virtualization
Engine
xSeries 366
DSE
ESD
2
1
0
0
1
2
5
4
3
3
4
5
4a
663 seireSx
xSeries 366
5a
4b
200MB/sec.
bandwidth
FlashCopy
BitMap
WS-X9530 SFI
MGMT
MGMT
10/100
CONSOLE
COM 1
CFI
STATUSSYSTEMACTIVEPWR RESET
SUPERVISOR
FlashCopy
BitMap
WS-X9530 SFI
MGMT
MGMT
10/100
CONSOLE
COM 1
CFI
STATUSSYSTEMACTIVEPWR RESET
STATUS
SUPERVISOR
WS-X9016
1
2
3
4
1
2
3
4
5
6
7
8
5
6
7
8
9
10
11
12
9
10
11
12
13
14
15
16
13
14
15
16
STATUS
STATUS
1/2 Gbps FC Module
WS-X9016
WS-X9016
1
2
3
4
1
2
3
4
5
6
7
8
5
6
7
8
9
10
11
12
9
10
11
12
13
14
15
16
13
14
15
16
STATUS
STATUS
1/2 Gbps FC Module
1/2 Gbps FC Module
WS-X9016
STATUS
1/2 Gbps FC Module
WS-X9032-SMV
STATUS
1
7
2
9
15
17
23
25
31
8
10
16
18
24
26
32
1
7
9
15
17
23
25
31
2
8
FC Services Module
WS-X9032-SMV
WS-X9032-SMV
STATUS
STATUS
10
16
18
24
26
32
1
7
9
15
17
23
25
31
2
8
10
16
18
24
26
32
1
7
9
15
2
17
23
25
31
10
16
18
24
26
32
1
7
9
15
17
23
25
31
2
8
8
10
16
18
24
26
32
FC Services Module
FC Services Module
WS-X9032-SMV
WS-X9032-SMV
STATUS
STATUS
FC Services Module
FC Services Module
WS-X9032-SMV
WS-X9032-SMV
1
1
7
9
15
17
23
25
31
2
8
10
16
18
24
26
32
STATUS
STATUS
2
7
9
8
10
16
18
24
26
32
9
15
17
23
25
31
10
15
16
17
18
23
24
25
26
31
32
FC Services Module
FC Services Module
WS-X9032-SMV
FlashCopy
Backup and Cloning
Process
STATUS
1
7
2
8
FC Services Module
FlashCopy Backup
Process
egarotSlatoT
TotalStorage
egarotSlatoT
0 1 7PXE
egarotSlatoT
0 1 7PXE
egarotSlatoT
0084SD
egarotSlatoT
0 1 7PXE
egarotSlatoT
0 1 7PXE
egarotSlatoT
DS8300
Storage
200MB/sec. x 8 ports
1.6 GB/sec.bandwidth
200MB/sec. x 8 ports
1.6 GB/sec.bandwidth
TotalStorage
2
DS8300
Storage
DS4800
Storage
DS4800
TotalStorage
0 1 7PXE
egarotSlatoT
egarotSlatoT
TotalStorage
EXP710
1
1
egarotSlatoT
TotalStorage
EXP710
0 1 7PXE
2
DS4800
Storage
EXP710
EXP710
TotalStorage
EXP710
TotalStorage
EXP710
TotalStorage
EXP710
TotalStorage
EXP710
TotalStorage
0 1 7PXE
0 1 7PXE
200MB/sec. x 2
channel bandwidth
(400MB/sec. usable)
Internal Bandwidth
400MB/sec. x 4 adapter ports
1.6 GB/sec. (1.2 GB/sec. effective rate)
Internal Bandwidth
400MB/sec. x 32 adapter ports
128 GB/sec. (64GB/sec. usable)
6a/b
To Tape
Library
Internal Bandwidth
400MB/sec. x 4 adapter ports
1.6 GB/sec. (1.2 GB/sec. effective rate)
Internal Bandwidth
400MB/sec. x 32 adapter ports
128 GB/sec. (64GB/sec. usable)
61
Data Replication Structures
01/18/06
Hypothetical Backup & Recovery Approach
Service Level 2 –
Under 15-Minute Recovery
Service Level 1 Continuous Availability
Service Class 3 Under 4-Hour Recovery
Mainframe Mainframe
Storage
Storage
Metro AreaNorth
Building
Storage Site
Bunker East
Primary
Facility
Data Center
Tiered
Storage
On-line file On-line file
recovery andrecovery and
Warm Failover
Warm Failover
Hot FailoverHot Failover Global
backupsMetro Mirror
backups
Metro Mirror Disk Array Disk Array Mirror Global Mirror Disk Array Disk Array
Unix
Storage
Unix
Storage
Wintel
Storage
Wintel
Storage
Local Tape
Backups
ss
r-l
e
TS
M
Se
r
TSM Serverless Backups
ve
(asynchronous)
(asynchronous)
TSM Serverless Backups
?
(synchronous)
(synchronous)
TSM Serverless Backups
TSM Serverless Backups
Disk Array Disk Array
(DS4000
Se TS
Se TS
Series with
M
rv M L(check-points)
(check-points)
SATA) rve La
e
r-l an
r-l n
es - le
es - le
s ss
s ss
Ba a
Ba a
ck n d
ck n d
up
up
s
s
??
(DS6800)
(DS8300)
Business
Standard
FlashCopy
Storage
ps
(DS8300)
Business
Standard
Business
Storage
Important
Storage
(DS4000
Series with
(DS6800)
SATA)
ku
Business
Mission
Critical
Important
Storage
Storage
Ba
c
Mission
Critical
Storage
Bunker West
Remote
Facility
Storage Site
Local Site
Area
Backups
???
Area Site
(within 170 miles)
Remote
Backups
And
Archiving
Local
Tape
Backups
Remote Site
(anywhere in the World)
NOTE: This is about a 50,000 ft. overview of one possible approach. It is designed in a
triple redundant disaster recovery architecture, so the primary structure could sustain a
failure and the secondary structure could also sustain a disaster before the primary structure
was restored, and the structure would still continue to support key business operations.
62
Other Large Systems Backup Approaches
 Object-based Backups
•
•
•
•
Backs up only new blocks that have changed
Copies only files it has never seen before
Inserts pointers if file exists somewhere else
Provides instant recoveries by presenting a mountable volume
 Delta-Block Incremental Backups
•
•
•
•
Evaluates changed data by breaking a file down into discrete blocks
Block-for-block comparison of a modified file with an existing file
When a difference is detected it extracts a copy of that block only
Usually copies a number of blocks, but not the entire file
 Continuous Data Protection (CDP)
•
•
•
•
Copies blocks to the backup system as soon as they change
Stores data in a log that allows recovery from any point in time
Performs fast recoveries by restoring only blocks that have changed
Provides instant recoveries by presenting a mountable volume
63
Security Requirements
64
Impact of Petabyte Storage on Security
 Traditional distributed access control
techniques are designed for smaller
systems with general or random
workloads
 Petabyte storage may service tens of
thousands of clients and hundreds of
storage devices
 Storage design must be capable of
supporting I/O patterns that are highly
parallel and very bursty by nature
 Security solutions must be kept highly
scalable to keep up with storage
growth patterns
65
Impact of Petabyte Storage on Security (Cont.)
 Authentication and authorization
requirements can dramatically
impact server performance
 Performance could be further
reduced if data is encrypted
 Traditional security protocols perform poorly because
they do not scale well
 The number of security operations is closely tied to
the number of devices and requests
66
Regulatory Compliance
67
The Challenge Of Regulatory Compliance
 What’s driving storage regulatory
legislation?
• Corporate fraud and illegal practices
• Increased emphasis on the security
of personal data
• The threat of terrorist activities
• The global impact of the Internet
• Increased reliance on stored data
for defense against litigation
• Increased business dependence on
electronic communications (e-mail,
digital voice messaging, instant
messaging, VoIP, etc.)
68
Regulatory Requirements Continue to Grow
 According to the Storage Networking Industry
Association (SNIA) there are over 20,000 regulations
worldwide addressing the storage of data
 The number of government regulatory
requirements increases every year
• There is little chance this upward trend will
reverse itself in the future
• Regulatory guidelines do not dictate how you
should maintain your data, only what the
expected outcome should be.
 If you do business overseas, you must also be aware
of the applicable foreign regulatory requirements
69
Common Regulatory Compliance Goals
 Most regulatory requirements are based upon:
• Security: Maintain data in a secure environment
• Efficiency: Rapid location and retrieval of data
• Legibility: Recovered documents must be in a readable
format that is clear and concise
• Authenticity: The recovered document must be verifiable as
the original
• Validation: Documentation process must be available for
review by a neutral third party
Regulatory compliance becomes more challenging as storage
subsystems grow in size and complexity.
70
Regulatory Legislation Examples











Sarbanes-Oxley
HIPAA
USA Patriot Act
Gramm-Leach-Bliley Act
FRCP
CFR 240 17a(f)
NASD 3010 and 3110
21 CFR Part 11 (FDA)
DOD 5015.2
California Senate Bill 1386
Florida Sunshine Law











PCI
ISO 17799
CFR Title 18 (FERC)
E-SIGN
EU Directive 95/46/EC
Basel II
NARA GRS20
CFR Title 47, Part 42
NASD 2711/NYSE Rule 472
JCAHO
FPC 65 COOP compliance
71
Maintenance Requirements
72
Disk Drive Reliability Misconceptions
 Actual disk failure rate is usually
higher than published
– Vendors indicate a .58% - .88% failure rate
– Actual field usage suggests a 1.5% - 3.5%
(or greater) failure rate
 Field studies show no appreciable
difference in reliability between
SCSI, FC, and SATA drives
 Heat and high duty cycles do not
appear to have as detrimental of an
effect on disk life as once thought
73
Disk Drive Reliability Misconceptions (Cont.)
 Infant Mortality doesn’t appear to be a significant
issue for newly installed disks
 Disks exhibit a fairly linear rate of failure over time,
which is contrary to the standard “bathtub” model
 Self-Monitoring, Analysis and Reporting Technology
(S.M.A.R.T.) diagnostics appear to predict only about
50% of all disk failures
74
Disk Failures Are an Expected Occurrence
Projected Disk Failures per Month
30
20
25
20
Traditional
15
M onolythic
7
10
5
Tiered
2
Traditional
6
Tiered
Tiered
0
1% Rate
3.5% Rate
Using the disk count from our previous Traditional vs. Tiered models,
it’s easy to see that disk failures will occur on a regular basis.
75
Other Considerations
76
Other Issues to Consider








Design for minimal human intervention
Maintain extensive monitoring of the environment
Exercise control over information propagation
Architect for maximum performance and throughput
Ensure robustness and high availability
Configure for scalability and flexibility
Develop well defined SLA objectives
Implement a structured support operation
77
Emerging Technologies
78
Emerging Technologies to Watch
 Thin Provisioning
 Data de-duplication
 SAS Interface
 InfiniBand
 NPIV (N_Port ID Virtualization)
 Large-platter storage technologies
 2.5” disk technologies
 Solid state disk drives
 Virtualized file systems (i.e.- ZFS, SOFS)
 Grid Storage
79
Summary
80
Some Parting Thoughts
 Online multi-petabyte storage is a reality
 Data will double every two to three years
 Storage media cost-per-GB will continue to decline
 Storage Operational management is a growing issue
 Governmental regulations will increase over time
 New technologies will demand additional storage
 Experienced storage designers and administrators will
grow increasingly harder to find
 Scarce data center resources (bandwidth, floor space,
power, cooling, etc.) will become more expensive
 A carefully designed architecture is the key to efficient
storage operations
81
Putting It All Together
05/12/06
Multi-Petabyte Storage Infrastructure
Remote Data
Sources
Redundant WAN Providers
Captain Kirk:
Scotty - We
need more
power!!!
Data Reduction
Techniques
Network Data
Compression
Long-Term
Archive
(Bunker Storage)
Local Server
Internal Backups
Storage
Management
Linux Server
GPFS Clusters
Event
Reporting
HSM Metadata
Database
Virtualization
Cluster
Virtualization
Cluster
Performance
Management
Mr. Scott:
Capn, I'm gi'in
ya all I got, she
can na take
much more!
Edge
Edge
Infiniband
Core
Data
Movers
SAN
Routers
Backup &
Recovery
(Tape
Library)
High Speed
Search Engine
Fabric
Management
Infiniband
Core
Primary
Primary
Primary
Primary
Storage
Storage
Storage
Storage
Fast File
Recovery
Data
Movers
Edge
Edge
HSM
Structure
Redundant Backup Capability
Infiniband
Core
Secondary
Primary
Primary
Primary
Storage
Storage
Storage
Storage
FlashCopy
Storage
Synchronous
Replication
(DR Hot Site)
Asynchronous
Replication
(DR Warm Site)
Within 300 KM
Any Distance
Near-Line
Storage
(Tape
Library)
FlashCopy
Storage
82
Questions?
Randy Cochran, Infrastructure Architect
IBM Global Technical Services - Cell: (630) 248-0660 - hcochran@us.ibm.com
83
Download