Metron

advertisement
Key Metrics for Effective
Storage Performance and
Capacity Reporting
Abstract
Doing capacity management for storage can be difficult with the many complex
and varied technologies being used. Given all of the options available for data
storage strategy, a clear understanding of the architecture is important in
identifying performance and capacity concerns. A technician looking at metrics on
a server is often seeing only the tip of a storage iceberg. Knowing which metrics
are important will depend on your objectives and storage architecture, but
response and space utilization will always be key to effectively managing storage.
Contents
• Storage Architecture
• Two distinct aspects of storage capacity
• Virtualization
• Key metrics from the host and backend storage view
• Reporting on what is most important
Space Capacity - History
Growth can result in increasing cost and complexity
Two Distinct Aspects of Storage Capacity and Performance
Storage Throughput
Storage Space
Response, IOPS
Space Capacity – Space Utilization
What does storage ‘Utilization’
mean in your environment?
Factors include: RAID/DR,
Raw/Configured, Host/SAN,
Backups, Compression, Etc...
Space Capacity – Proactive Visibility
Alarm on key metric trends instead of current threshold breaches to get in
front of problems before they happen.
Trending, forecasting, and exceptions.
Space Capacity – Trending
Understand the limitations of linear regression when trending and
forecasting data.
Chart below has low correlation
Chart above has high correlation
Space Capacity – Showing Different Viewpoints
Business, Application, Host, Storage Array, Billing Tier
Space Capacity – Host Metrics
Metrics are typically available at the file
system, volume and logical disk views.
Key metrics for space capacity from the
host perspective are typically:
•
•
•
Storage allocated to system (disks)
Allocated but not configured (volumes)
Space used or free (file systems)
Space Capacity – Array Metrics
NetApp Aggregate
Key metrics for space capacity from the
array perspective depends on the
technology and how it is being used.
However, like the host view, total capacity
and space available are key metrics:
•
•
•
Storage installed in arrays (disks)
Configured but not allocated (aggregates)
Space used or free (volumes)
Storage arrays can have many space related metrics at different levels
Virtual Environments and Clusters
Managing storage in clustered and/or virtual environment can be challenging
because it is shared among all hosts and virtual machines running on it.
Image Source: VMware.com
•
Thin provisioning
•
Storage viewed at many levels
•
Could be different tiers allocated to
the same cluster
•
Overhead at various points
Storage Virtualization
Pooling physical storage from multiple sources into logical groupings
•
Can be a centralized source for
collecting data
http://www.networkmagazineindia.com/200207/vendor.shtml
Wide variety of techniques for virtualizing storage, be aware of the
implications for data collection and reporting
Performance Capacity – Response Impacts
SAN or storage array performance problems can be identified at the host
or backend storage environment.
Response is the key metric for
performance evaluation
•
•
•
•
Host I/O response
Fabric or Network response
Virtualization device response
Array response
High response is typically caused
by insufficient throughput capacity
Performance Capacity – Host Metrics
Understand the limitations of certain host metrics
• Measured response is the best metric for
identifying trouble.
• Host utilization only shows busy time, it
doesn’t give capacity for SAN.
• Physical I/O rate is an important measure of
throughput, all disks have their limitation.
• Queue Length is a good indicator that a
limitation has been reached somewhere.
Performance Capacity – Host Metrics
100% host disk utilization can indicate high throughput, but ample
backend capacity might still be available, as was the case here.
Performance Capacity – Host Metrics
Queue lengths from the previous high utilization chart indicates that it may
not currently be impacting response, but headroom is unknown.
Performance Capacity – Host Metrics
I/O generated from the previous high utilization chart is shown here,
where combined throughput peaks are very high.
Performance Capacity – Host Metrics
Spikes in throughput typically correlate with queues and response for simple
disk configurations, as seen in the chart, but most disk configurations are not
simple anymore, which means these metrics often do not correlate.
Performance Capacity – Array Architecture
• Front End Processors
• Shared Cache
• Back End Processors
• Disk Storage
Performance Capacity – Array Metrics
Front end processors are typically the first to bottleneck. This
chart shows acceptable utilization.
Performance Capacity – Array Metrics
Find arrays doing the most work with throughput metrics.
EMC-All Array IOPs
45000
40000
35000
30000
25000
EMC -000
20000
EMC -001
EMC -002
EMC -003
15000
EMC -004
10000
5000
0
10/19/2012
Intellimagic EMC Volume IO/sec
10/20/2012
10/21/2012
10/22/2012
Performance Capacity – Array Metrics
Aggregating and trending key metrics can be useful as shown here.
EMC-Array Total IOPs Trend
30000
25000
20000
15000
10000
Least square f it
Max IOPs
IO/sec
5000
0
IO/sec f or EMC-000
between: 20/10/2012 and 22/10/2012
extrapolated until: 27/10/2012
, 72 R aw D ata points
Performance Capacity – Array Metrics
Knowing what is generating the IOPS can also be important
EMC-Top 10 Volumes for All Array IOPs
14000
12000
10000
EMC -000,rnk-0001,v ol-00304
8000
EMC -000,rnk-0001,v ol-00321
EMC -001,rnk-0018,v ol-03614
EMC -001,rnk-0020,v ol-03437
6000
EMC -001,rnk-0020,v ol-04389
EMC -003,rnk-0033,v ol-08738
EMC -003,rnk-0033,v ol-08739
4000
EMC -003,rnk-0033,v ol-08744
EMC -004,rnk-0051,v ol-10396
EMC -004,rnk-0051,v ol-10409
2000
0
10/19/2012
Intellimagic EMC Volume IO/sec
10/20/2012
10/21/2012
10/22/2012
Performance Capacity – Storage Virtualization Metrics
Key metrics are also available from virtualization devices. This
chart shows the top 10 IBM SVC volumes for throughput.
IBM SVC Top 10 Volumes
3500
3000
2500
2000
rnk-0217,v ol-00926
rnk-0218,v ol-00678
1500
rnk-0218,v ol-00691
rnk-0218,v ol-00974
rnk-0229,v ol-00451
1000
rnk-0229,v ol-00578
rnk-0229,v ol-00648
rnk-0229,v ol-00757
500
rnk-0229,v ol-00910
rnk-0229,v ol-01082
0
Intellimagic Volume,SVC-006 Total op/sec
Performance Capacity – Storage Virtualization Metrics
This is another example of aggregating and trending, although this
particular SVC data sample is not a good real world example.
IBM SVC Volume IOPs Trend
10000
9000
8000
7000
6000
5000
4000
Least square f it
90% upper conf . limit
3000
2000
1000
0
Total op/sec f or SVC-006,rnk-0229,v ol-00451
between: 18/10/2011 and 19/10/2011
extrapolated until: 21/10/2011
y = 2010x + 914, 97 Raw Data points
90% lower conf . limit
Alarm
Total op/sec
Performance Capacity – Storage Virtualization Metrics
Key metric for performance evaluation is
response.
Other metrics are important too, but are
typically used to avoid or troubleshoot high
response times.
Storage devices can have many performance metrics at different levels
Performance Capacity – Array Metrics
NetApp
Response
EMC
Performance Capacity – Component Breakdown
Service time versus response time – different metrics
IO Response
The bar chart shows service times as blue and green, with queue
time represented as red and yellow.
Response is the combination of service and queue time.
Performance Capacity – Workload Profiles
Application type is important in estimating performance risk
Performance Capacity – Scorecards and Exceptions
Performance Capacity – Dashboards
At a glance view of important metrics for critical resources
Storage Key Metrics – Conclusions
• Knowledge of your storage architecture is critical
• Understand both storage space and throughput
• Consider all factors that affect storage space utilization
• Be aware of virtualization and clustering complexities
• Know key metrics and their limitations
• Start with key report types and areas that are most important
Thank you for attending
The End
Key Metrics for Effective Storage Performance and
Capacity Reporting
Download