Uploaded by plamenpekov

Understanding and Troubleshooting 3PAR Performance

advertisement
Understanding and
troubleshooting
3PAR performance
Christophe Dubois / 3PAR Ninja team
20.03.2013
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Agenda
A discussion about performance of HP 3PAR StoreServ and basic troubleshooting techniques
2
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Architecture Overview
RTFM Read the concept guide!
http://bizsupport1.austin.hp.com/bc/docs/support/SupportManual/c02986000/c02986000.
pdf
Or search 3PAR concept guide in your favourite search engine
3
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Limits
When sizing for or troubleshooting performance, consider the following limits that will apply to
any 3PAR system :
• How many IOPS the controller nodes can sustain
• How many IOPS the physical disks can sustain
• Block size matters!
• How much read bandwidth the controller nodes can sustain
• How much write bandwidth the controller nodes can sustain
• Software limitation of the write cache algorithm
• If using AO, the IO locality profile
4
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Front-end to back-end ratio
Depending on the type of IOs (read/write), whether they are random/sequential, the RAID type
and the IO size, the ratio between front-end and back-end will vary
Cache hit
• Read cache hit is a read IO to a portion of data that is already in cache
• Write cache hit IO is an IO to a portion of data that is in write cache, but has not been destaged to disk yet
Note that when doing sequential read IOs, the system will report very high read hit ratios (99%+)
because the pre-fetching algorithm puts the data in cache
5
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Front-end to back-end ratio
How RAID write overhead is calculated
RAID 1 – Writes
D1
D1
RAID 5 – Writes
D1
D2
D3
RAID 6 – Writes
6
A1
A2
A3
Ap
B1
B2
Bp
B3
C1
p
C2
C3
Dp
D1
D2
D3
P
1.
2.
Write new data to 1st mirror (1 IOP)
Write new data to 2nd mirror (1 IOP)
1.
2.
3.
4.
5.
Read old data block (1 IOP)
Read old parity block (1 IOP)
Calculate new parity block (0 IOP)
Write new data block (1 IOP)
Write new parity block (1 IOP)
1.
Complicated Process
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Total IOPS per 1 New Write = 2
Total IOPS per 1 New Write = 4
Total IOPS per 1 New Write = 6.66
Front-end to back-end ratio
Depending on the type of IOs (read/write), whether they are random/sequential, the RAID type
and the IO size, the ratio between front-end and back-end will vary
RAID1
- Random read IO : 1 front-end IO = 1 back-end IO
- Sequential reads : 1 KiB/s of front-end = at least 1 KiB/s of back-end
-
Do not look at IOPS when doing sequential workloads, as the system will aggregate multiple IOs when going to the backend. Use KiB/s instead
-
Because of prefetching there will almost always be more KB/s on the backend than on the front-end
-
Random write IO : 1 front-end IO = 2 back-end IO
Sequential writes : 1 KiB/s of front-end = 2 KiB/s of back-end
-
7
Do not look at IOPS when doing sequential workloads, as the system will aggregate multiple IOs when going to the backend. Use KiB/s instead
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Front-end to back-end ratio
RAID5
- Random read IO : 1 front-end IO = 1 back-end IO
- Sequential reads : 1 KiB/s of front-end = at least 1 KiB/s of back-end
-
Do not look at IOPS when doing sequential workloads, as the system will aggregate multiple IOs when going to the backend. Use KiB/s instead
-
Because of prefetching there will almost always be more KB/s on the backend than on the front-end
-
Random write IO : 1 front-end IO = 4 back-end IO
Sequential writes : 1 KiB/s of front-end = 1 KiB/s * (setsize / (setsize – 1)) of back-end
-
8
Do not look at IOPS when doing sequential workloads, as the system will aggregate multiple IOs when going to the backend. Use KiB/s instead
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Front-end to back-end ratio
RAID6
- Random read IO : 1 front-end IO = 1 back-end IO
- Sequential reads : 1 KiB/s of front-end = 1 KiB/s of back-end
-
Do not look at IOPS when doing sequential workloads, as the system will aggregate multiple IOs when going to the backend. Use KiB/s instead
-
Because of prefetching there will almost always be more KB/s on the backend than on the front-end
-
Random write IO : 1 front-end IO = 6.66 back-end IO
Sequential writes : 1 KiB/s of front-end = 1 KiB/s * (setsize / (setsize – 2)) of back-end
-
9
Do not look at IOPS when doing sequential workloads, as the system will aggregate multiple IOs when going to the backend. Use KiB/s instead
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Controller nodes IOPS limits
V-class
Approximately 120K backend IOPS per node pair
Note that with read IOs on V-class, the number does not scale linearly with the number of node
pairs. It does scale linearly for write IOs
T/F-class
Approximately 64K backend IOPS per node pair
These numbers do not include cache IOPS.
Cache IOPS are not characterised, however during the SPC-1 benchmark for the V800 it is
estimated that on top of the 120K backend IOPS, each node pair was doing 37K cache IOPS
10
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Controller nodes bandwidth limits
The following limits are front-end :
V-class
• Reads : approximately 3250 MB/s per node pair
• Writes : 1500 MB/s if only 2 nodes, 2600 MB/s per node pair when using more than 2 nodes
T-class
• Reads : approximately 1400 MB/s per node pair
• Writes : 600 MB/s per node pair
F-class
• Reads : approximately 1300 MB/s per node pair
• Writes : 550 MB/s per node pair
11
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
V-class tested limits (no cache IOs) with RAID1
Random Reads IOPs
Random Writes IOPs
Random 50/50 RW IOPs
Sequential Reads MBs
Sequential Writes MBs
12
2node V400
8node
V800
120700
60700
95000
3250
1630
365000
246000
335000
14300
11100
SO data taken solely from full disk config results (480 or 1920 PDs)
Random IOPS: Multiple threads doing 8K IOsize
Sequential MBs: 1 to 2 threads doing 256K IOsize
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
V-class tested limits (no cache IOs) with RAID5
Random Reads IOPs
Random Writes IOPs
Random 50/50 RW IOPs
Sequential Reads MBs
Sequential Writes MBs
2node V400
114000
34300
58500
3390
1550
8node V800
345000
140300
235000
14800
10500
SO data taken solely from full disk config results (480 or 1920 PDs)
Random IOPS: Multiple threads doing 8K IOsize
Sequential MBs: 1 to 2 threads doing 256K IOsize
13
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
V-class tested limits (no cache IOs) with RAID6
Random Reads IOPs
Random Writes IOPs
Random 50/50 RW IOPs
Sequential Reads MBs
Sequential Writes MBs
14
2node V400
105000
20000
33000
3000
1250
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
8node V800
340000
77000
137000
12000
6500
HP 3PAR StoreServ 7000 Performance
7200
8KB Rand Read 8KB Rand
IOPS
Write IOPS
Seq. Read
MB/sec
Seq. Write
MB/sec
15K HDD
drive limited
drive limited
2,500
1,200
SSD
150,000
75,000
2,800
1,200
7400
8KB Rand Read 8KB Rand
IOPS
Write IOPS
Seq. Read
MB/sec
Seq. Write
MB/sec
15K HDD
drive limited
drive limited
4,800
2,400
SSD
320,000
160,000
4,800
2,400
Drive Limited = the performance depends on the number of drives and the drive capabilities and is not limited by the
controllers or interconnects.
15
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Physical Disks IOPS limits
Recommended Max drive IOPS
The following numbers are for small IOPS (8-16KB)
15K RPM disks = 200 backend IOPS per PD recommended. Can do 250 reasonably well
10K RPM disks = 150-170 backend IOPS
7.2K RPM disks (NL) = 75 backend IOPS per PD recommended
SSD = It depends!
• On type of IO
• On type of RAID (yes, even for back-end IOs)
• But 3000 IOPS per disk is a safe assumption
16
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
SSDs
SSD 100/200 GB
Max IOPS
per drive
RAID 1, Aligned
RAID 1,
Unaligned
RAID 5, Aligned
RAID 5,
Unaligned
100% Reads
8000
7200
7500
7000
70% /30%
6000
4500
3300
1700
50% / 50%
5000
4000
3000
1500
30% / 70 %
5000
4000
2800
1400
100% Writes
5000
4000
2800
1400
we recommend using a figure of 3000 IOPS when sizing with 100-200GB SSD
17
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
T10 DIF and Performance Impact
What is impacted significantly by DIF?
Most IOPS are not affected significantly by introduction of DIF (e.g., reads, R5, R6, FC15, SSDs,
sequential)
RAID1 random writes on 10000 NL PDs are affected UNLESS the IOsize is a multiple of 16K AND
the IO is aligned
ALIGNMENT
Usually ignored by 3PAR in the past. Secondary performance impact (load on PDs)
Increasingly an issue (e.g., DIF, SSDs)
18
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Block size matters
All random IOPS numbers on 3PAR are given for small IOs : 8 KBs
When doing IOs larger than 8 KB, the number of backend IOs a system can sustain do may drop
off significantly as IO size increases
For FC/NL disks, there is virtually no difference between 8 KB and 16 KB IOs.
Above 16KB, the number of IOs per PD degrades
For SSDs, since a cell is 8KB, any IO larger than 8KB will cause a performance degradation
Remember that when using large blocks, the bandwidth limitation can be reached faster than the
IOPS limitations!
19
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
FC disks : IOPS vs block size (empirical data)
With 32x 15K disks, RAID5 fully allocated VV
Maximum IOPS vs Block Size (8K = 100%)
Drop from 8KB in %
120%
100%
80%
60%
40%
20%
0%
100% Reads
50% Read 50% Write
100% Writes
8KB
100%
100%
100%
16KB
98%
98%
98%
32KB
92%
81%
63%
64KB
72%
52%
47%
This graph is not 100% accurate, but is used to show a drop in IOPS when the block size is increased. Values may change 5-10%.
20
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
SSD disks : IOPS vs block size (empirical data)
With 16 x 100GB SSDs, RAID5 fully allocated VV
Maximum IOPS vs Block Size (8K = 100%)
Drop from 8KB in %
120%
100%
80%
60%
40%
20%
0%
100% Reads
50% Read 50% Write
100% Writes
8KB
100%
100%
100%
16KB
32KB
64KB
93%
91%
52%
50%
50%
30%
This graph is not 100% accurate, but is used to show a drop in IOPS when the block size is increased. Values may change 5-10%.
21
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Block size matters
Performance drop off with block size increase
Rules of thumb (for a mixed workload 50% read, 50% write, 100% random)
FC & NL Disks
SSD Disks
8KB
8KB
IOPS given by HP Sizer
16KB ~ same as 8KB block size
16KB
~ 90% of 8KB block size
32KB ~ 80% of 8KB throughput
32KB
~ 50% of 8KB throughput
64KB ~ 50% of 8KB throughput
64KB
~ 30% of 8KB throughput
22
IOPS given by HP Sizer
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Difference between full VVs and thin VVs
While there is a difference in the number of backend IO required for a front-end IO on a
Thin VV compared to a full VV, this only applies to first write
This is usually completely transparent to the user and the application, since the
system will acknowledge the IO to the host and write to disk afterwards
Most applications usually “prepare” new capacity before using it
After the first write, there is absolutely no difference between Thin and full VVs
23
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Full VVs vs TPVVs after initial write
Workload is 100% random read, then 100% random write, then 50/50
Same amount of
backend IOs
Frontend IOs
on full VV
24
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Frontend IOs
on Thin VV
Snapshots
Impact of Snapshots
• A Snapshot is a point-in-time copy of a source Virtual Volume
− It can use any RAID level
− Reservation less - It has no “background normalization” process
− Normalization occurs as a result of host writes to the source VV
• The Copy-on-Write can cause increased Host I/O Latencies
− Before “New” data can be written to disk, the “Old” data needs to be read off the disks and
copied to the snapshot space. This is called a Copy On [first] Write or COW.
• Copy-on-Writes cause additional backend disk IOs and increased Host I/O latency
As long as the system is not maxed out in terms of backend IOs, snapshots will have
a marginal impact
25
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Without and with snapshot
Backend IO/s
Create Snapshot
Frontend IO/s
Read response time
Write response time
26
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
3PAR Copy on Write Implementation
• 3PAR arrays perform Copy on Write
operations in the background,
allowing Hosts to see Electronic
(Cache) latencies
• Write Latencies CAN still increase if
there is sufficient activity to slow
down the cache flush rate
− Busy drives = Slower cache flush
rate
• Time estimate are for a system
that has enough buffers available
to support the workload
Host
3PAR FLUSHER
3PAR Cache
Backend Disks
Flush Data
t=0.25ms
8kb SCSI Write
Read 16kb Old Data
t=6ms
Read 16kb from Disk Drives
Status Good
t=0.25ms
Data
Store Data into SD Space
Flush Update to SA/SD space
t=4ms
Write Old Data to SA/SD Space
Status Good
Write Host 8kb to original Location
t=4ms
Write New Data to Disk
Status Good
27
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
3PAR StoreServ 7200 Virtual Copy Snapshot
Performance Summary
Base workload:
70/30 8kb
IOPS
Average Latency
SS7200, 3.1.2 R1
11,500
4.1ms
SS7200, 3.1.2 R5
9000 IOPS
5.2ms
SS7200, 3.1.2 R6
8000 IOPS
6ms
11,000
4.2ms
9000
5.2ms
8000
6ms
10900
4.3ms
8900
5.3ms
8000
6ms
10200
4.6ms
8100
5.5ms
7600
6.2ms
8500
5.6ms
8200
5.8ms
7300
6.5ms
1 Virtual Copy
IOPS
Average Latency
2 Virtual Copy
IOPS
Average Latency
4 Virtual Copy
IOPS
Average Latency
8 Virtual Copy
IOPS
Average Latency
28
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
3PAR Virtual Copy, Raid 1, 1 Snap
OLTP 70/30 8kb, q=48, 144x15k, SS7200, 1 snap
11500 iops Average Latency
5
Response Time (ms)
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
Time
Create Snapshot
29
Average Latency
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Remove Snapshot
3PAR Virtual Copy, Raid 1,8 SNAPS
OLTP 70/30 8kb, q=48, 144x15k, SS7200, 8 snaps
Baseline: 11500 iops/4msAverage Latency
12
Response Time (ms)
10
8
6
4
2
0
Time
Create 8 Snaps
30
Average Latency
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Remove 8 Snaps
Couple of things to be aware of…
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
3PAR Cache and drive populations
7200 2-node
7400 2-node/4-Node
16GB
16GB/32GB
(4)8GB
(8)16GB/32GB
144 (120)
240 (216)/480(432)
72 (60)
120(108)
262144 (4GB)
524288 (8GB)
Number of drives needed to reach maximum number of
Write Buffers
Number of drives needed to reach maximum number of
Write Buffers
15k (2400 buffers)
108 drives per Node Pair
216 drives per Node Pair
NL (1200 buffers)
You cannot maximize Write cache with only NL Drives
You cannot maximize Write cache with only NL Drives
SSD (9600 buffers)
28 drives per Node Pair
56 drives per Node Pair
Control cache
Data cache
Max drives per System
Max drives per Node
Per Node Cache Buffers
The table above estimates the number of disk drives that need to be distributed in a 3PAR StoreServ 7000 to ensure the maximum
number of Write Buffers are available.
Caution must be used when cache size is taken into account to size for performance. Just because the array has 8GB, it does not mean a
host workload will be able to utilize the full amount for Writes
With all NL Drives, you CANNOT allocate the maximum amount of Write-Buffers
This is important to understand for small disk 3PAR StoreServ 7000 systems which may suffer higher response times on writes that is
expected for the size of the arrays cache.
32
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
MB/s per disk and Write cache flusher limits
Upon writing to the 3PAR array, the data will be put in write cache. Each 3PAR controller node
only allows a maximum number of pages for a given number and type of disk
When reaching 85% of this number of maximum allowed cache page, the system will start
delaying the acknowledgement of IOs in order to throttle down the hosts, until some cache pages
have been freed by having their data de-staged to disk (condition known as “delayed ack”)
This de-staging happens at a fixed rate that depends on the number and type of disks
The maximum write bandwidth of the hosts will be limited to the de-staging speed
33
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
MB/s per disk and Write cache flusher limits
While undocumented, the following rules of thumb can be used for small systems (less that 100
disks of a type). Note that this is empirical data
De-staging speed for FC 15K disks : 8-10 MB/s (front-end) per FC 15K PD
De-staging speed for NL disks : 4-6 MB/s (front-end) per NL PD
Note that when doing read IOs, this limit does not apply and much higher values can be reached
(35-40 MB/s per FC 15K PD)
Beware of the controller node limits when sizing or troubleshooting bandwidth related issues
Always use the Storage Optimizer when sizing for MB/s
34
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Delayed ack mode
“Delayed ack” is a behaviour of HP 3PAR systems when the cache gets filled faster than it can be
de-staged to disk (most likely because the physical disks are maxed out)
This is determined by the number of dirty cache pages for a type of disk exceeding 85% of the
allowed maximum
If the threshold is reached, the system will reduce the host IO rate by delaying the “ack” sent back
on host writes. Throttling is done to reduce the possibility of hitting max allowable dirty CMP limit
(cache full).
The host will see this behavior and naturally slow down the IO rate it sends to the InServ (extreme
cases cause host IO timeout & outage). If continually in delayed-ack mode, load needs to be
lowered on hosts, or additional nodes/disks.
35
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Delayed ack mode
The maximum number of cache pages is a function of the number of disks of each type that are
connected to a given node :
• SSD : 4800 pages per PD
• FC : 1200 pages per PD
• NL : 600 pages per PD
For example on a 4 node system with 32 SSDs, 256 FC disks and 64 NL disks (each node will see
16 SSDs, 128 FC and 32 NL):
• Per node : 76800 pages for SSDs, 153600 pages for FC, 19200 pages for NL
36
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Identifying delayed ack mode
The statcmp command shows the number of dirty pages per node, type of disks,
the maximum number of pages allowed and the number of delyed acks :
Page Statistics
---------CfcDirty--------- -----------CfcMax------------ -----------DelAck-----------Node FC_10KRPM FC_15KRPM NL SSD FC_10KRPM FC_15KRPM
NL SSD FC_10KRPM FC_15KRPM
NL SSD
2
0
15997 2
0
0
19200 19200
0
0
53896 16301
0
3
0
18103 1
0
0
19200 19200
0
0
95982 15092
0
Current number of dirty
pages for each node for
this type of disks (instant)
37
Max allowed pages per node
for this type of disks
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Number of delayed acks.
This counter is incremented
whenever a delayed ack happens
Reasons why the system may be in delayed ack mode
Delayed acks occur the cache gets filled faster than it can be de-staged to disk
Factors that contribute to delayed-ack :






38
PD’s are maxed out  Check status with statpd
“servicemag’” running with heavy IO already occurring
The cache flushing limit might be reached for this type of disks
flusher speed bug
Drastic change in host IO profile
RemoteCopy sync destination array’s disks maxed out (make sure to
include in your performance analysis)
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Performance issues with single streamed, small
write IOs
On workloads using single streams (1 outstanding IO) of small write IOs, the performance of the
3PAR system might be lower than expected. This really matters for block sizes < 128KB
On inform OS 3.1.1 situation can sometimes be improved by disabling the “interrupt coalescing”
feature of the front-end ports. (Default is disabled for Inform OS 3.1.2)
While interrupt coalescing has a positive effect on most workloads (offload 3PAR CPUs on
controller nodes), it can have a detrimental impact on this specific type of workloads
experience with 3.1.2 is that it gives you 2x Performance improvement for single thread
sequential IO over 3.1.1
39
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Performance issues with single streamed, small
write IOs
Interrupt Coalescing (intcoal) is defined on front-end target ports
It should only be disabled on ports used by hosts that use this type of workload
To disable it use the following command on each port:
controlport intcoal disable <N:S:P>
The port will be reset (expect a short host io interruption during reset)
40
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Write Request
① WRITE COMMAND
② XFER READY
③ DATA TRANSFER
④ STATUS
Server
Storage Array
The write from a server to a storage device uses a dual round trip SCSI Write
protocol to service the write request
41
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Response time of single write IOs with and without
interrupt coalescing
Intcoal enabled
1KB
2KB
4KB
8KB
16KB
32KB
64KB
Intcoal disabled
50% lower latency
42
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
128KB
256KB
512KB
Measuring performance
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Stat* vs Hist* commands
All of the previous objects have stat* commands (use “help stat” for complete list)
Stat* commands display average values between 2 iterations
Because the result is an average, a single anomalously long IO might be hidden by a large number
of IOs with a good service time.
Prefer a short sampling interval (i.e 15 sec or less)
The hist* commands can be used to display buckets of response times and block sizes if required
Use “help hist” to see the list of hist* commands
44
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
6 Layers to Check
Watch for
• High vlun latencies
• Large IO Sizes
FRONT END
1) statvlun –rw -ni
2) statport –rw –ni –host
NODES
• Large IOSizes, Latencies
• Delayed Acks
3) statvv –rw –ni
4) statcmp and statcpmvv
BACK END
• Heavy Disk IO activity
• Slow PD (high queue/latency)
45
5) statpd –rw -ni
6) statport –rw –ni –disk
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Performance Case Studies
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Case Study #1
Example of Debugging Performance Issues
Issue – Customer is seeing service time increase greatly, and IOPs staying the same.
They are running on 10K FC drives.
CLI “statport-host” output:
Host “iostat” output:
47
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Case Study #1
Example of Debugging Performance Issues (cont.)
CLI “statpd” output:
48
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Case Study #1
Example of Debugging Performance Issues (cont.)
If you remember earlier in the presentation, as IOPs increase, so does the service time.
However, here the IOPs have stopped increasing.
When a hardware hits its IOP max, additional IO gets queued on the device and is
caused to wait, which adds to the service time.
From earlier slides, we know the following about PD IOP limits:
 7K NL = 75 IOPS
 10K FC = 150 IOPS
 15K FC = 200 IOPS
In our case, PDs are running at 133% of maximum load and is the bottleneck.
Solution – Additional hardware (PDs) would be needed to reduce the backend load and
reduce service times to the application(s).
49
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Case Study #2
Example of Debugging Performance Issues
Issue – Customer is expecting higher IOPS, but not getting what was “advertised”.
They are running on 10K FC drives.
CLI “statport-host” output:
50
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Case Study #2
Example of Debugging Performance Issues (cont.)
CLI “statpd” output:
51
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Case Study #2
Example of Debugging Performance Issues (cont.)
• When 3PAR quotes performance limits for IOPS or throughput, it is assuming a giving IO size (typically 8KB
for Random Workloads). If the customer workload diverges from that, the quoted limits will most likely not
be achievable.
• Looking at the stats facing the host, we see the block size coming in to the InServ is 32KB, and the block size
to the disks is also 32KB.
• From earlier slides, we know in order to hit node and PD performance max:
 Access pattern = RANDOM
 Block Size <= 16KB
• The block size is above the size we require to get the true max performance from the StoreServ. Because of
this, they will only get approx 75% of the max.
• The customer can either lower their application request size (if possible), or add additional PDs to sustain
their desired IOPS number (taking into consideration the % drop with a larger block size)
52
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Case Study #3
Example of Debugging Performance Issues (cont.)
Issue – Customer sees very high write service times for small IO sizes in ”statvlun”
but ”statvv” shows no problem with write service times
CLI “statvlun” output:
56.1 ms
CLI “statvv” output:
0.1 ms
53
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Case Study #3
Example of Debugging Performance Issues (cont.)
• Examination of the backend showed very high IO rates
• The large difference between ”statvlun” and ”statvv”
can be an indication of delayed ack.
• Next we will examine statcmp and look for possible
delayedAcks…
54
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Case Study #3
Example of Debugging Performance Issues (cont.)
The page statistics of statcmp showed Delayed ACK is occurring on all nodes for FC &
NL drives.
Note: DelAck is a cumulative counter value.
CLI “statcmp” output:
Examine load on FC & NL drives to identify root cause. (I/O size, High Avg. IOPS
per drive, etc…). If applicable, check for RC related contention as well
55
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Measuring performance
3PAR performance counters are available at many levels to help troubleshoot performance issues
56
Physical objects
Logical objects
Physical disks
CPUs
FC Ports
iSCSI ports
Links (Memory, PCI, ASIC-to-ASIC)
Chunklets
Logical disks
Virtual Volumes
VLUNs
Cache
Remote Copy links and VVs
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Common stat options
Some options are common to most stat* commands:
-ni : display only non-idle objects
-rw : displays read and write stats separately. Output will have 3 lines per object : read (r), write
(w) and total (t)
-iter <X> : only display X iterations. Default : loop continuously
-d <X> : specifies an interval of X seconds between 2 iterations. Default : 2 seconds
57
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
statpd
Shows Physical disks stats :
• Current/Average/Max IOPS
• Current/Average/Max KB/s
• Current/Average service time
• Current/Average IO size
• Queue length
• Current/Average % idle
cli% statpd -devinfo
17:45:48 08/20/12 r/w I/O per second KBytes per sec
58
ID
Port
0
2:2:1
1
3:2:1
Svt ms IOSz KB
Idle %
Cur
Avg
Max
Cur
Avg
Max Cur Avg Cur Avg Qlen Cur Avg
t
0
0
0
0
0
0 0.0 0.0 0.0 0.0
0 100 100
t
0
0
0
0
0
0 0.0 0.0 0.0 0.0
0 100 100
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
statpd
Statpd will show :
• Backend IOs caused by host IOs
• IOs caused by data movement, such as DO tunes, AO region moves
• IOs caused by clones
• IOs caused by disk rebuild
Statpd will not show :
• IOs caused by chunklet initialisation. The only way to see that chunklet initiatialisation is
going on is to use “showpd -c”
59
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
statpd
Useful options :
• -devinfo : displays the type and speed of each disk
• -p –devtype FC/NL/SSD : display only FC/NL/SSD PDs
Things to look for :
• PDs that have too many IOPS (based on recommended numbers, see “Limits”). Usually these
PDs will also have a % idle < 20
• PDs of a given type that have significantly more/less IOPS than other PDs of the same type
Usually a sign that PDs are incorrectly balanced.
• PDs with anomalous response times
60
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
statport
Statport will show the aggregated stats for all devices (disks or hosts) connected on a port
The totals reported by statport –host are the same as the totals of statvlun
The totals reported by statport –disk are the same as the totals of statpd
61
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
statport
Useful options :
• -host/disk/rcfc/peer: displays only host/disk/rcfc/peer ports
Things to look for :
• Host ports that have a higher response time than other for the same hosts. Might indicate
problem on fabric
• Host ports that have reached they maximum read/write bandwidth
• Host ports that are busy in terms of bandwidth as this can increase the response time of IOs
for hosts
62
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Warning about statport stats for RCFC ports
Remote Copy synchronous uses a “posted read” method, which consists in the remote system
constantly posting read IOs on the source system (384 per port)
When the source system has some data to send to the destination system, it doesn’t need to do a
write IO because there’s already a read IO pending from the destination.
This technique is used to save a round-trip between the 2 systems
Because of these posted reads, the average response time and queue of RCFC ports will always
be very high, and will actually decrease as more data is being replicated
• When no data is being sent, the average response time on the RCFC ports will be of 60,000ms
(these IOs have a timeout of 60s) and the queue length of 384
• When replicating an average of 100 MB/s, the average response time will be 75ms
This is completely normal and no cause for concern.
To find the Remote Copy round-trip latency, use the statrcopy -hb command
63
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
statvlun
Statvlun is the highest level that can be measured and the statistics reported will the be the
closest to what could be measured on the host.
Statvlun shows :
• All host IOs, including cache hits
Statvlun does not show :
• RAID overhead
• IOs caused by internal data copy/movement, such as clones, DO/AO tasks…
• IOs caused by disk rebuilds
• IOs caused by VAAI copy offload (XCOPY)
64
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
statvlun
Statvlun read service time :
• Excludes interrupt coalescing time
• Includes statvv read time
• Includes additional time spent dealing with the VLUN
Statvlun write service time :
• Excludes the first interrupt coalescing time
• Includes the time spent between telling the host it’s OK to send data and the host actually
sending data. Because of this if the host/HBA/link is busy the statvlun time will increase but
the problem will be at the host/SAN level!
• Includes the second interrupt coalescing time when the host sends data
• Includes the time spent writing data to cache + mirroring
• Includes delayed ack time
65
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
statvlun
Useful options :
• -vvsum: displays only 1 line per VV
• -hostsum : diplays only 1 line per host
• -v <VV name> : displays only VLUN for specified VV
Things to look for :
• High read/write response times
• Higher response times on some paths only
• Using -hostsum : has the host reached its max read/write bandwidth?
• Single threaded workloads : will have a queue of 1 steadily. Consider disabling interupt
coalescing
• Maximum host/HBA/VM queue length reached for a path/host
66
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
statvv
Statvv stats represent the IOs done by the array to the VV. They exclude all time spent
communicating with the host and all time spent at the FC/iSCSI level.
Statvv includes :
• Cache hits
• IOs caused by the pre-fetching during sequential read IOs. Because of this it is possible to
have more KB/s at the VV level than at the VLUN level
• (needs checking) IOs caused by VAAI copy offload (XCOPY)
• IOs caused by cloning operations
• IOs caused by Remote Copy
Things to look for :
• High write response times. Might indicate delayed ack
67
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
statcmp
Useful options :
• -v : shows read/write cache hit/miss stats per VV instead of per node
Things to look for :
• Delayed ack on a device type
• High LockBlock
68
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
statcpu
Things to look for :
• CPUs maxed out
69
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
statrcopy
Useful options :
• -hb : shows link heart-beat response time
Things to look for :
• Max write bandwidth reached on a link
• Higher heart-beat round-trip latency on a link than on the other with -hb.
70
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Capturing performance data
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Capturing performance
3PAR performance can be measured / captured using different tools :
- GUI : real time performance graphs. No historical data. Small granularity (seconds)
On demand only
- System Reporter : Historical performance information. Min granularity = 1 min, default = 5
min
Continuous
- CLI : Real-time performance stats and histograms (buckets). Small granularity (seconds)
On demand only
- Service Processor / STaTS / “Perform” files. Very large granularity (4 hours)
Continuous
- Service Processor / STaTS / Performance Analyzer (Perfanal files). Small granularity (seconds)
On demand only
72
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Capturing a “Performance analyzer”
Connect to the Service Processor by pointing a browser to http://<IP_address_of_SP>
Login with the login “spvar” and password “HP3parvar” (SP 2.5.1 MU1 or later) or
“3parvar” (SP 2.5.1 or earlier)
Select “Support” on the left, then “Performance Analyzer”
Click “Select all” and enter the number of iterations to capture
Example, to capture 1 hours of data, enter 360 iterations of 10 seconds
The default of 60 iterations of 10 seconds will correspond to at least 10 minutes of
data
Click “Launch Performance Analysis tool”
73
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Capturing a “Performance analyzer”
Once the performance capture is over, the files
will be uploaded automatically to the HP 3PAR
support center and can be downloaded from
STATs. (http://stwebprod.hp.com/)
If the service processor is not configured to
send data automatically, the file can be found
in /files/<3PAR serial number>/perf_analysis
74
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Performance troubleshooting
guide
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Troubleshooting guide
What is the problem reported by the user?
Is this problem visible at the 3PAR level? If not it might be a problem higher up the chain
Poor response times on VLUNs:
Is the problem affecting only reads IOs or write IOs?
Is the problem visible on VVs?
High write service time on VLUNs and VVs -> Look for delayed ack with statcmp
76
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Troubleshooting guide
What is the queue at the statvlun level?
• If the queue is steadily at 0 or 1, that’s typical of a single-threaded workload. Look for ways of
increasing the queue depth at the application level or the behaviour of the application
• If the queue is steadily at a value, for example 15/16, or 31/32, this indicates that the
maximum queue length of the host/HBA/VM… has been reached.  Increase Host’s HBA Q
Length
77
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Troubleshooting guide
High service time on VLUNs but not on VVs :
• Try disabling interupt coalescing
• Can be representative of a problem on the host. Statvlun includes some time spent on the
host for write IOs
• Have some host ports reached their max bandwidth?
• Are there some host ports that have a higher response time than some other.
• For all hosts? Might indicate problem on the fabric/switch port/SFP…
78
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Troubleshooting guide
Delayed ack is happening :
• Look for busy PDs
• Is the maximum write bandwidth of the system reached?
• If using RC synchronous, is the maximum RC bandwidth reached?
• If using RC synchronous, is there delayed ack happening on the remote system?
79
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Thanks!
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Download
Random flashcards
Radiobiology

39 Cards

African nomads

18 Cards

Historical eras

16 Cards

Create flashcards