TR 1-470 Performance Analysis for Dummies Lars Ejskjaer Greg Ferguson Agenda Understanding Performance Analysis Can Help You Sell − Gain better understanding of your customers environment − Grow awareness to possible system bottleneck areas − Understand changes that can enhance your customers overall satisfaction Getting Started… Simple Analysis (1, 2, 3) Valuable Tools and Resources 2 The Customer Problem ”My storage is slow…” 3 Your Goal Guide the customer to best practices Recommend solutions − Administrative − Product Provide advice in the context of their environment 4 What You Will Need To Know Understanding of how the customer’s environment was set up Ability to identify missed best practices − Administrative − Performance Ability to identify common performance issues 5 Getting Started 6 Easily Collect Performance Data - Perfstat https://communities.netapp.com/docs/DOC-16209 Pro Tip: Perform multiple smaller iterations versus one larger iteration for better visibility 7 Loading Data into LatX for Analysis https://latx.netapp.com/ 8 Find and View the Perfstat NetApp Confidential – Limited Use 9 Finding Specific Data Pro Tip Use PRESTATS iteration 1 for configuration information Use POSTSTATS for performance measurements 10 Analysis Strategy Understand Configuration − Aggregates − Volumes − Look for configuration errors Analyze Performance − System − Disk − Flash Make Recommendations 11 The Simple Analysis Part 1 Disk Configuration 12 Disk Types on the Controller Prestats - sysconfig -r Pro Tip Common disk types today: SAS, BSAS, MSATA 13 RAID Groups Poststats – statit Pro Tip Avoid unbalanced raid-groups in aggregates! 14 32-bit vs 64-bit Aggregates Prestats – aggr status -v Pro Tip New features work best (require) with 64-bits aggregates Using one aggregate type makes the customer operations easier 15 Aggregate Utilization Poststats - df –A -h Pro Tip An aggregate which is 80% full, should be monitored Aggregates over +90% full could impact performance 16 Aggregate Snapshot Copies Prestats - df –A –h, snap status -A & aggr status –v Pro Tip Only SyncMirror and MetroCluster use aggregate snapshots If not in use: - Remove schedule - Release space reservation NetApp Confidential – Limited Use 17 The Simple Analysis Part 2 Volumes 18 Volume Space Utilization Poststats - df & lun stat –v all (or vol status –v) Pro Tip Databases like Oracle initialize their data files or data could be static so it could be fine that the volume is almost full 19 Deduplication Poststats- stats perfstat_sis Pro Tip Savings less than 6-8% are just using resources without a real effect, unless data is static, turn it off and save resources! 20 Deduplication Runtimes Poststats - sis status -l Pro Tip Deduplication is a low priority process but it still occupies system resources Work to understand run times to smooth system scheduling for better performance 21 Misalignment Poststats - nfsstat –d & lun stat –v all Pro Tip Understand what is in these files/LUNs Logfiles are often ”misaligned” Verify that virtual machines are aligned 22 The Simple Analysis Part 3 System Performance 23 What Are Domains In Data ONTAP? ONTAP breaks work into groups of processes called domains ONTAP schedules work across CPU cores as IT sees best This can be seen in Sysstat –M Detailed analysis of this is an advanced topic 24 CPU Utilization Poststats - sysstatM.out Pro Tip: Average CPU utilization >70% depicts a very busy controller CPU Utilization is a generally poor indicator of performance Look at AVG CPU – Single CPU (thread) utilization is very informative 25 Writing Data to Disk - CP Type & Time Poststats – sysstat_1sec.out Pro Tip Deferred Back-to-back CP’s are performance killers (type #) - Data can’t get to disk fast enough Investigate (ignoring the CPU utilization): - Mis-alignment - Disk over commited Solutions: - Move load to another controller/aggregate - Add disks/Flash Cache 26 Aggregate Disk Utilization Poststats - statit Pro Tip Analyze performance impact based on drive utilization (by drive type) - SATA drives > 50% = busy - SAS drives > 60% = busy Statit will give clues about where to move load to 27 Active Volumes Perfsys Report Pro Tip Map volumes to aggregates (aggr status –v) Identify workloads to move 28 FlashCache Perfsys Report Pro Tip Verify that the system is benefiting from the use of FlashCache Use PCS and perfstat to verify a customers gains by use of FlashCache 29 Latency in the Environment Poststats - stats perfstat_cifs/_nfs/_fcp Pro Tip High latency = performance impact Latency requirements vary by applicaton Analyze the workload and - Add Disk - Add FlashCache - Upgrade the Controllers 30 Review of Findings We reviewed − High disk I/O’s − Busy volumes − Disk configuration issues − Average CPU utilization − Potential Flash Cache benefits − Mis-alignments 31 Recommendations Resolve mis-alignments Consider moving busy volumes Add drives and reallocate to even out raid groups Add Flash Cache Upgrade Controllers Add more Controllers within Clustered Data ONTAP 32 Now YOU Can Better Understand System Performance Don’t be afraid – performance is no longer such a mystery! Most importantly, monitor disk utilization! Have FUN! 33 Important Resources The Community site − http://communities.netapp.com ONTAP documentation (particularly ONTAP Command ref) − http://support.netapp.com Latx – your analysis tool − http://latx.netapp.com 34 Complimentary Sessions TR-1-123 Sizing, Designing, and Presenting a NetApp Solution TR-2-717 Using NetApp Tech Tools to Create Winning Proposals and Tech Refreshes TR-2-315 A Field Guide to Sizing - Part 1 TR-3-317 A Field Guide to Sizing – Part 2 35 Take an Insight Survey! NEW! 1) Click on the session number in your agenda. 2) Click on the Surveys Button. 3) Follow the prompts, complete the survey and submit! Complete this survey by 7PM and be entered to win one of the following prizes: 1 iPad Mini 16GB Wifi 1 Bose SoundLink Mini Bluetooth 2 Jawbone Up Wristbands (Activity Tracker) 4 NetApp Signature Dry Zone Caps Went to a different session? Need a translated survey? Visit the main survey page in the mobile app to take a daily survey – available in English, Chinese, Japanese and Korean. 36 Facebook www.facebook.com/ NetAppInsightAmericas www.facebook.com/ NetAppInsightEMEA Twitter www.twitter.com/ InsightAmericas www.twitter.com/ InsightEMEA Tweet friends with #NTAPInsight 37 © 2013 NetApp, Inc. All rights reserved. No portions of this document may be reproduced without prior written consent of NetApp, Inc. Specifications are subject to change without notice. NetApp, the NetApp logo, and Go further, faster, are trademarks or registered trademarks of NetApp, Inc. in the United States and/or other countries. All other brands or products are trademarks or registered trademarks of their respective holders and should be treated as such. 38