Use Case: Extracting Performance data from OnCommand using APIs Arda Oral - Professional Services Engineer 1 Agenda Scope of Work Environment Performance Collection Implementation – The Theory Implementation – The Praxis (Demonstration) SLA Thresholds Dashboard Scope of Work Customer wants to retrieve and store performance data of all storage controllers (NetApp and other vendors) in his common “performance database” Customer defines SLAs to the performance values. SLA violations are to be imported into the database Dashboard presenting SLA violations 3 Scope of Work Oncommand „Performance Advisor“ responsible for data collection Performance data is stored in internal Sybase database NMSDK APIs used to access Oncommand Performance data 4 Environment ~ 30 NetApp Storage Systems OnCommand5 on a Windows 2008 Server Oracle10 Database on AIX 5 (Performance DB) 5 Environment Windows 2008 AIX 5 http,https OnCommand5 http,https Oracle NMSDK4.1 Performance DB 6 Performance Collection NetApp performance data is being collected by the CounterManager (CM) residing on the storage controller CM groups data in objects, instances and counters Data can be retrieved with „stats“ on a storage controller Performance Collection stats list objects (aggregate, cifs, disk, lun, volume…) stats list instances object name: aggregate, instance: aggr1 object name: system, instance: system object name: volume, instance: vol0 stats list counters object name: aggregate, counter: user_reads object name: system, counter: cpu_busy object name: lun, counter: avg_latency Implementation – The Theory Install NMSDK 4.1on AIX5 server Install required Perl Modules (SSL,LWP…) Check NMDSK examples (basic, advanced) ../netapp-manageability-sdk4.1/src/sample/DataFabric_Manager/API_Sample_Code/advanced/Perl/perf_cou nters/ Find appropriate API: perf-get-counter-data ../netapp-manageability-sdk-4.1/doc/WebHelp/index.htm 9 NetApp Confidential - Internal Use Only Implementation – The Theory (cont. 1) perf-get-counter-data start-time end-time sample-rate instance-counter-info object-name-or-id time-consolidation-method counter-info perf-object-counter API = Object = object-type counter-name string/int = 11 Implementation – The Theory (cont. 2) Object/Instance/Counter Value start-time 6h before now end-time now sample-rate 5 minutes objekt-name-or-id storage controller counter-name cpu_busy object-type system time-consolidation-method average Command on storage system: stats show -i 1 system:*:cpu_busy 12 SLA Thresholds CPU_BUSY Disk_BUSY LUN Latency TARGET > 90% > 90% > 20ms Queue Full = SLA violation = SLA violation = SLA violation = SLA violation if 10% of collected counter data exceed SLA threshold storage system counter is flagged yellow ** if 20% of collected counter data exceed SLA threshold storage system counter is flagged red 13 Dashboard (sample output) 14 15