SEDS-Lite: Using Open Source Tools (R, BIRT and MySQL) to Report and Analyze Performance Data Igor Trubin, PhD http://www.itrubin.blogspot.com SCMG’11 Agenda o SEDS-Lite project: Open source based implementation of Statistical Exception Detection System - SEDS-Lite Introduction o Open Source and “free” tools for Capacity Management – review o Control Charts as a good test of using different tools for Capacity management - Power of Control Charts and IT-Chart Concept (Part 1) o R-system to built Control Charts (System Management by Exception: o o o o o o Power of Control Charts Mar 25, 2009) How To Build IT-Control Chart - Use the Excel Pivot Table! BIRT based Control Chart BIRT Data Cubes Usage for Performance Data Analysis Building IT-Control Chart by BIRT against Data from the MySQL Database EV-Control Chart (to build using BIRT) Simplified Exception Detection using BIRT: UCL=LCL : How many standard deviations do we use for Control Charting? Use ZERO! SCMG’11 2 SEDS-Lite Open source based implementation of Statistical Exception Detection System - SEDS-Lite Introduction SCMG’11 Open Source and “free” tools for Capacity Management Previous reviews: o MeasureIT - Issue 2.05 - Open Source Capacity & Performance Management Tools for Windows & Unix Systems (2004 - Brian Johnson ) o Capacity Planning and Performance Monitoring with Free Tools (for mspcmg 2011! - Adrian Cockcroft) The missed point is BI and Generic DB open source tools usage for Performance Analysis and Reporting o BIRT (Business Intelligence and Reporting Tools) o MySQL o NoSQL SCMG’11 4 Control Charts as a good test of using different tools for Capacity management What is the Control Chart? o Power of Control Charts and IT-Chart Concept (CMG’09 -’10) SCMG’11 5 R-system to built Control Charts CMG’09 workshop “ Power of Control Charts “, SCMG - Mar 25, 2009 o EXAMPLE3: Monthly Profile R download: http://www.r-project.org/ The input data is Unix File Space Utilization: R-script (published on my blog): (FYI: qcc: An R package for quality control charting : http://cran.rproject.org/web/packages/qcc/index.html ) SCMG’11 6 R-system to built Control Charts CMG’09 workshop “ Power of Control Charts “, SCMG - Mar 25, 2009 SCMG’11 7 BIRT based Control Chart What is BIRT – see BIRT Report Designer Tutorial Demo There is a tool that generates control charts against performance data using BIRT: The IT-Control Chart would give much more value for analyzing time stamped historical data. Is that possible to build using BIRT? SCMG’11 8 BIRT based IT-Control Chart: Live demo The case where the input data source is CSV file: But how to calculate that UCL and LCL numbers!? SCMG’11 9 How to calculate that UCL and LCL numbers!? Use EXCEL Pivot Table Weekhour calculation: Pivot Table creation (IT-Chart Builder) SCMG’11 Pivot Chart built 10 How to calculate that UCL and LCL numbers!? BIRT Data Cubes Usage for Performance Data Analysis Data behind IT Control chart is a Cube: Input data is raw stamped measures The result (in the form of BIRT report designer preview) is on the picture at left SCMG’11 11 How to calculate that UCL and LCL numbers!? BIRT Data Cubes Usage for Performance Data Analysis (1) The Reference set or base-line is based on the input raw data with some filtering and computed columns (weekday and weekhour) (2) the Actual data set is the same but having the different filter: (raw[“date”] Greater “2011-04-02”) SCMG’11 12 How to calculate that UCL and LCL numbers!? BIRT Data Cubes Usage for Performance Data Analysis (3) To combine both data sets for comparing base-line vs. actual, the “Data Set1” is built as a “Joint Data Set” by the following BIRT Query builder: (4) Then the Data Cube was built in the BIRT Data Cube Builder with the structure shown on the following screen: (5) The next step is building report starting with Cross Table (which is picked as an object from BIRT Report designer “Pallete”): The picture above shows also what fields are SCMG’11 chosen from Cube to Cross table. 13 How to calculate that UCL and LCL numbers!? BIRT Data Cubes Usage for Performance Data Analysis (6) The final step is dropping “Chart” object from “Palette” and adding UCL calculation using Expression Builder for additional Value (Y) Series: SCMG’11 14 BIRT Data Cubes Usage for Performance Data Analysis To see the result one needs just to run the report or to use a "preview' tab on the report designer window: SCMG’11 15 How to calculate that UCL and LCL numbers!? SQL script for MySQL Database (upload) (1) That raw data (the same) need to be uploaded to some table (CPUutil) in the MySQL schema (ServerMetric) by using the following script (sqlScriptToUploadCSVforSEDS.sql): SCMG’11 16 How to calculate that UCL and LCL numbers!? SQL script for MySQL Database (Calculate) (2) Then the output (result) data (ActualVsHistoric table) is built using the following script (sqlScriptToControlChartforSEDS.sql): SCMG’11 17 Building IT-Control Chart by BIRT against Data from the MySQL Database (3) In the BIRT, the connection to MySQL database is established (to MySQLti with schema ServerMetrics to table ActualVsHistorical): SCMG’11 18 Building IT-Control Chart by BIRT against Data from the MySQL Database (4) the chart is developed the same way like that was done in BIRT based Control Chart post (slide 14): SCMG’11 19 Building IT-Control Chart by BIRT against Data from the MySQL Database (5) In BIRT you can specify report parameters, that could be then a part of any constants including for filtering (to change a baseline or to provide server or metric names). (6) Finally the report should be run to get the following result, which is almost identical with the one built for BIRT based Control Chart post (slide 15) Lastly: R-system can be used to run the same SQL script against MySQL by using Package 'RMySQL‘!!!! SCMG’11 20 EV-Control Chart (to build using BIRT) EV meta-metric introduced in 2001 as a measure of anomaly severity. (The Exception Value Concept to Measure Magnitude of Systems Behavior Anomalies) In CMG’08 paper “Exception Based Modeling and Forecasting” That metric was plotted using Excel to explain how it could be used for a new trend starting point recognition: It is the difference (integral) between actual data and control limits. SCMG’11 21 EV-Control Chart (to build using BIRT) Plotting that meta-metric and/or two their components (EV+ and EV-) over time gives a valuable picture of system behavior. The BIRT example of the EV-chart against the same sample data: EV calculation SCMG’11 22 EV-Control Chart (to build using BIRT) The EXCEL example of the EV-chart against the same sample data: SCMG’11 23 SEDS-Lite: Exception Detection (EV-based) using BIRT Lite version: (UCL=LCL) Below is example of DB2-like SQL script to implement that. It is based on real script developed and successfully tested against real data for BIRT tool to get exceptional servers list: SCMG’11 24 References Igor Trubin: “Global and Application Levels Exception Detection System, Based on MASF Technique ”, Proceedings of the Computer Measurement Group, 2002. (http://www.cmg.org/measureit/shared/trubin_02.pdf) Linwood Merritt, Igor Trubin: “Disk Subsystem Capacity Management Based on Business Drivers I/O Performance Metrics and MASF”, Proceedings of the Computer Measurement Group, 2003. (http://regions.cmg.org/regions/ncacmg/downloads/june162004_session3.doc) Linwood Merritt, Igor Trubin: : “Mainframe Global and Workload Level Statistical Exception Detection System, Based on MASF”, Proceedings of the Computer Measurement Group, 2004. (http://www.cmg.org/membersonly/2004/papers/4179.pdf) Igor Trubin: “Capturing Workload Pathology by Statistical System”, Proceedings of the Computer Measurement Group, 2005. (http://www.cmg.org/membersonly/2005/papers/5016.pdf) Igor Trubin: “System Management by Exception, Measurement Group, 2006. (http://www.cmg.org/membersonly/2006/papers/6120.pdf) Exception Detection Part 6”, Proceedings of the Computer Igor Trubin: “System Management by Exception, Part Final”, Proceedings of the Computer Measurement Group, 2007.(http://regions.cmg.org/regions/scmg/fall_07/richmond/SEDSCMG2007_v4.pdf ) Igor Trubin: “Exception Based Modeling and Forecasting”, Proceedings of the Computer Measurement Group, 2008. (http://itrubin.blogspot.com/2008/08/exception-based-modeling-and.html ) Igor Trubin: “IT-Control Chart”, Proceedings of the Computer Measurement Group, 2010. (http://itrubin.blogspot.com/2010/07/my-new-cmg10-paper-it-control-charts.html ) SCMG’11 25 Questions? Everything is ready to build SEDS-Lite: Using Open Source Tools (R, BIRT and MySQL) Igor Trubin, PhD http://www.itrubin.blogspot.com SCMG’11 26