Introduction of Weka for power system stability studies Yongli Zhu Dr. Kai Sun University of Tennessee, Knoxville Feb. 02. 2016 1 Outline •1. Introduction •2. Basic GUI •3. Data mining algorithms: Decision Tree (DT) and Support Vector Machine (SVM) •4. Application example: Eastern Interconnection system security assessment 2 Introduction • What is Weka? – Waikato Environment for Knowledge Analysis – a research tool for data mining developed originally by researchers in University of Waikato, New Zealand. – Weka is also a bird found only on the islands of New Zealand. 3 •Features -- compact size: around 20MB -- easy to install : download .exe file and then click to install -- easy to use: friendly graphical user interface, concise buttons layout -- database connection: with professional database support like JDBC, it can be applied to large-scale data analytics for industrial applications. 4 4 •Features (cont’d) -- providing various preprocessing, training, testing and post-processing functions: 49 data preprocessing tools 76 classification/regression algorithm 8 clustering algorithms 3 algorithms for finding association rules 15 attribute/subset evaluators + 10 search algorithms for attributes selection 5 5 Download and Install Weka •Website: http://www.cs.waikato.ac.nz/~ml/weka/index.html •Support multiple platforms (written in java): – Windows, Mac OS X, and Linux 6 6 Basic GUI •Four modules: Explorer, Experimenter, KnowledgeFlow, Simple CLI Explorer: pre-processing the data Experimenter: experimental environment using batch-style data processing KnowledgeFlow: new process model interface using dragging and drawing style Simple CLI: command line style operation •In this tutorial, we will mainly focus on the “Explorer” module 7 •Explorer: – Preprocess data – Classification – Clustering – Association Rules – Attribute Selection – Data Visualization 8 Explorer (cont’d): •Data can be imported from a file in various formats: ARFF, Excel (.CSV), binary, etc. •Data can be also read from a URL or from an SQL database (using JDBC) •Pre-processing tools in Weka are called “filters” Weka contains filters for: – Discretization, normalization, resampling, attribute selection, transforming and combining attributes, etc. 9 Data mining algorithm •What is data mining? •Data mining: the transformation of large data set into meaningful patterns and rules. •Two types: -- supervised: learning data samples with prior class information -- unsupervised: learning data samples without class information The class information of the training data set was given (e.g. blue and red classes); The goal is to find a separating surface (an explicit function or a set of rules) The class information of the training data set was NOT given. The goal is to split it into different groups (no decision surface needs to be generated) 10 Data mining algorithm: Decision tree (DT) •In Weka, we mainly apply the so-called “J48” algorithm for DT •Main Features: 1) Perpendicularly dividing the sample space 2) Using probability and information theory (e.g. “entropy”) 3) Resulted decision rules have clear physical meaning (e.g. “Va>10kV and Ia<1A”) •Advantage: 1) Simple to implement 2) Fast training •Disadvantage: 1) Hard to fit complex decision boundary 2) Overfitting 11 A DT classification example in Weka x+0 y=1.5 5 4.5 4 3.5 y 3 2.5 2 1.5 1 0.5 0 0 1 2 3 4 5 x 12 Data mining algorithm: Support Vector Machine (SVM) ρ x Support Vector Machine (SVM) is essentially: r 1) Decision function is fully specified by a subset of training samples, i.e. so-called support vectors. x′ 2) Solving process involves Quadratic Programming (QP) techniques. w 13 SVM mathematic model: Quadratic programming 𝑚𝑖𝑛𝑤,𝑏 𝑠. 𝑡. 1 𝑤 2 2 𝑦𝑖 𝑤 ∙ 𝑥𝑖 + 𝑏 ≥ 1 − 𝜉𝑖 , 𝑖 = 1,2, … 𝑁 𝜉𝑖 ≥ 0, 𝑖 = 1,2, … 𝑁 Strict mathematic foundation: maximize the distance between decision surface of different classes. Multiple choices of kernel functions for classifier design. Higher accuracy for small sample set. 14 •Advantage: -- Theoretically-ensured accuracy -- Fewer overfitting problems than DT •Disadvantage: -- Higher computational burden than DT -- Selection of parameters heavily relies on the personal experience or trial-anderror methods 15 Application Example: East Interconnection system Security Assessment A DT classifier is designed to identify insecure status. Simulate N-1 contingencies on a list of HV transmission lines 60 PMU measurements on bus angles and line flows are obtained for DT training 2 nominal attributes also obtained : From Bus and To Bus numbers of the contingency line (the 3-phase fault is added to the From Bus) 13079 secure cases + 2601 insecure cases = 15680 cases in total 16 Topology of East Interconnection system 17 Attributes for training (i.e. data inputs) 60 PMU measurements as the training attributes Attribute name P_537_606 P_301_233 P_301_430 P_233_235 P_235_390 . . . A_246_233 A_390_233 A_430_233 A_234_537 A_234_539 . . . Physical Meaning Line flow from bus 537 to bus 606 Line flow from bus 301 to bus 233 Line flow from bus 301 to bus 430 Line flow from bus 233 to bus 235 Line flow from bus 235 to bus 390 . . . Angle difference between buses 246 and 233 Angle difference between buses 390 and 233 Angle difference between buses 430 and 233 Angle difference between buses 234 and 537 Angle difference between buses 234 and 539 . . . 18 Step-1 : read data set : file name “Data_EI.csv” 19 Step-2: chose a combination of attributes for training •For example, we can choose all the attributes except “FB” and “OB” (bus numbers); •We also need to choose “SECU$”, since it gives the a priori classes information. 20 •Choose “FB” and “OB” •Click “Remove” to keep only the attributes we need 21 Decision tree training •Click “Classify” Under “Text options” menu, choose “Percentage split %[66]” (i.e. 1/3 of the original data is used for testing; 2/3 of that of training) •In attributes list, choose “(Nom) SECU$”, 22 •Click the “choose” button under “Classifier” •In the drop-down list, choose “trees” --> ”J48” --> click ”Close” button below 23 •Then double click text“J48 –C 0.25 –M 2” •In the pop up menu , choose option "True” for item “unpruned” 24 •Click “Start” button in the middle place •The training results will be displayed in left window under “Classifier output” •5331 instances (i.e. data samples) are used for testing; the remaining 10349 instances are used for training •Result interpretation: Accuracy= 86.6629% (i.e. Error rate 1%) 25 •Result interpretation (cont’d): • 4277 instances are classified as “Secure” status by Decision tree, where 270 instance are indeed “Insecure” status considered misclassified. •1054 instances are classified as “Insecure” status, where 441 misclassified cases exist. •There are other indices (e.g. TP, FP, ROC, etc.) used for evaluating the classifier performance which can be referred to Weka manual. 26 •Tree view : Right-click on the “Result list” item and choose “Visualize tree” 27 Pruned tree •The original generated decision tree usually has many levels and leafs ; in other words, the “decision deepness” is too large that can affect the decision speed. •To improve this, various pruning techniques can be applied . •In Weka, we can just go back to the “J48” option page in our previous step, where we can turn back the “unpruned” option to “False”. 28 •After pruning, the number of leaves is reduced to 4, and to make decision only 3 comparisons are needed to make a final decision. 29 Questions & Additional exercises (optional) • 1. What is the geometrical meaning of each non-leaf node on the tree? • 2 .Try another two different combinations of attributes 1) Use any 3 different attributes 2) Use all the attributes Re-do the DT training; compare and analyze the result differences. • 3. What does “cross-validation” mean? (search Google) Under “Test option”, choose “cross validation [10 folds]” and re-do the trailing using the same three attributes as in the previous slides example. Analyze and compare the result difference. 30