Introduction of Weka for power system stability studies Yongli Zhu Dr. Kai Sun

advertisement
Introduction of Weka for power system
stability studies
Yongli Zhu
Dr. Kai Sun
University of Tennessee, Knoxville
Feb. 02. 2016
1
Outline
•1. Introduction
•2. Basic GUI
•3. Data mining algorithms: Decision Tree (DT) and Support Vector Machine (SVM)
•4. Application example: Eastern Interconnection system security assessment
2
Introduction
• What is Weka?
– Waikato Environment for Knowledge Analysis
– a research tool for data mining developed originally by researchers in University of
Waikato, New Zealand.
– Weka is also a bird found only on the islands of New Zealand.
3
•Features
-- compact size: around 20MB
-- easy to install : download .exe file and then click to install
-- easy to use: friendly graphical user interface, concise buttons layout
-- database connection: with professional database support like JDBC, it
can be applied to large-scale data analytics for industrial applications.
4
4
•Features (cont’d)
-- providing various preprocessing, training, testing and post-processing
functions:
 49 data preprocessing tools
 76 classification/regression algorithm
 8 clustering algorithms
 3 algorithms for finding association rules
 15 attribute/subset evaluators + 10 search algorithms for
attributes selection
5
5
Download and Install Weka
•Website: http://www.cs.waikato.ac.nz/~ml/weka/index.html
•Support multiple platforms (written in java):
– Windows, Mac OS X, and Linux
6
6
Basic GUI
•Four modules: Explorer, Experimenter, KnowledgeFlow, Simple CLI
Explorer: pre-processing the data
Experimenter: experimental environment
using batch-style data processing
KnowledgeFlow: new process model interface
using dragging and drawing style
Simple CLI: command line style operation
•In this tutorial, we will mainly focus on the “Explorer” module
7
•Explorer:
– Preprocess data
– Classification
– Clustering
– Association Rules
– Attribute Selection
– Data Visualization
8
Explorer (cont’d):
•Data can be imported from a file in various formats: ARFF, Excel (.CSV), binary, etc.
•Data can be also read from a URL or from an SQL database (using JDBC)
•Pre-processing tools in Weka are called “filters”
Weka contains filters for:
– Discretization, normalization, resampling, attribute selection, transforming and
combining attributes, etc.
9
Data mining algorithm
•What is data mining?
•Data mining: the transformation of large data set into meaningful patterns and rules.
•Two types:
-- supervised: learning data samples with prior class information
-- unsupervised: learning data samples without class information
The class information of the training data
set was given (e.g. blue and red classes);
The goal is to find a separating surface (an
explicit function or a set of rules)
The class information of the training data
set was NOT given. The goal is to split it
into different groups (no decision surface
needs to be generated)
10
Data mining algorithm: Decision tree (DT)
•In Weka, we mainly apply the so-called “J48” algorithm for DT
•Main Features:
1) Perpendicularly dividing the sample space
2) Using probability and information theory (e.g. “entropy”)
3) Resulted decision rules have clear physical meaning (e.g. “Va>10kV and Ia<1A”)
•Advantage:
1) Simple to implement
2) Fast training
•Disadvantage: 1) Hard to fit complex decision boundary
2) Overfitting
11
A DT classification example in Weka
x+0 y=1.5
5
4.5
4
3.5
y
3
2.5
2
1.5
1
0.5
0
0
1
2
3
4
5
x
12
Data mining algorithm: Support Vector Machine (SVM)
ρ
x
Support Vector Machine (SVM) is essentially:
r
1) Decision function is fully specified by a subset of
training samples, i.e. so-called support vectors.
x′
2) Solving process involves Quadratic Programming
(QP) techniques.
w
13
SVM mathematic model: Quadratic programming
𝑚𝑖𝑛𝑤,𝑏
𝑠. 𝑡.
1
𝑤
2
2
𝑦𝑖 𝑤 ∙ 𝑥𝑖 + 𝑏 ≥ 1 − 𝜉𝑖 ,
𝑖 = 1,2, … 𝑁
𝜉𝑖 ≥ 0,
𝑖 = 1,2, … 𝑁
 Strict mathematic foundation: maximize the distance between
decision surface of different classes.
 Multiple choices of kernel functions for classifier design.
 Higher accuracy for small sample set.
14
•Advantage:
-- Theoretically-ensured accuracy
-- Fewer overfitting problems than DT
•Disadvantage:
-- Higher computational burden than DT
-- Selection of parameters heavily relies on the personal experience or trial-anderror methods
15
Application Example: East Interconnection system Security Assessment
A DT classifier is designed to identify insecure status.
Simulate N-1 contingencies on a list of HV transmission lines
60 PMU measurements on bus angles and line flows are obtained for DT training
2 nominal attributes also obtained : From Bus and To Bus numbers of the
contingency line (the 3-phase fault is added to the From Bus)
13079 secure cases + 2601 insecure cases = 15680 cases in total
16
Topology of East Interconnection system
17
Attributes for training (i.e. data inputs)
60 PMU measurements as the training attributes
Attribute name
P_537_606
P_301_233
P_301_430
P_233_235
P_235_390
.
.
.
A_246_233
A_390_233
A_430_233
A_234_537
A_234_539
.
.
.
Physical Meaning
Line flow from bus 537 to bus 606
Line flow from bus 301 to bus 233
Line flow from bus 301 to bus 430
Line flow from bus 233 to bus 235
Line flow from bus 235 to bus 390
.
.
.
Angle difference between buses 246 and 233
Angle difference between buses 390 and 233
Angle difference between buses 430 and 233
Angle difference between buses 234 and 537
Angle difference between buses 234 and 539
.
.
.
18
Step-1 : read data set : file name “Data_EI.csv”
19
Step-2: chose a combination of attributes for training
•For example, we can choose all the attributes except “FB” and “OB” (bus numbers);
•We also need to choose “SECU$”, since it gives the a priori classes information.
20
•Choose “FB” and “OB”
•Click “Remove” to keep only the attributes we need
21
Decision tree training
•Click “Classify”  Under “Text options” menu, choose “Percentage split %[66]”
(i.e. 1/3 of the original data is used for testing; 2/3 of that of training)
•In attributes list, choose “(Nom) SECU$”,
22
•Click the “choose” button under “Classifier”
•In the drop-down list, choose “trees” --> ”J48” --> click ”Close” button below
23
•Then double click text“J48 –C 0.25 –M 2”
•In the pop up menu , choose option "True” for item “unpruned”
24
•Click “Start” button in the middle place
•The training results will be displayed in left window under “Classifier output”
•5331 instances (i.e. data samples) are used for testing; the remaining 10349 instances
are used for training
•Result interpretation: Accuracy= 86.6629% (i.e. Error rate 1%)
25
•Result interpretation (cont’d):
• 4277 instances are classified as “Secure” status by Decision tree, where 270 instance
are indeed “Insecure” status considered misclassified.
•1054 instances are classified as “Insecure” status, where 441 misclassified cases exist.
•There are other indices (e.g. TP, FP, ROC, etc.) used for evaluating the classifier
performance which can be referred to Weka manual.
26
•Tree view : Right-click on the “Result list” item and choose “Visualize tree”
27
Pruned tree
•The original generated decision tree usually has many levels and leafs ; in other
words, the “decision deepness” is too large that can affect the decision speed.
•To improve this, various pruning techniques can be applied .
•In Weka, we can just go back to the “J48” option page in our previous step, where
we can turn back the “unpruned” option to “False”.
28
•After pruning, the number of leaves is reduced to 4, and to make decision only 3
comparisons are needed to make a final decision.
29
Questions & Additional exercises (optional)
• 1. What is the geometrical meaning of each non-leaf node on the tree?
• 2 .Try another two different combinations of attributes
1) Use any 3 different attributes
2) Use all the attributes
Re-do the DT training; compare and analyze the result differences.
• 3. What does “cross-validation” mean? (search Google)
Under “Test option”, choose “cross validation [10 folds]” and re-do the trailing
using the same three attributes as in the previous slides example. Analyze and
compare the result difference.
30
Download