Anomaly Detection in Premise Energy Consumption Data

advertisement
Anomaly Detection in
Premise Energy
Consumption Data
Jason W Black (Energy Systems, Battelle)
Yi Zhang, Weiwei Chen (MSL, GE Global Research
Center)
CMU Electricity Conf
March 2012
Demand Response Estimation
• Critical X’s: weather,
Temperature, seasons, days of
week, devices, time0
• X’s are gathered from:
• AMI data
• Real time weather forecasts
• Historical data
DR Forecast
18,000
17,500
• Methods:
• Sampling Methods
• Categorization Techniques
• Empirical Methods
• Regression techniques
• Bayesian methods
• Artificial Intelligence –
Neural Nets, SVM,0
Load (MW)
17,000
16,500
16,000
15,500
Base Load Forecast
DR Actual
15,000
DR Forecast
14,500
13
14
15
16
17
18
Hour
19
20
21
22
23
24
• 2 Error Sources:
- Base Load Forecast
- Response Estimate
2/
GE Title or job number /
3/13/2012
Typical Household Load Patterns
Weekend with heating/AC
Weekday without heating/AC
Weekday with heating/AC
Anomaly
3/
GE Title or job number /
3/13/2012
Proposed Methods for Anomaly
Detection
Regression-Based Anomaly Detection
Distribution-Based Anomaly Detection
Attribute-Based Anomaly Detection
Data collections
and extraction
Which
method?
Entropy
calculation
Data preprocession
Distribution-based
method
Attribution generation
Regression-based
method
Attribute-based
method
Output the
anomalies
4/
GE Title or job number /
3/13/2012
Regression-based Anomaly Detection
5/
GE Title or job number /
3/13/2012
Distribution-based Anomaly Detection
In information theory, entropy is a measure of the uncertainty associated
with a random variable.
Normal Weekday
E(Abnormal) <
E(Normal)
High
Entrop
y
Low
Entrop
y
Abnormal
Weekday
6/
GE Title or job number /
3/13/2012
Attribute-based Anomaly Detection
Attributes generated from daily
consumption data are:
mean
Variance
Maximum
range (=maximum - minimum)
ratio (=maximum/minimum).
Log(Var)
Normal Weekday
High usage
mean and
peak
Vacation Days
Weekdays
Weekends
Low usage
mean and
peak
Vacation Weekday
Log(Mean)
7/
GE Title or job number /
3/13/2012
Raw Data
2 sets of raw data
Single customer data
Detailed usage
8 months
Time
1/1/2010
1/1/2010
1/1/2010
1/1/2010
Air
A/C Down A/C Up
Handler Air
Entertain
Condensi Condensi Downstair Handler Computer Dishwash Dryer(Gas ment
Microwav
ng Unit
ng Unit
s
Upstairs Room
er
)
Center
e
Other
Range
0:00
0
0
0
0
0.03
0
0
0.01
0
0.01
0:15
0
0
0
0.06
0.03
0
0
0.01
0
0.01
0:30
0
0
0
0.03
0
0
0.01
0
0
0:45
0
0
0
0.05
0.03
0
0
0.01
0
0.02
Pool
Pump
0
0
0
0
Indoor
Temperat Indoor
ure
Temperat
Refrigerat
Downstair ure
Weather
or
Main
Washer s
Upstairs Temp
0
0.06
0.1
0
281.97
283.64
53.6
0
0.01
0.1
0
282.07
283.53
53.6
0
0.02
0.05
0
282.01
283.64
53.6
0
0.07
0.18
0
282.14
283.71
53.6
9-customer data
Some information is missing
90 days
8/
GE Title or job number /
3/13/2012
Abnormal Days Detection
(use statistics, machine learning, etc. to detect days with abnormal behavior/responsiveness
in nonevent days or during DR events)
1.2
1
0.8
0.6
0.4
1
1.5
1
0.5
0.5
5 10 15 20
Usage(Kw)
2
1.5
1
0.8
0.6
0.4
0.2
5 10 15 20
4
3
2
1
5 1015 20
0.5
0.4
0.3
0.2
0.1
5 10 15 20
0.2
0.2
0.2
0.1
0.1
5 10 15 20
5 10 15 20
1
5 10 15 20
3
3
2
2
1
1
5 10 15 20
0.1
5 10 15 20
5 10 15 20
0.5
5 10 15 20
5
4
3
2
1
5 10 15 20
2.5
2
1.5
1
0.5
4
3
2
1
2
5 10 15 20
5 10 15 20
2
1.5
1
0.5
3
4
3
2
1
5 10 15 20
0.3
5 10 15 20
5 10 15 20
2
1.5
1
0.5
0.4
0.3
5 10 15 20
4
3
2
1
2
1.5
1
0.5
0.4
5 10 15 20
10
8
6
4
2
5 10 15 20
2
1.5
1
0.5
5 10 15 20
5 10 15 20
2
1.5
1
0.5
5 10 15 20
5 10 15 20
0.4
2
1.5
1
0.5
5 10 15 20
5 10 15 20
1.2
1
0.8
0.6
0.4
5
4
3
2
1
2
1.5
1
0.5
5 10 15 20
5 10 15 20
5 10 15 20
1
0.8
0.6
0.4
0.2
2
1.5
1
0.5
1
5 10 15 20
5 10 15 20
0.3
4
3
2
1
3
2
1
5 10 15 20
5 10 15 20
1.5
1
0.8
0.6
0.4
1
0.8
0.6
0.4
0.3
0.2
5 10 15 20
0.1
5 10 15 20
Time (hr)
9/
GE Title or job number /
3/13/2012
Simulation Results: 9-customer Data
10 /
GE Title or job number /
3/13/2012
Simulation Results: Single-customer Data
Detection Results
ROC Curve
11 /
GE Title or job number /
3/13/2012
Conclusions
Anomaly detection achievable in demand
response estimation.
Proposed Regression-based, distribution-based,
and attribute based methods can be used.
Data mining on each customer’s power
consumption profile is required for algorithm
selection and parameter tuning
12 /
GE Title or job number /
3/13/2012
Backup
13 /
GE Title or job number /
3/13/2012
Distribution-based Anomaly Detection cont.
14 /
GE Title or job number /
3/13/2012
Piecewise Regression
15 /
GE Title or job number /
3/13/2012
Attribute-based Anomaly Detection cont.
Algorithm:
1. Generate attributes from raw energy consumption
data
2. Run k-Means clustering n times using random start
point and select the best one output
3. Identify the cluster with the smallest mean usage as
anomaly days
16 /
GE Title or job number /
3/13/2012
Different Anomaly Detection Methodologies
Regression
based
Distribution based
Attribute based
Log(Var)
Vacation Days
Weekdays
Weekends
Log(Mean) 17 /
GE Title or job number /
3/13/2012
Demand Response (DR) Process
Estimation of future energy consumption
Estimation of future energy generation
No
Is DR needed?
Finish
Yes
Assign DR event to customers
DR estimation
Does reduction amount meet
needs?
No
Yes
Adjust DR event assignment
Sent DR event to customers
18 /
GE Title or job number /
3/13/2012
Download