Poster-Predicting Solar Generation from Weather Forecasts

advertisement
Predicting Solar Generation from Weather Forecasts
Advisor: Professor Arye Nehorai
Chenlin Wu, Yuhan Lou
Department of Electrical and Systems Engineering
Principal Component Analysis (PCA)
Kernel Trick for SVR
Background

Smart grid: increasing the contribution of renewable in grid
energy

Solar generation: intermittent and non-dispatchable
The kernel trick is a way of mapping observations from a
general set S (Input space) into an inner product space V
(high dimensional feature space)
Φ: ℝ𝑛 → β„π‘š
Goals
Creating automatic prediction models
𝑓(π‘₯)

Predicting future solar power intensity given weather forecasts
=
π‘šβ‰«π‘›
𝑖
𝛼𝑖 − 𝛼𝑖 ∗ π‘˜(π‘₯𝑖 , π‘₯)
Experiments
+𝑏
where π‘˜ π‘₯𝑖 , π‘₯ = Ο• π‘₯𝑖 , Ο• π‘₯ .

NREL National Solar Radiation Database 1991-2010

Hourly weather and solar intensity data for 20 years
Gaussian Processes (GP)

Station: ST LOUIS LAMBERT INT’L ARPT, MO
GP regression model:
Input: (combination of 9 weather metrics)
𝑦𝑖 = 𝑓 π‘₯𝑖 + πœ€π‘– , where noise πœ€π‘– ~𝑁(0, 𝜎 2 𝐼)

Date, Time , Opaque Sky Cover, Dry-bulb Temperature, Dew-point
Temperature, Relative Humidity, Station Pressure,Wind Speed, Liquid
Precipitation Depth
Output :
οƒ˜
Amount of solar radiation (Wh/m2) received in a collimated beam on a
surface normal to the sun
Methods


In our research, regression is used to learn a mapping from
some input space of n-dimensional vectors to an output space of
real-valued targets
We apply different regression methods including:
οƒ˜
Linear least squares regression
οƒ˜
Support vector regression (SVR) using multiple kernel functions
οƒ˜
Gaussian processes
Linear Model
𝑦 = 𝑓 X = 𝑋𝑇 π‘Ž + 𝑒
where 𝑦 ∈ ℝ𝑛 : measurement (solar intensity)
X ∈ ℝ𝑛×𝑝+1 : each row is a p-dimensional input
𝐴 ∈ ℝ𝑝+1 : unknown coefficient
𝑒 ∈ ℝ𝑛 : random noise
Loss function(Square error): 𝑦 − 𝑦 2 = 𝑦 − 𝑋 𝑇 π‘Ž 2
Support Vector Regression (SVR)
Given training data {(π’™πŸ , 𝑦1 ), (π’™πŸ , 𝑦2 )…(𝒙𝒏 , 𝑦𝑛 )
Linear SVR Model:
𝑓 𝒙 = π’˜, 𝒙 + 𝑏 =
minimize
1
2
π’˜ 2 +𝐢
Applying PCA to remove redundant information
The graph shows the MSE with different input
dimensions. The feature set with 8 dimensions performs
the best with the lowest test error.
And as long as we keep more than 5 principle
components, the errors are lower than linear regression
𝑖
Data Source
οƒ˜
Such as: Temperature & Time of the day
𝛼𝑖 − 𝛼𝑖 ∗ Ο• π‘₯𝑖
πœ”=


Some weather metrics correlate strongly
𝑀𝑇𝒙
+𝑏
∗
(ξ
+ξ
𝑖 𝑖 𝑖 )
𝑦𝑖 − 𝑓(π‘₯𝑖 ) ≤ πœ€ + ξ𝑖
∗
𝑓(π‘₯
)
−
𝑦
≤
πœ€
+
ξ
subject to
𝑖
𝑖
𝑖
ξ𝑖 , ξ𝑖 ∗ ≥ 0
Loss function: (epsilon intensive)
0
𝑖𝑓 ξ ≤ πœ€
ξπœ€β‰”
ξ − πœ€ π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’.

Predictions are made by proposed methods

20% of data is used to train & 10% of the data is used to test
Linear regression

Assume a zero mean GP prior distribution
MSE is used to evaluate the
result of regression. Followings
are the prediction errors of the 3
different methods:
οƒ˜
over inference functions 𝑓 βˆ™ . In particular,
Linear Regression
𝑓 π‘₯ 1 , . . . , 𝑓 π‘₯ 𝑛 ~𝑁 0, 𝐾 , 𝐾𝑝,π‘ž = πΆπ‘œπ‘£(𝑓 π‘₯ 𝑝 , 𝑓 π‘₯ π‘ž ) = 𝐾(π‘₯ 𝑝 , π‘₯ π‘ž )
215.7884
To make predictions 𝑦 ∗ at test points 𝑋 ∗ , where 𝑦 ∗ = 𝑓 𝑋 ∗ + ε
𝑓 ∗ : 𝑓𝑓∗
𝐾 𝑋, 𝑋
~ 𝑁 0,
𝐾 𝑋∗, 𝑋
∗
It follows that p 𝑦 𝐷, 𝑋
𝑋, 𝑋 ∗
𝐾
𝐾 𝑋∗, 𝑋∗
∗
,
πœ€
πœ€∗
~ 𝑁 0,
𝜎2𝐼
οƒ˜
130.1537
0
𝜎2𝐼
0
−1
οƒ˜
SVM regression
= 𝑁(πœ‡, Σ)
where πœ‡ = 𝐾 𝑋, 𝑋 ∗ [𝐾(𝑋, 𝑋) + 𝜎 2 𝐼]−1 𝑦
Σ = 𝐾 𝑋 ∗ , 𝑋 ∗ − 𝐾 𝑋, 𝑋 ∗ 𝐾 𝑋, 𝑋 + 𝜎 2 𝐼
SVR
122.9167

𝐾 𝑋∗, 𝑋 .
SPGP
Followings are 24-hour prediction
Sparse Pseudo-input GP (SPGP)
GPs are prohibitive for large data sets due to the inversion
of the covariance matrix.
Consider a model parameterized by a pseudo data set 𝐷 of
size π‘š β‰ͺ 𝑛, where n is the number of real data points.
Reduce training cost from 𝑂 𝑛3 to 𝑂 π‘š2 𝑛 , and prediction
cost from 𝑂 𝑛2 to 𝑂 π‘š2
Pseudo data set 𝐷: 𝑿 = 𝒙𝑖 𝑖=1…π‘š , 𝒇 = 𝑓𝑖 𝑖=1…π‘š
SPGP regression
Prior on Pseudo targets: 𝑝 𝒇 𝑿 = 𝑁(0, 𝐾𝑀 )
Likelihood:
𝑇
−1
𝑝 𝑦 𝒙, 𝑿, 𝒇 = 𝑁 𝐾𝒙 𝐾𝑀 𝒇,
𝑇
𝐾𝒙𝒙 − 𝐾𝒙 𝐾𝑀
−1
LR
SVR
GP
𝐾𝒙 + 𝜎 2
Posterior distribution over 𝒇 :
𝑝 𝒇 𝐷, 𝑿 = 𝑁 𝐾𝑀 𝑄𝑀 −1 𝐾𝑀𝑁 (𝜦 + 𝜎 2 𝑰)−1 π’š, 𝐾𝑀 𝑄𝑀 −1 𝐾𝑀
where 𝑄𝑀 = 𝐾𝑀 + 𝐾𝑀𝑁 (𝜦 + 𝜎 2 𝑰)−1 𝐾𝑁𝑀
Given new input π‘₯ ∗ , the predictive distribution:
𝑝 𝑦 ∗ 𝐷, 𝑋 ∗ = 𝑑 𝒇𝑝 𝑦 ∗ 𝒙∗ , 𝑿, 𝒇 𝑝 𝒇 𝐷, 𝑿 = 𝑁 πœ‡∗ , Σ ∗
−1
𝑇
∗
where πœ‡ = 𝐾∗ 𝑄𝑀 𝐾𝑀𝑁 (𝜦 + 𝜎 2 𝑰)−1 π’š
Σ ∗ = 𝐾∗∗ − 𝐾∗ 𝑇 𝐾𝑀 −1 − 𝑄𝑀 −1 𝐾∗ + 𝜎 2
Predicting Error
191.5258
93.2988
90.2835
Conclusions

Using machine learning to automatically model the function of
predicting solar generation from weather forecast lead to a
acceptable result

Gaussian processes achieved lowest error among all the methods
Download