Predicting Solar Generation from Weather Forecasts Advisor: Professor Arye Nehorai Chenlin Wu, Yuhan Lou Department of Electrical and Systems Engineering Principal Component Analysis (PCA) Kernel Trick for SVR Background ο¬ Smart grid: increasing the contribution of renewable in grid energy ο¬ Solar generation: intermittent and non-dispatchable The kernel trick is a way of mapping observations from a general set S (Input space) into an inner product space V (high dimensional feature space) Φ: βπ → βπ Goals Creating automatic prediction models π(π₯) ο¬ Predicting future solar power intensity given weather forecasts = πβ«π π πΌπ − πΌπ ∗ π(π₯π , π₯) Experiments +π where π π₯π , π₯ = Ο π₯π , Ο π₯ . ο¬ NREL National Solar Radiation Database 1991-2010 ο¬ Hourly weather and solar intensity data for 20 years Gaussian Processes (GP) ο¬ Station: ST LOUIS LAMBERT INT’L ARPT, MO GP regression model: Input: (combination of 9 weather metrics) π¦π = π π₯π + ππ , where noise ππ ~π(0, π 2 πΌ) ο¬ Date, Time , Opaque Sky Cover, Dry-bulb Temperature, Dew-point Temperature, Relative Humidity, Station Pressure,Wind Speed, Liquid Precipitation Depth Output : ο Amount of solar radiation (Wh/m2) received in a collimated beam on a surface normal to the sun Methods ο¬ ο¬ In our research, regression is used to learn a mapping from some input space of n-dimensional vectors to an output space of real-valued targets We apply different regression methods including: ο Linear least squares regression ο Support vector regression (SVR) using multiple kernel functions ο Gaussian processes Linear Model π¦ = π X = ππ π + π where π¦ ∈ βπ : measurement (solar intensity) X ∈ βπ×π+1 : each row is a p-dimensional input π΄ ∈ βπ+1 : unknown coefficient π ∈ βπ : random noise Loss function(Square error): π¦ − π¦ 2 = π¦ − π π π 2 Support Vector Regression (SVR) Given training data {(ππ , π¦1 ), (ππ , π¦2 )…(ππ , π¦π ) Linear SVR ModelοΌ π π = π, π + π = minimize 1 2 π 2 +πΆ Applying PCA to remove redundant information The graph shows the MSE with different input dimensions. The feature set with 8 dimensions performs the best with the lowest test error. And as long as we keep more than 5 principle components, the errors are lower than linear regression π Data Source ο Such as: Temperature & Time of the day πΌπ − πΌπ ∗ Ο π₯π π= ο¬ ο¬ Some weather metrics correlate strongly π€ππ +π ∗ (ξ +ξ π π π ) π¦π − π(π₯π ) ≤ π + ξπ ∗ π(π₯ ) − π¦ ≤ π + ξ subject to π π π ξπ , ξπ ∗ ≥ 0 Loss function: (epsilon intensive) 0 ππ ξ ≤ π ξπβ ξ − π ππ‘βπππ€ππ π. ο¬ Predictions are made by proposed methods ο¬ 20% of data is used to train & 10% of the data is used to test Linear regression ο¬ Assume a zero mean GP prior distribution MSE is used to evaluate the result of regression. Followings are the prediction errors of the 3 different methods: ο over inference functions π β . In particular, Linear Regression π π₯ 1 , . . . , π π₯ π ~π 0, πΎ , πΎπ,π = πΆππ£(π π₯ π , π π₯ π ) = πΎ(π₯ π , π₯ π ) 215.7884 To make predictions π¦ ∗ at test points π ∗ , where π¦ ∗ = π π ∗ + ε π ∗ : ππ∗ πΎ π, π ~ π 0, πΎ π∗, π ∗ It follows that p π¦ π·, π π, π ∗ πΎ πΎ π∗, π∗ ∗ , π π∗ ~ π 0, π2πΌ ο 130.1537 0 π2πΌ 0 −1 ο SVM regression = π(π, Σ) where π = πΎ π, π ∗ [πΎ(π, π) + π 2 πΌ]−1 π¦ Σ = πΎ π ∗ , π ∗ − πΎ π, π ∗ πΎ π, π + π 2 πΌ SVR 122.9167 ο¬ πΎ π∗, π . SPGP Followings are 24-hour prediction Sparse Pseudo-input GP (SPGP) GPs are prohibitive for large data sets due to the inversion of the covariance matrix. Consider a model parameterized by a pseudo data set π· of size π βͺ π, where n is the number of real data points. Reduce training cost from π π3 to π π2 π , and prediction cost from π π2 to π π2 Pseudo data set π·: πΏ = ππ π=1…π , π = ππ π=1…π SPGP regression Prior on Pseudo targets: π π πΏ = π(0, πΎπ ) Likelihood: π −1 π π¦ π, πΏ, π = π πΎπ πΎπ π, π πΎππ − πΎπ πΎπ −1 LR SVR GP πΎπ + π 2 Posterior distribution over π : π π π·, πΏ = π πΎπ ππ −1 πΎππ (π¦ + π 2 π°)−1 π, πΎπ ππ −1 πΎπ where ππ = πΎπ + πΎππ (π¦ + π 2 π°)−1 πΎππ Given new input π₯ ∗ , the predictive distribution: π π¦ ∗ π·, π ∗ = π ππ π¦ ∗ π∗ , πΏ, π π π π·, πΏ = π π∗ , Σ ∗ −1 π ∗ where π = πΎ∗ ππ πΎππ (π¦ + π 2 π°)−1 π Σ ∗ = πΎ∗∗ − πΎ∗ π πΎπ −1 − ππ −1 πΎ∗ + π 2 Predicting Error 191.5258 93.2988 90.2835 Conclusions ο¬ Using machine learning to automatically model the function of predicting solar generation from weather forecast lead to a acceptable result ο¬ Gaussian processes achieved lowest error among all the methods