Statistical Tools for Solar Resource Forecasting Vivek Vijay IIT Jodhpur Date: 16/12/2013 Outline • Solar Resource Assessment • Types of Data • Regression Analysis – Modeling of Cross Sectional Data • Statistical Tests • Dimensionality Reduction • Time Series Forecasting • Learning Algorithm - ANN Solar Resource Assessment Solar Resource Assessment (SRA) is a characterization of solar irradiance available for energy conversion for a region or specific location over a historical time period of interest. Forecasting solar irradiance is an important first step toward predicting the performance of a solar-energy conversion system and ensuring stable operation of electricity grid. PV plants are fairly linear in their conversion of solar power to electricity, that is, their overall conversion efficiency during operation typically changes less than 20%. On the other hand, assessment of CSP production is more challenging due to the non-linear nature of thermodynamic parameters. Types of Data Cross Sectional Data Multiple individuals at the same time Time Series Data Single individuals at multiple points in time Panel or Longitudinal Data Multiple individuals at multiple time periods Regression Analysis Problem – Estimation of Global Solar Radiation from Meteorological Parameters (Air temperature, relative humidity etc.) and Sunshine Duration Angstrom-Prescott Model – A linear regression model (Monthly average daily radiation at a particular location (H) v/s Monthly average daily sunshine hours (S)) 𝐻 𝑆 = 𝑎 + 𝑏. 𝐻0 𝑆𝑚𝑎𝑥 𝐻0 and 𝑆𝑚𝑎𝑥 can be obtained by using some other parameters. Statistical Test • The accuracy of the estimated models must be judged by statistical indicators, such as • Correlation Coefficient • Mean Bias Error • Root Mean Square Error • Percentage Error • Coefficient of Determination Dimensionality Reduction The dimension of the data is the number of variables that are measured on each observation. When the dataset is highdimensional, not all the measured variables are “important”. The analysis also becomes computationally expensive. The removal of “irrelevant” information is dimensionality reduction. Given the 𝑝 dimensional random vector 𝑋 = (𝑥1 , 𝑥2 , … , 𝑥𝑝 ), the problem is to find a lower dimensional representation of it, 𝑆 = (𝑠1 , 𝑠2 , … , 𝑠𝑘 ) with 𝑘 ≤ 𝑝 that captures the information in the original data, according to some criterion. Dimensionality Reduction The techniques of dimensionality reduction are mainly classified into (a) Linear (PCA, Factor Analysis etc) (b) Non-linear (Kernel PCA, MDS, Isomap etc) Linear techniques result in each of the 𝑘 ≤ 𝑝 components of the new variable being a linear combination of the original variables 𝑠𝑖 = 𝑤𝑖,1 𝑥1 + ⋯ + 𝑤𝑖,𝑝 𝑥𝑝 , 𝑖 = 1, … , 𝑘 Time Series Forecasting • Linear Time Series Models (Under Stationarity) • Simple Autoregressive (AR) Models • Simple Moving Average (MA) Models • Mixed ARMA Models • Seasonal Models • AR (1) model 𝑥𝑡 = 𝜑0 + 𝜑1 𝑥𝑡−1 + 𝑎𝑡 Where {𝑎𝑡 } is assumed to be a white noise series with mean zero and constant variance. Some Measures • Order Determination of AR • Partial Autocorrelation Function • AIC or BIC • Parameter Estimation – Any AR(p) is similar to multiple regression model and so least square method can be used to estimate the parameters. • Goodness of Fit 𝑅2 𝑅𝑆𝑆 =1− 𝑇𝑆𝑆 A Learning Algorithm - ANN • Artificial Neural Networks – When the data is non-linear in nature, ANN is a good methodology for forecasting. The gradient decent algorithm can be used for updation. • Issues • How many number of hidden neurons? • How many number of hidden layers? • Overestimation Thank You