Air Pollution Forecasting Presented by Sunil Ojha Air Pollution Research Group Why we need a Forecast • To satisfy the needs of public information • To further reduce and prevent exposure to air pollutants • To alert authorities, industry and the public to take measures for emission reduction • To increase public support for structural measures Air Pollution Research Group Forecasting Methods • Qualitative Methods • Quantitative Methods Air Pollution Research Group Qualitative Methods • Use the opinions of experts to subjectively predict future events – Such methods are often used when historical data either are not available at all or are scarce – These methods are often used to predict changes in the pattern of historical data Air Pollution Research Group Quantitative methods • Involves the analysis of historical data to predict future values of a variable of interest • Methods fall in two categories – Univariate models – Causal models Air Pollution Research Group Quantitative Methods Univariate models – Predicts future values of the variable of interest solely on the basis of the historical pattern of that variable Causal Models – Predicts the future values of the variable of interest based on the relationship between that and other variables Air Pollution Research Group Air Pollution Forecasting • Understand the nature of the pollutant by determining – How it forms? – When it forms? – How weather affects the pollutant? Air Pollution Research Group Statistical Modeling • Involves determining the functional relationship between the input and output variables in a system • Most common statistical method is linear regression which has the general form : Y= a + bx Air Pollution Research Group Statistical Modeling Multiple Regression – Used when you have more than one predictors (independent variables) – Takes the general form Y = a + bx1+cx2+dx3 Air Pollution Research Group Regression Analysis • Fits a trend line using the least squares approach • Goodness of fit of this line is summarized by the coefficient of determination (R2) • Higher the value of R2 , better is the fit. Air Pollution Research Group Establishing independent variables • Used for identifying the significant variables • A correlation matrix is formed • Higher correlation between the independent variables should be avoided • Significant variables are those having a higher correlation with the dependent variable Air Pollution Research Group Neural Network flowchart Output Input Layer Hidden Layer General structure of the feed forward neural network Air Pollution Research Group Neural Network Modeling Four step procedure: 1. The data is divided into two subsets – Training the data – Validating and testing the network 2. The input data are then scaled from 0 to 1 (this is often done by the neural network program) Air Pollution Research Group Neural Network Modeling 3. The architecture (or general layout) of the network is determined by: – – – – the number of input variables the number of hidden layers the number of neurons in each layer the type of network to be used Air Pollution Research Group Neural Network Modeling 4. The network is trained by determining the weights of neuron inputs The training process • Process of estimating optimum weights for the links is known as the ‘training’ or ‘learning’ process Air Pollution Research Group Back Propagation Method • A random value is assigned to start with • Inputs are then propagated forward till it reaches the output layer • The error is then used to correct weights on neurons that contributed most to the error Air Pollution Research Group Summary of the training process • • • • • • • Select input variables Select network architecture Initialize weights Apply inputs to network Measure error Back-propagate errors and adjust weights Repeat steps 4–6 a large number of times until the network converges Air Pollution Research Group Development of neural network forecasting model for predicting Ozone concentrations Air Pollution Research Group Objective To develop a neural network forecasting model for predicting ozone concentrations Air Pollution Research Group Literature Review • Surface temperature data has been used as a surrogate variable by various modelers • Most of the models developed has not been tested on independent data sets • Artificial neural network models perform better as compared to statistical models if extreme values exist Air Pollution Research Group Method Time series of Ozone concentration can be partitioned as : X(t)=e(t)+S(t)+W(t) where: X(t) is the original time series, i.e., Ozone Concentration e(t) is the long term trend component S(t) is the true seasonal variation W(t) is the short term variation Air Pollution Research Group Method • Ozone concentration fluctuates due to: – variation in meteorological conditions – changes in emission of ozone precursor chemicals • The two different phenomena above should be separated to get a better insight into the changes of ozone concentrations with time Air Pollution Research Group Method Kolmogorov-Zurbenko (KZm,p) filter – low pass filter produced by repeated iterations of a simple moving average – Used to separate the deterministic portions (e and S) from the short term variations – User determines the final low pass filter Air Pollution Research Group Method Each iteration of the moving average is defined by k Yi 1/ m Xi j j k where m=2k+1 Air Pollution Research Group Data • Hourly observations of Ozone concentration obtained from Ohio EPA • The monitor at Cincinnati has been considered • Hourly observations for meteorological variables obtained from NOAA Air Pollution Research Group Model development • Three ANN models have been developed – Filtered LN(Oz) vs. filtered Temp data – Filtered Temperature data vs. actual temp data – Filtered LN(Oz) data vs. actual LN(Oz) • Feed forward network with propagation algorithm has been used Air Pollution Research Group back Model Development • Hourly observations from 8 am to 9 pm considered • 1995-1997 data used for training the network • 1998 data used for validating • Model is tested using 1999 data Air Pollution Research Group Correlation factors for different filters 0.800 0.700 (400,4) (400,3) Correlation factor (365,3) 0.600 (365,1) 0.500 (365,2) (400,2) (400,1) (275,1) (200,1) 0.400 Series1 (0,0) 0.300 0.200 0.100 0 2 4 6 8 10 Iteration no. Air Pollution Research Group 12 Conclusions • Model developed using filtered data gives better predictions • The proposed model can be used only for the specific monitor • The proposed model is a better predictor of hourly peak values Air Pollution Research Group