MAR 5621 Advanced Statistical Techniques Summer 2003 Dr. Larry Winner Chapter 11 – Simple linear regression Types of Regression Models (Sec. 11-1) Linear Regression: Yi 0 1 X i i Yi - Outcome of Dependent Variable (response) for ith experimental/sampling unit X i - Level of the Independent (predictor) variable for ith experimental/sampling unit 0 1 X i - Linear (systematic) relation between Yi and Xi (aka conditional mean) 0 - Mean of Y when X=0 (Y-intercept) 1 - Change in mean of Y when X increases by 1 (slope) i - Random error term Note that 0 and 1 are unknown parameters. We estimate them by the least squares method. Polynomial (Nonlinear) Regression: Yi 0 1 X i 2 X i2 i This model allows for a curvilinear (as opposed to straight line) relation. Both linear and polynomial regression are susceptible to problems when predictions of Y are made outside the range of the X values used to fit the model. This is referred to as extrapolation. Least Squares Estimation (Sec. 11-2) 1. Obtain a sample of n pairs (X1,Y1)…(Xn,Yn). 2. Plot the Y values on the vertical (up/down) axis versus their corresponding X values on the horizontal (left/right) axis. ^ 3. Choose the line Y i b0 b1 X i that minimizes the sum of squared vertical distances ^ from observed values (Yi) to their fitted values ( Y i ) Note: SSE n (Y Y ) i 1 4. b0 is the Y-intercept for the estimated regression equation 5. b1 is the slope of the estimated regression equation ^ i i 2 Measures of Variation (Sec. 11-3) Sums of Squares Total sum of squares = Regression sum of squares + Error sum of squares Total variation = Explained variation + Unexplained variation Total sum of squares (Total Variation): SST n (Y Y ) i 1 df T n 1 2 i n Regression sum of squares (Explained Variation): SSR (Y Y ) ^ i 1 n Error sum of squares (Unexplained Variation): SSE (Y Y ) i 1 df R 1 2 i ^ i i 2 df E n 2 Coefficients of Determination and Correlation Coefficient of Determination Proportion of variation in Y “explained” by the regression on X exp lained var iation SSR r2 0 r2 1 total var iation SST Coefficient of Correlation Measure of the direction and strength of the linear association between Y and X r sign(b1 ) r 2 1 r 1 Standard Error of the Estimate (Residual Standard Deviation) Estimated standard deviation of data n SYX SSE n 2 Y Y^ i i i 1 n 2 2 ( V (Yi ) V (i ) 2 ) Model Assumptions (Sec. 11-4) Normally distributed errors Heteroscedasticity (constant error variance for Y at all levels of X) Independent errors (usually checked when data collected over time or space) Residual Analysis (Sec. 11-5) ^ Residuals: ei Yi Yi Yi (b0 b1 X i ) Plots (see prototype plots in book and in class): ^ Plot of ei vs Yi Can be used to check for linear relation, constant variance If relation is nonlinear, U-shaped pattern appears If error variance is non constant, funnel shaped pattern appears If assumptions are met, random cloud of points appears Plot of ei vs Xi Can be used to check for linear relation, constant variance If relation is nonlinear, U-shaped pattern appears If error variance is non constant, funnel shaped pattern appears If assumptions are met, random cloud of points appears Plot of ei vs i (see next section) Can be used to check for independence when collected over time Histogram of ei If distribution is normal, histogram of residuals will be mound-shaped, around 0 Measuring Autocorrelation – Durbin-Watson Test (Sec. 11-6) Plot residuals versus time order If errors are independent there will be no pattern (random cloud centered at 0) If not independent (dependent), expect errors to be close together over time (distinct curved form, centered at 0). Durbin-Watson Test H0: Errors are independent (no autocorrelation among residuals) HA: Errors are dependent (Positive autocorrelation among residuals) n Test Statistic: D (e i2 i ei 1 ) 2 n e i 1 2 i Decision Rule (Values of d L ( k , n) and d U ( k , n) are given in Table E.10): If D d u ( k , n) then conclude H0 (independent errors) where k is the number of independent variables (k=1 for simple regression) If D d L ( k , n) then conclude HA (dependent errors) where k is the number of independent variables (k=1 for simple regression) If d L ( k , n) D dU ( k , n) then we withhold judgment (possibly need a longer series) NOTE: This is a test that you “want” to include in favor of the null hypothesis. If you reject the null in this test, a more complex model needs to be fit (will be discussed in Chapter 13). Inferences Concerning the Slope (Sec. 11-7) t-test Test used to determine whether the population based slope parameter ( 1 ) is equal to a pre-determined value (often, but not necessarily 0). Tests can be one-sided (predetermined direction) or two-sided (either direction). 2-sided t-test: H 0 : 1 10 TS: t obs H A : 1 10 b1 10 Sb1 Sb1 SYX n (X i 1 RR:| t obs | t 2 i X )2 ,n 2 1-sided t-test (Upper-tail, reverse signs for lower tail): H 0 : 1 10 TS: t obs H A : 1 10 b1 10 Sb1 Sb1 SYX n (X i 1 RR: t obs t 2 i X )2 ,n 2 F-test (based on k independent variables) A test based directly on sum of squares that tests the specific hypotheses of whether the slope parameter is 0 (2-sided). The book describes the general case of k predictor variables, for simple linear regression, k=1. H 0 : 1 0 H A : 1 0 MSR SSR / k TS: Fobs MSE SSE / (n k 1) RR: Fobs F ,k ,n k 1 Analysis of Variance (based on k Predictor Variables) Source Regression Error Total df k n-k-1 n-1 Sum of Squares SSR SSE SST Mean Square F MSR=SSR/k Fobs=MSR/MSE MSE=SSE/(n-k-1) ------- (1-)100% Confidence Interval for the slope parameter, b1 t 2 ,n 2 Sb1 If entire interval is positive, conclude >0 (Positive association) If interval contains 0, conclude (do not reject) 0 (No association) If entire interval is negative, conclude <0 (Negative association) Estimation of Mean and Prediction of Individual Outcomes (Sec. 11-8) Estimating the Population Mean Response when X=Xi Parameter: E[Y | X X i ] 0 1 X i ^ Point Estimate: Y i b0 b1 X i (1-)100% Confidence Interval for E[Y | X X i ] 0 1 X i : 1 hi n ^ Y i t 2 ,n 2 SYX hi ( Xi X )2 n (X i 1 i X )2 Predicting a Future Outcome when X=Xi Future Outcome: Yi 0 1 X i i ^ Point Prediction: Y i b0 b1 X i (1-)100% Prediction Interval for Yi 0 1 X i i ^ Y i t 2 ,n 2 SYX 1 hi ( Xi X )2 1 hi n n ( Xi X )2 i 1 Problem Areas in Regression Analysis (Sec. 11-9) Awareness of model assumptions Knowing methods of checking model assumptions Alternative methods when model assumptions are not met Not understanding the theoretical background of the problem The most important means of identifying problems with model assumptions is based on plots of data and residuals. Often transformations of the dependent and/or independent variables can solve the problems. Other, more complex methods may need to be used. See Section 11-5 for plots used to check assumptions and Exhibit 11.3 on pages 555-556 of text. Computational Formulas for Simple Linear Regression (Sec. 11-10) Normal Equations (Based on Minimizing SSE by Calculus) n n Y i 1 nb0 b1 X i i i 1 n XY i i i 1 n n i 1 i 1 b0 X i b1 X i2 Computational Formula for the Slope, b1 b1 SSXY SSXX n SSXY i 1 n ( X i X )(Yi Y ) n SSXX i 1 n ( Xi X )2 i 1 i 1 n n X i Yi i 1 i 1 X i Yi n n Xi i 1 X i2 n 2 Computational Formula for the Y-intercept, b0 b0 Y b1 X Computational Formula for the Total Sum of Squares, SST n Yi n n i 1 2 2 SST (Yi Y ) Yi n i 1 i 1 2 Computational Formula for the Regression Sum of Squares, SSR 2 n n ^ ( SSXY ) 2 SSR Yi Y b0 Yi b1 SSXX i 1 i 1 i 1 n n Yi i 1 X i Yi n 2 Computational Formula for Error Sum of Squares, SSE n SSE 2 Y Y^ SST SSR i i i 1 n n n i 1 i 1 i 1 Yi 2 b0 Yi b1 X i Yi Computational Formula for the Standard Error of the Slope, Sb1 Sb1 SYX SSXX SYX n (X i 1 i X )2 Chapter 12 – mULTIple Linear regression Multiple Regression Model with k Independent Variables (Sec. 12-1) General Case: We have k variables that we control, or know in advance of outcome, that are used to predict Y, the response (dependent variable). The k independent variables are labeled X1, X2,...,Xk. The levels of these variables for the ith case are labeled X1i ,..., Xki . Note that simple linear regression is a special case where k=1, thus the methods used are just basic extensions of what we have previously done. Yi = 0 + X1i + ... + kXki + i where j is the change in mean for Y when variable Xj increases by 1 unit, while holding the k-1 remaining independent variables constant (partial regression coefficient). This is also referred to as the slope of Y with variable Xj holding the other predictors constant. Least Squares Fitted (Prediction) Equation (obtained by minimizing SSE): ^ Y i b0 b1 X 1i bk X ki Coefficient of Multiple Determination Proportion of variation in Y “explained” by the regression on the k independent variables. n R 2 rY21,,k Y^ Y i i 1 Y Y n i 1 2 2 SSR SSE 1 SST SST i Adjusted-R2 Used to compare models with different sets of independent variables in terms of predictive capabilities. Penalizes models with unnecessary or redundant predictors. n 1 SSE n 1 2 Adj R 2 radj 1 1 1 rY21,...,k n k 1 SST n k 1 Residual Analysis for Multiple Regression (Sec. 12-2) Very similar to case for Simple Regression. Only difference is to plot residuals versus each independent variable. Similar interpretations of plots. A review of plots and interpretations: ^ Residuals: ei Yi Yi Yi (b0 b1 X 1i bk X ki ) Plots (see prototype plots in book and in class): ^ Plot of ei vs Yi Can be used to check for linear relation, constant variance If relation is nonlinear, U-shaped pattern appears If error variance is non constant, funnel shaped pattern appears If assumptions are met, random cloud of points appears Plot of ei vs Xji for each j Can be used to check for linear relation with respect to Xj If relation is nonlinear, U-shaped pattern appears If assumptions are met, random cloud of points appears Plot of ei vs i Can be used to check for independence when collected over time If errors are dependent, smooth pattern will appear If errors are independent, random cloud of points appears Histogram of ei If distribution is normal, histogram of residuals will be mound-shaped, around 0 F-test for the Overall Model (Sec. 12-3) Used to test whether any of the independent variables are linearly associated with Y Analysis of Variance n (Y Y ) Total Sum of Squares and df: SST i 1 n (Y Regression Sum of Squares: SSR ^ i i 1 n Error Sum of Squares: SSE (Y Y ) i 1 Source Regression Error Total df k n-k-1 n-1 SS SSR SSE SST ^ i i df T n 1 2 i 2 Y)2 df R k df E n k 1 MS MSR=SSR/k MSE=SSE/(n-k-1) --- F Fobs=MSR/MSE ----- F-test for Overall Model H0: k = 0 (Y is not linearly associated with any of the independent variables) HA: Not all j = 0 (At least one of the independent variables is associated with Y ) TS: Fobs MSR MSE RR: Fobs F ,k ,n k 1 P-value: Area in the F-distribution to the right of Fobs Inferences Concerning Individual Regression Coefficients (Sec. 12-4) Used to test or estimate the slope of Y with respect to Xj, after controlling for all other predictor variables. t-test for j H0: j = j0 (often, but not necessarily 0) HA: j j0 TS: t obs b j j0 Sb j RR: | t obs | t 2 ,n k 1 P-value: Twice the area in the t-distribution to the right of | tobs | (1-% Confidence Interval for j bj t 2 ,n k 1 Sb j If entire interval is positive, conclude j>0 (Positive association) If interval contains 0, conclude (do not reject) j0 (No association) If entire interval is negative, conclude j<0 (Negative association) Testing Portions of the Model (Sec. 12-5, But Different Notation) Consider 2 blocks of independent varibles: Block A containing k-q independent variables Block B containing q independent variables We wish to test whether the set of independent variables in block B can be dropped from a model containing the set of independent variables. That is, the variables in block B add no predictive value to the variables in block A. Notation: SSR(A&B) is the regression sum of squares for model containing both blocks SSR(A) is the regression sum of squares for model containing block A MSE(A&B) is the error mean square for model containing both blocks k is the number of predictors in the model containing both blocks q is the number of predictors in block B H0: The ’s in block B are all 0, given that the variables in block A have been included in model HA: Not all ’s in block B are 0, given that the variables in block A have been included in model TS: Fobs SSR( A& B) SSR( A) SSR( B| A) q q MSR( B| A) MSE ( A& B) MSE ( A& B) MSE ( A& B) RR: Fobs F ,q ,n k 1 P-value: Area in Fdistribution to the right of Fobs Coefficient of Partial Determination (Notation differs from text) Measures the fraction of variation in Y explained by Xj, that is not explained by other independent variables. rY221 rY2312 SSR( X 1 , X 2 ) SSR( X 1 ) SSR( X 2 | X 1 ) SST SSR( X 1 ) SST SSR( X 1 ) SSR( X 1 , X 2 , X 3 ) SSR( X 1 , X 2 ) SSR( X 3 | X 1 , X 2 ) SST SSR( X 1 , X 2 ) SST SSR( X 1 , X 2 ) rYk2 12...k 1 SSR( X 1 , X k ) SSR( X 1 , , X k 1 ) SSR( X k | X 1 , , X k 1 ) SST SSR( X 1 , , X k 1 ) SST SSR( X 1 , , X k 1 ) Note that since labels are arbitrary, these can be constructed for any of the independent variables. Quadratic Regression Models (Sec. 12-6) When the relation between Y and X is not linear, it can often be approximated by a quadratic model. Population Model: Yi 0 1 X 1i 2 X 12i i ^ Fitted Model: Y i b0 b1 X 1i b2 X 12i Testing for any association: H0: 1=2=0 Testing for a quadratic effect: H0: 2=0 HA: 1 and/or 2 0 HA: 0 (t-test) (F-test) Models with Dummy Variables (Sec. 12-7) Dummy variables are used to include categorical variables in the model. If the variable has m levels, we include m-1 dummy variables, The simplest case is binary variables with 2 levels, such as gender. The dummy variable takes on the value 1 if the characteristic of interest is present, 0 if it is absent. Model with no interaction Consider a model with one numeric independent variable (X1) and one dummy variable (X2). Then the model for the two groups (characteristic present and absent) have the following relationships with Y: Yi 0 1 X 1i 2 X 2i i E ( i ) 0 Characteristic Present (X2i = 1): E (Yi ) 0 1 X 1i 2 (1) ( 0 2 ) 1 X 1i Characteristic Absent (X2i=0): E (Yi ) 0 1 X 1i 2 (0) 0 1 X 1i Note that the two groups have different Y-intercepts, but the slopes with respect to X1 are equal. Model with interaction This model allows the slope of Y with respect to X1 to be different for the two groups with respect to the categorical variable: Yi 0 1 X 1i 2 X 2i 3 X 1i X 2i i E ( i ) 0 X2i=1: E (Yi ) 0 1 X 1i 2 (1) 3 X 1i (1) ( 0 2 ) ( 1 3 ) X 1i X2i=0: E (Yi ) 0 1 X 1i 2 (0) 3 X 1i (0) 0 1 X 1i Note that the two groups have different Y-intercepts and slopes. More complex models can be fit with multiple numeric predictors and dummy variables. See examples. Transformations to Meet Model Assumptions of Regression (Sec. 12-8) There are a wide range of possibilities of transformations to improve model when assumptions on errors are violated. We will consider specifically two types of logarithmic transformations that work well in a wide range of problems. Other transformations can be useful in very specific situations. Logarithmic Transformations These transformations can often fix problems with nonlinearity, unequal variances, and non-normality of errors. We will consider base 10 logarithms, and natural logarithms. X ' log10 ( X ) X ' ln( X ) X 10 X ' X eX' Case 1: Multiplicative Model (with multiplicative nonnegative errors) Yi 0 X 1i1 X 2i2 i Taking base 10 logarithms of both sides gives model: log10 (Yi ) log10 ( 0 X 1i1 X 2i2 i ) log10 ( 0 ) log10 ( X 1i1 ) log10 ( X 2i2 ) log10 (i ) log10 ( 0 ) 1 log10 ( X 1i ) 2 log10 ( X 2i ) log10 (i ) Procedure: First, take base 10 logarithms of Y and X1 and X2, and fit the linear regression model on the transformed variables. The relationship will be linear for the transformed variables. Case 2: Exponential Model (with multiplicative nonnegative errors) Yi e 0 1 Xi 1 2 X 2i i Taking natural logarithms of the Y values gives the model: ln(Yi ) 0 1 X 1i 2 X 2i ln(i ) Multicollinearity (Sec. 12-9) Problem: In observational studies and models with polynomial terms, the independent variables may be highly correlated among themselves. Two classes of problems arise. Intuitively, the variables explain the same part of the random variation in Y. This causes the partial regression coefficients to not appear significant for individual terms (since they are explaining the same variation). Mathematically, the standard errors of estimates are inflated, causing wider confidence intervals for individual parameters, and smaller tstatistics for testing whether individual partial regression coefficients are 0. Variance Inflation Factor (VIF) VIFj 1 1 r j2 where r j2 is is the coefficient of determination when variable Xj is regressed on the j-1 remaining independent variables. There is a VIF for each independent variable. A variable is considered to be problematic if its VIF is 10.0 or larger. When multicollinearity is present, theory and common sense should be used to choose the best variables to keep in the model. Complex methods such as principal components regression and ridge regression have been developed, we will not pursue them here, as they don’t help in terms of the explanation of which independent variables are associated and cause changes in Y. Model Building (Sec. 12-10) A major goal of regression is to fit as simple of model as possible, in terms of the number of predictor (independent) variables, that still explains variation in Y. That is, we wish to choose a parsimonious model (think of Ockham’s razor). We consider two methods. Stepwise Regression This is a computational method that uses the following algorithm: 1. Choose a significance level (P-value) to enter the model (SLE) and a significance level to stay in the model (SLS). Some computer packages require that SLE < SLS. 2. Fit all k simple regression models, choose the independent variable with the largest tstatistic (smallest P-value). Check to make sure the P-value is less than SLE. If yes, keep this predictor in model, otherwise stop (no model works). 3. Fit all k-1 variable models with two variables (one of which is the winning variable from step 2). Choose the model with the new variable having the lowest P-value. Check to determine that this new variable has P-value less than SLE. If no, then stop and keep the model with only the winning variable from step 2. If yes, then check that the variable from step 2 is still significant (it’s P-value is less than SLS), if not drop the variable from step 2. 4. Fit all k-2 variable models with the winners from steps 2 and 3. Follow the process described in step 3, where both variables in model from steps 2 and 3 are checked against SLS. 5. Continue until no new terms can enter model by not meeting SLE criteria. Best Subsets (Mallows’ Cp Statistic) Consider a situation with T-1 potential independent variables (candidates). There are 2T-1 possible models containing subsets of the T-1 candidates. Note that the full model containing all candidates actually has T parameters (including the intercept term). 1. Fit the full model and compute rT2 , the coefficient of determination 2. For each model with k predictors, compute rk2 (1 rk2 )(n T ) 3. Compute C P (n 2( k 1)) 1 rT2 4. Choose model with smallest k so that CP is is close to or below k+1 Chapter 13 – Time series & Forecasting Classical Multiplicative Model (Sec. 13-2) Time series can generally be decomposed into components representing overall level, linear trend, cyclical patterns, seasonal shifts, and random irregularities. Series that are measured at the annual level, cannot demonstrate the seasonal patterns. In the multiplicative model, effects of each component enter in a multiplicative manner. Multiplicative Model Observed at Annual Level Yi Ti Ci I i where the components are trend, cycle, and irregularity effects Multiplicative Model Observed at Seasonal Level Yi Ti Ci Si I i where the components are trend, cycle, seasonal, and irregularity effects Smoothing a Time Series (Sec. 13-3) Time series data are usually smoothed to reduce the impact of the random irregularities in individual data points. Moving Averages Centered moving averages are averages of the value at the current point and neighboring values (above and below the point). Due to the centering, they are typically odd numbered. The 3- and 5-period moving averages at time point i would be: MA(3) i Yi 1 Yi Yi 1 3 MA(5) i Yi 2 Yi 1 Yi Yi 1 Yi 2 5 Clearly, their can’t be moving averages at the lower and upper end of the series. Exponential Smoothing This method uses all previous data in a series to smooth it, with exponentially decreasing weights on observations further in the past: E1 Y1 Ei WYi (1 W ) Ei 1 i 2,3, 0 W 1 The exponentially smoothed value for a current time point can be used as forecast for the next time point. ^ ^ Y i 1 Ei WYi (1 W ) Yi Least Squares Trend Fitting and Forecasting (Sec. 13-4) We consider three trend models: linear, quadratic, and exponential. In each model the independent variable Xi is the time period of observation i. If the first observation is period 1, then Xi = i. Linear Trend Model Population Model: Yi 0 1 X i i ^ Fitted Forecasting Equation: Y i b0 b1 X i Quadratic Trend Model Population Model: Yi 0 1 X i 2 X i2 i ^ Fitted Forecasting Equation: Y i b0 b1 X i b2 X i2 Exponential Trend Model Population Model: Yi 0 1Xi i ^ log10 (Yi ) log10 0 X i log10 (1 ) log10 (i ) Fitted Forecasting Equation: log10 (Yi ) b0 b1 X i ^ Y i (10b0 )(10b1 ) Xi This exponential model’s forecasting equation is obtained by first fitting a simple linear regression of the logarithm of Y on X, then putting the least squares estimates of the Yintercept and the slope into the second equation. Model Selection Based on Differences First Differences: Yi Yi 1 i 2, , n Second Differences: (Yi Yi 1 ) (Yi 1 Yi 2 ) Yi 2Yi 1 Yi 2 Percent Differences: Yi Yi 1 100% Yi 1 i 3, , n i 2, , n If first differences are approximately constant, then the linear trend model is appropriate. If the second differences are approximately constant, then the quadratic trend model is appropriate. If percent differences are approximately constant, then the exponential trend model is appropriate. Autoregressive Model for Trend Fit and Forecasting (Sec. 13-5) Autoregressive models allow for the possibility that the mean response is a linear function of previous response(s). 1st Order Model Population Model: Yi A0 A1Yi 1 i ^ Ftted Equation: Y i a0 a1Yi 1 Pth Order Model Population Model: Yi A0 A1Yi 1 AP Yi P i ^ Ftted Equation: Y i a0 a1Yi 1 a P Yi P Significance Test for Highest Order Term H 0 : AP 0 a TS: t obs P S AP H A : AP 0 RR:| t obs | t 2 ,n 2 P 1 This test can be used repeatedly to determine the order of the autoregressive model. You may start with a large value of P (say 4 or 5, depending on length of series), and then keep fitting simpler models until the test for the last term is significant. Fitted Pth Order Forecasting Equation Start by fitting the Pth order autoregressive model on a dataset cotaining n observations. Then, we can get forecasts into the future as follows: ^ Y n 1 a 0 a1Yn a 2 Yn 1 a P Yn P 1 ^ ^ ^ ^ Y n 2 a 0 a1 Y n 1 a 2 Yn a P Yn P 2 ^ ^ Y n j a 0 a1 Y n j 1 a 2 Y n j 2 a P Y n j P ^ In the third equation (the general case), the Y values are the observed values when j P , and are forecasted values when j P . Choosing a Forecasting Model (Sec. 13-6) To determine which of a set of potential forecasting models to use, first obtain the prediction errors (residuals). Two widely used measures are the mean absolute deviation (MAD) and mean square error (MSE). n ^ ei Yi Y i MAD n |Y Y | | e | i 1 ^ i n i i 1 n n i MSE (Y Y ) i 1 ^ i n 2 i n e i 1 2 i n Plots of residuals versus time should be a random cloud, centered at 0. Linear, cyclical, and seasonal patterns may also appear, leading to fitting models accounting for them. See prototype plots on Page 695 of text book. Time Series Forecasting of Quarterly or Monthly Series (Sec. 13-7) Consider the exponential trend model, with quarterly and monthly dummy variables, depending on the data in a series. Exponential Model with Quarterly Data Create three dummy variables: Q1, Q2, and Q3 where Qj=1 if the observation occured in quarter j, 0 otherwise. Let Xi be the coded quarterly value (if the first quarter of the data is to be labeled as 1, then Xi=i). The population model and its tranformed linear model are: Yi 0 1X i 2Q1 3Q2 4Q3 i log10 (Yi ) log10 ( 0 ) X i log10 ( 1 ) Q1 log10 ( 2 ) Q2 log10 ( 3 ) Q3 log10 ( 4 ) log10 (i ) Fit the multiple regression model with the base 10 logarithm of Yi as the response and Xi (which is usually i), Q1, Q2, and Q3 as the independent variables. The fitted model is: ^ log10 (Yi ) b0 b1 X i b2 Q1 b3Q2 b4 Q3 ^ Y i 10b0 b1 X i b2Q1 b3Q2 b4Q3 Exponential Model with Monthly Data A similar approach is taken when data are monthly. We create 11 dummy variables: M1...,M11 that take on 1 if the month of the observation is the month corresponding to the dummy variable. That is, if the observation is for January, M1=1 and all other Mj terms are 0. The population model and its tranformed linear model are: Yi 0 1X i 2M1 3M2 12M11 i log10 (Yi ) log10 ( 0 ) X i log10 ( 1 ) M1 log10 ( 2 ) M 2 log10 ( 3 ) M11 log10 ( 12 ) log10 (i ) The fitted model is: ^ log10 (Yi ) b0 b1 X i b2 M1 b3 M 2 b12 M11 ^ Y i 10b0 b1 X i b2 M1 b3 M2 b12 M11