Sums of Squares SS(C. Total) = 203748.80 SS(Year) = 187300.39 SS(Year

Sums of Squares SS(C. Total) = 203748.80  SS(Year) = 187300.39    Year explains 91.9% SS(Year2|Year) = 16264.87  Year2 adds 8.0% 1 Sums of Squares SS(C. Total) = 203748.80 SS(Year) = 187300.39 91.9% SS(Year2|Year) = 16264.87 8.0% 2 Sums of Squares SS(C. Total) = 203748.80  SS(Year2) = 188977.13    Year2 explains 92.8% SS(Year|Year2) = 14588.13  Year adds 7.1% 3 Sums of Squares SS(C. Total) = 203748.80 SS(Year|Year2) = 14588.13 7.1% SS(Year2) = 188977.13 92.8% 4 Sums of Squares SS(C. Total) = 203748.80 SS(shared) = 172895.80 84.8% SS(Year|Year2) = 14588.13 7.1% SS(Year2|Year) = 16264.87 8.0% 5 Sums of Squares SS(C. Total) = 203748.80  SS(YearCtr) = 187300.39    YearCtr explains 91.9% SS(YearCtr2|YearCtr) = 16264.87  YearCtr2 adds 8.0% 6 Sums of Squares SS(C. Total) = 203748.80 SS(YearCtr) = 187300.39 91.9% SS(YearCtr2|YearCtr) = 16264.87 8.0% 7 Sums of Squares SS(C. Total) = 203748.80  SS(YearCtr2) = 16264.87    YearCtr2 explains 8.0% SS(YearCtr|YearCtr2) = 187300.39  YearCtr adds 91.9% 8 Sums of Squares SS(C. Total) = 203748.80 SS(YearCtr|YearCtr2) = 187300.39 91.9% SS(YearCtr2) = 16264.87 8.0% 9 Sums of Squares SS(C. Total) = 203748.80 SS(YearCtr|YearCtr2) = 187300.39 91.9% SS(shared) = 0.00 0.0% SS(YearCtr2|YearCtr) = 16264.87 8.0% 10 Effects of Centering Year2 shares over 85% of the explained variation with Year.  YearCtr2 shares none of the explained variation with YearCtr.  11 Why does this happen? The correlation between Year2 and Year is statistically significant, multicollinearity.  The correlation between YearCtr2 and YearCtr is zero, no linear relationship.  12 What about 1940 & 1950?    The predictions for 1940 and 1950 are much higher than the actual population values. Why? Can we add a term to the model that could account for this? 13 Dummy Variable A dummy of indicator variable can be used to identify individual or sets of values.  X = 1 if Year is 1940 or 1950  X = 0 otherwise  14 Quadratic with Dummy  Predicted Population = 75.467 + 1.368*(Year – 1900) + 0.0066577*(Year – 1900)2 – 8.947*X  Note that the other estimated slope coefficients are very close to those in the quadratic model. 15 Quadratic with Dummy  For 1940 and 1950, the prediction is lowered by 8.947 million. 16 Quadratic  1940 Actual = 132.165  Predicted = 139.426  Residual = –7.261   1950 Actual =151.326  Predicted =159.129  Residual = –7.803  17 Quadratic with Dummy  1940 Actual = 132.165  Predicted = 131.908  Residual = 0.257   1950 Actual =151.326  Predicted =151.583  Residual = –0.257  18 Change in  Quadratic: R2 =0.9991   99.91% explained variation Quadratic+Dummy: R2 =0.9998   2 R 99.98% explained variation Only a small increase. 19 Significant Improvement?  Dummy variable, X added to the quadratic model. t = –7.25, P-value < 0.001  Because the P-value is small, the dummy variable, X, adds significantly to the quadratic model.  20 Change in RMSE  Quadratic:   Quadratic + Dummy:   RMSE = 3.029 RMSE = 1.602 RMSE reduced quite a bit. 21 22 Plot of Residuals One might detect a up – down – up – down, wave.  Worst predictions are still within 3 or 4 million of the actual population.  Probably can’t do much better.  23 Residuals 24

Sums of Squares SS(C. Total) = 203748.80 SS(Year) = 187300.39 SS(Year

Related documents

Products

Support

Sums of Squares SS(C. Total) = 203748.80 SS(Year) = 187300.39 SS(Year

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib