POL 681: Standard Errors and Such Run Regression: . reg measwt reptwt if female==1 Source | SS df MS -------------+-----------------------------Model | 4334.88935 1 4334.88935 Residual | 418.873025 99 4.23104066 -------------+-----------------------------Total | 4753.76238 100 47.5376238 Number of obs F( 1, 99) Prob > F R-squared Adj R-squared Root MSE = 101 = 1024.54 = 0.0000 = 0.9119 = 0.9110 = 2.0569 -----------------------------------------------------------------------------measwt | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------reptwt | .9772242 .0305301 32.01 0.000 .9166458 1.037803 _cons | 1.777503 1.744408 1.02 0.311 -1.68378 5.238787 -----------------------------------------------------------------------------The MSE is the variance of the residuals. SSResidual/n-k-1. It is given by the formula: For these data, it is: . display 418.873025/99 4.2310407 The standard error of the estimate (or RMSE) is given by the square root of the variance: . display sqrt(418.873025/99) 2.0569494 To see where the standard error of b1 comes from, recall the estimator. We can demonstrate where these quantities come from. First, generate descriptive statistics: . summ reptwt, detail, if female==1 Reported Weight ------------------------------------------------------------Percentiles Smallest 1% 44 41 5% 45 44 10% 50 44 Obs 101 25% 53 45 Sum of Wgt. 101 50% 75% 90% 95% 99% 56 61 65 68 75 Largest 71 75 75 77 Now, generate the “numerator”: . display 45.39307*100 4539.307 Mean Std. Dev. 56.74257 6.737438 Variance Skewness Kurtosis 45.39307 .4570697 3.639294 . display sqrt(45.39307*100) 67.374379 The standard error (as estimated) is: . display 2.0569494/67.374379 .03053014 Which corresponds to our regression output. Q: What happens as the variance in X decreases? Now, let’s turn to the multiple regression setting. model: Let’s run the following . reg prestige income educ Source | SS df MS -------------+-----------------------------Model | 36180.9458 2 18090.4729 Residual | 7506.69865 42 178.73092 -------------+-----------------------------Total | 43687.6444 44 992.90101 Number of obs F( 2, 42) Prob > F R-squared Adj R-squared Root MSE = = = = = = 45 101.22 0.0000 0.8282 0.8200 13.369 -----------------------------------------------------------------------------prestige | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------income | .5987328 .1196673 5.00 0.000 .3572343 .8402313 educ | .5458339 .0982526 5.56 0.000 .3475521 .7441158 _cons | -6.064663 4.271941 -1.42 0.163 -14.68579 2.556463 -----------------------------------------------------------------------------Let’s verify the RMSE estimate as being: . display sqrt(7506.69865/42) 13.369028 Taking into account the estimator for the standard error of b1, we can compute the “denominator.” . summ educ, detail Percent of males in occupation in 1950 who were high-school graduates ------------------------------------------------------------Percentiles Smallest 1% 7 7 5% 17 15 10% 20 17 Obs 45 25% 26 19 Sum of Wgt. 45 50% 75% 90% 95% 99% 45 84 92 97 100 . display 885.7071*44 38971.112 Largest 93 97 98 100 Mean Std. Dev. 52.55556 29.76083 Variance Skewness Kurtosis 885.7071 .2262167 1.451751 This is our “variance” term. What about the auxiliary regression? . reg educ income Source | SS df MS -------------+-----------------------------Model | 20456.6437 1 20456.6437 Residual | 18514.4674 43 430.569009 -------------+-----------------------------Total | 38971.1111 44 885.707071 Number of obs F( 1, 43) Prob > F R-squared Adj R-squared Root MSE = = = = = = 45 47.51 0.0000 0.5249 0.5139 20.75 -----------------------------------------------------------------------------educ | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------income | .8824238 .1280211 6.89 0.000 .6242448 1.140603 _cons | 15.61141 6.188362 2.52 0.015 3.13139 28.09143 The r2 from this model is the auxiliary regression, which of course is equivalent to the square of the correlation coefficient: . corr educ income (obs=45) | educ income -------------+-----------------educ | 1.0000 income | 0.7245 1.0000 . display .7245^2 .52490025 Putting the pieces together, the standard error is: . display 13.369028/sqrt(38971.112*(1-.52490025)) .09825