Econometrics--Econ 388 Spring 2012, Richard Butler Final Exam your name_________________________________________________ Section Problem Points Possible I 1-20 3 points each II 21 22 23 24 25 10 points 10 points 5 points 5 points 5 points III 26 27 28 20 points 20 points 20 points IV 20 points 25 points 29 30 1 I. Define or explain the following terms: 1. binary variables- 2. The prediction error for YT, i.e., the variance of a forecast value of y given a specific value of the regressor vector, XT (from YT X T ˆ T )- 3. formula for VIF test for collinearity-- 4. structural vs. reduced form parameters in simultaneous equations- 5. dummy variable trap - 6. endogeneous variable- 7. maximum likelihood estimation criteria- 8. F-test- 9. Goldfeld-Quandt test- 10. null hypothesis2 11. identification problem (in simultaneous equation models)- 12. LaGrange-Multiplier test-- 13. least squares estimation criterion for fitting a linear regression- 14. probit model- 15. dynamically complete models - 16. one-tailed hypothesis test- 17. model corresponding to “prais y x1 x2 x3;” procedure in STATA -- 18. show that N N i 1 i 1 ( yi y )( xi x ) ( yi y ) xi -- 19. probability significance values (i.e., ‘p-values’)- 20. central limit theorem 3 II. Some Concepts 21. Suppose we want to test the effect of the Romney “Only-True” stimulus package (“give all the money to Mitt, and he will spur GNP”) where the null hypothesis is that the Romney Multiplier is one or greater 𝐻0 : 𝛽1 ≥ 1 vs. the alternative hypothesis that the Romney multiplier is 0 (for each Mitt Buck Spent, GDP doesn’t change at all) 𝐻𝑎 : 𝛽1 = 0 and we know (the standard error angel has visited us) that S.E. j .3 in either case. 𝐺𝑁𝑃𝑖 = 𝛽0 + 0.5 (𝑀𝑖𝑡𝑡 𝐵𝑢𝑐𝑘𝑠)𝑖 (0.3) a) Employ the usual 95% confidence interval against a type I error, constructing the critical cutoff value for the appropriate one-tailed test assuming the null hypothesis (Ho) is true. What is that critical cutoff value for the percent tail? b) Can we reject the null hypothesis? c) What is the size of the type II error given the critical cutoff value in (a), assuming that the alternative hypothesis is true? (see the table at the end of the test for Z-scores) 4 22. In a simultaneous equation system, it is arbitrary which one of the endogenous variables is placed on the left hand side of the equation (and which of the remaining endogenous variables are placed on the right hand side). Hence we could describe a system of 4 simultaneous equations abstractly by indicating their presence or absence in each respective equation (X=indicates that the respective variable is included, 0=indicates that the respective variable is absent from the system) by using the following matrix table notation: Endo 1 X X 0 X Endo 2 0 X 0 0 Endo 3 X X X X Endo 4 X 0 0 X Exo 1 X 0 X 0 Exo 2 0 X 0 X Exo 3 0 X 0 0 Equatn 1 Equatn 2 Equatn 3 Equatn 4 . Using the order condition for identification, which of the 4 equation systems are identified? (answer without explanation gets zero points) 5 The next three questions consist of statements that are True, False, or Uncertain (Sometimes True). You are graded solely on the basis of your explanation in your answer 23. “Let Xˆ V (V 'V ) 1V ' X where V has the appropriate dimensions. Then Xˆ ' X Xˆ ' Xˆ .” 24. “In a linear regression model (either single or multiple), if the sample means of all the column variables of slope coefficients X are zero (excluding the constant) and the sample mean of Y is zero, then the intercept will be zero as well.” 25. "A first order autoregressive process, yt yt 1 t , is both stationary and weakly dependent if <1.” 6 III. Some Applications 26. For the following STATA output, indicate what sort of tests are made (2 tests are specifically programmed) and what they indicate: *. . . . . . narr86 pcnv avgsen tottime ptime86 qemp86 # times arrested, 1986 proportion of prior convictions avg sentence length, mos. time in prison since 18 (mos.) mos. in prison during 1986 # quarters employed, 1986; . regress narr86 pcnv avgsen tottime ptime86 qemp86; -----------------------------------------------------------------------------narr86 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------pcnv | -.1512246 .040855 -3.70 0.000 -.2313346 -.0711145 avgsen | -.0070487 .0124122 -0.57 0.570 -.031387 .0172897 tottime | .0120953 .0095768 1.26 0.207 -.0066833 .030874 ptime86 | -.0392585 .0089166 -4.40 0.000 -.0567425 -.0217745 qemp86 | -.1030909 .0103972 -9.92 0.000 -.1234782 -.0827037 _cons | .7060607 .0331524 21.30 0.000 .6410542 .7710671 -----------------------------------------------------------------------------. test (avgsen=0) (tottime=0); **FIRST TEST TO BE EXPLAINED********; F( 2, 2719) = 2.03 Prob > F = 0.1310 . regress narr86 pcnv ptime86 qemp86; -----------------------------------------------------------------------------narr86 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------pcnv | -.1499274 .0408653 -3.67 0.000 -.2300576 -.0697973 ptime86 | -.0344199 .008591 -4.01 0.000 -.0512655 -.0175744 qemp86 | -.104113 .0103877 -10.02 0.000 -.1244816 -.0837445 _cons | .7117715 .0330066 21.56 0.000 .647051 .776492 -----------------------------------------------------------------------------. predict resids, residuals; . regress resids pcnv avgsen tottime ptime86 qemp86; -----------------------------------------------------------------------------resids | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------pcnv | -.0012971 .040855 -0.03 0.975 -.0814072 .0788129 avgsen | -.0070487 .0124122 -0.57 0.570 -.031387 .0172897 tottime | .0120953 .0095768 1.26 0.207 -.0066833 .030874 ptime86 | -.0048386 .0089166 -0.54 0.587 -.0223226 .0126454 qemp86 | .0010221 .0103972 0.10 0.922 -.0193652 .0214093 _cons | -.0057108 .0331524 -0.17 0.863 -.0707173 .0592956 ------------------------------------------------------------------------------ ************ SECOND TEST TO BE EXPLAINED **************; . gen lm=e(N)*e(r2); . gen test=chi2tail(2,lm);. sum lm test; Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------lm | 2725 4.070729 0 4.070729 4.070729 test | 2725 .1306328 0 .1306328 .1306328 7 27. Data from GPA2.RAW generated the regression below, where sat=combined SAT score hsize=size of the individual’s high school graduating class (in hundreds) hsizesq=square of hsize female=1 if female, 0 if male black=1 if black, 0 in non-black R-SQUARE = VARIABLE NAME HSIZE HSIZESQ FEMALE BLACK CONSTANT 0.0832 R-SQUARE ADJUSTED = ESTIMATED STANDARD COEFFICIENT ERROR 19.115 3.837 -2.1894 0.5278 -41.608 4.175 -139.29 9.097 1027.0 6.290 T-RATIO 4132 DF 4.982 -4.148 -9.967 -15.31 163.3 0.0823 PARTIAL STANDARDIZED ELASTICITY P-VALUE CORR. COEFFICIENT AT MEANS 0.000 0.077 0.2381 0.0519 0.000-0.064 -0.1983 -0.0231 0.000-0.153 -0.1485 -0.0182 0.000-0.232 -0.2285 -0.0075 0.000 0.930 0.0000 0.9968 a). Should hsizesq be in the regression? In terms of optimal high school size, what does it imply? b). What do the estimated coefficients on the female and black dummy variables indicate (both in magnitude and statistical importance)? c). Is this one of those regressions where I should worry about interpreting the constant term? Why or why not? 8 III. Some Proofs 28. Show whether there is simultaneous equation bias (right hand side regressors correlated with the error) in the following particular measurement error framework: the true model is Y X z , but the variable z (the true value) is measured with error when it is observed, call this observed value z*, subject to the following relationship measurement error is z z* where is white noise (with the usual independent, zero mean distribution), uncorrelated with z* and so that E ( | z*) 0 (and is uncorrelated with X, z, and z*). Indicate whether or not there is “simultaneous equation” bias if Y is regressed on X and z* (as always, you are only graded on your explanation, not on your guess as to the right answer). 9 29. Derive V(𝛽̂ ), i.e., the variance-covariance matrix for 𝛽̂ = (𝑋 ′ 𝑋)−1 𝑋 ′ 𝑌, given the usual model assumptions. 2 30. Under the model assumptions, prove that s2 is an unbiased estimator of for the OLS regression model, using all the necessary assumptions employed in the proof in class or in the book, and proving it in the general case using matrix algebra. 10