Fina - BYU Department of Economics

advertisement
Econometrics--Econ 388
Spring 2009, Richard Butler
Final Exam
your name_________________________________________________
Section Problem Points Possible
I 1-20 3 points each
II 21
22
23
24
25
20 points
5 points
5 points
5 points
5 points
III 26
27
28
20 points
20 points
20 points
IV
20 points
20 points
29
30
1
I. Define or explain the following terms:
1. binary variables-
2. The prediction error for YT, i.e., the variance of a forecast value of y given a specific value of
the regressor vector, XT (from YT  X T ˆ  T )-
3. formula for VIF test for collinearity--
4. structural vs. reduced form parameters in simultaneous equations-
5. dummy variable trap -
6. endogeneous variable-
7. maximum likelihood estimation criteria-
8. F-test-
9. Goldfeld-Quandt test-
10. null hypothesis2
11. identification problem (in simultaneous equation models)-
12. LaGrange-Multiplier test--
13. least squares estimation criterion for fitting a linear regression-
14. probit model-
15. dynamically complete models -
16. one-tailed hypothesis test-
17. model corresponding to “prais y x1 x2 x3;” procedure in STATA --
18. show that
N
N
i 1
i 1
 ( yi  y )( xi  x )   ( yi  y ) xi --
19. probability significance values (i.e., ‘p-values’)-
20. central limit theorem 3
II. Some Concepts
21. To test if larger schools are better places to learn math, math 10th grade schools are regressed
on the log of total compensation, log of staff, and log of enrollment. The null hypothesis is that
effect of enrollment is non-negative (i.e., either positive or neutral) and the alternative hypothesis
hypothesis is that it is bad for learning math. Under the null hypothesis, using a 95 confidence
internval, what is the size of the type II error when the alternative hypothesis is that the lenroll
coefficient is really -1.5 (and the standard error of the coefficient remains .6932)? Show your
work; the tables in the back of this exam may be useful.
Source |
SS
df
MS
-------------+-----------------------------Model | 2930.03231
3 976.677437
Residual | 41887.1482
404
103.68106
-------------+-----------------------------Total | 44817.1805
407 110.115923
Number of obs
F( 3,
404)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
408
9.42
0.0000
0.0654
0.0584
10.182
-----------------------------------------------------------------------------math10 |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------ltotcomp |
21.15498
4.055549
5.22
0.000
13.18237
29.1276
lstaff |
3.979981
4.189659
0.95
0.343
-4.256274
12.21624
lenroll | -1.268042
.6932037
-1.83
0.068
-2.630778
.0946951
_cons | -207.6645
48.70311
-4.26
0.000
-303.4077
-111.9213
------------------------------------------------------------------------------
4
22. My son has a pyramid dice, with four sides numbered from 1 to 4. Let W be the random
variable corresponding to number that's on the bottom side when the dice is rolled. If the dice is
fair, all outcomes are equally probable. If the dice is not fair, the probability that the side with
numbers 1 as the outcome is one forth, the probability that the side with number 2 as the
outcome will be one fourth, and the likelihood of rolling a 4 is one half, then
a. What is the expected value of the random variable W and what is the variance of W when the
dice is not fair?
b. If we did not know whether the dice were fair or not (i.e., that each outcome was equally
probable), how could we test for that?
The next three questions consist of statements that are True, False, or Uncertain (Sometimes
True). You are graded solely on the basis of your explanation in your answer
23. “Let Xˆ  V (V 'V ) 1V ' X where V has the appropriate dimensions. Then Xˆ ' X  Xˆ ' Xˆ .”
5
24. “In a linear regression model (either single or multiple), if the sample means of all the
column variables of slope coefficients X are zero (excluding the constant) and the sample mean
of Y is zero, then the intercept will be zero as well.”
25. "A first order autoregressive process, yt   yt 1   t , is both stationary and weakly
dependent if  <1.”
6
III. Some Applications
26. Given the usual regression model Yt  X t   t where the population error terms have a
second order moving process: t  0 et  1 et 1  2 et 2 where et is a white noise error term,
independently distributed with zero mean and variance  2 , then a) derive the variancecovariance matrix for  , and b) explain whether the errors are weakly dependent or not.
7
27. Write STATA programs to make the following tests or estimate the following models
requested below, assuming that the sample variables Pand Q are endogeneous, and that the
exogeneous variables are A, B, C, D, E, and F.
a. Pi  0  1Qi  2 Bi  3Ci  4 Di  i
right hand side of the equation.
Do a Hausman test for endogeneity of Q on the
b. For the same model as in (a), write out the STATA code to test for overidentifying restrictions
on the “extra” identifying variables..
c. For the same model as in (a), write out the STATA code to estimate the model in (a) by two
stage least squares.
8
III. Some Proofs
28. Show whether there is simultaneous equation bias (right hand side regressors correlated with
the error) in the following particular measurement error framework:
the true model is
Y  X   z   ,
but the variable z (the true value) is measured with error when it is observed, call this observed
value z*, subject to the following relationship
measurement error is
z  z* 
where  is white noise (with the usual independent, zero mean distribution), uncorrelated with
z* and  so that E ( | z*)  0 (and  is uncorrelated with X, z, and z*). Indicate whether or
not there is “simultaneous equation” bias if Y is regressed on X and z* (as always, you are only
graded on your explanation, not on your guess as to the right answer).
9
29. Suppose that for the general linear regression model, Y  X    , the modeling
assumptions are satisfied, in particular,  is normally distribution with mean 0 and variance
n
 2 I where I is the n by n identity matrix. Prove that s 2 
 ˆ
i 1
2
i
nk

ˆ ' ˆ
nk
is a consistent
 X'X 
estimator of  2 . (You can assume that plim 
   , a positive definite, symmetric matrix
 n 
for any size n. Also recall the ̂ is the least squares residual,
ˆ  y  X  ( I  X ( X ' X ) 1 X ' ) y  ( I  ( X ( X ' X ) 1 X ' ) .)
10
30. We have nine observations in total, three for each of three college educated women (the first
three observations are for Callie, each from a different year; the next three for Bella, and the last
three for Lizzy, each from a different year). Regress their wage rates (Y) on three dummy
variables (with no constant in the model), so that the vector of independent variables (the D
matrix) looks like
1 0 0
1 0 0
1 0 0
0 1 0
D= 0 1 0
0 1 0
0 0 1
0 0 1
(0 0 1)
a) then what is the predicted wages look like, that is, what is: PD Y  D( D ' D) 1 D ' Y  ?
b) what do the residuals look like in this model with just 9 observations, that is, what is
M DY  ( I  PD )Y  ?
11
Download