Given name:____________________ Family name:___________________ Student #:______________________

advertisement
Given name:____________________
Student #:______________________
Family name:___________________
BUEC 333 FINAL
Multiple Choice (2 points each)
1) Which of the following is not an assumption of the CLNRM?
a) The errors are uniformly distributed
b) The model is correctly specified
c) The independent variables are exogenous
d) The errors have mean zero
e) The errors have constant variance
2) In the regression model ln Yi = β0 + β1 ln X i + ε i :
a) β1 measures the elasticity of Y with respect to X
b) β1 measures the elasticity of X with respect to Y
c) β1 measures the percentage change in Y for a one unit change in X
d) the marginal effect of X on Y is constant
e) none of the above
3) If two random variables X and Y are independent:
a) their joint distribution equals the product of their conditional distributions
b) the posterior distribution of X given Y equals the marginal distribution of X
c) E[XY] = E[X]E[Y]
d) b and c
e) none of the above
4) The power of a test is the probability that you:
a) reject the null when it is true
b) reject the null when it is false
c) fail to reject the null when it is false
d) fail to reject the null when it is true
e) none of the above
5) R-squared is:
a) the residual sum of squares as a fraction of the total variation in the independent variable
b) the explained sum of squares as a fraction of the total sum of squares
c) one minus the answer in a)
d) one minus the answer in b)
e) none of the above
6) In the linear regression model, the least squares estimator:
a) maximizes the value of R2
b) minimizes the sum of squared residuals
c) features the smallest possible sample variance
d) all of the above
e) only a) and b)
1 7) If q is an unbiased estimator of Q, then:
a) Q is the mean of the sampling variance of q
b) q is the mean of the sampling distribution of Q
c) Var[q] = Var[Q] / n where n = the sample size
d) q = Q
e) none of the above
8) In the Capital Asset Pricing Model (CAPM):
a) β measures the sensitivity of the expected return of a portfolio to systematic risk
b) β measures the sensitivity of the expected return of a portfolio to specific risk
c) β is greater than one
d) α is less than zero
e) R2 is meaningless
9) In the regression specification, Yi = β0 + β1 X i + ε i , which of the following is a justification for
including epsilon?
a) it accounts for potential non-linearity in the functional form
b) it captures the influence of all omitted explanatory variables
c) it incorporates measurement error in Y
d) it reflects randomness in outcomes
e) all of the above
10) Suppose [L(X), U(X)] is a 95% confidence interval for a population mean. Which of the following
is/are true?
a) Pr L( X ) ≤ X ≤ U ( X ) = 0.90
[
b)
]
Pr ⎡⎣ L ( X ) ≤ µ ≤ U ( X )⎤⎦ = 0.95
[
]
[
]
c) Pr X ≤ L( X ) + Pr U ( X ) ≤ X = 0.05
d) a and c
e) none of the above
11) Omitting a constant term from our regression will likely lead to:
a) higher R2 , higher F stat, and biased estimates of the independent variables when β0 ≠ 0
b) higher R2, lower F stat, and biased estimates of the independent variables when β0 ≠ 0
c) higher R2, lower F stat, and unbiased estimates of the independent variables when β0 ≠ 0
d) higher R2, higher F stat, and unbiased estimates of the independent variables when β0 ≠ 0
e) none of the above
12) In order for an independent variable to be labelled “exogenous” which of the following must be true:
a) E(εi) = 0
b) Cov(Xi,εi) = 0
c) Cov(εi,εj) = 0
d) Var(εi) = σ2
e) none of the above
2 13) Pure serial correlation:
a) relates to the persistence of errors in the regression model
b) can be detected with the RESET test statistic
c) is caused by mis-specification of the regression model
d) b and c
e) all of the above
14) The sampling variance of the slope coefficient in the regression model with one independent variable:
a) will be smaller when there is more variation in X
b) will be smaller when there is more variation in ε
c) will be larger when there is less variation in ε
d) will be larger when there is more co-variation in ε and X
e) none of the above
15) Suppose you compute a sample statistic q to estimate a population quantity Q. Which of the following
is/are true?
[1] the variance of Q is zero
[2] if q is an unbiased estimator of Q, then q = E(Q)
[3] if q is an unbiased estimator of Q, then Q is the mean of the sampling distribution of q
[4] a 95% confidence interval for q contains Q with 95% probability
a) 1 only
b) 3 only
c) 1 and 3
d) 1, 2, and 3
e) 1, 2, 3, and 4
16) Suppose the monthly demand for tomatoes (a perishable good) in a small town is random. With
probability 1/2, demand is 50; with probability 1/2, demand is 100. You are the only producer of tomatoes
in this town. Tomatoes sell for a fixed price of $1, cost $0.50 to produce, and can only be sold in the local
market. If you produce 60 tomatoes, your expected profit is:
a) $15
b) $35
c) $55
d) $75
e) none of the above
17) The OLS estimator is said to be unbiased when:
a) Assumptions 1 through 3 are satisfied
b) Assumptions 1 through 6 are satisfied
c) Assumptions 1 through 3 are satisfied and errors are normally distributed
d) Assumptions 1 through 6 are satisfied and errors are normally distributed
e) all of the above
18) The RESET test is designed to detect problems associated with:
a) specification error of an unknown form
b) heteroskedasticity
c) multicollinearity
d) serial correlation
e) none of the above
3 19) The Durbin-Watson test is only valid:
a) with models that include an intercept
b) with models that include a lagged dependent variable
c) with models displaying multiple orders of autocorrelation
d) all of the above
e) none of the above
20) The consequences of multicollinearity are that the OLS estimates:
a) will be biased while the standard errors will remain unaffected
b) will be biased while the standard errors will be smaller
c) will be unbiased while the standard errors will remain unaffected
d) will be unbiased while the standard errors will be smaller
e) none of the above
4 Short Answer #1 (10 points)
Suppose you have observations on a dependent variable, Y, and an independent variable, X.
a) Provide a plot of your X and Y values with a regression line through the points that would
indicate the presence of heteroskedasticy in the errors of the regression model.
b) Is your graph above indicative of a model with pure heteroskedasticity or impure
heteroskedasticity? Discuss.
c) Explain the consequences of using OLS estimation if the errors terms in the regression model
are heteroskedastic.
a) Something like the following should suffice
b) Technically speaking, there is no way of determining if it is impure or pure heteroskedasticity we are
dealing with here. For full credit, an answer should discuss the difference between the two and
why it matters.
c) There will be three consequences:
i) OLS estimates remain unbiased…but only if the problem is with pure heteroskedasticity; OLS
estimates, however, will be biased if the problem is with impure heteroskedasticity
brought about by correlated omitted variables.
ii) Even if unbiased, the sampling variance of the OLS estimator is inflated.
ii) Because of ii), the estimated value of the sampling variance—and consequently, the calculated
standard error—is wrong.
5 Page intentionally left blank. Use this space for rough work or the continuation of an answer.
6 Short Answer #2 (10 points)
Consider the following regression model:
Yi = β0 + β1 X i + ε i .
Suppose you have 101 observations and know the following summary statistics:
2
∑ (Y − Y ) = 45
∑ ( X − X ) = 50
∑ e = 15, 000
i
2
i
2
i
Cov( X i , Yi ) = 50
a) In the most general terms possible, what is the expression for a (1-α)% confidence interval for
the unknown population slope parameter? (2 points)
b) Using a critical value of 2.5, numerically construct a (1-α)% confidence interval for the
unknown population slope parameter. (4 points)
c) What is the precise interpretation of the confidence interval given in b)? (4 points)
a) This is simply the following:
Pr[βˆ1 − tα* /2 × s.e.(βˆ1 ) ≤ β1 ≤ βˆ1 + tα* /2 × s.e.(βˆ1 )] = 1 − α
b) First, we need expressions for beta-hat and its standard error…
βˆ1 =
Cov( X , Y )
=
Var (X)
50
50
=
= 100
2
∑ ( X i − X ) 50
100
n −1
( ∑ e ) / ( n − k − 1) = 15, 000 / 99 ≈ 3
2
ˆ ⎡ βˆ ⎤ =
Var
⎣ 1 ⎦
i i
∑ (X
i
i
−X)
2
50
ˆ ⎡ βˆ ⎤ ≈ 3 ≈ 1.75
s.e.( βˆ1 ) = Var
⎣ 1 ⎦
Pr[100 − 2.5 ×1.75 ≤ β1 ≤ 100 + 2.5 ×1.75] = 1 − α
Pr[100 − 4.375 ≤ β1 ≤ 100 + 4.375] = 1 − α
Pr[95.625 ≤ β1 ≤ 104.375] = 1 − α
The answer need not be as precise as the last expression; full marks for correctly deriving value of betahat-one and its standard error as the square root of 3 as long as they appear in the right place in
the confidence interval.
7 Page intentionally left blank. Use this space for rough work or the continuation of an answer.
c) In the limit, there is a (1-α)% probability that a set of confidence intervals constructed in this fashion
will include the true value of the population parameter beta.
8 Short Answer #3 (10 points)
Consider the following regression model:
Yi = β0 + β1 X1 + β2 X 2 + ε i .
Suppose you forget to include the variable X2 in the regression you estimate.
a) Derive an expression for the omitted variable bias resulting from your estimation.
b) If you could not obtain data on X2, what can you do to eliminate or diminish the omitted
variable bias?
a) The true DGP is
Yi = β0 + β1 X1 + β2 X 2 + ε i .
Instead, we estimate
Yi = β 0 + β1 X 1 + ε i* where ε i* = β 2 X 2 + ε i
Thus, we can derive the bias in the following way
Yi = β 0 + β1 X i + ε i ⇒ Y = β 0 + β1 X + ε
∑ ( X − X )(Y − Y )
∑ (X − X )
∑ ( X − X )( β + β X + ε − β − β X − ε )
βˆ =
∑ (X − X )
∑ ( X − X )(β ( X − X ) + ε − ε )
βˆ =
∑ (X − X )
β ∑ ( X − X ) ∑ ( X − X ) (ε − ε )
βˆ =
+
∑ (X − X )
∑ (X − X )
∑ ( X − X ) (ε − ε )
βˆ = β +
∑ (X − X )
⎛
X − X ) (ε − ε ) ⎞
(
∑
ˆ
⎜
⎟
E (β ) = β + E
⎜
∑ ( X − X ) ⎟⎠
⎝
βˆ1 =
i
i
i
2
i
i
i
i
0
1
i
i
1
0
1
2
i
i
i
i
1
i
i
1
2
i
i
2
1
i
i
i
1
i
i
2
i
1
2
i
i
i
i
1
2
i
i
i
1
i
i
i
i
1
2
i
i
9 Page intentionally left blank. Use this space for rough work or the continuation of an answer.
So, the last term on the RHS can be thought of the bias arising from omitting X2.
Partial credit for simply providing this last expression rather than formally deriving it.
b)Unfortunately, there is not a whole lot we can do in circumstances like these. There is the “easy” way
out: just add the omitted variable into the model, but we presumably would have done this in the
first place if it was possible. We can also include a “proxy” for the omitted variable instead
where the proxy is something highly correlated with the omitted variable.
10 Short Answer #4 (10 points)
Consider the simple univariate regression model,
Yi = β0 + β1 X i + ε i .
Demonstrate that the sample regression line passes through the sample mean of both X and Y.
We estimate the linear regression model Yi = β 0 + β1 X i + ε i as
Y = βˆ + βˆ X + e
i
0
1
i
i
We also know that
Yˆ = βˆ + βˆ X
i
0
1
i
and that
βˆ0 = Y − βˆ1 X
We can evaluate this second expression when X i = X to prove the statement above,
Yˆ = βˆ + βˆ X
i
0
1
Yˆi = Y − βˆ1 X + βˆ1 X = Y
So, by construction, the estimated regression line always passes through
the sample means when using OLS.
(Alternatively, you could start with the expression for Yi, considers it sum, and proceed to evaluate its
mean value).
11 Page intentionally left blank. Use this space for rough work or the continuation of an answer.
12 Short Answer #5 (10 points)
Consider the following regression model:
Yi = β0 + β1 X1 + β2 X 2 + ε i .
a) State the underlying assumptions for the classical linear regression model given above.
b) Which of these assumptions are necessary for our estimator to be unbiased and which are
necessary for it to be efficient?
c) Graphically illustrate the following assumptions: E(εi) = 0, Cov(εi,εj) = 0, and Var(εi) = σ2
d) Sometimes, the seventh assumption related to the normality of ε is used which implies that the
β’s are also normally distributed. But when we estimate via OLS, we always arrive at a
single number and not a distribution of values. Explain why this is the case.
a) The regression model is: a) linear in the coefficients, b) is correctly specified, and c) has an additive
error term.
The error term has zero population mean or E(εi) = 0.
All independent variables are uncorrelated with the error term, or Cov(Xi,εi) = 0 for each independent
variable Xi (we say there is no endogeneity).
Errors are uncorrelated across observations, or Cov(εi,εj) = 0 for two observations i and j (we say there
is no serial correlation).
The error term has a constant variance, or Var(εi) = σ2 for every i (we say there is no heteroskedasticity).
No independent variable is a perfect linear function of any other independent variable (we say there is no
perfect collinearity).
b) Of the assumptions listed above the first three are required for unbiasedness. Four through six are
necessary for the OLS estimator to be efficient.
c) Something along the lines of the following should suffice:
13 Page intentionally left blank. Use this space for rough work or the continuation of an answer.
d) This reflects that fact that our OLS estimates come from one particular set of data (i.e., one sample of
observations). Thus, there is only one number attached to a particular estimate of our population
parameter of interest. We also expect to generate different results in OLS whenever we change the sample
(that is, when we have different observations with different values for our variables…). The result on the
normality of the betas under the seventh assumption reflects this fundamental fact: repeated random
sampling will result in a whole distribution of values for the estimates of the population parameter of
interest.
14 Short Answer #6 (10 points)
Consider the following set of results for a log-log specification of NHL salaries on two independent
variables, age and points.
For the following statistical tests, specify what the null hypothesis of the relevant test is and provide the
appropriate interpretation given the results above:
a) the t test associated with the independent variable “points” (use a critical value of 2.58)
b) the F test associated with “age” and “points” in combination (use a critical value of 4.61)
c) the RESET test using the F statistic (use a critical value of 6.64)
d) the Durbin-Watson test (use a lower critical value of 1.55 and upper critical value of 1.80)
15 Page intentionally left blank. Use this space for rough work or the continuation of an answer.
a) H0: βPOINTS = 0 versus H1: βPOINTS ≠ 0
Since the test statistic of 21.92 is so much larger (in absolute value) than the critical value, it is unlikely
that the null is true, so we consequently reject it and regard the coefficient on points as being statistically
significant.
b) H0: β1 = β2 = ... = βk = 0 versus H1 : at least one βj ≠ 0, where j = 1, 2, ... , k
Since the test statistic of 315.90 is so much larger (in absolute value) than the critical value, it is unlikely
that the null is true, so we consequently reject it and consider that collectively our independent variables
are important in explaining the variation observed in our dependent variable…provided the errors are
normal!
c) The null hypothesis in this case is one of correct specification, in particular that of potential omitted
variables. Technically speaking, we are evaluating the joint significance of the coefficients for all the
powers (greater than one) of the predicted value of Y. Thus, the RESET test suggests that we fail to reject
the null hypothesis of having no missing variables. Unfortunately, it gives us no further indication of how
to deal with this problem.
d) The null hypothesis in this case is no positive autocorrelation. Thus, the Durbin-Watson test suggests
that we reject the null hypothesis of no positive autocorrelation, suggesting instead we very likely have
problems with serial correlation in this specification.
16 Page intentionally left blank. Use this space for rough work or the continuation of an answer.
17 Useful Formulas:
k
k
2
2
σ X2 = Var ( X ) = E ⎡( X − µ X ) ⎤ = ∑ ( xi − µ X ) pi
µ X = E ( X ) = ∑ pi xi
⎣
i =1
k
⎦
Pr(Y = y | X = x) =
Pr( X = x) = ∑ Pr ( X = x, Y = yi )
i =1
i =1
Pr( X = x, Y = y )
Pr( X = x)
m
k
(
)
E
Y
=
E (Y | X = xi ) Pr ( X = xi )
E (Y | X = x ) =
yi Pr (Y = yi | X = x )
i
=
1
i =1
k
2
Var (Y | X = x) = [yi − E (Y | X = x )] Pr (Y = yi | X = x ) E(a + bX + cY ) = a + bE( X ) + cE(Y )
i =1
k m
Var(a + bY ) = b 2Var(Y )
σ
=
Cov
(
X
,
Y
)
=
x j − µ X ( yi − µY ) Pr X = x j , Y = yi
XY
i =1 j =1
Cov( X , Y )
Var(aX + bY ) = a 2Var( X ) + b 2Var(Y ) + 2abCov( X ,Y )
Corr ( X , Y ) = ρ XY =
Var( X )Var(Y )
2
2
Cov(a + bX + cV ,Y ) = bCov( X ,Y ) + cCov(V ,Y )
E Y = Var(Y ) + E (Y )
X −µ
X −µ
⎛ σ 2 ⎞
t=
(
)
E
XY
=
Cov
(
X
,
Y
)
+
E
(
X
)
E
(
Y
)
Z
=
X ~ N ⎜ µ , ⎟
s/ n
σ
n ⎠
n
n
⎝
n
1
2
1
2
1
X =
xi
sX =
( xi − x )
(xi − x )( yi − y )
rXY = sXY / sX sY
s XY =
n i =1
n − 1 i =1
n − 1 i =1
n
X i − X Yi − Y
i =1
ˆ
For the linear regression model Yi = β 0 + β1 X i + ε i , β1 =
& βˆ0 = Y − βˆ1 X
n
2
Xi − X
i
=
1
Yˆ = βˆ + βˆ X + βˆ X + ! + βˆ X
i
0
1 1i
2 2i
k ki
e2
e2 / (n − k − 1)
RSS
2 ESS TSS − RSS
2
i i
i i
R =
=
= 1−
= 1−
R = 1−
2
2
TSS
TSS
TSS
Y
−
Y
Yi − Y / (n − 1)
i
i
i
e2
ei 2 / ( n − k − 1)
2
2
i i
i
s 2 =
ˆ
ˆ
⎡
⎤
where E ⎣ s ⎦ = σ
Var ⎡⎣ β1 ⎤⎦ =
2
( n − k − 1)
X
−
X
i
i
ˆ
β j − βH
βˆ − β H
Z =
t= 1
~ tn −k −1
~ N ( 0,1)
∑
∑
∑
∑∑ (
)
(
)
( )
∑
∑
∑
∑(
)(
∑(
∑
∑(
)
s.e.( βˆ1 )
Var[ βˆ j ]
Pr[ βˆ j − tα* /2 × s.e.( βˆ j ) ≤ β j ≤ βˆ j + tα* /2 × s.e.( βˆ j )] = 1 − α
T
∑ (e − e )
d=
∑ e
t
T
2
t =1 t
)
∑
)
∑(
(∑ )
)
∑(
∑
t =2
)
t −1
F=
ESS / k
ESS (n − k − 1)
=
RSS / (n − k − 1)
RSS
k
2
≈ 2(1 − ρ )
18 
Download