Given name:____________________ Family name:___________________

advertisement
Given name:____________________
Student #:______________________
Family name:___________________
Section #:______________________
BUEC 333 MIDTERM
Multiple Choice (2 points each)
1.) Suppose you have a random sample of 10 observations from a normal distribution with mean = 10 and
variance = 2. The sample mean (x-bar) is 8 and the sample variance is 3. The sampling distribution of xbar has
a.) mean 8 and variance 3
b.) mean 8 and variance 0.3
c.) mean 10 and variance 3
d.) mean 10 and variance 0.3
e.) none of the above
2.) The central limit theorem tells us that the sampling distribution of the sample mean:
a.) approaches normality as the sample size increases
b.) is always normal
c.) is always normal in large samples
d.) is normal in Monte Carlo simulations
e.) none of the above
3.) Suppose upon running a regression, EViews reports a value of the residual sum of squares as 1000 and
an R2 of 0.80. What is the value of the explained sum of squares in this case?
a.) 444.44
b.) 800
c.) 1000
d.) 4000
e.) none of the above as it is incalculable
4.) In the linear regression model, the stochastic error term:
a.) measures the difference between the dependent variable and its predicted value
b.) measures the difference between the independent variable and its predicted value
c.) is unbiased
d.) a and c
e.) none of the above
5.) The distribution of X when Y is not known is called _____ distribution of X, and is written as _____.
These blanks are best filled with the following
a.) conditional, p(X)
b.) conditional, p(X|Y)
c.) marginal, p(X)
d.) marginal, p(X|Y)
e.) none of the above
1
6.) The significance level of a test is the probability that you:
a.) fail to reject the null when it is false
b.) fail to reject the null when it is true
c.) reject the null when it is false
d.) reject the null when it is true
e.) none of the above
7.) Which of the following is not an assumption of the CLRM?
a.) The model is correctly specified
b.) The independent variables are exogenous
c.) The errors are normally distributed
d.) The errors have mean zero
e.) The errors have constant variance
8.) Suppose you have the following information about the cdf of a random variable X, which takes one of
4 possible values:
Value of X
Cdf
1
0.25
2
0.40
3
0.75
4
Which of the following is/are true?
a.) Pr(X = 2) = 0.4
b.) E(X) = 2.6
c.) Pr(X = 4) = 0.2
d.) all of the above
e.) none of the above
9.) The law of large numbers says that:
a.) the sample mean is a biased estimator of the population mean in small samples
b.) the sampling distribution of the sample mean approaches a normal distribution as the
sample size approaches infinity
c.) the behaviour of large populations is well approximated by the average
d.) the sample mean is an unbiased estimator of the population mean in large samples
e.) none of the above
10.) A negative covariance between X and Y means that whenever we obtain a value of X that is greater
than the mean of X
a.) we will have a greater than 50% chance of obtaining a corresponding value of Y which is
greater than the mean of Y
b.) we will have a greater than 50% chance of obtaining a corresponding value of Y which is
smaller than the mean of Y
c.) we will obtain a corresponding value of Y which is greater than the mean of Y
d.) we will obtain a corresponding value of Y which is smaller than the mean of Y
e.) none of the above
2
11.) The Gauss-Markov Theorem says that when the 6 classical assumptions are satisfied:
a.) The least squares estimator is unbiased
b.) The least squares estimator has the smallest variance of all linear estimators
c.) The least squares estimator has an approximately normal sampling distribution
d.) The least squares estimator is consistent
e.) None of the above
12.) Suppose a random variable X can take two possible values, zero or one, with equal probability.
Which of the following is/are true?
a.) E(X) = 0, Var(X) = 1
b.) E(X) = ½, Var(X) = ¼
c.) E(X) = ½, Var(X) = ½
d.) E(X) = 1, Var(X) = 1
e.) None of the above
13.) The OLS estimator of the variance of the slope coefficient in the regression model with one
independent variable:
a.) will be smaller when there is more variation in ei
b.) will be smaller when there are fewer observations
c.) will be smaller when there is less variation in X
d.) will be smaller when there are more independent variables
e.) none of the above
14.) Suppose you draw a random sample of n observations, X1, X2, …, Xn, from a population with
unknown mean μ. Which of the following estimators of μ is/are biased?
a.) the first observation you sample, X1
b.)
X2
2
2
c.) X  s / n
d.) b and c
e.) a, b, and c
15.) In the regression specification, Yi  0  1 X i   i , which of the following is a justification for
including epsilon?
a.) it accounts for potential non-linearity in the functional form
b.) it captures the influence of all omitted explanatory variables
c.) it incorporates measurement error in Y
d.) it reflects randomness in outcomes
e.) all of the above
3
Short Answer #1 (10 points – show your work!)
Consider the standard univariate regression model:
Yi  0  1 X i   i
Suppose you also know the following, 0  0. Derive the least squares estimator.
As always, we first have to define our residual as the difference between that which is observed and that
which is predicted by the regression. In this way, the residual is best thought of as a prediction error, that
is, something we would like to make as small as possible. And since 0  0, we have
ei  Yi  ˆ1 X i
Next, we need to define a minimization problem. Because our residuals will likely be both positive and
negative, simply considering their sum is unsatisfactory as these will tend to cancel one another out.
Additionally, minimizing the sum of residuals does not generally yield a unique answer. A better way
forward is to minimize the sum of the squared “prediction errors” which will definitely yield a unique
answer and which will penalize us for making big errors.
 e  
n
Minˆ
1
n
i 1
2
i
i 1
Yi  ˆ1 X i
  Y   
2
n
n
2
i 1
i
i 1

n

2Yi ˆ1 X i   ˆ1 X i
i 1

2
Now, we must take the derivative of the sum of squared residuals with respect to the beta-hat and set it
equal to zero. This first order condition establishes the value of beta-hat for which the sum of squared
residuals “bottoms out” and is, thus, minimized.
n
  ei2
i 1
ˆ
 2 Yi X i   2ˆ   X i2   0
n
n
i 1
i 1
Finally, we must solve for the values of the beta-hats which are consistent with these first order
conditions, thus, yielding our least squares estimator.
 Yi X i   ˆ   X i2   0
n
n
i 1
i 1
ˆ   X i2    Yi X i 
n
n
i 1
i 1
n
ˆ 
Y X
i 1
n
i
i
 X 
i 1
2
i
4
Page intentionally left blank. Use this space for rough work or the continuation of an answer.
5
Short Answer #2 (20 points – show your work!)
For a homework assignment on sampling you are asked to program a computer to do the following:
i.) Randomly draw 25 values from a standard normal distribution.
ii.) Multiply each these values by 10 and add 5.
iii.) Take the average of these 25 values and call it A1.
iv.) Repeat this procedure to obtain 500 such averages and call them A1 through A500.
a.) What is your best guess as to the value of A1? Explain your answer.
b.) What is the value of the variance associated with A1? Explain your answer.
c.) If you were to compute the average of the 500 A values, what should it be approximately equal to?
d.) If you were to compute the sampling variance of these 500 A values, what should it be approximately
equal to?
a.) A standard normal random variable is one for which the expected value is zero and the variance is
equal to one, so we have Z ~ N(0,1) where Z is one of he draws in i.). But in ii.), we are applying the
following transformation: X = 5 + 10*Z. We know from our formula sheet that
E  a  bX  cY   a  bE ( X )  cE (Y ).
It stands to reason then that our best guess (or expected value) of A1 = 5 + 10*E(Z) = 5.
b.) Likewise, we know from the formula sheet that
Var a  bY   b 2Var (Y )
It stands to reason then that the value of the variance associated with A1 = 102*Var(Z) = 100.
c.) We know that
 2 
X ~ N  , 
n 

That is, the sample mean should have an expected value equal to the mean in the population of our
random variable. In this case, this is nothing more than the value reported in a.) above. So, in this case,
the average of the averages should be approximately equal to 5.
d.) From part c.), we know that the sample mean has a sampling variance equal to the variance of the
underlying random variable divided by our sample size. We have calculated the numerator above as 100
and in applying n = 25, we find that the sampling variance should be approximately equal to 100/25 or 4.
6
Page intentionally left blank. Use this space for rough work or the continuation of an answer.
7
Short Answer #3 (20 points – show your work!)
In your first job as a forecaster, you have the following information on the joint probability of having an
expansion or a contraction in GDP with high, medium, or low inflation.
Expansion Contraction
High inflation
0.40
0.05
Medium inflation
0.20
0.10
Low inflation
0.20
a.) What is the probability of having low inflation during an expansion? Explain your answer.
b.) If you knew nothing about the state of the economy, what is the probability of having high inflation?
Explain your answer.
c.) If we are in a contraction, what would be the probability of high inflation? Explain your answer.
d.) In answering c.), what type of assumptions have you made about the independence of expansions and
inflation?
e.) If high inflation means 10%, medium inflation means 5%, and low inflation means 2%, what is the
expected value of inflation?
a.)The sum of probabilities must be one. That is, x + 0.40 + 0.20 + 0.20 + 0.10 + 0.05 = 1 implies
P(Expansion, Low inflation) = 0.05.
b.) 0.45. This is nothing more than the marginal probability of high inflation or the summation of 0.40
and 0.05.
c.) This relates to the conditional probability which we calculate as the joint probability of a contraction
and high inflation relative to the marginal probability of a contraction, or 0.05/0.35 ≈ 0.14.
d.) For the conditional to be equal to the ratio of the joint to the marginal, we are not assuming the two
variables are independent. We could exploit that fact that independence implies Pr(Y = y | X = x) = Pr(Y
= y) and Pr(Y = y, X = x) = Pr(Y = y) * Pr(X = x), but the expression for the conditional probability we
use does not require this.
e.) We form the expected value by assigning the value of outcomes (inflation rates) to their respective
probabilities (given by the marginal probabilities for high, medium, and low inflation). This would be
equal to 0.10*0.45 + 0.05*0.30 + 0.02*0.25 = 0.045 + 0.015 + 0.005 = 0.065 or 6.5%.
8
Page intentionally left blank. Use this space for rough work or the continuation of an answer.
9
Short Answer #4 (20 points – show your work!)
Suppose you want to explain incomes in Canada for workers in their prime working years. To answer this
question, you randomly sample 10,000 full-time workers age 30-50. Using these data, you estimate the
regression:
Yi  0  1 X1i  2 X 2i  3 X 3i  4 X 4i   i
Yi = average annual earnings in Canadian dollars of worker i
X1i = total work experience of worker i in years
X2i = work experience with current employer of worker i in years
X3i = amount of education of worker i in years
X4i = age of worker i in years
a.) What do you suppose would be the direction of the bias on the estimate of β1 if age were omitted from
the regression? Explain your answer.
b.) What do you suppose would be the direction of the bias on the estimate of β1 if education were
omitted from the regression? Explain your answer.
c.) Suppose you ran the regression including all four variables and generate the following results:
ˆ0  23, 465.53 ˆ1  457.82 ˆ2  128.93 ˆ3  1277.88 ˆ4  103.90
What does the value of 103.90 for the last coefficient mean?
d.) For the same regression, the 95% confidence interval for β1 is reported as (-88.27, 1003.91). Provide
the correct interpretation of this range.
e.) Suppose you want to estimate the increase in income expected by a working staying on the same firm
for another year. Explain how you would do this.
a.) Age and experience probably both have positive coefficients and are undoubtedly positively
correlated. So the bias will be upward. That is, if we omit age, experience will get “credit” for itself and
the contribution of age.
b.) Education and experience probably both have positive coefficients but are not unambiguously related
to one another with respect to their correlation. For workers early in their career, they may be negatively
related, i.e., another year in school comes at the expense of a potential year of experience. For workers
late in their career, the experience continues to grow while their education likely remains constant. This
suggests that if we omit education, there may be a downward bias. (Answers may vary here)
c.) It means holding constant the variation in work experience and education, we expect income to
increase by $103.90 for each additional year of age on the part of workers.
d.) There is a 95% probability that confidence intervals constructed in this fashion will include the true
value of the population parameter beta-one. (Interestingly enough, the fact that it extends into negative
values also suggests that we will fail to reject the null hypothesis at the 5% level that the true coefficient
is equal to zero, but this nugget of information is not necessary for answering the question).
e.) Both total work experience and work experience with current employer increase here, so the estimate
we want is the sum of the slope estimates on these two explanatory variables, or 457.82 + 128.93 =
$586.75. This is the cumulative predicted effect of staying on for an additional year with the same
employer.
10
Page intentionally left blank. Use this space for rough work or the continuation of an answer.
11
Useful Formulas:
E( X ) 

p x
i i
2
i
X
2 pi
Pr( X  x, Y  y)
Pr(Y  y | X  x) 
Pr( X  x)
k
Pr( X  x)   Pr X  x, Y  yi 
i 1
m
E Y    E Y | X  xi  Pr X  xi 
k
E Y | X  x    yi PrY  yi | X  x 
i 1
i 1
k
Var (Y | X  x)    yi  E Y | X  x  PrY  yi | X  x 
Ea  bX  cY   a  bE( X )  cE (Y )
2
i 1
Cov( X , Y )   x j   X  yi  Y  PrX  x j , Y  yi 
k
m
i 1 j 1
Cov X , Y 
Var  X Var Y 
Corr  X , Y    XY 
Cova  bX  cV ,Y   bCov( X ,Y )  cCov(V ,Y )
E XY   Cov( X ,Y )  E( X ) E(Y )
t
X 
1
xi  x 2
s 

n  1 i 1
2
i
i 1
Z
s/ n
n
n
Var a  bY   b 2Var (Y )
Var aX  bY   a 2Var ( X )  b 2Var (Y )  2abCov( X ,Y )
E Y 2   Var (Y )  E (Y ) 2
x
k
i 1
i 1
1
X 
n
  x  
Var ( X )  E  X   X  
k
s XY 
X 
 2 
X ~ N  , 
n 

rXY  s XY / s X sY

n
1
 xi  x  yi  y 
n  1 i 1
 X
n
For the linear regression model Yi   0  1 X i   i , ˆ1 
i 1
i
 X Yi  Y 
n
 X
i 1
i
X
2
& βˆ0  Y  ˆ1 X
Yˆi  ˆ0  ˆ1 X 1i  ˆ2 X 2i    ˆk X ki
e2
ESS TSS  RSS
RSS

i i
R 

 1
 1
2
TSS
TSS
TSS
 Yi  Y 
 e / (n  k  1)
R  1
 Y  Y  / (n  1)
e /  n  k  1
ˆ  ˆ     
Var
 
 X  X 
2
s2 
e
2
i
2
where E  s 2    2
 n  k  1
i
i
2
i i
2
i i
2
i i
1
2
i
Z
ˆ j   H
Var[ ˆ j ]
~ N  0,1
Pr[ˆ j  t* /2  s.e.(ˆ j )   j  ˆ j  t* /2  s.e.(ˆ j )]  1  
 e  e 
d
 e
T
t 2
t
T
2
t 1 t
t 1
t
F
i
ˆ1   H
~ tn k 1
s.e.( ˆ1 )
ESS / k
ESS (n  k  1)

RSS / (n  k  1)
RSS
k
2
 2(1   )
12
Download