Given name:____________________ Student #:______________________ Family name:___________________ Section #:______________________ BUEC 333 MIDTERM Multiple Choice (2 points each) 1.) Suppose you have a random sample of 10 observations from a normal distribution with mean = 10 and variance = 2. The sample mean (x-bar) is 8 and the sample variance is 3. The sampling distribution of xbar has a.) mean 8 and variance 3 b.) mean 8 and variance 0.3 c.) mean 10 and variance 3 d.) mean 10 and variance 0.3 e.) none of the above 2.) The central limit theorem tells us that the sampling distribution of the sample mean: a.) approaches normality as the sample size increases b.) is always normal c.) is always normal in large samples d.) is normal in Monte Carlo simulations e.) none of the above 3.) Suppose upon running a regression, EViews reports a value of the residual sum of squares as 1000 and an R2 of 0.80. What is the value of the explained sum of squares in this case? a.) 444.44 b.) 800 c.) 1000 d.) 4000 e.) none of the above as it is incalculable 4.) In the linear regression model, the stochastic error term: a.) measures the difference between the dependent variable and its predicted value b.) measures the difference between the independent variable and its predicted value c.) is unbiased d.) a and c e.) none of the above 5.) The distribution of X when Y is not known is called _____ distribution of X, and is written as _____. These blanks are best filled with the following a.) conditional, p(X) b.) conditional, p(X|Y) c.) marginal, p(X) d.) marginal, p(X|Y) e.) none of the above 1 6.) The significance level of a test is the probability that you: a.) fail to reject the null when it is false b.) fail to reject the null when it is true c.) reject the null when it is false d.) reject the null when it is true e.) none of the above 7.) Which of the following is not an assumption of the CLRM? a.) The model is correctly specified b.) The independent variables are exogenous c.) The errors are normally distributed d.) The errors have mean zero e.) The errors have constant variance 8.) Suppose you have the following information about the cdf of a random variable X, which takes one of 4 possible values: Value of X Cdf 1 0.25 2 0.40 3 0.75 4 Which of the following is/are true? a.) Pr(X = 2) = 0.4 b.) E(X) = 2.6 c.) Pr(X = 4) = 0.2 d.) all of the above e.) none of the above 9.) The law of large numbers says that: a.) the sample mean is a biased estimator of the population mean in small samples b.) the sampling distribution of the sample mean approaches a normal distribution as the sample size approaches infinity c.) the behaviour of large populations is well approximated by the average d.) the sample mean is an unbiased estimator of the population mean in large samples e.) none of the above 10.) A negative covariance between X and Y means that whenever we obtain a value of X that is greater than the mean of X a.) we will have a greater than 50% chance of obtaining a corresponding value of Y which is greater than the mean of Y b.) we will have a greater than 50% chance of obtaining a corresponding value of Y which is smaller than the mean of Y c.) we will obtain a corresponding value of Y which is greater than the mean of Y d.) we will obtain a corresponding value of Y which is smaller than the mean of Y e.) none of the above 2 11.) The Gauss-Markov Theorem says that when the 6 classical assumptions are satisfied: a.) The least squares estimator is unbiased b.) The least squares estimator has the smallest variance of all linear estimators c.) The least squares estimator has an approximately normal sampling distribution d.) The least squares estimator is consistent e.) None of the above 12.) Suppose a random variable X can take two possible values, zero or one, with equal probability. Which of the following is/are true? a.) E(X) = 0, Var(X) = 1 b.) E(X) = ½, Var(X) = ¼ c.) E(X) = ½, Var(X) = ½ d.) E(X) = 1, Var(X) = 1 e.) None of the above 13.) The OLS estimator of the variance of the slope coefficient in the regression model with one independent variable: a.) will be smaller when there is more variation in ei b.) will be smaller when there are fewer observations c.) will be smaller when there is less variation in X d.) will be smaller when there are more independent variables e.) none of the above 14.) Suppose you draw a random sample of n observations, X1, X2, …, Xn, from a population with unknown mean μ. Which of the following estimators of μ is/are biased? a.) the first observation you sample, X1 b.) X2 2 2 c.) X s / n d.) b and c e.) a, b, and c 15.) In the regression specification, Yi 0 1 X i i , which of the following is a justification for including epsilon? a.) it accounts for potential non-linearity in the functional form b.) it captures the influence of all omitted explanatory variables c.) it incorporates measurement error in Y d.) it reflects randomness in outcomes e.) all of the above 3 Short Answer #1 (10 points – show your work!) Consider the standard univariate regression model: Yi 0 1 X i i Suppose you also know the following, 0 0. Derive the least squares estimator. As always, we first have to define our residual as the difference between that which is observed and that which is predicted by the regression. In this way, the residual is best thought of as a prediction error, that is, something we would like to make as small as possible. And since 0 0, we have ei Yi ˆ1 X i Next, we need to define a minimization problem. Because our residuals will likely be both positive and negative, simply considering their sum is unsatisfactory as these will tend to cancel one another out. Additionally, minimizing the sum of residuals does not generally yield a unique answer. A better way forward is to minimize the sum of the squared “prediction errors” which will definitely yield a unique answer and which will penalize us for making big errors. e n Minˆ 1 n i 1 2 i i 1 Yi ˆ1 X i Y 2 n n 2 i 1 i i 1 n 2Yi ˆ1 X i ˆ1 X i i 1 2 Now, we must take the derivative of the sum of squared residuals with respect to the beta-hat and set it equal to zero. This first order condition establishes the value of beta-hat for which the sum of squared residuals “bottoms out” and is, thus, minimized. n ei2 i 1 ˆ 2 Yi X i 2ˆ X i2 0 n n i 1 i 1 Finally, we must solve for the values of the beta-hats which are consistent with these first order conditions, thus, yielding our least squares estimator. Yi X i ˆ X i2 0 n n i 1 i 1 ˆ X i2 Yi X i n n i 1 i 1 n ˆ Y X i 1 n i i X i 1 2 i 4 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 5 Short Answer #2 (20 points – show your work!) For a homework assignment on sampling you are asked to program a computer to do the following: i.) Randomly draw 25 values from a standard normal distribution. ii.) Multiply each these values by 10 and add 5. iii.) Take the average of these 25 values and call it A1. iv.) Repeat this procedure to obtain 500 such averages and call them A1 through A500. a.) What is your best guess as to the value of A1? Explain your answer. b.) What is the value of the variance associated with A1? Explain your answer. c.) If you were to compute the average of the 500 A values, what should it be approximately equal to? d.) If you were to compute the sampling variance of these 500 A values, what should it be approximately equal to? a.) A standard normal random variable is one for which the expected value is zero and the variance is equal to one, so we have Z ~ N(0,1) where Z is one of he draws in i.). But in ii.), we are applying the following transformation: X = 5 + 10*Z. We know from our formula sheet that E a bX cY a bE ( X ) cE (Y ). It stands to reason then that our best guess (or expected value) of A1 = 5 + 10*E(Z) = 5. b.) Likewise, we know from the formula sheet that Var a bY b 2Var (Y ) It stands to reason then that the value of the variance associated with A1 = 102*Var(Z) = 100. c.) We know that 2 X ~ N , n That is, the sample mean should have an expected value equal to the mean in the population of our random variable. In this case, this is nothing more than the value reported in a.) above. So, in this case, the average of the averages should be approximately equal to 5. d.) From part c.), we know that the sample mean has a sampling variance equal to the variance of the underlying random variable divided by our sample size. We have calculated the numerator above as 100 and in applying n = 25, we find that the sampling variance should be approximately equal to 100/25 or 4. 6 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 7 Short Answer #3 (20 points – show your work!) In your first job as a forecaster, you have the following information on the joint probability of having an expansion or a contraction in GDP with high, medium, or low inflation. Expansion Contraction High inflation 0.40 0.05 Medium inflation 0.20 0.10 Low inflation 0.20 a.) What is the probability of having low inflation during an expansion? Explain your answer. b.) If you knew nothing about the state of the economy, what is the probability of having high inflation? Explain your answer. c.) If we are in a contraction, what would be the probability of high inflation? Explain your answer. d.) In answering c.), what type of assumptions have you made about the independence of expansions and inflation? e.) If high inflation means 10%, medium inflation means 5%, and low inflation means 2%, what is the expected value of inflation? a.)The sum of probabilities must be one. That is, x + 0.40 + 0.20 + 0.20 + 0.10 + 0.05 = 1 implies P(Expansion, Low inflation) = 0.05. b.) 0.45. This is nothing more than the marginal probability of high inflation or the summation of 0.40 and 0.05. c.) This relates to the conditional probability which we calculate as the joint probability of a contraction and high inflation relative to the marginal probability of a contraction, or 0.05/0.35 ≈ 0.14. d.) For the conditional to be equal to the ratio of the joint to the marginal, we are not assuming the two variables are independent. We could exploit that fact that independence implies Pr(Y = y | X = x) = Pr(Y = y) and Pr(Y = y, X = x) = Pr(Y = y) * Pr(X = x), but the expression for the conditional probability we use does not require this. e.) We form the expected value by assigning the value of outcomes (inflation rates) to their respective probabilities (given by the marginal probabilities for high, medium, and low inflation). This would be equal to 0.10*0.45 + 0.05*0.30 + 0.02*0.25 = 0.045 + 0.015 + 0.005 = 0.065 or 6.5%. 8 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 9 Short Answer #4 (20 points – show your work!) Suppose you want to explain incomes in Canada for workers in their prime working years. To answer this question, you randomly sample 10,000 full-time workers age 30-50. Using these data, you estimate the regression: Yi 0 1 X1i 2 X 2i 3 X 3i 4 X 4i i Yi = average annual earnings in Canadian dollars of worker i X1i = total work experience of worker i in years X2i = work experience with current employer of worker i in years X3i = amount of education of worker i in years X4i = age of worker i in years a.) What do you suppose would be the direction of the bias on the estimate of β1 if age were omitted from the regression? Explain your answer. b.) What do you suppose would be the direction of the bias on the estimate of β1 if education were omitted from the regression? Explain your answer. c.) Suppose you ran the regression including all four variables and generate the following results: ˆ0 23, 465.53 ˆ1 457.82 ˆ2 128.93 ˆ3 1277.88 ˆ4 103.90 What does the value of 103.90 for the last coefficient mean? d.) For the same regression, the 95% confidence interval for β1 is reported as (-88.27, 1003.91). Provide the correct interpretation of this range. e.) Suppose you want to estimate the increase in income expected by a working staying on the same firm for another year. Explain how you would do this. a.) Age and experience probably both have positive coefficients and are undoubtedly positively correlated. So the bias will be upward. That is, if we omit age, experience will get “credit” for itself and the contribution of age. b.) Education and experience probably both have positive coefficients but are not unambiguously related to one another with respect to their correlation. For workers early in their career, they may be negatively related, i.e., another year in school comes at the expense of a potential year of experience. For workers late in their career, the experience continues to grow while their education likely remains constant. This suggests that if we omit education, there may be a downward bias. (Answers may vary here) c.) It means holding constant the variation in work experience and education, we expect income to increase by $103.90 for each additional year of age on the part of workers. d.) There is a 95% probability that confidence intervals constructed in this fashion will include the true value of the population parameter beta-one. (Interestingly enough, the fact that it extends into negative values also suggests that we will fail to reject the null hypothesis at the 5% level that the true coefficient is equal to zero, but this nugget of information is not necessary for answering the question). e.) Both total work experience and work experience with current employer increase here, so the estimate we want is the sum of the slope estimates on these two explanatory variables, or 457.82 + 128.93 = $586.75. This is the cumulative predicted effect of staying on for an additional year with the same employer. 10 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 11 Useful Formulas: E( X ) p x i i 2 i X 2 pi Pr( X x, Y y) Pr(Y y | X x) Pr( X x) k Pr( X x) Pr X x, Y yi i 1 m E Y E Y | X xi Pr X xi k E Y | X x yi PrY yi | X x i 1 i 1 k Var (Y | X x) yi E Y | X x PrY yi | X x Ea bX cY a bE( X ) cE (Y ) 2 i 1 Cov( X , Y ) x j X yi Y PrX x j , Y yi k m i 1 j 1 Cov X , Y Var X Var Y Corr X , Y XY Cova bX cV ,Y bCov( X ,Y ) cCov(V ,Y ) E XY Cov( X ,Y ) E( X ) E(Y ) t X 1 xi x 2 s n 1 i 1 2 i i 1 Z s/ n n n Var a bY b 2Var (Y ) Var aX bY a 2Var ( X ) b 2Var (Y ) 2abCov( X ,Y ) E Y 2 Var (Y ) E (Y ) 2 x k i 1 i 1 1 X n x Var ( X ) E X X k s XY X 2 X ~ N , n rXY s XY / s X sY n 1 xi x yi y n 1 i 1 X n For the linear regression model Yi 0 1 X i i , ˆ1 i 1 i X Yi Y n X i 1 i X 2 & βˆ0 Y ˆ1 X Yˆi ˆ0 ˆ1 X 1i ˆ2 X 2i ˆk X ki e2 ESS TSS RSS RSS i i R 1 1 2 TSS TSS TSS Yi Y e / (n k 1) R 1 Y Y / (n 1) e / n k 1 ˆ ˆ Var X X 2 s2 e 2 i 2 where E s 2 2 n k 1 i i 2 i i 2 i i 2 i i 1 2 i Z ˆ j H Var[ ˆ j ] ~ N 0,1 Pr[ˆ j t* /2 s.e.(ˆ j ) j ˆ j t* /2 s.e.(ˆ j )] 1 e e d e T t 2 t T 2 t 1 t t 1 t F i ˆ1 H ~ tn k 1 s.e.( ˆ1 ) ESS / k ESS (n k 1) RSS / (n k 1) RSS k 2 2(1 ) 12