Given name:____________________ Student #:______________________ Family name:___________________ BUEC 333 FINAL Multiple Choice (2 points each) 1) Which of the following is not an assumption of the CLNRM? a) The errors are uniformly distributed b) The model is correctly specified c) The independent variables are exogenous d) The errors have mean zero e) The errors have constant variance 2) In the regression model ln Yi = β0 + β1 ln X i + ε i : a) β1 measures the elasticity of Y with respect to X b) β1 measures the elasticity of X with respect to Y c) β1 measures the percentage change in Y for a one unit change in X d) the marginal effect of X on Y is constant e) none of the above 3) If two random variables X and Y are independent: a) their joint distribution equals the product of their conditional distributions b) the posterior distribution of X given Y equals the marginal distribution of X c) E[XY] = E[X]E[Y] d) b and c e) none of the above 4) The power of a test is the probability that you: a) reject the null when it is true b) reject the null when it is false c) fail to reject the null when it is false d) fail to reject the null when it is true e) none of the above 5) R-squared is: a) the residual sum of squares as a fraction of the total variation in the independent variable b) the explained sum of squares as a fraction of the total sum of squares c) one minus the answer in a) d) one minus the answer in b) e) none of the above 6) In the linear regression model, the least squares estimator: a) maximizes the value of R2 b) minimizes the sum of squared residuals c) features the smallest possible sample variance d) all of the above e) only a) and b) 1 7) If q is an unbiased estimator of Q, then: a) Q is the mean of the sampling variance of q b) q is the mean of the sampling distribution of Q c) Var[q] = Var[Q] / n where n = the sample size d) q = Q e) none of the above 8) In the Capital Asset Pricing Model (CAPM): a) β measures the sensitivity of the expected return of a portfolio to systematic risk b) β measures the sensitivity of the expected return of a portfolio to specific risk c) β is greater than one d) α is less than zero e) R2 is meaningless 9) In the regression specification, Yi = β0 + β1 X i + ε i , which of the following is a justification for including epsilon? a) it accounts for potential non-linearity in the functional form b) it captures the influence of all omitted explanatory variables c) it incorporates measurement error in Y d) it reflects randomness in outcomes e) all of the above 10) Suppose [L(X), U(X)] is a 95% confidence interval for a population mean. Which of the following is/are true? a) Pr L( X ) ≤ X ≤ U ( X ) = 0.90 [ b) ] Pr ⎡⎣ L ( X ) ≤ µ ≤ U ( X )⎤⎦ = 0.95 [ ] [ ] c) Pr X ≤ L( X ) + Pr U ( X ) ≤ X = 0.05 d) a and c e) none of the above 11) Omitting a constant term from our regression will likely lead to: a) higher R2 , higher F stat, and biased estimates of the independent variables when β0 ≠ 0 b) higher R2, lower F stat, and biased estimates of the independent variables when β0 ≠ 0 c) higher R2, lower F stat, and unbiased estimates of the independent variables when β0 ≠ 0 d) higher R2, higher F stat, and unbiased estimates of the independent variables when β0 ≠ 0 e) none of the above 12) In order for an independent variable to be labelled “exogenous” which of the following must be true: a) E(εi) = 0 b) Cov(Xi,εi) = 0 c) Cov(εi,εj) = 0 d) Var(εi) = σ2 e) none of the above 2 13) Pure serial correlation: a) relates to the persistence of errors in the regression model b) can be detected with the RESET test statistic c) is caused by mis-specification of the regression model d) b and c e) all of the above 14) The sampling variance of the slope coefficient in the regression model with one independent variable: a) will be smaller when there is more variation in X b) will be smaller when there is more variation in ε c) will be larger when there is less variation in ε d) will be larger when there is more co-variation in ε and X e) none of the above 15) Suppose you compute a sample statistic q to estimate a population quantity Q. Which of the following is/are true? [1] the variance of Q is zero [2] if q is an unbiased estimator of Q, then q = E(Q) [3] if q is an unbiased estimator of Q, then Q is the mean of the sampling distribution of q [4] a 95% confidence interval for q contains Q with 95% probability a) 1 only b) 3 only c) 1 and 3 d) 1, 2, and 3 e) 1, 2, 3, and 4 16) Suppose the monthly demand for tomatoes (a perishable good) in a small town is random. With probability 1/2, demand is 50; with probability 1/2, demand is 100. You are the only producer of tomatoes in this town. Tomatoes sell for a fixed price of $1, cost $0.50 to produce, and can only be sold in the local market. If you produce 60 tomatoes, your expected profit is: a) $15 b) $35 c) $55 d) $75 e) none of the above 17) The OLS estimator is said to be unbiased when: a) Assumptions 1 through 3 are satisfied b) Assumptions 1 through 6 are satisfied c) Assumptions 1 through 3 are satisfied and errors are normally distributed d) Assumptions 1 through 6 are satisfied and errors are normally distributed e) all of the above 18) The RESET test is designed to detect problems associated with: a) specification error of an unknown form b) heteroskedasticity c) multicollinearity d) serial correlation e) none of the above 3 19) The Durbin-Watson test is only valid: a) with models that include an intercept b) with models that include a lagged dependent variable c) with models displaying multiple orders of autocorrelation d) all of the above e) none of the above 20) The consequences of multicollinearity are that the OLS estimates: a) will be biased while the standard errors will remain unaffected b) will be biased while the standard errors will be smaller c) will be unbiased while the standard errors will remain unaffected d) will be unbiased while the standard errors will be smaller e) none of the above 4 Short Answer #1 (10 points) Suppose you have observations on a dependent variable, Y, and an independent variable, X. a) Provide a plot of your X and Y values with a regression line through the points that would indicate the presence of heteroskedasticy in the errors of the regression model. b) Is your graph above indicative of a model with pure heteroskedasticity or impure heteroskedasticity? Discuss. c) Explain the consequences of using OLS estimation if the errors terms in the regression model are heteroskedastic. a) Something like the following should suffice b) Technically speaking, there is no way of determining if it is impure or pure heteroskedasticity we are dealing with here. For full credit, an answer should discuss the difference between the two and why it matters. c) There will be three consequences: i) OLS estimates remain unbiased…but only if the problem is with pure heteroskedasticity; OLS estimates, however, will be biased if the problem is with impure heteroskedasticity brought about by correlated omitted variables. ii) Even if unbiased, the sampling variance of the OLS estimator is inflated. ii) Because of ii), the estimated value of the sampling variance—and consequently, the calculated standard error—is wrong. 5 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 6 Short Answer #2 (10 points) Consider the following regression model: Yi = β0 + β1 X i + ε i . Suppose you have 101 observations and know the following summary statistics: 2 ∑ (Y − Y ) = 45 ∑ ( X − X ) = 50 ∑ e = 15, 000 i 2 i 2 i Cov( X i , Yi ) = 50 a) In the most general terms possible, what is the expression for a (1-α)% confidence interval for the unknown population slope parameter? (2 points) b) Using a critical value of 2.5, numerically construct a (1-α)% confidence interval for the unknown population slope parameter. (4 points) c) What is the precise interpretation of the confidence interval given in b)? (4 points) a) This is simply the following: Pr[βˆ1 − tα* /2 × s.e.(βˆ1 ) ≤ β1 ≤ βˆ1 + tα* /2 × s.e.(βˆ1 )] = 1 − α b) First, we need expressions for beta-hat and its standard error… βˆ1 = Cov( X , Y ) = Var (X) 50 50 = = 100 2 ∑ ( X i − X ) 50 100 n −1 ( ∑ e ) / ( n − k − 1) = 15, 000 / 99 ≈ 3 2 ˆ ⎡ βˆ ⎤ = Var ⎣ 1 ⎦ i i ∑ (X i i −X) 2 50 ˆ ⎡ βˆ ⎤ ≈ 3 ≈ 1.75 s.e.( βˆ1 ) = Var ⎣ 1 ⎦ Pr[100 − 2.5 ×1.75 ≤ β1 ≤ 100 + 2.5 ×1.75] = 1 − α Pr[100 − 4.375 ≤ β1 ≤ 100 + 4.375] = 1 − α Pr[95.625 ≤ β1 ≤ 104.375] = 1 − α The answer need not be as precise as the last expression; full marks for correctly deriving value of betahat-one and its standard error as the square root of 3 as long as they appear in the right place in the confidence interval. 7 Page intentionally left blank. Use this space for rough work or the continuation of an answer. c) In the limit, there is a (1-α)% probability that a set of confidence intervals constructed in this fashion will include the true value of the population parameter beta. 8 Short Answer #3 (10 points) Consider the following regression model: Yi = β0 + β1 X1 + β2 X 2 + ε i . Suppose you forget to include the variable X2 in the regression you estimate. a) Derive an expression for the omitted variable bias resulting from your estimation. b) If you could not obtain data on X2, what can you do to eliminate or diminish the omitted variable bias? a) The true DGP is Yi = β0 + β1 X1 + β2 X 2 + ε i . Instead, we estimate Yi = β 0 + β1 X 1 + ε i* where ε i* = β 2 X 2 + ε i Thus, we can derive the bias in the following way Yi = β 0 + β1 X i + ε i ⇒ Y = β 0 + β1 X + ε ∑ ( X − X )(Y − Y ) ∑ (X − X ) ∑ ( X − X )( β + β X + ε − β − β X − ε ) βˆ = ∑ (X − X ) ∑ ( X − X )(β ( X − X ) + ε − ε ) βˆ = ∑ (X − X ) β ∑ ( X − X ) ∑ ( X − X ) (ε − ε ) βˆ = + ∑ (X − X ) ∑ (X − X ) ∑ ( X − X ) (ε − ε ) βˆ = β + ∑ (X − X ) ⎛ X − X ) (ε − ε ) ⎞ ( ∑ ˆ ⎜ ⎟ E (β ) = β + E ⎜ ∑ ( X − X ) ⎟⎠ ⎝ βˆ1 = i i i 2 i i i i 0 1 i i 1 0 1 2 i i i i 1 i i 1 2 i i 2 1 i i i 1 i i 2 i 1 2 i i i i 1 2 i i i 1 i i i i 1 2 i i 9 Page intentionally left blank. Use this space for rough work or the continuation of an answer. So, the last term on the RHS can be thought of the bias arising from omitting X2. Partial credit for simply providing this last expression rather than formally deriving it. b)Unfortunately, there is not a whole lot we can do in circumstances like these. There is the “easy” way out: just add the omitted variable into the model, but we presumably would have done this in the first place if it was possible. We can also include a “proxy” for the omitted variable instead where the proxy is something highly correlated with the omitted variable. 10 Short Answer #4 (10 points) Consider the simple univariate regression model, Yi = β0 + β1 X i + ε i . Demonstrate that the sample regression line passes through the sample mean of both X and Y. We estimate the linear regression model Yi = β 0 + β1 X i + ε i as Y = βˆ + βˆ X + e i 0 1 i i We also know that Yˆ = βˆ + βˆ X i 0 1 i and that βˆ0 = Y − βˆ1 X We can evaluate this second expression when X i = X to prove the statement above, Yˆ = βˆ + βˆ X i 0 1 Yˆi = Y − βˆ1 X + βˆ1 X = Y So, by construction, the estimated regression line always passes through the sample means when using OLS. (Alternatively, you could start with the expression for Yi, considers it sum, and proceed to evaluate its mean value). 11 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 12 Short Answer #5 (10 points) Consider the following regression model: Yi = β0 + β1 X1 + β2 X 2 + ε i . a) State the underlying assumptions for the classical linear regression model given above. b) Which of these assumptions are necessary for our estimator to be unbiased and which are necessary for it to be efficient? c) Graphically illustrate the following assumptions: E(εi) = 0, Cov(εi,εj) = 0, and Var(εi) = σ2 d) Sometimes, the seventh assumption related to the normality of ε is used which implies that the β’s are also normally distributed. But when we estimate via OLS, we always arrive at a single number and not a distribution of values. Explain why this is the case. a) The regression model is: a) linear in the coefficients, b) is correctly specified, and c) has an additive error term. The error term has zero population mean or E(εi) = 0. All independent variables are uncorrelated with the error term, or Cov(Xi,εi) = 0 for each independent variable Xi (we say there is no endogeneity). Errors are uncorrelated across observations, or Cov(εi,εj) = 0 for two observations i and j (we say there is no serial correlation). The error term has a constant variance, or Var(εi) = σ2 for every i (we say there is no heteroskedasticity). No independent variable is a perfect linear function of any other independent variable (we say there is no perfect collinearity). b) Of the assumptions listed above the first three are required for unbiasedness. Four through six are necessary for the OLS estimator to be efficient. c) Something along the lines of the following should suffice: 13 Page intentionally left blank. Use this space for rough work or the continuation of an answer. d) This reflects that fact that our OLS estimates come from one particular set of data (i.e., one sample of observations). Thus, there is only one number attached to a particular estimate of our population parameter of interest. We also expect to generate different results in OLS whenever we change the sample (that is, when we have different observations with different values for our variables…). The result on the normality of the betas under the seventh assumption reflects this fundamental fact: repeated random sampling will result in a whole distribution of values for the estimates of the population parameter of interest. 14 Short Answer #6 (10 points) Consider the following set of results for a log-log specification of NHL salaries on two independent variables, age and points. For the following statistical tests, specify what the null hypothesis of the relevant test is and provide the appropriate interpretation given the results above: a) the t test associated with the independent variable “points” (use a critical value of 2.58) b) the F test associated with “age” and “points” in combination (use a critical value of 4.61) c) the RESET test using the F statistic (use a critical value of 6.64) d) the Durbin-Watson test (use a lower critical value of 1.55 and upper critical value of 1.80) 15 Page intentionally left blank. Use this space for rough work or the continuation of an answer. a) H0: βPOINTS = 0 versus H1: βPOINTS ≠ 0 Since the test statistic of 21.92 is so much larger (in absolute value) than the critical value, it is unlikely that the null is true, so we consequently reject it and regard the coefficient on points as being statistically significant. b) H0: β1 = β2 = ... = βk = 0 versus H1 : at least one βj ≠ 0, where j = 1, 2, ... , k Since the test statistic of 315.90 is so much larger (in absolute value) than the critical value, it is unlikely that the null is true, so we consequently reject it and consider that collectively our independent variables are important in explaining the variation observed in our dependent variable…provided the errors are normal! c) The null hypothesis in this case is one of correct specification, in particular that of potential omitted variables. Technically speaking, we are evaluating the joint significance of the coefficients for all the powers (greater than one) of the predicted value of Y. Thus, the RESET test suggests that we fail to reject the null hypothesis of having no missing variables. Unfortunately, it gives us no further indication of how to deal with this problem. d) The null hypothesis in this case is no positive autocorrelation. Thus, the Durbin-Watson test suggests that we reject the null hypothesis of no positive autocorrelation, suggesting instead we very likely have problems with serial correlation in this specification. 16 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 17 Useful Formulas: k k 2 2 σ X2 = Var ( X ) = E ⎡( X − µ X ) ⎤ = ∑ ( xi − µ X ) pi µ X = E ( X ) = ∑ pi xi ⎣ i =1 k ⎦ Pr(Y = y | X = x) = Pr( X = x) = ∑ Pr ( X = x, Y = yi ) i =1 i =1 Pr( X = x, Y = y ) Pr( X = x) m k ( ) E Y = E (Y | X = xi ) Pr ( X = xi ) E (Y | X = x ) = yi Pr (Y = yi | X = x ) i = 1 i =1 k 2 Var (Y | X = x) = [yi − E (Y | X = x )] Pr (Y = yi | X = x ) E(a + bX + cY ) = a + bE( X ) + cE(Y ) i =1 k m Var(a + bY ) = b 2Var(Y ) σ = Cov ( X , Y ) = x j − µ X ( yi − µY ) Pr X = x j , Y = yi XY i =1 j =1 Cov( X , Y ) Var(aX + bY ) = a 2Var( X ) + b 2Var(Y ) + 2abCov( X ,Y ) Corr ( X , Y ) = ρ XY = Var( X )Var(Y ) 2 2 Cov(a + bX + cV ,Y ) = bCov( X ,Y ) + cCov(V ,Y ) E Y = Var(Y ) + E (Y ) X −µ X −µ ⎛ σ 2 ⎞ t= ( ) E XY = Cov ( X , Y ) + E ( X ) E ( Y ) Z = X ~ N ⎜ µ , ⎟ s/ n σ n ⎠ n n ⎝ n 1 2 1 2 1 X = xi sX = ( xi − x ) (xi − x )( yi − y ) rXY = sXY / sX sY s XY = n i =1 n − 1 i =1 n − 1 i =1 n X i − X Yi − Y i =1 ˆ For the linear regression model Yi = β 0 + β1 X i + ε i , β1 = & βˆ0 = Y − βˆ1 X n 2 Xi − X i = 1 Yˆ = βˆ + βˆ X + βˆ X + ! + βˆ X i 0 1 1i 2 2i k ki e2 e2 / (n − k − 1) RSS 2 ESS TSS − RSS 2 i i i i R = = = 1− = 1− R = 1− 2 2 TSS TSS TSS Y − Y Yi − Y / (n − 1) i i i e2 ei 2 / ( n − k − 1) 2 2 i i i s 2 = ˆ ˆ ⎡ ⎤ where E ⎣ s ⎦ = σ Var ⎡⎣ β1 ⎤⎦ = 2 ( n − k − 1) X − X i i ˆ β j − βH βˆ − β H Z = t= 1 ~ tn −k −1 ~ N ( 0,1) ∑ ∑ ∑ ∑∑ ( ) ( ) ( ) ∑ ∑ ∑ ∑( )( ∑( ∑ ∑( ) s.e.( βˆ1 ) Var[ βˆ j ] Pr[ βˆ j − tα* /2 × s.e.( βˆ j ) ≤ β j ≤ βˆ j + tα* /2 × s.e.( βˆ j )] = 1 − α T ∑ (e − e ) d= ∑ e t T 2 t =1 t ) ∑ ) ∑( (∑ ) ) ∑( ∑ t =2 ) t −1 F= ESS / k ESS (n − k − 1) = RSS / (n − k − 1) RSS k 2 ≈ 2(1 − ρ ) 18