Given name:__ Family name:_

Given name:____________________ Student #:______________________ Family name:___________________ Section #:______________________ BUEC 333 FINAL Multiple Choice (2 points each) 1.) Suppose that in the simple linear regression model Yi = β0 + β1Xi + εi on 100 observations, you calculate that R2= 0.5, the sample covariance of X and Y is 10, and the sample variance of X is 15. Then the least squares estimator of β1 is: a) not calculable using the information given b) 1/3 c) 1 / 3 d) 3/2 e) none of the above 2.) The Durbin-Watson test is only valid: a) with models that include an intercept b) with models that include a lagged dependent variable c) with models displaying multiple orders of autocorrelation d) all of the above e) none of the above 3.) Suppose you have a random sample of 10 observations from a normal distribution with mean = 10 and variance = 2. The sample mean (x-bar) is 8 and the sample variance is 3. The sampling distribution of xbar has a.) mean 8 and variance 3 b.) mean 8 and variance 0.3 c.) mean 10 and variance 0.2 d.) mean 10 and variance 2 e.) none of the above 4.) From a gravity model of trade, you estimate that Pr[0.9828  distance  0.7982]  95% , this allows you to state that: a.) there is a 95% chance that all potential estimates of the coefficient on distance are in this range b.) you can reject the null hypothesis that the true coefficient on distance is equal to zero at the 5% level of significance. c.) there is a 5% chance that some of the potential estimate of the coefficient on distance fall outside of this range d.) all of the above e.) none of the above 1 5.) Suppose you compute a sample statistic q to estimate a population quantity Q. Which of the following is/are false? [1] the variance of Q is zero [2] if q is an unbiased estimator of Q, then q = Q [3] if q is an unbiased estimator of Q, then q is the mean of the sampling distribution of Q [4] a 95% confidence interval for q contains Q with 95% probability a.) 2 only b.) 3 only c.) 2 and 3 d.) 2, 3, and 4 e.) 1, 2, 3, and 4 6.) In order for our independent variables to be labelled “exogenous” which of the following must be true: a.) E(εi) = 0 b.) Cov(Xi,εi) = 0 c.) Cov(εi,εj) = 0 d.) Var(εi) = σ2 e.) none of the above 7.) If correlated omitted independent variables are serially correlated, then: a) least squares coefficient estimates are biased b) GLS coefficient estimates are biased c) least squares standard errors are wrong d) ordinary least squares is not BLUE e) all of the above 8.) We saw the claim that the value of X1 is an unbiased estimator of the sample mean because E(X1) = μ. Now consider, the estimator (X1 + X2)*2. Is this another unbiased estimator of the population mean? a.) answer depends on the underlying distribution of X b.) this is a biased estimator of the population mean c.) this is an unbiased estimator of the population mean d.) there is insufficient information to answer this question e.) none of the above 9.) To be useful for hypothesis testing, a test statistic must: a.) be computable using sample data b.) have a known sampling distribution when the null hypothesis is true c.) have a known sampling distribution when the null hypothesis is false d.) a and b only e.) none of the above 10.) Adding an irrelevant explanatory variable that is uncorrelated with the other independent variables causes: a.) bias and no change in variance b.) bias and an increase in variance c.) no bias and no change in variance d.) no bias and an increase in variance e.) none of the above 2 11.) A newspaper reports a poll estimating the proportion u of the adult population in favour of legalizing marijuana as 65%, but qualifies this result by saying that “this result is accurate within plus or minus 3 percentage points (19 times out of twenty).” What does this mean? a.) the probability is 95% that u lies between 62% and 68% b.) the probability is 95% that u is equal to 65% c.) 95% of estimates calculated from samples of this size will lie between 62% and 68% d.) not enough information e.) none of the above 12.) Omitting a relevant explanatory variable that is uncorrelated with the other independent variables causes: a.) no bias and no change in variance b.) no bias and an increase in variance c.) no bias and a decrease in variance d.) bias e.) none of the above 13.) The OLS estimator of the variance of the slope coefficient in the regression model with one independent variable: a.) will be smaller when there is less variation in ei b.) will be smaller when there are fewer observations c.) will be smaller when there is less variation in X d.) will be smaller when there are more independent variables e.) none of the above 14.) The central limit theorem tells us that the sampling distribution of the sample mean: a.) is always normal b.) is always normal in large samples c.) approaches normality as the sample size increases d.) is normal in Monte Carlo simulations e.) none of the above 15.) Suppose you compute a sample statistic q to estimate a population quantity Q. Which of the following is/are true? [1] the variance of Q is zero [2] if q is an unbiased estimator of Q, then q = Q [3] if q is an unbiased estimator of Q, then q is the mean of the sampling distribution of Q [4] a 95% confidence interval for q contains Q with 95% probability a.) 1 only b.) 2 only c.) 2 and 3 d.) 2, 3, and 4 e.) 1, 2, 3, and 4 16.) If the covariance between two random variables X and Y is zero then a.) X and Y are independent b.) Knowing the value of X provides no information about the value of Y c.) E(X) = E(Y) = 0 d.) a and b are true e.) none of the above 3 17.) Given the equation for the F statistic, we can say that it is a.) decreasing in R2, decreasing in n, and decreasing in k b.) increasing in R2, increasing in n, and increasing in k c.) decreasing in R2, increasing in n, and decreasing in k d.) increasing in R2, increasing in n, and decreasing in k e.) none of the above 18.) In the Capital Asset Pricing Model (CAPM), a.) β measures the sensitivity of the expected return of a portfolio to systematic risk b.) β measures the sensitivity of the expected return of a portfolio to specific risk c.) β is greater than one d.) α is less than zero e.) R2 is meaningless 19.) If a random variable X has a normal distribution with mean μ and variance σ2 then: a.) X takes positive values only b.) ( X   ) /  2 has a standard normal distribution c.) ( X   ) 2 /  2 has a chi-squared distribution with n degrees of freedom d.) ( X   ) /( s / n ) has a t distribution with n-1 degrees of freedom e.) none of the above 20.) Suppose the assumptions of the CLRM model applies and you have used OLS to estimate a slope coefficient as 2.43. If the true value of this slope is 3.05, then the OLS estimator a.) has bias of 0.62 b.) has bias of –0.62 c.) is unbiased d.) not enough information e.) none of the above 4 Short Answer #1 (10 points) According to the Canada Revenue Agency, the average length of time for an individual to complete a CRA Income Tax Return is 10.53 hours with a standard deviation of 2.00 hours. The distribution of this variable, however, is unknown. Suppose we randomly sample 360 taxpayers. a.) In words, explain what Xi equals. b.) In words, explain what X-bar equals. c.) Now, tell me how X-bar is distributed—that is, tell me the type of distribution and its parameters. d.) Would you be surprised if the 360 taxpayers finished their Income Tax Return in an average of more than 12 hours? Explain why or why not in complete sentences. e.) Would you be surprised if one taxpayer out of the 360 taxpayers finished his Form 1040 in more than 12 hours? Explain why or why not in complete sentences. 5 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 6 Short Answer #2 (10 points) Suppose we have a linear regression model with one independent variable and no intercept: Yi = βXi + εi Suppose also that εi satisfies the six classical assumptions. a.) Verbally, explain the steps necessary to derive the least squares estimator. b.) Formally, derive a mathematical expression for this estimator given your answer in part a). 7 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 8 Short Answer #3 (10 points) There are at least two different possible approaches to the problem of building a model of the costs of production of electric power. Model I hypothesizes that per-unit costs (C) as a function of the number of kilowatt-hours produced (Q) continually and smoothly falls as production is increased, but it falls at a decreasing rate. Model II hypothesizes that per-unit costs (C) decrease fairly steadily as production (Q) increases across plant type, but costs start at a higher level for hydroelectric plants than for other kinds of facilities. a.) What functional form would you recommend for estimating Model I? Write out a specific equation. b.) What functional form would you recommend for estimating Model II? Write out a specific equation. c.) Would R2 be a reasonable way to compare the overall fits of the two equations? Why or why not? 9 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 10 Short Answer #4 (10 points) Consider the regression results below where the dependent variable is the amount of time in minutes that individuals spend traveling from home to work. The sample consists of workers across Canada. It also contains information on individual’s earnings, years of schooling, age, sex, and place of birth. Dependent variable: Canadian commuting times Independent variables: Total earnings in 2012 (in $1000s) 0.0225 p-value 0.000 Years of schooling -0.0344 p-value 0.162 Age 0.0183 p-value 0.007 Female -3.1650 p-value 0.000 Africa 4.0390 p-value 0.000 Asia 1.1200 p-value 0.000 Australasia 1.2630 p-value 0.066 Europe -0.5527 p-value 0.635 Latin America 2.039 p-value 0.000 Intercept 27.64 p-value 0.000 R2 0.0081 2 Adjusted R F-statistic p-value of F Observations 0.0080 104.2 0.000 115089 a.) How many of the independent variables are statistically significant? Which ones are not? b.) Does the whole set of independent variables have a reliable collective effect on the dependent variables? Explain your answer. c.) Consider the values of the R2 and adjusted R2 of the regression. Tell me what these mean individually and collectively. d.) Interpret the coefficient associated with variable on total earnings in 2012. Do the sign and magnitude of this coefficient seem plausible? Why or why not? e.) The sample for this regression include individuals in many different metropolitan areas of Canada. Does this seem like a good idea? Why or why not? What would you suggest as an alternative? 11 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 12 Short Answer #5 (10 points) Consider the regression results below where the dependent variable is the natural log of annual earnings for single, child-less men with high-school education or less. The sample consists of workers in Vancouver over the years from 2003 to 2012. It also contains information on individual’s age (as a set of dummy variables capturing a range of ages), their status as a “visible minority” (that is, whether or not they are Caucasian), their status as “Aboriginal” (that is, whether or not they are First Nations origin), and whether or not they possess a “High School Degree”. Dependent variable: natural log of annual earnings Independent variables: OLS Age from 30-34 0.29 standard error 0.16 t-statistic 1.80 Age from 35-39 0.19 standard error 0.18 t-statistic 1.10 Age from 40-44 0.25 standard error 0.16 t-statistic 1.51 Age from 45-49 0.12 standard error 0.17 t-statistic 0.71 Age from 50-54 0.08 standard error 0.17 t-statistic 0.46 Age from 55-59 0.37 standard error 0.19 t-statistic 2.02 Age from 60-64 0.35 standard error 0.30 t-statistic 1.15 Visible minority 0.07 standard error 0.19 t-statistic 0.36 Aboriginal -0.54 standard error 0.20 t-statistic -2.74 High school degree standard error t-statistic Intercept 10.18 standard error 0.12 t-statistic 84.44 R2 F-statistic DW statistic p-value of F Observations 0.03 89.21 0.98 0.000 4160 OLS with OLS with Newey-West SEs Newey-West SEs 0.29 0.26 0.13 0.14 2.20 1.88 0.19 0.21 0.17 0.17 1.13 1.23 0.25 0.24 0.15 0.15 1.66 1.65 0.12 0.13 0.17 0.17 0.70 0.79 0.08 0.06 0.18 0.18 0.43 0.32 0.37 0.38 0.20 0.21 1.83 1.84 0.35 0.33 0.26 0.27 1.34 1.24 0.07 0.02 0.19 0.09 0.73 0.29 -0.54 -0.49 0.27 0.26 -2.01 -1.88 0.33 3.09 0.00 10.18 9.96 0.11 0.14 89.22 73.62 0.03 89.21 1.78 0.000 4160 0.05 108.98 1.78 0.000 4160 13 For a.) through e.), consider only the output in the first and second columns and assume that with 4160 observations, the t distribution is functionally the same as the standard normal distribution. a.) Why are the coefficients the same, but the standard errors different in the first and second column? b.) Which set of estimates do you think are more reliable? Explain. c.) What is the test statistic for the hypothesis that Aboriginal and Caucasian men have the same earnings against the alternative that they do not? Can you reject this hypothesis? Hint: use the “rule of thumb” that 2.00 is a sufficiently large critical value. d.) Do you reject the hypothesis that these two groups have the same log-earnings against the alternative that Aboriginal men have lower log-earnings? e.) Are the R-squared’s too low? Should we ignore these results? f.) The third column reports the results for a regression just like that reported in the second column, but it adds a dummy variable equal to 1 if an individual has a high school diploma. Why is the coefficient on “Aboriginal” now smaller in absolute value than in the second column? 14 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 15 Short Answer #6 (10 points) The first half of the course was dedicated to developing the least squares estimator. The rest of the course was dedicated to considering those instances when problems with the least squares estimator arise. Underlying the discussion, there were the six assumptions of the classical linear model. a.) Name the six assumptions and explain what each of them mean. b.) Some of these assumptions are necessary for the OLS estimator to be unbiased. Some of these assumptions are necessary for the OLS estimator to be “best”. Explain the distinction between these two concepts. c.) Indicate which of the six assumptions are necessary for the OLS estimator to be unbiased and which of the six assumptions are necessary for the OLS estimator to be “best”. d.) In general, would you prefer your estimates to be biased but efficient or unbiased but not efficient? Explain your answer. 16 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 17 Useful Formulas: E( X )     x   k Var ( X )  E  X   X   k p x i i 2 i 2 pi X i 1 i 1 k Pr( X  x)   Pr X  x, Y  yi  Pr(Y  y | X  x)  i 1 m E Y    E Y | X  xi  Pr X  xi  k E Y | X  x    yi PrY  yi | X  x  i 1 i 1 k Var (Y | X  x)    yi  E Y | X  x  PrY  yi | X  x  Ea  bX  cY   a  bE( X )  cE (Y ) 2 i 1 Cov( X , Y )   x j   X  yi  Y  PrX  x j , Y  yi  k m i 1 j 1 Cov X , Y  Var  X Var Y  Corr  X , Y    XY  Pr( X  x, Y  y) Pr( X  x) Var a  bY   b 2Var (Y ) Var aX  bY   a 2Var ( X )  b 2Var (Y )  2abCov( X ,Y ) E Y 2   Var (Y )  E (Y ) 2 Cova  bX  cV ,Y   bCov( X ,Y )  cCov(V ,Y ) E XY   Cov( X ,Y )  E( X ) E(Y ) t 1 X  n 1 n xi  x 2 s   n  1 i 1 n x 2 i i 1 s XY  X  Z s/ n n 1  xi  x  yi  y  n  1 i 1  X n For the linear regression model Yi   0  1 X i   i , ˆ1  i 1 i  X Yi  Y  n  X i 1 i X 2 X   rXY  s XY / s X sY & βˆ0  Y  ˆ1 X Yî  ˆ0  ˆ1 X 1i  ˆ2 X 2i    ˆk X ki e2 ESS TSS  RSS RSS  i i R    1  1 2 TSS TSS TSS  Yi  Y   e / (n  k  1) R  1  Y  Y  / (n  1) e /  n  k  1 ˆ  ˆ      Var    X  X  2 e 2 i i i 2 2 s  where E  s 2    2  n  k  1 2 2 i i 2 i i i i 1 2 i Z ˆ j   H Var[ ˆ j ] ~ N  0,1 Pr[ˆ j  t* /2  s.e.(ˆ j )   j  ˆ j  t* /2  s.e.(ˆ j )]  1    e  e  d  e T t 2 t T 2 t 1 t t 1 t F i ˆ1   H ~ tn k 1 s.e.( ˆ1 ) ESS / k ESS (n  k  1)  RSS / (n  k  1) RSS k 2  2(1   ) 18

Given name:__ Family name:_

Related documents

Products

Support

Given name:____________________ Family name:___________________

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

Given name:__ Family name:_