Given name:____________________ Student #:______________________ Family name:___________________ Section #:______________________ BUEC 333 MIDTERM Multiple Choice (2 points each) 1.) The Gauss-Markov Theorem says that when the 6 classical assumptions are satisfied: a.) The least squares estimator is unbiased b.) The least squares estimator has the smallest variance of all linear unbiased estimators c.) The least squares estimator has an approximately normal sampling distribution d.) The least squares estimator is consistent e.) None of the above 2.) Which of the following is not a linear regression model: a.) Yi X i X i2 i 2 b.) Yi cos( X i ) exp( X i ) i c.) log( Yi ) 0 1 log( X i ) i d.) Yi 0 1 log( X i ) i e.) none of the above 3.) The distribution of X when Y is known is called the _____ distribution of X, and is written as _____. These blanks are best filled with the following a.) conditional, p(X) b.) conditional, p(X|Y) c.) marginal, p(X) d.) marginal, p(X|Y) e.) none of the above 4.) In the linear regression model, the degrees of freedom a.) affects the precision of the coefficient estimates b.) is equal to the number of observations (n) minus 1 c.) affects the value of the coefficient estimates d.) all of the above e.) none of the above 5.) The power of a test statistic should become larger as the a.) the probability of a type II error becomes smaller b.) null becomes closer to being true c.) significance level becomes larger d.) sample size becomes larger e.) none of the above 1 6.) The central limit theorem tells us that the sampling distribution of the sample mean: a.) is always normal b.) is always normal in large samples c.) approaches normality as the sample size increases d.) is normal in Monte Carlo simulations e.) none of the above 7.) Suppose [L(X), U(X)] is a 90% confidence interval for a population mean. Which of the following is/are true? a.) Pr L X U X 0.90 b.) Pr L X U X 0.90 c.) Pr L X Pr U X 0.10 d.) a and c e.) none of the above 8.) The sampling variance of the slope coefficient in the regression model with one independent variable: a.) will be smaller when there is less variation in ε b.) will be larger when there is less variation in ε c.) will be smaller when there is less variation in X d.) will be larger when there is less co-variation in ε and X e.) none of the above 9.) The central limit theorem tells us that the sampling distribution of least squares regression coefficient: a.) is always normal b.) is always normal in large samples c.) approaches a uniform distribution as the sample size increases d.) is normal in Monte Carlo simulations e.) none of the above 10.) In order for our independent variables to be labelled “exogenous” which of the following must be true: a.) E(εi) = 0 b.) Cov(Yi,εi) = 0 c.) Cov(εi,εj) = 0 d.) Var(εi) = σ2 e.) none of the above 11.) Which of the following statements is false regarding the Central Limit Theorem: a.) when the sample size is large, the mean of X-bar is approximately equal to the mean of X. b.) when the sample size is large, X-bar is approximately normally distributed. c.) when the sample size is large, the standard deviation of X-bar is approximately the same as the standard deviation of X. d.) all of the above e.) none of the above 2 12.) If the covariance between two random variables X and Y is zero then a.) X and Y are independent b.) Knowing the value of X provides no information about the value of Y c.) E(X) = E(Y) = 0 d.) a and b are true e.) none of the above 13.) If two random variables X and Y are independent, a.) their joint distribution equals the product of their conditional distributions b.) the conditional distribution of X given Y equals the joint distribution of X c.) their covariance is zero d.) a and c e.) a, b, and c 14.) If a random variable X has a normal distribution with mean μ and variance σ2 then: a.) X takes positive values only b.) ( X ) / 2 has a standard normal distribution c.) ( X ) /( s / n ) has a t distribution with n-1 degrees of freedom d.) ( X ) 2 / 2 has a chi-squared distribution with n degrees of freedom e.) none of the above 15.) Suppose you want to test the following hypothesis at the 5% level of significance: H0: μ = μ0 H1: μ ≠ μ0 Which of the following is/are true? a.) the probability of a Type I error is 0.05 b.) the probability of a Type I error is 0.025 c.) the t statistic for this test has a t distribution with n degrees of freedom d.) a and c e.) b and c 3 Short Answer #1 (10 points – show your work!) Consider the case of a uniformly distributed random variable where each outcome (1, 2, 3, 4) has an equal chance of occurring. It can be easily shown that the population mean and variance of this random variable are 2.50 and 1.25, respectively. a.) Suppose that a random number generator provides the following sequence of numbers, 2-1-4-1. What is the mean and variance of this sample? b.) What is the sampling distribution of the sample mean calculated above? Provide a verbal interpretation of the sampling distribution of the sample mean. c.) Compute the value of the t-statistic for testing the null hypothesis that μ = 2.5. Hint: the square root of two is equal to 1.4 when rounded to the first decimal place. d.) The critical value for a t distribution with 3 degrees of freedom and a 0.20 level of significance in the presence of a two-sided alternative is equal to 1.638. Can you reject the null hypothesis that μ = 2.5 at the 20% level of significance? What about at the 10% level of significance? 4 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 5 Short Answer #2 (20 points – show your work!) A researcher is using data for a sample of 1,000 male wage-earners to investigate the relationship between hourly wage rates, Yi (measured in dollars per hour), and length of work experience with a particular firm, Xi (measured in years). Analysis of the data in Excel produces the following sample information: n 1, 000 Y 12,500 Cov( X i , Yi ) 36 e 2 i i Var(X i ) 60 X i 3,500 Var(Yi ) 60 15, 000 Use the information above to answer the following questions. Show all formulas and calculations, using the following approximation, n = (n – 1) = (n – 2). a.) What are the OLS estimates of the constant term (β0) and the slope coefficient (β1)? b.) Interpret the estimate of the slope coefficient you calculated in part a.). c.) Calculate an estimate of the variance of the error term in the population regression model. d.) Calculate an estimate of the variance of the estimated slope coefficient. e.) Compute the value of R2 and briefly explain what the calculated value of R2 means. 6 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 7 Short Answer #3 (20 points – show your work!) Consider the standard univariate population regression model: Yi 0 1 X i i Assume that all of the classical assumptions are satisfied. Show that the OLS estimator ˆ1 is an unbiased estimator of 1 . Hint: you can make use of the fact that Yi 0 1 X i i Y 0 1 X 8 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 9 Short Answer #4 (20 points – show your work!) You wish to determine if the application of fertilizer and water affect plant growth. To that end, you run an experiment where you randomly apply different amounts of fertilizer and water to your hemp plants. You then use regression analysis to determine how they affect the yield of a hemp plant in grams. Fertilizer is measured in kilograms and ranges in value from (0.0 to 1.0) while water is measured in liters per week. The standard errors of the regression coefficients are reported in parentheses. You get the following results: Yieldi 12.1 5.5* Fertilizeri 12*Wateri (0.4) (0.5) n 140 RSS=1234 (2.7) R 0.76 2 a.) Do you think this type of analysis will give you an unbiased estimate of how much adding fertilizer increases your crop yield? Why or why not? b.) How do you interpret the constant in this case? Explaining why in most instances we ignore such results. c.) How much does fertilizer increase plant growth? d.) What is the regression’s predicted yield for a plant exposed to 50 kilograms of fertilizer? Do you think this prediction is reliable? Why or why not? 10 Page intentionally left blank. Use this space for rough work or the continuation of an answer. 11 Useful Formulas: E( X ) p x i i 2 i X 2 pi Pr( X x, Y y) Pr(Y y | X x) Pr( X x) k Pr( X x) Pr X x, Y yi i 1 m E Y E Y | X xi Pr X xi k E Y | X x yi PrY yi | X x i 1 i 1 k Var (Y | X x) yi E Y | X x PrY yi | X x Ea bX cY a bE( X ) cE (Y ) 2 i 1 Cov( X , Y ) x j X yi Y PrX x j , Y yi k m i 1 j 1 Cov X , Y Var X Var Y Corr X , Y XY Cova bX cV ,Y bCov( X ,Y ) cCov(V ,Y ) E XY Cov( X ,Y ) E( X ) E(Y ) t X 1 xi x 2 s n 1 i 1 2 i i 1 Z s/ n n n Var a bY b 2Var (Y ) Var aX bY a 2Var ( X ) b 2Var (Y ) 2abCov( X ,Y ) E Y 2 Var (Y ) E (Y ) 2 x k i 1 i 1 1 X n x Var ( X ) E X X k s XY X 2 X ~ N , n rXY s XY / s X sY n 1 xi x yi y n 1 i 1 X n For the linear regression model Yi 0 1 X i i , ˆ1 i 1 i X Yi Y n X i 1 i X 2 & βˆ0 Y ˆ1 X Yˆi ˆ0 ˆ1 X 1i ˆ2 X 2i ˆk X ki e2 ESS TSS RSS RSS i i R 1 1 2 TSS TSS TSS Yi Y e / (n k 1) R 1 Y Y / (n 1) e / n k 1 ˆ ˆ Var X X 2 s2 e 2 i 2 where E s 2 2 n k 1 i i 2 i i 2 i i 2 i i 1 2 i Z ˆ j H Var[ ˆ j ] ~ N 0,1 Pr[ˆ j t* /2 s.e.(ˆ j ) j ˆ j t* /2 s.e.(ˆ j )] 1 e e d e T t 2 t T 2 t 1 t t 1 t F i ˆ1 H ~ tn k 1 s.e.( ˆ1 ) ESS / k ESS (n k 1) RSS / (n k 1) RSS k 2 2(1 ) 12