Example Questions for AAE637 Exam 1. In the 1970s, there was an increase in the number of statistical analyses focused on the factors of production in the aggregate U.S. manufacturing sector. Two of the more important pieces of research were undertaken by Berndt and Wood (1975, 1979)1, who collected measures of aggregate U.S. manufacturing output as well as indices of the use of capital (K), labor (L), and other material (M) inputs over the 1947-1971 period. Although they used a different model in the above work, you have obtained their data and as a first pass you decide to estimate the following CobbDouglas production function. (1.1) Y M 1 K 2 L3 exp( ) Table 1 provides a partial listing of the output you obtain from estimating the above model after you transform it in the usual manner to enable you to use CRM estimation techniques. Again assume that all CRM assumptions hold. (a) (10 pts) In Table 1 there are 12 pieces of information not displayed. Using the information provided, provide estimates of the missing information. (b) (5 pts) Let’s refer to the naïve model of production as one where you assume that ln(Yt) would equal ln(Yt ) for all observations regardless the level of input use. Undertake a formal hypothesis test that when compared to this naïve model, the estimation of (1.1) generates a statistically significant increase in explanatory power. (c) (5 pts) Test the null hypothesis that the sum of the materials (M) and labor (L) output elasticities equals 1.0. What are the null and alternate hypotheses and what is the result of your hypothesis test? Berndt, E.R. and D. O. Wood, 1975. “Technology, Prices and the Derived Demand for Energy”, Review of Economics and Statistics, 57:3, August, 259-268; Berndt, E.R. and D.O. Wood, 1979. “Engineering and Econometric Interpretation of Energy-Capital Complementarity”, American Economic Review, 69:3, June, 342-354. 1 Table 1. Summary of Regression Results No. of Obs DF SSE TSS R2 Adjusted R2 σ2 U σU Variable Intercept LN(Materials) LN(Capital) LN(Labor) Regression Statistics 25 0.997755 0.014493 Estimate 6.8232 0.8292 0.0524 0.3414 Std. Error T-Value 48.82 0.0777 0.1090 3.13 Partial Listing of the Elements of (X'X)-1 (1st row and column missing) Materials Capital Labor Materials 28.770 Capital -6.767 7.456 Labor -34.659 -0.690 56.609 2. Suppose you and grandmother are both interested in how the sale price of a house is affected by its distance to the nearest WAL-MART. You both go collect data and estimate regression models using all of the CRM assumptions. Your grandmother, having a strong econometrics background, estimates the following model: (2.1) yi= β0 + β1xi + εi where yi = sale price of the house where the sale price is measured in $ xi = distance to nearest WAL-MART where distance is measured in miles Using the same data you estimate the following model: (2.2) yi*= α0 + α1*xi + υi where yi* = sale price of the house measured in $000 xi* = distance to nearest WAL-MART where distance is measured in kilometers ( 1 km = .62 miles) Because you are a good econometrician, you observe that yi* = yi/1000 and xi* = xi/.62 (a) You want to compare your estimates. What is the mathematical relationship between β0 and α0? Between β1 and α1? Between V(εi) and V(υi)? (b) Compare your estimate of the elasticity of price with respect to distance using the results of your regression model and the elasticity estimate obtained from your grandmother's model? Give some brief intuition behind this result. 3. The regression equation Y = 1 + 2 X + was estimated using 80 cross-sectional observations on countries via classical regression techniques. To check for heteroskedasticity related to population, separate regressions were run for the 32 countries with the lowest populations and the 32 countries with the highest populations. The sum of squared residuals for the low-population countries was 240. The sum of squared residuals for the high-population countries was 90. (a) Compute unbiased estimates of the variance of the error term in the two subsamples. (b) Given these results, which subsample appears to lie closer to the true regression line: the low-population-countries or the high-population countries? Explain your answer. (c) Test the null hypothesis of homoskedasticity, against the (one-sided) alternative hypothesis that high-population countries have higher error variance, at 5% significance using a Goldfeld-Quandt test. Give the value of the test statistic, its distribution under the null hypothesis, the critical point, and your conclusion (accept or reject the null hypothesis of homoskedasticity). (d) Suppose you believe that heteroskedasticity was indeed present and that the variance of the error term were inversely proportional to population (i.e., Var(i ) = POPi , where = an unknown constant and POPi = population of the ith country). Provide the formulas necessary to transform the data so as to ensure that the above error terms satisfy the classical assumptions of the linear model. (e) Suppose the first observation in the raw data were as shown below: Obs(i) Xi Yi POPi 1 50 60 100 Use the formulas you gave in part (d) to compute the first observation of the transformed data. 4. Assume you have a sample of observed values of a random variable, yi (i=1,…,T) which are distributed iid . (a) Assume that yi has the following pdf, f(yi): f(y t )=β Xβt yβt . Find the maximum likelihood estimator given your T sample of observations (b) Assume that that yi has the following pdf, f(yi): f ( y) y e y! , y 0,1, 2,3,... That is, y is a Poisson random number. y has the characteristic the its mean and variance equals the parameter λ. Find the maximum likelihood estimation of λ. Find the mean and variance of the maximum likelihood estimator of λ. Is this estimator consistent? 5. You have been hired as a consultant by a firm that manages a large number of egg laying operations across the U.S. They are interested in obtaining a better understanding of egg consumption patterns in the U.S. As such you collect monthly data and formulate the following model where you let Eggt represent the per capita number of eggs consumed in month t and Bacont to be the per capita pounds of bacon consumed: Eggt = α + βBacont + εt where α and β are unknown coefficients, and εt is an error term where εt ~ N(0,σ2IT). You suspect there may be a complicating factor determining egg consumption. That is, Easter, which occurs in April, increases the consumption of eggs. To recognize this effect you define a dummy variable, Aprilt which is 1 if the month t is April, 0 otherwise. To analyze consumption patterns, you are given the following set of results which all come from the same data set consisting of n=81 observations (estimated coefficient std. errors are in parentheses). Unfortunately, your GAUSS coding is not what it should be and some pieces of information were not printed out. Using the information that did print you should be able to answer the following questions. Regression Results from Alternative Models of U.S. Egg Consumption Variable Model I Model II Model III 30.69 (4.46) 2.15 (XXX) 18.89 (XXX) 2.15 (0.08) 23.60 (XXX) RSS 53,230 XXXXX 21.03 (3.76) 2.08 (XXX) 19.33 (5.32) 0.14 (XXX) XXXXX R2 0.7519 0.9132 0.9140 Intercept Bacont Aprilt Aprilt Bacont a. Using Model III, sketch two regression lines showing the relationship between egg and bacon consumption, one for April and one for December. Make sure you explicitly identify the values of the slope and intercept for these two months. b. Test at the 0.05 level, the hypothesis that either the intercept, slope or both are different in April than other months. Write out the null and alternate hypotheses. c. Test at the 0.05 level, the hypothesis that the intercept is different in April given that the slope is the same in April as in non-April months. Write out the null and alternate hypotheses. d. Test at the 0.05 level the hypothesis that the slope changes in April, given that the intercept is different. Write out the null and alternate hypotheses. e. Using the information provided for Model I, test at the 0.05 level the hypothesis that Bacon consumption has a significant impact on egg consumption. Write out your null and alternate hypotheses. (Hint: What is your restricted model? Look at Question 1(c). )