Senior Seminar Assignment 4 Due April 2, 2007 Finding and detecting Heteroskedasticity The purpose of this exercise is to reinforce our data manipulation skills, to run simple regressions, and to learn to deal with heteroskedasticity. Be sure to include a constant term in all regressions. 1. Load the data file Manufactures which I will send you and which contains data from the U.S. Census Bureau’s American Survey of Manufactures on 455 Industries from 1994 on the following variables: shipments (value of output shipped), materials (value of materials inputs used in production), newcap (expenditure on new capital by this industry), inventory (value of inventories held), managers (number of supervisory workers employed), and workers (number of production wokers employed). The first four variables are measured in thousands of dollars. A. Estimate the regression: shipments = f(materials, newcap, inventory, managers, workers) If a firm employs one more manager, how much do shipments rise? If a firm employs one more production worker, how much do shipments rise? Does this match your intuition about the salaries earned by managers and production workers? B. Perform the White test, no cross terms, for heteroskedasticity of unknown form. How many right-hand side variables does the test regression have? What is the test statistic, and what is the critical value? Does this test find heteroskedasticity or not? C. Now perform the White test with cross-terms. How many right hand side variables does the regression have? What is the test statistic, and what is the critical value? Does this test find heteroskedasticity or not? Does it match your answer in part B? D. Reestimate the equation, using the correction for heteroskedasticity-consistent standard errors. How much do the standard errors change? How many variables change their reported statistical significance as a result? E. Let’s assume that the heteroskedasticity depends upon materials. What is the median value of the materials variable? Estimate the regression using only the data points where materials has a value less than 90% of the median value. How many observations are used in the regression? What is the sum of squared residuals? Now estimate the regression using only the data points where materials has a value greater than 110% of the median value. How many observations are used and what is the sum of squared residuals? F. Calculate the Goldfeld-Quandt test statistic. What value do you get? What is the critical value? Does this test detect heteroskedasticity or not?