ECN 4231: INTRODUCTION TO ECONOMETRICS PRACTICE QUESTIONS Question 1 You have data on the natural log of hourly wages, (LWAGE), age, measured in years, (AGE), a dummy variable, (FEMALE), that takes the value 1 if female 0 otherwise, the product of FEMALE and AGE, FEMALE*AGE, years of work in the current job, TENURE, and its square, TENURE2. You estimate the following regressions: i. After how many years of work in the current job will the (log of) wages be maximised? ii. Test the hypothesis that the coefficients on TENURE and TENURE2 are jointly significant in the model 1 iii. What would be the effect for the OLS estimates in equation (1) of omitting the variable FEMALE from the regression? iv. Outline how you would test the hypothesis that the specification of the variables on the right hand side of (1) were correct. Question 2 i. What do you understand by the term autocorrelation? ii. What can cause it? iii. What are the consequences for OLS estimation? iv. Given the following information from a regression of the model πΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπ‘π‘ = π½π½0 + π½π½1 πΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπ‘π‘−1 + π½π½2 πΊπΊπΊπΊπΊπΊπ‘π‘ + π½π½3 πΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπΌπ‘π‘ + πππ‘π‘ test for the presence of 1st order autocorrelation in the residuals v. Which of the test statistics do you prefer and why? vi. Outline the Feasible GLS solution to the problem of autocorrelation 2 Question 3 You have quarterly data on the level of national income, Yt , and consumption , Ct, for the period 1950-1999 and fit the following: (1) an ordinary least squares (OLS) regression of Yt on Ct (2) An OLS regression of Yt , Ct and C lagged by one year, Ct-1 The table gives the estimated regression coefficients. Standard errors are given in brackets. i. What is autocorrelation, what might cause autocorrelation and what are the consequences for OLS estimation? ii. Interpret the estimates in column 1 and explain why you might be dissatisfied with the first equation. iii. Has the specification in (2) solved the problem? Why might you be suspicious of this equation? iv. Outline the form of a test for autocorrelation that is not affected by this problem. 3 Question 4 i. What do you understand by the term multicolinearity? ii. What is the consequence of multicolinearity for OLS estimation? iii. How would you detect the presence of multicolinearity? Question 5 Suppose that Miami is interested in understanding what determines the probability that a student accepted for enrolment chooses to attend Miami. The econometric model they have estimated is as follows: ππππππππππππππ = ππ0 + ππ1 π΄π΄π΄π΄π΄π΄ + ππ3 π΄π΄π΄π΄π΄π΄ππ + ππ4 πΉπΉπΉπΉπΉπΉπΉπΉπΉπΉπΉπΉππ + πππ‘π‘ where the subscript i indexes student i, attend is a dummy variable that equals one if the student chooses to attend Miami, ACT is the student’s ACT score, and Faminc is the student’s reported family income. Suppose Miami wants to know if financial aid (AID) has the same effect on the attendance decisions of families with different family incomes. i. Write out a model that allows a test of whether there is a differential effect of AID depending on family income (e.g. ππππππππππππππ=ππ0+ππ1π΄π΄π΄π΄π΄π΄π΄π΄+β―). What coefficient(s) in your model can be used to test whether the effect of AID varies with family income? If students with lower family income are more sensitive to the financial aid offer, what should be true about the coefficient(s) in your estimated model 4 Question 6 Refer to this table: 5 Female: dummy variable that equals one if the student is female; 0 otherwise. Resident: dummy variable that equals one if the student is a resident of Ohio; 0 otherwise High School GPA: The student’s grade point average in high school. ACT Score: The student’s composite ACT score. Use specification (1) to answer the questions below. i. Construct a 95% confidence interval for the coefficient on the Ohio resident dummy variable. ii. What is R2 in specification (1)? iii. Based on specification 1 in table 1, construct a t-statistic for the null hypothesis that Ohio residents and non-residents earn the same grades. iv. Using specification (2), what is the predicted course grade for a non-resident female student who had a 3.0 high school GPA, and a 25 on the ACT? v. In specification (4), the number of semesters of college experience and its square are added to the regression. Use the information in the table to construct an F-statistic for the null hypothesis that the coefficients on experience and experience2 are jointly equal to zero. For the next several questions, refer to table 2. This uses the same grade data as table 1 but adds dummy variables for the subject that the grade was earned in. Dummy variables are included for 5 subjects and all other subject areas (i.e. those not included in the 5 subjects listed) are lumped into a single category referred to as “other subjects”. The dummy for other subjects is the omitted dummy and therefore, other subjects are the reference group. 6 ii. Calculate the t-statistic for the null hypothesis that grades in Economics are the same as those in the reference group (“other subjects”) iii. If you are to test the null hypothesis that grades are equal across course subject areas (economics, finance, marketing, math, accounting, and “all other subjects”), what would be the precise test command you would use in Stata? 7 Question 7 Use the regressions below to answer the questions that follow. Note that the F-test provided at the top of each regression (e.g. 811.38 in first regression) tests the null hypothesis that all coefficients, other than the intercept, are equal to zero. In the regressions below, I provide estimates from a regression of daily minutes spent sleeping (sleep) on a person’s age, the square of age, a dummy variable for female, and a dummy variable indicating whether the person is employed. This regression was then followed by several additional regressions. Regressions using ATUS data on minutes spent sleeping. i. What is the predicted daily minutes of sleep for a male who is 20 years old and not employed? ii. Holding age and sex constant, how many more or less minutes is an employed person predicted to sleep than a non-employed? iii. The LM statistic for the Breusch-Pagan test for heteroscedasticity is: ___. What can you conclude? iv. The LM statistic for the simplified White test for heteroscedasticity is___. What can you conclude? 8 9 Question 8 An economics department at a large state university keeps track of its majors’ starting salaries. Does taking econometrics affect starting salary? Let SAL=salary in dollars, GPA=grade point average on a 4.0 scale, METRICS=1 if student took econometrics, and METRICS =0 otherwise. Using the data file metrics.dat, which contains information on 50 recent graduates, we obtain the estimated regression: ii. How would you modify the equation to see whether women had lower starting salaries than men? iii. How would you modify the equation to see if the value of econometrics was the same for men and women? 10 Question 9 Suppose you collect data from a survey on wages, education, experience, and gender. In addition, you ask for information about marijuana usage. The original question is: “On how many separate occasions last month did you smoke marijuana?” i. Write an equation that would allow you to estimate the effects of marijuana usage on wage, while controlling for other factors. You should be able to make statements such as, “Smoking marijuana five more times per month is estimated to change wage by x%.” ii. Write a model that would allow you to test whether drug usage has different effects on wages for men and women. How would you test that there are no differences in the effects of drug usage for men and women? iii. Suppose you think it is better to measure marijuana usage by putting people into one of four categories: nonuser, light user (1 to 5 times per month), moderate user (6 to 10 times per month), and heavy user (more than 10 times per month). Now, write a model that allows you to estimate the effects of marijuana usage on wage. iv. Using the model in part (iii), explain in detail how to test the null hypothesis that marijuana usage has no effect on wage. Be very specific and include a careful listing of degrees of freedom. Question 10 Using the data in GPA2.RAW, the following equation was estimated: The variable sat is the combined SAT score, hsize is size of the student’s high school graduating class, in hundreds, female is a gender dummy variable, and black is a race dummy variable equal to one for blacks and zero otherwise. i. Is there strong evidence that hsize2 should be included in the model? From this equation, what is the optimal high school size? 11 ii. Holding hsize fixed, what is the estimated difference in SAT score between nonblack females and nonblack males? How statistically significant is this estimated difference? iii. What is the estimated difference in SAT score between nonblack males and black males? Test the null hypothesis that there is no difference between their scores, against the alternative that there is a difference iv. What is the estimated difference in SAT score between black females and nonblack females? What would you need to do to test whether the difference is statistically significant? 12