ECO-2400: Econometrics Assignment 3 Anisha Sharma anisha.sharma@ashoka.edu Spring 2023 Instructions 1. Assignment is due at 2359 hours on March 12, 2023 2. Assignments are to be done individually. Plagiarism/copying will be seriously dealt with. 3. Your answer will be uploaded to Google Classroom in two parts. First, you will upload one pdf which contains the full set of answers, including whatever you have written in a word processor as well extracts of results from Stata (these can be cut and pasted from the log file). Second, you will upload the full log file. If either is missing, you cannot be graded. 4. Both files (pdf and log) should be saved as “Firstname Lastname A3”. PART A To download the dataset required for this question, first type “ssc install bcuse” into Stata. Once downloaded, you can access the data with “bcuse ‘filename’, replace”. Answer the following questions based on this dataset: 1. Open a log file to save your work and results. 2. The dataset we will use is htv.dta. You can access this by typing “bcuse htv, clear” (you will need a working internet connection). Open the dataset on your computer. 3. What is the range of the educ variable in the sample? What percentage of men completed 12th grade but no higher? Do the men or their parents have, on average, higher levels of education? (use “summ” and “tab”). 4. Estimate the regression model educ = β0 + β1motheduc + β2fatheduc + u by OLS. How much sample variation in educ is explained by parents education? Interpret the coefficient on motheduc and comment on its significance. 5. You have been told that the exclusion of ability could generate biased estimates of the β1 and β2. What is the likely direction of the bias? 6. Add the variable abil (a measure of cognitive ability) to the regression from part (iv). Does ability help to explain variations in education, even after controlling for parents’ education? Explain. 1 7. Now estimate an equation where abil appears in quadratic form: educ = β0 + β1 motheduc + β2 f atheduc + β3 abil + β4 abil 2 + u. Using the estimates β̂3 and β̂4 , find the value of abil, call it abil*, where educ is minimized. Hold all other variables constant. 8. Test the null hypothesis that educ is linearly related to abil against the alternative that the relationship is quadratic. 9. Test H0 : β1 = β2 against a two-sided alternative. What is the p-value of the test? 10. Add the two college tuition variables (tuit17 and tuit18) to the regression from part (7) and determine whether they are jointly statistically significant. 11. What is the correlation between tuit17 and tuit18 (use “corr”)? Explain why using the average of the tuition over the two years might be preferred to adding each separately. What happens when you do use the average? 12. Do the findings for the average tuition variable make sense when interpreted causally? What might be going on? 13. How many different values are taken on by educ in the sample? Does educ have a continuous distribution? 14. Plot a histogram of educ with a normal distribution overlay. Does the distribution of educ appear normal? (use “histogram”) 15. Which of the CLM assumptions (if any) do you think is violated in the model? How would this violation affect the inference procedures you carry out above. 16. Report the results of the regressions in parts (4), (6), (7), (10), (11) in a table using ‘esttab’ or similar. You may need to install esttab (type “ssc install esttab”). Report estimated coefficients, standard errors, number of observations and the Rsquared for each estimated equation. The exported table should be added to your answer file (.pdf). 17. Close the log file and save and upload the log file along with the pdf. PART B 1. Complete all parts of question 11 from the end-of-chapter questions in Chapter 3 of the 5th edition of Wooldridge. [The question begins with: “Suppose that the population model determining y is”]. You can type your answers on Word or Latex if you know how to format equations, or you can write your answers by hand, scan the answers and upload as a pdf. 2