Econometrics Assignment 3

ECOM30002/90002 ECONOMETRICS 2 SEMESTER 1, 2023 ASSIGNMENT 3 Due date: 11:59 pm on Wednesday, 10 May, 2023 Information and instructions • This assignment contributes 7.5% to the overall assessment. There are 2 questions with a total of 40 marks. Points for each question are provided in brackets as well as for each part of the question in parenthesis. • An assignment can be submitted individually or by a group of up to four students. Students form their own groups. Members of a group need not be enrolled in the same tutorial. All group members will receive the same mark for the assignment. • Assignment groups must be registered through LMS by 10am May 8, even if the group members are the same as in previous assignments (students working individually need not register). Assignments have to be submitted electronically via LMS. Instructions for group registration and assignment submission have been announced on LMS. • List all group members and their student ID number on the first page. • Include your R-code for each question (either at the end of the document or as separate file as part of the online submission). A penalty of 5 points is applied for each question for assignment submissions without R-code. • Assignments must be submitted as a Portable Document File (pdf). Be short, precise and clear in your answers and aim to stay within the 600 word limit. Do not present unnecessary graphs, tables or numbers, and do not discuss irrelevant matters. However, show your work and intuition: marks are given for explanations and not just for having the correct answer. Give precise numerical answers when possible and explain what units your answers are in. Regression output must be presented in clearly labelled equation or table form whenever either is requested as part of a question. • Extension Policy Late assignments will not be accepted. Students with a genuine and documented reason for not completing an assignment, such as illness, can apply for special consideration to have the marks for that assignment re-weighted to the exam. Special Consideration Students who have been significantly affected by illness or other serious circumstances during the semester may be eligible to apply for Special Consideration. The following website contains detailed information relating to who can apply for Special Consideration and the process for making an application: http://students.unimelb.edu.au/admin/special. Page 1 of 4 Question 1 [20 marks] Reconsider the simulation setting from Question 2 in assignment 2 that considered OLS and 2SLS estimation of a causal equation when there is a possible omitted variable. In this assignment you will also consider asymptotic properties and a case where the potential IV can vary in its relevance. The set-up for the data generation below is the same as in Question 2 for Assignment 2: • The specification for the data generating process is as follows: Yi = β0 + β1 X1,i + β2 X2,i + β3 X3,i + Ui , • A causal equation for the dependent variable Yi is Yi = β0 + β1 X1,i + β2 X2,i + β3 X3,i + Ui , (1) in which the coefficient β1 is of primary interest as an object for statistical inference. • The explanatory variable X1,i is generated by an equation of the form X1,i = α0 + α1 Z1,i + α2 X2,i + α3 X3,i + Vi . (2) • The random variables (Z1,i , X2,i , X3,i , Vi )′ are i.i.d. and have a multivariate standard normal distribution, i.e. N (0, I4 ) where I4 is the 4 by 4 identity matrix. • The disturbance Ui is independent of (Z1,i , X2,i , X3,i , Vi )′ with the following distribution: Ui is i.i.d. standard normal (standardised to mean 0, variance 1.) • Throughout the assignment the population values β0 = 0; β1 = 1; β2 = 1; α0 = 0; will be set. The other population parameters will be given various values in the questions below. As in Assignment 2, the explanatory variable X3,i will be treated as an unobservable variable throughout the assignment. (eg X3,i can be thought of as something like ability in a wage equation.) Even though Yi will be generated from equation (1), which involves X3,,i , calculations of estimators will proceed without using X3,i . In this sense X3,i is playing the role of an omitted causal variable that is (possibly) involved in the generation of values for Yi but is not included in statistical analysis of the data. (i) (6 marks) To start, set β3 = 1; α1 = 1; α2 = 0; α3 = 1 For each sample n = 20, 200, 2000 generate 1000 replications of samples from X1,i from (2) and Yi from (1). For each sample carry out a 2SLS regression and calculate and tabulate, i.e. provide a table, for the following quantities: • bias of the estimator for β1 • standard deviation of the estimator for β1 • coverage rate of the 95% confidence interval i.e. percentage of simulated confidence intervals that include β1 ) • average length of the confidence intervals (length is upper limit minus lower limit) Page 2 of 4 (ii) (2 marks) Do your simulation results form part (i) suggest that the 2SLS estimator is consistent or inconsistent for β1 ? Briefly explain. (iii) (2 marks) Imagine you were to redo the table of results you reported in part (i) but now using the OLS estimator instead of 2SLS. How would you expect the results on bias and coverage expect to change compared to those for 2SLS? (You are NOT asked to redo the analysis!) (iv) (5 marks) Set n = 200 and consider the range of values for α1 : α1 = 0.8, 0.5, 0.3, 0.1, 0 For these 5 cases, carry out the simulations of the same quantities/properties of the 2SLS estimator as in part (i), i.e. report bias and standard deviation as well as the coverage rate and length of the confidence interval for all 5 cases in a table. Provide a histogram of the estimates for β1 from the simulations with α1 = 0.8 and α1 = 0.3, respectively. (v) (2 marks) Explain the results that your reported in part (iv) in terms of the validity/invalidity of Z1,i as an IV for X1,i . (vi) (3 marks) Consider the OLS estimator for the simulation setting with the following values β3 = 0, α1 = 1, α3 = 1, n = 500 and a range of values for α2 : α2 = 0; 2; 4; 6; 8. The bias and standard deviation of the OLS estimator and the confidence interval in terms of the coverage rate and interval length are provided in the table in Figure 1. α2 α2 α2 α2 α2 = = = = = 0.0 2.0 4.0 6.0 8.0 Bias −0.002 −0.002 −0.004 −0.006 −0.008 SD Coverage Length cor(X1,i, X2,i) 0.071 94.4 0.280 −0.000 0.109 94.4 0.428 0.755 0.179 94.4 0.706 0.917 0.257 94.4 1.012 0.961 0.337 94.4 1.326 0.977 Figure 1: OLS estimator properties from simulations for α2 = (0, 2, 4, 6, 8). Briefly explain what aspect/concept the simulation investigates. Briefly discuss the main conclusions that you can draw from the provided results for the properties of the OLS estimator (bias, standard deviation, coverage rate and confidence interval length). Page 3 of 4 Question 2 [20 marks] For this question you will consider the US data on road fatalities and beer tax you have seen in lectures 13 and 14. The data is provided in the file “A3Fatalities.csv” and contains the following variables: • Fatali,t : number of road fatalities in state i in year t • Popi,t : population of state i in year t • BeerTaxi,t : tax on beer in state i in year t ($/case, 1988 prices) • FRatei,t : fatalities per 1000 of population in state i in year t Suppose that the true causal model is F Ratei,t = αi + β1 BeerT axi,t + Vit (i) (2 marks) Should we regress F Ratei,t on BeerT axi,t to estimate β1 ? Why or why not? (ii) (4 marks) Write down the transformed model that can be estimated via OLS and makes use of all 7 years of data. Define all terms, including any additional variables that are required for the transformation of the fatalities and beer tax variables. Briefly explain what the transformation achieves. (Make sure you have reviewed lecture 14 before you proceed.) (iii) (6 marks) Next create the within-average variable for the fatalities rate and the beer tax for each state and call them fratehat and beertaxhat. You can use the variable state_id in a loop to create these state mean variables. Provide a histograms of fratehat and beertaxhat. Then create the transformed dependent and explanatory variables (call them fratetilde and beertaxtilde) that are required for the transformed model in part (ii). Report the original and transformed variables for the state of California in year 1982. Report the means and standard deviations of your transformed variables. Hint: You can generate the transformed variables either in Excel (and the load in the data into R) or first load the data set into R and undertake the transformation in R. In R, the within average variables and the transformed variables can be computed using a loop. In R it is useful to first generate a variable called state_list using the command state_list < −unique(dt$state) that you can then use to compute the within-state average values by by selecting indices of observations for a variable associated with a particular state using which(dt$state == state_list[i]) where i is referring to a unique state and you can use a loop to go through all the states. (iv) (4 marks) Estimate the transformed model using OLS. Report the table of estimation results and interpret the coefficient on β1 . (v) (4 marks) Finally, estimate a one-way fixed effects model for original data. Compare your results to those in part (iv) and briefly discuss. Are they what you expected? Page 4 of 4

Econometrics Assignment 3

Related documents

Products

Support

Econometrics Assignment 3

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib