Uploaded by 喻喻梓家

Assignment 2

advertisement
THE CHINESE UNIVERSITY OF HONG KONG, SHENZHEN
2022 - 2023 TERM 2
ECO 3121 Introductory Econometrics
ASSIGNMENT 2
TOPIC: Simple and multiple linear regression model.
INSTRUCTIONS:
• Please label clearly each answer with the appropriate question number and letter. Securely
staple all answer sheets together, and make certain that your name(s) and student number(s)
are printed clearly at the top of each answer sheet.
• Please use STATA to do Question 1, and report your STATA commands and results
together with your answers to the questions. Answers without supporting documents won’t
be marked.
• Hand-written answers must be legible. Illegible assignments will be returned unmarked.
• Please combine your answers with supporting documents into one Adobe PDF file and
submit.
DUE DATE: 5PM Thursday March 16, 2023
Please submit your work on Blackboard. Late submissions will receive a 0 with no excuses.
MARKING: Marks for each question are indicated in parentheses. Total marks for the assignment
equal 125. Marks are given for both content and presentation.
Question 1 (50 marks)
Data file: 3121A2.dta (3121A2.csv)
Data Description: A random sample of 526 employees drawn from the 1990 U.S. population of
all employed paid workers:
Variable Definitions:
𝑤𝑎𝑔𝑒𝑖 = the hourly wage rate of employee 𝑖, measured in dollars per hour;
𝑒𝑑𝑢𝑐𝑖 = the number of years of formal education completed by employee 𝑖;
𝑒𝑥𝑝𝑖 = the number of years of work experience accumulated by employee 𝑖;
𝑡𝑒𝑛𝑢𝑟𝑒𝑖 = the number of years of employee 𝑖 with the current employer.
Compute and present OLS estimates of the following population regression equation for the full
sample of 526 paid workers:
log(𝑤𝑎𝑔𝑒𝑖 ) = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐𝑖 + 𝛽2 𝑒𝑥𝑝𝑖 + 𝛽3 𝑒𝑑𝑢𝑐𝑖2 + 𝛽4 𝑒𝑥𝑝𝑖2 + 𝛽5 𝑡𝑒𝑛𝑢𝑟𝑒𝑖 + 𝑢𝑖
(1)
(16 marks)
(a) Compute and report OLS estimates of regression equation (1) for the full sample of 526 paid
̂2 and 𝛽
̂3 , the estimated
workers. Present the estimation results, and report the OLS estimates 𝛽
̂2 and 𝛽
̂3 , the t-statistics for 𝛽
̂2 and 𝛽
̂3 , the standard error of the regression, and
standard errors of 𝛽
2
the 𝑅 .
(11 marks)
(b) Write the expression implied by equation (1) for the partial (ceteris paribus) marginal effect of
𝑒𝑥𝑝𝑖 on hourly wage. Use the OLS estimation results for equation (1) to test the proposition that
years of work experience has no effect on mean hourly wage for all values of 𝑒𝑥𝑝𝑖 . State the null
and alternative hypotheses, and show how the sample value of the test statistic is calculated (give
its formula). Report the sample value of the test statistic and its p-value. State the decision rule
you use, and the inference you would draw from the test. Does the sample evidence favor the
proposition?
(12 marks)
(c) Use the OLS estimation results for equation (1) to test an economist’s conjecture that the partial
(ceteris paribus) marginal effect on hourly wage of 𝑒𝑑𝑢𝑐𝑖 , years of formal education, is larger for
employees with high education level than for employees with low education level. State the null
and alternative hypotheses, and show how the sample value of the test statistic is calculated (give
its formula). Report the sample value of the test statistic and its p-value. State the appropriate
critical values of the null distribution of the test statistic for both the 5 percent and 1 percent
significance levels. State the decision rule you use, and the inference you would draw from the
test. Does the sample evidence favor the real estate broker’s conjecture?
(11 marks)
(d) Use the OLS estimation results for equation (1) to test the conjecture that years of formal
education has no marginal effect on mean hourly wage for employees with a bachelor's degree (16
years of formal education). State the null and alternative hypotheses, and show how the sample
value of the test statistic is calculated (give its formula). Report the sample value of the test statistic
and its p-value. State the decision rule you use, and the inference you would draw from the test.
Does the sample evidence favor the conjecture?
Question 2 (45 marks)
You are investigating the relationship between the final exam grades of university students in an
introductory economics course and those students’ class attendance, as measured by the percentage
of classes each student attended during the term. You also have sample data on two additional
explanatory variables: each student’s cumulative GPA (Grade Point Average) prior to the term in
which the introductory economics course was taken; and each student’s score on a standardized
college entrance exam, the ACT exam. You have sample data for 680 students on the following
variables:
𝑓𝑖𝑛𝑎𝑙𝑝𝑐𝑡𝑖 = final exam grade of the 𝑖-th student, measured in percentage points;
𝑎𝑡𝑡𝑟𝑎𝑡𝑒𝑖 = percentage of classes attended by the 𝑖-th student during the term, measured in
percentage points;
𝐺𝑃𝐴𝑖 = Cumulative Grade Point Average (GPA) of the 𝑖-th student prior to the term in which
the introductory economics course was taken, measured out of 4.0;
𝐴𝐶𝑇𝑖 = ACT score of the 𝑖-th student on the ACT college entrance exam, measured in points.
Using the given sample data on 680 students, your trusty research assistant has estimated
regression equation (1) and obtained the following estimation results (with estimated standard
errors given in parentheses below the coefficient estimates):
𝑓𝑖𝑛𝑎𝑙𝑝𝑐𝑡𝑖 = 𝛽0 + 𝛽1 𝑎𝑡𝑡𝑟𝑎𝑡𝑒𝑖 + 𝛽2 𝐺𝑃𝐴𝑖 + 𝛽3 𝐺𝑃𝐴2𝑖 + 𝛽4 𝐴𝐶𝑇 + 𝑢𝑖
𝛽̂0 = 55.564
(8.077)
𝛽̂1 = 0.071698
(0.0279)
𝛽̂3 = 4.4785
(1.0513)
𝛽̂4 = 0.9
(0.1336)
(1)
𝛽̂2 = −18.6518
(5.5788)
𝐶𝑜̂𝑣(𝛽̂1 , 𝛽̂2 ) = −0.024
𝐶𝑜̂𝑣(𝛽̂1 , 𝛽̂3 ) = 0.00203
𝐶𝑜̂𝑣(𝛽̂1 , 𝛽̂4 ) = 0.001284
𝐶𝑜̂𝑣(𝛽̂2 , 𝛽̂3 ) = −5.7846
𝐶𝑜̂𝑣(𝛽̂2 , 𝛽̂4 ) = 0.075667
𝐶𝑜̂𝑣(𝛽̂3 , 𝛽̂4 ) = −0.025335
𝑆𝑆𝑅 = 73218.11
𝑆𝑆𝑇 = 94137.17
N = 680
𝑆𝑆𝑅 is the Residual Sum-of-Squares and 𝑆𝑆𝑇 is the Total Sum-of-Squares for sample regression
equation (1). Sample size N = 680. 𝐶𝑜̂𝑣(𝛽̂𝑗 , 𝛽̂ℎ ) is the estimated covariance between coefficient
estimates 𝛽̂𝑗 and 𝛽̂ℎ . Estimated standard errors are given in parentheses below the coefficient
estimates 𝛽̂𝑗 (𝑗 = 0,1, … ,4)
(15 marks)
(a) Compute a test of the proposition that students’ cumulative Grade Point Average has no effect
on students’ final exam grade. State the coefficient restrictions on regression equation (1) implied
by this proposition; that is, state the null hypothesis 𝐻0 and the alternative hypothesis 𝐻1 . Write
the restricted regression equation implied by the null hypothesis 𝐻0 . OLS estimation of this
restricted regression equation yields an 𝑅2 = 0.1701. Use this information, together with the
results from OLS estimation of equation (1), to calculate the required test statistic. State the
decision rule you use, and the inference you would draw from the test at the 5 percent significance
level.
(15 marks)
(b) The average student in the course has a cumulative Grade Point Average of 2.5; that is, the
sample mean value of 𝐺𝑃𝐴𝑖 equals 2.5. Write the expression (or formula) for the marginal effect
on 𝑓𝑖𝑛𝑎𝑙𝑝𝑐𝑡𝑖 of 𝐺𝑃𝐴𝑖 implied by regression equation (1). Use the estimation results for regression
equation (1) to test the proposition that the marginal effect of cumulative GPA (𝐺𝑃𝐴𝑖 ) on students’
final exam grade (𝑓𝑖𝑛𝑎𝑙𝑝𝑐𝑡𝑖 ) equals zero for the average student whose GPA is 2.5. State the null
and alternative hypotheses, and show how you calculate the required test statistic. State the
inference you would draw from the test at the 5 percent significance level.
(15 marks)
(c) Use the estimation results for regression equation (1) to test the joint significance of the slope
coefficient estimates 𝛽̂𝑗 (𝑗 = 1, … , 4) in regression equation (1). Perform the test at the 5 percent
significance level (i.e., for significance level 𝛼 = 0.05). State the null and alternative hypotheses,
and show how you calculate the required test statistic. State the inference you would draw from
the test at the 5 percent significance level.
Question 3 (30 marks)
Consider the standard simple regression model 𝑦 = 𝛽0 + 𝛽1 𝑥 + 𝑢 under the Gauss-Markov
Assumptions SLR.1 through SLR.5. The usual OLS estimators 𝛽̂0 and 𝛽̂1 are unbiased for their
respective population parameters. Let 𝛽̃0 and 𝛽̃1 be the estimator of 𝛽0 and 𝛽1 obtained by
𝑦 −𝑦
𝑦 −𝑦
connecting the first two observations (𝑥1 , 𝑦1 ) and (𝑥2 , 𝑦2 ). [𝛽̃0 = 𝑦1 − 2 1 𝑥1 and 𝛽̃1 = 2 1 ]
𝑥2 −𝑥1
𝑥2 −𝑥1
(15 marks)
(a) Prove that 𝛽̃0 and 𝛽̃1 are unbiased for 𝛽0 and 𝛽1 .
(15 marks)
(b) Find 𝑉𝑎𝑟(𝛽̃1 |𝑥) , and prove that OLS is more efficient. (Hint: prove that 𝑉𝑎𝑟(𝛽̃1 |𝑥) ≥
𝑉𝑎𝑟(𝛽̂1 |𝑥))
Download