Uploaded by Han Chen

Assignment 1 with Answers

advertisement
THE CHINESE UNIVERSITY OF HONG KONG, SHENZHEN
2022 - 2023 TERM 2
ECO 3121 Introductory Econometrics
ASSIGNMENT 1 ANSWERS
TOPIC: Simple linear regression model.
INSTRUCTIONS:
• Please label clearly each answer with the appropriate question number and letter. Securely
staple all answer sheets together, and make certain that your name(s) and student number(s)
are printed clearly at the top of each answer sheet.
• Please use STATA to do Question 1, and report your STATA commands and results
together with your answers to the questions.
• Hand-written answers must be legible. Illegible assignments will be returned unmarked.
• Please combine your answers with supporting documents into one Adobe PDF file and
submit.
DUE DATE: 5PM Friday February 24, 2023
Please submit your work on Blackboard. Late submissions will receive a 0 with no excuses.
MARKING: Marks for each question are indicated in parentheses. Total marks for the assignment
equal 90. Marks are given for both content and presentation.
Question 1 (25 marks)
Data file: 3121A1.dta (or 3121A1.csv)
Data Description: A random sample of 436 employees drawn from the 1976 U.S. population of
all employed paid workers.
Variable Definitions:
𝑤𝑎𝑔𝑒𝑖 = average hourly earnings of worker i in 1976, in dollars per hour.
𝑒𝑑𝑢𝑐𝑖 = years of formal education completed by worker i, in years.
𝑓𝑒𝑚𝑎𝑙𝑒𝑖 = an indicator variable equal to 1 if worker i is female, and 0 if worker i is male.
(5 marks)
1. Compile a table of descriptive summary statistics for the sample data. The table should include
for each of the variables in the dataset: the sample mean, the sample standard deviation, the
minimum sample value, and the maximum sample value. How many females and how many
males are there in the sample?
(1 mark) per column in table, except Obs.
. sum wage educ female
Variable
Obs
Mean
wage
educ
female
436
436
436
Std. Dev.
Min
Max
6.051216
12.67202
.4380734
3.795647
2.660956
.4967202
.53
0
0
25
18
1
. tab1 female, missing
-> tabulation of female
female
Freq.
Percent
Cum.
0
1
245
191
56.19
43.81
56.19
100.00
Total
436
100.00
Number of females in the sample = 191 (0.5 mark)
Number of males in the sample = 245 (0.5 mark)
(25 marks)
2. Compute and present OLS estimates of the following population regression equation for the
full sample of 436 paid workers:
𝑤𝑎𝑔𝑒𝑖 = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐𝑖 + 𝑢𝑖
(1)
where 𝑢𝑖 is a random error term that is assumed to satisfy all the assumptions of the classical
linear regression model.
(5 marks)
a) Report the OLS coefficient estimates 𝛽̂0 and 𝛽̂1 computed by estimating population
regression equation (1).
. reg wage educ
Source
SS
df
MS
Model
Residual
1061.27825
5205.739
1
434
1061.27825
11.9947903
Total
6267.01726
435
14.4069362
wage
Coef.
educ
_cons
.5869922
-1.38716
̂0 = −1.38716
𝛽
̂1 = 0.5869922
𝛽
Std. Err.
.0624042
.807995
t
9.41
-1.72
Number of obs
F( 1,
434)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
436
88.48
0.0000
0.1693
0.1674
3.4633
P>|t|
[95% Conf. Interval]
0.000
0.087
.4643401
-2.97523
.7096443
.2009096
(2.5 mark)
(2.5 mark)
(5 marks)
b) Interpret the value of the slope coefficient estimate 𝛽̂1 ; i.e., explain in words what the
numerical value of 𝛽̂1 means.
(Answer must not be just a generic description of the slope coefficient estimate; it must
explicitly account for the units in which wage and educ are measured.)
wage is measured in dollars per hour; educ is measured in years.
̂1 = 0.5870 means that a 1-year increase in education is
Therefore, the estimate 𝛽
associated with an increase in average hourly wages equal to 𝟎. 𝟓𝟖𝟕𝟎 dollars per
hour. (5 marks)
(5 marks)
c) Interpret the value of the intercept coefficient estimate 𝛽̂0 ; i.e., explain in words what the
numerical value of 𝛽̂0 means.
̂0 = −1.3872 means that the average (mean) hourly wage rate of workers
The estimate 𝛽
with zero years of education (educ = 0) equals −𝟏. 𝟑𝟖𝟕𝟐 dollars per hour. (5 marks)
(5 marks)
d) On a set of appropriately labeled coordinate axes, draw the estimated sample regression
function implied by OLS estimation of regression equation (1). That is, draw the graph of
the equation 𝑤𝑎𝑔𝑒
̂ 𝑖 = 𝛽̂0 + 𝛽̂1 𝑒𝑑𝑢𝑐𝑖 , compute the coordinates of the two points on it that
correspond to the values 12 and 16 of 𝑒𝑑𝑢𝑐𝑖 and label these two points on your graph as A
and B respectively. (Note: you do not need to use STATA, or any software program, to
draw and label this graph.)
The two points have the following coordinates:
Point A: For 𝑒𝑑𝑢𝑐𝑖 = 12 years, the estimated mean of average hourly earnings equals:
𝑤𝑎𝑔𝑒
̂ 𝑖 = 𝛽̂0 + 𝛽̂1 𝑒𝑑𝑢𝑐𝑖 = −1.3872 + 0.5870(12) = 𝟓. 𝟔𝟓𝟔𝟖 𝐝𝐨𝐥𝐥𝐚𝐫𝐬 𝐩𝐞𝐫 𝐡𝐨𝐮𝐫
= $ 𝟓. 𝟔𝟔 per hour
(1 mark)
Point B: For 𝑒𝑑𝑢𝑐𝑖 = 16 years, the estimated mean of average hourly earnings equals:
𝑤𝑎𝑔𝑒
̂ 𝑖 = 𝛽̂0 + 𝛽̂1 𝑒𝑑𝑢𝑐𝑖 = −1.3872 + 0.5870(16) = 𝟖. 𝟎𝟎𝟒𝟖 𝐝𝐨𝐥𝐥𝐚𝐫𝐬 𝐩𝐞𝐫 𝐡𝐨𝐮𝐫
= $ 𝟖. 𝟎𝟎𝟒𝟖 per hour
(1 mark)
10
Figure 1: Line graph of 𝑤𝑎𝑔𝑒
̂ 𝑖 = 𝛽̂0 + 𝛽̂1 𝑒𝑑𝑢𝑐𝑖 = −1.3872 + 0.5870𝑒𝑑𝑢𝑐𝑖
(3 marks) total: 2 marks for correct line graph; 1 mark for labeling points A and B
B
0
5
A
0
5
10
educ = year of education
15
20
Question 2 (35 marks)
A researcher is using data for a sample of 88 houses sold in an urban area during a recent year to
investigate the relationship between house prices 𝑦𝑖 (measured in thousands of dollars) and house
size 𝑥𝑖 (measured in square meters). Preliminary analysis of the sample data produces the
following sample information:
∑𝑛𝑖=1 𝑦𝑖 = 25,832.05
𝑛 = 88
∑𝑛𝑖=1 𝑥𝑖2 = 3,329,789.6
∑𝑛𝑖=1 𝑥𝑖 = 16462.34
∑𝑛𝑖=1 𝑦𝑖2 = 8,500,750.69
∑𝑛𝑖=1 𝑥𝑖 𝑦𝑖 = 5,209,990.7 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )( 𝑦𝑖 − 𝑦̅) = 377,534.76
∑𝑛𝑖=1(𝑦𝑖 − 𝑦̅)2 = 917,854.51
∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2 = 250,144.32 ∑𝑛𝑖=1 𝑢̂𝑖 2 = 348,053.43
Use the above sample information to answer all the following questions. Show explicitly all
formulas and calculations.
(12 marks)
(a) Use the above information to compute OLS estimates of the intercept coefficient 𝛽0 and the
slope coefficient 𝛽1
𝑛
̅)
377,534.76
𝑖 −𝑥̅ )( 𝑦𝑖 −𝑦
̂1 = ∑𝑖=1(𝑥
𝛽
=
= 1.509268 = 𝟏. 𝟓𝟎𝟗𝟑
)2
∑𝑛 (𝑥
𝑖=1
𝑖 −𝑥̅
250,144.32
(6 marks)
̂0 = 𝑦̅ − 𝛽
̂1 𝑥̅
𝛽
𝑦̅ =
∑𝑛
𝑖=1 𝑦𝑖
𝑛
=
25,832.05
88
= 293.546 and 𝑥̅ =
∑𝑛
𝑖=1 𝑥𝑖
𝑛
=
16,462.34
88
= 187.072
Therefore
̂0 = 𝑦̅ − 𝛽
̂1 𝑥̅ = 293.546 − 1.509268 ∗ 187.072 = 293.546 − 282.342 = 𝟏𝟏. 𝟐𝟎𝟒 (6 marks)
𝛽
(5 marks)
(b) Interpret the slope coefficient estimate you calculated in part (a) -- i.e., explain what the
̂1 means.
numeric value you calculated for 𝛽
̂1 = 𝟏. 𝟓𝟎𝟗𝟑. 𝑦𝑖 is measured in thousands of dollars, and 𝑥𝑖 is measured in square
Note: 𝛽
meters.
̂1 means that an increase (decrease) in house size of 1 square meter is
The estimate 1.5093 of 𝛽
associated on average with an increase (decrease) in house price of 1.5093 thousands of dollars,
or 1,509.3 dollars.
(6 marks)
(c) Calculate an estimate of 𝜎 2 , the error variance.
𝜎̂ 2 =
𝑅𝑅𝑆
𝑛−2
=
∑𝑛
̂𝑖 2
𝑖=1 𝑢
𝑛−2
=
348,053.43
88−2
= 𝟒, 𝟎𝟒𝟕. 𝟏𝟑𝟑
(6 marks)
(d) Compute the value of 𝑅2 , the coefficient of determination for the estimated OLS sample
regression equation. Briefly explain what the calculated value of 𝑅2 means.
𝑆𝑆𝐸 = 𝑆𝑆𝑇 − 𝑆𝑆𝑅 = ∑𝑛𝑖=1(𝑦𝑖 − 𝑦̅)2 − ∑𝑛𝑖=1 𝑢̂𝑖 2 = 917,854.51 − 348,053.43 = 569,801.08
𝑅2 =
𝑆𝑆𝐸
𝑆𝑆𝑇
=
569,801.08
917,854.51
= 𝟎. 𝟔𝟐𝟎𝟖
(4 marks)
Interpretation of 𝑹𝟐 = 𝟎. 𝟔𝟐𝟎𝟖: The value of 0.6208 indicates that 62.08 percent of the total
sample variation in house prices is attributable to, or explained by, the model. (2 marks)
(6 marks)
(e) What are the values of ∑𝑛𝑖=1 𝑢̂𝑖 and ∑𝑛𝑖=1 𝑥𝑖 𝑢̂𝑖 for the sample regression equation you have
estimated? Explain briefly how you obtained your answer.
∑𝑛𝑖=1 𝑢̂𝑖 = 0
∑𝑛𝑖=1 𝑥𝑖 𝑢̂𝑖 = 0
(2 marks)
(2 marks)
These computational properties of the OLS sample regression equation follow from the first-order
conditions for the OLS coefficient estimators.
(2 marks)
Question 3 (30 marks)
Derive the Ordinary Least Squares (OLS) estimate for the simple linear regression model, i.e., 𝛽̂0
and 𝛽̂1 . Be very specific.
Deriving the OLS estimates
The first-order conditions (FOCs) for a minimum of the RSS function by setting the partial
derivatives equal to zero:
we can get:
(1)
(2)
To solve the equations, pass the summation operator through the equation (1):
So
and plug this into the equation (2) (and drop the division by n):
simple algebra gives
.
If
we can write
Download