Uploaded by Nguyen Thi Hang

Midterm Reviews

advertisement
MID-TERM REVIEWS
PART A: INSTRUMENTAL VARIABLES
1.
What is the endogenous problem? What causes endogenous?
Vấn đề nội sinh là vấn đề khá phổ biến, khi cá thể có đặc trưng mà không có dữ liệu về đặc trưng
ấy: độ thông minh, độ sáng dạ, đặc trưng ấy bị gộp vào sai số ngẫu nhiên u, nếu có tương quan với
biến độc lập x thì x là biến nội sinh
Nguyên nhân: omitted: biến về năng suất, lấy ví dụ
2.
What solutions are suggested when the model has endogenous problems?
-
Use fixed effects
-
Use instrumental
-
Tùy theo vào dữ liệu có, nếu có dữ liệu mảng  fixed, nếu k có dữ liệu mảng  instrumental.
Tùy thuộc vào biến công cụ có thể có sẵn hay không, dung các biến proxy tùy theo dữ liệu hiện có,
3.
What is the condition for the variable Z to be an instrumental variable for the
endogenousvariable?
-
Instrument relevance: ví dụ: motheduc với biến u
-
Instrument exogeneity: motheduc với u là không có, vì không ai trả tiền lương cho mẹ
4.
When is the instrumental variable Z called a weak instrumental variable? Should
weakinstrumental variables be used for estimation?
Tương quan yếu với biến độc lập nội sinh, không nên dung vì không hiệu quả
5.
Consider a simple model to estimate the effect of personal computer (𝑃𝐶) ownership on college
grade point average (GPA) for graduating seniors at a large public university:
𝐺𝑃𝐴 = 𝛽0 + 𝛽1𝑃𝐶 + 𝑢,
where 𝑃𝐶 is a binary variable indicating PC ownership.
a.
Why might 𝑃𝐶 ownership be correlated with u?
b.
Explain why PC is likely to be related to parents’ annual income. Does this mean parental
income is a good IV for PC? Why or why not?
c.
Suppose that, four years ago, the university gave grants to buy computers to roughly perhalf
of the incoming students, and the students who received grants were randomly chosen. Carefully
explain how you would use this information to construct an instrumental variablefor PC.
6.
In a recent article, Evans and Schwab (1995) studied the effects of attending a Catholic high
school on the probability of attending college. For concreteness, let college be a binary variable
equal to unity if a student attends college, and zero otherwise. Let CathHS be a binary variable
equal to one if the student attends a Catholic high school. A linear probability model is
𝑐𝑜𝑙𝑙𝑒𝑔𝑒 = 𝛽0 + 𝛽1𝐶𝑎𝑡ℎ𝐻𝑆 + 𝛽2𝑜𝑡ℎ𝑒𝑟𝑓𝑎𝑐𝑡𝑜𝑟𝑠 + 𝑢,
where the other factors include gender, race, family income, and parental education.
a)
Why might 𝐶𝑎𝑡ℎ𝐻𝑆 be correlated with u?
b)
Evans and Schwab have data on a standardized test score taken when each student was a
sophomore. What can be done with this variable to improve the ceteris paribus estimate of
attending a Catholic high school?
Let 𝐶𝑎𝑡ℎ𝑅𝑒𝑙 be a binary variable equal to one if the student is Catholic. Discuss the two
c)
requirements needed for this to be a valid IV for 𝐶𝑎𝑡ℎ𝐻𝑆 in the preceding equation. Whichof these
can be tested?
d)
Not surprisingly, being Catholic has a significant positive effect on attending a Catholic
high school. Do you think 𝐶𝑎𝑡ℎ𝑅𝑒𝑙 is a convincing instrument for 𝐶𝑎𝑡ℎ𝐻𝑆?
7.
The data in FERTIL2.dta include, for women in Botswana during 1988, information on number
of children, years of education, age, and religious and economic status variables.
a)
Estimate the model
𝑐ℎ𝑖𝑙𝑑𝑟𝑒𝑛 = 𝛽0 + 𝛽1𝑒𝑑𝑢𝑐 + 𝛽2𝑎𝑔𝑒 + 𝛽3𝑎𝑔𝑒2 + 𝑢
by OLS and interpret the estimates. In particular, holding 𝑎𝑔𝑒 fixed, what is the estimatedeffect of
another year of education on fertility? If 100 women receive another year of education, how many
fewer children are they expected to have?
The variable 𝑓𝑟𝑠𝑡ℎ𝑎𝑙𝑓 is a dummy variable equal to one if the woman was born during the
b)
first six months of the year. Assuming that 𝑓𝑟𝑠𝑡ℎ𝑎𝑙𝑓 is uncorrelated with the error term from part
(a), show that 𝑓𝑟𝑠𝑡ℎ𝑎𝑙𝑓 is a reasonable IV candidate for 𝑒𝑑𝑢𝑐. (Hint: You need to do a
regression.)
Estimate the model from part (a) by using 𝑓𝑟𝑠𝑡ℎ𝑎𝑙𝑓 as an IV for 𝑒𝑑𝑢𝑐. Compare the
c)
estimated effect of education with the OLS estimate from part (a).
Add the binary variables 𝑒𝑙𝑒𝑐𝑡𝑟𝑖𝑐, 𝑡𝑣, and 𝑏𝑖𝑐𝑦𝑐𝑙𝑒 to the model and assume these are
d)
exogenous. Estimate the equation by OLS and 2SLS and compare the estimated coefficients on
𝑒𝑑𝑢𝑐. Interpret the coefficient on 𝑡𝑣 and explain why television ownership has a negative effect on
fertility.
PART B: PANEL MODELS
1.
What are the advantages of the panel data model over the cross-data model?
Thông tin chiều ngang chiều dọc, giải quyết được nội sinh, cho những phân tích sinh động hơn
Cov(X, ci) = 0 => RE
Cov(X; Ci) => FE
2.
Compare the difference of FE and RE models? What is the basis for choosing FE or RE to
estimate the model with panel data? Which test is used for selection?
3.
What is balanced panel data? What causes unbalance?
4.
The defect test results of the regression model are reported as follows: (kiểm định khuyết tật)
Modified Wald test for groupwise heteroskedasticityin fixed
effect regression model
H0: sigma(i)^2 = sigma^2 for all ichi2 (2094)
=
2.1e+05
Prob>chi2 =
0.0000
What defects does the model have? How to fix it?
Đây là kiểm định vấn đề PSSS thay đổi, vì p-value nhỏ nên

5.
Dung PSSM thêm robust để chỉnh sửa
Use the data in JTRAIN to determine the effect of the job training grant on hours of job training
per employee. The basic model for the three years is
ℎ𝑟𝑠𝑒𝑚𝑝𝑖𝑡 = 𝛽0 + 𝛿1𝑑88𝑡 + 𝛿2𝑑89𝑡 + 𝛽1𝑔𝑟𝑎𝑛𝑡𝑖𝑡 + 𝛽2𝑔𝑟𝑎𝑛𝑡_1𝑖,𝑡 + 𝛽3 log(𝑒𝑚𝑝𝑙𝑜𝑦𝑖𝑡 ) + 𝑐𝑖 + 𝑢𝑖𝑡
a.
Estimate the equation using fixed effects. How many firms are used in the FE estimation?
How many total observations would be used if each firm had data on all variables (in particular,
hrsemp) for all three years?
We use 135 firms. In the fixed effect estimation, we would have a total of 405 observations as we
have three years. However, due to missing data, we only have 390 observations in the fixed
effects estimation. This is the regression result or to the estimate on variable
b.
Interpret the coefficient on grant and comment on its significance.
Grant means that if a firm received a grand for the current here, a train is worker, an average of
34.2 hours more than it would have otherwise. This is a large effect. And because the T statistic is
also very large, this effect is very significant.
c.
Do larger firms provide their employees with more or less training, on average? How big
are the differences?
6.
Use the state-level data on murder rates and executions in MURDER for the following
exercise.
a.
Consider the unobserved effects model
𝑚𝑟𝑑𝑟𝑡𝑒𝑖𝑡 = 𝜇𝑡 + 𝛽1𝑒𝑥𝑒𝑐𝑖𝑡 + 𝛽2𝑢𝑛𝑒𝑚𝑖𝑡 + 𝑐𝑖 + 𝑢𝑖𝑡
where 𝜇𝑡 simply denotes different year intercepts and ai is the unobserved state effect. Ifpast
executions of convicted murderers have a deterrent effect, what should be the sign of
𝛽1? What sign do you think 𝛽2 should have? Explain.
In the given unobserved effects model:
If the past execution of the convicted murderers have a deterrent effect, the sign of
should be negative. This is because there would be a fear amongst the prospect
murderers about the punishment that would follow once they are executed post they are
convicted. This fear would reduce
The sign of
would be expected to be positive as with the higher unemployment rate,
the murder rate
is expected to increase due to increased violence.
b.
Using just the years 1990 and 1993, estimate the equation from part (i) by pooled OLS.
Ignore the serial correlation problem in the composite errors. Do you find any evidence for a
deterrent effect?
as you can see here, the estimate on execution is positive, implying that there is no deterrent
effect. If we look at the population, the murder rate is relatively low. On average, the average
murder rate in 1993 was about 8.7. So a reduction of one can be nontrivial.
7.
To evaluate the impact of FDI on business performance, we consider the following model
ln 𝑉𝐴𝑖𝑡 = 𝛽0 + 𝛽1 ln 𝐾𝑖𝑡 + 𝛽2 ln 𝐿𝑖𝑡 + 𝛽3𝐹𝐷𝐼𝑖𝑡 + 𝛽4𝑒𝑑𝑢𝑖𝑡 + 𝛽5𝑅𝐷𝑖𝑡 + 𝛽6𝑠𝑖𝑧𝑒𝑖𝑡 + 𝑐𝑖 + 𝑢𝑖𝑡
where, ln 𝑉𝐴is the natural logarithm of total added value; 𝐹𝐷𝐼 is a dummy variable, equal to 1 if
the enterprise has foreign direct investment, and zero otherwise; ln 𝐾 is the natural logarithm of
total capital; ln 𝐿 is the logarithm of total labor; 𝑒𝑑𝑢 is labor training cost/total labor; 𝑅𝐷 is the
total cost of research and development/total investment; 𝑠𝑖𝑧𝑒 is the size of thebusiness, the dummy
variable includes 4 categories (1-super small; 2-small; 3-medium; 4- large), and 𝑠𝑖𝑧𝑒_1 is the base
category.
The estimated results of the panel model are reported below.
Fixed-effects (within) regression
Number of obs
=
736,694
Group variable: ma_thue
Number of groups
=
105,242
R-sq:
Obs per group:
= 0.3026
min =
7
between = 0.8641
avg =
7.0
overall = 0.7677
max =
7
=
15766.30
=
0.0000
within
F(8,105241)
corr(u_i, Xb)
= 0.4678
Prob > F
(Std. Err. adjusted for 105,242 clusters in ma_thue)
|
ln_VA_ |
Robust
Coef. Std. Err.
t
P>|t|
[95% Conf. Interval]
+
ln_K_ |
.152219 .0014121
107.80
0.000
.1494513
.1549867
ln_L_ |
.7200339 .0031237
230.51
0.000
.7139116
.7261563
fdi_ |
.2223424 .0356943
6.23
0.000
.1523822
.2923027
edu_ |
.0001049 .0000345
3.04
0.002
.0000373
.0001725
R&D_ |
.0000101 .0000192
-0.53
0.599
-.0000477
.0000275
|
size_ |
2
|
.033728 .0042856
7.87
0.000
.0253283
.0421277
3
|
-.0277995 .0080698
-3.44
0.001
-.0436162
-.0119827
4
|
-.0874293 .0115103
-7.60
0.000
-.1099893
-.0648694
3.944356 .0112145
351.72
0.000
3.922376
3.966336
|
_cons |
+
sigma_u |
.67424384
sigma_e |
.63714114
rho |
a.
.52827013
(fraction of variance due to u_i)
How many observations are included in the data? Is the data balanced?
Nếu min < max => dữ liệu không cân xứng, có 1 vài quan sát thiếu 1 số năm
b.
Is the above result estimated from the fixed effects model or the random effects model?
Hausman  nhìn vào p-value nếu không có trên đầu
c.
Explain the meaning of the estimated coefficient of the variable 𝐹𝐷𝐼
Đây là biến giả, hệ số ước lượng dương, có ý nghĩa thống kê ở mức 1% những doanh nghiệp FDI
hoạt động hiệu quả hơn những doanh nghiệp nội địa
d.
From the estimated coefficient of the variable 𝑒𝑑𝑢, how do you conclude about the impact
of spending on labor training on the performance of the enterprise?
Chi cho đào tạo lao động sẽ có tác động tích cực đến sản lượng đầu ra, vì là hệ số dương
e.
From the estimated coefficient of the variable RD, how do you conclude about the impact
of spending on R&D on the performance of the business?
Hệ số ước lượng của RĐ là số âm, chưa có bằng chứng tác động của RD đến sự phát triển của
output, sự tác động của biến này
f.
Can it be concluded that enterprise size has a positive effect on firm performance? Why?
Các nhóm biến size có 4 loại: nhỏ siêu nhỏ, nhỏ, vừa, lớn. nhóm siêu nhỏ là nhóm cơ sở, nhóm thứ
2 là hệ số dương có ý nghĩa thống kê, nhưng nhóm 3, 4 có output kém hơn với điều kiện tương tự,
nhóm 3,4 kém hiệu quả nhất vì hệ số âm nhưng lớn hơn.
Download