Uploaded by flxnstn

HW2

advertisement
ECO 2400 - 01, 02
Homework 2
Due Date: March 27 (11:59 PM)
About:
Homework 2 engages you with statistical inference and various functional forms by testing the relationship between firm profits and capital. We will be using data from a representative survey conducted by the National Sample Survey Organization (NSS) in the year
2010-11. The survey covers 300,000 unorganized enterprises in India. This homework draws
a random sample of establishments from the original survey. Before starting the homework,
it would be a good idea to familiarize yourselves with the variables which will be considered
in this exercise and their units.
Rules of Engagement:
Students will work in groups of 2 with each group submitting one homework. You are
required to submit both the homework solutions as a word document or PDF, as well as
the Stata log file. Homeworks submitted without the Stata log file will result in an automatic halving of the points scored on the homework. Students are encouraged to collaborate
across groups, and also use existing online resources for assistance with their Stata coding.
Students can email the instructor and the teaching fellow with questions on the homework,
or use office hours for the same. However, the instructor and the teaching fellow will stop
responding to queries on the homework after March 25, 11:59 AM (60 hours prior to the
homework being due). Except for documented medical exigencies, no extensions will be
granted for the homework. The maximum possible score is 50.
1
1
Homework Assignment
1.1
Hypothesis Testing
1. Before we go to Stata, draw a figure of the t-distribution with 150 degrees of freedom.
Consider a one-sided hypothesis test first, where H0 : βj = αj and Ha : βj > αj . On the
figure, shade the “rejection region” associated with the probability of a Type-I Error being 0.05. What is the corresponding t-statistic associated with the probability of a Type-I
Error being 0.05? For what values of tβˆj can you reject H0 in favour of Ha at the 5% level? [3]
2. Repeat the exercise in 1.1 for a 2-sided hypothesis test. Explain how a two-sided hypothesis test differs from a one-sided hypothesis test. Assume your tβˆj = 1.31. How would you
re-state this t-statistic in terms of βj , βˆj and se(βˆj )? [4]
3. Let’s start with the following specification:
ln(gvai ) = β0 + β1 ln(capitali ) + β2 ln(wagesi ) + β3 rurali + β4 f emalei
+ β5 marginalizedi + β6 agei + β7 age2i + i (1)
4. Interpret the β1 coefficient and comment on both its statistical and “economic” significance. In your understanding, does capital stock have a large or small impact for these
enterprises? [2]
5. What is the 99% confidence interval corresponding to β1 . Can you reject the null hypothesis that β1 = 0.08 at the 1% level using a 2-sided test? [2]
6. Can you reject the null hypothesis that β1 = 0.07 at the 10% level using a 2-sided test?
Can you reject the null hypothesis that β1 = 0.07 at the 10% level using a 1-sided test? Can
you reject the null hypothesis that β1 = 0.07 at the 5% level using a 2-sided test? [3]
7. Now augment equation (1) with contemporaneous rainfall and 5 lags of rainfall (std arf, l1 std arf ,
l2 std arf, l3 std arf, l4 std arf, l5 std arf ). std arf and it’s lags measure standardized
annual rainfall incidence in the district (standardized measure of annual rainfall incidence).
After including the rainfall lags, how does the β1 coefficient change in terms of magnitude
and statistical significance? Interpret the coefficient on l2 std arf and l3 std arf . [1]
2
8. Conduct a F-test to determine whether you should keep the contemporaneous rainfall measure and the 5 rainfall lags in your specification (std arf, l1 std arf l2 std arf, l3 std arf, l4 std arf, l5 std
You need to report your numerator and denominator degrees of freedom, and the numerator
residual sum of squares (or R2 ) from the restricted and unrestricted specifications. [2]
1.2
Heterogeneity Tests
1. Let’s return to equation (1). Interpret the β3 and β5 coefficients. Are these coefficients
statistically significant? What does this indicate about how enterprise location and the social background of the enterprise owner affect enterprise profitability? [2]
2. Does the impact of capital (consider ln(capital)) vary by whether the enterprise operates
in a rural location? You need to a) write out the estimating equation b) interpret your coefficients of interest (coefficient magnitude and statistical significance) c) report the impact
of capital on profits for urban enterprises and d) report the impact of capital on profits for
rural enterprises. [4]
3. Does the impact of capital (consider ln(capital)) vary by whether the enterprise is owned
by an individual from a socially marginalized community? You need to a) write out the
estimating equation b) interpret your coefficients of interest (coefficient magnitude and statistical significance) c) report the impact of capital on profits for owners not from socially
marginalized backgrounds and d) report the impact of capital on profits for owners from
socially marginalized backgrounds. [3]
4. Now consider the following equation:
ln(gvai ) = β0 + β1 ln(capitali ) + β2 ln(wagesi ) + β3 rurali + β4 f emalei
+ β5 marginalizedi + β6 agei + β7 age2i + β8 BranchP Cd + i (2)
where BranchP C is the number of bank branches per capita (branch10 pc) in district d,
where enterprise i is located.
Estimate equation (2) using OLS. What is the interpretation of the coefficient β8 ? Is β8 sta-
3
tistically significant? Can you comment on whether bank branches per capita has a “large”
or “small” impact on enterprise profitability (economic significance of β8 )? [3]
5. Now consider the following equation:
ln(gvai ) = β0 + β1 ln(capitali ) + β2 ln(wagesi ) + β3 rurali
+ β4 f emalei + β5 marginalizedi + β6 agei + β7 age2i
+ β8 BranchP Cd + β9 ln(capitali ) × BranchP Cd + i (3)
Estimate equation (3) using OLS. What is the interpretation of the β1 coefficient? What
is the interpretation of the β8 coefficient? What is the interpretation of the β9 coefficient?
Based on the β9 coefficient, what can you say about how the relationship between capital
and enterprise profitability varies with the presence of bank branches? [4]
6. Construct the dummy variables HighBranchd and V HighBranchd where a) HighBranchd =
1 if BranchP Cd > 68.1573 and BranchP Cd ≤ 93.84007 and b) V HighBranchd = 1 if
BranchP Cd > 93.84007. These are the 50th and 75th percentiles of the bank branch density distribution across all districts. Now estimate the following equation using OLS:
ln(gvai ) = β0 + β1 ln(capitali ) + β2 ln(wagesi ) + β3 rurali
+ β4 f emalei + β5 marginalizedi + β6 agei + β7 age2i
+ β8 HighBranchP Cd + β9 ln(capitali ) × HighBranchP Cd
+ β10 V HighBranchP Cd + β11 ln(capitali ) × V HighBranchP Cd + i (4)
What is the interpretation of the coefficient β1 . What are the interpretation of the coefficients
β9 and β11 ? Test the null hypothesis β9 = β11 using a 2-sided test at the 5% level? Report
the standard error associated with the estimated β9 − β11 coefficient.
Note: you are not permitted to use Stata’s test command for this test!. [5]
1.3
One Prediction!
1. Use equation (1) to predict the profitability of an enterprise if all the predictors equal their
ˆ
sample means. Test the null hypothesis that your predicted value of the outcome (ln(gva))
equals 9.5 at the 5% level using a 2-sided test. [3]
4
1.4
Linear Probability Models
1. Estimate the following specification using OLS:
Pr(AnyCrediti = 1) = β0 + β1 ln(capitali ) + β2 ln(wagesi ) + β3 rurali + β4 f emalei
+ β5 marginalizedi + β6 agei + β7 age2i + β8 BranchP Cd + i (5)
Interpret the coefficients β4 and β8 . Are they statistically significant at the 1% level? Comment on the economic significance of both coefficients [3].
2. Use Stata’s predict command to obtain the fitted values from estimating the OLS equation
in 4.1. What is the percentage of predicted values a) less than 0 or b) greater than 1. Based
on the above percentages, how would you evaluate the reliability of the linear probability
model in this particular case. [2]
3. Now, redefine HighBranchP Cd to equal 1 if BranchP Cd > 68.1573 and 0 otherwise.
Estimate the following specification using OLS:
Pr(AnyCrediti = 1) = β0 + β1 ln(capitali ) + β2 ln(wagesi ) + β3 rurali
+ β4 f emalei + β5 marginalizedi + β6 agei + β7 age2i
+ β8 HighBranchP Cd + β9 rurali × HighBranchP Cd + i (6)
Interpret the coefficients on β3 , β8 and β9 . Does the presence of bank branches affect whether
rural enterprises can access credit? [2]
1.5
Extra Credit
1. Estimate the following specification using OLS:
5
Pr(AnyCrediti = 1) = β0 + β1 ln(capitali ) + β2 ln(wagesi ) + β3 rurali
+ β4 f emalei + β5 marginalizedi + β6 agei + β7 age2i
+ β8 HighBranchP Cd + β9 marginalizedi × HighBranchP Cd
+ β10 rurali × HighBranchP Cd + β11 marginalizedi × rurali
+ β12 marginalizedi × rurali × HighBranchP Cd + i (7)
Interpret the coefficients on β8 , β9 , β10 and β12 . Does the presence of bank branches affect
whether marginalized enterprise owners can access credit? Do rural enterprises owned by
individuals from marginalized communities have a higher likelihood of receiving credit if they
are located in a district with relatively high bank branches? [5]
6
2
Variable Descriptions
1. Gross value addition (gva): measured as total revenues less total operating expenses, over
the past 30 days. This is a proxy for profits. Operating expenses do not cover wages.
2. Machinery and tools (tot pm): measured as the stock value of machinery and tools owned
by the enterprise at the time of the survey.
3. Age (age): enterprise age, based on year of incorporation.
4. Wages (tot wages): total wages paid.
5. Rural (rural ): dummy equaling 1 if the enterprise is operating from a rural location.
6. Marginalized (marg owner ): dummy equaling 1 if the enterprise is owned by an individual
from a socially marginalized community (Dalit or Adivasi ).
6. Female (marg owner ): dummy equaling 1 if the enterprise is owned by a female.
7. Branch Per Capita (branch10 pc): commercial bank branches per million individuals in
district
8. Rainfall (std arf ): standardized annual rainfall incidence in the district. The annual
rainfall variable is standardized for each district, based on the long-term (40 years) mean
and standard deviation of rainfall in the district.
Variables prefixed ln represent the natural log of the concerned variable.
7
Download