Uploaded by alessiava2000

Econometrics

advertisement
QUESTION 1:
Describe the data: which type of data do you have? Can you provide some preliminary
descriptions of the dependent variable lwage? Are all the variables characterised by both
individual and temporal variability?
The data in the dataset WAGE.dta are panel data, because they entail the presence of several
cross-sections observed along a span of time of several periods.
The rst command to implement is the xtset for informing Stata that I am dealing with panel data,
with id (=identifying number of the worker) as panel variable and year as the temporal one.
Xtset : used to declare a dataset as a panel or longitudinal dataset. The basic syntax of the xtset
command is as follows:
xtset entity_var time_var
entity_var refers to the variable that uniquely identi es each entity or individual in the
•
panel dataset.
time_var refers to the variable that represents the time dimension in the panel dataset.
•
xtset id year
Panel variable: id (strongly balanced)
Time variable: year, 1980 to 1987
Delta: 1 unit
- I obtain that the dataset is strongly balanced, which means that it simplify the analysis. —>
each entity in the dataset has an equal number of observations across all time periods. There
are no missing observations for any entity within the panel dataset. It's important to note that
even if a dataset is not strongly balanced, it can still be used for panel data analysis. Various
techniques and models, such as xed e ects or random e ects models, can handle
unbalanced or partially balanced panel data.
Xtdes provides a summary of the panel data characteristics, including the number of entities, time
periods, and the distribution of observations across entities and time.
xtdes
id: 1, 2, ..., 545
n=
545
year: 1980, 1981, ..., 1987
T=
8
Delta(year) = 1 unit
Span(year) = 8 periods
(id*year uniquely identi es each observation)
Distribution of T_i: min
8
8
5%
8
25%
50%
8
8
8
75%
8
95%
max
Freq. Percent Cum. | Pattern
---------------------------+---------545 100.00 100.00 | 11111111
---------------------------+---------545 100.00
| XXXXXXXX
- With xtdes I can see that the panel variable nr takes up to 545 values (n=545) and the temporal
variable takes up to 8 values (T=8). Therefore, the total observations are nxT (4360). Thanks to the
small table of this output, is con rmed that the dataset is balanced and has no missing value.
Lwage as dependent variable : logarithm of wages and is commonly used as a dependent
variable in econometric models. It is often used to address issues such as skewness,
heteroscedasticity, and non-linearity that can be present in wage data.
ff
fi
ff
fi
fi
fi
fi
1
My dependent variable is lwage, which is the log transformation of the variable wage. Log
transformations are useful because variables in log are more likely to approach normal distributions
and because the interpretation of coe cients in regressions is more convenient since they can be
read in percentages.
Xtsum lwage: used to summarize panel or longitudinal data. It provides descriptive statistics and
information about the structure of the panel dataset.
Variable
|
Mean Std. dev.
Min
Max | Observations
-----------------+--------------------------------------------+---------------lwage overall | 1.649147 .5326094 -3.579079 4.05186 | N = 4360
between |
.3907468 .3333435 3.174173 | n = 545
within |
.3622636 -2.467201 3.204687 | T =
8
With xtsum I obtain some general descriptive statistics about my dependent variable. The mean
value is 1.65 and the variable takes values in a range from -3.58 to 4.05. In addition, the overall
standard deviation (0.53) is divided into the between component (.39) and the within component
(.36). However, the within SD is not computed appropriately because the number of degrees of
freedom is not correct, so the 2 command is more convenient.
Varanaeasy:
QUESTION 2:
Estimate by POLS, RE, FE, FD the role of time dummies on lwage. Interpret the estimates.
I create the temporal dummies tau* using the command dtime and test their signi cance with
testparm.
Dtime year I have to make a regression for lwage tau* In order to use after testparm so…
. dtime year
. reg lwage tau*
note: tau1987 omitted because of collinearity.
Source |
SS
df
MS
Number of obs = 4,360
-------------+---------------------------------- F(7, 4352)
= 50.54
Model | 92.9668229
7 13.2809747 Prob > F
= 0.0000
Residual | 1143.56282 4,352 .262767192 R-squared
= 0.0752
-------------+---------------------------------- Adj R-squared = 0.0737
Total | 1236.52964 4,359 .283672779 Root MSE
= .51261
-----------------------------------------------------------------------------lwage | Coe cient Std. err.
t P>|t| [95% conf. interval]
-------------+---------------------------------------------------------------tau1980 | -.4730023 .0310529 -15.23 0.000 -.5338818 -.4121228
tau1981 | -.3536121 .0310529 -11.39 0.000 -.4144916 -.2927326
tau1982 | -.2948122 .0310529 -9.49 0.000 -.3556918 -.2339327
tau1983 | -.2472159 .0310529 -7.96 0.000 -.3080954 -.1863363
tau1984 | -.1761842 .0310529 -5.67 0.000 -.2370638 -.1153047
tau1985 | -.127069 .0310529 -4.09 0.000 -.1879485 -.0661895
tau1986 | -.0667606 .0310529 -2.15 0.032 -.1276401 -.005881
tau1987 |
0 (omitted)
_cons | 1.866479 .0219577 85.00 0.000 1.823431 1.909528
------------------------------------------------------------------------------
fi
ffi
ffi
2
Then I implement testparm : commonly used after estimating a regression model to assess the
joint signi cance of a group of coe cients or to compare nested models. It helps in
understanding whether a set of variables or coe cients collectively have a statistically signi cant
impact on the dependent variable.
. testparm tau*
( 1)
( 2)
( 3)
( 4)
( 5)
( 6)
( 7)
tau1980 = 0
tau1981 = 0
tau1982 = 0
tau1983 = 0
tau1984 = 0
tau1985 = 0
tau1986 = 0
F( 7, 4352) = 50.54
Prob > F = 0.0000
Time dummies are signi cant, meaning that temporal heterogeneity characterizes this dataset.
I then start estimating the models.
—> BY POLS is a commonly used regression method when there is no concern for unobserved
heterogeneity or correlation within groups. It assumes that the relationship between the
dependent variable and the independent variables is constant across all individuals or groups.
The model estimated by POLS suggest that all the temporal dummies are signi cant, since all the
p-values are less than 0.05 (I chose this value as my signi cance level). The estimated coe cients
are all negative, indicating that there is potentially a negative relationship between these dummies
fi
ffi
fi
fi
ffi
ffi
fi
fi
3
and my dependent variable lwage, so for example, an increase in temporal dummy in year 1980
causes a decrease in lwage of 47%.
Nevertheless, POLS is not able to capture adequately heterogeneity, so it is better to switch to FE,
RE and FD models.
The dummy variable regression gives us exactly the same estimates of the βj that we would obtain
from the regression on time-demeaned data, and the standard errors and other major statistics are
identical. Therefore, a FE estimator can be obtained.
—> BY FE : Fixed E ects estimation is particularly useful when there is concern about omitted
variables or unobserved heterogeneity that may bias the results. By accounting for individual/
group-speci c e ects, the Fixed E ects model helps to mitigate such biases and provide more
reliable estimates of the relationship between the dependent and independent variables.
The estimated coe cients are equal even with FE transformation, because obviously no other
explanatory variable is included in the model. All the taus are still all signi cant, meaning leading us
to have even stronger evidence of temporal variability.
fi
ff
ff
ffi
ff
fi
4
—> BY RE: Estimating the Random E ects model involves taking into account the covariance
structure between the individual/group-speci c e ects and the error term. This is typically done
using maximum likelihood estimation (MLE) methods, such as the "xtreg" command with the "re"
option in Stata.
The Random E ects model is useful when there is interest in estimating both the average
relationships between the independent and dependent variables and the individual/group-speci c
e ects. It provides estimates that account for unobserved heterogeneity and allows for the
analysis of time-varying e ects.
Even with random e ects, as it could have been easily predicted, all the estimated coe cients are
equal because of the same reasons aforementioned. The time dummies are still all signi cant since
their p-values are all smaller than 0.05.
—> BY FD: Fixed E ects estimation is particularly useful when there is concern about omitted
variables or unobserved heterogeneity that may bias the results. By accounting for individual/
group-speci c xed e ects, the Fixed E ects model helps to mitigate such biases and provide
more reliable estimates of the relationship between the dependent and independent variables.
In summary, a Fixed E ects model allows for the estimation of within-individual or within-group
variation by controlling for individual/group-speci c xed e ects, enabling the identi cation of the
e ects of the independent variables on the dependent variable.
fi
fi
ffi
fi
ff
fi
ff
fi
fi
ff
ff
ff
ff
ff
ff
ff
ff
fi
fi
ff
ff
5
.
Here I implemented a rst-di erences model. As it could have been predicted, the estimated
coe cients here are di erent because Stata computed the di erence between consecutive
observations of a variable.
The interpretation of the coe cients in a rst di erence model focuses on the changes over time
rather than the absolute levels. The model captures the within-entity changes by subtracting the
lagged values from the current values, allowing for the assessment of the impact of changes in the
independent variable on the changes in the dependent variable.
Here the signs of the coe cients change, since all the values are positive, suggesting a positive
(even though probably very weak) relationship between lwage and time dummies. What is
di erence from the other models is that here all time dummies (except for tau1980) are not
statistically signi cant.
QUESTION 3:
Add to the model the variables educ black and hisp, re-estimate the model by POLS, RE, and FE,
and test for heteroschedasticity and autocorrelation. Comment the estimated results among the
estimation methods.
By adding these 3 explanatory variables to the POLS model I obtain this output.
ff
ff
fi
ff
ffi
ffi
ff
fi
fi
ffi
ff
6
The coe cients provide me with several information. Education has a positive but weak e ect on
wage: a one-unit increase in the education level leads to an increase in 7.7 percentage points in
the wage. Being Hispanic has almost no e ect on the wage, while it is surprising that being black
causes a decrease of 12 percentage point in the wage.
The only variable that is not signi cant is hisp.
Hettest test : The hettest test, also known as the Breusch-Pagan test or Breusch-Pagan-Godfrey
test, is a statistical test used in econometrics to check for heteroskedasticity in regression
models. Heteroskedasticity occurs when the variance of the errors in a regression model is not
constant across all levels of the independent variables. In other words, the variability of the
residuals changes as the values of the independent variables change.
Whitetst test: The whitetst test, also known as the White test, is a statistical test used to detect
heteroskedasticity in regression models, similar to the hettest test (Breusch-Pagan test). Both
tests are designed to check for the presence of heteroskedasticity, which occurs when the
variance of the residuals in a regression model is not constant across di erent levels of the
independent variables.
ff
ff
ff
fi
ffi
7
If we estimate the same model by xed e ects, we obtain that educ, black and hisp are omitted
because of multicollinearity, meaning that two or more independent variables in a regression
model are highly correlated with each other.
Xttest3: is used to perform the Modi ed Wald test for
groupwise heteroskedasticity in xed e ects
regression models for panel data.
The Modi ed Wald test is a test for heteroskedasticity
in the context of panel data with xed e ects (withine ects) regression models. It is used to determine
whether the assumption of constant variance
(homoskedasticity) of the error terms is violated in a
xed e ects model.
ff
ff
ff
fi
fi
fi
fi
fi
ff
8
ff
fi
In both Breusch-Pagan test and White test, the null hypothesis is not rejected, suggesting that
homoskedasticity is present.
H0: sigma(i)^2 = sigma^2 for all i —> Null hypothesis
In this case I implemented that Wald test for groupwise heteroskedasticity since I estimated my
model by xed e ects. Here the null hypothesis is rejected, hence heteroskedasticity is present.
At last, I estimated the same model by random e ects. The coe cients are identical to those
estimated by POLS. This suggests that with any likelihood there is no signi cant variation in the
coe cients across entities in the panel data. Even with RE, the variable hisp is not statistically
signi cant.
Xtserial : The Wooldridge test for autocorrelation in panel data is a robust test that addresses the
issue of autocorrelation in the residuals of xed e ects or random e ects panel data models. It is
designed to account for the panel structure and provides more reliable results when dealing with
correlated errors over time for individual units.
fi
ff
ffi
ff
ff
fi
ff
fi
fi
ffi
9
With xtserial, I tested for autocorrelation using the Wooldridge test. The
null hypothesis is rejected, meaning that rst-order autocorrelation
between the residuals of a regression model at di erent points in time is
present. Autocorrelation is likely to be present in panel data.
The last thing I’d like to point out is that since in both POLS and RE hisp
turned out to be not statistically signi cant, I can omit this variable
because by testing its signi cance with testparm, it turns out to be nonsigni cant.
QUESTION 4:
Re-estimate the model with POLS, RE and FE, and cluster-robust standard errors. Compare the
standard errors among the estimation methods. What do you think about the comparison
between POLS and RE? And RE and FE?
First of all, by removing variable hisp the coe cients for educ and black are almost unchanged.
The second thing to notice is that no variable is non-signi cant.
Implementing models with cluster standard error is useful in order to deal with the issue of
heteroskedasticity, found before.
fi
ff
ffi
fi
fi
fi
fi
10
If we compare POLS and RE, we notice that the parameters estimated are exactly identical, and
even the standard errors are identical. For instance, the standard error associated to educ is
0.0087876 both with POLS and RE. Only the con dence intervals change in an almost
unperceivable
way.
fi
11
With FE model, the issue of multicollinearity is not solved with the omission of hisp. The standard
errors of the time dummies are almost identical between FE and RE, for instance 0.0259931 is the
SE of tau1986 in FE and 0.0259871 is the SE of the same variable in RE.
QUESTION 5:
Add to the model the variables union, married, experience and squared experience, re-estimate
by POLS, RE, and FE, without and with time dummies.
By adding the new regressors, the coe cients change. Now the coe cient for educ is always
positive but higher (from 0.08 to .09) and that of black is always negative but higher in absolute
value (from -0.12 to -0.14). this may be due to the omitted variable bias.
Experience has a positive impact on wage (0.07), as well as being married (.11). what is strange is
that squared experience instead has a negative impact on wage (-0.002), while being in unions has
a strong impact (.18). Here the only non-signi cant variables are some of the time dummies.
ffi
fi
ffi
12
Without time dummies the output is for sure nicer.
By implementing the model
with xed e ects, there is
always a multicollinearity
problem with educ, black and
exper. Being married has a
much lower impact (0.05) with
respect to the POLS model,
and this di erence is
signi cant, being 0.11 out of
the new con dence interval.
Squared expertise has always
a low but negative impact on
wage and being in union has a
lower positive impact (from
0.18 to 0.08). No variable is
non-signi cant.
By removing the time
dummies, I solved the
problem of collinearity at
least for exper.
The signs of the
coe cients are all the
same and the size of the
coe cients are similar
to the model that
included time dummies.
Expertise has a great
positive impact on wage
(.12).
fi
ff
ff
fi
fi
fi
ffi
ffi
13
With random e ects, the
signs are the same and
the coe cients are very
similar in size with those
of the POLS model.
Some of the time
dummies are still not
signi cant.
By eliminating the time
dummies, the
coe cients are similar
in size (almost
identical) to the model
with RE with time
dummies. The output
now is nicer for
analysis.
ff
ffi
fi
ffi
14
The models I just analyzed are
with robust standard errors. I
estimate now models in FE and
RE with time dummies but no
standard error so that I can implement Hausman test.
15
QUESTION 6:
Test the consistency of RE for the model under QUESTION 5 with time dummies. Propose an
approach to consistently estimate ALL the variables and implement the consistency test for RE.
Hausman test: The Hausman test is an econometric test used to determine whether the xed
e ects or random e ects model is more appropriate for panel data analysis. It helps researchers
decide whether to use the xed e ects (FE) model or the random e ects (RE) model, both of
which are commonly used for panel data analysis.
Null hypothesis is rejected
so the FE model is
preferred.
ESAME 2
INTRO:
Use in STATA the dataset: pwt91.dta
It contains the Penn World Table (PWT) attempts to construct estimates of output, prices, and the
like which are reasonably comparable across a large number of countries. See Summers and
Heston (1991) The Penn World Table (Mark 5): An Expanded Set of International Comparisons,
1950–1988,” Quarterly Journal of Economics 106, 327–68.
The variables to be used are:
cgdpo Output-side real GDP at current PPPs (in mil. 2011US$)
csh_c Share of household consumption at current PPPs
csh_i Share of gross capital formation at current PPPs
csh_g Share of government consumption at current PPPs
csh_x Share of merchandise exports at current PPPs
csh_m Share of merchandise imports at current PPPs
We would like to model the growth rate in the PWT real output per capita datum for country i and
year t as depending on the growth rate in country i’s volume of trade (as a fraction of real output)
in year t and also on the growth rates in the proportions of real output which country i allocates to
consumption, investment, and government spending in year t.
fi
ff
ff
fi
ff
ff
16
QUESTION 1:
Describe the data: which type of data do you have? How is the panel?
Construct the variables of the model. (Hints: g lcgdpo=ln(cgdpo), g y=d.lcgdpo, g
pippo=csh_x+csh_m, g lpippo=ln(pippo), g open=d.lpippo, drop pippo lpippo, g SHC=d.csh_c, g
SHI=d.csh_i, g SHG=d.csh_g)
How does the panel change if you consider the availability of the model’s variables?
The data in the dataset pwt91.dta are panel data, because it contains observations for multiple
countries (entities) over several years (time periods).
Firstly I use the describe command to check the variable names in the dataset to identify the
variables representing the country and year.
. describe
Contains data from /Users/alessiavanni/Desktop/Econometrics/pwt91.dta
Observations:
12,376
Variables:
52
5 Apr 2019 22:30
---------------------------------------------------------------------------------------------------------------------------------------------------------Variable
Storage Display Value
name
type format label
Variable label
---------------------------------------------------------------------------------------------------------------------------------------------------------countrycode str3 %9s
country
str34 %34s
Country name
currency_unit str29 %29s
Currency unit
year
int %8.0g
rgdpe
oat %9.0g
Expenditure-side real GDP at chained PPPs (in mil. 2011US$)
rgdpo
oat %9.0g
Output-side real GDP at chained PPPs (in mil. 2011US$)
pop
oat %9.0g
Population (in millions)
emp
oat %9.0g
Number of persons engaged (in millions)
avh
oat %9.0g
Average annual hours worked by persons engaged (source: The Conference Board)
hc
oat %9.0g
* Human capital index, see note hc
ccon
oat %9.0g
Real consumption of households and government, at current PPPs (in mil. 2011US$)
cda
oat %9.0g
* Real domestic absorption, see note cda
cgdpe
oat %9.0g
Expenditure-side real GDP at current PPPs (in mil. 2011US$)
cgdpo
oat %9.0g
Output-side real GDP at current PPPs (in mil. 2011US$)
cn
oat %9.0g
Capital stock at current PPPs (in mil. 2011US$)
ck
oat %9.0g
Capital services levels at current PPPs (USA=1)
ctfp
oat %9.0g
TFP level at current PPPs (USA=1)
cwtfp
oat %9.0g
Welfare-relevant TFP levels at current PPPs (USA=1)
rgdpna
oat %9.0g
Real GDP at constant 2011 national prices (in mil. 2011US$)
rconna
oat %9.0g
Real consumption at constant 2011 national prices (in mil. 2011US$)
rdana
oat %9.0g
Real domestic absorption at constant 2011 national prices (in mil. 2011US$)
rnna
oat %9.0g
Capital stock at constant 2011 national prices (in mil. 2011US$)
rkna
oat %9.0g
Capital services at constant 2011 national prices (2011=1)
rtfpna
oat %9.0g
TFP at constant national prices (2011=1)
rwtfpna
oat %9.0g
Welfare-relevant TFP at constant national prices (2011=1)
labsh
oat %9.0g
Share of labour compensation in GDP at current national prices
irr
oat %9.0g
Real internal rate of return
delta
oat %9.0g
Average depreciation rate of the capital stock
xr
oat %9.0g
Exchange rate, national currency/USD (market+estimated)
pl_con
oat %9.0g
Price level of CCON (PPP/XR), price level of USA GDPo in 2011=1
pl_da
oat %9.0g
Price level of CDA (PPP/XR), price level of USA GDPo in 2011=1
pl_gdpo
oat %9.0g
Price level of CGDPo (PPP/XR), price level of USA GDPo in 2011=1
i_cig
byte %12.0g i_cig_label
* 0/1/2, see note i_cig
i_xm
byte %12.0g i_xm_label
* 0/1/2, see note i_xm
i_xr
byte %12.0g i_xr_label
0/1: the exchange rate is market-based (0) or estimated (1)
i_outlier
byte %8.0g
i_outlier_label
* 0/1, see note i_outlier
i_irr
byte %17.0g i_irr_label
* 0/1/2/3, see note i_irr
cor_exp
oat %9.0g
* Correlation between expenditure shares, see note cor_exp
statcap
oat %9.0g
Statistical capacity indicator (source: World Bank, developing countries only)
csh_c
oat %9.0g
Share of household consumption at current PPPs
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
17
csh_i
csh_g
csh_x
csh_m
csh_r
pl_c
pl_i
pl_g
pl_x
pl_m
pl_n
pl_k
oat %9.0g
oat %9.0g
oat %9.0g
oat %9.0g
oat %9.0g
oat %9.0g
oat %9.0g
oat %9.0g
oat %9.0g
oat %9.0g
oat %9.0g
oat %9.0g
Share of gross capital formation at current PPPs
Share of government consumption at current PPPs
Share of merchandise exports at current PPPs
Share of merchandise imports at current PPPs
Share of residual trade and GDP statistical discrepancy at current PPPs
Price level of household consumption, price level of USA GDPo in 2011=1
Price level of capital formation, price level of USA GDPo in 2011=1
Price level of government consumption, price level of USA GDPo in 2011=1
Price level of exports, price level of USA GDPo in 2011=1
Price level of imports, price level of USA GDPo in 2011=1
Price level of the capital stock, price level of USA 2011=1
Price level of the capital services, price level of USA=1
* indicated variables have notes
---------------------------------------------------------------------------------------------------------------------------------------------------------Sorted by: countrycode year
I then generate the new variables as asked.
g lcgdpo=ln(cgdpo)
(2,395 missing values generated)
. g y=d.lcgdpo
(2,579 missing values generated)
. g pippo=csh_x+csh_m
(2,391 missing values generated)
. g lpippo=ln(pippo)
(9,345 missing values generated)
. g open=d.lpippo
(9,826 missing values generated)
. drop pippo lpippo
. g SHC=d.csh_c
(2,573 missing values generated)
. g SHI=d.csh_i
(2,573 missing values generated)
. g SHG=d.csh_g
(2,573 missing values generated)
I use then the xtset command for informing stata that I’m dealing with panel data with id as panel
variable and year as a temporal one.
. encode countrycode, generate (id)
. drop countrycode
. xtset id year
Panel variable: id (strongly balanced)
Time variable: year, 1950 to 2017
Delta: 1 unit
I obtain that the dataset is strongly balanced, which means that the addition of the new variables
does not impact negatively on the panel.
If we want to assess if our panel is balanced in another way we can use the command xtdes.
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
fl
18
The output indicates that the dataset represents a balanced panel data set with 182 entities
(cross-sectional units) and 68 time periods (years). The data appears to be complete, with no
missing observations for any entity-year combination. The balanced nature of the panel, where
each entity has data for all 68 years, makes it suitable for various panel data analysis techniques.
My dependent variable is lcgdpo, which is the log trasformation of the variable cgdpo. Log
transformations are useful because variables in log are more likely to approach normal distributions
and because the interpretation of coe cients in regression is more convenient since they can be
read in percentage.
Continuing in describe our variable we can use the command summarize.
The variable "lcgdpo" represents a numerical continuous
variable in the dataset. The summary statistics provide
valuable insights into its distribution and characteristics.
The relatively small standard deviation (2.272648) and the
skewness close to zero (-0.0640123) suggest that the
data is relatively close to a normal distribution with a
slightly left-skewed tail.
The median (approximately 10.2012) being close to the
mean (approximately 10.22008) also supports the notion
of the data being symmetrically distributed. However, the
relatively higher kurtosis (2.835745) indicates that the
distribution has heavier tails and is more peaked
compared to a normal distribution.
Another command in order to verify the skewness is sktest.
The low p-values for both tests (0.0091 and 0.0003) indicate strong evidence against the null
hypothesis that "lcgdpo" follows a normal distribution. Therefore, we can reject the assumption of
normality for this variable.
The joint test, which combines the skewness and kurtosis tests, also yields a low p-value of
0.0001, further con rming that "lcgdpo" does not conform to a normal distribution.
These results suggest that the distribution of "lcgdpo" may exhibit signi cant departures from
normality, which can be crucial information for researchers and analysts.
fi
ffi
fi
19
Now, is time to perform the iqr command.
The output provides a comprehensive summary of
the distribution of the variable "lcgdpo" and the
presence of potential outliers. The mean (10.22) and
median (10.2) are relatively close, indicating a
central tendency close to the middle of the data
distribution.
The standard deviation (2.273) and pseudo
standard deviation (2.311) provide measures of
dispersion, and the IQR (3.118) gives a robust
measure of the spread of the central 50% of the
data.
The presence of mild outliers is indicated by the
number of data points outside the inner fences. There are 23 potential mild outliers below the
lower inner fence and 17 potential mild outliers above the upper inner fence. The percentage of
mild outliers is relatively small, suggesting that the majority of data points are within the inner
fences.
There are no potential severe outliers, as indicated by the absence of data points beyond the outer
fences.
In order to verify what just said, we can compute some graphs. The rst one is the boxplot.
graph box lcgdpo, over(year, label(angle(90) labsize(vsmall))) marker(20, mlabel(id))
saving(box.gph, replace)
Now to verify the tail we can implement the histogram.
histogram lcgdpo, fraction normal bin(10) title("…") saving(histfre.gph, replace)
The histogram has a bell-shaped curve and is
approximately symmetric around the center, it
suggests a normal distribution. The mean,
median, and mode are approximately equal.
fi
20
Now, in order to investigate the variability, I can execute the command varanaeasy for the
dependent variable.
. varanaeasy lcgdpo id year
___ variable lcgdpo___
Statistics
NT
9981
Nmin 55
Navg
146.77941
Nmax
182
Tmin 13
Tavg
54.840659
Tmax
68
Note: di erences among statistics of individuals or/and time-periods --> the panel is unbalanced
Test of the signi cance of individual e ects
Fnum_i
Fden_i
F_i
Fpval_i
181
9732
2012.2858
0.00
Test of the signi cance of time e ects
Fnum_t
Fden_t
F_t
Fpval_t
67
9732
567.88435
0.00
Statistics: mean and variability (standard deviations)
Total mean (x..)
10.220082
Total sd (xit-x..)
2.2726484
Between sd inter_id (xi.-x..)
Between sd inter_year (x.t-x..)
Within sd intra_id_year (xit-xi.-x.t+x..)
2.1368129
.70988751
.36090439
Within sd intra_id (xit-xi.)
Within sd intra_year (xit-x.t)
.79693976
2.1679262
____________________________________________________________________________________________
Percentages of overall sum of squared dev. due to individuals, time, and residuals
Two-ways individuals & temporal
% between inter_id (xi.-x..)/(xit-x..)
87.926375
% between inter_year (x.t-x..)/(xit-x..)
9.6144393
% within intra_id_year (xit-xi.-x.t+x..)/(xit-x..)
2.4591859
- Focus on One-way individuals: intra_id+inter_id
% within intra_id (xit-xi.)/(xit-x..)
12.073625
of which explained by between inter_year (%) (x.t-x..)/(xit-xi.)
79.631752
- Focus on One-way temporal: intra_year+inter_year
% within intra_year (xit-x.t)/(xit-x..)
90.385561
of which explained by between inter_id (%) (xi.-x..)/(xit-x.t)
97.279227
____________________________________________________________________________________________
The output provides valuable insights into the sources of variability in the variable "lcgdpo" within
an unbalanced panel data setting. The signi cant F-statistics for individual and time e ects
suggest that both play a crucial role in explaining the variation in "lcgdpo" across the dataset.
The decomposition of variability shows that a signi cant proportion of the total variability is due to
di erences among individuals, followed by di erences among time periods. The low percentage of
within-individual and within-time variation indicates that these factors have relatively less impact
on the overall variability.
QUESTION 2:
Estimate, by POLS, RE, FE, and FD with cluster standard errors, the model yit as a function of
openit, SHCit SHIit SHGit time-dummies.
Interpret and comparatively discuss the estimates.
I create the temporal dummies tau* using the command dtime and test their signi cance with
testparm, after making a regression for lcgdpo tau*
ff
fi
fi
ff
fi
ff
ff
fi
fi
ff
ff
21
reg y open SHC SHI SHG tau*, vce(cluster id)
note: tau1950 omitted because of collinearity.
note: tau1955 omitted because of collinearity.
Linear regression
Number of obs =
2,546
F(70, 125)
=
27.20
Prob > F
= 0.0000
R-squared
= 0.3745
Root MSE
= .08176
(Std. err. adjusted for 126 clusters in id)
-----------------------------------------------------------------------------|
Robust
y | Coe cient std. err.
t P>|t| [95% conf. interval]
-------------+---------------------------------------------------------------open | -.0083645 .002189 -3.82 0.000 -.0126968 -.0040322
SHC | -1.040524 .0804048 -12.94 0.000 -1.199655 -.8813933
SHI | -.1580099 .0942423 -1.68 0.096 -.3445271 .0285074
SHG | -.5822732 .2955081 -1.97 0.051 -1.16712 .0025741
tau1950 |
0 (omitted)
tau1951 | -.000719 .0164227 -0.04 0.965 -.0332216 .0317836
tau1952 | -.0414502 .0126615 -3.27 0.001 -.0665089 -.0163915
tau1953 | -.0215499 .0139741 -1.54 0.126 -.0492063 .0061066
tau1954 | -.0210786 .013646 -1.54 0.125 -.0480856 .0059285
tau1955 |
0 (omitted)
tau1956 | -.0412841 .0116236 -3.55 0.001 -.0642887 -.0182796
tau1957 | -.0273057 .0077197 -3.54 0.001 -.042584 -.0120273
tau1958 | -.038976 .0095905 -4.06 0.000 -.0579568 -.0199951
tau1959 | -.0367763 .0123355 -2.98 0.003 -.0611898 -.0123629
tau1960 | -.0167422 .0135499 -1.24 0.219 -.0435592 .0100749
tau1961 | -.0385565 .0127822 -3.02 0.003 -.0638542 -.0132589
tau1962 | -.0268461 .0251163 -1.07 0.287 -.0765544 .0228622
tau1963 | -.0044766 .0156646 -0.29 0.776 -.0354788 .0265255
tau1964 | -.0012946 .0157425 -0.08 0.935 -.032451 .0298618
tau1965 | -.0157602 .0149434 -1.05 0.294 -.045335 .0138146
tau1966 | -.0373597 .0143899 -2.60 0.011 -.0658391 -.0088803
tau1967 | -.0121991 .0103084 -1.18 0.239 -.0326007 .0082025
tau1968 | -.0265933 .0095316 -2.79 0.006 -.0454575 -.0077291
tau1969 | -.0191154 .0109927 -1.74 0.085 -.0408713 .0026405
tau1970 | -.0109004 .0137323 -0.79 0.429 -.0380784 .0162776
tau1971 | -.0201685 .01061 -1.90 0.060 -.041167 .0008299
tau1972 | -.0265155 .010157 -2.61 0.010 -.0466174 -.0064135
tau1973 | -.0110248 .0131538 -0.84 0.404 -.0370577 .0150081
tau1974 | .018153 .0190256 0.95 0.342 -.019501 .0558071
tau1975 | -.020317 .0212394 -0.96 0.341 -.0623523 .0217184
tau1976 | -.0058766 .015333 -0.38 0.702 -.0362224 .0244692
tau1977 | -.0101955 .0138402 -0.74 0.463 -.037587 .017196
tau1978 | -.035352 .0122536 -2.89 0.005 -.0596034 -.0111006
tau1979 | -.0216289 .0190386 -1.14 0.258 -.0593086 .0160508
tau1980 | -.0285228 .0128797 -2.21 0.029 -.0540134 -.0030323
tau1981 | -.0359005 .0146227 -2.46 0.015 -.0648406 -.0069605
tau1982 | -.0479356 .0101047 -4.74 0.000 -.067934 -.0279371
tau1983 | -.0419962 .0109605 -3.83 0.000 -.0636883 -.0203041
tau1984 | -.0474444 .0125193 -3.79 0.000 -.0722217 -.0226671
tau1985 | -.0374335 .010758 -3.48 0.001 -.0587248 -.0161422
tau1986 | -.0709505 .0295933 -2.40 0.018 -.1295195 -.0123816
tau1987 | -.0387469 .0145524 -2.66 0.009 -.0675478 -.0099459
tau1988 | -.0337385 .0115317 -2.93 0.004 -.0565611 -.0109158
tau1989 | -.0247548 .0140562 -1.76 0.081 -.0525738 .0030641
tau1990 | -.0346246 .0179094 -1.93 0.055 -.0700695 .0008203
tau1991 | -.0870706 .0155673 -5.59 0.000 -.1178801 -.056261
tau1992 | -.0496859 .0220317 -2.26 0.026 -.0932895 -.0060823
tau1993 | -.0962551 .0178052 -5.41 0.000 -.1314937 -.0610165
tau1994 | -.0928072 .019403 -4.78 0.000 -.1312082 -.0544063
tau1995 | -.0268686 .014666 -1.83 0.069 -.0558945 .0021572
tau1996 | -.0312689 .015057 -2.08 0.040 -.0610685 -.0014694
tau1997 | -.0207448 .0118321 -1.75 0.082 -.044162 .0026723
tau1998 | -.0476344 .0137596 -3.46 0.001 -.0748665 -.0204024
tau1999 | -.0237623 .0119642 -1.99 0.049 -.047441 -.0000837
tau2000 | .0134335 .0143082 0.94 0.350 -.0148842 .0417512
tau2001 | -.0224725 .0126559 -1.78 0.078 -.0475201 .002575
tau2002 | -.0238668 .0114618 -2.08 0.039 -.046551 -.0011825
ffi
22
tau2003 |
tau2004 |
tau2005 |
tau2006 |
tau2007 |
tau2008 |
tau2009 |
tau2010 |
tau2011 |
tau2012 |
tau2013 |
tau2014 |
tau2015 |
tau2016 |
tau2017 |
_cons |
-.0012138
.0259278
.0537708
.0115015
.0306528
.0070869
-.0727106
.0106175
.0286362
-.0288682
-.022187
-.0459122
-.0664595
-.0653652
-.0511581
.0673464
.0137195 -0.09 0.930 -.0283665
.0145309 1.78 0.077 -.0028307
.0160055 3.36 0.001 .0220939
.012173 0.94 0.347 -.0125904
.0133389 2.30 0.023 .0042534
.0140963 0.50 0.616 -.0208113
.018981 -3.83 0.000 -.1102762
.0123861 0.86 0.393 -.0138962
.0130254 2.20 0.030 .0028572
.0111041 -2.60 0.010 -.0508446
.0098042 -2.26 0.025 -.0415907
.0099049 -4.64 0.000 -.0655152
.0142261 -4.67 0.000 -.0946148
.0139181 -4.70 0.000 -.0929108
.0099961 -5.12 0.000 -.0709417
.0068231 9.87 0.000 .0538427
.0259388
.0546863
.0854477
.0355934
.0570522
.0349852
-.0351449
.0351312
.0544151
-.0068918
-.0027833
-.0263091
-.0383043
-.0378195
-.0313745
.0808501
Then I implement testparm: commonly used after estimating a regression model to assess the joint
signi cance of a group of coe cients or to compare nested models. It helps to understand
whether a set of variables or coe cients have statistically signi cant impact on the dependent
variable.
. testparm y open SHC SHI SHG tau*
( 1) open = 0
( 2) SHC = 0
( 3) SHI = 0
( 4) SHG = 0
( 5) tau1951 = 0
( 6) tau1952 = 0
( 7) tau1953 = 0
( 8) tau1954 = 0
( 9) tau1956 = 0
(10) tau1957 = 0
(11) tau1958 = 0
(12) tau1959 = 0
(13) tau1960 = 0
(14) tau1961 = 0
(15) tau1962 = 0
(16) tau1963 = 0
(17) tau1964 = 0
(18) tau1965 = 0
(19) tau1966 = 0
(20) tau1967 = 0
(21) tau1968 = 0
(22) tau1969 = 0
(23) tau1970 = 0
(24) tau1971 = 0
(25) tau1972 = 0
(26) tau1973 = 0
(27) tau1974 = 0
(28) tau1975 = 0
(29) tau1976 = 0
(30) tau1977 = 0
(31) tau1978 = 0
(32) tau1979 = 0
(33) tau1980 = 0
(34) tau1981 = 0
(35) tau1982 = 0
(36) tau1983 = 0
(37) tau1984 = 0
(38) tau1985 = 0
(39) tau1986 = 0
(40) tau1987 = 0
(41) tau1988 = 0
(42) tau1989 = 0
(43) tau1990 = 0
(44) tau1991 = 0
(45) tau1992 = 0
fi
ffi
ffi
fi
23
(46)
(47)
(48)
(49)
(50)
(51)
(52)
(53)
(54)
(55)
(56)
(57)
(58)
(59)
(60)
(61)
(62)
(63)
(64)
(65)
(66)
(67)
(68)
(69)
(70)
tau1993 = 0
tau1994 = 0
tau1995 = 0
tau1996 = 0
tau1997 = 0
tau1998 = 0
tau1999 = 0
tau2000 = 0
tau2001 = 0
tau2002 = 0
tau2003 = 0
tau2004 = 0
tau2005 = 0
tau2006 = 0
tau2007 = 0
tau2008 = 0
tau2009 = 0
tau2010 = 0
tau2011 = 0
tau2012 = 0
tau2013 = 0
tau2014 = 0
tau2015 = 0
tau2016 = 0
tau2017 = 0
F( 70, 125) = 27.20
Prob > F = 0.0000
Based on the output, since the p-value is essentially zero (Prob > F = 0.0000), there is strong
evidence to reject the null hypothesis that all the coe cients of the speci ed variables (open,
SHC, SHI, SHG, tau from 1951 to 2017) are zero simultaneously. In other words, at least one of
these variables has a statistically signi cant e ect on the dependent variable in the regression
model.
. eststo POLStd: reg y open SHC SHI SHG tau*, cluster (id)
note: tau1950 omitted because of collinearity.
note: tau1955 omitted because of collinearity.
Linear regression
Number of obs =
2,546
F(70, 125)
=
27.20
Prob > F
= 0.0000
R-squared
= 0.3745
Root MSE
= .08176
(Std. err. adjusted for 126 clusters in id)
-----------------------------------------------------------------------------|
Robust
y | Coe cient std. err.
t P>|t| [95% conf. interval]
-------------+---------------------------------------------------------------open | -.0083645 .002189 -3.82 0.000 -.0126968 -.0040322
SHC | -1.040524 .0804048 -12.94 0.000 -1.199655 -.8813933
SHI | -.1580099 .0942423 -1.68 0.096 -.3445271 .0285074
SHG | -.5822732 .2955081 -1.97 0.051 -1.16712 .0025741
tau1950 |
0 (omitted)
tau1951 | -.000719 .0164227 -0.04 0.965 -.0332216 .0317836
tau1952 | -.0414502 .0126615 -3.27 0.001 -.0665089 -.0163915
tau1953 | -.0215499 .0139741 -1.54 0.126 -.0492063 .0061066
tau1954 | -.0210786 .013646 -1.54 0.125 -.0480856 .0059285
tau1955 |
0 (omitted)
tau1956 | -.0412841 .0116236 -3.55 0.001 -.0642887 -.0182796
tau1957 | -.0273057 .0077197 -3.54 0.001 -.042584 -.0120273
tau1958 | -.038976 .0095905 -4.06 0.000 -.0579568 -.0199951
tau1959 | -.0367763 .0123355 -2.98 0.003 -.0611898 -.0123629
tau1960 | -.0167422 .0135499 -1.24 0.219 -.0435592 .0100749
tau1961 | -.0385565 .0127822 -3.02 0.003 -.0638542 -.0132589
tau1962 | -.0268461 .0251163 -1.07 0.287 -.0765544 .0228622
tau1963 | -.0044766 .0156646 -0.29 0.776 -.0354788 .0265255
tau1964 | -.0012946 .0157425 -0.08 0.935 -.032451 .0298618
tau1965 | -.0157602 .0149434 -1.05 0.294 -.045335 .0138146
fi
ffi
ff
fi
ffi
24
tau1966 | -.0373597 .0143899 -2.60 0.011 -.0658391 -.0088803
tau1967 | -.0121991 .0103084 -1.18 0.239 -.0326007 .0082025
tau1968 | -.0265933 .0095316 -2.79 0.006 -.0454575 -.0077291
tau1969 | -.0191154 .0109927 -1.74 0.085 -.0408713 .0026405
tau1970 | -.0109004 .0137323 -0.79 0.429 -.0380784 .0162776
tau1971 | -.0201685 .01061 -1.90 0.060 -.041167 .0008299
tau1972 | -.0265155 .010157 -2.61 0.010 -.0466174 -.0064135
tau1973 | -.0110248 .0131538 -0.84 0.404 -.0370577 .0150081
tau1974 | .018153 .0190256 0.95 0.342 -.019501 .0558071
tau1975 | -.020317 .0212394 -0.96 0.341 -.0623523 .0217184
tau1976 | -.0058766 .015333 -0.38 0.702 -.0362224 .0244692
tau1977 | -.0101955 .0138402 -0.74 0.463 -.037587 .017196
tau1978 | -.035352 .0122536 -2.89 0.005 -.0596034 -.0111006
tau1979 | -.0216289 .0190386 -1.14 0.258 -.0593086 .0160508
tau1980 | -.0285228 .0128797 -2.21 0.029 -.0540134 -.0030323
tau1981 | -.0359005 .0146227 -2.46 0.015 -.0648406 -.0069605
tau1982 | -.0479356 .0101047 -4.74 0.000 -.067934 -.0279371
tau1983 | -.0419962 .0109605 -3.83 0.000 -.0636883 -.0203041
tau1984 | -.0474444 .0125193 -3.79 0.000 -.0722217 -.0226671
tau1985 | -.0374335 .010758 -3.48 0.001 -.0587248 -.0161422
tau1986 | -.0709505 .0295933 -2.40 0.018 -.1295195 -.0123816
tau1987 | -.0387469 .0145524 -2.66 0.009 -.0675478 -.0099459
tau1988 | -.0337385 .0115317 -2.93 0.004 -.0565611 -.0109158
tau1989 | -.0247548 .0140562 -1.76 0.081 -.0525738 .0030641
tau1990 | -.0346246 .0179094 -1.93 0.055 -.0700695 .0008203
tau1991 | -.0870706 .0155673 -5.59 0.000 -.1178801 -.056261
tau1992 | -.0496859 .0220317 -2.26 0.026 -.0932895 -.0060823
tau1993 | -.0962551 .0178052 -5.41 0.000 -.1314937 -.0610165
tau1994 | -.0928072 .019403 -4.78 0.000 -.1312082 -.0544063
tau1995 | -.0268686 .014666 -1.83 0.069 -.0558945 .0021572
tau1996 | -.0312689 .015057 -2.08 0.040 -.0610685 -.0014694
tau1997 | -.0207448 .0118321 -1.75 0.082 -.044162 .0026723
tau1998 | -.0476344 .0137596 -3.46 0.001 -.0748665 -.0204024
tau1999 | -.0237623 .0119642 -1.99 0.049 -.047441 -.0000837
tau2000 | .0134335 .0143082 0.94 0.350 -.0148842 .0417512
tau2001 | -.0224725 .0126559 -1.78 0.078 -.0475201 .002575
tau2002 | -.0238668 .0114618 -2.08 0.039 -.046551 -.0011825
tau2003 | -.0012138 .0137195 -0.09 0.930 -.0283665 .0259388
tau2004 | .0259278 .0145309 1.78 0.077 -.0028307 .0546863
tau2005 | .0537708 .0160055 3.36 0.001 .0220939 .0854477
tau2006 | .0115015 .012173 0.94 0.347 -.0125904 .0355934
tau2007 | .0306528 .0133389 2.30 0.023 .0042534 .0570522
tau2008 | .0070869 .0140963 0.50 0.616 -.0208113 .0349852
tau2009 | -.0727106 .018981 -3.83 0.000 -.1102762 -.0351449
tau2010 | .0106175 .0123861 0.86 0.393 -.0138962 .0351312
tau2011 | .0286362 .0130254 2.20 0.030 .0028572 .0544151
tau2012 | -.0288682 .0111041 -2.60 0.010 -.0508446 -.0068918
tau2013 | -.022187 .0098042 -2.26 0.025 -.0415907 -.0027833
tau2014 | -.0459122 .0099049 -4.64 0.000 -.0655152 -.0263091
tau2015 | -.0664595 .0142261 -4.67 0.000 -.0946148 -.0383043
tau2016 | -.0653652 .0139181 -4.70 0.000 -.0929108 -.0378195
tau2017 | -.0511581 .0099961 -5.12 0.000 -.0709417 -.0313745
_cons | .0673464 .0068231 9.87 0.000 .0538427 .0808501
------------------------------------------------------------------------------
The analysis indicates that the regression model is statistically signi cant (F(70, 125) = 27.20, p <
0.001), suggesting that at least one independent variable is associated with "y."
The variables "SHI" and "SHG" have p-values of 0.096 and 0.051, respectively. While "SHI" is not
statistically signi cant at the 0.05 level, "SHG" is borderline signi cant, indicating that further
investigation or additional data may be needed.
Among the independent variables, "SHC," and others between the tau’s (1952, 1956, 1957 etc..)
are signi cant predictors of "y." Changes in these variables are associated with changes in the
outcome variable “y."
Furthermore, is also important to comment the signs of the coe cients which indicate the
direction of the relationship between the independent variables and "y." For example, "open,"
"SHC," and several time dummy variables have negative coe cients, implying that an increase in
these variables is associated with a decrease in “y."
fi
fi
ffi
ffi
fi
fi
25
. . eststo REtd: xtreg y open SHC SHI SHG tau*, re cluster (id)
note: tau1950 omitted because of collinearity.
note: tau2017 omitted because of collinearity.
Random-e ects GLS regression
Group variable: id
R-squared:
Within = 0.3793
Between = 0.4666
Overall = 0.3743
corr(u_i, X) = 0 (assumed)
Number of obs
Number of groups =
Obs per group:
min =
avg =
max =
=
2,546
126
1
20.2
61
Wald chi2(70) = 1911.52
Prob > chi2
= 0.0000
(Std. err. adjusted for 126 clusters in id)
-----------------------------------------------------------------------------|
Robust
y | Coe cient std. err.
z P>|z| [95% conf. interval]
-------------+---------------------------------------------------------------open | -.0084195 .0021785 -3.86 0.000 -.0126894 -.0041497
SHC | -1.034171 .0802075 -12.89 0.000 -1.191375 -.8769675
SHI | -.1636893 .0936838 -1.75 0.081 -.3473061 .0199275
SHG | -.5836507 .2959536 -1.97 0.049 -1.163709 -.0035923
tau1950 |
0 (omitted)
tau1951 | .0560255 .0187883 2.98 0.003 .0192011 .0928499
tau1952 | .0152107 .0148631 1.02 0.306 -.0139204 .0443418
tau1953 | .0345916 .0153262 2.26 0.024 .0045528 .0646305
tau1954 | .0354683 .0143475 2.47 0.013 .0073478 .0635888
tau1955 | .0557456 .0104187 5.35 0.000 .0353253 .076166
tau1956 | .0132962 .0128362 1.04 0.300 -.0118623 .0384547
tau1957 | .0275226 .0101333 2.72 0.007 .0076617 .0473836
tau1958 | .0157673 .0117812 1.34 0.181 -.0073235 .0388581
tau1959 | .0184966 .0129177 1.43 0.152 -.0068216 .0438148
tau1960 | .0384303 .013546 2.84 0.005 .0118807 .0649799
tau1961 | .0144742 .0145978 0.99 0.321 -.014137 .0430854
tau1962 | .0258981 .0239606 1.08 0.280 -.0210638
.07286
tau1963 | .0483927 .017439 2.77 0.006 .0142129 .0825726
tau1964 | .0538533 .0163195 3.30 0.001 .0218677 .085839
tau1965 | .0392534 .0138461 2.83 0.005 .0121155 .0663914
tau1966 | .0170801 .0130404 1.31 0.190 -.0084785 .0426388
tau1967 | .0418227 .011384 3.67 0.000 .0195105 .0641349
tau1968 | .0281787 .0113538 2.48 0.013 .0059257 .0504317
tau1969 | .0356866 .0114226 3.12 0.002 .0132986 .0580746
tau1970 | .0442843 .013252 3.34 0.001 .0183109 .0702578
tau1971 | .0326433 .0087474 3.73 0.000 .0154987 .0497878
tau1972 | .0270749 .011202 2.42 0.016 .0051195 .0490304
tau1973 | .0432757 .0120457 3.59 0.000 .0196665 .0668848
tau1974 | .0716774 .018804 3.81 0.000 .0348223 .1085325
tau1975 | .0313378 .0210188 1.49 0.136 -.0098583 .0725338
tau1976 | .0467936 .0166956 2.80 0.005 .0140708 .0795165
tau1977 | .0415577 .0147237 2.82 0.005 .0126998 .0704156
tau1978 | .0161758 .0137248 1.18 0.239 -.0107243 .0430759
tau1979 | .0307324 .0189107 1.63 0.104 -.0063319 .0677967
tau1980 | .0236995 .0121136 1.96 0.050 -.0000428 .0474417
tau1981 | .0168793 .0130877 1.29 0.197 -.0087721 .0425308
tau1982 | .0044156 .0099799 0.44 0.658 -.0151446 .0239757
tau1983 | .0109827 .0115052 0.95 0.340 -.0115671 .0335326
tau1984 | .0057933 .0117573 0.49 0.622 -.0172506 .0288373
tau1985 | .0155364 .0096597 1.61 0.108 -.0033962 .034469
tau1986 | -.0181652 .0296962 -0.61 0.541 -.0763687 .0400383
tau1987 | .0140433 .0134263 1.05 0.296 -.0122718 .0403584
tau1988 | .0184285 .0102371 1.80 0.072 -.0016359 .0384928
tau1989 | .0270544 .0124034 2.18 0.029 .0027443 .0513646
tau1990 | .0166255 .0172661 0.96 0.336 -.0172155 .0504665
tau1991 | -.0329613 .015106 -2.18 0.029 -.0625686 -.0033541
tau1992 | .0045569 .0220158 0.21 0.836 -.0385932 .047707
tau1993 | -.0425689 .016957 -2.51 0.012 -.075804 -.0093337
tau1994 | -.0396799 .0182235 -2.18 0.029 -.0753974 -.0039625
tau1995 | .0261079 .0152414 1.71 0.087 -.0037646 .0559804
tau1996 | .0211294 .0148679 1.42 0.155 -.0080111
.05027
ffi
ff
26
tau1997 | .032111 .0111289 2.89 0.004 .0102988 .0539231
tau1998 | .0052477 .0150653 0.35 0.728 -.0242796 .0347751
tau1999 | .0290796 .0111932 2.60 0.009 .0071413 .051018
tau2000 | .0649342 .0151996 4.27 0.000 .0351435 .0947249
tau2001 | .0288523 .011903 2.42 0.015 .0055228 .0521817
tau2002 | .0284242 .0110618 2.57 0.010 .0067436 .0501049
tau2003 | .0503054 .0135909 3.70 0.000 .0236677 .0769431
tau2004 | .0775762 .0162448 4.78 0.000
.045737 .1094155
tau2005 | .1049535 .0164697 6.37 0.000 .0726734 .1372335
tau2006 | .0626215 .0130465 4.80 0.000 .0370508 .0881922
tau2007 | .0822668 .0150836 5.45 0.000 .0527035 .1118302
tau2008 | .0592003 .0149401 3.96 0.000 .0299182 .0884823
tau2009 | -.0209721 .0166697 -1.26 0.208 -.0536441
.0117
tau2010 | .0625999 .0125272 5.00 0.000
.038047 .0871528
tau2011 | .0813039 .0141317 5.75 0.000 .0536062 .1090016
tau2012 | .0234265 .0102669 2.28 0.023 .0033037 .0435493
tau2013 | .0298135 .0102594 2.91 0.004 .0097053 .0499216
tau2014 | .0064025 .0076529 0.84 0.403 -.008597 .021402
tau2015 | -.0153834 .0141158 -1.09 0.276 -.0430499 .0122831
tau2016 | -.0145529 .0099083 -1.47 0.142 -.0339729 .0048671
tau2017 |
0 (omitted)
_cons | .0136945 .007032 1.95 0.051 -.000088 .0274771
-------------+---------------------------------------------------------------sigma_u | .01232141
sigma_e | .07864704
rho | .0239566 (fraction of variance due to u_i)
———————————————————————————————————————
The
variables "open" and "SHC" have negative coe cients with p-values less than 0.001,
indicating that an increase in these variables is associated with a signi cant decrease in the
outcome variable “y."
The variable "SHG" also has a negative coe cient but with a p-value of 0.049, suggesting a
borderline signi cant relationship with “y."
Among the time dummy variables some values (“tau1951," "tau1953," "tau1954," “tau1955," and
others..) have signi cant coe cients. Changes in these time periods are associated with changes
in the outcome variable “y."
Some time dummy variables, namely "tau1950" and "tau2017," have been omitted due to
collinearity with other independent variables. Collinearity may lead to multicollinearity issues,
making the interpretation of coe cients less reliable.
. eststo FEtd: xtreg y open SHC SHI SHG tau*, fe cluster (id)
note: tau1950 omitted because of collinearity.
note: tau2017 omitted because of collinearity.
Fixed-e ects (within) regression
Group variable: id
R-squared:
Within = 0.3826
Between = 0.3090
Overall = 0.3695
Number of obs =
2,546
Number of groups =
126
Obs per group:
min =
avg =
max =
F(70, 125)
=
Prob > F
corr(u_i, Xb) = 0.0405
1
20.2
61
32.10
= 0.0000
(Std. err. adjusted for 126 clusters in id)
-----------------------------------------------------------------------------|
Robust
y | Coe cient std. err.
t P>|t| [95% conf. interval]
-------------+---------------------------------------------------------------open | -.0085809 .0021707 -3.95 0.000 -.0128769 -.0042848
SHC | -1.014701 .0762565 -13.31 0.000 -1.165622 -.8637795
SHI | -.180844 .0912464 -1.98 0.050 -.3614321 -.000256
SHG | -.5470263 .2958913 -1.85 0.067 -1.132632 .0385793
tau1950 |
0 (omitted)
tau1951 | .0687901 .0206983 3.32 0.001 .0278255 .1097546
tau1952 | .027556 .0164641 1.67 0.097 -.0050284 .0601404
tau1953 | .0456739 .0162365 2.81 0.006 .0135399 .0778079
tau1954 | .0474158 .0148489 3.19 0.002 .0180279 .0768036
fi
ffi
ffi
ffi
ffi
fi
fi
ffi
ff
27
tau1955 | .0663794 .0121483 5.46 0.000 .0423365 .0904224
tau1956 | .0224906 .0139408 1.61 0.109 -.0050999 .0500812
tau1957 | .0373064 .0117258 3.18 0.002 .0140996 .0605131
tau1958 | .0247987 .012694 1.95 0.053 -.0003243 .0499218
tau1959 | .0284378 .0138713 2.05 0.042 .0009847 .0558909
tau1960 | .0495219 .0148365 3.34 0.001 .0201587 .0788851
tau1961 | .021236 .0160437 1.32 0.188 -.0105164 .0529884
tau1962 | .030989 .0245679 1.26 0.210 -.0176339 .0796119
tau1963 | .0547991 .0186832 2.93 0.004 .0178228 .0917755
tau1964 | .0653351 .0163413 4.00 0.000 .0329937 .0976765
tau1965 | .0492828 .0141492 3.48 0.001 .0212798 .0772858
tau1966 | .0256304
.0146 1.76 0.082 -.0032647 .0545255
tau1967 | .0469642 .0114306 4.11 0.000 .0243417 .0695868
tau1968 | .0371789 .0123456 3.01 0.003 .0127454 .0616124
tau1969 | .0490305 .0112167 4.37 0.000 .0268313 .0712297
tau1970 | .0572495 .0140691 4.07 0.000
.029405 .085094
tau1971 | .0391435 .0092096 4.25 0.000 .0209166 .0573705
tau1972 | .0361779 .0111877 3.23 0.002 .0140362 .0583197
tau1973 | .0565321 .0125702 4.50 0.000 .0316541 .0814101
tau1974 | .080732 .0188772 4.28 0.000 .0433716 .1180923
tau1975 | .0330608 .0211649 1.56 0.121 -.0088273 .0749488
tau1976 | .052649 .016704 3.15 0.002 .0195898 .0857082
tau1977 | .0453743 .0157272 2.89 0.005 .0142481 .0765005
tau1978 | .0192746 .014128 1.36 0.175 -.0086864 .0472355
tau1979 | .0369265 .0186182 1.98 0.050 .0000789 .0737742
tau1980 | .0280868 .0122923 2.28 0.024 .0037589 .0524147
tau1981 | .0219617 .0136594 1.61 0.110 -.005072 .0489953
tau1982 | .0086489 .0105155 0.82 0.412 -.0121627 .0294605
tau1983 | .0164845 .0118781 1.39 0.168 -.0070238 .0399927
tau1984 | .0118106 .0123843 0.95 0.342 -.0126994 .0363206
tau1985 | .0217226 .0101656 2.14 0.035 .0016036 .0418416
tau1986 | -.01409 .0318778 -0.44 0.659
-.07718 .0490001
tau1987 | .0209623 .0149022 1.41 0.162 -.0085311 .0504557
tau1988 | .0235234 .0115682 2.03 0.044 .0006286 .0464182
tau1989 | .0314781 .0128342 2.45 0.016 .0060776 .0568787
tau1990 | .0191012 .0161408 1.18 0.239 -.0128436 .0510459
tau1991 | -.0085615 .015565 -0.55 0.583 -.0393665 .0222436
tau1992 | .0417059 .0162633 2.56 0.012 .0095188 .073893
tau1993 | -.0261252 .0175193 -1.49 0.138 -.060798 .0085477
tau1994 | -.0317614 .0179008 -1.77 0.078 -.0671893 .0036666
tau1995 | .0333058 .0153262 2.17 0.032 .0029734 .0636383
tau1996 | .024061 .0146627 1.64 0.103 -.0049583 .0530803
tau1997 | .036407 .0121281 3.00 0.003 .0124039
.06041
tau1998 | .0106534 .0152902 0.70 0.487 -.0196077 .0409145
tau1999 | .0346912 .0111454 3.11 0.002
.012633 .0567493
tau2000 | .0675343 .0155675 4.34 0.000 .0367242 .0983443
tau2001 | .0310835 .0120298 2.58 0.011
.007275 .0548921
tau2002 | .0328948 .0111274 2.96 0.004 .0108722 .0549174
tau2003 | .0527311 .0140502 3.75 0.000
.024924 .0805381
tau2004 | .0797188 .0167397 4.76 0.000 .0465888 .1128488
tau2005 | .1067611 .0164782 6.48 0.000 .0741486 .1393735
tau2006 | .0640681 .013405 4.78 0.000
.037538 .0905983
tau2007 | .0841787 .0153411 5.49 0.000 .0538167 .1145407
tau2008 | .0615172 .0154037 3.99 0.000 .0310313 .0920031
tau2009 | -.0215049 .0161711 -1.33 0.186 -.0535096 .0104998
tau2010 | .0653787 .0128502 5.09 0.000 .0399465 .0908109
tau2011 | .084683 .0145556 5.82 0.000 .0558756 .1134904
tau2012 | .0255449 .0107843 2.37 0.019 .0042015 .0468883
tau2013 | .0318813 .0103625 3.08 0.003 .0113726 .0523901
tau2014 | .0091828 .0078085 1.18 0.242 -.0062712 .0246369
tau2015 | -.014889 .0142195 -1.05 0.297 -.0430311 .0132532
tau2016 | -.0156564 .0098051 -1.60 0.113 -.0350619 .0037491
tau2017 |
0 (omitted)
_cons | .0082206 .007451 1.10 0.272 -.0065259 .0229671
-------------+---------------------------------------------------------------sigma_u | .06156902
sigma_e | .07864704
rho | .37998254 (fraction of variance due to u_i)
------------------------------------------------------------------------------
The results indicate that "open," "SHC," and "SHI" have statistically signi cant negative
associations with "y," suggesting that increases in these variables are related to a decrease in the
fi
28
outcome. On the other hand, "SHG" shows a marginally signi cant negative relationship with
"y" (p-value close to the signi cance threshold of 0.05).
The time dummy variables ("tau1951" to "tau2016") provide valuable insights into how "y" changes
over di erent time periods compared to the omitted periods. These coe cients allow researchers
to understand how the relationship between the independent variables and "y" evolves over time.
It's important to note that the model ts the data reasonably well, as indicated by the within Rsquared value of 0.3826. This value suggests that approximately 38.26% of the total variation in
"y" is accounted for by the included variables after controlling for individual-speci c e ects.
The usage of cluster-robust standard errors, considering 126 clusters, addresses potential
correlation within each cluster and strengthens the validity of the inferential conclusions.
eststo FDtd: xtreg d.y d.open d.SHC d.SHI d.SHG d.tau*, fe cluster (id)
note: D.tau1950 omitted because of collinearity.
note: D.tau2016 omitted because of collinearity.
note: D.tau2017 omitted because of collinearity.
Fixed-e ects (within) regression
Group variable: id
Number of obs =
2,217
Number of groups =
112
R-squared:
Within = 0.4156
Between = 0.6781
Overall = 0.4181
Obs per group:
min =
avg =
max =
F(69, 111)
=
Prob > F
corr(u_i, Xb) = 0.0223
1
19.8
59
57.50
= 0.0000
(Std. err. adjusted for 112 clusters in id)
-----------------------------------------------------------------------------|
Robust
D.y | Coe cient std. err.
t P>|t| [95% conf. interval]
-------------+---------------------------------------------------------------open | -.0050665 .0018084 -2.80 0.006 -.0086499 -.0014831
SHC | -.9235197 .076643 -12.05 0.000 -1.075393 -.7716464
SHI | -.2178646 .0945435 -2.30 0.023 -.4052089 -.0305203
SHG -.53763 .2875113 -1.87 0.064 -1.107353 .0320927
tau1950 | 0 (omitted)
tau1951 | .9656789 .6530335 1.48 0.142 -.3283505 2.259708
tau1952 | .9063255 .6454659 1.40 0.163 -.3727082 2.185359
tau1953 | .8983542 .6335281 1.42 0.159 -.3570241 2.153732
tau1954 | .8911149 .6242357 1.43 0.156 -.3458498 2.12808
tau1955 | .8933833 .6139076 1.46 0.148 -.3231157 2.109882
tau1956 | .8393476 .6045917 1.39 0.168 -.3586913 2.037386
tau1957 | .8401535 .5945928 1.41 0.160 -.3380718 2.018379
tau1958 | .8169015 .5846535 1.40 0.165 -.3416286 1.975431
tau1959 | .8090339 .5745205 1.41 0.162 -.3294167 1.947485
tau1960 | .8254694 .5644019 1.46 0.146 -.2929307 1.94387
tau1961 | .7990269 .5542906 1.44 0.152 -.2993369 1.897391
tau1962 | .7887823 .5456737 1.45 0.151 -.2925065 1.870071
tau1963 | .7986881 .5342759 1.49 0.138 -.2600153 1.857392
tau1964 | .7967504 .5252775 1.52 0.132 -.2441221 1.837623
tau1965 | .7685305 .5145383 1.49 0.138 -.2510615 1.788123
tau1966 | .7233136 .5071024 1.43 0.157 -.2815436 1.728171
tau1967 | .7337298 .4957871 1.48 0.142 -.2487054 1.716165
tau1968 | .7067708 .4875302 1.45 0.150 -.2593029 1.672844
tau1969 | .7018293 .4760629 1.47 0.143 -.2415211 1.64518
tau1970 | .6890599 .4670384 1.48 0.143 -.2364079 1.614528
tau1971 | .6559682 .4552804 1.44 0.152 -.2462004 1.558137
tau1972 | .6358276 .4462014 1.42 0.157 -.2483504 1.520006
tau1973 | .6379823 .4362096 1.46 0.146 -.2263961 1.502361
tau1974 | .6470415 .4266257 1.52 0.132 -.1983458 1.492429
tau1975 | .5783536 .4167581 1.39 0.168 -.2474804 1.404188
tau1976 | .5928201 .4040562 1.47 0.145 -.2078443 1.393485
tau1977 | .5649789 .3968766 1.42 0.157 -.2214585 1.351416
tau1978 | .528185 .3840578 1.38 0.172 -.232851 1.289221
tau1979 | .5418809 .3762218 1.44 0.153 -.2036278 1.28739
tau1980 | .509366 .3680448 1.38 0.169 -.2199392 1.238671
tau1981 | .4897621 .3584465 1.37 0.175 -.2205236 1.200048
tau1982 | .4566131 .3480409 1.31 0.192 -.2330531 1.146279
ff
fi
ffi
fi
fi
fi
ffi
ff
ff
29
tau1983 | .4502948
tau1984 | .4483086
tau1985 | .440293
tau1986 | .3873707
tau1987 | .4109031
tau1988 | .4019093
tau1989 | .3974398
tau1990 | .3772265
tau1991 | .3309773
tau1992 | .3714833
tau1993 | .2857568
tau1994 | .2979967
tau1995 . .3394041
.3363799 1.34 0.183 -.2162644
.3267229 1.37 0.173 -.1991146
.316137 1.39 0.166 -.1861536
.2962333 1.31 0.194 -.1996352
.297625 1.38 0.170 -.1788608
.2868356 1.40 0.164 -.1664746
.2766449 1.44 0.154 -.1507506
.2679481 1.41 0.162 -.1537305
.2581287 1.28 0.202 -.180522
.2514711 1.48 0.142 -.1268234
.2361398 1.21 0.229 -.1821701
.2267075 1.31 0.191 -.1512393
.2140888 1.59 0.116 -.0848271
1.116854
1.095732
1.06674
.9743767
1.000667
.9702932
.9456302
.9081835
.8424765
.86979
.7536836
.7472327
.7636354
tau1996 | .3223874 .2062684 1.56 0.121 -.0863472 .7311219
tau1997 | .3235624 .1982579 1.63 0.106 -.0692988 .7164237
tau1998 | .2760696 .1846245 1.50 0.138 -.0897762 .6419154
tau1999 | .2865189 .1818889 1.58 0.118 -.0739061 .6469439
tau2000 | .309061 .1734379 1.78 0.077 -.0346178 .6527397
tau2001 | .2538412 .1626402 1.56 0.121 -.0684413 .5761236
tau2002 | .238623 .1524978 1.56 0.120 -.0635615 .5408076
tau2003 | .2524616 .1435842 1.76 0.081
-.03206 .5369832
tau2004 | .2744041 .1326543 2.07 0.041 .0115408 .5372674
tau2005 | .2828725 .1273485 2.22 0.028
.030523 .535222
tau2006 | .2256855 .1130003 2.00 0.048
.001768 .4496031
tau2007 | .2251986 .1047772 2.15 0.034 .0175756 .4328216
tau2008 | .1908656 .094237 2.03 0.045 .0041287 .3776025
tau2009 | .0806697 .0775979 1.04 0.301 -.0730957 .2344352
tau2010 | .1648394 .0713502 2.31 0.023 .0234542 .3062245
tau2011 | .1709737 .0630823 2.71 0.008 .0459718 .2959755
tau2012 | .0937018 .050785 1.85 0.068 -.0069321 .1943358
tau2013 | .0881952 .0373667 2.36 0.020 .0141505 .1622398
tau2014 | .0509645 .0287244 1.77 0.079 -.0059547 .1078838
tau2015 | .0102043 .0161083 0.63 0.528 -.0217154 .042124
tau2016 |
0 (omitted)
tau2017 |
0 (omitted)
|
_cons | .0138134 .0099002 1.40 0.166 -.0058045 .0334313
-------------+---------------------------------------------------------------sigma_u | .03867263
sigma_e | .10280893
rho | .12395704 (fraction of variance due to u_i)
------------------------------------------------------------------------------
The presented output represents the results of a xed-e ects regression analysis with clusterrobust standard errors.
Two time dummy variables, "tau1950" and "tau2017," were omitted due to collinearity. Collinearity
can a ect the stability of the regression estimates and should be taken into account when
interpreting the results.
Notably, the coe cient of "open" is negative and statistically signi cant (p < 0.001), suggesting
that an increase in the variable "open" is associated with a decrease in the dependent variable "y."
Similarly, "SHC" and "SHI" also exhibit negative and statistically signi cant coe cients (p < 0.001),
implying that higher values of these variables are associated with lower values of "y."
However, it's worth noting that the coe cient of "SHG" is negative but marginally insigni cant (p =
0.067), with a relatively large standard error. Hence, the relationship between "SHG" and "y" may
require further investigation and cautious interpretation.
Regarding the time dummy variables, several of them show statistically signi cant relationships
with "y." For example, "tau1951," "tau1970," and "tau2000" have positive and signi cant
coe cients, indicating that speci c years have a positive impact on the dependent variable "y.
QUESTION 3:
Test the consistency of RE for the model under QUESTION 2. How can you obtain BE estimates
with cluster standard errors?
fi
fi
ffi
fi
fi
fi
ff
fi
ffi
fi
ffi
ff
ffi
30
The models I just analyzed are with robust standard errors. I estimate now models in FE and RE
with time dummies but no standard error so that I can implement Hausman test.
eststo REtd2: xtreg y open SHC SHI SHG i.year, re
eststo FEtd2: xtreg y open SHC SHI SHG i.year, fe
hausman REtd2
---- Coe cients ---|
(b)
(B)
(b-B) sqrt(diag(V_b-V_B))
| FEtd2
REtd2
Di erence
Std. err.
-------------+---------------------------------------------------------------open | -.0085809 -.0084195
-.0001614
.
SHC | -1.014701 -1.034171
.0194707
.
SHI | -.180844 -.1636893
-.0171547
.
SHG | -.5470263 -.5836507
.0366245
.
year |
1952 | -.041234 -.0408148
-.0004192
.
1953 | -.0231162 -.0214339
-.0016823
.
1954 | -.0213743 -.0205572
-.0008171
.
1955 | -.0024106 -.0002799
-.0021307
.
1956 | -.0462995 -.0427293
-.0035701
.
1957 | -.0314837 -.0285029
-.0029808
.
1958 | -.0439913 -.0402582
-.0037331
.
1959 | -.0403523 -.0375289
-.0028234
.
1960 | -.0192681 -.0175952
-.0016729
.
1961 | -.0475541 -.0415513
-.0060027
.
1962 | -.0378011 -.0301274
-.0076737
.
1963 | -.0139909 -.0076328
-.0063581
.
1964 | -.003455 -.0021722
-.0012828
.
1965 | -.0195073 -.0167721
-.0027352
.
1966 | -.0431596 -.0389454
-.0042143
.
1967 | -.0218258 -.0142028
-.007623
.0021414
1968 | -.0316112 -.0278468
-.0037644
.0011909
1969 | -.0197595 -.0203389
.0005794
.0024827
1970 | -.0115406 -.0117412
.0002006
.0009528
1971 | -.0296465 -.0233822
-.0062643
.0017889
1972 | -.0326121 -.0289506
-.0036615
.
1973 | -.012258 -.0127498
.0004919
.
1974 | .0119419 .0156518
-.0037099
.
1975 | -.0357293 -.0246878
-.0110415
.
1976 | -.0161411 -.0092319
-.0069092
.0016156
1977 | -.0234158 -.0144679
-.0089479
.0020483
1978 | -.0495155 -.0398497
-.0096658
.0016204
1979 | -.0318635 -.0252931
-.0065704
.
1980 | -.0407033 -.0323261
-.0083772
.
1981 | -.0468284 -.0391462
-.0076822
.
1982 | -.0601412
-.05161
-.0085312
.
1983 | -.0523056 -.0450428
-.0072628
.0006982
1984 | -.0569795 -.0502322
-.0067473
.0016602
1985 | -.0470675 -.0404891
-.0065783
.0018816
1986 | -.08288 -.0741907
-.0086893
.0026833
1987 | -.0478277 -.0419822
-.0058455
.0018793
1988 | -.0452666 -.0375971
-.0076696
.001933
1989 | -.0373119 -.0289711
-.0083408
.0017493
1990 | -.0496889
-.0394
-.0102889
.0016295
1991 | -.0773516 -.0889869
.0116353
.0031451
1992 | -.0270842 -.0514687
.0243845
.0046149
1993 | -.0949152 -.0985944
.0036791
.0021452
1994 | -.1005514 -.0957055
-.004846
.0019846
1995 | -.0354842 -.0299177
-.0055666
.0028839
1996 | -.0447291 -.0348961
-.009833
.0025791
1997 | -.0323831 -.0239146
-.0084685
.0019385
1998 | -.0581367 -.0507778
-.0073589
.0019192
1999 | -.0340989 -.0269459
-.007153
.0012805
2000 | -.0012558 .0089087
-.0101645
.0017237
2001 | -.0377065 -.0271733
-.0105333
.0019115
2002 | -.0358953 -.0276013
-.008294
.0015115
2003 | -.016059 -.0057201
-.0103389
.002135
ff
ffi
31
2004 | .0109287 .0215507
-.010622
.0023556
2005 | .037971 .0489279
-.0109569
.0018474
2006 | -.0047219
.006596
-.0113179
.0018726
2007 | .0153886 .0262413
-.0108527
.0010517
2008 | -.0072729 .0031747
-.0104476
.0016673
2009 | -.090295 -.0769976
-.0132974
.0018306
2010 | -.0034114 .0065744
-.0099858
.
2011 | .015893 .0252784
-.0093854
.0014137
2012 | -.0432452 -.032599
-.0106461
.0015337
2013 | -.0369087 -.0262121
-.0106966
.0010709
2014 | -.0596072 -.049623
-.0099842
.0006237
2015 | -.083679 -.0714089
-.0122701
.00145
2016 | -.0844464 -.0705784
-.013868
.0021258
2017 | -.0687901 -.0560255
-.0127645
.0018561
-----------------------------------------------------------------------------b = Consistent under H0 and Ha; obtained from xtreg.
B = Inconsistent under Ha, e cient under H0; obtained from xtreg.
Test of H0: Di erence in coe cients not systematic
chi2(70) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 102.06
Prob > chi2 = 0.0075
(V_b-V_B is not positive de nite)
The test statistic for the Hausman test is computed as chi-square with degrees of freedom equal
to the number of independent variables. In this case, the test statistic is 102.06, and the p-value is
0.0075. Since the p-value is less than the signi cance level (commonly set at 0.05), we reject the
null hypothesis (H0). This suggests that the coe cients of the xed e ects model (FEtd2)
signi cantly di er from the coe cients of the random e ects model (REtd2) so the coe cients in
the xed e ects model are consistent and more appropriate than the random e ects model for
explaining the relationship between the dependent variable and the independent variables in the
dataset.
eststo BEtd: reg y open SHC SHI SHG i.tau*, absorb(id) cluster(year)
Linear regression, absorbing indicators
Number of obs = 2,546
Absorbed variable: id
No. of categories = 126
F(66, 66)
=
.
Prob > F
=
.
R-squared
= 0.4504
Adj R-squared = 0.4048
Root MSE
= 0.0786
(Std. err. adjusted for 67 clusters in year)
-----------------------------------------------------------------------------|
Robust
y | Coe cient std. err.
t P>|t| [95% conf. interval]
-------------+---------------------------------------------------------------open | -.0085809 .0026377 -3.25 0.002 -.0138471 -.0033146
SHC | -1.014701 .1047833 -9.68 0.000 -1.223907 -.805494
SHI | -.180844 .0871555 -2.07 0.042 -.3548556 -.0068325
SHG | -.5470263 .2190114 -2.50 0.015 -.9842967 -.1097558
|
year |
1952 | -.041234 .0024333 -16.95 0.000 -.0460922 -.0363759
1953 | -.0231162 .0040641 -5.69 0.000 -.0312305 -.0150018
1954 | -.0213743 .0034647 -6.17 0.000 -.0282917 -.0144568
1955 | -.0024106 .0039303 -0.61 0.542 -.0102577 .0054364
1956 | -.0462995 .0041188 -11.24 0.000 -.0545228 -.0380761
1957 | -.0314837 .0038272 -8.23 0.000 -.039125 -.0238424
1958 | -.0439913 .003796 -11.59 0.000 -.0515704 -.0364123
1959 | -.0403523 .0045571 -8.85 0.000 -.0494509 -.0312537
1960 | -.0192681 .0049356 -3.90 0.000 -.0291223 -.009414
1961 | -.0475541 .0046589 -10.21 0.000 -.0568558 -.0382523
1962 | -.0378011 .0044497 -8.50 0.000 -.0466852 -.028917
1963 | -.0139909 .0037203 -3.76 0.000 -.0214188 -.006563
1964 | -.003455 .004924 -0.70 0.485 -.0132861 .0063762
1965 | -.0195073 .0042494 -4.59 0.000 -.0279916 -.011023
ffi
ff
ff
fi
ff
ffi
fi
ffi
ffi
ffi
fi
ff
ff
ff
ffi
fi
fi
32
1966 | -.0431596 .0044642 -9.67 0.000 -.0520728 -.0342465
1967 | -.0218258 .004397 -4.96 0.000 -.0306046 -.013047
1968 | -.0316112 .0041868 -7.55 0.000 -.0399705 -.0232519
1969 | -.0197595 .0043467 -4.55 0.000 -.0284381 -.011081
1970 | -.0115406 .0060019 -1.92 0.059 -.0235238 .0004427
1971 | -.0296465 .0044941 -6.60 0.000 -.0386192 -.0206739
1972 | -.0326121 .0044848 -7.27 0.000 -.0415663 -.0236579
1973 | -.012258 .0044918 -2.73 0.008 -.0212262 -.0032897
1974 | .0119419 .0048187 2.48 0.016
.002321 .0215627
1975 | -.0357293 .0056636 -6.31 0.000 -.0470371 -.0244215
1976 | -.0161411 .0047221 -3.42 0.001 -.0255691 -.0067131
1977 | -.0234158 .0046482 -5.04 0.000 -.0326961 -.0141354
1978 | -.0495155 .0045812 -10.81 0.000 -.0586621 -.0403689
1979 | -.0318635 .0058058 -5.49 0.000 -.0434552 -.0202719
1980 | -.0407033 .0047946 -8.49 0.000 -.0502759 -.0311306
1981 | -.0468284 .0048093 -9.74 0.000 -.0564305 -.0372263
1982 | -.0601412 .0045066 -13.35 0.000 -.0691389 -.0511435
1983 | -.0523056 .0051379 -10.18 0.000 -.0625638 -.0420474
1984 | -.0569795 .0046466 -12.26 0.000 -.0662566 -.0477023
1985 | -.0470675 .0043506 -10.82 0.000 -.0557537 -.0383812
1986 | -.08288 .0054101 -15.32 0.000 -.0936816 -.0720785
1987 | -.0478277 .0050398 -9.49 0.000
-.05789 -.0377655
1988 | -.0452666 .0042084 -10.76 0.000 -.053669 -.0368643
1989 | -.0373119 .0050286 -7.42 0.000 -.0473518 -.0272721
1990 | -.0496889 .0051525 -9.64 0.000 -.0599762 -.0394016
1991 | -.0773516 .0085558 -9.04 0.000 -.0944337 -.0602694
1992 | -.0270842 .0089328 -3.03 0.003 -.0449191 -.0092492
1993 | -.0949152 .0083768 -11.33 0.000 -.1116401 -.0781904
1994 | -.1005514 .0048761 -20.62 0.000 -.1102868 -.090816
1995 | -.0354842 .0059696 -5.94 0.000 -.0474028 -.0235656
1996 | -.0447291 .006611 -6.77 0.000 -.0579283 -.0315298
1997 | -.0323831 .0045959 -7.05 0.000 -.041559 -.0232072
1998 | -.0581367 .005783 -10.05 0.000 -.0696828 -.0465906
1999 | -.0340989 .0052231 -6.53 0.000 -.0445272 -.0236706
2000 | -.0012558 .005745 -0.22 0.828 -.0127261 .0102145
2001 | -.0377065 .0050019 -7.54 0.000 -.0476931 -.0277199
2002 | -.0358953 .0043558 -8.24 0.000 -.0445918 -.0271987
2003 | -.016059 .0056768 -2.83 0.006 -.027393 -.004725
2004 | .0109287 .0044617 2.45 0.017 .0020206 .0198369
2005 | .037971 .0052689 7.21 0.000 .0274514 .0484907
2006 | -.0047219 .0052111 -0.91 0.368 -.0151263 .0056824
2007 | .0153886 .0053978 2.85 0.006 .0046116 .0261657
2008 | -.0072729 .0055794 -1.30 0.197 -.0184125 .0038668
2009 | -.090295 .0067449 -13.39 0.000 -.1037616 -.0768284
2010 | -.0034114 .0048027 -0.71 0.480 -.0130004 .0061776
2011 | .015893 .0049501 3.21 0.002 .0060098 .0257761
2012 | -.0432452 .0045573 -9.49 0.000 -.0523441 -.0341463
2013 | -.0369087 .0046709 -7.90 0.000 -.0462344 -.0275831
2014 | -.0596072 .005059 -11.78 0.000 -.0697078 -.0495067
2015 | -.083679 .0066562 -12.57 0.000 -.0969686 -.0703895
2016 | -.0844464 .0054244 -15.57 0.000 -.0952766 -.0736163
2017 | -.0687901 .0046015 -14.95 0.000 -.0779772 -.0596029
|
_cons | .0770107 .0038696 19.90 0.000 .0692848 .0847366
------------------------------------------------------------------------------
QUESTION 4:
Suppose that the Great Recession has generated a change in the parameters. How can you test
this hypothesis by exploiting your preferred estimation method?
To test whether the Great Recession has generated a change in the parameters of the model, I
can use a time dummy variable to capture the e ect of the recession. By including a dummy
variable that takes the value of 1 during the period of the Great Recession (2007-2013) and 0
otherwise.
Let's assume that the Great Recession occurred during a speci c time period 2007-2013.
I then create a dummy variable, "recession_dummy," that takes the value of 1 for those years and
0 for all other years.
fi
ff
33
ESAME 3
Questions 1. Which type of data do you have? (Hint: consider infants clustered within
mothers). Describe the data.
1. In this dataset, I have panel data. Panel data is a type of longitudinal data where
observations (in this case, infants' birth weights) are collected over multiple time
periods for the same entities (mothers). The data appears to be clustered within
mothers, meaning that multiple observations (birth weights) are associated with
each mother, so I can track changes over time for the same mothers.
The rst command I implement is describe.
The output tell me that the dataset contains 8604 observations and there are a
total of 24 variables in the dataset. The output mentions also that the dataset is
sorted by “momid”, which suggest that the data might be organized based on
mothers’ identi ers.
fi
fi
34
Then I implement the summarize command in order to have a summary of key
statistics for the variables in the dataset. This output offers valuable insights into
the distribution and characteristics of the dataset's variables. Notably,
'birwt' (child's birth weight) appears to have a mean of approximately 3469.93
grams and a standard deviation of approximately 527.14 grams, with values
ranging from 284 to 5642 grams. Other variables, such as 'mage' (mother's age)
and 'cigs' (number of cigarettes smoked per day), also exhibit variation.
35
for understanding how smoking during pregnancy might affect birth weight I use
tabstat.
The 'tabstat' output provides a clear summary of the variable 'cigs' (number of
cigarettes smoked per day) categorized by smoking behavior during pregnancy
('smoke'). Notably, among nonsmokers, the mean number of cigarettes smoked
per day is zero, as expected. In contrast, among smokers, the mean number of
cigarettes per day is approximately 16.22. These statistics highlight a substantial
difference in smoking behavior between the two groups. The total mean for 'cigs'
across both groups is approximately 2.27, indicating an overall average number
of cigarettes per day in the dataset.
After making progressive the momid variable, we can see the summary statistics
for the variables momidNP and momid.
36
It’s now time to inform Stata that I’m dealing with panel data, so I implement
xtset.
xtset momid idx
Panel variable: momid (unbalanced)
Time variable: idx, 1 to 3
Delta: 1 unit
The panel is de ned by the 'momid' variable (representing mothers), which is
marked as unbalanced, indicating that not all mothers have the same number of
observations. The 'idx' variable represents the time periods, ranging from 1 to 3.
The time intervals between observations are uniform, with a delta of 1 unit,
suggesting that each unit increment in 'idx' corresponds to a consistent time
interval.
fi
37
The 'xtdes' output provides important information about the panel structure of the
dataset. It indicates that there are 3,978 unique mothers ('momid') in the dataset,
each observed over 3 time periods ('idx'), with a total of 11,934 observations. The
'Delta(idx)' of 1 unit suggests that each unit increment in 'idx' corresponds to a
consistent time interval, and the 'Span(idx)' of 3 periods shows the range of time
periods available.
The distribution of 'T_i' (the number of time periods each mother is observed)
indicates that most mothers (83.71%) are observed for 2 time periods, while the
remaining 16.29% are observed for 3 time periods. The 'Pattern' section shows
that most observations have a pattern of '11.' (indicating the same mother
observed in two time periods) or '111' (indicating the same mother observed in all
three time periods).
38
The 'summ' (summary statistics) output provides a comprehensive overview of
the variable 'birwt,' which represents the child's birth weight in grams.
The summary reveals that birth weights range from a minimum of 284 grams to a
maximum of 5,642 grams, with a mean birth weight of approximately 3,469.93
grams. The data's spread is captured by the standard deviation of approximately
527.14 grams, indicating variability around the mean.
Percentiles are also provided, offering a more detailed view of the distribution.
For instance, the 25th percentile shows that 25% of the births had a weight of
3,147 grams or less, while the 75th percentile indicates that 75% of the births
weighed 3,799 grams or less.
Further statistical measures such as skewness and kurtosis provide insights into
the shape of the distribution. In this case, the negative skewness value (-0.332)
suggests a slightly left-skewed distribution, while the positive kurtosis value
(4.632) indicates the presence of fat tails in the distribution.
varanaeasy birwt momid idx
___ variable birwt___
Statistics
NT
8604
Nmin 648
Navg
2868
Nmax
3978
Tmin 2
Tavg
2.1628959
Tmax
3
Note: differences among statistics of individuals or/and time-periods --> the panel is unbalanced
Test of the signi cance of individual effects
Fnum_i
Fden_i
F_i
Fpval_i
3977
4624
3.0842791
0.00
Test of the signi cance of time effects
Fnum_t
Fden_t
F_t
Fpval_t
2
4624
25.669358
0.00
Statistics: mean and variability (standard deviations)
Total mean (x..)
3469.9311
Total sd (xit-x..)
527.13941
fi
fi
39
Between sd inter_momid (xi.-x..)
Between sd inter_idx (x.t-x..)
Within sd intra_momid_idx (xit-xi.-x.t+x..)
Within sd intra_momid (xit-xi.)
Within sd intra_idx (xit-x.t)
448.65
35.521972
375.47337
377.47037
526.40201
________________________________________________________________________________
____________
Percentages of overall sum of squared dev. due to individuals, time, and residuals
Two-ways individuals & temporal
% between inter_momid (xi.-x..)/(xit-x..)
72.427861
% between inter_idx (x.t-x..)/(xit-x..)
.30276271
% within intra_momid_idx (xit-xi.-x.t+x..)/(xit-x..)
27.269376
- Focus on One-way individuals: intra_momid+inter_momid
% within intra_momid (xit-xi.)/(xit-x..)
27.572139
of which explained by between inter_idx (%) (x.t-x..)/(xit-xi.)
1.0980748
- Focus on One-way temporal: intra_idx+inter_idx
% within intra_idx (xit-x.t)/(xit-x..)
99.697237
of which explained by between inter_momid (%) (xi.-x..)/(xit-x.t)
72.647812
With the varanaeasy we obtain insights inyo the variability and signi cance of individual and time
effects in the context of the variable “birwt”.
We can see that the 72.43% of the variation in birth weight is attributed to indidivual differences,
while a smaller percentage (0.305) is attributed to time effects. The remaining 27.27% is explained
by within-individual and within-time variations. Approximately 27.57% of the variation in birth
weight is due to within-individual differences. Within-time variations (intra-time) account for
99.70% of the total variation, with approximately 1.10% of this explained by between-individual
differences.
2. We would like to estimate birwt as a function of smoke, mage black. Use
POLS, FE, RE, BE, CRE (for CRE use both HA1 and HA3 at p. 54 of the
theoretical note on static PDM). Comment the results and the estimated
parameters. Can you consistently estimate the model by the RE?
eststo POLSa: reg birwt smoke mage black
Source |
SS
df
MS
Number of obs = 8,604
-------------+---------------------------------- F(3, 8600)
= 235.00
Model | 181123614
3 60374538.1 Prob > F
= 0.0000
Residual | 2.2094e+09 8,600 256912.01 R-squared
= 0.0758
-------------+---------------------------------- Adj R-squared = 0.0754
Total | 2.3906e+09 8,603 277875.962 Root MSE
= 506.86
-----------------------------------------------------------------------------birwt | Coef cient Std. err.
t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
fi
fi
40
smoke | -296.0173
mage | 9.211306
black | -258.8591
_cons | 3266.549
15.99786
1.021905
21.42488
30.42223
-18.50 0.000 -327.377 -264.6577
9.01 0.000 7.208126 11.21448
-12.08 0.000 -300.857 -216.8612
107.37 0.000 3206.914 3326.184
The POLS model shows that smoking during pregnancy is associated with a signi cant decrease in
birthweight, wih an estimated coef cient of -296.0173 (p < 0.001), indicating that each unit
increase in smoking behavior is associated with a decrease in birthweight of approximately 296.
Mother’s age also has a signi cant effect on birthweight, with a positive coef cient of 9.2113 (p <
0.001), suggesting that each additional year of maternal age is associated with an increase in
birthweight of approximately 9.21. Black instead, is associated with a signi cant decrease in
birthweight, as indicated by the coef cient of -258.8591 (p < 0.001), suggesting that Black mothers
tend to have infants with lower birthweights compared to other racial groups.
. eststo FEa: xtreg birwt smoke mage black, fe
note: black omitted because of collinearity.
Fixed-effects (within) regression
Group variable: momid
R-squared:
Within = 0.0149
Between = 0.0441
Overall = 0.0378
Number of obs =
Number of groups =
Obs per group:
min =
avg =
max =
F(2, 4624)
=
Prob > F
corr(u_i, Xb) = -0.0808
8,604
3,978
2
2.2
3
34.95
= 0.0000
-----------------------------------------------------------------------------birwt | Coef cient Std. err.
t P>|t| [95% conf. interval]
-------------+---------------------------------------------------------------smoke | -105.6992 29.53343 -3.58 0.000 -163.5989 -47.79961
mage | 23.11536 3.050865 7.58 0.000 17.13421 29.09652
black |
0 (omitted)
_cons | 2823.812 87.39633 32.31 0.000 2652.473 2995.15
-------------+---------------------------------------------------------------sigma_u | 442.92373
sigma_e | 374.73056
rho | .58282488 (fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0: F(3977, 4624) = 2.86
Prob > F = 0.0000
In the Fixed effect model, the Smoking behavior (smoke) remains signi cantly associated with a
decrease in birthweight, with an estimated coef cient of approximately -105.70 (p < 0.001) after
accounting for individual xed effects. Instead, Mother's age (mage) continues to show a signi cant
positive association with birthweight, with an estimated coef cient of approximately 23.12 (p <
0.001) after controlling for individual xed effects. As regards to the variable black, is omitted due
to collinearity.
eststo REa: xtreg birwt smoke mage black, re
Random-effects GLS regression
Number of obs
=
8,604
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
41
Group variable: momid
Number of groups =
R-squared:
Within = 0.0081
Between = 0.0973
Overall = 0.0749
Obs per group:
min =
avg =
max =
corr(u_i, X) = 0 (assumed)
3,978
2
2.2
3
Wald chi2(3)
= 460.38
Prob > chi2
=
0.0000
-----------------------------------------------------------------------------birwt | Coef cient Std. err.
z P>|z| [95% conf. interval]
-------------+---------------------------------------------------------------smoke | -249.0647 17.41837 -14.30 0.000 -283.2041 -214.9254
mage | 10.70324 1.198604 8.93 0.000 8.354016 13.05246
black | -255.0322 26.65648 -9.57 0.000 -307.2779 -202.7864
_cons | 3215.825 35.63581 90.24 0.000
3145.98 3285.67
-------------+---------------------------------------------------------------sigma_u | 341.66312
sigma_e | 374.73056
rho | .45393994 (fraction of variance due to u_i)
The Random effect model shows that Smoking behavior (smoke) continues to show a signi cant
negative association with birthweight, with an estimated coef cient of approximately -249.06 (p <
0.001) after accounting for both within-group and between-group variations.
Mother's age (mage) remains a signi cant predictor of birthweight, with a positive coef cient of
approximately 10.70 (p < 0.001) while controlling for within-group and between-group variations.
The variable 'black'instead, is signi cantly associated with a decrease in birthweight but with an
omitted coef cient due to collinearity.
eststo BEa: xtreg birwt smoke mage black, be
Between regression (regression on group means) Number of obs =
Group variable: momid
Number of groups = 3,978
R-squared:
Within = 0.0054
Between = 0.1011
Overall = 0.0751
Obs per group:
min =
avg =
max =
8,604
2
2.2
3
F(3,3974)
= 148.99
sd(u_i + avg(e_i.)) = 427.9402
Prob > F
=
0.0000
-----------------------------------------------------------------------------birwt | Coef cient Std. err.
t P>|t| [95% conf. interval]
-------------+---------------------------------------------------------------smoke | -331.8269 21.51428 -15.42 0.000 -374.007 -289.6469
mage | 7.373431 1.307413 5.64 0.000 4.810167 9.936694
black | -259.8512 26.68225 -9.74 0.000 -312.1634 -207.539
_cons | 3322.507 38.95188 85.30 0.000 3246.139 3398.874
fi
fi
fi
fi
fi
fi
fi
fi
42
In the between groups regression analysis Smoking during pregnancy is signi cantly associated
with a decrease in group-level mean birthweight, with an estimated coef cient of approximately
-331.83 (p < 0.001). Mother's age exhibits a signi cant positive association with group-level mean
birthweight, with an estimated coef cient of approximately 7.37 (p < 0.001). Instead, Black is
signi cantly associated with a decrease in group-level mean birthweight, with an estimated
coef cient of approximately -259.85 (p < 0.001).
eststo CREa: xtreg birwt smoke smoke_idot mage mage_idot black, re theta
Random-effects GLS regression
Group variable: momid
Number of obs = 8,604
Number of groups = 3,978
R-squared:
Within = 0.0149
Between = 0.1011
Overall = 0.0801
Obs per group:
min =
avg =
max =
corr(u_i, X) = 0 (assumed)
2
2.2
3
Wald chi2(5)
= 521.95
Prob > chi2
=
0.0000
------------------- theta -------------------min
5%
median
95%
max
0.3872 0.3872 0.3872 0.4650 0.4650
-----------------------------------------------------------------------------birwt | Coef cient Std. err.
z P>|z| [95% conf. interval]
-------------+---------------------------------------------------------------smoke | -105.6992 29.51831 -3.58 0.000 -163.5541 -47.84441
smoke_idot | -227.2295 36.5382 -6.22 0.000 -298.843 -155.6159
mage | 23.11536 3.049303 7.58 0.000 17.13884 29.09189
mage_idot | -15.5921 3.317603 -4.70 0.000 -22.09448 -9.089715
black | -260.0303 26.61759 -9.77 0.000 -312.1998 -207.8607
_cons | 3318.902 38.94571 85.22 0.000 3242.569 3395.234
-------------+---------------------------------------------------------------sigma_u | 341.66312
sigma_e | 374.73056
rho | .45393994 (fraction of variance due to u_i)
------------------------------------------------------------------------------
In the Contextual random effects regression analysis, we can see that Smoking
during pregnancy is signi cantly associated with a decrease in birthweight, with
an estimated coef cient of approximately -105.70 (p < 0.001). The contextual
smoking effect, captured by 'smoke_idot,' also has a signi cant negative
association with birthweight, with an estimated coef cient of approximately
-227.23 (p < 0.001). Mother's age exhibits a signi cant positive association with
birthweight, with an estimated coef cient of approximately 23.12 (p < 0.001).
The contextual age effect, represented by 'mage_idot,' is signi cantly negatively
associated with birthweight, with an estimated coef cient of approximately
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
43
-15.59 (p < 0.001). Race (black) is signi cantly associated with a decrease in
birthweight, with an estimated coef cient of approximately -260.03 (p < 0.001).
hausman REa FEa
---- Coef cients ---|
(b)
(B)
(b-B) sqrt(diag(V_b-V_B))
|
REa
FEa
Difference
Std. err.
-------------+---------------------------------------------------------------smoke | -249.0647 -105.6992
-143.3655
.
mage | 10.70324 23.11536
-12.41213
.
-----------------------------------------------------------------------------b = Consistent under H0 and Ha; obtained from xtreg.
B = Inconsistent under Ha, ef cient under H0; obtained from xtreg.
Test of H0: Difference in coef cients not systematic
chi2(2) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= -58.63
Warning: chi2 < 0 ==> model tted on these data
fails to meet the asymptotic assumptions
of the Hausman test; see suest for a
generalized test.
The Hausman test (Hausman REa vs. FEa) is used to assess whether the Random-Effects (REa)
model or the Fixed-Effects (FEa) model is more appropriate for analyzing the data.
The coef cient for smoking behavior differs signi cantly between the REa and FEa models, with a
substantial difference of approximately -143.37 (p < 0.001). The coef cient for mother's age also
differs signi cantly between the two models, with a difference of approximately -12.41 (p < 0.001).
Due to this, the RE model is inconsistent for the data.
3. Suppose that the e ect of smoking di ers according to the
mother’s age (hint: you need interactions). Estimate your new
model by FE, BE, CRE (both HA1 and AH3). Interpret the results.
eststo FEI: xtreg birwt smoke mage i.smoke#c.mage black, fe
note: black omitted because of collinearity.
Fixed-effects (within) regression
Group variable: momid
R-squared:
Within = 0.0150
Between = 0.0431
Overall = 0.0370
Number of obs =
Number of groups =
Obs per group:
min =
avg =
max =
F(3, 4623)
=
Prob > F
corr(u_i, Xb) = -0.0857
8,604
3,978
2
2.2
3
23.49
= 0.0000
fi
ff
fi
fi
fi
fi
ff
fi
fi
fi
fi
fi
44
-----------------------------------------------------------------------------birwt | Coef cient Std. err.
t P>|t| [95% conf. interval]
-------------+---------------------------------------------------------------smoke | -205.6625 134.0088 -1.53 0.125 -468.3836 57.05863
mage | 22.75097 3.087986 7.37 0.000 16.69705 28.8049
|
smoke#c.mage |
Smoker | 3.794638 4.961931 0.76 0.444 -5.933114 13.52239
|
black |
0 (omitted)
_cons | 2834.287 88.46711 32.04 0.000 2660.849 3007.725
-------------+---------------------------------------------------------------sigma_u | 443.36829
sigma_e | 374.74738
rho | .5832908 (fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0: F(3977, 4623) = 2.86
Prob > F = 0.0000
eststo BEI: xtreg birwt smoke mage i.smoke#c.mage black, be
Between regression (regression on group means) Number of obs =
Group variable: momid
Number of groups = 3,978
R-squared:
Within = 0.0054
Between = 0.1012
Overall = 0.0752
Obs per group:
min =
avg =
max =
8,604
2
2.2
3
F(4,3973)
= 111.83
sd(u_i + avg(e_i.)) = 427.9712
Prob > F
=
0.0000
-----------------------------------------------------------------------------birwt | Coef cient Std. err.
t P>|t| [95% conf. interval]
-------------+---------------------------------------------------------------smoke | -265.2415 104.4722 -2.54 0.011 -470.0656 -60.41735
mage | 7.781811 1.450075 5.37 0.000 4.938849 10.62477
|
smoke#c.mage |
Smoker | -2.49421 3.829506 -0.65 0.515 -10.00219 5.013772
|
black | -258.2448 26.79792 -9.64 0.000 -310.7838 -205.7059
_cons | 3310.556 43.05996 76.88 0.000 3226.134 3394.978
-----------------------------------------------------------------------------eststo CREI: xtreg birwt smoke mage i.smoke#c.mage black, re
Random-effects GLS regression
Group variable: momid
R-squared:
Within = 0.0081
Number of obs = 8,604
Number of groups = 3,978
Obs per group:
min =
2
fi
fi
45
Between = 0.0975
Overall = 0.0750
corr(u_i, X) = 0 (assumed)
avg =
max =
2.2
3
Wald chi2(4)
= 460.58
Prob > chi2
=
0.0000
-----------------------------------------------------------------------------birwt | Coef cient Std. err.
z P>|z| [95% conf. interval]
-------------+---------------------------------------------------------------smoke | -208.7883 82.53216 -2.53 0.011 -370.5483 -47.0282
mage | 10.93277 1.283739 8.52 0.000 8.416691 13.44886
|
smoke#c.mage |
Smoker | -1.516633 3.038013 -0.50 0.618 -7.471029 4.437763
|
black | -254.1116 26.72171 -9.51 0.000 -306.4851 -201.738
_cons | 3209.129 38.07586 84.28 0.000 3134.501 3283.756
-------------+---------------------------------------------------------------sigma_u | 341.69323
sigma_e | 374.74738
rho | .45396137 (fraction of variance due to u_i)
How can you evaluate the partial e ect of smoking when
mother’s age is at the median value?
est restore FEI
(results FEI are active now)
. xtreg birwt i.smoke##c.mage, fe
Fixed-effects (within) regression
Group variable: momid
R-squared:
Within = 0.0147
Between = 0.0432
Overall = 0.0371
corr(u_i, Xb) = -0.0818
Number of obs =
Number of groups =
Obs per group:
min =
avg =
max =
F(3, 4623)
=
Prob > F
8,604
3,978
2
2.2
3
23.07
= 0.0000
-----------------------------------------------------------------------------birwt | Coef cient Std. err.
t P>|t| [95% conf. interval]
-------------+---------------------------------------------------------------smoke |
Smoker | -207.7048 134.0198 -1.55 0.121 -470.4476 55.03803
mage | 22.43556 3.081158 7.28 0.000 16.39502 28.4761
|
ff
fi
fi
46
smoke#c.mage |
Smoker | 3.872842 4.962322 0.78 0.435 -5.855679 13.60136
|
_cons | 2843.27 88.27702 32.21 0.000 2670.205 3016.335
-------------+---------------------------------------------------------------sigma_u | 443.17047
sigma_e | 374.79789
rho | .58300833 (fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0: F(3977, 4623) = 2.86
Prob > F = 0.0000
. margins, dydx(mage smoke)
Average marginal effects
Model VCE: Conventional
Number of obs = 8,604
Expression: Linear prediction, predict()
dy/dx wrt: 1.smoke mage
-----------------------------------------------------------------------------|
Delta-method
|
dy/dx std. err.
z P>|z| [95% conf. interval]
-------------+---------------------------------------------------------------smoke |
Smoker | -96.96736 31.57848 -3.07 0.002
-158.86 -35.07468
mage | 22.97751 3.052369 7.53 0.000 16.99497 28.96004
-----------------------------------------------------------------------------Note: dy/dx for factor levels is the discrete change from the base level.
As result, we obtained that for infants born to mothers who smoke (Smoker), there is a decrease of
approximately 96.97 grams in birth weight compared to infants born to non-smoking mothers. This
effect is statistically signi cant (p = 0.002), indicating that smoking during pregnancy is associated
with lower birth weights. We can also see that For each additional year increase in mother's age,
there is an average increase of approximately 22.98 grams in birth weight. This effect is statistically
signi cant (p < 0.001), suggesting that older mothers tend to have infants with slightly higher birth
weights.
ESAME 3
It was used by Bronwyn H. Hall and Robert E. Hall (1993) “The Value and Performance
of U.S. Corporations,” Brooking Papers on Economic Activity, 1 1-50.
All values are nominal and millions of dollars except where otherwise noted. Stocks are
end of year.
We want to estimate the simple investment model:
Iit = b1 Qit-1 + b2 Dit-1 + b3 CFit-1 + b4 R&Dit-1 + b4 ADVit-1 + b6 Ti + INDi + τt + (μi + eit)
fi
fi
47
where I is investment/assets, Q is market value/assets, D is long-term debt/assets, CF is
cash ow/assets, R&D is R&D/assets, ADV is advertising/assets, and T is a dummy
variable indicating if the corporation’s stock is traded on the NYSE or AMEX.
The regression ALSO includes 19 dummy variables indicating the industry code (from
ardsic), and time dummies. Note that the error term could be composed by the
idiosyncratic shock and the individual heterogeneity. The shock eit is zero-mean,
supposed to be heteroskedastic and autocorrelated.
The model refers to Tobin's q theory of investment, which suggests that investment should
be predicted solely by Q (Tobin's Q). This theory predicts that the coef cient on Q should
be positive, and the other coef cients should be zero.
Theories of liquidity constraints suggest that the coef cient on D should be negative and
the coef cient on CF should be positive. The literature has recognized the intangible
capital aspects of advertising and R&D, which could be complementary or substitute for
investment.
QUESTION 1:
Note that the unit-time varying explanatory variables are lagged. Why? Explain.
The unit time results as lagged because the model wants to study and interpret the results
and effects of the investment model of the company. In fact, when dealing with nancial
markets, it is more common to obtain observations about lagged data and information,
effects and consequences from the past that affect the estimations and studies we could
conduct about the investment plan.
QUESTION 2:
Describe the dataset: which type of data do you have?
describe
Contains data from /private/var/folders/xv/628t t14g17mjjllxlhmbm40000gn/T/com.microsoft.Outlook/Outlook Temp/
hall_hall_1993[19].dta
Observations:
27,566
Variables:
23
26 Jul 2023 19:43
----------------------------------------------------------------------------------------------------------------------------------------------------------Variable Storage Display Value
name
type format label Variable label
----------------------------------------------------------------------------------------------------------------------------------------------------------cusip
long %10.0g
Committee on Uniform Security Identication Procedures, rm code number, the r
year
int %10.0g
2-digit year of the data
pstar
double %10.0g
The PDV of dividends received on this rm's common stock in the future, discoun
z0
double %10.0g
the rst term in the linearized expression for z (see Appendix A of the paper)
pricef
double %10.0g
the end of scal year actual price of common stock
divf
double %10.0g
the dividends paid during the past scal year
rnda
double %10.0g
rnd to assets ratio
adva
double %10.0g
advertising to assets ratio
fyr
byte %10.0g
the month of the scal year close (1-12)
ardsic
byte %42.0g ardsic_labels
Industry code
exityr
int %10.0g
the year the rm exited from the sample
inva
double %10.0g
investment to assets ratio
cfa
double %10.0g
cash ow to assets ratio
debta
double %10.0g
long term debt to assets ratio
fi
fi
fi
fi
fi
fi
fi
fl
fi
fi
fi
fi
fl
fi
fi
fl
48
sales
double %10.0g
sales during the year (millions $)
netcap
double %10.0g
book value of assets = P&E+inventories+other, adjusted for the effects of in at
earnsh
double %10.0g
reported earnings per share
nyseamex
byte %10.0g
dummy = 1 if rm is traded on NYSE or AMEX
h0
double %10.0g
an instrument like that given in equation B-5 of the paper
h1
double %10.0g
an instrument like that given in equation B-5 of the paper
vala
double %10.0g
total market value to assets ratio (Tobin's Q)
oneper
str18 %18s
one period rate of return, adjusted for non-diversi able discounting
sharef
double %10.0g
shares of common stock outstanding (1000s)
----------------------------------------------------------------------------------------------------------------------------------------------------------Sorted by: cusip year
xtset cusip year
Panel variable: cusip (unbalanced)
Time variable: year, 1960 to 1991
Delta: 1 unit
xtdes
cusip: 32, 209, ..., 989845
n=
1962
year: 1960, 1961, ..., 1991
T=
32
Delta(year) = 1 unit
Span(year) = 32 periods
(cusip*year uniquely identi es each observation)
Distribution of T_i: min
5% 25%
50%
5
5
8
13
20 27
75%
29
95%
max
Freq. Percent Cum. | Pattern
---------------------------+---------------------------------127 6.47 6.47 | ............11111111111111111111
77 3.92 10.40 | ...11111111111111111111111111111
44 2.24 12.64 | ............1111111.............
43 2.19 14.83 | .............1111111111111111111
41 2.09 16.92 | ............11111...............
41 2.09 19.01 | ............111111..............
38 1.94 20.95 | ...........................11111
33 1.68 22.63 | ..........................111111
33 1.68 24.31 | ........................11111111
1485 75.69 100.00 | (other patterns)
---------------------------+---------------------------------1962 100.00
| XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
QUESTION 3:
Estimate the model by POLS, FE, FD, BE, RE, CRE. In the light of the econometric
theory behind each estimation method, interpret and comparatively discuss the estimates
and select your preferred estimation method. Which theory do you think is valid on these
data? Which kind of endogeneity could you detect?
Iit = b1 Qit-1 + b2 Dit-1 + b3 CFit-1 + b4 R&Dit-1 + b4 ADVit-1 + b6 Ti + INDi + τt + (μi + eit)
I = inva
Q = vala
D = debta
CF = cfa
R&D = rnda
ADV = adva
T = nyseamex
IND = ind*
fl
fi
fi
fi
49
Tab ardsic, g(ind)
Dtime year
Esisto POLS: reg inva vala debta cfa rnda adva nyseamex ind* tau*
eststo FE:xtreg inva vala debta cfa rnda adva nyseamex ind* tau*, fe
eststo FD: reg d.inva d.vala d.debta d.cfa d.rnda d.adva d.nyseamex d.ind* d.tau*
eststo BE: xtreg inva vala debta cfa rnda adva nyseamex ind* tau*, be
eststo RE: xtreg inva vala debta cfa rnda adva nyseamex ind* tau*, re theta
within cusip year inva vala debta cfa rnda adva nyseamex ind* if e(sample)
. sort cusip year
. list cusip year vala vala_idot debta debta_idot cfa cfa_idot rnda rnda_idot adva adva_idot
nyseamex nyseamex_idot ind* ind*_idot in 1/20, noobs sepby(cusip)
eststo CRE: xtreg inva vala vala_idot debta debta_idot cfa cfa_idot rnda rnda_idot
adva adva_idot nyseamex nyseamex_idot ind* ind*_idot tau*, re theta
50
Download