Uploaded by ayyanali2003

Econ 221 Assignment 1 (2) (2)

advertisement
Econ 221 Assignment 1
Ayyan Ali
455234833
1) Logpgp95 has 64 observations and a mean of 8.062237. It has a standard deviation
of 1.043359.
Log GDP per capita (PPP adjusted) in 1995 (i.e., a measure of economic
development) takes a minimum value of 6.109248 and a maximum value of
10.21574. The minimum value of 6.109248 suggests that the country observed has
a relatively lower economic growth and development rate than other countries in the
dataset. This country may have weaker state institutions, may not be politically
stable, and may not have equitable access to economic resources. On the other
hand, the maximum value of 10.21574 suggests that this country may have strong
state institutions, be more politically stable, and have better access to economic
resources. The range of the values of Logpgp95 suggests considerable variation in
economic growth amongst different countries, which should be considered when
making policies.
Avexpr has 64 observations and a mean of 6.515625. It has a standard deviation of
1.468647.
Average protection against expropriate risks between 1985–1995 (i.e., a measure of
property right protection) takes a minimum value of 3.5 and a maximum value of 10.
The minimum value of 3.5 suggests that the country observed has ill-protected
property rights. Therefore, this can negatively impact a country's economic
development as investors are more hesitant to invest in such areas. In the long run,
reduced investment can lead to lower economic growth. On the other hand, the
maximum value of 10 suggests that the country observed has very well-protected
property rights. This country may have favourable conditions for thriving businesses,
leading to increased investment. Increased investment can lead to higher economic
growth.
2) Logpgp95: confidence interval ( 7.801614, 8.32286)
● With 95% confidence, the true population parameter of logpgp95 lies within
the interval ( 7.801614, 8.32286). That said, if we were to repeat the
sampling process several times and compute 95% confidence intervals for
each sampling, we expect that 95% of the time, the true population parameter
would lie within the 95% confidence interval.
●
It is plausible to say that 95% of the time, the true population parameter of
logpgp95 would lie within the interval ( 7.801614, 8.32286). That said, the
interval is representative of the values within which we are 95% confident that
the true population parameter would be found.
Avexpr: confidence interval (6.148768, 6.882482)
●
With 95% confidence, the true population parameter of avexpr lies within the
interval (6.148768, 6.882482)). That said, if we were to repeat the sampling
process several times and compute 95% confidence intervals for each
sampling, we expect that 95% of the time, the true population parameter
would lie within the 95% confidence interval.
●
It is plausible to say that 95% of the time, the true population parameter of
avexpr would lie within the interval (6.148768, 6.882482). That said, the
interval is representative of the values within which we are 95% confident that
the true population parameter would be found.
3)
*The blue line represents kdensity logpgp95 when good_property_right=0
**The red line represents kdensity logpgp95 when good_property right=1
The kdensity plot visualises the distribution of logpgp95 by displaying the conditional
densities of the observations across its range. In this case, the blue line represents
kdensity logpgp95 when good_property_right=0, while the red line represents
kdensity logpgp95 when good_property right=1.
The difference between the conditional densities suggests that there may be a
relationship between the protection of property rights and the economic development
of a country. For instance, the density of Logpgp95 when good_property_right= 1 is
skewed to the left, suggesting that the distribution is mainly shifted towards higher
values. This indicates that when a country has better protection of property rights, it
may have higher economic development. On the other hand, the density of logpgp95
when good_property_right = 0 is skewed to the right and is mainly shifted towards
lower values. This suggests that countries with weaker protection of property rights
may have lower economic development.
A plausible explanation for the difference in the means of conditional densities could
be that better protection of property rights encourages increased investment. The
increased investment could lead to higher economic growth and development as
businesses can expand and grow freely throughout the economy, generating
employment opportunities and more disposable income for workers. On the other
hand, countries with weaker property rights protections can discourage investments
as investors may be hesitant to invest in countries with weak business environments.
This could lead to lower economic development.
4) The difference between the means is derived from the following:
Mean(logpgp95 when good_property_right = 0) - Mean (logpgp95 when
good_property_right=1)
The difference means reported is -1.204658, and the 95% confidence interval for the
difference means is between -1.632122 and -.7771948. The p-value reported is
0.000. The T statistic for the test is -5.6334.
Since the T statistic is -5.6334 at 95% confidence level and the P value is 0.000,
which is less than the significance level 0f 0.05 at 95% confidence, we can reject the
null hypothesis and conclude that there is significant evidence to support the
alternative hypothesis. Since the p-value is less than the significance level of 0.05,
the probability of observing a t statistic as extreme or more extreme than the one
already observed is virtually zero. This means that there is significant evidence to
reject the null hypothesis in favour of the alternative hypothesis, suggesting that the
mean of logpgp95 when good_property_right=1 is statistically significantly different
than the mean of logpgp95 when good_property_right = 0.
The conditional mean for logpgp95 when good_property_right = 0 is 7.459908, while
the conditional mean for logpgp95 when good_property_right = 1 is 8.664566. To
calculate the percentage change in the conditional mean of logpgp95
when going from good_property_right=0 to good_property_right=1, we would employ
the percentage formula: (8.664566 - 7.459908/7.459907)*100 . The percentage
change rounds up to 16.12 %. This suggests that the conditional mean of logpgp95
witnesses an increase of approximately 16.12% when shifting from a country with
poor property rights (good_property_right=) to a country with well-protected property
rights (good_property_right=1).
As a result, stronger property rights can contribute to higher economic growth and
development, as measured by the Log GDP per capita. Economies with more
substantial property rights may be more lucrative for investment leading to higher
economic growth.
5)
The pattern in the scatter plot and the correlation coefficient between avexpr and
logpgp95 suggests a strong linear positive correlation between the two variables. The
scatter plot visualises a linear relationship and a positive association between the two
variables with little scatter. On average, as avexpr increases, the logpgp95 increases
(not accounting for any outliers). This aligns with the findings in Q3 and Q4, which
suggests that the stronger the property rights in a country, the more likely it has a
higher Log GDP per capita, i.e. higher economic growth and development.
6) logpgp95i = β0^ + β1^avexpri + ui
The coefficient estimate for β0^^ (Y Intercept of the function) is 4.660383 and has a
standard error of 0.4085062. The coefficient estimate of (β1^), better known as the
function's slope, is estimated to be 0.522107 and has a standard error of 0.061185.
Holding all other variables and factors constant, a one standard deviation increase in
avexpri will, on average, lead to a 0.522107 predicted increase in logpgp95i. The
predicted increase is statistically significant as the p-value of β1^ is less than the
significance level of 0.05.
Assuming the standard deviation of avexpr is 1.468647, the standard error of the
estimate of the predicted change in logpgp95i is equal to SE(β1^avexpr) = SE(β1^) *
SD(avexpr). This would round up to 0.061185 * 1.468647 = 0.089898
With the use of this standard error, we are now able to compute the 95% confidence
interval for the predicted change.
Lower Bound of the Interval: 0.522107 - 1.96*0.089898 = 0.345631
Upper Bound of the Interval: 0.522107 + 1.96*0.89898 = 0.698583
Hence, it is plausible that with 95% confidence interval and holding all other variables
constant, a one standard deviation increase in avexpr leads to a predicted increase
in logpgp95 within the interval of 0.345631 to 0.698583.
It is important to state that this test assumes homoskedasticity, which means the
error term has a constant variance.
7) Assuming the error term (ui) is heteroscedastic, the coefficient estimate for β0^ (Y
Intercept of the function) is 4.660383 and has a robust standard error of 0.3201315.
The coefficient estimate of (β1^), better known as the function's slope, is estimated to
be 0.522107 and has a robust standard error of 0.0499225.
If the error term is heteroskedastic and an usual standard error is generated using an
OLS estimate, the standard error will be biased as the OLS assumes that the error
term has a constant variance. This could lead to incorrect statistical inferences. The
robust standard error, however, accounts for heteroscedasticity. In this case, the
robust standard error for both β0^^ and β1^ is smaller than the normal standard error.
This means that if we use the OLS method for estimating the standard error and the
error term is heteroskedastic, we will most likely overestimate the standard error.
However, the coefficient estimates for β0^^ and β1^ do not change, regardless of
whether the standard error is robust. Therefore, the interpretation of the magnitude of
change in logpgp95 corresponding to a one standard deviation increase in avexpr will
remain unchanged.
8) The coefficient of avexpr remains unchanged as Stata omits the variable “Colony”
due to its colinearity with avexpr. Therefore, the command “regress logpgp95 avexpr
colony” equals “regress logpgp95 avexpr.”
9) Compared to the regression of avexpr and logpgp95, the coefficient estimate for
avexpr slightly increases to 0.527621. The coefficient estimate for f_french is
-0.0703873. This means that countries that were French colonies have a lower mean
response than countries that were not French colonies. The coefficient estimate for
f_brit is 0 as the variable is omitted due to collinearity with one of the included
variables. The coefficient estimate for f_other is 0.3066813. This means that the
mean response of countries colonised by other countries is higher than that of
countries not colonised by other countries. Furthermore, the Y-intercept is now
4.53184.
The coloniser's identity impacts the log GDP per capita. This is because there is an
apparent difference between the coefficients of former French and other
colonies.However, this difference could be due to the different approaches of
colonisers to economic policies, consequently leading to different rates of GDP per
capita change. It is important to note that we are unable to determine an impact of a
country being a former British colony on logpgp95 as the variable has been omitted
due to its collinearity with avexpr.
10) The assumption E(ui|avexpri) = 0 assumes that the error term ui does not
correlate with the x variable avexpri. That said, for any given value of avexpri, ui is 0,
implying that the error term is not dependent on the avexpr. However, if the
assumption is violated and E(ui|avexpri) ≠ 0, then this implies that the error term is
correlated with the independent variable, avexpr, leading to bias in the regression
coefficients and eventually leading to inaccurate statistical inferences.
The assumption could be violated if there is an omitted variable bias. This means if
an omitted variable correlates with the independent variable, avexpr, and the error
term ui. The extent and the direction of the bias depending on the correlation. Let us
consider political stability to be an omitted variable in this case. For instance, if the
omitted variable, political stability, has a positive correlation coefficient with the error
term, then it is likely that there will be a downward bias for the coefficient estimate of
avexpri. This will mean that our model underestimates the impact of a change in
avexpri on logpgp95. On the other hand, if the omitted variable, political stability, has
a negative correlation coefficient with the error term, our model will likely have an
upwards bias for the coefficient estimate for avexpri. This means that our model
overestimates the impact of a change in avexpri (protection of property rights) on
logpgp95 (Economic development). This is assuming ceteris paribus.
Another reason could be measurement error of the independent variable, avexpr. If
the data has not captured the correct values of avexpr, there will be bias in the
coefficient estimates. The bias's extent and direction will depend on the direction of
the measurement error. The bias may tend towards zero only if the measurement
error is random, leading to underestimation of the true impact of avexpr on logpgp95.
However, if the error is non random, this could lead to either overestimating or
underestimating the true impact of avexpr on logpgp95, dependent on the situation.
11) //////////////////////////////////////////////////////////
//
//
//
ECON221 Assignment 1
//
Code by Ayyan Ali
//
April 2023
//
//////////////////////////////////////////////////////////
//
//
//
//
log using "Z:\Desktop\ECON221 Assignment 1\Assignment 1.log" /// start log file
cd "Z:\Desktop\ECON221 Assignment 1" // set working directory
/*~~~~~~~~~~~~~~~~~~*/
/*
Assignment 1 */
/*~~~~~~~~~~~~~~~~~~*/
/*import AJR_raw.csv*/
import delimited "AJR_raw.csv", clear
/*save as stata dta file*/
save "AJR_raw.dta", replace
/* detailed summary statistics for logpgp95*/
sum logpgp95, detail
/* detailed summary statistics for avexpr*/
sum avexpr, detail
/* 95% confidence interval*/
ci mean logpgp95
/* 95% confidence interval*/
ci mean avexpr
/* Generate good_property_right=1 if avexpr is greater than the median*/
gen good_property_right = 1 if (avexpr > 6.477273)
/* replace good_property_right=0 if avexpr is lower than the median*/
replace good_property_right = 0 if (avexpr < 6.477273)
/*plot density of good_property_right==1*/
kdensity logpgp95 if good_property_right == 1
/*export kdensity graph of good_property_right==1 as png format*/
graph export "Z:\Desktop\ECON221 Assignment 1\Kdensity Graph 1.png", as(png)
name("Graph") file Z:\Desktop\ECON221 Assignment 1\Kdensity Graph 1.png saved as
PNG format
/*plot density of good_property_right==0*/
kdensity logpgp95 if good_property_right == 0
/*export kdensity graph of good_property_right==0 as png format*/
graph export "Z:\Desktop\ECON221 Assignment 1\Kdensity Graph 2.png", as(png)
name("Graph")
file Z:\Desktop\ECON221 Assignment 1\Kdensity Graph 2.png saved as PNG format
/*plot twoway density of good_property_right*/
twoway kdensity logpgp95 if good_property_right==0 || kdensity logpgp95 if
good_property_right==1
/*export twoway kdensity graph of good_property_right as png format*/
graph export "Z:\Desktop\ECON221 Assignment 1\Twoway K Density Graph.png", as(png)
name("Graph")
file Z:\Desktop\ECON221 Assignment 1\Twoway K Density Graph.png saved as PNG format
/*calculate mean of logpgp95 by good_property_right and then test the difference*/
ttest logpgp95, by(good_property_right)
/*scatter plot with fitted line*/
twoway scatter logpgp95 avexpr || lfit logpgp95 avexpr
/*export scatter plot with fitted line as png format*/
graph export "Z:\Desktop\ECON221 Assignment 1\Scatter Plot.png", as(png) name("Graph")
file Z:\Desktop\ECON221 Assignment 1\Scatter Plot.png saved as PNG format
/*correlation coefficient*/
corr (avexpr logpgp95)
/*regression logpgp95 on avexpr assuming homoskedasticity*/
regress logpgp95 avexpr
/*regression logpgp95 on avexpr assuming heteroskedastic robust standard error*/
regress logpgp95 avexpr, r
/*regression logpgp95 on avexpr and colony assuming homoskedasticity*/
regress logpgp95 avexpr colony
/*save my work in order to not over-ride my raw data*/
save "AJR_raw_new.dta", replace
/*import colonyID.csv*/
import delimited "colonyID.csv", clear
/*merge with AJR_raw.csv using shortnam as identifier*/
merge 1:1 shortnam using AJR_raw_new
/*only keeping the observations that appears in both datasets*/
keep if _merge==3
/*drop the auto-generated variable.*/
drop _merge
/*regression logpgp95 on avexpr f_french f_brit f_other assuming homoskedasticity*/
regress logpgp95 avexpr f_french f_brit f_other
Download