1. In the attached table are two variables (x... dependent and x the independent variable, i indicates the observation... Lab-session 4:

advertisement
Lab-session 4:
1. In the attached table are two variables (x and y) with 20 observations each, y is the
dependent and x the independent variable, i indicates the observation number. Please
estimate the linear relationship between the 2 variables: calculating the two parameters
alpha and beta by simple ordinary least square estimation of the model: y_i = alpha +
beta*x_i + epsilon_i
i
x
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
y
6
-1
3
-1
8
0
-3
1
4
7
2
1
7
-4
4
4
4
-1
-3
3
11
-5
5
-21
53
-11
2
10
14
29
23
-4
31
-10
30
30
22
-12
-28
25
2. Calculate the standard error of beta.
3. Calculate a two sided t-test for the significance of beta with a significance level of 5 %!
Note: the critical values for n=20 (n-2=18) are 2.101 and -2.101
4. Calculate the 95% confidence interval for beta.
5. Calculate the R-squared for the above model.
6. download the file garmit_lab.dta into your m-drive: this file contains a panel of 18 OECD
countries from 1961 till 1994. Backgroud-reading: Garrett-Mitchell paper on government
spending (sent around by e-mail)
7. open stata
8. change the working directory to your m drive:
cd "c:\...\..."
open the data file in stata:
use garmit_lab.dta
9. summarize the main characteristics of the interesting variables:
sum spend unem growthpc depratio left cdem trade lowwage fdi skand
10. estimate a basic model for government spending and interpret the regression table:
Interpret the regression results, coefficients, standard errors, confidence intervals, R², FTest
reg spend unem growthpc depratio left cdem trade lowwage fdi
11. now change the confidence levels (note the default setting is 95% levels)
set level 99
set level 90
estimate the same regression again, what has changed, how and why?
Set the level back to default:
set level 95
estimate the model again:
reg spend unem growthpc depratio left cdem trade lowwage fdi
12. Now run the same regression but estimate standardized coefficients – how to interpret
standardized coefficients? Why do we need standardized coefficients?
reg spend unem growthpc depratio left cdem trade lowwage fdi, beta
Heteroscedasticity and Omitted Variable Bias
13. calculate predicted values and residuals:
predict spend_hat
predict spend_resid, resid
14. Now create scatterplots for the residuals against some of the explanatory variables and the
country codes: what can you see? Omitted variable bias, heteroskedasticity?
twoway (scatter spend_resid unem)
twoway (scatter spend_resid cc)
twoway (scatter spend_resid growthpc)
15. Stata provides build in tests for omitted variable bias and heteroskedasticity: How to
interprete the test results?
estat hettest
estat ovtest
estat szroeter unem growthpc depratio left cdem trade lowwage fdi
16. What can we do about Heteroskedasticity and omitted variable bias?
reg spend unem growthpc depratio left cdem trade lowwage fdi skand
What can be observed, how to interpret the coefficient for "skand"?
17. Now do the same tests again for the new model and interpret the results:
predict spend_hat2
predict spend_resid2, resid
estat hettest
estat ovtest
twoway (scatter spend spend_hat2 if year==1984, mlabel(country))
twoway (scatter spend_resid2 cc)
twoway (scatter spend_resid2 unem)
estat szroeter unem growthpc depratio left cdem trade lowwage fdi skand
18. Just treating the standard errors: robust White SEs:
reg spend unem growthpc depratio left cdem trade lowwage fdi, robust
reg spend unem growthpc depratio left cdem trade lowwage fdi, vce(robust)
reg spend unem growthpc depratio left cdem trade lowwage fdi, vce(cluster cc)
reg spend unem growthpc depratio left cdem trade lowwage fdi, vce(bootstrap)
reg spend unem growthpc depratio left cdem trade lowwage fdi, vce(jacknife)
reg spend unem growthpc depratio left cdem trade lowwage fdi, vce(hc2)
reg spend unem growthpc depratio left cdem trade lowwage fdi, vce(hc3)
19. GLS: robust Huber-White Sandwich Estimator
xtgls spend unem growthpc depratio left cdem trade lowwage fdi, p(h)
open stata
change the working directory to your m drive:
cd "c:\...\..."
open the data file in stata:
use garmit_lab.dta
open log file
log using lab5.log
estimate a basic model for government spending and interpret the regression table:
reg spend unem growthpc depratio left cdem trade lowwage fdi
Multicollinearity:
calculate correlation coefficients for all explanatory variables, do you find problems of multicollinearity?
corr unem growthpc depratio left cdem trade lowwage fdi skand
What is the trade-off between multi-collinearity and omitted variable bias?
If there is complete multicollinearity we don't have to worry, STATA does the job:
Linear transformations of a variable are perfect multicollinear to the original variable:
gen unem2=2*unem+3
corr unem unem2
reg spend unem unem2 if year==1986
Was does Stata do in case of perfect multicollinearity?
Now look at two variables that are highly correlated but not multicollinear, what is the
problem?
gen unem3=2*unem+3*invnorm(uniform())
corr unem unem3
reg spend unem unem3 if year==1986
reg spend unem skand if year==1986
reg spend unem unem3 skand if year==1986
run the original regression again:
reg spend unem growthpc depratio left cdem trade lowwage fdi
Run a variance inflation factor test for higher order multicollinearity:
estat vif
estat vif, uncentered
How to interprete the vif, what does vif measure?
AV-plots (additional) variables are for multivariate models to show what ceteris paribus effect
a single independent variable has:
avplots
avplot unem
for linear OLS models the marginal effect of a single independent variable is equal for all
values of the independent variable:
mfx
mfx compute, at(mean)
mfx compute, at(median)
Autocorrelation:
now lets look at a single time-series, e.g. Germany (or UK) – what has changed, what is the
difference to the above models?
reg spend unem growthpc depratio left cdem trade lowwage fdi if
country=="Germany"
Do the same tests again: what can you observe?
estat hettest
estat ovtest
Now let's turn to another violation of Gauss-Markov assumptions: autocorrelation, serial
correlation and test for it: Durbin-Watson statistic and Breusch-Godfrey test, how to interprete
the results?
estat dwatson
estat bgodfrey
Or a simpler test: what can we observe?
predict spend_residger, resid
gen lagspend_residger=l.spend_residger
reg spend_residger lagspend_residger unem growthpc depratio left cdem trade
lowwage fdi if country == "Germany"
the simplest way to deal with serial correlation is to include the lagged values of the
dependent variable to the right hand side of the model – the LDV (lagged dependent variable;
BUT with including and excluding variables from the model and do some more Omitted
variable bias and heteroskedasticity tests…
Now do the same for the UK!
…
: opens another can of worms… We will talk about this in more detail when we talk about
time series models): How to interpret the coefficient of the LDV?
reg spend spendl unem growthpc depratio left cdem trade lowwage fdi if
country=="Germany"
And do the tests again:
estat dwatson
estat bgodfrey
Run a Prais-Winsten model:
prais spend unem growthpc depratio left cdem trade lowwage fdi if
country=="Germany"
Play a little around
Interpretatipon of a dummy variable and interaction effects:
reg spend skand if year==1986
reg spend unem skand
gen skand_unem=unem*skand
reg spend unem skand skand_unem
reg spend unem skand skand_unem growthpc depratio left cdem trade lowwage fdi
Interaction effects:
gen unem_trade=unem*trade
reg spend unem trade growthpc depratio left cdem lowwage fdi
reg spend unem trade unem_trade growthpc depratio left cdem lowwage fdi
sslope spend unem trade unem_trade growthpc depratio left cdem lowwage fdi,
i(unem trade unem_trade)
sslope spend unem trade unem_trade growthpc depratio left cdem lowwage fdi,
i(trade unem unem_trade)
graphical display of IA effects:
sslope spend unem trade unem_trade growthpc depratio left cdem lowwage fdi,
i(unem trade unem_trade) graph
sslope spend unem trade unem_trade growthpc depratio left cdem lowwage fdi,
i(trade unem unem_trade) graph
sum trade unem
Program marginal effects of IA effects:
capture drop MV-lower
reg spend unem trade unem_trade growthpc depratio left cdem lowwage fdi
generate MV=((_n-1)*10)
replace
MV=. if _n>17
matrix b=e(b)
matrix V=e(V)
scalar b1=b[1,1]
scalar b2=b[1,2]
scalar b3=b[1,3]
scalar varb1=V[1,1]
scalar varb2=V[2,2]
scalar varb3=V[3,3]
scalar covb1b3=V[1,3]
scalar covb2b3=V[2,3]
scalar list b1 b2 b3 varb1 varb2 varb3 covb1b3 covb2b3
gen conb=b1+b3*MV if _n<=17
gen conse=sqrt(varb1+varb3*(MV^2)+2*covb1b3*MV) if _n<=171
gen a=1.96*conse
gen upper=conb+a
gen lower=conb-a
graph twoway (line conb
MV, clwidth(medium) clcolor(blue)
clcolor(black))/*
*/(line upper MV, clpattern(dash) clwidth(thin)
clcolor(black))/*
*/(line lower MV, clpattern(dash) clwidth(thin)
clcolor(black)) , /*
*/ xlabel(0 20 40 60 80 100 120 140 160, labsize(2.5)) /*
*/ ylabel(-0.5 0 0.5 1 1.5 2,
labsize(2.5)) yscale(noline)
xscale(noline) legend(col(1) order(1 2) label(1 "Marginal Effect of
Unemployment on Spending") label(2 "95% Confidence Interval") /*
*/ label(3 " ")) yline(0, lcolor(black)) title("Marginal Effect
of Unemployment on Spending as Trade Openness changes", size(4))/*
*/ subtitle(" " "Dependent Variable: Government Spending" " ",
size(3)) xtitle( Trade Openness, size(3) ) xsca(titlegap(2))
ysca(titlegap(2)) /*
*/ ytitle("Marginal Effect of Unemployment", size(3))
scheme(s2mono) graphregion(fcolor(white))
graph export
m:\...\unem_trade1.eps, replace
translate @Graph m:\...\unem_trade1.wmf, replace
capture drop MV-lower
or much easier:
download grinter.ado from Fred Boehmke
net from http://myweb.uiowa.edu/fboehmke/stata/grinter
reg spend unem trade unem_trade growthpc depratio left cdem lowwage fdi
grinter unem, inter (unem_trade) const02(trade) depvar(spend) kdensity yline(0)
test for functional form:
reg spend unem growthpc depratio left cdem trade lowwage fdi
acprplot unem, mspline
acprplot left, mspline
etc…
gen ln_unem = ln(unem)
reg spend ln_unem growthpc depratio left cdem trade lowwage fdi
gen sqrt_unem=unem^2
reg spend unem sqrt_unem growthpc depratio left cdem trade lowwage fdi
what effect can we observe? Calculate the “turning point”!
gen sqrt_unem=unem^2
Outliers:
test for outliers:
reg spend unem growthpc depratio left cdem trade lowwage fdi
dotplot spend
symplot spend
rvfplot
lvr2plot
lvr2plot, ml(country)
dfbeta
Solution: jacknife and bootstrapping:
jacknife _b _se, eclass saving(name.dta): reg spend unem growthpc depratio left cdem
trade lowwage fdi skand
bootstrap _b _se, reps(1000) saving(name.dta): reg spend unem growthpc depratio left
cdem trade lowwage fdi skand
Download