Uploaded by pulchnyadrian

Assignment-Microeconometrics

advertisement
EC338: Assignment 1
Adrian Pulchny
rm(list=ls())
Section A
Question 1:
V̂ const = V̂ homosk
PN
2
We start by showing that:
i=1 (Wi − W ) = N W (1 − W )
PN
PN
2
2
2
2
i=1 (Wi − W ) =
i=1 (Wi − 2Wi W + W ) = N W − 2W N W + W = N W (1 − W )
We are now moving to: V̂ const
V̂ const = s2 ( N1c +
1
Nt )
Plugging in
s2 =
1
2
N −2 (sc
∗ (Nc − 1) + s2t ∗ (Nt − 1) =
1
N −2 (
obs
i:W i=0 (Yi
P
obs
− Y c )2 +
obs
i:W i=1 (Yi
P
obs
− Y t )2 )
We now focus on the denominator:
( N1c +
1
Nt )
N
Nc Nt
=
Putting it all together results in:
P
obs 2 P
obs
1
N −2 (
i:W i=0
(Yi
−Y c
) +
obs 2
i:W i=1
(Yiobs −Y t
) )
Nc Nt
N
Now we can manipulate the denominator
PN
Nc Nt
2
t
= N NNc N
i=1 (Wi − W )
N
N = N W (1 − W ) =
So in the end s2 equals :
P
P
obs
obs
1
(Yiobs −Y c )2 +
(Yiobs −Y t )2 )
N −2 (
i:W i=0
i:W i=1
PN
2
i=1
(Wi −W )
Now we have shown what V̂ const looks like. Let’s switch to V̂ homosk , because we have already shown that
the denominator is the same i will focus on the Numerator. We plug in for epsilon:
PN 2
PN
obs
obs
1
1
− Y c ) ∗ (Wi − W ))2
i=1 ϵi = N −2
i=1 (Yi − Y − (Y t
N −2
Using the definition of Y , splitting the sum and multiplying out we become:
P
PN
obs
obs
obs
obs
obs
1
− (1 − W )Yc + W Yt − W Yc )2 + i:W i=1 (Yi − W Yt − (1 −
i:W i=0 (Yi − W Y t
N −2 [
W )Yc
obs
− Yt
obs
+ W Yt
obs
+ Yc
obs
− Yc W )2 ]
Many things fall away and we are left with (here i include the denominator):
PN
obs 2 PN
obs 2
1
(
(Yi −Yc
) +
(Yi −Yt
)
i:W i=1
V̂ homosk = N −2 i:W i=0 PN
)
2
i=1
(Wi −W )
1
Which means that V̂ homosk = V̂ const
Question 2
s̃t
s̃c
+N
V̂ neyman = V̂ hetero where V̂ neyman = N
c
t
PN
PN 2
2
OLS
OLS
)
(Y
−α
−β
Wi )2 ∗(Wi −W )2
ϵ
(W
−W
i
i
P
= i=1
V̂ hetero = Pi=1 i
2 2
2 2
(
i=1
(Wi −W ) )
(
i=1
(Wi −W ) )
Where the denominator can be written as:
P
t Nc 2
( i=1 (Wi − W )2 )2 = (N W (1 − W ))2 = (N N
N N ) =
Nt2 N c2
N2
We plug in for the denominator and numerator
PN
obs
obs
2
2
i=1
(Yi −Y −(Y t
obs
i:W i=0
Nt 2
N )
)∗(Wi −W )) ∗(Wi −W )
N 2 N c2
t
N2
P
(
−Y c
(Yi −W Y t
PN
i:W i=0
−(1−W )Yc
obs
c 2
(Yi −Yc )2 +( N
N )
+W Yt
obs
PN
N 2 N c2
t
N2
i:W i=1
−W Yc
obs 2
PN
) ∗(−W )2 +
i:W i=1
N 2 N c2
t
N2
(Yi −W Yt
obs
−(1−W )Yc
obs
−Yt
obs
+W Yt
obs
+Yc
obs
(Yi −Yt )2
The denominator cuts out completely and we can shorten the last expression to :
PN
PN
1
1
2
2
i:W i=1 (Yi − Yt )
i:W i=0 (Yi − Yc ) + N 2
N2
c
t
s˜c 2
Nc
which is equal to:
+
s˜t 2
Nt
We have now shown that:
V̂ neyman = V̂ hetero
Question 3
Because variances are the same under homoskedasticity, we can use a more precise estimator by pooling the
variance and weighting it with Nc and Nt this is not the case when we have heterogeneous treatment effects.
Here we need to use the Neyman Variance estimator, because the variances are different we can’t build a
pooled variance like we did with homogeneous treatment effects. How does this refer to the treatment effect?
In the case of homoskedasticity all treatment effects are the same, because we have no differences in the
variance of the control and treated group, which yields us the same treatment effect. In case of heterogeneity
we have different variances in the control and treatment group, which then yield different treatmeant effects.
Section B
Our effect size is
τ
s
= 0.207 = τ = 0.207s
Using the formula we were given:
τ0
P (Z < zα/2 − se(τ̂
) | r = r0 ) + P (Z > z1−α/2 −
τ0
zα/2 − se(τ̂ ) | r = r0 )
τ0
se(τ̂ )
| r = r0 ) = 1 − P (Z < z1−α/2 −
τ0
se(τ̂ )
| r = r0 ) + P (Z <
If we assume that attrition is independent of treatment status we get that Nt = 320 and Nc = 480. We can
now plug in all the values and then compute the power. Here i use the normal distribution and phi represents
the cdf of the normal distribution.
1−P (1.96− p
0.207s
)+P (−1.96− p 20.207s
)
s2 ∗( N1c + N1 )
s ∗( N1c + N1 )
t
= 1−P (1.96− √
t
0.207
)+P (−1.96− √ 0.207
)
1
1
1
1
( 320
+ 480
)
( 320
+ 480
)
1 − ϕ(−0.91) + ϕ(−4.83) = 1 − (1 − ϕ(0.91) + (1 − ϕ(4.83) = 1 − 0.1814 + 0 = 0.8186 ≈ 0.82
We can also show this result using an R-package: We have a two sided test with different sample sizes.
2
=
−Yc W )2 ∗(1−W )2
pwr.2p2n.test(h = 0.207, n1 = 320, n2 = 480, sig.level = 0.05, power = NULL, alternative = "two.sided")
##
##
difference
##
##
h
##
n1
##
n2
##
sig.level
##
power
##
alternative
##
## NOTE: different
of proportion power calculation for binomial distribution (arcsine transformation)
=
=
=
=
=
=
0.207
320
480
0.05
0.818144
two.sided
sample sizes
Section C
Set observations
data_generation <- data.frame(matrix(ncol=0, nrow=1000))
Set parameters
y0 <- 1.2
y1 <- 0.015
y2 <- -0.02
y3 <- -0.01
set seed
set.seed(333)
Generate Covariates and error term
data_generation$error <- rnorm(1000,0,0.55)
data_generation$age <- floor(46*runif(1000)+20)
data_generation$female <- rbinom(1000, 1, 0.5- (0.25*ln(data_generation$age - 19))/ln(46))
Generate Y0
data_generation$yi0 <- y0 + y1*data_generation$age + y2*data_generation$female + y3*data_generation$age
Create the heterogenous treatment effect
data_generation$tau <- rnorm(1000, mean = 0.02 + 0.06*ifelse(data_generation$age > 43, 1, 0), sd = 0.01)
Treatment status
data_generation$treat <- rbinom(1000, 1, 0.25 + (0.5*ln(data_generation$age - 19))/ln(46))
Generate Y1
data_generation$yobs <- data_generation$yi0 + data_generation$treat * data_generation$tau
Estimating the model 1000 times
set.seed(100)
mat_1 <- replicate(1000, {
df <- data.frame(matrix(ncol=0, nrow=1000));
df$error <- rnorm(1000,0,0.55);
df$age <- floor(46*runif(1000)+20);
3
df$female <- rbinom(1000, 1, 0.5- (0.25*ln(df$age - 19))/ln(46));
df$yi0 <- y0 + y1*df$age + y2*df$female + y3*df$age * df$female + df$error;
df$tau <- rnorm(1000, mean = 0.02 + 0.06*ifelse(df$age > 43, 1, 0), sd = 0.01);
df$treat <- rbinom(1000, 1, 0.25 + (0.5*ln(df$age - 19))/ln(46));
df$yobs <- df$yi0 + df$treat * df$tau;
lmodel <- glm(treat ~ age, family=binomial(logit), data= df);
prop = predict.glm(lmodel, newdata = df, type="response");
lambda = 1/(propˆdf$treat * (1-prop)ˆ(1-df$treat));
df$agesat <- as.factor(df$age);
reg1 <- lm(yobs ~ treat, data=df);
reg2 <- lm(yobs ~ treat + age + female + age*female, data=df);
reg3 <- lm(yobs ~ treat + agesat + 0, data=df);
reg4 <- lm(yobs ~ treat, weights = lambda, data=df);
reg5 <- lm(yobs ~ treat + agesat , weights = lambda, data=df);
coef <- c(reg1$coefficients[2],reg2$coefficients[2], reg3$coefficients[1], reg4$coefficients[2], reg5$
}, simplify = "array")
mat_2 <- t(mat_1)
colnames(mat_2) <- c("beta1", "beta2", "beta3", "beta4", "beta5")
summary(mat_2)
##
##
##
##
##
##
##
##
##
##
##
##
##
##
beta1
Min.
:0.01521
1st Qu.:0.10559
Median :0.13182
Mean
:0.13259
3rd Qu.:0.15817
Max.
:0.26449
beta5
Min.
:-0.07243
1st Qu.: 0.02099
Median : 0.04933
Mean
: 0.04903
3rd Qu.: 0.07591
Max.
: 0.17550
beta2
Min.
:-0.05270
1st Qu.: 0.02056
Median : 0.04734
Mean
: 0.04645
3rd Qu.: 0.06977
Max.
: 0.16964
beta3
Min.
:-0.07374
1st Qu.: 0.01911
Median : 0.04643
Mean
: 0.04655
3rd Qu.: 0.07328
Max.
: 0.16815
beta4
Min.
:-0.06858
1st Qu.: 0.01934
Median : 0.04695
Mean
: 0.04644
3rd Qu.: 0.07178
Max.
: 0.16847
describe(mat_2)
##
##
##
##
##
##
beta1
beta2
beta3
beta4
beta5
vars
1
2
3
4
5
n
1000
1000
1000
1000
1000
mean
0.13
0.05
0.05
0.05
0.05
sd median trimmed mad
min max range skew kurtosis se
0.04
0.13
0.13 0.04 0.02 0.26 0.25 0.14
-0.01 0
0.04
0.05
0.05 0.04 -0.05 0.17 0.22 0.12
-0.05 0
0.04
0.05
0.05 0.04 -0.07 0.17 0.24 0.07
-0.17 0
0.04
0.05
0.05 0.04 -0.07 0.17 0.24 0.11
-0.11 0
0.04
0.05
0.05 0.04 -0.07 0.18 0.25 0.05
-0.16 0
ggplot(as.data.frame(mat_2), aes(x=beta1)) +
geom_density(aes(x=beta1, color="Beta1"), size = 1) +
geom_density(aes(x=beta2, color="Beta2"),size = 1) +
geom_density(aes(x=beta3, color="Beta3"), size = 1 ) +
geom_density(aes(x=beta4, color="Beta4"), size = 1 ) +
geom_density(aes(x=beta5, color="Beta5"), size = 1) +
geom_vline(xintercept=0.05) +
labs(title="kernel density plot")
4
kernel density plot
9
colour
density
Beta1
6
Beta2
Beta3
Beta4
Beta5
3
0
0.0
0.1
0.2
beta1
It clearly can be seen that something is wrong with Model 1, because it overestimates the treatment effect,
which was specified to be 0.05 but is on average 0.13 in this model. We have an OVB in Model 1. In this
case we can determine where the OVB comes from, because the omitted variables are observed (age and
female) now we need to figure out from which variable it comes from. The OVB exists, when the following 2
conditions are fulfilled. 1) the omitted variable needs to be a determinant of Yobs , what means that the beta
of the omitted variable shouldn’t be 0. And second: The included covariate needs to be correlated with the
omitted variable. If those 2 conditions are fulfilled we have a OVB. We will now establish where the OVB
comes from
set.seed(1000)
mat_11 <- replicate(1000, {
dff <- data.frame(matrix(ncol=0, nrow=1000));
dff$error <- rnorm(1000,0,0.55);
dff$age <- floor(46*runif(1000)+20);
dff$female <- rbinom(1000, 1, 0.5- (0.25*ln(dff$age - 19))/ln(46));
dff$yi0 <- y0 + y1*dff$age + y2*dff$female + y3*dff$age * dff$female + dff$error;
dff$tau <- rnorm(1000, mean = 0.02 + 0.06*ifelse(dff$age > 43, 1, 0), sd = 0.01);
dff$treat <- rbinom(1000, 1, 0.25 + (0.5*ln(dff$age - 19))/ln(46));
dff$yobs <- dff$yi0 + dff$treat * dff$tau;
reg6 <- lm(yobs ~ treat, data=dff);
reg7 <- lm(yobs ~ treat + age, data=dff);
reg8 <- lm(yobs ~ treat + female, data=dff);
coef <- c(reg6$coefficients[2],reg7$coefficients[2], reg8$coefficients[2])
}, simplify = "array")
mat_22 <- t(mat_11)
colnames(mat_22) <- c("beta1", "beta2", "beta3")
5
summary(mat_22)
##
##
##
##
##
##
##
beta1
Min.
:0.02103
1st Qu.:0.10646
Median :0.13395
Mean
:0.13359
3rd Qu.:0.16023
Max.
:0.25837
beta2
Min.
:-0.07710
1st Qu.: 0.01969
Median : 0.04689
Mean
: 0.04657
3rd Qu.: 0.07391
Max.
: 0.15132
beta3
Min.
:0.02077
1st Qu.:0.09595
Median :0.12071
Mean
:0.12130
3rd Qu.:0.14738
Max.
:0.22218
describe(mat_22)
##
##
##
##
##
##
##
##
vars
n mean
sd median trimmed mad
min max range skew kurtosis
1 1000 0.13 0.04
0.13
0.13 0.04 0.02 0.26 0.24 -0.01
-0.27
2 1000 0.05 0.04
0.05
0.05 0.04 -0.08 0.15 0.23 -0.02
-0.19
3 1000 0.12 0.04
0.12
0.12 0.04 0.02 0.22 0.20 -0.05
-0.26
se
beta1 0
beta2 0
beta3 0
beta1
beta2
beta3
ggplot(as.data.frame(mat_22), aes(x=beta1)) +
geom_density(aes(x=beta1, color="Beta1"), size = 1) +
geom_density(aes(x=beta2, color="Beta2"),size = 1) +
geom_density(aes(x=beta3, color="Beta3"), size = 1 ) +
geom_vline(xintercept=0.05) +
labs(title="kernel density plot")
kernel density plot
10.0
7.5
density
colour
Beta1
5.0
Beta2
Beta3
2.5
0.0
0.0
0.1
0.2
beta1
We can see that including the age as a covariate the OVB is significantly reduced, we can also observe that
the Average treatment effect from beta2 is 0.04657, so very close to the real treatment effect of 5%. The third
6
regression is also interesting, because it shifts the graph more towards the true parameter of 0.05. The reason
the second graph (where we included age) and the third graph (with female) are so different is because of the
different correlations with the “treat” covariate. We can compute those:
cor(data_generation$treat, data_generation$age)
## [1] 0.1616059
cor(data_generation$treat, data_generation$female)
## [1] 0.007368619
cor(data_generation$yobs, data_generation$female)
## [1] -0.359865
cor(data_generation$yobs, data_generation$age)
## [1] 0.3136586
Age is heavily correlated with the treatment variable, which is logical because we defined the treatment
effect using the variable age. For female we see there is basically a zero correlation with treat but it is a
determinant of Yobs . We see that age is causing an OVB so we should include it in the model as a covariate.
But there is one case where we would not include the age variable. If age would be an outcome of the model,
which means that the treatment variable is causal for age.
I will now go back and interpret the other 4 models (reg2:reg5). The second regression matches the CEF
of Yiobs , which is the perfect model, we can also see that if plotted it yields us a ATE of 5%. Model 3 is a
saturated model and it is actually very similar to the second model but it lacks the female variable. But a lot
of the female variation is caught in the age variable, because they are correlated with each other. Model 4 is
an interesting model, because we use inverse probability weights. We have seen in model 1 that not including
age leads to a OVB, while here we didn’t include age but have no OVB. The reason behind is that inverse
probability weighting can remove confounding and therefore can reduce the bias of the unweighted estimators.
The last model is the saturated model with inverse probability weighting. As we seen before model 4 had a
ATE of 5% already and adding the additional dummy variables in model 5 does not improve the estimation.
Section D
a)
The minimum wage raise in Ontario creates a good treated and control group. The treated group being the
state “Ontario”, which experiences a significant increase in the minimum wage. The control group being the
neighboring states or potentially other states in Canada. It is important to find a good control group, which
exhibits the same characteristics as Ontario. But the problem is the assignment to treatment, which is not
determined randomly but is aimed at low skilled workers or seasonal workers, who earn the minimum wage.
This can lead to some bias, because the treated control group may not be directly comparable to the control
group. Is the SUTVA fulfilled? We need to ask ourselves whether there are spillover effects between the
treated and control group. An increase in the minimum wage could lead to some spillover effects if low skilled
workers from neighboring states would change their place of residence to Ontario to profit from the increase.
And we also need to ask ourselves whether there are any hidden variations in the treatment level. There is
variation, because wages of liquor servers or people under 18 got a different raise but it was documented so
we don’t have to worry about that. But there is one problem: The new minimum wage rates may increase
every year in October 1 but they are announced usually on April in this case it was announced on June 2017,
meaning that there could be some pre-emptive behaviour, as an example business owners could preemptively
react to the change in the minimum wage in october by already firing some of their workers. Which would
happen in the pre-treatment period and could therefore bias our treatment effect. Furthermore we would
want a control group, which is not treated but every state got a increase in minimum wage.
7
b)
Reading, filtering and grouping the data.
data = read_dta("lfs_2010_2019_ages1564_20per.dta")
df = data %>%
filter(agegrp >= 1, agegrp<3,
year >=2010, year<=2019,
province >=5, province<=7,
empstat >=1, empstat<=4)
dff <- df %>% group_by(province,empstat,year) %>% summarise(n = n())
dfg <- df %>% group_by(province,empstat,year,month) %>% summarise(n = n())
Lets compute the yearly employment rates and plot them as a time series.
employment_rates1 <- numeric(10)
for (i in 1:10){
}
employment_rates1[i] = (dff[i,4] + dff[10 + i,4])/(dff[i,4]+ dff[10 + i,4] + dff[20 + i,4] + dff[30
employment_rates2 <- numeric(10)
for (i in 1:10){
}
employment_rates2[i] = (dff[40+ i,4] + dff[50 + i,4])/(dff[40+i,4]+ dff[50 + i,4] + dff[60 + i,4] + df
employment_rates3 <- numeric(10)
for (i in 1:10){
}
employment_rates3[i] = (dff[80+ i,4] + dff[90 + i,4])/(dff[80+i,4]+ dff[90 + i,4] + dff[100 + i,4] + d
employment_ratesc <- data.frame(matrix(ncol=0, nrow=10))
employment_ratesc$QUE <- employment_rates1
employment_ratesc$ONT <- employment_rates2
employment_ratesc$MAN <- employment_rates3
empy <- ts(employment_ratesc, frequency = 1, start = 2010)
dygraph(empy) %>%
dyOptions(axisLineWidth = 2.0)
8
0.65
QUE
ONT
MAN
0.6
0.55
0.5
2010
Now we will compute the monthly employment rates and plot them.
employment_ratesque <- numeric(120)
for (i in 1:120){
}
employment_ratesque[i] = (dfg[i,5]+dfg[120+i,5])/(dfg[i,5]+ dfg[120+i,5]+dfg[240 + i, 5] + dfg[360 + i
employment_ratesont <- numeric(120)
for (i in 1:120){
}
employment_ratesont[i] = (dfg[480+i,5]+dfg[600+i,5])/(dfg[480+i,5]+ dfg[600++i,5]+dfg[720 + i, 5]+dfg[
employment_ratesman <- numeric(120)
for (i in 1:120){
}
employment_ratesman[i] = (dfg[960+i,5]+dfg[1080+i,5])/(dfg[960+i,5]+ dfg[1080+i,5]+dfg[1200 + i, 5] +
monthly_emp_rates <- data.frame(matrix(ncol=0, nrow=120))
monthly_emp_rates$QUE <- employment_ratesque
monthly_emp_rates$ONT <- employment_ratesont
9
monthly_emp_rates$MAN <- employment_ratesman
empmonth <- ts(monthly_emp_rates, frequency = 12, start = c(2010,1))
dygraph(empmonth) %>%
dyOptions(axisLineWidth = 2.0)
QUE
ONT
MAN
0.7
0.6
0.5
2010
Looking at the first graph we have to say that the parallel trends assumption does not really hold. From
2010 to 2013 it doesn’t look bad but in the year 2013 to 2014 we can see that Manitoba follows a different
trend also in the year 2012 and 2013 Manitoba follows a different trend. For Quebec we can say that parallel
trends doesn’t hold as well, especially in the year before the treatment in 2017 we see a huge difference in
trends. But if we switch to the more detailed second graph where we plotted monthly employment rates, we
can observe that both states experience similar trends with Ontario. All 3 states experience huge spikes in
employment rates, the first one in May, the second one in June and the third and last one in July. And they
also experience downward spikes in employment in August and september. The other month do not really
look promising, because sometimes we have some deviations but they are not that bad so we can actually say
that parallel trends holds. Not perfectly but perfect parallel trends are basically impossible. We can see that
the youth employment goes up significantly in May, June and July and then drops a little in August and goes
back to the same level as in April in September. The spike in employment rate is caused by the summer
break in Canada, which starts in June. There is also a Mid-winter break in canada but it varies, sometimes it
is in March and sometimes it is in April but there are no significant changes in employment due to the mid
winter break. The trend we observe is clearly positive the employment rate grows over the years slightly.
c)
Here i decided to use Quebec as a control group. I think Quebec can be a good control group, because we
have seen previously in the monthly employment graph that both Quebec and Ontario followed a similar
10
trend in employment. I tried to match on specific characteristics like population: Quebec is the second biggest
state while Ontario is the biggest one. Furthermore Ontario has 917741 square kilometres of land and Quebec
has 1356128 square kilometres of land and most of its population lives in urban areas. I didn’t decide to
go for another control group, because as an example Manitoba only has a population of around 1.3 Million,
while still having a great amount of land. Manitoba has lots of landscapes and forests with less economic
activity and heavily relies on agriculture. Also it is important to notice that Quebec also got a minimum wage
increase in 2018, which can make it a bad control group, because we actually want a never treated control
group but neither of the states fulfills this requirement. Manitobas minimum wage increase is minimal but in
my opinion the characteristics are just not optimal, because as already said it mostly relies on agriculture
and also tourism. As a covariate i included the edugrp, because it was correlated with the outcome and the
regressor and is not a outcome of the model. Furthermore i decided to include the Familytype because it is
also correlated with our outcome and the regressor.
dv <- data %>% filter(year >=2017, year<=2018,
province == 6 | province == 5,
agegrp >= 1, agegrp<3)
dv$yobs <- ifelse(dv$empstat<=2, 1, 0)
dv$ontariodummy <- ifelse(dv$province == 6, 1, 0)
dv$timedummy <- ifelse(dv$year == 2018,1,0)
dv$month <- as.factor(dv$month)
reg11 <- lm(yobs~ontariodummy*timedummy, data=dv)
reg22 <- lm(yobs~ontariodummy*timedummy + month , data=dv)
reg33 <- lm(yobs~ontariodummy*timedummy + month + edugrp + efamtype , data=dv)
cov(dv$efamtype, dv$ontariodummy*dv$timedummy)
## [1] 0.08276152
coef <- c(reg11$coefficients[4], reg22$coefficients[15], reg33$coefficients[17])
mat <- matrix(coef)
tmat <- t(mat)
colnames(tmat) <- c("beta1", "beta2", "beta3")
table(tmat)
## tmat
## -0.0456354130768801 -0.0456127081540193 -0.0420036810225033
##
1
1
1
We can see that the there is a big difference when including the covariates, especially Family type, because
its correlation was very high. After including the Family type we see a significant decline of our beta. Which
may indicate a failure of parallel trends
d)
db <- data %>% filter(year>=2014, year<=2019,
province == 6 | province == 5,
agegrp == 1 | agegrp == 2)
db$yobs <- ifelse(db$empstat<=2, 1, 0)
db$ontariodummy <- ifelse(db$province == 6,1,0)
db$dynamic1 <- ifelse(db$year == 2014, 1 ,0)
db$dynamic2 <- ifelse(db$year == 2015, 1, 0)
db$dynamic3 <- ifelse(db$year == 2016, 1, 0)
db$dynamic4 <- ifelse(db$year == 2018, 1, 0)
db$dynamic5 <- ifelse(db$year == 2019, 1, 0)
db$month <- as.factor(db$month)
11
reg100 <- lm(yobs ~ ontariodummy*(dynamic1 + dynamic2 + dynamic3 + dynamic4+ dynamic5) + month, data=db)
matdyn <- data.frame(matrix(ncol=0, nrow=6))
matdyn$year <- 2014:2019
reg100$coefficients
##
(Intercept)
ontariodummy
dynamic1
##
0.541158416
-0.038948549
-0.004995016
##
dynamic2
dynamic3
dynamic4
##
0.012667454
0.010348675
0.046227427
##
dynamic5
month2
month3
##
0.054705267
0.001172655
0.009290986
##
month4
month5
month6
##
0.004596648
0.058852933
0.104667802
##
month7
month8
month9
##
0.140261242
0.118475538
0.029928940
##
month10
month11
month12
##
0.032917646
0.022443061
0.028307612
## ontariodummy:dynamic1 ontariodummy:dynamic2 ontariodummy:dynamic3
##
-0.009550772
-0.035321280
-0.021708430
## ontariodummy:dynamic4 ontariodummy:dynamic5
##
-0.045609421
-0.061449180
matdyn$beta <- c(coef(reg100)[19], coef(reg100)[20], coef(reg100)[21], 0, coef(reg100)[22], coef(reg100)
matdyn$se <- c(coef(summary(reg100))[19 , 2], coef(summary(reg100))[20 , 2],
coef(summary(reg100))[21
coef(summary(reg100))[23 , 2])
matdyn$conf <- qnorm(0.975)*matdyn$se
matdyn
##
##
##
##
##
##
##
1
2
3
4
5
6
year
2014
2015
2016
2017
2018
2019
beta
-0.009550772
-0.035321280
-0.021708430
0.000000000
-0.045609421
-0.061449180
se
0.01155037
0.01177449
0.01186172
0.00000000
0.01209383
0.01200575
conf
0.02263830
0.02307757
0.02324855
0.00000000
0.02370346
0.02353083
summary(reg100)
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
Call:
lm(formula = yobs ~ ontariodummy * (dynamic1 + dynamic2 + dynamic3 +
dynamic4 + dynamic5) + month, data = db)
Residuals:
Min
1Q
-0.7361 -0.5305
Coefficients:
(Intercept)
ontariodummy
dynamic1
dynamic2
dynamic3
Median
0.3542
3Q
0.4503
Max
0.5204
Estimate Std. Error t value Pr(>|t|)
0.541158
0.008591 62.990 < 2e-16 ***
-0.038949
0.008439 -4.615 3.93e-06 ***
-0.004995
0.009235 -0.541 0.588576
0.012667
0.009385
1.350 0.177090
0.010349
0.009451
1.095 0.273523
12
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
dynamic4
0.046227
0.009635
dynamic5
0.054705
0.009549
month2
0.001173
0.007955
month3
0.009291
0.007973
month4
0.004597
0.007947
month5
0.058853
0.007972
month6
0.104668
0.007968
month7
0.140261
0.007997
month8
0.118476
0.008012
month9
0.029929
0.007995
month10
0.032918
0.007998
month11
0.022443
0.007973
month12
0.028308
0.008013
ontariodummy:dynamic1 -0.009551
0.011550
ontariodummy:dynamic2 -0.035321
0.011774
ontariodummy:dynamic3 -0.021708
0.011862
ontariodummy:dynamic4 -0.045609
0.012094
ontariodummy:dynamic5 -0.061449
0.012006
--Signif. codes: 0 '***' 0.001 '**' 0.01 '*'
4.798
5.729
0.147
1.165
0.578
7.383
13.136
17.539
14.787
3.743
4.116
2.815
3.533
-0.827
-3.000
-1.830
-3.771
-5.118
1.61e-06
1.01e-08
0.882804
0.243928
0.562990
1.56e-13
< 2e-16
< 2e-16
< 2e-16
0.000182
3.86e-05
0.004878
0.000411
0.408307
0.002702
0.067235
0.000163
3.09e-07
***
***
***
***
***
***
***
***
**
***
**
.
***
***
0.05 '.' 0.1 ' ' 1
Residual standard error: 0.4927 on 91018 degrees of freedom
Multiple R-squared: 0.01399,
Adjusted R-squared: 0.01376
F-statistic: 58.72 on 22 and 91018 DF, p-value: < 2.2e-16
ggplot(matdyn, aes(x = year, y = beta))+
geom_errorbar(aes(ymin = beta - conf, ymax = beta + conf), width = 0.2)+
geom_line(col="red")+
geom_point(col="blue")+
geom_vline(xintercept = 2017.5, linetype = "dotted")
13
0.00
beta
−0.03
−0.06
2014
2015
2016
2017
2018
2019
year
The policy impact seams reasonable. After such a big minimum wage increase we expect the employment
rate to decline. In fact when ontario raised its minimum wage many disabled workers lost their job, many
daycare centers closed as well, because costs for childcare increased by 10.6%, which lowered the demand for
child care. Also businesses, which mostly employ low skilled labor had to shut down. 1
The parallel trend assumtion doesn’t hold here. The trend should be flat if we would assume parallel trends,
this is clearly not the case. Although the confidence intervals in 2014 and 2016 include the 0 in the graph
the beta in 2015 clearly doesn’t. In the second graph in B) it looked like the parallel trends assumption
could hold, because the differences between the treatment and control group looked constant, both had spikes
and reductions at the same time, they were also similar in magnitude. But is it really a failure of parallel
trends or is it pre-emptive behaviour? We can see that when the policy was announced in June 2017 the
employment rate decreased significantly, so there definitely is some pre-emptive behaviour involved from 2017
to 2018, which destroys our estimate.
e)
Now with monthly data we need a new parallel trends assumption. We could use the Conditional parallel
trends assumption where we would assume parallel trends conditional on months, we would include fixed
effects for months to account for the seasonality.
f)
We could use the formula from the lecture slides regarding No pre-emptive behaviour: E[Yit |Di = 1, t =
t0 + j] = E[Yit (0)|Di = 1, t = t0 + j]
g)
We could split the sample and use the older population aged 24-60 and treat them as a control group. The
minimum wage increase mostly influences the employment in young unskilled and seasonal workers. So the
1 Matthew
Lau, Ontario’s Minimum Wage Hike Has Been Disastrous, Especially for Disabled Workers, 2018
14
increase in minimum wage shouldn’t affect the employment rates in those higher agegroups. The remaining
sample can used to trace out the counterfactual, where at the end we would do an DiD estimation and we
could estimate the Average Treatment Effect.
h)
I am not really convinced by the estimates in this exercise, because we have seen that parallel trends is
questionable here, which leads to a biased estimator, but i think the outcome is still plausible and i provide
some arguments for that. The minimum wage raise was very high and also the article i cited states that the
employment for unskilled workers suffered heavily and businesses, which had mostly employed low skilled
workers had to either quit or raise the prices. Even if minimum wage increases do not lead to any impact
in the employment they still change the economy as seen in the article where the inflation for child care
and housekeeping increased by 10.6%. Furthermore we have to question whether the finding in the studies
regarding the minimum wage are really comparable with the outcomes in Ontario. The paper (Card and
Krueger 1994) analyzes New Jersey minimum wage increase from 4.25 to 5.05, which is an 18.9% increase and
find no significant employment effects in the fast food industry. The Ontario increase was higher, it was 20%.
So do we think that the findings from Card and Krueger have external validity? Probably not, provinces in
Canada differ significantly from New Jersey. So we can’t generalise the findings of the study to our situation
in Ontario. Even if my estimate is kinda wrong, because of the questionable parallel trends assumption the
real coefficient can still be negative. 2
Section E
In this section i didn’t code the exercise, because of the wrong outputs that R generate. Therefore i provide
only arguments here and i executed the code in Stata. I hope this is okay.
In model 1 NFL receives a very high weight, which seems quite odd, because NFL is one of the smallest
provinces in Canada with the important industries being financial services, oil and manufacturing companies.
We are basically constructing a control group, which mostly consists of NFL, which is just from the economical
intuition not appropriate. When inspecting the graph we can also see that it barely matches our treated
group.
In the second model we receive a weight of 1 on NB, which means that the whole control group is just NB,
which also seems very unreasonable considering that it is similar to NFL, being one of the smallest states
in Canada, where the primary sector is dominant. Concluding the second specification is bad, because we
wouldn’t like to choose NB as a control group.
In the third model we get a very interesting graph, because the trend seems from our synthetic control group
seems to match the trend of Ontario. Also we can see the 3 weights on the states NFL, PEI and BC all very
small states, which i would not use as a control group. But they seem to match over the covariates well.
I still decided to stay with my control group as being Quebec, the graphs didn’ really match well with the
trend of Ontario, so it didn’t convince me to switch my control group. But in general it can be a good idea to
try and compute a synthetic control group and then use the weights to determine which state really matches
the outcome or the covariates of ontario the best. You can then choose the control group in the DID setting.
The Synthetic control definitely should be used when there are problems with the parallel trends assumption.
It generally always creates a better pre-trend by weighting multiple units. While the DiD estimate would be
biased in this scenario, because the parallel trends assumption doesn’t hold.
2 Card
and Krueger, 1994
15
Download