Presentazione di PowerPoint

advertisement
ESTIMATING THE DOSE-RESPONSE
FUNCTION THROUGH THE GLM APPROACH
Barbara Guardabascio, Marco Ventura
Italian National Institute of Statistics
7th June 2013, Potsdam
1
Outline of the talk
 Motivations;
 literature references;
 our contribution to the topic;
 the econometrics of the dose-response;
 how to implement the dose-response;
 our programs;
 applications.
2
Motivations
 Main question:
how effective are public policy programs with
continuous treatment exposure?
 Fundamental problem:
treated individuals are self-selected and not randomly.
Treatment is not randomly assigned
 (possible) solution:
estimating a dose-response function
3
Motivations
 What is a dose-response function?
It is a relationship between treatment and an outcome
variable e.g.: birth weight, employment, bank debt, etc
Treatment Effect Function
10000
-20000
0
-10000
0
E[year6(t+1)]-E[year6(t)]
10000 15000 20000
5000
-5000
E[year6(t)]
Dose Response Function
0
2
4
6
Treatment level
Dose Response
8
10
Low bound
Upper bound
Confidence Bounds at .95 % level
Dose response function = Linear prediction
0
2
4
6
Treatment level
Treatment Effect
8
10
Low bound
Upper bound
Confidence Bounds at .95 % level
Dose response function = Linear prediction
4
Motivations
 How can we estimate a dose-response function?
It can be estimated by using the Generalized Propensity
Score (GPS)
5
Literature references
1. Propensity Score for binary treatments:
Rosenbaum and Rubin (1983), (1984)
2. for categorical treatment variables:
Imbens (2000), Lechner (2001)
3. Generalized Propensity Score for continuous treatments:
Hirano and Imbens, 2004; Imai and Van Dyk (2004)
6
Our contribution
 Ad hoc programs have been provided to STATA
users (Bia and Mattei, 2008), but …
… these programs contemplate only Normal
distribution of the treatment variable
(gpscore.ado and doseresponse.ado)
 We provide new programs to accommodate other
distributions, not Normal.
(gpscore2.ado and doseresponse2.ado)
7
The econometrics of the dose-response
 {Yi(t)} set of potential outcomes for
 Where
[t0, t1]
is the set of potential treatments over
8
The econometrics of the dose-response
Let us suppose to have
N
individuals, i=1 … N
Xi
vector of pre-treatment covariates;
Ti
level of treatment delivered;
Yi (Ti) outcome corresponding to the treatment Ti
9
The econometrics of the dose-response
 We want the average dose response function
 (t )  EYi (t )
 Hirano-Imbens define the GPS as the conditional
density of the actual treatment given the covariates
R  r (T | X )
10
The econometrics of the dose-response
 Balancing property:
X  1{T  t} | r (t , x)
Within strata with the same r(t,x) the probability that
T=t does not depend on X
11
The econometrics of the dose-response
 If weak unconfoundedness holds we have
Y (t )  T | X
t  
This means that the GPS can be used to eliminate any
bias associated with differences in the covariates and
…
12
The econometrics of the dose-response
 The dose-response function can be computed
as:
 (t , r )  EY (t ) | r (t , X )  r 
 EY | T  t , R  r 
 (t )  E t , r (t , X )
13
How to implement the GPS
 The dose-respone can be implemented in 3
steps:
FIRST STEP:
1. Regress Ti on Xi and
take the conditional distribution of the treatment given
the covariates Ti| Xi
14
How to implement the GPS

f (Ti ) | X i ~ D  ' X i ,  2

Where f(.) is a suitable transformation of T (link)
D is a distribution of the exponential family
β parameters to be estimated
σ conditional SE of T|X
15
How to implement the GPS

ˆ  D T , ˆ ' X , ˆ 2
R
i
i
i

GPS
1a. Test the balancing property
16
How to implement the GPS
SECOND STEP:
Model the conditional expectation of E[Yi| Ti, Ri ] as
a function of Ti and Ri
 (t , r )  EYi | Ti , Ri 
  0  1Ti   2Ti 2   3 Ri   4 Ri2   5Ti Ri
17
How to implement the GPS
THIRD STEP:
Estimate the dose-response function by averaging
the estimated conditionl expectation over the GPS
at each level of the treatment we are interested in
1
 (t ) 
N
N
 ˆt , rˆ(t , X )
i
i
18
How to implement the GPS
 Where is the novelty?
in the FIRST STEP
 Instead of a ML we use a GLM
 exponential distribution (family)
 combined with a link function
19
our programs
Link\Distr
Normal
Inv.
Normal
Binomial
Poisson
Neg.
Binomial
Gamma
Identity
X
X
X
X
X
X
Log
X
X
X
X
X
X
X
X
X
Logit
X
Probit
X
Cloglog
X
Power
Opower
X
X
X
X
Nbin
X
Loglog
X
Logc
X
20
our programs
We have written two programs:
 doserepsonse2.ado;
estimates the dose-response function and graphs
the result.
It carries out step 1 – 2 – 3 of the previous slides by
running other 2 programs
21
our programs
 gpscore2.ado:
evaluates the gpscore under 6 different distributional
assumptions
step 1 of the previous slides
 doseresponse_model.ado:
Carries out step 2 of the previous slides
22
our programs
doseresponse2 varlist , outcome(varname) t(varname)
family(string) link(string) gpscore(newvarname)
predict(newvarname) sigma(newvarname)
cutpoints(varname) nq_gps(#) index(string)
dose_response(newvarlist)
Options
t_transf(transformation) normal_test(test) normal_level(#)
test_varlist(varlist) test(type) flag(#) cmd(regression_cmd)
reg_type_t(string) reg_type_gps(string) interaction(#)
t_points(vector) npoints(#) delta(#) bootstrap(string)
filename(filename) boot_reps(#) analysis(string)
analysis_leve(#) graph(filename) flag_b(#) opt_nb(string)
opt_b(varname) detail
23
our programs
gpscore2 varlist , t(varname) family(string) link(string)
gpscore(newvarname) predict(newvarname)
sigma(newvarname) cutpoints(varname) index(string)
nq_gps(#)
Options
t_transf(transformation) normal_test(test) normal_level(#)
test_varlist(varlist) test(type) flag_b(#) opt_nb(string)
opt_b(varname) detail
24
Application
Data set by Imbens, Rubin and Sacerdote (2001);
The winners of a lottery in Massachussets:
amount of the prize (treatment)
 Ti
earnings 6 years after winning (outcome)
 Yi
age, gender, education, # of tickets bought, working
status, earnings before winning up to 6
 Xi
25
Application: flogit
Fractional data: flogit model.
Treatment: prize/max(prize)
outcome: earnings after 6 year
family(binomial) link(logit)
26
Application: flogit
5000
10000
Treatment Effect Function
0
-5000
-10000
-20000
-40000
E[year6(t)]
0
E[year6(t+.1)]-E[year6(t)]
20000
Dose Response Function
0
.2
.4
.6
Treatment level
Dose Response
.8
Low bound
Upper bound
Confidence Bounds at .95 % level
Dose response function = Linear prediction
0
.2
.4
.6
Treatment level
Treatment Effect
Low bound
Upper bound
Confidence Bounds at .95 % level
Dose response function = Linear prediction
27
.8
Application: count data
Count data: Poisson model.
Treatment: years of college+ high school
outcome: earnings after 6 year
family(poisson) link(log)
28
Application: count data
20000
-20000
0
-10000
0
E[year6(t+1)]-E[year6(t)]
15000
10000
5000
-5000
E[year6(t)]
Treatment Effect Function
10000
Dose Response Function
0
2
4
6
Treatment level
Dose Response
8
10
Low bound
Upper bound
Confidence Bounds at .95 % level
Dose response function = Linear prediction
0
2
4
6
Treatment level
Treatment Effect
8
10
Low bound
Upper bound
Confidence Bounds at .95 % level
Dose response function = Linear prediction
29
Application: gamma distribution
Gamma distribution:
Treatment: age
outcome: earnings after 6 year
family(gamma) link(log)
30
Application: gamma distribution
0
20
40
60
Treatment level
Dose Response
80
Low bound
Upper bound
Confidence Bounds at .95 % level
Dose response function = Linear prediction
-15000
0
-10000
-5000
0
E[year6(t+1)]-E[year6(t)]
100000
50000
-50000
E[year6(t)]
Treatment Effect Function
5000
150000
Dose Response Function
0
20
40
60
Treatment level
Treatment Effect
80
Low bound
Upper bound
Confidence Bounds at .95 % level
Dose response function = Linear prediction
31
Download