Non-Experimental Evidence
E. Michael Foster
School of Public Health,
University of North Carolina &
Methodology Center,
Pennsylvania State University
& Conduct Problems
Prevention Research Group
2
3
Background
Data: Fast Track Project
Methods
– Why not regression?
–
–
Propensity scores and matching
Doubly robust estimation
Results
4
Youth with mental health problems are at greater risk of JJ involvement
Juvenile justice involvement may harm mental health
Variety of policy initiatives to link juvenile justice system and delivery of mental health services
Model programs exist that can reduce delinquency (MST)
But, what about the “real world”?
5
The answer is “it depends”.
Heckman and colleagues (1997+) identify several key factors
Are the covariates (for matching or adjusting) measured in the same way? With same (good) reliability?
Are the different groups in the same “market” or site?
Are there unmeasured confounders?
6
10-year intervention project to prevent chronic conduct disorder in high risk youth
Schools randomly assigned to intervention & control conditions
Community-level, school-level, family-level, child-level data
Parental report of mental health services
(in-patient and out-patient)
7
3 cohorts in poor areas of 4 sites (3 urban, 1 rural)
High-risk youth:
– Multi-stage screening involving Parent and Teachers
– Generally top 20% in terms of combined risk
–
–
Intervention group (n=445)
Comparison group (n=446)
Randomly sampled youth (control schools) (n=308)
8
Work hard to avoid using linear regression to avoid extrapolating across groups
Application
– Outcome: parental report of arrests in grades 9 or
10.*
– Predicted by service use in grades 6, 7 or 8
– Individuals matched based on characteristics in grade 6 and earlier
9
Problems with regression
10
-10 0 x
10 control tx
20
11
-10 0 x
10 control tx
20
12
(cont)
Propensity scores as an alternative
Avoid restrictions of linear model both in estimating
–
– the propensity score and the outcome model
Careful checking of balance of covariates
13
Estimate propensity scores [ P(used services)] using neural networks
–
–
Problems in academic, social, peer and home domains (years 5 and 6)
Family demographics (mother’s age at first birth and education, biological dad in household) (baseline)
Use the pscores to match individuals (rather than as a weight or covariate)
(cont)
14
Refine matching based on key variables
– Parent and teacher reports of behavior problems at baseline
– Parental report of police contact at year 7
– Diagnosis of conduct disorder at years 4 or 7
Exact matching required for key variables
– Race (black v. other)
– Gender
– Site
15
(cont)
Matching done with replacement
(Better matching units used repeatedly.)
Non-matching units discarded
Finally, covariates used as covariates in analysis of outcomes (“doubly robust”)
16
Basic Descriptives
Provide matched and adjusted comparisons
17
Variable | Obs Mean Std. Dev. Min Max
-------------+-------------------------------------------------------serv | 740 .3608108 .4805606 0 1 diag | 740 .1675676 .3737344 0 1 arrest | 740 .0662162 .2488278 0 1
Unadjusted Relationship Among Unmatched Cases
0.14
18
0.03
Did not Used Services
Unadjusted Relationship Among Unmatched Cases
0.76
19
0.28
No DX CD DX (years 4 or 7)
20
270 nonusers didn’t match a user
50 of the remaining 203 non-users were used multiple times (generally twice)
These individuals were weighted in subsequent analyses
So, how did we do in balancing the covariates?
21
22
23
24
Unadjusted, unmatched
Regression, unmatched
Matched, unadjusted
Matched, adjusted (DR)
Female
0.041
0.012
0.054
0.030
0.037
0.583
0.038
0.302
Unadjusted, unmatched
Regression, unmatched
Matched, unadjusted
Matched, adjusted
Male
0.147
0.041
0.135
0.025
0.000
0.187
0.000
0.545
25
What else could we have measured better or at all?
Maybe what matters more than quantity of covariates is their quality.
Perhaps the outcome here is washed away by other forces
Perhaps a different outcome measure would show stronger effects
Perhaps repeated or severe offenses (e.g., violent crimes against persons)
26
Perhaps not all mental health services are created equal
Maybe the results are true
We need to know more about the content of treatment.
Methodologically, doubly robust appears beneficial