nola13-woodruff

advertisement
Conditional Stereotype Logistic Regression
A new estimation command
Rob Woodruff
Battelle Memorial Institute, Health & Analytics
Email: woodruffr@battelle.org
Cynthia Ferre
Centers for Disease Control and Prevention
1
Overview
• What is it?
- Stereotype Logistic Regression
- Conditional on what?
• What‘s it good for?
• Syntax and Examples
2
Constrained Multinomial Logistic Regression
• Multinomial Model
-Categorical Outcome Variable
-Vector of Explanatory Variables
-Related through the m logits:
3
Constrained Multinomial (continued)
-The stereotype model imposes the constraints:
Note: The phi’s are scalar quantities
4
It’s all about the phi’s
• Full multinomial has m(p+1) parameters
• Stereotype model has m-1 + m + p = 2m-1+p
• The phi parameters give a way to quantify ordinality of the
outcome variable. If
Then we have evidence of ordinal effect.
• Also allow tests of distinguishability of outcome categories
5
So what’s the condition?
• The multinomial and stereotype logistic regression models
are implemented in Stata by mlogit and slogit
• Assume independence of observations, not true for matched
case-control data
• For matched case control study, only independence of matched
groups (strata, panels, clusters, etc)
• For 1:M matching, condition on stratum total for outcome
variable and focus instead on conditional likelihood
Do I have to?
Why condition on this particular event?
6
Conditional vs. Unconditional Likelihood
7
Conditional vs. Unconditional Likelihood
8
CSTEREO
cstereo command
Basic syntax:
. cstereo depvar indepvars [if] [in], group(varname)
[options]
9
Example with Real Data:
Preterm Birth and Vitamin D
• 1:2 (some 1:1) Pooled, Matched Case-Control Study of 2,583
Mothers in 870 matched groups
• A case defined as gestational age at delivery of <37 weeks
outcome4=3 (<32 weeks), outcome4=2, (32-35 weeks), outcome4=1
(36 weeks) and outcome4=0 (control: 37+ weeks)
• Primary exposure variable of interest: Vitamin D levels,
ohd25_total: blood serum concentration of (25)OHD in ng/ml
• Sample of other covariates measured:
edu = 0/1 indicator of post-high school education
vitamin = 0/1 indicator of vitamin use during pregnancy
10
Example Continued (nolog option):
. cstereo outcome4 ohd25_total edu vitamin, group(matchgroup) nolog
note: 77 groups (139 obs) dropped because of all positive or
all negative outcomes.
Log-Likelihood from Conditional Multinomial Model: -835.83679
Chi2 value on 4 degrees of freedom: 10.604861
P-value: .03138281
11
Example Continued:
Number of obs
Wald chi2(3)
Prob > chi2
Log likelihood = -841.13922
Std. Err.
z
P>|z|
=
=
=
2322
1.85
0.6048
[95% Conf. Interval]
outcome4
Coef.
xb
ohd25_total
edu
vitamin
-.0073684
-.4010391
.1301369
.0144916
.431587
.1954516
-0.51
-0.93
0.67
0.611
0.353
0.506
-.0357714
-1.246934
-.2529413
.0210346
.4448559
.5132151
_cons
.8764578
1.268331
0.69
0.490
-1.609424
3.36234
_cons
.9398113
1.206139
0.78
0.436
-1.424178
3.3038
phi1
phi2
12
Interpretation of cstereo output:
• Estimated beta coefficient of ohd25_total = -0.0074
95% confidence interval (-0.0358, 0.0210)
with
• Odds ratio of being in <32 weeks gestational age compared to
control is exp(-0.0074) = 0.993 (0.965, 1.021)
• Now for odds ratios for the 32-35 weeks and 36 week case
categories, we need the products of the parameters:
• For standard errors, use Delta Method via nlcom
13
Interpretation continued:
. nlcom [xb]ohd25_total*[phi2]_cons
_nl_1: [xb]ohd25_total*[phi2]_cons
outcome4
Coef.
_nl_1
-.0069249
Std. Err.
.0072757
z
-0.95
P>|z|
[95% Conf. Interval]
0.341
-.021185
.0073351
Exponentiating gives the odds ratio of being in the
32-35 weeks case category compare to controls of
0.994 with a 95% C.I. of (0.983, 1.004)
14
Constraints:
• Are the 36 week and 32-35 weeks case categories
distinguishable?
. constraint 1 [phi1]_cons=[phi2]_cons
. cstereo outcome4 ohd25_total edu vitamin, group(matchgroup) nolog constraints(1)
note: 77 groups (139 obs) dropped because of all positive or
all negative outcomes.
15
Constraint Output
Number of obs
Wald chi2(3)
Prob > chi2
Log likelihood = -841.1454
=
=
=
2322
1.73
0.6293
( 1) [phi1]_cons - [phi2]_cons = 0
Std. Err.
z
P>|z|
[95% Conf. Interval]
outcome4
Coef.
xb
ohd25_total
edu
vitamin
-.0068382
-.3909924
.1294806
.013172
.4289348
.1888154
-0.52
-0.91
0.69
0.604
0.362
0.493
-.0326548
-1.231689
-.2405908
.0189784
.4497043
.4995519
_cons
.9417836
1.24291
0.76
0.449
-1.494276
3.377843
_cons
.9417836
1.24291
0.76
0.449
-1.494276
3.377843
phi1
phi2
16
Constraint Output
• The log-likelihood from the constrained model is -841.145
compared to -841.139 for the unconstrained stereotype model
• Difference of 0.006 gives a chi2 value of 0.012 on 1 degree
of freedom
• P-value = 0.91
• Unconstrained stereotype model does not fit significantly
better than the constrained and the two case categories are
indistinguishable
17
Relationship to Other Models for
Ordered/Categorical Outcomes
• Constrained Multinomial
• Not as parsimonious as the proportional odds model (ologit)
but not valid in outcome dependent sampling
• Adjacent category model is (basically) a constrained
stereotype model. Also valid under outcome dependent
sampling
18
Limitations
• Convergence Issues
• Currently only a one dimensional stereotype model
• Cannot currently force an ordering on the stereotype
parameters
• Additional dependence structure
19
References:
• Ferre C, et al; Maternal 25-Hydroxyvitamin D Status and the
Risk of Preterm Delivery: A Multi-Center Nested Case Control
Study; preprint
• Mukherjee B, Liu I, Sinha S; Analysis of matched casecontrol data with multiple ordered disease states;
Statistics in Medicine 2007
• Ahn J et. al.; Missing Exposure Date in Stereotype
Regression Model; Biometrics 2011
• Andersen EB; Asymptotic Properties of Conditional MaximumLikelihood Estimators; Journal of the Royal Statistical
Society 1970
• Liang KY, Stewart WF; Polychotomous Logistic Regression
Methods for Matched Case-Control Studies with Multiple Case
or Control Groups; American Journal of Epidemiology 1987
• Scott AJ, Wild CJ; Fitting Regression Models to Case-Contro
Data by Maximum Likelihood; Biometrika 1997
• Anderson JA; Regression and Ordered Categorical Variable;
Journal of the Royal Statistical Society 1984\
• Greenland S; Alternative Models for Ordinal Logistic
Regression; Statistics in Medicine 1994
20
Download