Correctly modeling CD4 cell count in Cox regression analysis

advertisement
Correctly modeling CD4 cell
count in Cox regression analysis
of HIV-positive patients
Allison Dunning, M.S.
Research Biostatistician
Weill Cornell Medical College
Outline
•
•
•
•
•
•
Background
Motivation
Methods
Data Management
Results
Conclusion
Background
• Results from the primary open-label clinical
trial have previously been published in the
New England Journal of Medicine.
Background
• Results of the clinical trial have shown that starting
antiretroviral therapy earlier (‘Early’) rather than
waiting for onset of symptoms (‘Standard’) in HIV
patients significantly decreases mortality.
• Between 2005 and 2008 a total of 816 participants –
408 per group – were enrolled and followed.
• After stopping the clinical trail all participants were
immediately put on antiretroviral therapy.
• Researchers have continued to follow and collect data
on the 816 participants.
Motivation
• As a follow-up, researchers are interested in
determining if ‘Early’ therapy significantly
decreases time to first Tuberculosis (TFTB)
diagnosis.
• CD4 cell count has long been considered a
measure of overall health in HIV patients.
• Therefore investigators felt it was important to
adjust for CD4 cell count in the analysis of
TFTB diagnosis.
Motivation
• The problem arose of how best to adjust for CD4
cell count.
• Typically CD4 recorded at the beginning of the
study is used for analysis; known as baseline CD4
cell count.
• Per protocol CD4 cell counts were collected every
6 months for all participants.
• Investigators felt it was important to account for
changing CD4 cell counts, especially after therapy
initiation, in the analysis.
Motivation
• Our analysis was not interested in predicting
survival just whether or not drug start time
was a predictor of TB diagnosis.
• In order to allow survival analysis to account
for changing CD4 cell counts we decided to
conduct a Cox Proportional Hazards
Regression analysis using a mixture of fixed
and time-dependent covariates.
What is a Time Dependent Covariate
• Time-dependent covariates are those that
may change in value over the study period
• Most variables in survival analysis are
collected at one time point, typically at the
start of the study, these include demographic
and risk factor variables
• Sometimes we may collect a lab variable or
risk factor that can vary over the study period.
Example of Time Dependent Variables
• Lab Values:
– Blood Pressure
• Most studies will only use blood pressure collected at start
of study, sometimes called baseline blood pressure.
• However, in theory, blood pressure could be collected at
multiple time during the study period.
• Risk Factors:
– Smoking Status
• Again this can be collected only at start of study, or baseline
or could be tracked over time
• Some patients may quit smoking, start smoking, or quit and
relapse smoking during the study period.
Fixed Covariates
• Fixed Covariates is a term used to represent
variables that stay constant, or do not change,
during the study period.
• These are typically things like patient gender,
race/ethnicity, risk factors such as diabetes or
hypertension, etc.
• We as researchers must develop a method to
analyze time to event data while including both
these fixed covariate and time-dependent
covariates
Methods
• STATA 12.0 was used to perform two Cox
regression models to analyze the effect of ART
start time on TFTB.
• The first model included baseline CD4 cell count
only as a predictor
• While the second model treated CD4 cell count as
a time-varying predictor.
• Both models were adjusted for history of TB
diagnosis prior to clinical trial and baseline BMI
Methods
• Regular Cox Proportional Hazards Model:
– Log[hi(t)] = α(t) + β1xi1 + … + βkxik
– Where α(t) = log [λ0(t)]
• Proportional Hazards Model with time-varying
covariate:
– Log[hi(t)] = α(t) + β1xi1 + β2xi2(t)
– Where α(t) = log [λ0(t)]
Data Management
• Problems we encountered:
• Missing CD4 cell count
– Some patients missed a scheduled lab visit during the
study, therefore CD4 cell count was missing for one of
the six month intervals.
• Multiple CD4 cell counts within a six month
interval
– For various reasons, several patients visited the lab
multiple times within a six month interval, therefore
multiple CD4 cell counts were collected in the six
month time frame.
Data Management
• What we did – Missing Data:
– If only one interval was missing, the previous CD4 cell
count was used in a carry the last forward approach
– If at least two consecutive intervals were missing, the
patient was excluded from the study; 13 patients in
total were excluded for this reason.
• What we did – Multiple Observations:
– The minimum CD4 cell count collected in the six
month interval was the value used in analysis for that
time frame.
. use "C:\Documents and Settings\ald2018\Desktop\STATA Conference 2013\JSM Abstra
. quietly tabulate rxcode, generate(grp)
. rename grp1 Early
. stset weeks_to_tb, failure(incident_tb==1)
failure event:
obs. time interval:
exit on or before:
760
0
760
94
150407.6
incident_tb == 1
(0, weeks_to_tb]
failure
total obs.
exclusions
obs. remaining, representing
failures in single record/single failure data
total analysis time at risk, at risk from t =
earliest observed entry t =
last observed exit t =
. stcox Early baseline_bmi history_tb baseline_cd4
failure _d:
analysis time _t:
incident_tb == 1
weeks_to_tb
0
0
330.5714
Results – Regular Cox Regression
. stcox Early baseline_bmi history_tb baseline_cd4
failure _d:
analysis time _t:
incident_tb == 1
weeks_to_tb
Iteration 0:
log likelihood
Iteration 1:
log likelihood
Iteration 2:
log likelihood
Iteration 3:
log likelihood
Refining estimates:
Iteration 0:
log likelihood
=
=
=
=
-619.08645
-601.71285
-600.93058
-600.92881
= -600.92881
Cox regression -- Breslow method for ties
No. of subjects =
No. of failures =
Time at risk
=
Log likelihood
=
773
96
153457.4286
-600.92881
_t
Haz. Ratio
Early
baseline_bmi
history_tb
baseline_cd4
.4388986
.9067024
2.163769
.9972554
Std. Err.
.096901
.0322029
.5184032
.0025424
z
-3.73
-2.76
3.22
-1.08
Number of obs
=
773
LR chi2(4)
Prob > chi2
=
=
36.32
0.0000
P>|z|
[95% Conf. Interval]
0.000
0.006
0.001
0.281
.2847305
.8457326
1.352935
.9922849
.6765415
.9720676
3.460546
1.002251
Results
• Regular cox regression analysis showed that
‘Early’ therapy results in a significant decrease
in TFTB, after adjustment for previous TB
diagnosis, baseline BMI, and baseline CD4 cell
count.
Data Management
• Data was collected with one row per
participant:
Data Management
• In STATA, using reshape command, we
reformatted dataset for analysis:
. quietly tabulate rxcode, generate(grp)
. rename grp1 Early
. stset t2, id(patid) failure(status==1) time0(t1)
id:
failure event:
obs. time interval:
exit on or before:
5659
212
36
5411
760
92
148364
patid
status == 1
(t1, t2]
failure
total obs.
entry on or after exit (t1>t2)
overlapping records (t2[_n-1]>t1)
obs. remaining, representing
subjects
failures in single failure-per-subject data
total analysis time at risk, at risk from t =
earliest observed entry t =
last observed exit t =
. stcox Early history_tb baseline_bmi cd4_count
failure _d:
analysis time _t:
status == 1
t2
PROBABLE ERROR
PROBABLE ERROR
0
0
329
Results – Cox Regression with timedependent covariates
. stcox Early history_tb baseline_bmi cd4_count
failure _d:
analysis time _t:
id:
status == 1
t2
patid
Iteration 0:
log likelihood
Iteration 1:
log likelihood
Iteration 2:
log likelihood
Iteration 3:
log likelihood
Iteration 4:
log likelihood
Refining estimates:
Iteration 0:
log likelihood
=
=
=
=
=
-593.47212
-541.68367
-535.94519
-535.89227
-535.89226
= -535.89226
Cox regression -- Breslow method for ties
No. of subjects =
No. of failures =
Time at risk
=
Log likelihood
=
760
92
148364
-535.89226
_t
Haz. Ratio
Early
history_tb
baseline_bmi
cd4_count
.8041764
1.991573
.9376221
.9911868
Std. Err.
.1881104
.4914993
.0327094
.0010495
z
-0.93
2.79
-1.85
-8.36
Number of obs
=
5411
LR chi2(4)
Prob > chi2
=
=
115.16
0.0000
P>|z|
[95% Conf. Interval]
0.351
0.005
0.065
0.000
.5084414
1.227803
.8756554
.9891321
1.271926
3.230456
1.003974
.9932458
Results
• When treating CD4 cell count as time-varying
predictor in Cox regression, we find that ART
start time is not a significant predictor of
TFTB.
Conclusion
• Failing to adjust for the change in CD4 cell
counts over time led to reporting that ‘Early’
therapy significantly reduces risk of TB
diagnosis. Modeled correctly, the effect
becomes non-significant. This result has
substantial consequence on treatment
decision making.
Conclusion
• Our results help us to consider that TFTB
diagnosis in HIV positive patients is not
associated with start time of ART when overall
patient health is considered.
• Further analysis is needed before we are
comfortable making this conclusion.
Looking Forward
• We are currently in the process of further
examining the relationship between CD4 cell
count and ART start.
• Currently collecting data to examine time from
ART start to first TB diagnosis. For the Early
group this data does not change, however, for
the Standard group this may have a significant
effect on the analysis.
Acknowledgements
• Daniel W. Fitzgerald, M.D
• Sean Collins, M.D
• Sandra H. Rua, Ph.D
Thank You
ald2018@med.cornell.edu
Download