Baby immune response and HIV infection

advertisement
Group4 (Robert Szulkin, Alex Granqvist)
Baby immune response and HIV infection
Maternal transmission of HIV is well known (Nduati, et al JAMA 2000), where nearly 40% of
breastfed children turned HIV positive within 2 years after birth. Considering breastfeeding as
exposure, the question we now address is whether the child’s immune response (as measured
the peptide response) predict their time to infection.



‘exposed children’ are those under breastfeeding.
their infection status is ascertained at monthly visits during followup
peptide response is also measured at those visits. The peptides tested belong to 5 major
groups; first answer the question for all combined, and then check if there is any
individual group effect.
The data is in 3 STATA files where infant is identified by IDNUM in each:
 BF_lastBF: whether child was breastfed, and if so, the age in months at last
breastfeeding. Example of reading the STATA data into R:
>setwd('z:/classes-new/likelihood/MEB/HIV')
>library(foreign)
>BF = read.dta('BF_lastBF.dta')
>names(BF)
[1] "idnum"
"bfeeding" "lastbf"
>table(BF$bfeeding)
no
yes stopped
102
349
0
(NOTE: Analyse only the 349 breastfed children.)

LAST_NEG_1ST_POS_DATES: dates of last negative and first positive HIV test
(variable PROBLEM indicates where infection status was uncertain) also delivery date
of child, allowing us to compute age. Use library(date) to deal with date variables. E.g.
> library(date)
> a=as.date(c("1jan1960", "2jan1960", "31mar1960", "30jul1960"))
> as.numeric(a)
[1]
0
1 90 211
> as.date(c('2001-12-31','2009-5-30'), order='ymd')
(i.e. dates are represented as the number of days since 1Jan1960. This allows you to
compute survival times from calendar dates.)

ELISPOT_TESTS: the Elispot test result (in ‘spot forming units’) for each peptide at
each visit. The file includes the following variables:
o SPECDAY, SPECMON, SPECYR: day month and year of the specimen
o VISIT: code for the age of infant at visit, DEL=delivery, M1=month 1, M2 =
month 2 etc.
o PEPTIDE: the exact sequence of the peptide tested
o PEPPROT (peptide group, 5 levels),1=env(gp120), 2=nef, 3=gag (p24/p17),
4=pol, 5=rev
o OBG: overall background
o OSFU: overall spot forming units
o OHIVSFU: overall HIV-specific SFU (‘Overall’ is a measurement done by
eye and by machine, where we take the eye value when available, and machine
otherwise.)
o CTL50, CTL100, CTL500: indicators for whether peptide gives positive
response for 3 different cut-off criteria
Questions
 The main predictor of interest is OHIVSFU or the ratio (OHIVSFU/OBG), which is
best viewed on log scale. Check with the normal quantile plot to get the reasonable
scale.
 Fit a simple Cox model, using only the first peptide test result as predictor. For the
time of infection, use the mid-point between the last-negative and first-positive as the
time of HIV infection. (You can use coxph function in the survival package in
R.)
Your main problem now is to use all the repeated peptide values during the followup. The
idea is to use some sort of GEE-type extension for the Cox model, where all the
measurements from one child are paired with the survival information, and they form
repeated survival data. E.g. if a child has 3 visits, then he will generate 3 correlated
subjects (x1,y1,delta), (x2,y2,delta) and (x3,y3,delta), where yi is the survival information
computed from the visits (consisting of start and end of followup dates). Delta is the final
infection status; it will have the same value for these 3 pseudo-subjects. (See the function
Surv(.) in R. You specify the start and end date in this function; see below.)
 For independent subjects, starting with the Cox likelihood, derive the score equation
and the information matrix for the Cox regression model. Identify the score equation
as an estimating equation, so explain how it can be extended to GEE assuming
‘independent working variance’.
 Sketch how the sandwich formula for the robust variance looks like for the GEE in
this case.
 Run the coxph model in R
Coxph(Surv(start,end,hiv) ~ peptide + cluster(idnum))

The ‘cluster’ term indicates the repeated measures. This will produce estimates and the
robust standard errors from the GEE model.
Compare the GEE results with the simple run using only the first peptide
measurement.
Download