Thoughts on Simplifying the Estimation of HIV Incidence John Hargrove, Alex Welte, Paul Mostert [and others] Estimates of incident (new) cases are important in the assessment of changes in an epidemic, identifying “hot spots” and in gauging the effects of interventions HIV incidence most accurately estimated via longitudinal studies – but these are lengthy, expensive, logistically challenging. Do provide a “gold standard” against which to judge other estimates of HIV incidence An alternative way of estimating incidence, involving none of the disadvantages of a longitudinal study, would be to use a single chemical test that can be used to estimate the proportions of recent vs longestablished HIV infections in crosssectional surveys Idea: identify HIV test where measured outcome not simply +/- but rather a graded response increasing steadily over a long period One such assay is the BED-CEIA developed by CDC 2.5 Graph shows result for a seroconverting client taken from the ZVITAMBO study carried out in Zimbabwe BED-CEIA Assay Case 23903G Normalised OD 2.0 1.5 1.0 0.5 0.0 0 100 200 300 400 500 Days since last negative 600 700 [14,110 post partum women followed up at 6-wk, 3-mo, then every 3-mo to two years] 1.6 Theoretical graph of sqrt(OD-n) vs ln(ti, j) 1.2 Selected OD cut-off (B) Square root of OD-n 0.8 0.4 Negative baseline (A) . . . . . Window (Wi ) 0.0 -0.4 Slope = b1,i -0.8 -1.2 Intercept = b0,i -1.6 0 1 2 3 4 5 Log time (ti, j days) since last negative 6 7 The idea is to calibrate the BED assay to estimate the “average” time [or “window”] taken for a person’s BED optical density [OD] to increase to a given OD cutoff In cross-sectional surveys proportion of HIV positive people with BED < cut-off allows us to calculate the proportion of new infections – and thus the incidence. Estimation of the window period is thus central to the successful application of the BED Data from commercial seroconversion panels with accurately known times of seroconversion indicate Problem 1. Delay (~25 days) between seroconversion and the onset of then increase in BED optical density 0.9 Window period (W) 0.8 W' 0.7 Date of seroconversion Observed OD Fitted line 0.6 BED ODn Extrapolated portion 0.5 Seronegative Seropositive 0.4 0.3 Date of infection 0.2 Extrapolated time when OD = baseline D1 0.1 D2 Baseline OD = 0.0476 0.0 -80 -60 -40 -20 0 20 40 60 80 100 Days since BED OD started to increase 120 140 160 180 Min < 0.8; max > 0.8; S > 2; t < 90 BED Optical Density 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0 200 400 600 Days since last negative 800 1000 Problem 2: Considerable variability between clients in a real population. No prospect of using BED to identify individual recent infections. Idea only to estimate population incidence Problem 3: Often have limited follow-up: of 353 seroconverters in ZVITAMBO, 167 only produced a single HIV positive sample, Samples per client (S) Frequency 1 2 3 4 5 167 89 35 21 24 6 7 8 8 8 1 Problem 4: The available data for a given client quite often do not span the OD cutoff. The proportion that fail to do so varies with the chosen cut-off. Failure to span increases the uncertainty in estimating the time at which the OD cut-off is crossed Problem 5: There is a large variation (27 – 656 days) in the time (t0) elapsing between last negative and first positive HIV tests. The degree of uncertainty in the timing of seroconversion increases with increasing t0 2.5 2.0 1.5 1.0 2.0 1.5 1.0 0.5 0.0 0.0 200 3.0 400 600 Days since last negative 800 0 1000 3.0 BED Optical Density 2.5 2.0 1.5 1.0 1000 800 1000 S = 2; t < 90 1.0 0.0 600 800 1.5 0.0 400 600 2.0 0.5 Days since last negative 400 2.5 0.5 200 200 Days since last negative Min > 0.8; S > 2; t < 90 0 Max < 0.8; S > 2; t < 90 2.5 0.5 0 BED Optical Density 3.0 Min < 0.8; max > 0.8; S > 2; t < 90 BED Optical Density BED Optical Density 3.0 0 200 400 600 Days since last negative 800 1000 3.0 Min < 0.8; Max > 0.8; S > 2; 90 <= t <120 2.5 BED Optical Density BED Optical Density 3.0 2.0 1.5 1.0 2.5 2.0 1.5 1.0 0.5 0.5 0.0 0.0 0 200 400 600 800 Max < 0.8; S > 2; 90 <= t < 120 0 1000 200 Days since last negative 3.0 Min > 0.8; S > 2; 90 <= t < 120 2.5 BED Optical Density BED Optical Density 3.0 400 600 800 1000 Days since last negative 2.0 1.5 1.0 0.5 S = 2; 90 <= t < 120 2.5 2.0 1.5 1.0 0.5 0.0 0.0 0 200 400 600 Days since last negative 800 1000 0 200 400 600 Days since last negative 800 1000 3.0 Min < 0.8; Max > 0.8; S > 2; t >=120 2.5 BED Optical Density BED Optical Density 3.0 2.0 1.5 1.0 2.5 2.0 1.5 1.0 0.5 0.5 0.0 0.0 0 200 400 600 800 Max < 0.8; S > 2; t >= 120 0 1000 200 3.0 Min > 0.8; S > 2; t >= 120 2.5 BED Optical Density BED Optical Density 3.0 400 600 800 1000 Days since last negative Days since last negative 2.0 1.5 1.0 0.5 S = 2; 120 < t < 182 2.5 2.0 1.5 1.0 0.5 0.0 0.0 0 200 400 600 Days since last negative 800 1000 0 200 400 600 Days since last negative 800 1000 S = 2; t >= 183 We need to consider how variation in samples per client, t0 , and failure to span the cutoff affect our estimate of the window period. 3.0 BED Optical Density 2.5 2.0 1.5 1.0 0.5 0.0 0 200 400 600 Days since last negative 800 1000 3.0 BED Optical density 2.5 How to approach problem? Scatter-plot of the data? 2.0 Makes no use of the information of the trend for individual clients and ignores the fact that the sequential points for that client are not independent. 1.5 1.0 0.5 0.0 0 100 200 300 400 500 600 700 800 900 1000 Time since seroconversion Alternative which uses trend in BED OD is suggested by an approximately linear relationship between square root of OD and time-since-lastnegative HIV test (t). A. 1.6 Square root of OD values 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 loge days since last negative test 7.0 Allows a regression approach taking out variance due to t and to difference between clients Gives consistent results; in that results independent of whether we insist on minimum of 3, 4 or 5 samples per client; and on value of t0 between 75 and 180 days 290 210 270 200 250 230 Window (days) Window (days) 190 210 190 170 180 170 13 21 53 55 60 160 68 150 150 49 Minimum 3 Minimum 4 130 140 Minimum 5 110 0.65 130 0.75 0.85 0.95 OD Cut-off 1.05 1.15 40 60 80 100 120 140 160 180 200 Maximum days last negative to first positive Are we even using the right transformation? And should we be using the time of last negative HIV test as the origin 2.4 2.0 Optical density Try instead to do a preliminary estimate of the time when OD starts to increase by fitting a quadratic polynomial to the data. Then use this estimate as the origin. 1.6 1.2 0.8 0.4 0.0 0 100 200 300 400 500 600 700 800 Days since last negative 1 A. 1.6 C. B. 1.6 0 1.4 1.4 1.2 1.2 1.0 0.8 0.6 loge OD values Square root of OD values Square root of OD values -1 1.0 0.8 0.6 -2 11445X 14557A 15513K -3 15801X 16715D 16853F -4 17926A 18101N 0.4 0.4 -5 0.2 0.2 -6 0.0 0.0 -7 20606K 20674F 21556F 3.5 4.0 4.5 5.0 5.5 6.0 6.5 loge days since last negative test 7.0 0 1 2 3 4 5 6 loge estimated days since seroconversion 7 23903G 23983A 0 1 2 3 4 5 6 loge estimated days since seroconversion Seems to suggest that the true relationship may actually be a power function. What it really were? What would we see if we plotted OD vs time since-last negative 7 Our problem is that we do not know when seroconversion occurred. We only know the time of the last HIV negative test. 1.0 Optical density 0.8 0.6 0.4 0.2 Examples of times when HIV -ve tests might have been taken True window 173 days 0.0 -160 -80 0 80 160 240 Days since function intersects baseline level And the greater the delay between last negative and first positive tests the greater the uncertainty 1.6 1.6 Offset = 100 days 1.4 1.4 1.2 1.2 square root (OD) square root (OD) Offset = 0 days 1.0 0.8 0.6 0.4 0.8 0.6 0.4 y = 0.334x - 0.768 R2 = 0.976 Window = 126 d 0.2 1.0 y = 0.53x - 2.08 R2 = 1.00 Window = 196 d 0.2 0.0 0.0 3 4 5 6 log e (days since last negative) 7 3 4 5 6 log e (days since last negative) For zero offset the window is UNDER-estimated; for 100-day offset it is OVER-estimated 7 Estimated window 220 200 True window period 180 160 140 120 0 40 80 Offset (days) 120 160 This approach to window estimation is clearly not optimal since the window estimate changes with the timing of the last HIV-negative test But can we do any better? If OD increases as a power function fit: OD a (t t 0 ) b or equivalently ln( OD) ln( a) b ln( t t 0 ) where a and b are constants, t is the time since the last negative and t0 is the estimated time of seroconversion. We use the data to estimate a, b and t0 by nonlinear regression For the generated data [without noise] this approach gives the correct window – regardless of the time of the last negative test But for real data in 40% of 61 cases the time of seroconversion was estimated to be before the time of the last negative test or after the time of the first positive. [Work in progress] Turnbull survival analysis different approach suggested by Paul Mostert (Stellenbosch Statistics Department). This is a slightly more sophisticated variant of the Kaplan Meier survival analysis. Works on the basis that the (unknown) times of: i) seroconversion ii) OD cut-off each lie between two known times The times of the two events are quantified using interval censoring 0.8 Turnbull window estimates Runs 0.2 0.4 0.6 All data (red; 183 d) 2: Excluding max OD < 0.8 (purple; 141 d) 3: Excluding min OD > 0.8 (green; 210 d) 4: Excluding 2 and 3 (blue; 163 d) 0.0 estimated exceeding probability 1.0 Estimation of HIV w indow period since SC using Turnbull's algorithm 0 100 200 window period (days) 300 400 The window length is estimated using a non-parametric survival technique which makes no assumptions about any parametric models and underlying distributions. . No interpolation is used to obtain the cut-off time where the BED OD reaches 0.8 or the seroconversion time point. Only time points that will define the interval boundaries were used, which means that time points more than four for a specific women were not fully utilised. However, time points as few as two per women could be used in this estimation of window length. Conclusion There is still no general agreement on how best to estimate the window for methods like the BED. Fortunately most of those described seem to give fairly similar answers – though it’s not clear to what extent this is happening by chance.