HDEF User Guide

advertisement
H-DEF User Guide
Table of Contents
Overview ....................................................................................................................................................................3
Running CEPAC ........................................................................................................................................................3
Run setup ................................................................................................................................................................3
Run organization ....................................................................................................................................................3
Naming conventions ...............................................................................................................................................4
Loading CEPAC output into H-DEF ..........................................................................................................................5
Specifying H-DEF inputs ...........................................................................................................................................6
Transmission rates ..................................................................................................................................................6
Transmissions / 100 person-years by viral load stratum ....................................................................................6
Acute infection settings ......................................................................................................................................7
Transmission rate multiplier while failed on ART .............................................................................................7
Transmission rate adjustments to individual runs ..............................................................................................7
Demography ...........................................................................................................................................................9
Year at model start..............................................................................................................................................9
Calculate vs. specify susceptible entrants...........................................................................................................9
Baseline population distribution .............................................................................................................................9
Economic parameters ...........................................................................................................................................10
HIV test costs ...................................................................................................................................................10
Background costs..............................................................................................................................................10
Other parameters ..............................................................................................................................................11
Saving/importing H-DEF inputs...........................................................................................................................11
Running H-DEF and saving model outputs ..............................................................................................................11
Strategy-sets .....................................................................................................................................................11
Understanding H-DEF results ..................................................................................................................................12
Epidemic trends ....................................................................................................................................................12
Formulating an S-I model .................................................................................................................................12
Where is equilibrium? ......................................................................................................................................14
Prevalence, incidence, and duration .................................................................................................................16
The baseline state of the population .....................................................................................................................17
1
Transmission rates ................................................................................................................................................18
Infections, incidence, and incidence-to-prevalence-ratio .................................................................................20
Heterogeneous mixing ..........................................................................................................................................21
Long-term treatment outcomes within H-DEF .....................................................................................................22
CEPAC equilibrium..........................................................................................................................................22
Calibration ................................................................................................................................................................23
Appendices ...............................................................................................................................................................24
CEPAC equilibrium “proof” ................................................................................................................................24
2
Overview
The H-DEF model (HIV Dynamic Epidemic Framework) is a deterministic model of HIV transmission that
interfaces with output from the CEPAC disease progression model. Using the H-DEF model involves four basic
steps:
1. Running CEPAC – a given CEPAC run cannot necessarily be incorporated into H-DEF; runs must be
specially tailored to work with H-DEF.
2. Loading CEPAC output into H-DEF.
3. Specifying H-DEF inputs.
4. Running H-DEF and saving model outputs.
Running CEPAC
There are few special considerations when setting up CEPAC runs for use in H-DEF. Broadly, these can be
grouped into run setup, run organization, and naming conventions.
Run setup
CEPAC runs for use in H-DEF should always be undiscounted; any discounting of costs and QALMs will be
performed within H-DEF. If you attempt to load discounted output into H-DEF it will produce an error message.
Run organization
In a standard CEPAC analysis, each run of the CEPAC model represents a single strategy (treatment policy).
Thus, an analysis is often organized as shown in the schematic below; a unique folder is used for each scenario
(e.g. basecase or a specific sensitivity analysis), within which are the runs representing each of the strategies being
evaluated.
Scenarios
CEPAC Runs
(strategies)
s1
CD4=100
Standard CEPAC
analysis:
s1
CD4=300
s0
BASECASE
Enhanced
treatment
Standard
of care
When setting up CEPAC runs for use in H-DEF, this picture becomes somewhat more complicated. In H-DEF, it
generally requires more than one run to simulate a single treatment strategy – this is because it is necessary to
simulate HIV-infected individuals who may occupy a variety of different states at the start of the H-DEF analysis.
For example, as shown in the diagram below, individuals could be acutely-infected, chronically-infected but
3
undiagnosed, on 1st-line ART, on 2nd-line ART, etc. Separate CEPAC runs are used to model individuals starting
in each of these various baseline states – the same treatment policy will be applied in each run.
Scenarios
Folders
(strategies)
s1
CD4=100
H-DEF analysis:
s1
CD4=300
s0
BASECASE
CEPAC runs
(baseline states)
Chronic,
undetected
Enhanced
treatment
Standard of
care
2nd-line
ART
1st-line
ART
Acutelyinfected
To keep track of these runs and strategies, an additional layer of folders must be introduced to sort the CEPAC
runs used for an H-DEF analysis. As shown above, rather than using a single run to represent a strategy, now a
folder of runs is used to represent a strategy.
Naming conventions
It’s possible to load large numbers of scenarios, folders, and CEPAC runs into H-DEF simultaneously. To keep
them all organized within H-DEF, it’s important to adhere to certain naming conventions.
The various CEPAC runs within a folder should be given names beginning with a two-digit number, followed by
an underscore and a short description of what baseline state that run is intended to simulate. Following this, the
user is free to add in an indication of what round of runs it is, or any other information they’d like to have in the
file name. For the runs in the figure above, these names could be:
01_chronicUndet_R1.in
02_acute_R1.in
03_1stLine_R1.in
04_2ndLine_R1.in
It is crucial that these names and numbers be used consistently within every folder for every scenario simulated in
an analysis; for example, in the analysis above, a run whose name begins with “03” should always be a run in
which patients start on 1st-line ART. These numbers are H-DEF’s only way to tell apart these runs, and to know
which runs it should treat like 1st-line ART runs, and which to treat like acutely-infected runs.
4
If so desired, the user can append information about the strategy or the sensitivity analysis at the end of CEPAC
run names. As a general rule, though, this may not be worth the effort. The names of the folders containing any
run will always indicate which strategy and which sensitivity analysis it is simulating, and the full directory name
(containing this information) will always be given in H-DEF output.
Finally, it’s important to also be systematic when naming the strategy folders. As with runs, these should be
distinguished from one another using a two-digit number, followed by an underscore and a description of the
strategy. In the example analysis, the strategy folders could be named:
01_SOC
02_Enhanced treatment
The standard of care folder (that is, whichever strategy best represents the current state of treatment in the
modeled setting) should ALWAYS be given the number 01; the reason for this is described in the section on
running the H-DEF model. Other folders can be numbered however the user sees fit. You don’t strictly have to be
consistent with this numbering, but… why would you do that to yourself? Be consistent!
Loading CEPAC output into H-DEF
After you’ve done your CEPAC runs and saved the output somewhere, you need to get that output into H-DEF.
To do this, first open up the H-DEF model and hit the button labeled “Edit CEPAC output identification”. A submenu will appear, which you can use to locate the output from your CEPAC runs, process those runs, and give
names to the individual runs within each folder.
First, you will need to specify what CEPAC model version you used, using the drop-down menu labeled “CEPAC
model version”. This is important because different model versions have different formats for their .out files. HDEF needs to know what format it’s working with, so it knows where to look for the particular output it uses.
Next, you need to tell H-DEF where your CEPAC output is located on your hard drive/shared drive. To do this,
press the “Select CEPAC output location” button and select the folder containing all of the runs that you would
like to process. For example, let’s say that in the example above we wanted to process both s1 scenarios and the
base case. To do this, you would select the folder containing all of these scenarios (it’d probably be called “R1” or
“Runs” or something…). Next, in the input labeled “Runs per folder”, specify how many CEPAC runs are in each
strategy folder; in the example analysis, this would be 4.
At this point, you can press the button labeled “Find CEPAC output”. H-DEF will search through all of the
subdirectories of the folder you specified, and display the names of the run-containing folders (i.e. the lowest
folders) and output files in the grid at the bottom. If any folders do not contain the expected number of output
files, H-DEF will produce an error message.
Now is a good time to check that your output file names are consistent. The names will be sorted alphabetically,
so the run starting with “01” should always come first, followed by “02”, etc. Do all of the “01” runs have the
same name (e.g. 01_chronicUndet_R1.in)? Do all of the “02”, “03”, etc. runs have the same name? If not,
something has gone wrong, and you should look into your CEPAC runs to figure out what happened. If all the
runs do have the same names, well done!
5
Now that you’ve located the output for H-DEF, it needs to be processed – that is, H-DEF needs to extract all of
the relevant numbers from each CEPAC output file. Just to give a sense of what this means, H-DEF will need to
extract around 50,000 data points from each individual output file – so it can take a while. Before you start the
extraction process, the model will give you an estimate of just how long it’s going to take.
Depending on the state of your CEPAC output files, there are now three options that may be presented to you:
1. If all of the folders have the correct number of runs, and the model doesn’t find any pre-existing extracted
data, then you will see the message “NO DATA EXTRACTION FILE FOUND; SELECT “PROCESS
.OUT FILES” TO EXTRACT DATA”. If you hit the button, the model will begin processing all of the
.out files; when it’s done, the name of the file where it has stored the data will be displayed.
2. Alternately, it’s possible that a data extraction file already exists in the main folder you have selected. If
this extraction file matches the sub-folders and runs in your folder, then you will be given two options.
First you can reprocess the .out files, which will overwrite the existing data extraction file. Alternately,
you can just use the extraction file that is already in the folder, which will save you the time of having to
redo the extraction.
3. Finally, it’s possible that a data extraction file exists, but it doesn’t match the sub-folders and runs in your
folder. There’s no great reason why this should happen, but clearly something got messed up! You can
press the “Reprocess .out files” button to overwrite this old extraction file – it’s recommended that you
first take a look through your folders to try to understand what happened here.
While extracting the data from your output files, H-DEF will test to ensure that your runs were undiscounted. To
do this, it will compare two values: 1) the reported mean life-months, and 2) the mean life-months calculated by
summing the monthly survival output. If these are not identical, it indicates that a nonzero discount rate was used,
and the H-DEF will produce an error message.
Finally, this menu gives you the option to provide more meaningful names for the runs within in each folder. That
is, if you’d like something more understandable than “01_chronicUndet_R1.in”, you can choose to label run 01
as “Chronically-infected, undetected”.
Specifying H-DEF inputs
H-DEF inputs are specified through a variety of sub-menus. The model’s main page will display a selection of
these inputs in various categories, along with a message indicating whether any errors have been found in the
inputs for each particular category. To edit the inputs in any category, simply click the button labeled “Edit
(category)”.
Transmission rates
The Transmission rates menu is where users specify inputs related to how frequently people with HIV transmit
the virus to others.
Transmissions / 100 person-years by viral load stratum
At the top, the user can specify transmission rates (expressed in transmissions/100 person-years with HIV) for the
seven viral load strata in CEPAC. Before being actually used in H-DEF, these numbers will be multiplied by a
calibration term that may increase or decrease all of them by some factor. Thus, their absolute values are less
important than their relative values.
6
Acute infection settings
Next, the user can specify the characteristics of acute infection. These inputs are intended to capture the fact that,
for a short period after the time of infection, HIV-infected individuals will have an extremely high viral load and
tend to be very infectious. The user can specify two factors related to this acute period: first, how long it lasts, in
months; and second, the multiplier applied to transmission rates during this time. For example, if a user specifies a
duration of 2 and multiplier of 10, then the transmission rates applied to patients in a given run will be 10-fold
higher during the first two months of that run. Note that this multiplier does not apply to individuals in the HVL
vLo bucket, as they only to enter this bucket is to achieve suppression on ART.
Transmission rate multiplier while failed on ART
On the right side, the user can specify a transmission rate multiplier for individuals who have failed ART, but
continue taking the drugs. This is intended to capture the fact that patients who remain on ART after failure tend
to maintain some degree of partial suppression; that is, while they no longer have an undetectable viral load, they
often do not rebound all the way to their set point viral load. Since CEPAC does not model this behavior, we use a
failed ART transmission rate multiplier below 1 to capture its effects on infectivity within H-DEF.
Transmission rate adjustments to individual runs
Finally, at the bottom, the user has the option to apply particular adjustments specifically to individual runs. First,
there are two inputs related to acute infection. For each run, the user can select yes/no for whether to apply the
acute infection transmission rate multiplier during the first few months of that run. The decision of whether to use
acute infection will depend on what state patients are in during the first few months of that run. Is the run intended
to simulate patients starting at the time of infection? If so, use the multiplier. If the run is intended to start at some
other state (e.g. on 2nd-line ART, chronically-infected but undiagnosed) then don’t use the multiplier.
Next, the user can specify whether to treat patients as in the “baseline acute” state at the start of an H-DEF
projection. The meaning of this term is a bit subtle, but hopefully the following diagram will clear things up!
Consider a scenario in which the acute infection phase lasts for three months. In the following diagram, the month
of HIV infection is denoted with an X and the acute phase with a blue arrow; the three groups of patients who are
in the acute phase at H-DEF model start are denoted with dark blue arrows:
7
Month
-4
-3
-2
-1
H-DEF model start
0
1
2
3
x
x
x
x
x
x
x
x
Of all the patients who start the model in the acute state, you can see that 100% of them are in the acute state in
month 0. In next month, though, those who were infected in month -3 are no longer in the acute state; at this
point, two-thirds of those who were acutely-infected at baseline are still in the acute state (denoted by the orange
shading). In the next month, those who were infected in month -2 are no longer in the acute state; at this point,
only one-third are still acute. And finally, by month three none of the individuals who were acute at baseline are
still in the acute state.
So to simulate all of these “baseline acute” individuals with a single run, we’d need to take some of them out of
the acute state each month; the specific proportion we remove is 1/(the duration of acute infection); so for the
above example, we remove one-third of them each month.
Selecting Yes for “In baseline acute state at model start?” will tell H-DEF to do this – that is, it will remove 1/(the
duration of acute infection) of your patients from the acute state each month, until there are no more in the acute
state. So in general, this should be applied only to those individuals who are in the acute state in the first month of
your analysis (usually <5% of your initial HIV-infected population). Those who get infected in later months
typically will not be described as “baseline acutes”.
The last run-specific option in this menu is to “copy back” transmission rates. If you enter a month number
greater than 0 here, H-DEF will take the overall transmission rate calculated for that month of the run, and apply
it to all preceding months. Why would you want to do this? In some cases, the viral load distribution in the first
few months of CEPAC run may not be exactly what you intend to simulate. For example, consider how you might
simulate a group of patients who are suppressed on first-line ART at the start of the H-DEF analysis. You’d like
start your entire CEPAC cohort on first-line ART, and give it 100% efficacy; but even with this, it’s going to take
2-3 months before all of these individuals reach the suppressed viral load bucket. What you’re trying to model
though, is an entire cohort with suppressed viral load right from the get go. So to correct for this, what you can do
is just take the transmission rate calculated from month 3 (when everyone is suppressed) and apply it to the earlier
months.
Before moving on to other categories, it’s worth considering how these run-specific adjustments work together.
First, “baseline acute” should not be Yes unless “use acute multiplier” is also Yes; “baseline acute” is a
8
meaningless distinction if you’re not applying the acute phase multiplier. And second, you will likely not want to
“copy back” a transmission rate if you are also applying an acute phase multiplier – doing so could entirely
eliminate the increased transmission rate during acute infection.
Demography
The demography section allows the user to specify inputs related to how many uninfected individuals enter and
exit the model each month. The number of uninfecteds entering and exiting plays a large role in determining
population size, and thus most of the inputs in this section revolve around population size.
Year at model start
This input allows the user to specify what year (in real, calendar time) it should be when the H-DEF model is
initiated. Note that this input will not affect any of the model’s other calculations; it’s just used to provide a
calendar time in model output.
Calculate vs. specify susceptible entrants
The H-DEF model has two methods it can use to determine how many uninfected individuals should enter/exit
each month.
ο‚·
ο‚·
One option (Specify susceptible entrants) is for the user to tell the model how many people should enter
the susceptible pool in each month of the analysis. If you select this option, the “Desired susceptible
entrants at end of month” table will become editable, and you can specify how many people should enter
each month. Note that these numbers are intended to represent net entry/exit. That is, if you think 100
susceptibles will die in month 10, and 150 will enter the population, then the net entry/exit is 50.
The second option (Calculate susceptible entrants) is for the user to tell the model how big the population
should be in each month, and let the model calculate how many entrants are required to achieve that
population size. If you select this option, the “Desired population size at beginning of month” table will
become editable, and you can enter the size you’d like your population to be at the beginning of each
month of your analysis. For example, let’s say you think the population size should be 1,000 at the
beginning of month 50. If there are 925 people (both with and without HIV) alive at the end of month 49,
then the model will bring in 75 people.
Baseline population distribution
This sub-menu allows the user to specify how the HIV-infected population should be distributed among the
various runs in the first month of the H-DEF analysis. For example, if you specify an initial population size of
1,000 and an HIV prevalence of 20%, then there will be 200 people with HIV. If you specify that 50% of them
should be on first-line ART, then 100 people will enter the model on first-line ART, and their future outcomes
will be described by your “first-line ART” run. H-DEF will check that the proportions you enter here sum to 1; if
they do not, it will display an error message on the main menu.
Now is a good time to make an important distinction about these baseline states – they only refer to baseline! As
the H-DEF model runs, individuals who start off on first-line ART may transition to a variety of other states, such
as 2nd-line ART, or LTFU. The transitions between these states are governed entirely by how you set up your
CEPAC runs; there is no way within H-DEF for you to directly specify that, for example, 75% of your population
should be on 2nd-line ART after 5 years.
9
To belabor the point, consider again those 100 people who enter the model on 1st-line ART. Their future outcomes
are determined by a CEPAC run you’ve done simulating the outcomes of a cohort starting on 1st-line ART. And if
you start a CEPAC run with one million patients on 1st-line ART, they won’t all still be on 1st-line ART a year
later: perhaps 5% will have switched to 2nd-line, 10% will be lost to follow-up, and 5% will have died. And these
exact some outcomes will occur to the 100 people who start out on 1st-line ART in H-DEF: after 1 year, 5 will be
on 2nd-line ART, 10 will be lost, and 5 will be dead. But the 5 people who have switched to 2nd-line ART are still
being described by the same run they started in. They do not “jump” to your 2nd-line ART run; that run was done
solely to describe people who are on 2nd-line ART at baseline.
Economic parameters
This sub-menu allows the user to specify various parameters related to the valuation of costs and benefits
specifically within the H-DEF model.
HIV test costs
For undiagnosed HIV-infected individuals, CEPAC has a very sophisticated testing module that can be used to
describe how frequently they get HIV tests done, how long it takes them to enter care, etc. However, when you do
run of 1,000,000 undiagnosed HIV-infected persons, you’re missing one thing that’s crucial to the valuation of
HIV testing programs: people without HIV! When using H-DEF, HIV-uninfected people should not be included
in CEPAC runs; they will be modeled entirely within H-DEF. Because of this, any HIV-related costs or QoL that
they accrue must be simulated within H-DEF.
In this section, the user can first specify the monthly probability that an uninfected person will get an HIV test. As
a general rule, this should be the same as the monthly testing probability within CEPAC. If you want to simulate a
“targeted” testing program, it’s possible that you may want this probability to be lower than that which you use in
CEPAC (that is, based on some marker such as age, risk group, etc. we may be able to test people who are likely
to actually have HIV more frequently than those who are likely to not HIV).
Next, the user can specify the cost of an HIV test. This cost should capture all the costs related to a single testing
event in CEPAC (e.g. if confirmatory tests for false-positives are costed separately within CEPAC, then those
costs should be lumped into this single input in H-DEF). This input is simply multiplied by the monthly test
probability to determine an overall per-person monthly test cost for the HIV-uninfected population; this value is
multiplied by the susceptible population size to get the total cost of HIV testing in that group.
Finally, the user can specify a QoL multiplier that should be applied in the month of an HIV test. This is intended
to reflect any stress caused to HIV-negative individuals by waiting for their test results, receiving false-positive
results, etc. In each month then, the average QoL multiplier due to HIV testing is:
(1 – test probability)*1 + (test probability) * (test QoL)
Background costs
While the HIV-related healthcare costs of an uninfected person tend to be quite low, their overall healthcare costs
will not be zero. For this reason, the user can specify a monthly background cost that will be applied to all
individuals in the model each month; this cost can be specified separately for infected and susceptible individuals.
The choice of value for the “background cost” input will depend strongly on the perspective of your costeffectiveness analysis. If you are taking the perspective of an organization that finances only HIV-related medical
10
care, then a person without HIV is essentially free from your perspective; the susceptible “background cost” could
potentially be zero. On the other hand, if you are taking a societal perspective or taking the perspective of a
national single payer-type organization, then people without HIV are most certainly not free! You’d want to find
some general estimate of average healthcare costs to use in your analysis. There are lots of options for finding this
data, but the World Bank might be a good place to start – for example, you can find data on per capita health
expenditure per year for most countries in the world over the past 15 years.
Other parameters
Finally, there are two additional economic parameters. First is the average QoL of a susceptible individual. This is
a multiplier that is applied to all susceptible individuals each month. This input is intended to capture the fact that
just because a person does not have HIV does not mean her health-related QoL is 1; to assume that susceptibles
have a QoL of 1 would overstate the benefits of preventing HIV transmissions (i.e. keeping people susceptible).
Second is the annual discount rate. This discount rate is applied to both costs and QALYs using the same formula
as is used in CEPAC. Of note, H-DEF will provide discounted output for all economic outcomes; for the main
outcomes (total costs and QALYs) it will also automatically provide undiscounted output.
Saving/importing H-DEF inputs
To simplify the process of storing and re-using H-DEF inputs, the user can save a file that records the values of all
inputs in H-DEF (e.g. transmission rates, demography, etc.). After clicking the “Save model inputs” button (or
selecting “Save model inputs” from the File menu) the user will be given the option to name their inputs file and
select where to save it.
At a later time, the user can select this saved file using the “Import model inputs” button (or File menu). Selecting
a file will populate all of the H-DEF inputs with the values in the saved file.
Running H-DEF and saving model outputs
To be updated.
Strategy-sets
11
Understanding H-DEF results
Though the preceding pages may suggest otherwise, H-DEF is about the simplest possible model for simulating
an HIV epidemic. Still, this doesn’t mean the results you get from H-DEF will always be intuitive! To fully
comprehend H-DEF’s results, you’ll need to understand a variety of different (and sometimes competing)
processes that go on within the model. This section will describe a few of them, and provide some tips for
synthesizing your understanding of all of the processes:
1.
2.
3.
4.
5.
Epidemic trends
The population’s baseline state
Transmission rates
Heterogeneous mixing
Long-term treatment outcomes within CEPAC
Epidemic trends
At its heart, H-DEF is an S-I model. That is, it simulates HIV transmission by modeling the rate at which people
move from being in a susceptible state (S) to an infected state (I). So to begin to understand H-DEF, it’s useful to
first gain an understanding of even simpler S-I model.
Formulating an S-I model
In the following section, we will denote by S the proportion of a population that does not have HIV, and we will
denote by I that proportion that is HIV-infected. So based on this framework, S + I should always be equal to 1.
With this simple setup, we can already calculate the prevalence of HIV: it’s I/(S + I), or just I. And with just S
and I, calculating the prevalence is about all we can do! To make things more interesting, let’s introduce another
term: the frequency with which susceptible individuals become infected. We will denote by ri the rate at which
susceptible people become infected. For example, let’s say ri is set to 0.01/month. If 90% of the population is in S
initially, how many people will transition to I in the first month? To calculate this, simply multiple S by ri:1
π‘›π‘’π‘šπ‘π‘’π‘Ÿ π‘‘π‘Ÿπ‘Žπ‘›π‘ π‘–π‘‘π‘–π‘œπ‘›π‘–π‘›π‘” π‘“π‘Ÿπ‘œπ‘š 𝑆 π‘‘π‘œ 𝐼 𝑖𝑛 π‘“π‘–π‘Ÿπ‘ π‘‘ π‘šπ‘œπ‘›π‘‘β„Ž = 𝑆 ∗ π‘Ÿπ‘– = 90% ∗ 0.01⁄π‘šπ‘œπ‘›π‘‘β„Ž = 0.9%
If 0.9% of S individuals transition to I, we are left with 89.1% in S. Applying the same equation in the following
month gives:
π‘›π‘’π‘šπ‘π‘’π‘Ÿ π‘‘π‘Ÿπ‘Žπ‘›π‘ π‘–π‘‘π‘–π‘œπ‘›π‘–π‘›π‘” π‘“π‘Ÿπ‘œπ‘š 𝑆 π‘‘π‘œ 𝐼 𝑖𝑛 π‘ π‘’π‘π‘œπ‘›π‘‘ π‘šπ‘œπ‘›π‘‘β„Ž = 𝑆 ∗ π‘Ÿπ‘– = 89.1% ∗ 0.01⁄π‘šπ‘œπ‘›π‘‘β„Ž = 0.891%
If we continue applying this equation over the following months, we’ll see that fewer and fewer people transition
in each successive month. But as time goes on, nearly everyone gets infected:
1
In truth, simply multiplying these values is not quite correct, due to the distinction between rates and probabilities. But for
the purposes of this exercise, this minor inaccuracy is irrelevant.
12
It doesn’t matter how small ri is: eventually nearly everyone will get infected. But this clearly isn’t what we see in
reality, so we must be missing something in our equations!
What we haven’t accounted for yet is this: people with HIV don’t live forever. Eventually HIV-infected
individuals die, and new susceptible individuals are born. So how can we add this into our equations? Right now,
here’s what our model looks like:
S
ri
I
Let’s assume that HIV-infected people die at a rate rd. And further, we’ll assume that every time an HIV-infected
person dies, they are immediately replaced by a susceptible person being born. How could we add this assumption
to the visual depiction above?
S
ri
rd
I
rd works just like ri: to calculate how many people will leave I (by dying) and how many will enter S (by being
born), simply multiply rd by I.
Now, witness the firepower of this fully armed and operational S-I model! As before, we’ll set S to 90% initially
(that is, the prevalence is 10%), we’ll set ri to 0.01/month, and we’ll also set rd to 0.01/month. What happens?
13
Now that we’ve added a way for people with HIV to die, we see that no longer does everyone eventually get
infected. Instead, the model trends toward a stable equilibrium at some intermediate prevalence – in this case, that
equilibrium prevalence is 50%.
Where is equilibrium?
The obvious next question is: why 50%? To figure this out, let’s try playing with some of the parameters in the
model. To do this yourself, open the file “Simple SI model ELR 12-20-13”, in the references folder.
Have you opened the file? You should open the file. This guide will refer to it repeatedly over the following
pages, so you may as well start now!
First, let’s see how the initial prevalence affects the equilibrium. So rather than setting I to 10% initially, let’s set
it to 80%:
14
The model still gets to an equilibrium of 50% prevalence. So our first conclusion is: it looks like the initial
prevalence doesn’t affect the equilibrium prevalence. What about ri and rd? Let’s start by doubling ri:
That changed things! Now the equilibrium prevalence is around 67%. So the second conclusion we can reach is: a
greater infection rate leads to greater equilibrium prevalence.
Now, what if we instead double rd?
Now, the equilibrium prevalence is around 33%. So the third conclusion we can draw is: a greater death rate leads
to a lower equilibrium prevalence.
The table below summarizes these trends:
15
Parameter
Initial prevalence
Infection rate
Death rate of infected people
Change in parameter
Effect on equilibrium prevalence
↑
no change
↓
no change
↑
↑
↓
↓
↑
↓
↓
↑
Understanding these trends is critical to understanding the epidemic projections of dynamic transmission models.
But we can go even further than understanding these trends: we can quantify precisely how each parameter affects
equilibrium prevalence.
To do this, let’s first consider what exactly equilibrium means. Although HIV prevalence has stabilized at
equilibrium, equilibrium does not mean that the processes of infection and death have been “turned off”. These
processes are still ongoing – but equilibrium means that they have precisely balanced each other out.
Let’s think about this from the perspective of I. In each month, S × ri individuals are going to transition from S to
I; at the same time, I × rd individuals are going to transition from I to S (as we’ve formulated it, this is actually a
combination of deaths and births, but the math is the same). If S × ri > I × rd, then more people are going to enter
I than leave, and I will increase. If S × ri < I × rd then more people are going to leave I than enter, and I will
decrease. But when S × ri = I × rd, something special happens: exactly the same number of people are going to
enter and leave I, and I will be unchanged. To put it in other words, the prevalence of HIV will be unchanging –
that is, we’ll be at equilibrium!
What is the prevalence at this equilibrium? To find out, we can rearrange the above equation (we’ll use Seq and Ieq
to denote the proportion of susceptibles and infected at equilibrium):
π‘π‘Ÿπ‘’π‘£π‘Žπ‘™π‘’π‘›π‘π‘’ π‘Žπ‘‘ π‘’π‘žπ‘’π‘–π‘™π‘–π‘π‘–π‘Ÿπ‘’π‘š = πΌπ‘’π‘ž =
π‘†π‘’π‘ž ∗ π‘Ÿπ‘–
π‘Ÿπ‘‘
From this equation, you can see all of the trends in the above table – think about what happens to Ieq as you
change ri or rd. And note that the initial values of S and I aren’t in the equation at all – they don’t affect
equilibrium.
Prevalence, incidence, and duration
We can also use the above equation to derive a common epidemiologic rule of thumb: that is, “prevalence equals
incidence times duration.” Prevalence is easy to find in the above equation – it’s Ieq (and Seq is just one minus the
equilibrium prevalence). Incidence is just another word for infection rate, or ri. But what about duration?
Duration is used to mean the length of time that someone has an infectious disease. Since there’s no cure for HIV,
duration is equivalent to life expectancy with the disease – the only way to no longer have HIV is to die.
So what is life expectancy with HIV in our model? We know the death rate, rd – using this parameter, it can be
shown that the mean survival is 1/rd. And by rearranging, rd is just 1/the mean survival.
16
Substituting all of these into the above equation, we get:
π‘π‘Ÿπ‘’π‘£π‘Žπ‘™π‘’π‘›π‘π‘’ =
(1 − π‘π‘Ÿπ‘’π‘£π‘Žπ‘™π‘’π‘›π‘π‘’) ∗ 𝑖𝑛𝑐𝑖𝑑𝑒𝑛𝑐𝑒
= (1 − π‘π‘Ÿπ‘’π‘£π‘Žπ‘™π‘’π‘›π‘π‘’) ∗ 𝑖𝑛𝑐𝑖𝑑𝑒𝑛𝑐𝑒 ∗ π‘‘π‘’π‘Ÿπ‘Žπ‘‘π‘–π‘œπ‘›
1/π‘‘π‘’π‘Ÿπ‘Žπ‘‘π‘–π‘œπ‘›
π‘π‘Ÿπ‘’π‘£π‘Žπ‘™π‘’π‘›π‘π‘’
= 𝑖𝑛𝑐𝑖𝑑𝑒𝑛𝑐𝑒 ∗ π‘‘π‘’π‘Ÿπ‘Žπ‘‘π‘–π‘œπ‘›
1 − π‘π‘Ÿπ‘’π‘£π‘Žπ‘™π‘’π‘›π‘π‘’
So you may have noticed something above: it’s not really “prevalence equals incidence times duration”. In fact,
it’s “prevalence over one minus prevalence equals incidence times duration”; if you’re feeling pedantic, bring this
up whenever someone mentions the more common simplified rule of thumb – I’ve found it to be a good way of
making friends.
But in truth, the precise formula isn’t too important – it’s more important just to understand the relationships that
it highlights. For example, let’s say that giving HIV-infected people ART cuts HIV incidence in half, while
doubling the life expectancy of people with HIV. What will happen to prevalence? Based on the above equation,
we’d guess that prevalence is going to stay fairly steady.
If you’d like to learn more about this topic, there are two interesting papers in the references folder. The paper by
Freeman presents some more of the math relating incidence, prevalence, and duration, and the paper by White
uses estimates of these to come to some interesting conclusions about the HIV epidemic in the UK.
The baseline state of the population
You might think that at this point we’ve exhausted everything there is to learn from the simple S-I model above –
but epidemics are complex, and there’s still a lot more detail to dive into. So in this section, we’ll consider in a bit
more detail something we glossed over above: the population’s baseline state.
First, let’s turn to our “base case” epidemic from above: an initial prevalence of 10%, an infection rate of
0.01/month, and a death rate of 0.01/month. But this time, let’s look at how many people get infected each month:
17
Initially, we’re seeing about 0.009 infections per month. But as the epidemic progresses and HIV prevalence rises
toward equilibrium, the monthly number of infections declines, eventually leveling off around 0.005 infections
per month. If you think back to the equation we use to calculate the number of infections, this makes perfect
sense! As prevalence goes up, S goes down, and so S * ri (i.e. the monthly number of infections) also goes down.
If you were the health minister of Modelville, you might look at the blue curve above and think, “I’m doing a
great job! We’ve reduced the monthly number of new HIV infections in Modelville by almost 50%!” But you’d
be wrong – in truth, you’re just seeing the natural leveling off of the number of infections that can occur as an
epidemic reaches its equilibrium prevalence. It turns out you’ve actually been a wholly ineffectual health minister
– the infection rate in Modelville (ri) is unchanged.
Now, what if the population had started out at 50% prevalence (that is, at equilibrium)?
In this case, the monthly number of new infections would be unchanging over time. The important lesson here is
this: sometimes a change in HIV prevalence or the number of new HIV infections represents the impact (positive
or negative) of treatment policy or behavior change; but sometimes it’s just an epidemic playing out its natural
course.
Transmission rates
If you spent any time playing with the simple S-I model in Excel, you may have noticed something strange: if you
set initial prevalence to 0, you still see lots of new people becoming infected. Which seems rather unrealistic –
after all, if prevalence is 0, then who is transmitting HIV to all the uninfected people?
This strange result highlights an issue with our simple S-I model: the infection rate is totally independent of the
number of people with HIV. To fix this, we’ll introduce one new wrinkle to our model: rather than thinking in
terms of infection rates (i.e. how frequently susceptibles acquire HIV) we’ll think in terms of transmission rates
(i.e. how frequently the infected people transmit HIV to others).
18
In this formulation, we will denote by rt the rate at which an infected person transmits HIV to susceptibles. By
multiplying this value by I, we can arrive at the overall transmission rate for the entire population of infected
people. For example, if I is 10% and rt is 0.02 transmissions/month, then the total transmission rate is
0.2%/month.
To determine the actual number of transmissions, we’ll then multiply this rate by the prevalence of susceptibles,
S. Why? We’re essentially saying, if infected people had sex exclusively with susceptible people, then we’d see
0.2% of our population acquire HIV in this month. But infected people aren’t going to have sex exclusively with
susceptibles – the likelihood that a randomly chosen sexual partner will be susceptible is S, or one minus the
prevalence. In the above example, this is 90%, so we expect to see 90%*0.2% = 0.18% of the population acquire
HIV in the first month. Comparing this to our simpler model above, we have essentially replaced ri with I * rt:
S
I * rt
rd
I
So, let’s take this badboy for a spin! First, consider what happens if I is set to zero; in this case I * rt will also be
zero, and no one will ever get infected – which is just the behavior we wanted to see.
With this setup, we can now produce a realistic depiction of an epidemic’s growth from a low prevalence. Here’s
what happens when I is initially set to 1%, rt is 0.06/month, and rd is 0.01/month:
First, let’s consider the red curve – this is the characteristic S-shaped curved of epidemic growth. Prevalence
grows slowly for a while, accelerates rapidly in the middle, and then levels off towards some equilibrium value.
The blue curve provides the explanation for why we see the characteristic S-shape in the prevalence curve.
Initially, when prevalence is low, there are so few people with HIV that you don’t see very many new infections
happening. But the prevalence does gradually rise – and as it rises, there are now enough people with HIV to
begin transmitting to the susceptibles more rapidly, so the monthly number of infections begins to climb. But as
prevalence continues to increase, a new roadblock emerges – eventually, there aren’t enough susceptibles left to
get infected, and the number of new infections starts to fall again, leveling off as prevalence reaches its
equilibrium.
19
Infections, incidence, and incidence-to-prevalence-ratio
Rather than looking at the absolute number of new infections, we can also look at the infection rate (a.k.a. the
incidence rate, as described above). We no longer have a term in our model called the infection rate, but we can
calculate what this term would be each month by dividing the number of new infections by the number of
susceptibles. Here’s what the infection rate looks like over time:
Let’s think back to your role as the naïve, ineffectual health minister of Modelville. If you were hired in month
96, you might look at the ongoing trend in the number of new infections, and think to yourself, “Starting from the
day of my inauguration, we’ve seen almost a 50% decline in the number of new infections in Modelville – I
deserve a raise! And a medal! Or maybe a trophy…”
Alternately, you might look at the HIV incidence rate (i.e. the infection rate), and think to yourself, “Starting from
my inauguration, there’s been a 70% increase in the HIV incidence rate – I should resign.”
In truth, neither conclusion is quite right. Let’s take a look at our model again, and think about which parameters
you – as health minister – might be able to impact:
S
I * rt
rd
I
As health minister, you can’t directly affect S or I – these will change based on changes in how frequently people
acquire HIV, and die from HIV. But by getting more people treated, or by promoting safer sex, you might be able
to affect rt and rd. Since we’re thinking about transmissions here, let’s focus on rt.
Neither of the two metrics above isolates rt: the number of new transmissions is S×I×rt, and the incidence rate (or
infection rate) is the number of new transmissions divided by S – that is, I×rt. So neither of these is quite right.
But you can’t just go out and measure rt – instead, we’ll have to take some of the things we can measure
(incidence, prevalence, new infections) and use them to calculate rt.
20
Fortunately, this is pretty easy! As shown above, the incidence rate can be calculated as I×rt; so to get rt alone, all
we have to do is divide by I (or the prevalence). This gives a new metric known as the incidence to prevalence
ratio, or IPR. Let’s see what happens when we calculate the IPR over time for the epidemic above:
IPR tells the true story: as health minister, you haven’t done anything to curb the epidemic, but you haven’t made
things worse either. In fact, you’ve had no impact at all.
Heterogeneous mixing
The model described in the preceding section is nice, and it captures many of the interesting features of epidemic
growth. There’s (at least!) one rather concerning assumption embedded in it though. When constructing our
formula for new infections, we made the following justification for including S in the formula: “the likelihood
that a randomly chosen sexual partner will be susceptible is S, or one minus the prevalence.” This may sound
alright, but note those two words in the middle – randomly chosen! Is this realistic? Well, we all certainly like to
think that we consider lots of factors when choosing a partner. So perhaps this isn’t the best assumption.
When modeling HIV transmission, the inaccuracy of randomly selected sexual partners becomes quite obvious. In
the US, for example, a large proportion of individuals with HIV are men who have sex with men (MSMs) and
injecting drug users (IDUs), with relatively fewer heterosexuals. Let’s assume 5% of the population is MSM.
Following the random selection assumption, we would be assuming that a given MSM has a 95% chance of
choosing a partner who is not an MSM! At the same time, a given heterosexual or IDU would have the same
chances of choosing a partner who is not an MSM – that is, the three groups wouldn’t actually be different!
So random partner selection clearly doesn’t make sense. Instead, we could try modeling the three groups as
completely separate from one another – the “Heterogeneous mixing” tab of the spreadsheet implements just this
idea.
This time, the same exact model used in the previous section is implemented for three different sub-groups. For
each of the sub-groups, you can specify how big it is (i.e. the proportion of the total population that it comprises),
the initial prevalence in the subgroup, and rt and rd for that subgroup. This model is initially set up to have a
21
small, very high-risk subgroup (Sub-group A, with rt of 0.16), an even smaller, moderate-risk sub-group (Subgroup B, with rt of 0.04), and a large, low-risk sub-group (Sub-group C, with rt of 0.011).
You can see in the graphs how prevalence and monthly new infections vary over time for each of these subgroups. For A, there’s a very high number of new infections, and prevalence quickly rises to an equilibrium
>90%; for B, prevalence rises more slowly, and reaches around 70% by month 240; and for C, prevalence and
new infections both remain extremely low over 240 months. Neither of the latter two groups reaches equilibrium
by month 240.
When we look at these trends for the total population, things get more interesting. Akin to A (the high-risk group),
you can see that prevalence rises rapidly in the first few years, but levels out after around 5 years. But unlike A,
the long-term prevalence does not approach 90%; instead, it levels off somewhere below 20%, much closer to C
than either A or B.
On a qualitative level, this is the most important effect of heterogeneous mixing: if there are high-risk subgroups,
prevalence can rise quickly initially, but it will level off at a lower prevalence than might be expected based on
the speed of the initial rise.
In H-DEF, the effect of heterogeneous mixing is simulated using the parameter J. A greater value of J simulates
an epidemic with more unevenly spread risk. A lower value of J simulates an epidemic that is closer to random
mixing.
Long-term treatment outcomes within H-DEF
In truth, H-DEF is a rather boring model. It contains only a few parameters of its own, and it can’t make any
projections without using results derived from CEPAC – mostly, it is a tool used to extrapolate from CEPAC
results. Given this, the user should understand that the most important part of any H-DEF analysis is the CEPAC
runs – and when seeking to understand H-DEF’s epidemic projections, the user should always start with CEPAC.
This section will provide a few tips for analyzing and understanding the results of CEPAC runs, and the next
section will tie together CEPAC runs with all of the trends seen in the earlier epidemic modeling sections.
CEPAC equilibrium
Earlier sections described equilibrium prevalence as the point where new infections are balanced out by deaths of
persons with HIV. But when this point is reached, prevalence isn’t the only factor that is at equilibrium – all the
various states within CEPAC (on/off ART, diagnosed/undiagnosed, CD4 above/below 500, etc.) also reach an
equilibrium. When equilibrium prevalence is reached, these other states also reach a point of balance; for
example, at equilibrium, the number of people starting ART each month will be equal to the number stopping
ART. And as before, there’s an easy shortcut to calculate what each of these equilibria will be!
To determine the proportion of HIV-infected persons who will be on ART at equilibrium, you don’t need to use
H-DEF at all. Instead, open the CEPAC run you have done which started at the time of infection. Sum the total
number of person-months that were spent on ART, as well as the total number that were spent off ART – there are
a few ways to find these in the output file. (As a check, if you add up the on-ART and off-ART person-months,
and divide by your initial cohort size, you should get the cohort’s undiscounted life expectancy.)
22
The equilibrium proportion of your population on ART will be equal to the proportion of all person-months that
were spent on ART. For example, if 200 million PMs were spent on ART, and 100 million PMs were spent off
ART, then 67% of HIV-infected people will be on ART at equilibrium.
This type of calculation works just as well for other states – you can use it to calculate the proportion suppressed
at equilibrium, diagnosed at equilibrium, and just about any other possible state within CEPAC. This is
tremendously useful for two main reasons:
1. It shows you where your epidemic is headed. If you calculate that the equilibrium ART use will be
67%, then regardless of what your baseline ART use is, you know that it’s going to trend toward 67%.
And to figure this out, you don’t have to do multiple CEPAC runs, figure out all the parameter values for
H-DEF, or run H-DEF at all; all you have to do is analyze a single CEPAC run.
2. It helps you equilibrate your model. In some analyses, you may want to simulate an epidemic that is
near equilibrium from the start of the analysis. By analyzing a single CEPAC run, you can get a very
detailed picture of just what that equilibrium will need to look like.
For example, by analyzing your CEPAC run, you might determine that, at equilibrium, 80% of patients
will be on ART, 10% will be LTFU, and 5% will be unlinked, and 5% will be undiagnosed. However,
you may have some data showing that today only 50% of patients are on ART. There are a few
conclusions you could potentially draw from this:
a. Perhaps the epidemic is at equilibrium today, and more than 80% of patients are actually on ART
– the data showing 50% was wrong, or was applied too broadly.
b. Perhaps the epidemic is at equilibrium today, and 50% of patients are truly on ART. In this case,
too many PMs (80%) are being spent on ART in the CEPAC run, and certain CEPAC parameters
(diagnosis rate, ART efficacy, late failure rate, LTFU rate, etc.) should be adjusted to bring the
proportion of PMs spent on ART closer to 50%.
c. Perhaps 50% of patients are truly on ART today, and this proportion really will trend towards
80% in the future – in this case, the epidemic is not actually at equilibrium.
Calibration
To be updated.
23
Appendices
CEPAC equilibrium “proof”
Consider two mutually-exclusive states that exist in CEPAC, which we will call A and B. For example, these
states could be on/off ART.
We will define A(m) and B(m) to be the number of individuals in these states in month m of a CEPAC run
starting at the time of infection. For example, in month 60 of a 1,000,000-patient run, 900,000 individuals may
still be alive, of whom 600,000 are on ART and 300,000 are off; A(m) and B(m) would be 600,000 and 300,000
in this case.
We will define T(t) to be the number of new transmissions occurring in month t of an H-DEF analysis. Note the
distinction between t and m. t is calendar time, and is the same for all persons alive at any point; at time t,
however, m will differ between HIV-infected people depending on when they were infected. For example, when t
= 0, m can only equal 0. When t = 3, m will be 3 for persons infected in month 0, 2 for persons infected in month
1, 1 for persons infected in month 2, and 0 for persons infected in month 3.
We will define Atotal(t) and Btotal(t) to be the total number of people in states in A and B in month t of an H-DEF
analysis. These can be calculated as follows:
𝑑
π΄π‘‘π‘œπ‘‘π‘Žπ‘™ (𝑑) = ∑ 𝑇(𝑑 − π‘š) ∗ 𝐴(π‘š)
π‘š=0
𝑑
π΅π‘‘π‘œπ‘‘π‘Žπ‘™ (𝑑) = ∑ 𝑇(𝑑 − π‘š) ∗ 𝐡(π‘š)
π‘š=0
Having calculated these, we can calculate the proportion of living individuals in state A (the prevalence of A) as:
𝑝𝐴(𝑑) =
π΄π‘‘π‘œπ‘‘π‘Žπ‘™ (𝑑)
π΄π‘‘π‘œπ‘‘π‘Žπ‘™ (𝑑) + π΅π‘‘π‘œπ‘‘π‘Žπ‘™ (𝑑)
We will define equilibrium as the point where a few conditions are met:
1. The prevalence of HIV has been unchanged for the length of an entire CEPAC run (e.g. 1,000 months).
2. The monthly number of new infections is also unchanging.
3. The distribution of people among the various CEPAC states (e.g. on/off ART, etc.) is also unchanging.
Condition 2 is most important here – it means that, at equilibrium, we can replace T(t-m) in the above equations
with just T – a constant. We can now calculate the prevalence of A at time t (if the epidemic is at equilibrium at
time t as:
∑π‘‘π‘š=0 𝑇 ∗ 𝐴(π‘š)
𝑝𝐴(𝑑) = 𝑑
∑π‘š=0 𝑇 ∗ 𝐴(π‘š) + ∑π‘‘π‘š=0 𝑇 ∗ 𝐡(π‘š)
24
𝑝𝐴(𝑑) =
∑π‘‘π‘š=0 𝑇 ∗ ∑π‘‘π‘š=0 𝐴(π‘š)
∑π‘‘π‘š=0 𝑇 ∗ ∑π‘‘π‘š=0 𝐴(π‘š) + ∑π‘‘π‘š=0 𝑇 ∗ ∑π‘‘π‘š=0 𝐡(π‘š)
𝑝𝐴(𝑑) =
∑π‘‘π‘š=0 𝑇 ∗ ∑π‘‘π‘š=0 𝐴(π‘š)
∑π‘‘π‘š=0 𝑇 ∗ [∑π‘‘π‘š=0 𝐴(π‘š) + ∑π‘‘π‘š=0 𝐡(π‘š)]
∑π‘‘π‘š=0 𝐴(π‘š)
𝑝𝐴(𝑑) = 𝑑
∑π‘š=0 𝐴(π‘š) + ∑π‘‘π‘š=0 𝐡(π‘š)
Let’s consider what this last equation means: if you want to know what the prevalence of A will be at equilibrium,
you don’t have to run H-DEF until it reaches equilibrium. Instead, you just need to take the closed-cohort, “frominfection” CEPAC run that you are using, and sum the number of people in states A and B across all months (i.e.
sum the person-months accrued in states A and B). The proportion accrued in state A will be equal to the steadystate prevalence of state A.
It’s important to recognize that the stringency of the three conditions defining equilibrium tend to limit this
equation’s usefulness as an exact calculation. 1,000 months is quite a long time, and no H-DEF analysis will
actually be reporting results that far out. However, the equation is still quite useful in two ways:
1. It reveals how the prevalence of A will trend over time from its initial value.
2. It gives a good approximation of the prevalence of A when equilibrium is approximately reach (i.e. HIV
prevalence, transmissions, and CEPAC state prevalence are all fairly stable, which can occur within a few
decades).
25
Download