SUPPLEMENTARY MATERIALS 1. Equations governing birth terms in the model The equations governing the birth terms, BXi and BYi, in the model are: 2. BX1 = X N 1 X 1 2 N1 2 N tot 2 2 (S1.1a) BY1 = Y N 1 Y1 2 N 1 2 N tot 2 2 (S1.1b) BX2 = 1 N tot X 2 N2 N 2 X2 X 1 2 2 N 3 + 2 X 3 N 1 2 (S1.1c) BY2 = 1 N tot Y2 N 2 N 2 Y2 Y1 2 2 N 3 + 2 Y3 N 1 2 (S1.1d) BX3 = 1 X2 N X 3 2 N3 N tot 2 2 (S1.1e) BY3 = 1 Y2 N Y3 2 N 3 N tot 2 2 (S1.1f) Time until generic modification under different conditions 2.1. Allowing for different modes of inheritance We next considered the effects that different modes of inheritance might have on the rate of disappearance of the frail allele. We considered three different forms of inheritance. For the first, used above, the frail allele is partially dominant and the frail phenotype requires the inheritance of only one frail allele. Individuals homozygous for the frail allele are even more frail. The second has complete dominance of the frail allele. Individuals with one or two copies of the frail allele proceed to death in 5.5 years. The results of this mode of inheritance are almost identical to that of partial dominance, as the group that are homozygous for the frail allele is so small that their differential death rate does not significantly alter results. The third inheritance mode occurs when the frail allele is recessive, individuals require two copies of the allele in order to have the frail phenotype. We assessed the affect of altering the mode of inheritance on the decrease in the frequency of the frail allele for similar prevalence levels. In all cases we assumed that initially the frail phenotype had a frequency of 10% in the population. This means that for simulations of the partially dominant and dominant inheritance mechanisms the frail allele has an initial prevalence of 5% in the population, while for the recessive inheritance mechanism it has an initial prevalence of 32%. The results are shown in Figure S2A. Under a recessive inheritance mechanism it takes far longer for the frail allele to disappear from the population, as it is preserved in heterozygotes who do not have the frail phenotype. During a Low prevalence infection with recessive inheritance the proportion of the frail allele in the population only decreases to 90% of its starting ratio after 500 years of infection, compared to 62% when partial dominance is the mode of inheritance. Similarly in a high prevalence infection the decreases are 59% for recessive inheritance and 4% for partial inheritance after 500 years. 2.2. Allowing for different starting ratios As a further consideration we assessed the effect that the initial proportion of frail individuals in the population could have on the decrease in their proportion during an HIV infection. For this section we once again assumed that inheritance of the frail allele was partially dominant. The results of our simulations are shown in Figure S2B. We find that the relative decrease in the proportion of the frail allele in the population is independent of the starting percentage of frail individuals, and any differences in the curves of Figure S2B can be explained by slightly different infection prevalences between the simulations. 3. Deriving expressions for the proportion of frail individuals using the simplified model In this section we approximate the prevalence function, g(t), by a function made up of a series of straight lines to determine an even simpler expression for equation 6.5 of the main paper. We can find solutions to equation 6.2a and b from t = a to t =b as: t X1(t) = X1(a) e t ( s ) c ( s ) g ( s ) ds t g(s) a 1 g ( s ) a ds (S3.1a) t X2(t) = X2(a) e t ( s ) c ( s ) g ( s ) ds a (S3.1b) Equations S3.1a and b can be combined to give: X ( ) X (t ) log 2 = log 2 a − X 1 (t ) X 1 ( b ) g ( s) a 1 g ( s) t (S3.2) X (0) Subtracting log 2 from both sides of equation S3.2 and taking exponents gives: X 1 (0) (t) = (a) e t a g (s) 1 g ( s ) It remains to find a simplified expression for (S3.3) g (t ) which can be integrated from t = a to t =b. 1 g (t ) 3.1. Rising function For a rising prevalence we assume that g(t) rises from a value of 0 at t = a to a value of at time t = b. So we have g(t) = r(t) = (a) e = (a) e b a t a (t-a). We approximate (t a ) g (t ) by and have: (1 )( b a ) 1 g (t ) ( s a ) ds (1 )( b a ) ( t a ) 2 2 (1 )( b a ) (S3.4) (the superscript of r denotes that equation S3.4 is valid during the rising phase). 3.2. Flat function For a flat prevalence we assume that g(t) remains constant at a value of %. So we have g(t) = . Then g (t ) is given by and we have 1 g (t ) 1 f(t) = (a) e = (a) e t a 1 ds ( t a ) 1 (S3.5) (the superscript of f denotes that equation S3.5 is valid during the flat phase). 3.3. Decreasing function For a decreasing prevalence we assume that g(t) decreases from a prevalence of at time t = a to a prevalence of at time t = b and write g(t) = - approximated by d(t) 1 = (a) e - b a (t-a). Then g (t ) can be 1 g (t ) (t a ) and we have (1 )( b a ) ( s a ) ds t a 1 (1 )( ) b a = (a) e = (a) e ( t a ) ( t a )2 1 2 1 ( b a ) t 2 2 bt a b a ( b a ) 1 2 ( b a ) (S3.6) (the superscript of d denotes that equation S3.6 is valid during the decreasing phase). 3.4. Calculating expressions for a low and high prevalence infection For the low and high prevalence scenarios shown in Figure 2A we allow the prevalence of the disease, g(t), to be approximated by a broken stick function that begins to rise in prevalence at time t = lag after detection of HIV, rises to a prevalence of % over years and remains at this prevalence thereafter. Then g(t) takes the form: 0 t lag g(t) = (t lag ) lag t lag t lag (S3.7) During the initial phase, g(t) = 0, and (t) = 1. During the rising phase, from t = to t = we have (t) given by r(t) from equation S3.4. And so: ( t lag )2 (t) = e 2 1 for lag < t ≤ lag (S3.8) At t = lag (the end of the rising phase) (lag) = e 21 . During the flat phase, from t = lag onwards, (t) is given by f(t) from equation S3.5. ( t lag ) (t) = (lag) e e e 2 1 1 ( t lag ) e 1 t lag 2 1 for t > lag (S3.9) The expressions in equations S3.8 and S3.9 define the following expression for (t) for the low and high prevalence scenarios 1 t lag 2 ( t lag ) (t) = e 2 1 lag t lag t lag 2 e 1 t lag (S3.10) Equation S3.10 can be rearranged (making t the subject) to give an expression for T(), the time taken for the ratio of frail to normal susceptible individuals to drop below % of its starting ratio: 2 log (1 ) 1/ 2 lag T() = log ( 1 ) lag 2 e 2 1 (S3.11) otherwise 3.5. Calculating expressions for a disappearing infection For a disappearing epidemic like the one shown in Figure 2A we allow the prevalence of the disease to be approximated by a triangular function that begins to rise in prevalence at time t = lag after detection of HIV, rises to a prevalence of % over years and then decreases back to a prevalence of 0% over a further years. Therefore we have: 0 t lag (t lag ) lag t lag g(t) = (t ( lag )) lag t lag 0 t lag (S3.12) During the initial flat phase of the function (t) = 1. During the initial rising phase (t) is given by the expression in equation S3.8. At the end of the rising phase (t = lag ) once again we have (lag) = e 21 . During the decreasing phase (t) is given by the expression in equation S3.6. (t) = (lag) e e 2 1 = e e 2 t 2 ( lag ) t ( lag )( lag ) ( lag ) 1 2 ( b a ) 2 t 2 ( lag ) t ( lag )( lag ) ( lag ) 1 2 2 t 2 ( lag ) t ( lag )( lag ) lag 1 2 for lag < t ≤lag + (S3.13) At t =lag equation S3.13 evaluates to: (lag) = e (1 ) 2 (S3.14) Finally, for t > lag + (t) is unchanging as g(t)=0. The expressions given in equations S3.8, S3.13 and S3.14 can be combined to give an expression for (t) as: 1 t lag ( t lag ) 2 e 2 1 lag t lag 2 t 2 ( ) t ( )( ) lag lag lag lag = 2 e 1 lag t lag e (1 ) 2 t lag (S3.15) Equation S3.15 can be rearranged (making t the subject) to give an expression for T() for a disappearing prevalence infection: 1/ 2 2 log (1 ) lag 2 log (1 ) T() = lag e 2 (1 ) e 2 (1 ) e 2 (1 ) (S3.16) otherwise Note that when rearranging equation S3.13 we arrive at the expression t = lag 2 log (1 ) . We use the negative square root of this expression as corresponds to the smaller of the two values of t and so gives the first time at which a ratio of % is reached. The second time (given by taking the positive square root) actually occurs for t > lag + when equation S3.13 is no longer valid. From equations S3.15 and S3.16 one can observe that following a disappearing the long term 2 prevalence of the frail allele in the population is given by = e (1 ) , where is the maximum prevalence attained in the infection and + is the duration of the infection in the population. 3.6. Using the simplified expressions Figure S3 shows plots of (t) determined using equations S3.10 (for a low prevalence and high prevalence infection) and S3.15 (for a disappearing prevalence infection), the parameters in Table 1 of the main paper, and the expressions for g(t) given in Table S3 below. The plots determined using equations S3.10 and S3.15 are shown using dashed lines, and are compared to those generated using equation 6.5 of the main paper (and the prevalence curves in Figure 2A of the main paper). We can see a good agreement between the simplified functions determined in these supplementary materials and the exact function from the main paper. 4. Using different time dependent parameters for (t)c(t) We have incorporated time dependent parameters into the model to parameterise the function (t)c(t) (given in Table 2 of the main paper) and to achieve observed prevalence levels, however, the model is not sensitive to the choice of these parameters. The purpose of these parameters is to obtain the correct or prevalence of HIV in the population by defining the function (t)c(t). The selective pressure due to a given disease prevalence does not depend on the parameters for (t)c(t) that are used. This is shown in Figure S4 below. In Figure S4 panel A shows six different parameter sets that define different functions for (t)c(t). Panel B shows the almost identical model generated prevalence curves for these six parameter combinations. Panel C shows the relative proportion of the frail allele remaining in the population compared to its starting proportion predicted by the model for each of the six parameter combinations from panel A. We observe that the rate of disappearance of the frail allele is almost identical in each case, and therefore conclude that the speed of genetic selection depends on the prevalence of the disease in the population, rather than on the time dependant parameters chosen for (t)c(t). 5. Allowing for assortative mating To implement assortative mating we have introduced a constant that represents the probability that an infected and uninfected individual will mate. In the case of random mating, = 1. For assortative mating we assume that 0 < < 1. (Note that we must have ≥ 0 as otherwise the disease will not spread in the population.) The introduction of affects both the new infections term and the birth terms in Equation 4.1. Since uninfected and infected individuals now only mate with probability , the new infections term of equation 4.1 must now be multiplied by , meaning that the new infections term is given by ’(t)c’(t)Xi Ytot (where ’(t)c’(t) is the new function N tot required to maintain the same disease prevalence in the population as with random mating). However, in order to maintain the same disease prevalence, we require that ’(t)c’(t) = (t)c(t)/(so that enough new infections are generated) and so the new infections term is given by ’(t)c’(t)Xi Ytot Y (t )c(t ) Ytot = (t)c(t) Xi = (t)c(t)Xi tot , which is identical to the term that N tot N tot N tot was used in the random mating case. The equations governing the number of births for the assortative mating model are: BX1 = X X Y 1 X 1 2 X 1 2 Y1 2 N tot 2 2 2 BY1 = 1 N tot Y Y1 2 2 BX2 = 1 N tot X 2 X 2 Y2 X 1 2 2 X 3 2 Y3 X Y X 1 2 Y1 2 2 2 X Y X + 2 X 3 X 1 2 Y1 2 2 2 2 (S5.1a) (S5.1b) (S5.1c) BY2 = 1 N tot Y2 X 2 Y2 Y1 2 2 X 3 2 Y3 X Y Y + 2 Y3 X 1 2 Y1 2 2 2 2 (S5.1d) BX3 = 1 X2 X Y X 3 2 X 3 2 Y3 N tot 2 2 2 (S5.1e) BY3 = 1 Y2 X Y Y3 2 X 3 2 Y3 N tot 2 2 2 (S5.1f) Equations S5.1a-f lead to a lower birth rate than in the random mating case when there are significant numbers of infected individuals. We have implemented a model with assortative mating by using the birth equations given in equations S5.1a-f. We have ensured that the disease prevalence in the population remains similar by slightly altering the function (t)c(t). The modified parameters that were used for (t)c(t) for different values of are given in Table S4 below. We find that a lower value of (more assortative mating) results in slower selection against the frail allele. These results are shown in Figure S5. 6. Including age structure in the model We assume that children take 15 years to reach sexual (and reproductive) maturity and that children below this age do not contract HIV by means other than perinatal transmission. We term these children as “juveniles”. Adults are both reproductive and exposed to HIV from the age of 15 – 49 years. This is in accordance with WHO and UNAIDS reports that record adult HIV prevalence in people aged 15-49 (UNAIDS, 2008) and also corresponds to the age range for which fertility data is given by the United Nations (Population Division of the Department of Economic and Social Affairs of the United Nations Secretariat, 2007). We term people between 15 and 49 years of age “reproductive”. Adults over 49 years of age are assumed not to be reproductive and are assumed not to be exposed to HIV. This is in agreement with reports of HIV incidence rates by age (Williams et al., 2001, Hallett et al., 2008). We term such people as “old”. We allow the function xi(a,t) to represent the number of uninfected people of genotype i, age a present at time t, and yi(a,t) to represent the number of infected people of genotype i, age a present at time t. We then allow Xji(t) to represent the total number of uninfected juvenile individuals of genotype i, Xri(t) to represent the total number of uninfected reproductive individuals of genotype i, Xoi(t) to represent the total number of uninfected old individuals of genotype i, Yri(t) to represent the total number of reproductive age infected individuals of genotype i and Yoi(t) to represent the total number of old infected individuals of genotype i. These groups have intuitive definitions as: 15 Xji(t) = x (a, t )da (S6.1a) i a 0 49 Xri(t) = x (a, t )da (S6.1b) i a 15 Xoi(t) = x (a, t )da (S6.1c) i a 49 49 Yri(t) = y (a, t )da (S6.1d) i a 15 Yoi(t) = y (a, t )da (S6.1e) i a 49 Note that there are no infected individuals younger than 15 years as children are assumed not to become infected, and, as previously, we do not include the children who acquire disease through perinatal transmission in the model. We allow Nri(t) = Xri(t) + Yri(t) and, as before Nrtot(t) 3 = N ri (t ) and Yrtot(t) = i 1 3 i 1 Yri (t ) . It remains for us to define how xi(a, t) and yi(a, t) change over time and as individuals age. For uninfected individuals we have: a 15 x ( a, t ) xi ( a, t ) xi ( a, t ) (it ) x ( a, t ) x ( a, t ) 15 a 49 i i a t xi ( a, t ) a 49 (S6.2a) And for infected individuals we have: a 15 0 yi (a, t ) yi (a, t ) (t ) xi (a, t ) ( ) yi (a, t ) 5 a 49 a t ( ) y ( a , t ) a 49 i (S6.2b) Where (t) = (t)c(t)(t) and (t) is a function that approximates the prevalence of infection in the reproductive section of the population. Note that we should really have (t) = Yrtot (t ) , however, N rtot (t ) including this term complicates both the analytical and computational solution to the partial differential equations (PDEs) of equations S6.2a and S6.2b. For this reason in the model definition we assume that (t) is known a priori, and use it as a driving function for the PDEs. In the model implementation section below we discuss how we ensure that the relationship (t) = Yrtot (t ) does N rtot (t ) in fact hold. 6.1.1. Initial Conditions of the Age-structured model The initial conditions for the PDEs are given by: xi(0,t) = B Xi BYi (S6.3a) yi(0,t) = 0 (S6.3b) The expressions for BXi and BYi are almost identical to those in the main paper, however they now depend only on reproductive individuals. Therefore we now have: 6.1.2. BX1 = X N 1 X r 1 r 2 N r1 r 2 2 N rtot 2 BY1 = 1 N rtot BX2 = X N N 1 X X r 1 r 2 r 2 N r 3 + r 2 X r 3 N r1 r 2 N rtot 2 2 2 2 (S6.4c) BY2 = Y N N 1 Y Y r1 r 2 r 2 N r 3 + r 2 Y r 3 N r1 r 2 N rtot 2 2 2 2 (S6.4d) BX3 = 1 X r2 N X r3 r2 N r3 N rtot 2 2 (S6.4e) BY3 = 1 N rtot (S6.4a) Y N Y r1 r 2 N r1 r 2 2 2 (S6.4b) Yr 2 N Yr 3 r 2 N r 3 2 2 (S6.4f) Delay Differential Equation Representation of the Age-structured Model We can combine equation sets S6.2 - S6.4 to give the following age structured delay differential equations: dX ji (t ) dt = B Xi BYi − xi(15, t) dX ri (t ) = xi(15, t) − xi(49, t) dt dX oi (t ) = xi(49, t) dt − − Xji(t) − (t)Xri(t) − Xri(t) Xoi(t) dYri (t ) = (t)Xri(t) − yi(49, t) dt (S6.5a) (S6.5b) (S6.5c) − (Yi(t) (S6.5d) dYoi (t ) = yi(49, t) dt − (Yi(t) (S6.5e) We can use the method of characteristics to derive expressions for xi(15, t), xi(49, t) and yi(49, t). We obtain the following expressions: xi(15, t) = x(0,t-15)e-15 (S6.6a) xi(49, t) = x(0,t-49)e-49 k(t) (S6.6b) yi(49, t) = x(0,t-49)e-49 mi(t) (S6.6c) The function k(t) represents the proportion of individuals who passed through the reproductive stage without becoming infected and is given by k(t) = e t ( z ) dz t ( 4915) . The function mi(t) represents the proportion of individuals of genotype i who have become infected during the reproductive stage, t but have not yet died from disease related causes. It is given by mi(t) = e ( q t ) i (t )e t ( z ) dz t ( 4915) dq t ( 4915) Combining substituting the expressions in equation sets S6.6 into equation set S6.5 gives the full delay differential equation expression for the age structured model: dX ji (t ) dt 6.1.3. = B Xi BYi − x(0,t-15)e-15 − Xji(t) (S6.7a) dX ri (t ) = x(0,t-15)e-15 − x(0,t-49)e-49 k(t)− (t)Xri(t) − Xri(t) dt (S6.7b) dX oi (t ) = x(0,t-49)e-49 k(t) − Xoi(t) dt (S6.7c) dYri (t ) = (t)Xri(t) − x(0,t-49)e-49 mi(t) − (Yi(t) dt (S6.7d) dYoi (t ) = x(0,t-49)e-49 mi(t) dt (S6.7e) − (Yi(t) Age-structured Model Implementation We can implement the age structured delay differential equation described in equation set S6.7 using the dde23 solver in Matlab®. Note that we must modify the birth rate parameter, , to ensure that the correct birth rate is achieved as births now depend only on the reproductive section of the population. The modified birth rate is given in Table S4 below. During the implementation we must ensure that the relationship (t) = Yrtot (t ) holds. N rtot (t ) As the aim of this model is to drive the demography from known prevalence curves, we begin with an initial estimate for (t) that is taken from published data. We estimate (t) at one year intervals, and interpolate between these points We call this estimate (t). We then define (t) = (t)c(t)(t), and solve equation set S6.7 numerically, setting (t) = (t). During the simulation we numerically evaluate the functions k(t) and mi(t) using Matlab®’s trapz and cumtrapz functions, using a spacing of 1 year. The results are not sensitive to this choice of spacing. From the first simulation, we extract the solution for the total number of individuals of reproductive age, and 1 (t ) . We also extract the solution for the total number of infected individuals of denote this by N rtot 1 rtot reproductive age, and denote this by Y (t ) . We then set (t) 1 Yrtot (t ) = and (t) = (t)c(t)(t), 1 N rtot (t ) and solve equation set S6.7 numerically again, this time with (t) = (t). From this solution we 2 2 (t ) and N rtot (t ) and use these to define 3 (t ) extract Yrtot allowing n 1 2 Yrtot (t ) . We proceed recursively, 2 N rtot (t ) n Yrtot (t ) and solving equation set S6.7 numerically with (t)=n+1(t) until we (t ) n N rtot (t ) meet the condition n+1(t) −n(t)| < for all t (S6.8) Once the condition in equation S6.8 is achieved we set (t) =n(t). This ensures that n Y (t ) Yrtot (t ) within a tolerance n (t ) for all t, and so meets the requirement that (t) = rtot n N rtot (t ) N rtot (t ) of For our simulations we set = 10-5. 6.1.4. Results of the Age-structured Model We compared the results of the model presented in equation set 4.1 of the main paper with the age structured delay equation presented in equation set S6.7 above. In order to allow a comparison between the models we require the prevalence of HIV, which defines the force of infection, to be the same. Therefore, we ensured that Y (t ) Ytot (t ) in the normal model was similar to rtot in the age N rtot (t ) N tot (t ) structured model by slightly altering the function (t)c(t). The modified parameters for (t)c(t) are given in Table S4 below. The results of the model implementations are shown in Figure S6. We find that for matched prevalence curves (shown in panel A) the frail allele disappears from the population slower when the age structured model is used than when the original model is used. This means that the estimates provided using the original model are conservative estimates that present a lower bound on the time to disappearance of the frail allele. In actual fact, since the population is indeed structured by age, the time to disappearance of the allele will be slower than our estimates. 7. Figures Figure S1 Modelling results of a high prevalence (A) low prevalence (B) and disappearing (C) infection. For each simulation the frail phenotype was assumed to be partially dominant, and 10% of the population was initially heterozygous for the frail allele. Black curves give proportion of the normal phenotype (solid) and normal allele (dotted). Grey curves show proportion of the frail phenotype (solid) and frail allele (dotted). Red curves show the prevalence of infection in the population. Simulation for the disappearing infection (panel C) is halted after 155 years as after this time there are very few infected individuals remaining in the population. Figure S2 Relative decreases in frequencies of the frail allele and phenotype in the years following an HIV epidemic for different inheritance mechanisms (A), and different inititial percentages of frail individuals (B). Proportions are shown relative to the initial proportion of frail alleles within the population. (A) Relative frequency decrease of the frail allele in the population when different inheritance mechanisms are used to determine frailty. Blue lines show results for a low prevalence infection, red lines show results for a high prevalence infection. Thick solid lines are used to depict partially dominant inheritance, thin solid lines are used for recessive inheritance and dotted lines are used for dominant inheritance. Note that the dotted lines cannot be seen as they are almost identical to the solid lines for partial dominance. (B) Relative frequency decreases in the proportion of frail alleles during high prevalence (red) and low prevalence (blue) infections given different initial percentages of frail individuals in the population. Figure S3 Relative frequency decreases in the proportion of frail susceptible individuals. Solid lines show estimates using equation 6.5 of the main paper and dotted lines show estimates using equations S3.10 and S3.15. Prevalence functions used to generate the dotted lines are given in Table S3. Figure S4 Results of using different time varying parameters for (t)c(t). (A) Different functions used for (t)c(t) (B) Disease prevalence generated by the model in equations 4.1 using the functions in panel A. Legend is as for panel A. (C) Relative frequency decrease in the frail allele predicted by the model in equations 4.1 using the functions in panel A. Legend is as for panel A. Figure S5 Results of different mating strategies. Plot shows the relative frequency decrease of the frail allele in the population (compared to its starting proportion). Lower values of (more assortative mating) result in slower selection against the frail allele. Figure S6 Comparison of the original model and age structured model. (A) The prevalence of infection in the two model simulations is shown. We see that the prevalence of infection in the whole population in the original model is similar to the prevalence of infection in the reproductive section of the population in the age structured model. (B) The relative frequency decrease of the frail allele in the population (compared to its starting proportion) is shown for both the original (blue) and age structured model (green) for the prevalence curves shown in panel A. We see that the frail allele disappears more slowly when the age structured model is used (green curve sits above blue curve). 8. Tables Param Explanation Value Source / Reason 0.04 Sensible rounding of data on crude Healthy birth rate birth rates in Sub-Saharan Africa 19802010 (Population Division of the Department of Economic and Social Affairs of the United Nations Secritariat, 2009) Natural Death Rate 0.02 Sensible rounding of the data on life expectancy at birth of 50 years for SubSaharan Africa 1980-2010 (Population Division of the Department of Economic and Social Affairs of the United Nations Secretariat, 2007) Death rate due to disease in .071 homozygous normal group Allows for an average lifetime of 11 years between HIV infection and AIDS onset Death rate due to disease in .162 heterozygous group Allows for an average lifetime of 5.5 years between HIV infection and AIDS onset Death rate due to disease in .313 homozygous frail group Allows for an average lifetime of 3 years between HIV infection and AIDS onset. y Proportion of healthy births 0.7 (UNAIDS, 2008) amongst infected mothers. The (30% p121 probability of perinatal perinatal transmission is given by (1-). trans) Initial proportion of infected 0.1% individuals in the population Low starting prevalence that allows for realistic growth curves. Table S1 Parameter values used in all simulations in the paper. Prevalence Init frails % Figure 0 end 1 2 Early Late R0 R0 Low 1% S2B 0.39 0.113 10 20 4.26 1.24 Low 5% S2B 0.39 0.115 10 20 4.18 1.23 2B, 3A, 0.4 0.117 10 20 4.17 1.22 Low 10% S1A, S2 Low 20% S2B 0.41 0.123 10 20 4.01 1.20 High 1% S2B 0.39 0.123 19 25 4.26 1.34 High 5% S2B 0.39 0.124 19 25 4.18 1.33 2B, 3A, 0.4 0.124 19 25 4.17 1.29 4.01 1.24 5.73 0.21 High 10% S1B, S2 High 20% S2B 0.41 0.127 19 25 2B, 3A, 0.55 0.02 10 20 Disappearing 10% S1C, S2 Table S2 Parameters used for (t)c(t) and associated R0 values for the simulations depicted in Figure 2B, Figure 3A, Figure S1 and Figure S2. The parameters for β(t)c(t) were chosen so that a good match to the appropriate prevalence curve (from Figure 2A) was obtained from the model. Early R0 values are determined using β(t)c(t) = 0, late R0 values are determined using β(t)c(t) = end. Prevalence g(t) Low 0 t 10 g(t) = 0.006(t 10) 10 t 20 0.06 t 20 High 0 t9 g(t) = 0.02(t 9) 9 t 24 0.3 t 24 Disappearing 0 t 11 .0333(t 11) 11 t 18 g(t) = .3 .03(t 18) 18 t 28 0 t 28 Table S3 Prevalence functions used to generate dotted lines in Figure S3. Simulation 0 end 1 2 Original model .4 .117 10 20 .04 Time varying params 1 (original) 0.4 0.117 10 20 .04 Time varying params 2 0.377 0.117 16.2 16.2 .04 Time varying params 3 0.542 0.117 0 20 .04 Time varying params 4 0.458 0.117 5 20 .04 Time varying params 5 0.36 0.117 10 25 .04 Time varying params 6 0.4 0.117 5 25 .04 Assortative mating =1 (original) .4 .117 10 20 .04 Assortative mating =0.75 .4 .1161 10 20 .04 Assortative mating =.5 .4 .1153 10 20 .04 Assortative mating =0.25 .4 .1145 10 20 .04 0.407 0.1442 10 20 .0981 Age structured Table S4 Parameters used for (t)c(t) and for for the simulations depicted in Figures S4, S5 and S6 the supplementary materials