1 SUPPLEMENTARY METHODS Nicholas C Grassly, Christophe Fraser and Geoff P Garnett (2004) “Increasingly synchronised epidemics of syphilis across the USA are driven by host immunity” Detailed description of data. Reported annual numbers and rates of syphilis and gonorrhoea cases treated in the major cities of the United States (population size >200,000) have been routinely collected since 1941. These data are maintained by the Centers for Disease Control and Prevention (CDC) with surveillance reports summarising levels and trends currently published annually 1. We compiled reported cases and rates of primary and secondary syphilis and gonorrhea for the period 1941 to 2002 for the 68 major cities. Cases prior to 1969 are recorded by fiscal year after which calendar years are used. The reporting of both case numbers and rates (per 100,000 population) allows the denominator city size estimate used in the past to be calculated and compared to estimates based on the decennial census 2. These calculated denominator city sizes show occasional sudden changes in contrast to the census estimates, which may be a result of changes in the area from which cases are reported, or simply the use of different sources for the estimates. It is not possible to distinguish these causes and we therefore choose to analyse the published rates rather than re-calculate rates based on the city size estimates from the decennial census, since these published rates contain information about changes in the reporting area that would otherwise be lost. Frequency analysis. Differencing the data prior to spectral analysis removes any long-term changes in the mean rate (non-stationarity) remaining after the linear transformation, resulting in a suppressed spectral density for low frequency ‘noise’ 3. The spectrum of differenced 2 data S is biased, and for annual data is related to the spectrum of the original untransformed data S by S (T ) S (T )[ 2 2Cos(2 / T )] , where T is the period. For the periods of interest, this bias is minimal, and the peak in the spectra for syphilis and gonorrhea is the same for the differenced and untransformed data, in part reflecting limits in the resolution of the spectra due to the number of years of observations. In the paper we present spectra for the differenced data, where the spectral density for low frequency fluctuations is suppressed (any remaining power at low frequencies indicates non-stationarity in the differenced data). Tests for significant periodicity in each city spectrum were based on the spectra of 1000 random permutations of the differenced data for each city. Randomization removes any periodicity and is equivalent to the hypothesis of random noise in the differenced data. The resulting spectra are flat and similar to the distribution seen for white noise, which is approximated by the χ2 distribution 4. However, in this case the 95th percentile is somewhat higher for the bootstrapped distribution. A hypothesis of white noise in the differenced data is equivalent to a random walk in the untransformed data. Comparison of the observed spectrum with the distribution of spectra for the randomized data therefore allows the hypothesis of an aperiodic random walk to be tested. Cross-correlation and city size. Typically average correlation may be plotted against some measure of distance between locations to derive a correlogram 5. In ecological studies the distance measure is often simply physical distance. However, in infectious disease epidemiology it is the contact of people with one another or disease vectors that is important. Throughout the latter half of the twentieth century the spread of directly transmitted infections including 3 sexually transmitted infections has been strongly hierarchical 6,7. Disease spread is through the size hierarchy of cities, in the USA spreading rapidly among the largest cities such as New York and Los Angeles, and then more slowly to the smaller cities. In some cases this may be followed by more local suburban diffusion. This pattern of disease spread closely maps to patterns of air travel 8 and is also likely to relate to cultural diffusion of behaviours that may relate to the transmission of disease. Different measures of distance between cities that reflect their position in the size hierarchy may be derived. However, a correlogram for syphilis based on such measures would be problematic since smaller cities typically have fewer case reports and greater sampling error. This makes it extremely difficult to test the significance of any observed correlation or synchronisation distance threshold. Instead we compare the average correlation of the 10 largest to the 10 smallest cities (based on their average rank in the size hierarchy over the period 1960-96). Significant correlation of the log-transformed rates of primary and secondary syphilis for each city pair within the two groups was tested using Pearson’s correlation coefficient. This was then repeated for the 10 largest cities but where the reported rates were resampled 10,000 times from a binomial distribution with mean equal to the reported rate and sample size equal to that for the smallest cities (the population size for each city for each bootstrap replicate was randomly sampled without replacement from the distribution of sizes of the smallest cities averaged over 1960-96). This gives the distribution of the number of significant pairwise correlations expected for the largest cities but in the presence of additional sampling error corresponding to that expected for the smallest cities. If this distribution excludes the number seen for the 10 smallest cities, then the largest and smallest cities can be said to have significantly different levels of cross-correlation in rates of reported syphilis over the period of interest. 4 Detailed analysis of SIRS dynamics. The SIRS model describes the process of becoming infected (I), recovery to an immune class (R) and subsequent loss of immunity to return to the susceptible class (S). It extends what has been termed the ‘classic endemic model’9 - the SIR model with demography (births and deaths) - by allowing for loss of immunity. We define X, Y and Z as the number of susceptible, infected and recovered/immune individuals in a population of total size N = X + Y + Z. We use lower case x, y and z to denote the fraction of the population in each of these categories (i.e. x = X/N). The deterministic SIRS model is described by three ordinary differential equations dx / dt ( y ) x z (1) dy / dt yx ( ) y (2) dz / dt y ( ) z (3) where β is the transmission parameter, ν the rate of recovery from infection, γ the rate of loss of immunity and μ the rate of birth/death. This model can be re-parameterised in terms of the basic reproductive number R0 /( ) which gives the number of infections a single infectious individual would generate in an entirely susceptible population. At equilibrium x* 1 / R0 and y* q( R0 1) / R0 where q ( ) /( ) . Stability analysis using Taylor series expansion for small perturbations from x* and y* leads to a quadratic for the eigenvalues of the linearized system 2 [qs ] ( )s 0 (4) where s ( )( R0 1) is the exponential rate of growth in incidence for an outbreak within a fully susceptible subpopulation . If the solution for the eigenvalues contains an 5 imaginary part ( 4s ( )[1 s /( )]2 ) then the system shows damped oscillations towards the endemic equilibrium with characteristic damping time TD 2 /( qs ) (5) and period T given by the imaginary part T 2 / s ( ) 0.25( qs ) 2 (6) The stochastic version of the SIRS model given in equations (1)-(3) can be solved numerically by taking small time steps t such that the rates of each event (infection, death etc…) can be considered independent of one another. The number of events of type k that have rate rk in a population of size N is then binomially distributed with mean Nrk t . The seven possible events for the SIRS model and associated rates are given in Supplementary Table 1. The marginal distribution of the number of infected individuals in the quasistationary state is approximately normal in the stochastic SIRS model for reasonably large N 10. The moments of this distribution can be found using a diffusion approximation. Assuming recovery from infection is rapid compared to the loss of immunity, the variance in the number of infected individuals is Y2 N ( R0 1) / R02 (7) A measure of the magnitude of this variability is the ratio of variance to mean number of infections, which is 1 / qR0 : thus low values of the basic reproduction number, such that R0 1 / q , result in an over-dispersed distribution of infections compared to the Poisson. In sharp contrast to the deterministic case, the stochastic SIRS model results in sustained oscillations due to the continued perturbation of the system by random 6 events11 (Supplementary Figure 1a). The period of these oscillations is given by T (equation 6), the amplitude by the variance derived above (equation 7), while the phase of these oscillations is subject to drift. Thus for a single population stochastic SIRS model, in the absence of random extinction, sustained oscillations can be observed with a period T that is dependent on the parameters of the infection and population birth/death rates. As the rate of loss of immunity 0 the SIR model is recovered, in which case oscillations can still occur with the period given by equation (6). In contrast allowing such that there is no longer any protective immunity results in the SIS model and oscillations in prevalence do not occur. Failure to develop protective immunity in all those infected can be described by allowing a fraction φ of individuals recovering from infection to directly re-enter the susceptible population. In the deterministic case the ordinary differential equations are modified such that the νy term in equation (3) is replaced by (1-φ)νy and the term + φνy added to equation (1) describing the dynamics of susceptibles. In this case, as 1 periodicity in oscillations gradually disappears and with 1 only random noise is apparent. However, significant periodicity remains even for reasonably large φ (Supplementary Figure 1b). Oscillatory dynamics in the SIRS model are robust to the addition of realistic model complexity. Inclusion of more realistic distributions for the infectious period still result in regular oscillations in prevalence but with a larger amplitude than that predicted by the simple SIRS model 12. Also, while the model population size N will vary according to the number of individuals sufficiently sexually active to be at risk of infection, the predicted period of oscillations is independent of N. Perhaps more important are the heterogeneities in sexual activity and contacts patterns within the population. However, SIR infection dynamics defined on a contact network show periodic oscillations for all but the lowest level of network disorder13. Furthermore, stratification of the population into different risk groups can be considered analogous to 7 the geographic stratifications captured in metapopulation models, where oscillatory dynamics occur with synchronisation determined by levels of coupling14. References. 1. Centers for Disease Control and Prevention. Sexually Transmitted Disease Surveillance, 2002. (U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, Atlanta, GA, 2003). 2. Gibson, C. Population of the 100 largest cities and other urban places in the United States: 1790-1990. Working Paper No. 27. (Population Division, US Census Bureau, Washington, D.C., 1998). 3. Bjørnstad, O. N., Champely, S., Stenseth, N. C. & Saitoh, T. Cyclicity and stability of grey-sided voles, Clethrionomys rufocanus, of Hokkaido: spectral and principal components analyses. Phil Trans R Soc Lond B 351, 867-875 (1996). 4. Chatfield, C. The analysis of time series: an introduction. 6th edition (Chapman & Hall/CRC, Boca Raton, 2003). 5. Bjørnstad, O. N., Ims, R. A. & Lambin, X. Spatial population dynamics: analyzing patterns and processes of population synchrony. Trends Ecol Evol 14, 427432 (1999). 6. Cliff, A. D., Haggett, P. & Smallman-Raynor, M. Deciphering global epidemics: analytical approaches to the disease records of world cities, 1888-1912 (eds. Baker, A. R. H., Dennis, R. & Holdsworth, D.) (Cambridge University Press, Cambridge, 1998). 7. Wallace, R., Huang, Y. S., Gould, P. & Wallace, D. The hierarchical diffusion of AIDS and violent crime among US metropolitan regions: Inner-city decay, stochastic resonance and reversal of the mortality transition. Social Science & Medicine 44, 935947 (1997). 8 8. US Dept of Transportation. Airline Origin and Destination Survey. (Bureau of Transportation Statistics, 2003). 9. Hethcote, H. W. The mathematics of infectious diseases. SIAM Rev 42, 599-653 (2001). 10. Nåsell, I. Stochastic models of some endemic infections. Math Biosci 179, 1-19 (2002). 11. Bailey, N. T. J. The mathematical theory of infectious diseases and its applications. 2nd edition. (Griffin, London, 1975). 12. Lloyd, A. L. Realistic distributions of infectious periods in epidemic models: Changing patterns of persistence and dynamics. Theor Popul Biol 60, 59-71 (2001). 13. Kuperman, M. & Abramson, G. Small world effect in an epidemiological model. Phys Rev Lett 86, 2909-2912 (2001). 14. Lloyd, A. L. & Jansen, V. A. A. Spatiotemporal dynamics of epidemics: synchrony in metapopulation models. Math Biosci 188, 1-16 (2004).