8th Mach 2013 Dear Editor BMC Public Health Re: MS:1694435383859048 “Forecast analysis of the incidence of Tuberculosis in the Province of Quebec” By: Alex Klotz, Abdoulaye Harouna and Andrew F Smith We have reviewed in detail the comments forwarded to us by the reviewers of our manuscript and have implemented the required and suggested revisions. There was a degree of concern from both reviewers that the mechanism of re-infection in our model was not sufficiently explained. We have attempted to elaborate upon the role of this mechanism and further justify why we did not expect it to play a significant role. There was also concern over whether our various compartmental equations would accurately model the progression of tuberculosis in Quebec. To address these points, we attempted to better describe the role of latency and included a more detailed discussion on potential interactions between the immigrant and Canadian born population. Concern was also raised over the long-term dynamics of the model over long time periods. We have discussed this, but noted that it does not realistically model the population because the underlying demographic characteristics display long-term exponential growth. Lastly, owing to time and space limitations, we regret that we did not have time, nor space to implement the proposed discretionary recommendations of Dr. Guo concerning supplemental mathematical analyses, but agree that they should definitely be examined before this model is applied beyond the Quebec population to which it was originally intended as the underlying demographic information will necessarily be different. Below is a point-by-point discussion of the issues raised by the reviewers along with our detailed reply in bold. We thank the reviewers for their insightful comments. With kind regards, Alex Klotz, MSc., Abdoulaye Harouna, MSc., Andrew F Smith,PhD Detailed Reply to Reviewers Comments on our Manuscript Reviewer 1: Heffernan: Major Comments: 1a . To generate Figs 3-6 the model is fit to data and then the authors use the general trend to forecast the TB epidemic. I do not know why the subpopulations are treated independently of each other to generate Figs 4-6. It seems that these populations will have some mixing, thus a meta-population model or core group model would be better suited. If these populations are truly mixing independent of each other, then please provide data/references justifying this assumption. 1b. Page 13 – the authors state that the immigrant and Canadian-born populations largely live in the same cities. So transmission can occur between these populations. This is not included in the model (in the way that I understand that the model was fit to the data). Please justify the assumption of independence or revise the model to include transmission between populations. The Inuit population is largely isolated in the North of the province. We initially treated the populations separately based on a paper by Haase et al (now reference 10 of the paper) who found that transmission occurred within immigrant communities. Nevertheless, we subsequently modelled the populations of immigrants and Canadian-born together, and plot the results in an inset of Figure 2. A paragraph more fully elaborating on the rationale underlying our particular modelling approach has been added to the discussion section on page 14. 2a. Why does the SELIR model not include relapse or reinfection? 2b. The model is revised later to include retransmission. What is retransmission? Please show the new model and new model diagram. Does retransmission not occur in immigrant populations too? 2c. 12 – if immigrant populations live close to one another, why is retransmission and/or reinfection not included in the model? Reports on tuberculosis in Canada (PHAC 2000) show that in Quebec only 6% of cases occur in patients who have already had the disease, thus we did not initially include it in the main part of our model. We investigated this for all sub-populations using similar assumptions as stated for the Inuit case, but it did not affect the results by more than 1% over the periods of interest, much less than the error bounds. We did not include it on the graph to reduce visual clutter. The flow chart (Fig 1) has been modified to include this reinfection possibility. 3. I do not understand how the Gaussian random number distributions were used or generated. I do not understand how these relate to Figs 3-6. Clarification is needed. Regressions to historical data generate parameters as well as a confidence interval on these parameters. If the model is simulated with different values of these parameters, there are scenarios where the fastest growing cases (with upper bounds on transmission parameters) experience a maximum and begin to decline before the lower-bound cases, and thus our upper bound becomes lower than our lower bound, which does not make sense. Instead, we simulate a large number of populations, with the historical data as the starting point but with the fit parameters chosen according to a Gaussian random variable with a mean equal to the best fit and a standard deviation equal to the confidence interval. At each point in time, there is a distribution in the number of new annual cases. We use this distribution over many populations to propagate the uncertainty in the initial fits. However, re-reading our paper the authors no longer feel Figure 2. is necessary as it only serves to show that the model can fit historical data, which is shown in Figures 2-4 in the current version (4-6 in the old version) 4. Euler’s method with a time step of one month was used. Why was one month chosen? Why not use ODE solvers that are preprogrammed in MATLAB? We also integrated the model using the Runge-Kutta method (ode45 in MATLAB), however, the results were virtually identical to the Euler method (Figure S1 below). The ode45 method would be more appropriate if the first derivative were very sharply changing signs (in author's experience, for example, at the local amplitude minimum of a non-linear oscillator). However, our populations were sufficiently smooth that this was not necessary. Furthermore, the model is designed to be run in the background of an excel spreadsheet, which does not have the same differential equation solving capabilities built into it MATLAB. A one month time step was used because it was a small enough period of time for the dynamics of the model to remain stable and run in a reasonable amount of computational time. 50 Euler Runge-Kutta 4-5 45 40 Number of Infected 35 30 25 20 15 10 5 5 10 15 20 25 Year 30 35 40 45 50 Figure S1. Comparison of the same population simulated with both the Euler method and the Runge-Kutta method. N.B. The two curves overlap to the point of indistinguishability. 5a. Page 10 – the authors state that the number of cases will increase in the middle of the century. Please show this result. 5b. Page 12 – the immigrant cases will never vanish because there is a constant incoming rate in to the equation for the E class. Thus, this result is not surprising. We have added this as an inset to the figure; we feel that expanding the entire graph (figure 3 in the current paper) to this range would reduce resolution on the time-scales of immediate interest. It is of note (see Guo comment and response to point #4) that this increase is related to the total increase in populations; the per capita rate is expected to stabilize. 6. If the stochastic model is to be used in this study, results should be shown. Last two lines of page 13 – normal distribution… - I do not understand the motivation behind choosing this process. Please explain. When we replace the "flow" in the model with a probabilistic process, if the number of transmissions is small then it is reasonable to model it by using a Poisson distribution. Since a larger number of transmissions occur each year, the Poisson distribution can be replaced with a Gaussian distribution with appropriate parameters. Again, the change in the result is minimal and we have left it off the graph in the interest of reducing clutter. 7. Page 14 – the authors state that if new treatments came in to effect, then the model results would no longer be adequate. There are drug therapies for TB now. How do these affect the current model? Treatments are built into the recovery parameters. New effective treatments would increase the rate of recovery, while new resistances would decrease it. If, during the course of the simulation, one of these parameters were to change, a cusp would occur in the prediction. 8. Parameter values – What are the realistic ranges for these parameter values? How is the model structure sensitive to the parameter ranges (sensitivity analysis)? We can perform a Latin Hypercube sampling on these parameters to see the forecasted results and their sensitivity. In Figure S2, for example, there is a distribution of results from a Latin Hypercube sampling of the initial exposed ratio and the fraction of exposed new immigrants for the immigrant population over 30 years. It leads to a distribution of 61±7, similar to the variation expected from the propagated uncertainty in the transmission parameter. This analysis can be applied to other parameters as well. 140 120 Counts 100 80 60 40 20 0 45 50 55 60 Cases 65 70 75 Figure S2. Distribution in cases after 30 years from a Latin Hypercube sampling of 1000. 9. Why are there no error bars on Fig 3? Figure 3 is merely a demonstration of one instance of the model rather than a robust prediction. Error bars can be easily added. In retrospect, the authors no longer think Figure 3 (previous version of the paper) is necessary or conveys useful information to the reader. 10. Figure 6 – there is not enough information and discussion in the text about the differences between the two cases presented here. This has been elaborated upon in the caption to Figure 4 and on page 11 in the revised version of the paper. Minor Comments: 11. I suggest that figures 3-6 should be one figure with 4 subplots This has been done. The 4 subplot have now been reduced to 3. We have included the separated versions in the text and the combined versions at the end of this document and leave the choice to the editor for inclusion or not. Reviewer 2: Guo. 1) The paper didn't clearly classify the difference between exposed and latent compartments as TB latency is variable in length. And it is not clear whether those in latent class will progress to active disease or not, though the probability is not negligible compared to those in exposed class [1]. This is discussed on the top of page 15. The latent category does not have an effect on disease dynamics per se (although it could be modified to do so). Rather, there is a cost associated with latent tuberculosis (of several hundred dollars) and this category is necessary for calculating the potential financial burden of the disease. Latent individuals can become infected through the optional re-infection route, although as discussed in the paper re-infection only plays a significant role for the Inuit population. 2) (the full question has been redacted for brevity). It would be of interest to rigorously explore the dynamics of the model using such generalizations, we will not do so in this publication because this was beyond the limited scenario of the current research question. A fuller discussion on page 14 has been added with reference to those papers suggesting future investigations. 3) Reinfection route of the model for Inuit population is not shown and it is nice to have readers to know how they add this element to the TB dynamics in a small Inuit population. Some reinfection could cause complex behaviors such as backward bifurcation. It is recommended to write it out. This was discussed in the reply to Dr. Heffernan (see Page 11 in the revised paper), but the route has been added to the flow chart in Fig 1. 4 & 5) Another point is the future trend of annual new TB infections of immigrant population when the time frame extended to a longer period, for example, 2100, 2500,or 3000. Does it stabilize or keeps increasing? Over long times the number of cases increases but the ratio with the whole population stabilizes (Figure S3). However, over long times the current birth, immigration, and death rates lead to unrealistic exponential growth; we expect logistic growth to occur before these times. See page 14 for additional clarification related to this point. Figure S3. Long term predictions for the immigrant population. Minor essential revisions: The minor essential revisions have been implemented. Figures S4 and S5. The three forecast graphs arranged horizontally and vertically as a single figure.