Responses to Reviewers’ comments Comments Referee 2b: …… 14. Methods: Sampling and data collection: The sampling strategy is not clear and this whole first paragraph – it needs to be rewritten with a clearer explanation how of how the sampling was done in 2011. This would include an explanation on how schools were selected i.e was it random or did using the list of schools included in the 1985-87 surveys mean it was purposive? How many of the 1985-87 schools were included in the 2011 survey? Did the 2011 survey include more schools (it looks that way from the maps)? In what way were the health-districts involved in the sampling strategy? In what way did the district’s size (population or land cover?), population density and ecological zones influence whether 1 or 2 schools selected? Was 1 or 2 schools per health-district a large enough sample size? Was the sampling based on statistical calculations? Responses: In the ‘Sampling and data collection’ section it is already highlighted that ‘In order to assess the current levels of infections and to compare the data with previous ones, the schools were selected using the list of villages and schools previously investigated in 1985-87’. It is also highlighted in the first paragraph of this section that in 1985-87 ‘A stratified randomcluster sampling procedure, with the 5th grade of school children as the basic sampling unit was used in the mapping of schistosomiasis and STH in Cameroon’. This clearly explains the sampling strategy and addresses the reviewer’s comment. A new sentence was added in manuscript to clarify that ‘In districts where the number of schools selected was fewer than in 1985-87, those schools with previously higher prevalence were selected in priority’. Reviewer’s response: The reviewer feels that the clarification of ‘In districts where the number of schools selected was fewer than in 1985-87, those schools with previously higher prevalence were selected in priority’, helps to explain how the selection was done in 2011 from the original schools in 1985-87 study i.e. in each health district the 1 or 2 schools included in the original study were chosen, but where there were more than 1 or 2 schools then those that had the highest prevalence in the 1985-87 study were chosen from the total schools. It would still be good to define: - Age of school-children in 5th grade - How many of the 1985-87 schools were included in the 2011 survey Reviewer’s 3rd response: the authors have still not addressed these points Authors’ responses: Thanks for the comments. More revisions have been made to the relevant parts in the ‘Sampling and data collection’ and additional data were added in Table2. Age of school-children in 5th and 6th grade was defined, and numbers of schools included in surveys were added in Table 2. 24. Results: Schistosomiasis: Prevalence: fourth paragraph – this paragraph and analysis are confusing. In the Methods section it is explained that 5th grade children or those in upper classes were selected for sampling in the mapping exercise. Therefore, how are there children as young as 5, 6 and 7 years in the sample? This goes back to comment 15 – what is the age- range of children in 5th grade? If there are children that were sampled below 5th grade and the other upper classes because enrolment in these classes was below the 50 children required for the sample, then surely this was infrequent and the number of 5, 6, 7 year olds and even 14 and 15 year olds sampled was very small? Thus, if only a small sample size is it possible to do such analysis? This kind of age distribution is commonly done when a sample of 10 or more children is selected from each age-class in each school, for which this study was not set up to do. If the authors feel that this paragraph and the Figures 2 and 3 should remain in the paper, the reviewer feels there is no justification for them, then there needs to be more evidence of the sample sizes in each age-class and why there were younger children and indeed such a large age-range included when 5th grade and upper classes were the target. Responses: The reviewer did not read the Methods carefully. It is clearly mentioned that: ‘Children were preferentially selected from the 5th grade, and then in other grades where the number of children in the 5th grade was fewer than 50’. Clearly, in some small villages there is only a single school and it is impossible to find 50 children in 5th grade. Therefore, children from the lower classes are sampled to complete the sampling size. This is the standard protocol used for mapping of schistosomiasis and STH; and the sentence recalled above clearly explains that. Therefore, Figures 2 and 3 should remain in the paper for more clarity. Reviewer’s response: The reviewer read the following in the Methods section: ‘In the current study, in each school, urine and stool samples collected from 50 children selected randomly in the upper classes, approximately half boys and half girls. Children were preferentially selected from the 5th grade, and then in other grades where the number of children in the 5th grade was fewer than 50’.These sentences state that children from the upper classes were selected, preferentially from the 5th grade. Thereby leading the reader to believe only upper age groups would be selected. In the 1990 paper (Ratard et al., 1990, ref 4) it states 5th grade students are 10 to 19 years of age. Thus it is surprising that there are so many children included in the study at younger ages. The reviewer continues to suggest that it needs to be made clearer in the manuscript why there were younger children included and what the sample sizes were for each age-class. The reviewer is concerned that the sample sizes for the younger age groups will be too small to justify the analysis done on age-groups. Reviewer’s 3rd response: the authors have still not addressed these points Authors’ responses: The sentence quoted above by the Reviewer has now been revised in the Methods section. The age distribution in the relevant parts and the Figures 2 and 3 are removed as suggested. This applies to the following Comments 27, 29, 30 and 41, which are not copied here individually. 31. Results: Comparison of data: overall the reviewer is concerned with this analysis for three reasons: (i) If the analysis of the data means that the prevalence and intensity of infection are ‘adjusted’ due to the sampling methodology and inclusion of sample weights, then can these values be fairly compared to data where this was not done, as the values for each are not the same. Response: The data in this paper were collected using the same sampling strategy and survey method and analyzed using the same statistical method with adjustment and weighting, so they are fairly comparable. Reviewer’s response: As the 1990 paper (Ratard et al., 1990, ref 4) does not mention that they adjust their prevalence and intensity estimates, like the authors have described in this manuscript, then the reader will have to trust in this being correct. Authors’ responses: Thanks for the comments. Sorry we did not make this clear. In the 1985-87 survey, sampling was made “proportional to the school population”, therefore, no adjustment for the prevalence was needed. In the current survey, we adjusted the samples according to the actual population sizes of school age children. Therefore we believe two sets of estimates are comparable. (ii) The number of places sampled in the 85-87 mapping exercise is clearly, from the maps, far fewer than for the 2011 mapping exercise, especially in the north-west and south-west regions, and thus the regional estimates of prevalence are likely to be underestimated or overestimated due to sampling bias. Response: The number of survey sites may have been different between the two surveys, but the sampling strategy and target population were similar. Indeed, in multistage cluster sampling, more clusters would give a better estimate for the population. To minimize the bias, we have weighted the samples and adjusted the estimate for each region. Reviewer’s response: The reviewer continues to be confused and thus feels there needs to be greater clarity on the comparison between 1985-87 data and 2011 data with regards to the following points: 1. Nowhere in the text or in Table 2 does it state how many schools were included in the analyses for each region in 1985-87 and how many in 2011. This needs to be stated. Authors’ responses: The number of schools surveyed in 2011 was given at the beginning of the results. But these are now shown in Table 2 together with those from 1985-87. 2. In Figure 5, it is clear that in 2011 there were more schools sampled in North-West and South-West regions than in 1985-87. However, it is not clear in the Methods section: Sampling and data collection, how the selection of these additional schools was done? The impression given is that only schools from the 1985-87 list were selected and in some health districts, less schools than were on the 1985-87 list. Authors’ responses: Thanks for the comments. The description is now revised to give better understanding. As described now, the 1985-87 survey was sampled based on provinces (regions) and divisions, not every health district was surveyed, while in the current survey, sampling was made to include schools covering all health districts to give an even spatial coverage. This may have caused underestimation of the prevalence as commented. This is now discussed in the discussion. 3. The reviewer still questions the validity of comparing the overall schistosomiasis and STH prevalence of each region between the two time points due the large variation in sample sizes i.e. in 1985-87 there is only 1 clear school, but perhaps a few on the border with Littoral region, in which the prevalence is estimated from. This is compared to 17 schools in the 2011 survey on which the prevalence is estimated from. The reviewer does not feel that such a comparison is statistically robust. Reviewer’s 3rd response: the authors have still not addressed these points Authors’ responses: Thanks for the comments. As explained above the 1985-87 sampling did not consider an even spatial distribution of the selection. Some districts were not surveyed according to the selection. But Littoral region had much more than 1 school and most of them were near the border areas. Due to the scale of the map, it’s difficult to see clearly. (iii) The schistosomiasis prevalence 2011 data in Table 2 (first row, second column) may be incorrect and thus subsequent analysis is incorrect. It appears as if this data has been added together e.g. the prevalence of S. haematobium is 3.2%, S. guineensis is 1.2% and S. mansoni is 3.0% which is 7.4% when summed, which is the same value as the overall schistosomiasis prevalence (7.4%). However, if there is species co-infection in some of the schools, which there clearly is from the maps, then the overall schistosomiasis prevalence should be lower than the sum of the individual species. This is also the same for the overall schistosomiasis prevalence values broken down by region. This means that the values in this column are incorrect and the whole of the schistosomiasis analysis in Table 2 and in the text is incorrect and subsequently the discussion and the conclusions. If the reviewer is correct then the analysis will need to be redone. Response: We have checked the data again and the results are correct. The prevalence for individual species was relatively low. There were only a couple of children being co-infected and this does not affect much of the overall adjusted schistosomiasis prevalence. Therefore, it is unfortunate that the results look like a simple adding them up. This can be confirmed by the STH results which showed clearly the co-infection effect. Reviewer’s response: The reviewer remains dubious that if there is any co-infection, even if only a couple of children, that the overall prevalence and by region would remain exactly the same as simple adding up. However, if this is the case then the authors need to address this in the Discussion for the reader. Reviewer’s 3rd response: the authors have still not addressed these points Authors’ responses: It is quite fascinating how the results could be interpreted like this by just looking. As we initially explained, the prevalence for each species was low and there were only a couple of co-infection. Thinking about the sample size of about 4000, this couple of co-infection really would not do much to change the figure, and also there is a factor of rounding up to 1 decimal point. The authors can confirm again the results are correct. We have added more information on the samples for calculating the overall schistosomiasis prevalence, which were those who were examined for both urine and stool samples (3999). Regarding the breakdown prevalence by region, we could not imagine how this could be a problem! The population size and the sample size for each region were different, and regional figures were calculated using the data within each region, while the overall prevalence was adjusted according to the population sizes because the sampling was not according to PPS. Comments Referee 3: Overall I found this paper interesting and I certainly think that it should be published. The data are very useful for others involved in control measures. However, I was asked specifically to look at the statistical approach adopted by the authors and I have some suggestions for improvements as well as some concerns. Authors’ responses: Thanks for the comments. 1. P9. “..data were initially analyzed and cleaned up by……” What on earth does this mean? This really needs to be explained carefully because it suggests that you merely removed data that did not suit your purpose (which I am sure you did not!). So the data were quality controlled in respect of the following……Explain how and how they were manipulated. Authors’ responses: Thanks for the comments. It was meant to say “cleaned up for entry errors”. The description has been revised. 2. If you are moving data from a database/spreadsheet such as excel, you export it out of the database but import it into SPSS. Authors’ responses: Thanks. This is now revised. 3. “intensity” This can be a really confusing term. I hope that in your treatment of the quantitative data you actually also included the negatives (zero egg counts), because this is the only meaningful way to treat epidemiological data. If this is what you did, and I hope it is, then you should refer to this as abundance and not intensity. Intensity does not include zero counts, and apart from in studies of pathology has no role to play in epidemiological studies. The rules for this were established in the following papers and should be followed by both parasite ecologists and epidemiologists: Margolis L., Esch G.W., Holmes, J.C., Kuris A.M. & Schad G.A. 1982. The use of ecological terms in parasitology (report of an ad hoc committee of The American Society of Parasitologists). Journal of Parasitology 68, 131-133 Bush A.O., Lafferty K.D., Lotz J.M. & Shostak A.W. 1997. Parasitology meets ecology on its own terms: Margolis et al., revisited. Journal of Parasitology 83, 575-583 Authors’ responses: Thanks for the comments and suggestions. The “intensity of infection” is commonly used to represent indirectly the abundance of worm infection in humans, either in the positives only or in the population (both positives and negatives). The calculation here did include both positives and negatives as we fully agree with the reviewer on this point in the epidemiological terms on the population basis. We feel it’s not necessary to change the term “intensity”. The use of “abundance” instead of ‘intensity” will cause confusion in the NTD community. 4. Sample weighting. Again this is not explained. We need to know how you weighted and in what respect. The sentence that follows “Sample weighting was applied for each district according to the ratio of the proportionally expected number of schools to be surveyed and the number of actually surveyed schools in each district. Is insufficient to allow anyone to understand exactly what you did and repeat your treatment of the data. So please explain clearly and fully how you manipulated the data in respect of this weighting. Authors’ responses: Thanks for the comments. It was indeed a mistake in description. More detail on weighting is now given. 5. P11 and elsewhere. The days when you could just cite (P>0.05) have long gone. For each statistical output you should refer to the test, give the test statistic , degrees of freedom and then the p value. So for example (Kruskal-Wallis test, H=50.0, n=23, 24, 25, P<0.001) Authors’ responses: Thanks for the comments. We have checked the format in recent publications in BMC Infectious Diseases and made necessary revisions. 6. P 9. “95% confidence intervals (CIs) for prevalence were calculated using the Wilson score method without the continuity correction after adjusting for sample weighting” what does this mean? Authors’ responses: Thanks. More detail is now given. 7. Table 2. There is no such thing as P=0.000. In fact this means certainty! So there is no point in citing the test. In reality of course you have written here an artefact from the computer stats output. Most stats packages fail to round off probability outputs, so when you get P=0.000, you should always convert this to P<0.001. Authors’ responses: Thanks for the comments. These are now revised. 8. P15 use “reduction” and not “decrease”. Authors’ responses: Thanks. This is now revised.