1 Title: 2 Estimating animal movement contacts between holdings of different 3 production types. 4 Authors: Tom Lindströma, Scott A. Sissonb, Susanna Stenberg Lewerinc and Uno Wennergrena 5 a IFM Theory and Modelling, Linköping University, 581 83 Linköping, Sweden 6 b School of Mathematics and Statistics, University of New South Wales, Sydney 2052, Australia 7 c Department of Disease Control and Epidemiology, SVA, National Veterinary Institute, 751 89 Uppsala, 8 Sweden 9 Corresponding Author: 10 Uno Wennergren 11 Tel: +46 13 28 16 66 12 Fax: +46 13 28 13 99 13 Email: unwen@ifm.liu.se 14 Correspondence address: See above. 15 16 Key words 17 Markov Chain Monte Carlo; Animal databases; Animal movements; Production types 18 1 1 Abstract 2 Animal movement poses a great risk for disease transmission between holdings. Heterogeneous contact 3 patterns are known to influence the dynamics of disease transmission and should be included in modeling. 4 Using pig movement data from Sweden as an example, we present a method for quantification of between 5 holding contact probabilities based on different production types. The data contained seven production 6 types: Sow pool center, Sow pool satellite, Farrow-to-finish, Nucleus herd, Piglet producer, Multiplying 7 herd and Fattening herd. The method also estimates how much different production types will determine 8 the contact pattern of holdings that have more than one type. The method is based on Bayesian analysis 9 and uses data from central databases of animal movement. Holdings with different production types are 10 estimated to vary in the amount of contacts as well as in with what type of holding they have contact with, 11 and the direction of the contacts. Movements from Multiplying herds to Sow pool centers, Nucleus herds 12 to other Nucleus herds, Sow pool centers to Sow pool satellites, Sow pool satellites to Sow pool centers 13 and Nucleus herds to Multiplying herds were estimated to be most common relative to the abundance of 14 the production types. We show with a simulation study that these contact patterns may also be expected to 15 result in substantial differences in disease transmission via animal movements, depending on the index 16 holding. Simulating transmission for a one year period showed that the median number of infected 17 holdings was 1 (i.e. only the index holding infected) if the infection started at a Fattening herd and 2161 if 18 the infection started on a Nucleus herd. We conclude that it is valuable to include production types in 19 models of disease transmission and the method presented in this paper may be used for such models when 20 appropriate data is available. We also argue that keeping records of production types is of great value 21 since it may be helpful in risk assessments. 22 23 1. Introduction 2 1 In the last decade several major outbreaks with contagious livestock diseases, such as foot and mouth 2 disease (FMD) and classical swine fever (CSF) have occurred in Europe. In the effort to minimize the 3 risks and extent of outbreaks, researchers are increasingly recognizing the impact of the contact structures 4 that mediate transmission of infectious agents between holdings (e.g. Velthuis and Mourits 2007, Dubé et 5 al. 2009, Nöremark et al. 2009, Vernon and Keeling 2009). Depending on the disease, the risk of different 6 contact types may vary but between holding movements of animals are often regarded to be the main risk 7 factor for transmission of livestock disease (Févre et al. 2006, Ortiz-Pelaez et al. 2006, Rweyemamu et al. 8 2008). 9 Large scale experiments of contagious animal diseases are impossible for both financial and practical 10 reasons. Instead risk assessment and contingency plans must be based on modeling scenarios which can be 11 based on previous outbreaks or even outbreaks from other countries. However, models and model 12 parameterizations are not directly applicable from one country to another due to differences in animal 13 population structure and density, production systems, contact networks etc. For instance, Robinson and 14 Christley (2007) showed that livestock markets are expected to have great influence on the transmission of 15 diseases in the UK, while Nöremark et al. (2009) state that such markets are rare in Sweden. Therefore 16 animal movement patterns may be very different in different countries or regions. In addition, even 17 outbreaks of the same disease in the same country may be very different, which is clearly illustrated by the 18 fact that the more recent outbreak of FMD in the United Kingdom in 2007 followed a totally different 19 course (DEFRA 2009) than the 2001 outbreak. Drawing conclusions from just one or a few outbreaks may 20 therefore be problematic. The livestock industry is not static and the structure as well as the movement 21 patterns may change, in particular after major outbreaks (Velthuis and Mourits 2007), which makes 22 forecasting based on historic outbreak data difficult. Modeling based on observed contact structure is a 23 good approach as it can be based on specific and (ideally) updated data of each country. Member states of 24 the EU must keep databases on all livestock holdings and register movements of cattle and pigs. Therefore 3 1 such data is often available for analysis and model parameterization. However, not all national databases 2 contain the same details and this may limit the use for some applications in some countries. 3 Modeling may assume that all holdings have equal probability of contacts but heterogeneity in both the 4 number of contacts and between which holdings these occur have been recorded (Bigras-Poulin et al. 5 2006, Lindström et al. 2009, Nöremark et al. 2009) and such differences are known to affect the 6 transmission dynamics (Mollison 1993, Keeling 2005, Moslonka-Lefebvre et al. 2009). One factor that may 7 lead to non random contact patterns is the production types of the holdings. By production type we mean 8 the different types of specialized production systems, e.g. the separation of breeding animals and animals 9 for slaughter on different holdings, which are usually applied in modern animal production. We may 10 assume that animal movements are more common between some production types than others, and also 11 that these may differ in the direction of the contact. For example, EU requirements do not include details 12 on production type but the pig holding databases in some countries, e.g. Sweden, contains this 13 information. 14 The scope of this paper is twofold. First, we present a method for quantification of the contact structure 15 between holdings with different production types. Previous studies have shown differences in the contact 16 pattern between holdings with different production types (Dickey et al. 2008, Ribbens et al. 2009) and our 17 aim is to provide estimates that can be used in the modeling of such contacts. Commonly, 18 parameterizations of models are based on expert opinions and questionnaires (Christensen et al. 2008. 19 Dickey et al. 2008, Ribbens et al. 2009, Ward et al. 2009). Our method, based on Bayesian analysis, 20 utilizes data found in central databases where animal movements and holding information are stored in 21 such databases at the national level. It also handles the fact that holdings may have more than one 22 production type by analyzing whether some production types may be dominant in determining the 23 contacts between holdings. Bayesian analysis is a flexible tool that allows for parameterization of complex 24 models. Rather than just estimating the maximum likelihood of the model, the data is considered with 25 parameter uncertainty included. This uncertainty may then be incorporated when the model is used for 4 1 simulation or inference based on the parameters. We apply the method to data of pig movements in 2 Sweden from one year. 3 Secondly, we also perform a simplistic simulation of disease transmission via the estimated contact 4 pattern. The intention is not to investigate the dynamics of any specific disease. Rather we investigate how 5 model predictions generally may change with inclusion of production types as estimated with the method 6 presented. More specifically we investigate how the expected number of infected holdings differs 7 depending on the production type of the first infected holding, based on direct transmission via animal 8 movements only. Previous studies have shown that the type of index holding may have large impact on 9 outbreak dynamics. For instance, Boklund et al. (2008) focused on the difference between herds with 10 domesticated and wild boar populations and showed in a simulation study of CSF that the probability of 11 an epidemic changed depending on the type of the index herd. Also, Ward et al. (2009) and Pineda-Krsh et 12 al. (2010) simulated transmission of FMD between cattle and pig herds, respectively, and showed that the 13 index herd affected both the size and duration of an outbreak. 14 15 2. Material and method 16 2.1 Data 17 Data were supplied by the Swedish Board of Agriculture. It contained pig holdings with their reported 18 production types and movements reported from July 2005 until June 2006. Holdings are supposed to be 19 removed from the data-base if they no longer have any pigs but it is known that this is not always the case. 20 Inactive holdings where therefore removed as described in Nöremark et al. (2009) and we use the same 21 data as in that study for spatial analysis of pig holdings. This data has previously been analyzed for spatial 22 (Lindström et al. 2009, Nöremark et al. 2009) and temporal (Nöremark et al. 2009) patterns. Little 23 seasonal variation was observed for pig movements. 5 1 Movements of cattle and pigs, as well as information related to livestock holdings, are registered by the 2 Swedish Board of Agriculture and stored in different databases. The database is designed for multiple 3 purposes, including the basis of agricultural subsidies, contact with individual farmers and disease tracing. 4 EU legislations require that all states keep databases including all holdings and register movements of 5 cattle and pigs. The holdings with pigs are registered with a unique number, the production-place number 6 (PPN), and the database contains information on postal address, species kept and approximate number of 7 animals kept on the holding. The type of production is also included. Holdings that are geographically 8 separated from each other should have different PPNs. Thus one farmer can have several holdings. The 9 PPN number system is also used for transport vehicles transporting cattle or pigs. 10 Movements of pigs are recorded at the group level, not for each pig. The reports include the date, the 11 number of pigs, and the start and end holdings of the movements. The movements are reported by the 12 farmer at the PPN sending the animals. Farmers should report within seven days after the event, either 13 electronically or using a form sent by ordinary mail. Due to single reporting on group level, the movement 14 reports could not be matched or checked with location, but earlier work on cattle data has shown that 15 errors may be common in these data (Nöremark et al. 2009). A total number of 3084 holdings and 20231 16 movements were included in the analysis. Data on production type are recorded at the first registration of 17 an animal holding but there is no requirement for updating this information. 18 The Swedish pig industry has a pyramidal structure with transports predominantly going downward in the 19 system. Permanent trade agreements between pig farmers are quite common. The different production 20 types are defined as Nucleus herds (that produce breeding animals), Multiplying herds (that receive 21 breeding animals and breed gilts for sale to piglet producers), Piglet producers, Fattening herds and 22 Farrow-to-finish herds. Moreover, there is a special type of production system involving several holdings, 23 namely Sow pools. These consist of a Sow pool center (where sows are covered or inseminated) and Sow 24 pool satellites (where the sows farrow). Thus, the sows are regularly moved back and forth between the 25 central unit and the satellites and they don’t always farrow in the same satellite herd every time. 6 1 We included seven production types which may be selected by the farmer on the report form. Farmers also 2 have the possibility to select a box for “other” and leave free text information, but this was excluded from 3 the analysis. Instructions state that a maximum number of two production types may be reported for each 4 PPN but the maximum number of production types reported was five. 20% of the holdings had reported 5 more than one type and 7.6% of the holdings had no information on production type or had only given 6 free text information. The number of holdings that had reported each production type is presented in Table 7 1. 8 9 2.2 Model and parameter specification 10 The available data includes reported production types, which we denote πΉ. We indicate π ππ = 1 if holding 11 π (π = 1,2, … π where π is the number of holdings) has reported production type π (π = 1,2, … πΎ where πΎ 12 is the number of production types including the additional type “Missing information”, which is referred 13 to as type π) and π ππ = 0 if holding π has not reported production type π. Also, data include start, π, and 14 end, π, holdings of all π movements and we use the notation that ππ‘ and π π‘ are the end and start holdings, 15 respectively, of movement π‘. We say that πΉ is fixed and given the observed data we search the posterior 16 distribution of two parameters of interest, π and π. The parameter π models how common movements 17 between production types are (see below for more details). The parameter vector π has πΎ − 1 elements 18 (production type π is excluded, see below) and indicates whether some production types are dominant in 19 determining the contacts of a holding with more than one production type. If there is no difference 20 between the types, then π£π = 1⁄(πΎ − 1) for all production types π = 1,2, … πΎ − 1. 21 Parameter matrix π has πΎ × πΎ elements and indicates how common movements between the different 22 production types are and βπΌπ½ = 1⁄πΎ 2 if there is no difference in probability of contacts between holdings. 23 A higher value indicates that movements between the types are more common than expected by random. 24 Hence, π can be seen as a measurement of the commonness of movements between holdings with 7 1 different production types but accounting for how frequent production types are, and we refer to this as the 2 commonness index. Both π and π are defined such that the sum of the elements equals to one. 3 To include the fact that many holdings have more than one production type we model movements between 4 holdings assuming that these act as a mixture of the types. The proportions are given through π and πΉ and 5 each holding π is assumed to be of type π with the proportion π£Μππ = π ππ π£π for π ≠ π ∑πΎ π ππΎ π£πΎ } if π ππ ≠ 1 π£Μππ = 0 (1) π£Μππ = 0 for π ≠ π } if π ππ = 1 π£Μππ = 1 if π = π 6 where π = 1,2, … πΎ − 1 and type π refers to “Missing information” in the data analyzed. The reason why 7 this has to be treated separately is that this enforced type never is shared with any other type for any 8 holding (as it by definition then would not be of type π). Hence, π£Μππ = 1 independent of π. In the case 9 where π ≠ π, equation 1 is interpreted as π normalized over the reported production types of each 10 holding π. 11 The basic outline of the model is that we search the posterior distribution π(π, π|π, π, πΉ) ∝ πΏ(π, π|π, π, πΉ)π(π, π) (2) 12 where πΏ(π, π|π, π, πΉ) is the likelihood of all movements and π(π, π) is the prior. The likelihood of one 13 movement, πΏ(ππ‘ , π π‘ |π, π, πΉ), may be rewritten by πΏ(ππ‘ , π π‘ |π, π, πΉ) = π(ππ‘ , π π‘ |π, πΉ, πΌπ‘ , π½π‘ )π(πΌπ‘ , π½π‘ |π, π, πΉ), 14 hence formulating the probability of ππ‘ , π π‘ conditional on the production types of the start ( πΌπ‘ ) and end 15 holding (π½π‘ ). We use the functions π(πΌπ‘ , π½π‘ |π, π, πΉ) = 8 ΜπΌ π ΜΜπ½πΌ βπΌπ½ π ΜπΌ π ΜΜπ½πΌ ∑πΌ ∑π½ βπΌπ½ π (3) 1 and π(ππ‘ , π π‘ |π, πΉ, πΌπ‘ , π½π‘ ) = π£Μπ πΌ π£Μππ½ , Μπ ΜΜ π (4) πΌ π½π 2 ΜπΌ = ∑π π£ΜππΌ , π ΜΜπ½πΌ = ∑π π£Μππ½ (1 − π£ΜππΌ ⁄π ΜπΌ ) and π ΜΜπ½π = ∑π≠π π£Μππ½ may be interpreted as the amount of where π 3 holdings of each production type, taking into account that a holding may be a proportion of each type, and 4 ΜΜπ½πΌ and π ΜΜπ½π are adjusted so that the normalizations of equation 3 and 4 account for the exclusion of ππ‘ = π 5 π π‘ . A movement will not end up at the same holding as it is sent from. The likelihood of all observed start 6 and end holdings is written π πΏ(π, π|π, π, πΉ) = ∏(π(ππ‘ , π π‘ |π, πΉ, πΌπ‘ , π½π‘ )π(πΌπ‘ , π½π‘ |π, π, πΉ) ) (5) π‘=1 7 Appendix A gives a more detailed description of the model. 8 We use a Markov Chain Monte Carlo (MCMC) for parameter estimation. This is described in Appendix 9 B. 10 11 2.3 Simulation 12 We set up a simple simulation model where contacts via animal movements between holdings are 13 modeled with probabilities given by section 2.2. Programs were written in MatLab 7.8. In each simulation 14 the model is parameterized with random draws from the posterior distribution, as provided by the MCMC 15 output. We simulated transmission between the same holdings and their reported production types as was 16 used in the analysis of π and π and the holdings are used as infective units. Each holding will be in either 17 state S (susceptible) or I (infected). The simulation assumes instant infection through directed contacts, i.e. 18 if a movement occurs from infected holding A to B, B will instantly become infected and all subsequent 9 1 movements from B to any uninfected holding C will lead to infection. If however there is a movement 2 from uninfected holding D to infected holding E, D will not become infected. 3 We investigated how the expected number of infected holdings varies depending on the production type of 4 the initially infected holding. Therefore we varied the type of the index holding and selected randomly 5 from holdings that had reported the current production type. We ran the simulations 500 times for each 6 production type and recorded the number of infected holdings after 388, 1552 and 20231 movements, 7 which corresponds to the expected number of movements for 7, 28 and 365 days, respectively. Note that 8 these numbers also include movements between non infected holdings. We analyzed the simulation results 9 with Kruskal-Wallis tests and box plots. 10 11 3. Results 12 Estimates of π are shown in Table 2. The five highest commonness indices were found for, in decreasing 13 order, movements from Multiplying herds to Sow pool centers, Nucleus herds to other Nucleus herds, 14 Sow pool centers to Sow pool satellites, Sow pool satellites to Sow pool centers and Nucleus herds to 15 Multiplying herds. Movements involving fattening herds were estimated to be relatively rare, and in 16 particular movements from such holdings. Movements to and from Farrow-to-finish type holdings also 17 had low commonness indices. Sow pool satellites only had high commonness indices for movements from 18 Sow pool centers and except for the high commonness index of movements back to Sow pool centers, 19 Sow pool satellites tend to send pigs to Fattening herds. Movements from Nucleus herds had generally 20 high commonness index but estimates were low for all incoming movements except from other Nucleus 21 herds. 10 1 Estimates for π are shown in Table 3. Sow pool centers were estimated to be dominant in determining the 2 contacts with other holdings. Fattening herds were estimated to predict very little about the contacts of a 3 herd when this production type was reported together with other types. 4 The predicted numbers of infected holdings depending on the production type where the infection was 5 initialized are shown in Figure 1. The highest median was found for simulations where the initial infection 6 started in a Nucleus herd followed by Multiplying herds and Sow pool centers. Simulations with the initial 7 infection on holdings with Fattening herds, Farrow-to-finish and Missing information showed low median 8 infection rate but many outliers with large number of infections. The trend was most apparent in the one 9 year simulation, where the lowest median, including the index holding, was found for Fattening herds 10 (median=1) and the largest for Nucleus herds (median=2161). The Kruskal-Wallis test showed that for all 11 tested time periods π βͺ 0.01 and the null hypothesis that the number of infected holdings is independent 12 of the production type of the initial holding is rejected. 13 14 4. Discussion 15 Largely, our results on general movement patterns between production types reflect what is already known 16 about the Swedish pig industry. The analysis of π shows that the top five estimates are found for 17 movements between types that are known to move many animals between them. Also, the commonness 18 index for movement between Sow pool centers is estimated to be high, indicating that trade of animals 19 between different holdings of this type would be common. However, trade between Sow pool centers is 20 not allowed in the system of Sow pools (the idea being that each Sow pool should function as one unit 21 without contact with herds outside the pool except for some sourcing herds for gilts). This finding is thus 22 contrary to what is known about the actual practice of these units and indicates that there may be 23 misclassification of the production type of some herds in this category. Also, the analysis estimates that 24 there is a high commonness index for movements from Nucleus herds to Sow pool centers. Hence, while 11 1 there is (as expected) many movements from Nucleus herds to Multiplying herds and from Multiplying 2 herds to Sow pool centers, movements directly from Nucleus herds to Sow pool centers are also estimated 3 to be common. We cannot exclude the possibility that these unexpected results are due to erroneous 4 reports on production types. The production types involved are quite rare (see Table 1) and thereby more 5 sensitive to data quality. 6 From Table 2 we observe that Fattening herds have higher commonness indices for incoming than 7 outgoing movements in contacts with all other types. This result is expected since Fattening herds only 8 produce pigs for slaughter and slaughterhouses were not included in this study. By comparing the 9 estimated values of βπΌπ½ to βπ½πΌ in Table 2 (i.e. movements from type πΌ to π½ compared to type π½ to πΌ) we may 10 further conclude that the commonness index of movements generally differs depending on the direction. 11 The credibility intervals of βπΌπ½ and βπ½πΌ , π½ ≠ πΌ, only overlap in 4 out of 28 cases. One important exception 12 is the commonness index of movements between Sow pool centers and Sow pool satellites which are very 13 similar in estimates independent of direction. This similarity is expected given that the system is based on 14 the same number of sows moving through the entire system, back and forth between the centre and the 15 satellite holdings. 16 The fact that many holdings have more than one production type needs to be incorporated when modeling 17 contacts. One may assume that a holding that for instance has two types will have a contact structure that 18 is 50% of each type. The analysis of π however show that such supposition is incorrect, at least for the 19 data analyzed in this study. There is great difference in how much the production types determine the 20 contacts of a holding. The type Sow pool center is very dominant while Fattening herd determines very 21 little about the contacts of a holding when this type is reported concurrently with others. So rather than 22 assuming 50% of each type, a holding π that has reported two production types is expected to have a 23 contact structure that is determined by π£Μππ (as defined by equation 1) of each type π = 1,2. Hence, a 24 holding that has reported e.g. both Sow pool center and Fattening herd is expected to interact with other 25 holdings as 99.7% (using mean estimates of π as shown in Table 3, 0.83⁄(0.83 + 0.0023)) Sow pool 12 1 center and only 0.3% Fattening herd. The expected contact pattern of such a holding would be very 2 different if it was assumed to be determined equally by the two types as the estimates for π are very 3 different between movements involving Sow pool centers and Fattening herds (see Table 2). 4 The assumptions of the simulation model are too crude to capture the dynamics of any real disease as 5 there is no intra-herd dynamics and all other contacts are neglected. Simplistic models with holdings as 6 infective units do however, as pointed out by Vernon and Keeling (2009), allow for investigation of the 7 effects of the contact structure. In the simulation study, disease transmission was estimated to be highest if 8 the index holding was a Nucleus herd, and also high if the index holding was of types Sow pool center, 9 Sow pool satellite or Multiplying herd. The median number of cases was very low if the index holding 10 was of type Farrow-to-finish, Fattening herd or had Missing information. Also, the Kruskal-Wallis test 11 showed that the number of infected holdings differed depending on the index holding, which is expected 12 both from the analyzed contact structure as well as previous studies where production types have been 13 included in outbreak simulations (Boklund et al. 2008, Ward et al. 2009, Pineda-Krsh et al. 2010). This is 14 also according to what is expected and planned for in contingency plans and disease surveillance 15 programs. 16 We conclude that the observed contact heterogeneities are also expected to influence the dynamics of 17 disease transmission. However, a more realistic model needs to include other factors, such as intra-herd 18 dynamics, incubation time, mode of spread and other disease specific aspects. Moreover, there are other 19 factors influencing the probability of contacts between holdings via animal movements. Distance is known 20 to be an important factor (Lindström et al. 2009, Ribbens et al. 2009) and we may expect holdings with 21 larger herd sizes to have more contacts (Ribbens et al. 2009). We do however argue that these factors, if 22 included, should be analyzed as being dependent on the production type. For instance, more short distance 23 contacts may occur when more permanent agreements between farmers are present, such as for the actors 24 in Sow pools. And a large Farrow-to-finish holding might not necessarily have many incoming contacts as 25 the whole production chain is integrated on the holding. 13 1 Our results however clearly show that production types influence the contact pattern and this in turn is 2 expected to have implications for disease transmission. Hence, there is great value in including this 3 information in animal databases and it has the potential to improve risk assessment. Ideally, animal 4 databases should include exact geographic location, type of production, type of housing, animal species 5 kept on the premises and the number of animals of each species, for each animal holding. Moreover, 6 yearly updating of the data would be optimal. In reality, this may not be feasible for existing databases but 7 for countries setting up new systems, these aspects should be considered. 8 We believe that production types should be included in disease spread models when data is available and 9 urge other researchers to include this in their studies. However, one must remember that the contact 10 patterns may change quickly, due to structural changes in pig production. Larger, more specialized units 11 may lead to less frequent reporting of more than one production system. Some types of production may, 12 however, become more common in smaller units or mixed with others. Therefore, analysis of the contact 13 patterns needs to be updated to avoid erroneous assumptions about the structure. Also, reliable data is 14 essential if analysis of the contact pattern is to be used in risk assessment of between holding disease 15 transmission. Clear guidelines to the farmers may improve the quality of the data and thereby the 16 possibility to utilize this for risk assessments of disease spread. 17 18 5. Conclusion 19 We have presented a model for analysis of animal movements between holding of varying production 20 types and applied it to pig movement data from Sweden. We have shown that there is great difference 21 between which production types influence the contact pattern when a holding has more than one type as 22 well as heterogeneity in the contact pattern between the types. The results also demonstrate that there 23 generally is a difference in direction of contacts. We have further shown that the contact heterogeneity is 24 expected to influence the dynamics of disease spread via the considered contacts. Hence we believe that 14 1 models based on contact patterns between holdings may be improved by inclusion of production types and 2 we argue that the method presented in this paper may be used for parameterization if data is available. It is 3 therefore valuable to include such information in central databases of livestock holdings. 4 5 Conflict of interest 6 We have no conflict of interest. 7 8 Acknowledgement 9 We thank the Swedish Emergency Management Agency (KBM) for funding and the Swedish Board of 10 Agriculture for supplying the data used. 11 12 Appendix A. Model details 13 Here follow some clarifications on the model used. In determining the distribution of πΏ(π, π|π, π, πΉ), we 14 start with the hypothetical case of no difference between production types and no difference in the 15 probability of movements between them. In such a setup, the probability that a movement π‘ originates at 16 holding π and ends up at holding π is given by π(ππ‘ , π π‘ |π) = π(π π‘ |π)π(ππ‘ |π, π π‘ ) = 1 1 1 ππ −1 1 (A.1) 17 where π is the total number of holdings. π(ππ‘ |π, π ) = π−1 rather than π since a movement may not end up 18 at the same holding as it originates. 15 1 If however there are different production types, each holding has exactly one type (π ππ = 1 for only one 2 type π) and the probability of contacts between these vary, then π(ππ‘ , π π‘ |π, πΉ) = βππ ∑πΌ ∑π½ βπΌπ½ ππΌ πΜπ½ (A.2) 3 where πΌ, π½ are production types of π and π respectively. Conditionality on πΉ is expressed through π΅ which 4 is a vector of size πΎand ππ is the number of holdings with production type π, i.e. ∑π π ππ . Since 5 movements starting and ending at the same holding are excluded, πΜπ½ = ππ½ if πΌ ≠ π½ and πΜπ½ = ππ½ − 1 if πΌ = 6 π½. The parameter π determines whether movements are more or less likely between types πΌ, π½ and equation 7 A.2 can be rewritten as equation A.1 if βπΌπ½ = 1⁄πΎ 2 for all πΌ, π½. 8 If the interest is not probability of individual holdings, but rather the probability of contact between 9 production types, the probability of a movement from type πΌ to type π½ is given by π(πΌ, π½|π, πΉ) = βπΌπ½ ππΌ πΜπ½ ∑πΌ ∑π½ βπΌπ½ ππΌ πΜπ½ (A.3) 10 This is obtained by summation of the probability of movements between holdings of types πΌ, π½ divided by 11 the probability of all possible movements. If we include the fact that many holdings have more than one 12 production type, equation A.3 may be rewritten as equation 3. Hence, equations 3 and A.3 are identical if 13 ΜπΌ , i.e. the all holdings have exactly one production type. The probability of the start holding is then π£Μπ πΌ ⁄π 14 ΜπΌ . Similarly, the probability of proportion of π being of type πΌ divided by the total amount of this type, π 15 ΜΜπ½π and the probability of start and end holding of movement π‘ conditional on the end holding is π£Μππ½ ⁄π 16 production types πΌπ‘ , π½π‘ is given by equation 4. 17 18 Appendix B. Parameter estimation 16 1 To facilitate computation of the full conditional distribution, we introduce an indicator variable (Gelman 2 et al. 2004), πΌ, of size πΎ × πΎ × π, and rewrite equation 5 as π πΎ πΎ πΏ(π, π, πΌ|π, π, πΉ) = ∏ ∏ ∏(π(ππ‘ , π π‘ |π, πΉ, πΌπ‘ , π½π‘ )π(πΌπ‘ , π½π‘ |π, π, πΉ) )ππΌπ½π‘ π‘=1 πΌ=1 π½=1 (B.1) 3 where ππΌπ½π‘ = 1, probabilistically, for exactly one combination of πΌ, π½ for each movement π‘ with (full 4 conditional) probability ππ(ππΌπ½π‘ = 1|π, π, πΉ) = π£Μπ πΌ π£Μππ½ βπΌπ½ . ∑πΌ ∑π½ π£Μπ πΌ π£Μππ½ βπΌπ½ (B.2) 5 Note that unlike the standard formulation of mixture models in Gelman et al. 2004, the mixture 6 components in equation B.1 include the mixing distribution π. This is because π(ππ‘ , π π‘ |π, πΉ, πΌπ‘ , π½π‘ ) and 7 π(πΌπ‘ , π½π‘ |π, π, πΉ) are normalized over the proportion of holdings belonging to each production type (see 8 equations 3 and 4, respectively). The full model distribution is then written as π(π, π, πΌ|π, π, πΉ) ∝ πΏ(π, π|π, π, πΌ, πΉ)π(π, π, πΌ). (B.3) 9 Parameters π, π and πΌ are estimated with MCMC. Parameter πΌ is updated with Gibbs sampling by 10 drawing one random number for each π‘ from a multinomial distribution with probabilities given by 11 equation B.2. The full conditional distribution of π and π is of non standard form and Metropolis-Hastings 12 updates have to be used (see below). The conditional distribution of π is based on equation 3 and with 13 inclusion of the indicator variable πΌ the conditional distribution of π is given by π πΎ πΎ π(π|π, πΌ, πΉ) ∝ ∏ ∏ ∏(π(πΌπ‘ , π½π‘ |π, π, πΉ) )ππΌπ½π‘ π(π) π‘=1 πΌ=1 π½=1 (B.4) 14 Μ and π΅ ΜΜ , and π(π) is the prior distribution of π. Μ, π΅ where conditionality on π and πΉ is expressed through π 15 We use an uninformative prior for π (π·ππππβπππ‘(1,1, … 1)). For each update and π‘, (π(πΌ, π½|π, π, πΉ) )ππΌπ½π‘ 16 will deviate from one for only one combination of πΌ, π½ and the probability distribution of π can be given by 17 πΏ1 = ππ’ππ‘πππππππ(π1,1 , π1,2 , … π2,1 , π2,2 … ππ,π−1 , ππ,π |π1,1 , π1,2 , … π2,1 , π2,2 … ππ,π−1 , ππ,π ) 1 where ππΌ,π½ = π(πΌ, π½|π, π, πΉ) as given in equation 3 and ππΌ,π½ = ∑π‘ ππΌπ½π‘ . 2 The parameter π is included in both π(ππ‘ , π π‘ |π, πΉ, πΌπ‘ , π½π‘ ) and π(πΌπ‘ , π½π‘ |π, π, πΉ) in equation B.1 and so the 3 conditional distribution of π is given by (B.5) π(π|π, πΌ, πΉ, π, π) π πΎ πΎ ∝ ∏ ∏ ∏(π(ππ‘ , π π‘ |π, πΉ, πΌπ‘ , π½π‘ )π(πΌπ‘ , π½π‘ |π, π, πΉ) )ππΌπ½π‘ π(π) π‘=1 πΌ=1 π½=1 π πΎ πΎ π πΎ (B.6) πΎ = ∏ ∏ ∏ π(ππ‘ , π π‘ |π, πΉ, πΌπ‘ , π½π‘ )ππΌπ½π‘ ∏ ∏ ∏ π(πΌπ‘ , π½π‘ |π, π, πΉ) ππΌπ½π‘ π(π) π‘=1 πΌ=1 π½=1 π‘=1 πΌ=1 π½=1 4 where π(πΌπ‘ , π½π‘ |π, π, πΉ) is given as in equation B.5 and π(ππ‘ , π π‘ |π, πΉ, πΌπ‘ , π½π‘ ) for each π‘ is given by equation 5 12. For further notation we write π πΎ ππΌπ½π‘ πΎ π£Μπ πΌ π£Μππ½ πΏ2 = ∏ ∏ ∏ ( ) Μπ ΜΜ π π‘ πΌ=1 π½=1 ππΌπ½π‘ πΌ π½π (B.7) 6 ΜπΌ π ΜΜπ½π )) and note that for each update ((π£Μπ πΌ π£Μππ½ )⁄(π 7 For π or π we use Metropolis-Hastings updates which involves proposing new parameter values and then 8 accepting or rejecting this proposal. Since there are many similarities in the updates of π or π, we use the 9 indication π½ (consisting of π1 , π2 , … ππ ) when referring to either π or π. Both π and π are defined as π1 + 10 π2 + β― +ππ = 1 and 0 < ππ < 1 for all π. Therefore the elements are dependent and cannot be updated 11 separately, and the proposed values need to follow that definition. Hence, proposals are performed using a 12 Dirichlet distribution. If the current position of an iteration is π½Μ then π½Μ is proposed from π (π½Μ|π½Μ, π΄(π½Μ)) = 13 π·ππππβπππ‘(π½Μ|π©) where π© = π½Μπ΄(π½Μ) which is centered at π½Μ and π΄(π½Μ) controls the width of the proposal deviates from one for only combination of πΌ, π½. 18 1 distribution. It is possible to use a fixed value π΄Μ such that π΄(π½Μ) = π΄Μ for all π½π‘ but computational problems 2 may arise for very small ππ . Generation of Dirichlet random numbers are based on random numbers from 3 either the Gamma or the Beta distribution (Gelman et al. 2004). For numerical reasons, small values of the 4 πΌ parameter of either Gamma or Beta distribution generates values equal to exactly zero, which also gives 5 ππ = 0 and all subsequent proposal will be ππ = 0 for any value of π΄Μ. We therefore use π΄(π½) = 6 πΆπππ /ππππ where πΆπππ is the critical value for where numerical problems occur in generation of Dirichlet 7 random numbers. The acceptance ratio is then given by π(π½Μ| β)π(π½Μ)π (π½Μ|π½Μ, π΄(π½Μ)) πππ (1, ) π(π½Μ| β)π(π½Μ)π (π½Μ|π½Μ, π΄(π½Μ)) (B.8) 8 where π(π½| β) is the posterior distribution of the parameter vector π½ and π(π½) is the prior distribution of π½. 9 The acceptance ratio for π, is given by πππ (1, πΏΜ1 π(πΜ)π (πΜ|πΜ, π΄(πΜ)) ) πΏΜ1 π(πΜ)π (πΜ|πΜ, π΄(πΜ)) (B.9) 10 where πΏΜ1 and πΏΜ1 are the distributions given by equation B.5 for the current and proposed values of π, 11 respectively. Good mixing is hard to obtain for π since it is based on a random walk in 63 (i.e. πΎ 2 − 1) 12 Μ is a vector with πΎ dimensions. We therefore make partial updates by rewriting βπΌπ½ = βΜπΌ βΜπΌπ½ where π 13 πΎ Μ Μ is a matrix with πΎ × πΎ elements and ∑πΎ Μ elements and π πΌ=1 βπΌ = 1 and ∑π½=1 βπΌπ½ = 1. For every iteration we 14 Μ and πΎ updates on π Μ , in the latter case proposing values (from a Dirichlet perform one update on π 15 distribution as described above) of βΜπ1 , βΜπ2 , … βΜππΎ for the πth partial update. The formulation of equation 16 B.9 is however still valid since the acceptance ratio is based on the probability density of π but π will be 17 different for each partial update. 18 The acceptance ratio for π, is given by 19 πππ (1, πΏΜ1 πΏΜ2 π(πΜ )π(πΜ |πΜ , π΄(πΜ )) ) πΏΜ1 πΏΜ2 π(πΜ )π(πΜ |πΜ , π΄(πΜ )) (B.10) 1 where πΏΜ1 and πΏΜ1 are the probability distributions given by equation B.5 for the current and proposed 2 values of π, respectively, and πΏΜ2 and πΏΜ2 are given analogously from equation B.7. 3 4 References 5 Bigras-Poulin, M., Thompson, R.A., Chriel, M., Mortensen, S., Greiner, M., 2006. Network analysis of 6 Danish cattle industry trade patterns as an evaluation of risk potential for disease spread. Prev. Vet. Med. 7 76, 11-39. 8 Boklund, A., Goldbach, S.G., Uttenthal, A., Alban, L., 2008. Simulating the spread of classical swine 9 fever virus between a hypothetical wild-boar population and domestic pig herds in Denmark. Prev. Vet. 10 Med. 85, 187-206. 11 Christensen, J., McNab, B., Stryhn, H. Dohoo, I., Hurnik, D., Kellar, J., 2008. Description of empirical 12 movement data from Canadian swine herds with an application to a disease spread simulation model. 13 Prev. Vet. Med. 83, 170-185. 14 DEFRA, 2009. FMD: 2007 outbreak. 15 http://www.defra.gov.uk/foodfarm/farmanimal/diseases/atoz/fmd/2007/index.htm 16 Dickey, B.F., Carpenter, T.E., Bartell, S.M., 2008. Use of heterogeneous operation-specific contact 17 parameters changes predictions for foot-and-mouth disease outbreaks in complex simulation models. Prev. 18 Vet. Med. 87, 272–287. 20 1 Dubé, C., Ribble, C., Kelton, D., McNab, B., 2009. A review of network analysis terminology and its 2 application to foot-and-mouth disease modelling and policy development. Transbound. Emerg. Dis. 56, 3 73–85. 4 Févre, E.M., Bronsvoort, B.M.de C., Hamilton, K.A., Cleaveland, S., 2006. Animal movements and the 5 spread of infectious diseases. TRENDS Microbiol. 14, 125-131. 6 Gelman A., Carlin, J. B., Stern, H. S., Rubin, D. B., 2004. Bayesian Data Analysis (2nd Edition). Chapman 7 & Hall/CRC. 8 Keeling, M., 2005., The implications of network structure for epidemic dynamics. Theo. Pop. Bio. 67. 1-8. 9 Lindström, T., Sisson, S.A., Nöremark, M., Jonsson A., Wennergren, U., 2009. Estimation of distance 10 related probability of animal movements between holdings and implications for disease spread modeling. 11 Prev. Vet. Med. 91,85-94. 12 Mollison, D., Isham, V., Grenfell, B., 1993. Epidemics: Models and Data. J. R. Statist. Soc. B. 157, 115- 13 149. 14 Moslonka-Lefebvre, M., Pautasso, M., Jeger, M., 2009. Disease spread in small-size directed networks: 15 Epidemic threshold, correlation between links to and from nodes, and clustering. Theo. Pop. Bio. 260, 16 402-411. 17 Nöremark, M., Håkansson, N., Lindström, T., Wennergren, U., Sternberg Lewerin, S., 2009. Spatial and 18 temporal investigations of reported movements, births and deaths of cattle and pigs in Sweden. Acta Vet. 19 Scand. 51:37. 20 Ortiz-Pelaez A, Pfeiffer D. U., Soares-Magalhães R.J. Guitian F.J., 2006. Use of social network analysis 21 to characterize the pattern of animal movements in the initial phases of the 2001 foot and mouth disease 22 (FMD) epidemic in the UK. Prev. Vet. Med. 75, 40-55. 21 1 Pineda-Krch, M., O'Brien, J.M., Thunes, C., Carpenter, T.E., 2010. Potential impact of introduction of 2 foot-and-mouth disease from wild pigs into commercial livestock premises in California. Am. J. Vet. Res. 3 71, 82-88. 4 Rweyemamu, M., Roeder, P., Mackay, D., Sumption, K., Brownlie, J., Leforban, Y., Valarcher, J.F., 5 Knowles, N.J., Saraiva, V., 2008. Epidemiological patterns of foot-and-mouth disease worldwide. 6 Transbound. Emerg. Dis. 55, 57-72. 7 Ribbens, S., Dewulf, J., Koenen, F., Mintiens, K., de Kruif, A., Maes, D., 2009. Type and frequency of 8 contacts between Belgian pig herds. Prev. Vet. Med. 88, 57–66. 9 Robinson, S.E., Christley, R.M., 2007. Exploring the role of auction markets in cattle movements within 10 Great Britain. Prev. Vet. Med. 81, 21-37. 11 Velthuis, A.G., Mourits, M.C., 2007. Effectiveness of movement-prevention regulations to reduce the 12 spread of foot-and-mouth disease in The Netherlands. Prev. Vet. Med. 82, 262-281. 13 Vernon, M.C., Keeling, M.J., 2009. Representing the UK’s cattle herd as static and dynamic networks. 14 Proc. Roy. Soc. London B. 276, 469-476. 15 Ward, M.P., Highfield, L.D., Vongseng, P., Graeme Garner, M., 2009. Simulation of foot-and-mouth 16 disease spread within an integrated livestock system in Texas, USA. Prev. Vet. Med. 88, 286-297. 17 18 Table 1. 19 Table shows the production types and the number and percentages of holdings that have reported having 20 them. Note that the percentages sum to more than 100% as holdings may have more than one type. Production type Nr holdings Percent of holdings 22 Sow pool center 36 1.2% Sow pool satellite 245 7.9% Farrow-to-finish 720 23.3% Nucleus herd 63 2.0% Piglet producer 1249 40.5% Multiplying herd 88 2.9% Fattening herd 1147 37.2% Information missing 233 7.6% 1 2 23 1 Table 2. 2 Posterior mean estimated values (underlined) of commonness indices, π, for movements between production types given such that the estimate for 3 movements from type πΌ to type π½, βπΌπ½ , is found in row πΌ, column π½. Estimates are given as βπΌπ½ x103, and 95% credibility intervals are given in 4 brackets. A high value of βπΌπ½ means that movements from type πΌ to type π½ are estimated to be common relative to a homogeneous contact pattern. 5 Estimated mean values larger than average commonness index (i.e. 1/64=0.016) are shown in bold. FROM TO Sow pool center Sow pool satellite Farrow-tofinish Nucleus herd Piglet producer Multiplying herd Fattening herd Missing information Sow pool center 77 (63,94) 120 (110,140) 0.79 (0.41,1.3) 0.59 (0.014,2.2) 4.0 (3.3,4.7) 6.1 (3.3,9.9) 10 (9.2,12) 13 (11,16) Sow pool satellite 120 (110,130) 1.6 (1.1,2.1) 0.033 (0.001,0.095) 0.11 (0.003,0.43) 0.015 (0.002,0.038) 0.11 (0.002,0.34) 9.3 (8.6,10) 0.51 (0.24,0.84) Farrow-tofinish 2.4 (1.7,3.1) 0.047 (0.002,0.13) 0.35 (0.28,0.43) 0.037 (0.001,0.14) 0.12 (0.087,0.15) 0.42 (0.22,0.67) 1.8 (1.6,2.0) 2.3 (2.0,2.5) Nucleus herd 69 (56,82) 0.51 (0.11,1.2) 12 (10,13) 130 (120,150) 12 (11,13) 120 (100,130) 3.6 (3.0,4.3) 16 (13,18) Piglet producer 5.3 (4.5,6.2) 0.5 (0.35,0.65) 0.45 (0.39,0.52) 0.13 (0.031,0.26) 0.29 (0.25,0.33) 0.097 (0.008,0.22) 9.5 (9.0,10) 3.2 (2.9,3.5) Multiplying herd 150 (140,170) 1.3 (0.58,2.3) 20 (18,22) 0.73 (0.079,2.1) 25 (23,26) 15 (11,20) 11 (10,13) 8.8 (7.2,11) Fattening herd 0.95 (0.61,1.3) 0.019 (0.001,0.049) 0.015 (0.005,0.030) 0.076 (0.015,0.17) 0.019 (0.010,0.031) 0.39 (0.24,0.58) 0.18 (0.15,0.22) 0.17 (0.11,0.24) Missing information 2.1 (1.3,3.2) 0.11 (0.018,0.22) 0.16 (0.095,0.24) 0.53 (0.19,1.0) 0.066 (0.034,0.10) 0.14 (0.007,0.38) 0.97 (0.82,1.1) 0.84 (0.60,1.1) 6 24 1 Table 3. 2 Posterior mean values (underlined) of elements in parameter vector π used to model how much production types will determine the contacts of 3 holdings with more than one type. 95% credibility intervals are shown in brackets. Sow pool center 0.83 Sow pool satellite 0.037 Farrow-to-finish 0.012 Nucleus herd 0.047 Piglet producer 0.019 Multiplying herd 0.052 Fattening herd 0.0023 (0.68,0.93) (0.016,0.064) (0.0045,0.022) (0.018,0.095) (0.0074,0.036) (0.020,0.10) (0.00086,0.0044) 25 1 Figure captions 2 Figure1. Box plots with number of infected holdings after simulation of disease transmission via pig 3 transports for 7 days (top), 28 days (middle) and 365 days (bottom). Each box shows the distribution of 4 simulations initialized with index holding with different production type. 26