Manuscript 3 nov - IFM

1 Title: 2 Bayesian analysis of animal movements related to factors at herd and between herd levels: 3 Implications for disease spread modeling. 4 Authors: Tom Lindströma, Scott A. Sissonb, Susanna Stenberg Lewerinc and Uno Wennergrena 5 a IFM Theory and Modelling, Linköping University, 581 83 Linköping, Sweden 6 b School of Mathematics and Statistics, University of New South Wales, Sydney 2052, Australia 7 8 c 9 Corresponding Author: Department of Disease Control and Epidemiology, SVA, National Veterinary Institute, 751 89 Uppsala, Sweden 10 Uno Wennergren 11 Tel: +46 13 28 16 66 12 Fax: +46 13 28 13 99 13 Email: unwen@ifm.liu.se 14 Correspondence address: See above. 15 16 Key words 17 Markov Chain Monte Carlo; Hierarchical Bayesian; Mixture models; Indicator variable; Animal 18 databases; Animal movements; Contact structure 19 20 Abstract 21 A method to assess the influence of between herd distances, production types and herd sizes on patterns of 22 between herd contacts is presented. It was applied on pig movement data from a central database of the 23 Swedish Board of Agriculture. To determine the influence of these factors on the contact between 1 1 holdings we used a Bayesian model and Markov chain Monte Carlo (MCMC) methods to estimate the 2 posterior distribution of model parameters. The analysis showed that the contact pattern via animal 3 movements is highly heterogeneous and influenced by all three factors, production type, herd size, and 4 distance between holdings. Most production types showed a positive relationship between maximum 5 capacity and the probability of both incoming and outgoing movements. In agreement with previous 6 studies, holdings also differed in both the number of contacts as well as with what holding types contact 7 occurred with. Also, the scale and shape of distance dependence in contact probability was shown to differ 8 depending on the production types of holdings. 9 To demonstrate how the methodology may be used for risk assessment, disease transmissions via animal 10 movements were simulated with the model used for analysis of contacts, and parameterized by the 11 analyzed posterior distribution. A Generalized Linear Model showed that herds with production types Sow 12 pool center, Multiplying herd and Nucleus herd have higher risk of generating a large number of new 13 infections. Multiplying herds are also expected to generate many long distance transmissions, while 14 transmissions generated by Sow pool centers are confined to more local areas. We argue that the 15 methodology presented may be a useful tool for improvement of risk assessment based on data found in 16 central databases. 17 18 1. Introduction 19 In order to understand the process of disease spread between animal holdings, researchers are increasingly 20 studying the contact patterns that contribute to the transmission (e.g. Velthuis and Mourits 2007, Dubé et 21 al. 2009, Nöremark et al. 2009a, Vernon and Keeling 2009, Lindström et al. 2010). Analysis of the contact 22 pattern allows for making predictions about disease spread (Kao et al. 2007) as well as testing the effects 23 of changes in its contact structure (Velthuis and Mourits 2007). Different diseases may be transmitted via 24 different paths, but generally between-holding movements of animals may be regarded as the main risk 2 1 factor for transmission of livestock diseases (Févre et al. 2006, Ortiz-Pelaez et al. 2006, Rweyemamu et al. 2 2008). 3 In this paper we present a methodology to transform data of between holding contacts into probability 4 distributions useful for predictions and direct analysis. We analyze pig movements in Sweden and show 5 how it is possible to assess the influence of several factors on the contact pattern, specifically production 6 type, number of animals and distances between holdings. For the latter, it has been shown that contacts 7 between holdings are more common at short distances (Boender et al. 2007, Robinson and Christley 2007, 8 Lindström et al. 2009, Ribbens et al. 2009). Epidemiological studies often describe the probability of 9 transmission dependent on distance with a spatial kernel (Keeling et al. 2001, Tildesley et al. 2008), and 10 Lindström et al. (2009) showed that a good description of contact probabilities at both short and long 11 distances are important. As in Lindström et al. (2009) we characterize the spatial kernel by its two 12 dimensional variance (quantifying the scale) and kurtosis (quantifying the shape). A high value of the 13 kernel variance means that contacts occur more frequently at longer distances. A high value of kurtosis 14 indicates that contacts are frequent at short distances, but concurrently long distance contacts, represented 15 by a fat tail of the kernel, are also common. From ecological research it is known that leptokurtic kernels 16 (i.e. kernels with higher kurtosis than a normal distribution) may be the result of heterogeneity in the 17 dispersal processes (see Hawkes 2009 and references therein). Hence, kurtosis may be interpreted as a 18 measure of the heterogeneity of the distance related contact probability. Figure 1 shows spatial kernels 19 with different combinations of 2D-variance and kurtosis. More frequent contacts at shorter distances result 20 in spatially clustered contact patterns (Keeling 1999) which lead to depletion of local susceptibles and a 21 rapid decline in the reproductive ratio (Keeling 2005). Hence, such dynamics are expected from disease 22 transmission where contacts are estimated to be described by a kernel with a small variance. A large value 23 of kurtosis also results in such patterns, but since long distance contacts are also common, the number of 24 contacts needed for connecting otherwise distant holdings is reduced. From theory of small-world 3 1 networks (Watts and Strogatz 1998) it is known that such contacts may have substantial impact on the 2 dynamics of an outbreak. 3 Contact heterogeneity due to production types has been reported and shown to be important for the 4 dynamics of outbreaks (Dickey et al. 2008, Ribbens et al. 2009, Lindström et al. 2010). The production 5 type of a holding may be expected to influence both the number of animal movements as well as which 6 other holdings animals are moved from/to. For the latter case, clustering patterns may be expected, similar 7 to contact heterogeneities found due to spatial clustering. However, unlike the spatial factor, directional 8 differences may be expected with animal movements being more common from production type A to type 9 B than from B to A. Holdings of some production types also have many contacts (i.e. trade many animals) 10 which means that if infected, these holdings may cause a large number of secondary infections and 11 thereby function as super spreaders (Matthews and Woolhouse 2005). 12 The number of animals on a holding is also expected to influence the contact pattern. Typically, larger 13 holdings are expected to trade more animals and hence have more contacts (Ribbens et al. 2009), making 14 such holdings candidates as super spreaders. However, differences may be expected between production 15 types. For example, the frequency of live animal movements to and from holdings with farrow-to-finish 16 production might not be strongly dependent on the number of animals kept on the premises, since this 17 production type includes both piglet producing units and fattening units on the same holding. 18 Commonly, between-holdings contacts are studied with network analysis (Bigras-Poulin et al. 2008, 19 Brennan et al. 2008, Nöremark et al. In press). Such studies provide good quantitative measures of the 20 observed structure and may also provide an understanding of the contact patterns and the dynamics of 21 disease transmission through these contacts (Keeling 2005, Kao et al. 2007). It may however be difficult 22 to parameterize a model from network measures. Simulation models in a network context are usually 23 confined to resampling observed contacts (e.g. Vernon and Keeling 2009). While the method presented in 24 this paper addresses many of the same questions as network studies, it utilizes (hierarchical) Bayesian 25 models and builds on methodology presented in Lindström et al. (2009) and Lindström et al. (2010). In 4 1 Lindström et al. (2009) a method was presented for estimation of distance-related probability of contacts, 2 but it was there assumed that all holdings are identical. Lindström et al. (2010) introduced a method that 3 analyzed the contact patterns based on production types, but other factors were excluded. In this paper we 4 present a model that describes contact via live animal movements between holdings where the probability 5 of contacts depends on production types, the number of animals at each holding and distance between 6 holdings. We estimate the posterior distribution of model parameters with Markov chain Monte Carlo 7 (MCMC) methods and utilize data found in central databases of animal movements. EU members and 8 some other states (e.g. Australia and New Zealand) are required to keep databases on all livestock 9 holdings and register all movements of pigs and cattle, which means that such data may be available for 10 analysis. The level of detail included in the databases does however vary between countries. While data 11 quality is a problem (Nöremark et al. 2009a, Lindström et al. 2010), analysis of such data allows for 12 investigation of large scale trends in the contact patterns. One should however be aware of its limitations 13 when drawing conclusions from the analyses. 14 The aim of this paper, and the method presented, is to use a probabilistic model to investigate how the 15 contact pattern is influenced by distance between holdings, herd sizes and production types. Our aim is 16 also to investigate how the influence of distance and herd size dependence differ between holdings of 17 different production types. We also aim to show how the analyzed contact pattern can be used in risk 18 assessment. 19 20 2. Material and method 21 2.1 Data 22 The data used was supplied by the Swedish Board of Agriculture. Due to legal requirements, the analysis 23 was performed on encoded data such that the ID number of specific holdings or names of farmers could 24 not be retrieved. This prohibited the tracing of potentially unexpected contacts. Holdings that were 5 1 considered to be inactive were removed, as well as holdings that did not have spatial coordinates (see 2 Nöremark et al. 2009a for more details on this). A total number of 3084 holdings and 20231 movements 3 (carried out from July 2005 until June 2006) were included in the analysis. Movements to slaughterhouses 4 were not included in the analysis. 5 Data included the maximum capacity (i.e. the reported maximum number of animals that could be kept on 6 the premises) of each holding, recorded separately for sows and fattening pigs. If maximum capacity of a 7 demographic group was missing in the database, it was assumed that the maximum capacity was zero. 8 Such entries were mostly found for holdings with production types that are expected not to have animals 9 of that demographic group and we found that maximum capacity equal to zero was rarely entered in the 10 database. In addition, previous studies (Nöremark et al. 2009b) has shown a better consistency between 11 larger holdings and the entries in the database, indicating that while 0 may not be accurate in some 12 instances, a low rather than a high number is to be expected. Seven production types were included in the 13 study: Sow pool centers, Sow pool satellite, Farrow-to-finish, Nucleus herd, Piglet producer, Multiplying 14 herd and Fattening herd. When reported by the farmer, the form has an option for free text. Holdings that 15 only had this information entered were placed in a group denoted “Missing information”. Note that when 16 we use this term, we only refer to missing information about the production type and that farmers may still 17 have reported location and herd sizes. For more details on the included production types, the pig farming 18 structure of Sweden and how the data is entered in the data base, see Nöremark et al. (2009a) and 19 Lindström et al. (2010). 20 21 2.2 Model and parameter estimation 22 Data include holding production types, and this information is included in the model by a matrix 𝑹 of size 23 𝑛 × 𝐾, where 𝑛 is the number of holdings and 𝐾 is the number of production types (including the artificial 24 type Missing information). We denote 𝑅𝑓𝑘 = 1 if holding 𝑓 has production type 𝑘 and 𝑅𝑓𝑘 = 0 otherwise, 6 1 for 𝑘 = 1, … , 𝐾 and 𝑓 = 1, … , 𝑛. Data also includes spatial coordinates of holdings and these are 2 translated into a distance matrix, 𝑫 of dimensions 𝑛 × 𝑛, where 𝐷𝑓𝑔 is the Euclidean distance between 3 holdings 𝑓 and 𝑔. Herd sizes of pig holdings are measured by the maximum capacity of fattening pigs and 4 sows, and these are denoted 𝑺𝟏 and 𝑺𝟐 (both vectors with 𝑛 elements), respectively. When we refer to 5 either of these demographic classes we write 𝑺, and 𝑆𝑢𝑓 refers to size 𝑢 (𝑢 = 1,2) of holding 𝑓. We use 6 notation such that each movement, 𝑡 (𝑡 = 1,2, … , 𝑇, where 𝑇 is the number of movements), has a start 7 holding 𝑠𝑡 and destination holding 𝑑𝑡 and vectors 𝒔 and 𝒅 (both with 𝑇 elements) refers to all start and 8 destination holdings. 9 10 2.2.1 Weight on production types 11 We want to estimate how contact probabilities depend on the factors herd size, production type and 12 distance between holdings. As in Lindström et al. (2010), we assume that holdings with more than one 13 type will behave as some mixture of each type, and rather than assuming that a holding will behave as an 14 equally weighted mixture of each type, we estimate how much different production types will determine 15 the behavior of a holding. This is estimated with a parameter vector 𝒗 with 𝐾 − 1 elements (see below for 16 explanation of this) where ∑𝑘 𝑣𝑘 = 1. A high value of 𝑣𝑘 indicates that production type 𝑘 has a large 17 influence on the contact pattern of a holding that has reported this type concurrently with other types. 18 A holding 𝑓 is assumed to consist of a proportion 𝑣̂𝑓𝑘 of each production type 𝑘, and is determined by 𝒗 19 and 𝑹 through 𝑅𝑓𝑘 𝑣𝑘 for 𝑘 ≠ 𝑚 ∑𝑙≠𝑚 𝑅𝑓𝑙 𝑣𝑙 } if 𝑅𝑓𝑚 = 0 𝑣̂𝑓𝑚 = 0 for 𝑘 = 𝑚 𝑣̂𝑓𝑘 = 7 (1) 𝑣̂𝑓𝑘 = 0 for 𝑘 ≠ 𝑚 } if 𝑅𝑓𝑚 = 1 𝑣̂𝑓𝑚 = 1 for 𝑘 = 𝑚 1 where 𝑅𝑓𝑘 = 1 if holding 𝑓 has type 𝑘 and 𝑅𝑓𝑘 = 0 otherwise. Production type 𝑚 is an artificial type 2 introduced for holdings with missing information. This is never shared with any other type (as it then 3 would not be missing) and is therefore excluded from 𝒗. Hence, 𝒗 has 𝐾 − 1 rather than 𝐾 elements. It is 4 possible to formulate a model where the missing production types of holdings are estimated by writing a 5 joint distribution of parameter estimates and unobserved production types. However, as the farmers of 6 these holdings have chosen not to report any of the seven included production types, we may not assume 7 that these holdings in fact are of any of these types. We rather interpret that this group mainly contains 8 holdings that for different reasons does not fit into any of the listed types. We therefore include them in 9 the analysis as an artificial type and expect the holdings included in the group to be heterogeneous. 10 11 2.2.2 Dependence on production types 12 Also, as in Lindström et al. (2010), dependence on production type was modeled with a parameter matrix 13 𝒉, of dimensions 𝐾 × 𝐾 with ∑𝐼𝐽 ℎ𝐼𝐽 = 1, where a high (low) value of ℎ𝐼𝐽 indicates that movements from 14 production types 𝐼 to 𝐽 are common (rare). The estimation of 𝒉 takes into account that some production 15 types are more common than others and the elements are referred to as commonness indices. We expand 16 the analysis of Lindström et al. regarding production types and also give estimates of the absolute number 17 of movements between production types and refer to this as 𝑸, of dimensions 𝐾 × 𝐾, where 𝑄𝐼𝐽 are the 18 estimated number of movements from type 𝐼 to 𝐽, however taking into account that herds often are 19 reported to have more than one type. Some production types are reported much more frequently than 20 others and hence the estimates of 𝑸 and 𝒉 provide different and complementary insight to the contact 21 pattern and the role of different holdings in a potential disease outbreak. 22 2.2.3 Herd size dependence 8 1 Dependence on sizes 𝑺𝟏 and 𝑺𝟐 was modeled as a power function with parameters 𝒄̇ and 𝒄̈ , both of 2 dimension 2 × 𝐾, where 𝑐̇𝑢𝑘 and 𝑐̈𝑢𝑘 (𝑢 = 1,2, corresponding to sizes 𝑺𝟏 and 𝑺𝟐 , respectively, and 𝑘 = 3 1, … , 𝐾) is the size dependence of type 𝑘 for sending and receiving contacts, respectively. In the 4 following, we use notation 𝒄 where we wish to refer to either 𝒄̇ or 𝒄̈ . If 𝑐𝑢𝐼 = 0, there is no size 5 dependence for size 𝑢, and 𝑐𝑢𝐼 < 0 (𝑐𝑢𝐼 > 0) indicates a negative (positive) relationship between size and 6 contact probability. For 𝑐𝑢𝐼 = 1, there is approximately a linear relationship such that e.g. twice as many 7 animals results in a twice as high probability of contacts. 8 9 2.2.4 Distance dependence 10 Distance dependence is modeled with a spatial kernel, which may be characterized by its variance (𝑉) and 11 kurtosis (𝜅), measuring the scale and shape, respectively (Lindström et al. 2008, 2009). We assume a 12 rotationally symmetric distribution and define the 2-D variance (in the below, this is what we refer to by 13 variance) as the second moment around zero (i.e. raw moment) of the radial distance. Kurtosis is 14 analogously defined by the fourth raw moment divided by the square of the second raw moment, 15 following suggestion from Clark et al. (1999). Contact probabilities dependent on distance between 16 holdings may differ depending on production types and we therefore estimate 𝑉𝐼𝐽 and 𝜅𝐼𝐽 (indicating 17 elements of matrices 𝑽 and 𝜿, respectively, of dimensions 𝐾 × 𝐾) for every combination of 𝐼, 𝐽. 18 However, the underlying processes (e.g. economical and social) are not completely different. For such 19 systems it is suitable to use a hierarchical model (Gelman et al. 2004) where the parameters, in this case 20 the elements of 𝑽 and 𝜿, have a hierarchical prior with a set of unknown hyper-parameters. This approach 21 has the benefit that it improves the estimation of parameters where the data is weak, a concept known as 22 “borrowing strength”. If it may be argued a priori that the parameters are not completely unrelated, then 23 the estimation of one parameter may be informed by the estimation of other parameters. If there is in fact 9 1 little similarity between the parameters, the hierarchical prior will have little influence on the estimations 2 of 𝑽 and 𝜿, indicating that there is little similarity between the parameters. 3 In Lindström et al. (2009) it was shown that data was better estimated when movements were modeled as 4 arising from a mixture of distance dependent and mass action mixing (MAM) processes. In that study all 5 other factors were excluded and a single kernel was used to describe all contacts. To simplify the model 6 and reduce the number of parameters in this study we exclude the MAM part. As we here include other 7 factors and let 𝑉𝐼𝐽 and 𝜅𝐼𝐽 be different for different production types 𝐼, 𝐽 we assume that the model can 8 account for factors that that could not be estimated with a single spatial kernel. To test these assumptions 9 we visually compare the predicted and observed movement distances (see below). 10 11 2.2.5 Contact probability model 12 We used a model formulation 𝑃(𝑑𝑡 , 𝑠𝑡 |𝜽) = ∑ ∑ 𝑃(𝑑𝑡 |𝑠𝑡 , 𝐼, 𝐽, 𝜽𝟏 )𝑃(𝑠𝑡 |𝐼, 𝜽𝟐 )𝑃(𝐼, 𝐽|𝜽𝟑 ), 𝐼 (2) 𝐽 13 where 𝜽𝟏 , 𝜽𝟐 and 𝜽𝟑 are subsets of 𝜽 and refers to particular sets of parameters, yet to be defined. To 14 clarify, we use the indication with 𝜽 here to give a more transparent description of the general outline of 15 the model. Equation 2 should be interpreted such that for movement 𝑡, the probability of destination 16 holding 𝑑𝑡 is conditional on the start holding 𝑠𝑡 and the production types of 𝑠 and 𝑑, denoted 𝐼 and 𝐽 17 respectively. Start holding 𝑠𝑡 is conditional on production type 𝐼 of the start holding. The joint distribution 18 𝑃(𝑑𝑡 , 𝑠𝑡 | … ) is a mixture distribution and (since holdings may have more than one type) the probability is 19 summed over all types with ∑𝐼 ∑𝐽 𝑃(𝐼, 𝐽|𝜽𝟑 ) = 1. We use a probability function 𝑃(𝐼, 𝐽|𝜽𝟑 ) = 20 𝑃(𝐼, 𝐽|𝒗, 𝒉, 𝑹) as introduced in Lindström et al. (2010) 10 𝑃(𝐼, 𝐽|𝒗, 𝒉, 𝑹) = 1 ̂𝐼 𝑁 ̂̇𝐽𝐼 ℎ𝐼𝐽 𝑁 ̂𝐼 𝑁 ̂̇𝐽𝐼 ∑𝐼 ∑𝐽 ℎ𝐼𝐽 𝑁 , (3) where conditionally on 𝒗 and 𝑹 we define 𝑣̂ ̂𝐼 = ∑ 𝑣̂𝑓𝐼 , 𝑁 ̂̇𝐽𝐼 = ∑ 𝑣̂𝑓𝐽 (1 − 𝑓𝐼 ), 𝑁 ̂𝐼 𝑁 𝑓 (4) 𝑓 2 ̂𝐼 and 𝑁 ̂̇𝐽𝐼 may be interpreted as measurements of the where 𝑣̂𝑓𝑘 is given by equation 1. The quantities 𝑁 3 amount of each production type at a holding, taking into account that holdings may not be of only one 4 ̂̇𝐽𝐼 is adjusted to account for exclusion of type (if more than one type is reported). The quantity 𝑁 5 movements ending up at the same destination as the start holding. 6 The distribution of 𝑠𝑡 conditional on type 𝐼 (i.e. 𝑃(𝑠𝑡 |𝐼, 𝜽𝟐 ) of equation 2) is modeled as 𝑃(𝑠𝑡 |𝒗, 𝑹, 𝐼, 𝑺) = 𝑣̂𝑠𝐼 𝐺(𝑆1𝑠 , 𝑆2𝑠 , 𝑐̇1𝐼 , 𝑐̇2𝐼 ) , ∑𝑓 𝑣̂𝑓𝐼 𝐺(𝑆1𝑓 , 𝑆2𝑓 , 𝑐̇1𝐼 , 𝑐̇2𝐼 ) (5) 7 where 𝐺(𝑆1𝑓 , 𝑆2𝑓 , 𝑐̇1𝐼 , 𝑐̇2𝐼 ) is a function describing dependence on sizes 𝑆1𝑓 , 𝑆2𝑓 of holding 𝑓, and 𝑐̇1𝐼 , 𝑐̇2𝐼 8 are the parameters determining the size dependence of sizes 𝑆1𝑓 , 𝑆2𝑓 , respectively, for production type 𝐼. 9 As in the FMD model presented in Tildesley et al. (2008), we assume that contact probability dependence 10 on herd size may be modeled as a power function and we write 𝑐̇1𝐼 𝐺(𝑆1𝑓 , 𝑆2𝑓 , 𝑐̇1𝐼 , 𝑐̇2𝐼 ) = (𝑆1𝑓 + 1) 𝑐̇2𝐼 (𝑆2𝑓 + 1) . (6) 11 We use 𝑆1𝑓 + 1 rather than just 𝑆1𝑓 to avoid 𝑃(𝑠|𝒗, 𝑹, 𝐼, 𝑺) = 0 if 𝑆𝑢𝑓 = 0 for any 𝑢 = 1,2 (e.g. Fattening 12 herds are expected to have no sows). 13 The probability distribution of 𝑑 conditional on types 𝐼, 𝐽 and start holding 𝑠𝑡 (i.e. 𝑃(𝑑𝑡 |𝑠𝑡 , 𝐼, 𝐽, 𝜽𝟑 ) of 14 equation 2) is dependent on both sizes, 𝑺, and distances, 𝑫 and given by 11 𝑃(𝑑𝑡 |𝒗, 𝑹, 𝑠𝑡 , 𝐼, 𝐽, 𝑺, 𝑫, 𝒄̈ 𝑱 , 𝜅𝐼𝐽 , 𝑉𝐼𝐽 ) = 𝑣̂𝑑𝐽 𝐺(𝑆1𝑑 , 𝑆2𝑑 , 𝑐̈1𝐽 , 𝑐̈2𝐽 )𝐹(𝐷𝑠𝑑 , 𝜅𝐼𝐽 , 𝑉𝐼𝐽 ) , ∑𝑓 𝑣̂𝑓𝐽 𝐺(𝑆1𝑓 , 𝑆2𝑓 , 𝑐̈1𝐽 , 𝑐̈2𝐽 )𝐹(𝐷𝑠𝑓 , 𝜅𝐼𝐽 , 𝑉𝐼𝐽 ) (7) 1 for 𝑓 ≠ 𝑠. The destination of a movement may not be the same holding as the start, 2 𝑃(𝑠𝑡 |𝒗, 𝑹, 𝑠𝑡 , 𝐼, 𝐽, 𝑺, 𝑫, 𝒄̈ 𝑱 , 𝜅𝐼𝐽 , 𝑉𝐼𝐽 ) = 0. Recall that 𝑐̈ is the equivalent of 𝑐̇ but used for modeling of 3 contact probability of incoming movements. The function 𝐹 is used for modeling of dependence of 4 between herd distance. As in Lindström et al. (2009), a generalized normal distribution is used. We write 𝐹(𝐷𝑓𝑔 , 𝜅𝐼𝐽 , 𝑉𝐼𝐽 ) = 𝐷𝑓𝑔 𝑏 −( ) 𝑒 𝑎 , (8) 5 where the relationships between 𝑎, 𝑏 and 𝑉, 𝜅 are given for two dimensional kernels in Lindström et al. 6 (2008) as 4 6 2 𝛤( ) 𝛤( )𝛤( ) 𝑏 𝑏 𝑏 . 𝑉=𝑎 ,𝜅 = 2 2 4 𝛤( ) 𝑏 (𝛤 ( )) 𝑏 2 (9) 7 For continuous functions, equation 8 is normalized by 2𝜋𝑎2 𝛤(2⁄𝑏)⁄𝑏. This cancels out in equation 7 and 8 normalization is instead performed by summation of the functions over all possible destination holdings 9 (see equation 7). When incorporated in this manner, 𝑉 is the parameter that will have the highest influence 10 on the disease spread dynamic (Lindström et al. In press) but 𝜅 also provides important information. Also, 11 it is difficult to a priori determine 𝜅 and erroneous assumptions may result in erroneous estimations of 𝑉. 12 Writing the full formulation of the joint probability distribution of 𝑑𝑡 and 𝑠𝑡 (equation 2) we get 𝑃(𝑑𝑡 , 𝑠𝑡 |𝒗, 𝒉, 𝑹, 𝑺, 𝑫, 𝒄, 𝜿, 𝑽) = ∑ ∑ 𝑃(𝑑𝑡 |𝒗, 𝑹, 𝑠𝑡 , 𝐼, 𝐽, 𝑺, 𝑫, 𝒄̈ 𝑱 , 𝜅𝐼𝐽 , 𝑉𝐼𝐽 )𝑃(𝑠𝑡 |𝒗, 𝑹, 𝐼, 𝑺, 𝒄̇ 𝑰 )𝑃(𝐼, 𝐽|𝒗, 𝒉, 𝑹). 𝐼 (10) 𝐽 13 To improve estimation of parameters 𝜿 and 𝑽 we implement hierarchical priors as described in Appendix 14 A. In particular, this allows for improved estimation of parameters where the data is weak, i.e. where few 12 1 movements are recorded between the production types. Estimation of parameters is performed with 2 MCMC, and an indicator variable is introduced to aid computation (see Appendix B). This indicator 3 variable is also used to calculate the posterior distribution of the number of transports between production 4 types, denoted 𝑸, of dimensions 𝐾 × 𝐾, where 𝑄𝐼𝐽 refers to the estimated number of movements from 5 production type 𝐼 to 𝐽. Note that 𝑸 is introduced because many holdings have several production types, 6 and therefore the exact numbers of movements between holdings of different production types are 7 unobserved. 8 9 2.2.6 Comparing observed and predicted movement distances 10 As we altered the model of Lindström et al. (2009) and removed the MAM part of the spatial kernel, we 11 compared the observed movement distances with the predictions under the model presented above. Note 12 that while this is a simplification of the kernel function, we are adding complexity by using a different 13 kernel for every combination of production types of the sending and receiving holding. The predicted 14 distances were obtained by generating 𝑁 animal movements with the model described in section 2.2.1. 15 Two thousand replicates were generated; each parameterized by a random draw from the posterior 16 distribution (based on the MCMC output) and the mean cumulative distribution was calculated and 17 compared to the cumulative distribution of observed distances. 18 19 2.3 Simulation 20 To demonstrate how the analyzed contact pattern may be used for risk assessment we performed a 21 simplistic simulation with holdings in the analyzed database as infective units. The aim was to study the 22 effect of the observed contact pattern and not provide estimates for any specific disease. All other contacts 23 between holdings were excluded as well as intra-herd dynamics and recovery. Hence, an infected holding 13 1 remained infectious for the entire time of the simulation. Simulations were initiated with one randomly 2 infected holding and we simulated posterior predictive movements with probabilities given by equation 10 3 and parameterized the model with random draws from the posterior distribution. If a movement was 4 simulated from an infected holding A to a susceptible holding B, the latter was assumed to become 5 instantaneously infected and any subsequent movements from B to a susceptible holding C were assumed 6 to result in transmission. Movements between two already infected holdings were not assumed to generate 7 a new infection. 8 We ran the simulation 1000 times and simulated 𝑇̌ = 40462 movements for every replicate. For each 9 replicate and infected holding we recorded the number of first, second and third degree infections. These 10 are defined such that if holding A infects B, this is a first degree infection of A. If B subsequently infects 11 C, this is a second degree infection of A and if C later infects D, this is a third degree infection of A. The 12 number of infections caused by every infected holding were recorded after 1556 movements, which 13 assuming a constant rate of movements corresponds to a period of four weeks (as 20231 holdings was 14 analyzed for the one year period). The analysis presented in Nörmark et al. 2009a suggests that there is in 15 fact little seasonal variation in the number of pig movements. Long distance transmissions are often of 16 particular interest and therefore we also recorded the number of first degree infections caused by 17 movements longer than 10, 100 and 500 km. 18 We analyzed the results using Generalized Linear Models (GLM). Since the response variable was 19 measured as the number of infections, which are natural numbers (0, 1, 2…), we used a Poisson error 20 distribution with a log link function. The response variable 𝒀 was a vector of 𝑊 elements where 𝑊 = 21 ∑𝑟 𝑤𝑟 and 𝑤𝑟 is the total number of holdings for replicate 𝑟 (𝑟 = 1, … ,1000) infected earlier than 22 movement 𝑇̌ − 1556. Holdings infected later were excluded as the number of infections after 1556 23 consecutive movements could not be recorded. For most analyses we focus on production types only and 24 for this we used a dummy variable 𝑿, of dimension 𝑊 × 𝐾, as predictor. The elements 𝑋𝑓́𝑘 = 1 if holding 25 𝑓́ (𝑓́ = 1, … 𝑊) was reported to have production type 𝑘. 14 1 To demonstrate the effect of the maximum capacity we divided the holdings into classes, denoted small, 2 medium and large, respectively. We used the definitions given in Nöremark et al. (2009b) such that a 3 holding is small if the maximum capacity of sows < 15 and slaughter pigs < 300, large if sows > 299 or 4 slaughter pigs > 4999 and is otherwise medium. Hence, 3𝐾 combinations of production types and size 5 classes were obtained and similarly to 𝑿, we used a dummy variable 𝒀, of size 𝑊 × 3𝐾, as predictor 6 where 𝑌𝑓́𝑧 = 1 if holding 𝑓́ is reported to belong to combination 𝑧 (𝑧 = 1, … ,3𝐾). Note that while a 7 holding may have more than one production type, it will always belong to exactly one size class in this 8 analysis of the simulation. No holding with “Missing information” was classified as large and so this 9 combination was excluded from the analysis. 10 Since we simulate a large number of replicates and each replicate may include many infected holdings, 11 significance is less relevant. We instead look at the coefficients of the parameters given by the GLM. A 12 large positive value should be interpreted such that if an infected holding is reported to belong to a 13 production type (or type and size class in the analysis where the latter is included), it is expected to 14 generate a large number of new infected (by the analyzed degree) holdings. A large negative value means 15 that holdings with the type are expected to generate few infections. 16 All programs for analysis and simulation was written in and implemented in MatLab 7.8. 17 18 3. Results 19 3.1 Parameter estimates of contact probabilities 20 3.1.1 Weight of production types, 𝒗 21 Table 1 shows estimates of 𝒗, modeling dominance of production types in determining the contact pattern 22 of a holding. The highest value was estimated for Sow pool centers and the lowest for Fattening herds. 15 1 2 3.1.2 Movements between production types, 𝒉 and 𝑸 3 Table 2 lists estimated values of the most common movement, defined by either 𝒉 or 𝑸. The estimated 4 values of commonness indices, 𝒉, showed large resemblance to the estimates given in Lindström et al. 5 (2010). The five highest mean estimates were found for (in decreasing order) movements from 6 Multiplying herds to Sow pool centers, Nucleus herds to other Nucleus herds, Nucleus herds to 7 Multiplying herds, Sow pool centers to Sow pool satellites and Sow pool satellites to Sow pool centers. 8 Estimates of 𝑸 showed that animals were most frequently moved between (in decreasing order) Piglet 9 producers to Fattening herds, Sow pool satellites to Fattening herds, Multiplying herds to Piglet producers, 10 Farrow-to-finish to Fattening herds, Sow pool center to Sow pool satellite, Sow pool satellite to Sow pool 11 center and Multiplying herds to Farrow-to-finish. We list the top 7 rather than five to avoid the false 12 connotation that movements from Sow pool center to Sow pool satellite is more frequent than Sow pool 13 satellite to Sow pool center. These were ranked as number 5 and 6 and this ranking only differed by one 14 movement. Also we include estimates for the seventh highest, Multiplying herds to Farrow-to-finish, as 15 this showed the highest estimates for kernel variance (see below). 16 17 3.1.3 Size dependent parameters, 𝒄 18 Table 3 shows estimates for parameters 𝒄, determining how the maximum capacity of the holdings 19 influences the contact pattern. Most estimates (23 of a total 32) showed a clear positive relationship 20 between size and contact probabilities (i.e. 𝑐 = 0 is not included in the 95% central credibility interval) 21 while 7 estimates showed a negative relationship. Of the negative relationships, 3 were found for estimates 22 of the artificial type “Missing information”. 23 16 1 3.1.4 Distance related parameters, 𝑽 and 𝜿 2 We only include the estimates of 𝑽 and 𝜿 for the most common movements (given above) and list these in 3 Table 2. Variance of the spatial kernel, 𝑉, is a measure of the scale at which contacts occur and a high 4 value of 𝑉𝐼𝐽 indicates that long distance movements are common from holdings of production type 𝐼 to 𝐽. 5 The highest estimate was found for movements from Multiplying herds to Farrow-to-finish holdings and 6 the lowest was found for movements between Sow pool centers. 7 Kernel kurtosis, 𝜿, is a measure of the difference in movement distances. A high value indicates that there 8 are many short distance movements but concurrently many at long distance, and a low value indicates that 9 movement distances are mode uniform. In Lindström et al. (2009), where differences in production types 10 were excluded, the kernel kurtosis was estimated at 32.6 with 95% central credibility interval (29.2, 36.4). 11 Of the 64 estimates of 𝜅𝐼𝐽 in this study, 49 showed strong evidences for lower values (non overlapping 12 central credibility intervals) and 2 showed strong evidence for higher. 13 Figure 2 illustrates the cumulative distribution of observed and predicted movement distances. 14 15 3.2 Simulation of disease transmission 16 Figure 3 shows the coefficients estimated by the GLM’s as described in section 2.3. Large positive values 17 indicate that herds with this characteristic are to generate a large number of new infections and negative 18 values indicate that a herd is expected to generate few transmissions. Error bars were small and is 19 excluded in the picture for clarity. Sow pool centers, Nucleus herds, Sow pool satellites and Fattening 20 herds were estimated to have increased (relative to other production types) risk of generating large number 21 of new infections when higher degree infections are accounted for (Figure 3a). However, apart from the 22 herds in the group Missing information, the Fattening herds had the lowest coefficient also for higher 23 degrees and are still considered low risk herds. Piglet producers were estimated to generate fewer 17 1 infections when accounting for higher degree infections. Sow pool centers and satellites as well as 2 holdings of the artificial type Missing information were estimated to generate fewer new infections when 3 studying long distance transmissions (Figure 3b). Holdings with Multiplying herds and Farrow-to-finish 4 were estimated to have higher risk of long distance transmission. 5 For most production types, larger holdings were estimated to have a higher risk of causing new infections 6 (Figure 3c). No holdings with type Missing information were classified as large, but the risk of infecting 7 other holdings were estimated to be lower for Medium than Small holdings. Also, Large Sow pool centers 8 had a lower coefficient than Medium holdings reported with the same production type. When observing 9 the data we found that no large holding reported with production type Sow pool centers were reported as 10 only this type. 11 12 4. Discussion 13 In this paper we have presented a hierarchical Bayesian model for analysis of contacts between holdings, 14 applied to movements of pigs in Sweden. The analysis revealed a highly heterogeneous contact structure. 15 Holdings of different production types were estimated to differ in the number of contacts and with what 16 other production types the contacts occurred. This was demonstrated both in the estimates of commonness 17 indices, 𝒉, as well as the estimated absolute number of movements, 𝑸. These two measures provide 18 somewhat different information about the contact pattern. Whereas 𝒉 takes into account how frequent the 19 production types are, 𝑸 does not. For example, Farrow-to-finish herds and Fattening herds are both 20 common production types and while there are overall many transports between herds with the former to 21 the latter type (high value of 𝑸), transports between individual holdings of these types are rare (low value 22 of 𝒉). 23 Posterior distributions of 𝒉 and 𝒗 were similar to the estimates given in Lindström et al. (2010). The Sow 24 pool centers were less dominant in determining the contact pattern of a holding when the model was 18 1 extended. However, it was still the dominant production type when reported concurrently with other types. 2 Posterior means of 𝒗 were estimated at more than four times larger than the second most dominant type, 3 Sow pool satellite, and 81 times larger than the least dominant type, Fattening herd. 4 Differences were also shown in how the probability of contacts depends on the maximum capacity of 5 holdings (estimated with 𝒄), although most production types showed a positive relationship (negative 6 values were generally small) between maximum capacity and the probability of both incoming and 7 outgoing movements. This was found for both demographic groups (sows and fattening pigs). Without 8 proper understanding about the production types and structure of the industry, the negative values of 𝒄 9 may seem unexpected. Negative estimates indicate that larger holdings have fewer contacts and one might 10 perhaps expect that these are more active and hence trade more animals. For instance, Farrow-to-finish 11 holdings showed a slightly lower probability of both incoming and outgoing movements with larger 12 maximum capacity of fattening pigs. This is however not surprising as large herds in this category may be 13 expected to produce piglets that are kept in the herd until fattening and thus the main movement would be 14 animals sent to slaughter (which was not included in this study). Note that this study does not include 15 shipments to slaughterhouses. For Nucleus herds a large negative value was found for incoming 16 movements depending on the maximum capacity of sows. This might be due to the fact that large breeding 17 herds mainly introduce new genetic material in the form of semen and rarely buy live animals, as part of 18 their biosecurity policy. Other inconsistencies may be a result of the reporting system. As the production 19 type is reported by the farmer, and no proper definition of the production types are provided, it leaves 20 room for interpretation. Previous studies (Nöremark et al. 2009b) have reported that farmers with small 21 herds had a different interpretation of their type, and, in particular, it was found that they sometimes 22 regarded themselves as breeders even though only a few sows where kept on the premises. Moreover, 23 there is no requirement for updating the information in the database if the production type is changed. 24 Thus, the information on production type may be incorrect in the database. While Nucleus herds (by 25 common definition) generally receive few animals but send many, this may not be the case for small herds 19 1 (here mainly few sows). Nucleus herds are also expected to have a low maximum capacity of slaughter 2 pigs, and herds with large numbers for this demographic group might also behave differently, which may 3 explain the large positive estimates of 𝒄̈ 𝟏 for Nucleus herds (Table 3). The holdings in the group Missing 4 information showed a negative relationship between the number of contacts and the maximum capacity in 5 3 out of 4 estimates. We believe this was a result of the heterogeneous nature of this group, which contains 6 all holdings not reported to belong to any of the other seven types. 7 The analysis of 𝑽 and 𝜿 also showed considerable difference in how contact probability is influenced by 8 the distance between holdings. Comparing the estimates of 𝜿 to the estimates of Lindström et al. (2009) 9 where production type differences were ignored, we found that the kernel kurtosis for movement between 10 production types was generally lower. This supports the interpretation of kurtosis as a measurement of the 11 heterogeneity of the distance related processes resulting in movements between holdings. Also, by 12 including production type differences we found a good fit between observed and predicted movement 13 distances (Figure 2). Of the more common types (i.e. high values of 𝒉 or 𝑸 listed in Table 2), the largest 14 mean posterior of 𝜿 was found for movements from Multiplying herds to Farrow-to-finish holdings. High 15 kurtosis was also found for movements from Farrow-to-finish to Fattening herds. This may be interpreted 16 that the contacts between these types are the result of heterogeneous processes. This may partly be 17 explained by the fact that in Farrow-to-finish herds with a limited capacity for fattening pigs, some piglets 18 are sold to fattening herds, while Farrow-to-finish herds with larger fattening units keep all their piglets. 19 Thus, it is not only the herd size but the relative size of the piglet producing and fattening units in the herd 20 that affects the contact pattern of this herd category. Moreover, there is a trend towards specialized 21 production units in pig farming and some herds registered as Farrow-to-finish may have changed their 22 production into either piglet production or fattening pigs without this being recorded in the database. 23 Movements from Multiplying herds were estimated to have high variance (𝑽), with the highest estimate of 24 the study found for Multiplying herds to Farrow-to-finish. This indicates that long distance movements are 25 common compared to other production types, and from a disease transmission perspective, Multiplying 20 1 herds may cause long distance transmission. Movements between Sow pool centers and satellites are 2 found to have low estimates of 𝑽, indicating that while many movements occur between these types (both 3 in absolute number and relative to their abundance), these movements are of relatively short distance. 4 Of the two parameters used to model distance dependence, 𝑽 and 𝜿, the former is the main determinant of 5 disease spread dynamics (Lindström et al. In press). Extrapolating the kernel variance estimates to 6 implications about disease transmission we may expect that an infected e.g. Sow pool center will cause 7 few long distance transmissions, while a Multiplying herd (if infected) may rapidly increase the range of 8 an emerging disease. This was also found in the simulation study. The coefficients of Sow pool centers 9 (Figure 3b) decreased with distance but increase for Multiplying herds. Coefficients also increased for 10 Farrow-to-finish holdings, which had a high estimated value of 𝑽 for the main contact type, Fattening 11 herds. Generally we expect contacts between types estimated with a high 𝑽 to be particularly important in 12 later stages of outbreaks. When local susceptibles are depleted (due to becoming infected from more local 13 transmission), long distance contacts may spark new infection where depletion has not yet occurred. This 14 dynamic was found e.g. in the UK 2001 FMD outbreak (Keeling et al. 2001). 15 Matthews and Woolhouse (2005) argued that animal markets acted as super spreaders in this outbreak. 16 Such markets are rare in Sweden, and identification of possible super spreaders in the system may instead 17 focus on holdings of different production types. However, by international standards the animal farming is 18 less intensive, and movements are less frequent. Our analysis of holdings as potential super spreaders 19 should therefore be interpreted as relative to other holdings in the system. Figure 3a shows how the 20 potential of generating new infections changes with the infection degree for the different production types. 21 This is mainly a result of the estimates of 𝒉 and 𝑸, and the largest increase was found for Nucleus herds. 22 These mainly move animals to other Nucleus herds and Multiplying herds, which both in turn have many 23 outgoing contacts. Piglet producers showed a lower potential (compared to other production types) of 24 generating new infections when looking at higher degree infections. The general pattern is however quite 25 similar for different infection degrees, and we may conclude that production types with high estimates of 21 1 𝒉 for outgoing movements may act as super spreaders. For many diseases where the time between 2 infection and first symptoms are short, second (and consequently third) degree transmission may be 3 prevented by movement restrictions. While early detection is always crucial, our results suggest that the 4 importance should be even more emphasized for some production types, such as Nucleus herds and Sow 5 pool centers. 6 Further, the analysis of the simulation study showed that for most production types, infected holdings in 7 the larger size classes were likely to generate more transmissions. Hence, we may conclude that larger 8 holdings generally have higher potential to act as super spreaders. Exceptions were found for holdings 9 with Missing information (which is expected due to the negative relationship between maximum capacity 10 and contact probability, see above) and Sow pool satellites. The latter showed a decrease in the coefficient 11 given by the GLM for Large holdings. We believe this is a result of the fact that this size class contained 12 no holdings reported with only this production type, and the number of transmissions was largely 13 explained by the coexisting production types. 14 Using databases of holdings and animal movements to estimate parameters allows for assessment of large 15 scale patterns. Also, unlike qualitative studies, where inference is made from a few handpicked holdings 16 checked for consistency, the parameters are estimated from the same type of data that may be used for 17 outbreak simulations. If, for example, parameters are estimated from 100 holdings with every trait 18 checked and edited in great detail, they may not be reliably used for modeling contact data for holdings 19 that have not been checked in the same way. However, erroneous and dubious reports pose a problem. In 20 order to provide better estimates, the data quality needs to be improved. Better guides to farmers, as well 21 as a requirement for regular updates of recorded information may provide more reliable data. In particular, 22 the interpretation of production type is of great importance for work such as in this study. Production type 23 has a high influence on the contact patterns and could be of great use in risk estimation if the data is 24 reliable. Working with data from central databases always means a risk of erroneous reports affecting the 25 results. Using the same data as in this study, Lindström et al. (2010) reported unexpectedly many 22 1 movements between Sow pool centers, a highly unlikely event and probably due to deficiencies in the 2 recorded data on production type. As these are rare production types but with many movements, they are 3 particularly sensitive to erroneous entries in the data base and this may have affected the results. 4 Network analysis is another common approach to study animal movement contacts based on data base 5 entries. This commonly involves measurements that captures the overall structures and conclusions about 6 disease spread is then made from these measures (see Dubé et al. 2009 and references therein). The 7 method we have presented here can be seen as a lower lever analysis and uses a set of parameters to 8 capture the underlying process of between herd contacts that ultimately results in the higher level structure 9 captured by network measurements. For instance, our results show that generally holdings with larger herd 10 sizes both send and receive more animals and hence would, in a network context, have both higher in- and 11 outdegree (i.e. number of ingoing and outgoing links). Further, if ℎ𝐼𝐽 is generally large for all production 12 types 𝐽 (𝐼), then holdings of type 𝐼 (𝐽) are expected to have a high outdegree (indegree). An advantage of 13 our analysis is that we may also address secondary (and higher order) infections and as shown in Figure 3a 14 this can change the picture of which holding are to be considered as potential super spreaders. 15 The distance dependence parameters, in particular 𝑽 (Håkansson et al. 2010), also relate to some 16 commonly used network measurements. Various types of centrality indices are frequently used in network 17 analyses. These capture how important nodes are for the overall connectivity of the network (Wasserman 18 and Faust 1994). Holdings with production types with high probability of long distance movements (large 19 values of 𝑽) have the ability to connect otherwise distant (in terms of number of links) holdings and would 20 have a high centrality. A strong spatial component, i.e. low values of 𝑽, will instead result in highly 21 clustered networks (Keeling 1999). 22 With Bayesian inference it is straightforward to incorporate uncertainties in the parameter estimates. The 23 more movements we include, the smaller the credibility intervals. A disadvantage of network analysis is 24 that a single measure is presented, and this measure depends largely on the number of movements 23 1 analyzed, hence on the time scale chosen. Whereas network analysis considers an observed animal 2 movement between two holdings as a fixed link, our approach instead consider this as a random event, 3 however with different probabilities. Another advantage of analysis at this lower level is that the 4 parameter may be directly incorporated in explicit simulations of outbreaks. Here we have presented a 5 simplistic simulation model to demonstrate how the parameters translate to predictions of disease spread. 6 A more realistic but far more complex simulation model of a disease outbreak should include other 7 relevant contact types as well as disease specific parameters, such as incubation time, recovery rate and 8 intra-herd dynamics. 9 An advantage of network analysis is however the accessibility. At present, there are numerous software 10 packages available which allows for straightforward analyses of data. The method we have presented here 11 is in comparison computationally heavy and requires some basic knowledge about MCMC techniques. 12 It should also be stressed that, while the model presented here is cumbersome in the number of 13 parameters, there are further aspects of the contact structure that are not included. Reoccurring contacts 14 between holdings may be expected, and in particular this is true for the holdings in the Sow pool system. 15 A low variance of the spatial kernel, as reported for contacts between Sow pool centers and satellites, 16 results in a high probability of contacts with nearby holdings, and a high rate of reoccurring contacts are 17 expected to occur in the simulation study. However, since it is not incorporated explicitly, we believe this 18 may have caused an overestimation of the number of infections caused by Sow pool centers and satellites 19 in the simulation study. Recurrent contacts are expected to influence disease spread dynamics and while it 20 requires estimation of additional parameters, this may be a salient extension of the model. 21 While the contact model presented may be improved, we believe that much of the relevant features are 22 included. If data of other contacts are available, the method may also be applied to these. Using 23 generalized measures of contact probabilities (such as the parameters of the presented contact model) 24 allows for the comparison, both between different holdings but also between different types of contacts. 25 Also, if similar analyses are applied to data of other countries, comparisons of parameters may inform us 24 1 of differences in the contact structure. We also believe that the model may be used for risk assessment as a 2 complement to classic methods, but there is a need for better data quality for reliable inference. Yet, an 3 analysis as the one presented may also guide towards increased data quality in the future. 4 5 5. Conclusion 6 In this study we have analyzed live animal movements between pig holdings and found that the contact 7 pattern is highly heterogeneous. We found that generally, but not always, a positive relationship exists 8 between the maximum capacity of a holding and the number of contacts. 9 Describing distance dependence with a spatial kernel and analyzing its characteristics provides valuable 10 information about the contact pattern between holdings and is a main feature in predicting disease spread. 11 Hence the more detailed knowledge gained by the methodology presented may improve both knowledge 12 and predictive power. We found that the probability of contacts between holdings dependent on distance 13 was influenced by the production type of the start and end holding. 14 Heterogeneous contact patterns, with some holdings likely to act as super spreaders, and differences in the 15 probability of long distance contacts is expected to cause stochastic dynamics of a disease outbreak where 16 animal movements are important for the transmission. 17 18 Conflict of interest 19 We have no conflict of interest. 20 21 Acknowledgement 25 1 We thank the Swedish Board of Agriculture for supplying the data and Swedish Civil Contingencies 2 Agency for funding 3 4 References 5 Bigras-Poulin, M., Thompson, R.A., Chriel, M., Mortensen, S., Greiner, M., 2006. Network analysis of 6 Danish cattle industry trade patterns as an evaluation of risk potential for disease spread. Prev. Vet. Med. 7 76, 11–39. 8 Boender, G.J., Meester, R., Gies, E., De Jong, M.C.M., 2007. The local threshold for geographical spread 9 of infectious diseases between farms. Prev. Vet. Med. 82, 90–101. 10 Brennan, M.L., Kemp, R., Christley, R.M. 2008. Direct and indirect contacts between cattle farms in 11 north-west England. Prev. Vet. Med. 84, 242–260. 12 Casella, G., George, E.I., 1992. Explaining the Gibbs sampler. Am. Stat. 46, 167–174. 13 Clark, J. S., Silman, M., Kern, R., Macklin, E., HilleRisLambers, E. 1999. Seed dispersal near and far: 14 generalized patterns across temperate and tropical forests. Ecology 80, 1475–1494. 15 Chib, S., Greenberg, E., 1995. Understanding the Metropolis-Hastings algorithm. Am. Stat. 49, 327–335. 16 Dickey, B.F., Carpenter, T.E., Bartell, S.M., 2008. Use of heterogeneous operation-specific contact 17 parameters changes predictions for foot-and-mouth disease outbreaks in complex simulation models. Prev. 18 Vet. Med. 87, 272–287. 19 Dubé, C., Ribble, C., Kelton, D., McNab, B., 2009. A review of network analysis terminology and its 20 application to foot-and-mouth disease modelling and policy development. Transbound. Emerg. Dis. 56, 21 73–85. 26 1 Fan Y., Dortet-Bernadet, J.-L., Sisson, S. A. 2010. A note on Bayesian curve fitting via auxiliary 2 variables. J. Comput. Graph. Stats. 19, 626–644. 3 Févre, E.M., Bronsvoort, B.M.de C., Hamilton, K.A., Cleaveland, S., 2006. Animal movements and the 4 spread of infectious diseases. Trends Microbiol. 14, 125–131. 5 Gamerman, D., Lopes, H.F., 2006. Markov Chain Monte Carlo: Stochastic Simulation for Bayesian 6 Inference, second ed. CRC Press, Chapman & Hall. 7 Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B., 2004. Bayesian Data Analysis, second ed. Chapman & 8 Hall/CRC (Chapter 18). 9 Hawkes, C. 2009 Linking movement behaviour, dispersal and population processes: is individual variation 10 a key? J. Anim. Ecol. 78, 894–906. 11 Håkansson, N., Jonsson, A., Lennartsson, J., Lindström, T., Wennergren, U., 2010. Generating structure 12 specific networks. Adv. Complex Syst. 13, 239-250. 13 Kao, R.R., Green, D.M., Johnson, J., Kiss, I.Z., 2007. Disease dynamics over very different time-scales: 14 foot-and-mouth disease and scrapie on the network of livestock movements in the UK. J. R. Soc. Interface 15 4, 907–916. 16 Keeling, M.J., 1999. The effects of local spatial structure on epidemiological invasions. Proc. R. Soc. 17 London B 266, 859–869. 18 Keeling, M.J., Woodhouse, M.E., Shaw, D.J., Matthews, L., 2001. Dynamics of the 2001 UK foot and 19 mouth epidemic: stochastic dispersal in a dynamic landscape. Science 294, 813–817. 20 Keeling, M., 2005. The implications of network structure for epidemic dynamics. Theo. Pop. Bio. 67, 1–8. 21 Lindström, T., Håkansson, N., Westerberg, L., Wennergren, U., 2008. Splitting the tail of the 22 displacement kernel shows the unimportance of kurtosis. Ecology 89, 1784–1790. 27 1 Lindström, T., Sisson, S.A., Nöremark, M., Jonsson A., Wennergren, U., 2009. Estimation of distance 2 related probability of animal movements between holdings and implications for disease spread modeling. 3 Prev. Vet. Med. 91, 85–94. 4 Lindström, T., Sisson, S.A., Sternberg Lewerin, S., Wennergren, U., 2010. Estimating animal movement 5 contacts between holdings of different production types. Prev. Vet. Med. 95, 23-31. 6 Lindström, T., Håkansson, N., Wennergren, U., The shape of the spatial kernel and its implications for 7 biological invasions in patchy environments. Proc. Roy. Soc. London B. In press. 8 Matthews, L., Woolhouse, M., 2005. New approaches to quantifying the spread of infection. Nat. Rev. 9 Mol. Cell Biol. 3, 529–536. 10 Nöremark, M., Håkansson, N., Lindström, T., Wennergren, U., Sternberg Lewerin, S., 2009a. Spatial and 11 temporal investigations of reported movements, births and deaths of cattle and pigs in Sweden. Acta Vet. 12 Scand. 51:37. 13 Nöremark M, Lindberg A, Vågsholm I, Sternberg Lewerin S. 2009b. Disease awareness, information 14 retrieval and change in biosecurity routines among pig farmers in association with the first PRRS outbreak 15 in Sweden. Prev. Vet. Med. 90, 1–9. 16 Nöremark, M., Håkansson, N., Sternberg Lewerin, S., Lindberg, A. Jonsson A., Network analysis of cattle 17 and pig movements in Sweden; measures relevant for disease control and risk based surveillance. In press. 18 Ortiz-Pelaez A, Pfeiffer D. U., Soares-Magalhães R.J., Guitian F.J., 2006. Use of social network analysis 19 to characterize the pattern of animal movements in the initial phases of the 2001 foot and mouth disease 20 (FMD) epidemic in the UK. Prev. Vet. Med. 75, 40–55. 21 Rweyemamu, M., Roeder, P., Mackay, D., Sumption, K., Brownlie, J., Leforban, Y., Valarcher, J.F., 22 Knowles, N.J., Saraiva, V., 2008. Epidemiological patterns of foot-and-mouth disease worldwide. 23 Transbound. Emerg. Dis. 55, 57–72. 28 1 Ribbens, S., Dewulf, J., Koenen, F., Mintiens, K., de Kruif, A., Maes, D., 2009. Type and frequency of 2 contacts between Belgian pig herds. Prev. Vet. Med. 88, 57–66. 3 Robinson, S.E., Christley, R.M., 2007. Exploring the role of auction markets in cattle movements within 4 Great Britain. Prev. Vet. Med. 81, 21–37. 5 Tildesley, M.J., Deardon, R., Savill, N.J., Bessell, P.R. Brooks, S.P., Woolhouse, M.E.J., Grenfell, B.T., 6 Keeling, M.J., 2008. Accuracy of models for the 2001 foot-and-mouth epidemic. Proc. Roy. Soc. London 7 B. 275, 1459–1468. 8 Velthuis, A.G., Mourits, M.C., 2007. Effectiveness of movement-prevention regulations to reduce the 9 spread of foot-and-mouth disease in The Netherlands. Prev. Vet. Med. 82, 262–281. 10 Vernon, M.C., Keeling, M.J., 2009. Representing the UK’s cattle herd as static and dynamic networks. 11 Proc. Roy. Soc. London B. 276, 469–476. 12 Wasserman, S., Faust, K., 1994. Social Network Analysis: Methods and Applications. Cambridge 13 University Press, Cambridge. 14 Watts, D.J., Strogatz, S.H., 1998. Collective dynamics of ‘‘small-world’’ networks. Nature 393, 440–442. 15 16 Appendix A. Hierarchical priors 17 We implement hierarchical priors for 𝑽 and 𝜿, and denote these 𝛹(𝑉𝐼𝐽 |𝝃𝑽 ) and 𝛹(𝜅𝐼𝐽 |𝝃𝜿 ), respectively, 18 where 𝝃𝑽 and 𝝃𝜿 are vectors of unknown hyper-parameters. These vectors model the degree of similarity 19 between parameters and (if similarities are prominent) improves estimation of the 𝑽 and 𝜿 when the data 20 is weak (e.g. few movements between types 𝐼, 𝐽). For 𝑽 we use an inverse gamma distribution with hyper- 21 parameters 𝛼𝑉 and 𝛽𝑉 29 𝛼 𝛹(𝑉𝐼𝐽 |𝛼𝑉 , 𝛽𝑉 ) = 𝛽𝑉 𝑉 𝛼𝑉 (1⁄𝑉𝐼𝐽 ) 𝑒 −𝛽𝑉⁄𝑉𝐼𝐽 . 𝛤(𝛼𝑉 ) (A.1) 1 The probability density function of the inverse gamma distribution is defined for values larger than zero. 2 Since the lower limiting value of 𝜿 is 4/3 (a uniform distribution obtained for 𝑏 → ∞, Lindström et al. 3 2008) we use a shifted inverse gamma distribution as hierarchical prior for 𝜿 and write 𝛼 𝛹(𝜅𝐼𝐽 |𝛼𝜅 , 𝛽𝜅 ) = 𝛽𝜅 𝜅 𝛼𝜅 (1⁄(𝜅𝐼𝐽 − 4⁄3)) 𝑒 −𝛽𝜅⁄(𝜅𝐼𝐽−4⁄3) . 𝛤(𝛼𝜅 ) (A.2) 4 5 Appendix B. Indicator variables 6 To facilitate computations when sampling from posteriors associated with mixture distributions, a 7 common strategy is to introduce indicator variables (for example Gelman et al. 2004). With this approach, 8 equation 10 may be rewritten as 𝑃(𝒅, 𝒔|𝒗, 𝒉, 𝑹, 𝑺, 𝑫, 𝒄, 𝜿, 𝑽, 𝑼) = ∏ ∏ ∏[𝑃(𝑑𝑡 |𝒗, 𝑹, 𝑠𝑡 , 𝐼, 𝐽, 𝑺, 𝑫, 𝒄̈ 𝑱 , 𝜅𝐼𝐽 , 𝑉𝐼𝐽 )𝑃(𝑠𝑡 |𝒗, 𝑹, 𝐼, 𝑺, 𝒄̇ 𝑰 )𝑃(𝐼, 𝐽|𝒗, 𝒉, 𝑹)] 𝑡 𝐼 𝑈𝐼𝐽𝑡 , (B.1) 𝐽 9 where 𝑼 is a tensor of dimension 𝐾 × 𝐾 × 𝑇 and 𝑈𝐼𝐽𝑡 = 1 for exactly one combination of 𝐼, 𝐽 for each 10 movement 𝑡. The full posterior distribution of unobserved parameters 𝒄, 𝜿, 𝑽, 𝒉, 𝒗, 𝛼𝜅 , 𝛽𝜅 , 𝛼𝑉 , 𝛽𝑉 , 𝑼 is 𝑃(𝒄 , 𝜿, 𝑽, 𝒉, 𝒗, 𝛼𝜅 , 𝛽𝜅 , 𝛼𝑉 , 𝛽𝑉 , 𝑼|𝑹, 𝑺, 𝑫, 𝒅, 𝒔) = 𝑃(𝒅, 𝒔|𝒗, 𝒉, 𝑹, 𝑺, 𝑫, 𝒄, 𝜿, 𝑽, 𝑼)𝛹(𝜿|𝛼𝜅 , 𝛽𝜅 ) 𝛹(𝑽|𝛼𝑉 , 𝛽𝑉 )𝑃(𝒗)𝑃(𝒉)𝑃(𝒄)𝑃(𝛼𝜅 )𝑃(𝛽𝜅 )𝑃(𝛼𝑉 )𝑃(𝛽𝑉 ), (B.2) 11 where ,𝑃(𝒗), 𝑃(𝒉), 𝑃(𝒄), 𝑃(𝛼𝜅 ), 𝑃(𝛽𝜅 ), 𝑃(𝛼𝑉 ) and 𝑃(𝛽𝑉 ) are prior distributions of parameters and 12 hyper-parameters. For 𝒗 and 𝒉 (recall that the elements of these sum to one) we use uninformative 13 𝐷𝑖𝑟𝑖𝑐ℎ𝑙𝑒𝑡(1,1, … ,1) priors. Priors 𝑃(𝒄), 𝑃(𝛼𝜅 ) and 𝑃(𝛼𝑉 ) are set to be proportional to one on the support 14 of the parameters, while 𝑃(𝛽𝜅 ) and 𝑃(𝛽𝑉 ) are defined as uniform for 𝛽 > 1. The inverse gamma 15 distribution does not have a finite mean for 𝛽 < 1 and we assume that both 𝑉 and 𝜅 are finite quantities. 30 1 Incorporation of the unobserved indicator variable 𝑼 in the model also allows for posterior estimation of 2 the number of movements between production types. The posterior distribution of 𝑸 is calculated from the 3 posterior distribution of 𝑼 by 𝑄𝐼𝐽 = ∑ 𝑈𝐼𝐽𝑡 . (B.3) 𝑡 4 5 Appendix C. MCMC estimation 6 We use Markov chain Monte Carlo (MCMC) techniques to estimate the posterior distribution of the model 7 parameters. This involves the construction of a stochastic Markov chain with stationary distribution given 8 by the posterior distribution of interest. Given a current state of the chain, MCMC methods sequentially 9 update the parameters either individually or in blocks, based on the full posterior conditional distributions 10 of each parameter under the model. Repeating this procedure, and after the chain has converged, the states 11 of the chain represent (correlated) draws from the posterior distribution of model parameters. Two basic 12 updates are involved. If the conditional distribution of a parameter is of a standard form, Gibbs sampling 13 (see e.g. Casella and George 1992 for further details) may be used. If however the distribution is of non- 14 standard form, Metropolis-Hastings updates may be used (see e.g. Chib and Greenberg 1995 for further 15 details). In this case, parameter values 𝜽∗ are proposed from a density function 𝑞(𝜽∗ |𝜽) and subsequently 16 accepted with probability 𝑚𝑖𝑛 (1, 𝑃(𝜽∗ | … )𝑃(𝜽∗ )𝑞(𝜽|𝜽∗ ) ), 𝑃(𝜽| … )𝑃(𝜽)𝑞(𝜽∗ |𝜽) (C.1) 17 where 𝜽 and 𝜽∗ are current and proposed parameter values, 𝑃(𝜽) is the prior and 𝑃(𝜽| … ) is the 18 likelihood evaluated at 𝜽. Further information on MCMC methods can be found in Gamerman and Lopes 19 (2006). 20 All parameters except 𝑼 may be updated with Metropolis-Hastings steps. Parameter matrix 𝒉 is only 21 involved in 𝑃(𝐼, 𝐽|𝒗, 𝒉, 𝑹) of equation B.1, and so the resulting conditional posterior distribution of 𝒉 is 31 𝑇 𝐾 𝐾 𝑃(𝒉|𝒗, 𝑼, 𝑹) ∝ ∏ ∏ ∏[𝑃(𝐼𝑡 , 𝐽𝑡 |𝒗, 𝒉, 𝑹) ]𝑈𝐼𝐽𝑡 𝑃(𝒉). (C.2) 𝑡=1 𝐼=1 𝐽=1 1 𝐾 𝑈𝐼𝐽𝑡 The prior, 𝑃(𝒉), is proportional to 1 and the distribution ∏𝑇𝑡=1 ∏𝐾 is given in 𝐼=1 ∏𝐽=1[𝑃(𝐼𝑡 , 𝐽𝑡 |𝒗, 𝒉, 𝑹) ] 2 Lindström et al. (2010) as 𝐿1 = 𝑀𝑢𝑙𝑡𝑖𝑛𝑜𝑚𝑖𝑎𝑙(𝑀1,1 , 𝑀1,2 , … 𝑀2,1 , 𝑀2,2 … 𝑀𝑘,𝑘−1 , 𝑀𝑘,𝑘 |𝑝1,1 , 𝑝1,2 , … 𝑝2,1 , 𝑝2,2 … 𝑝𝑘,𝑘−1 , 𝑝𝑘,𝑘 ), (C.3) 3 where 𝑝𝐼,𝐽 = 𝑃(𝐼, 𝐽|𝒗, 𝒉, 𝑹) is given in equation 3, 𝑀𝐼,𝐽 = ∑𝑡 𝑈𝐼𝐽𝑡 for all 𝐼, 𝐽. Dirichlet distributions were 4 used for proposals. More details may be found in Lindström et al. (2010). 5 Dirichlet proposals were also used for updates of 𝒗, which is included in each distribution 6 𝑃(𝑑𝑡 |𝒗, 𝑹, 𝑠𝑡 , 𝐼, 𝐽, 𝑺, 𝑫, 𝒄̈ 𝑱 , 𝜅𝐼𝐽 , 𝑉𝐼𝐽 ), 𝑃(𝑠𝑡 |𝒗, 𝑹, 𝐼, 𝑺, 𝒄̇ 𝑰 ) and 𝑃(𝐼, 𝐽|𝒗, 𝒉, 𝑹). For simplicity we write 𝑇 𝐾 𝐾 𝐿2 = ∏ ∏ ∏ [ 𝑡=1 𝐼=1 𝐽=1 7 𝑣̂𝑠𝐼 𝐺(𝑆1𝑠 , 𝑆2𝑠 , 𝑐̇1𝐼 , 𝑐̇2𝐼 ) 𝑈𝐼𝐽𝑡 ] ∑𝑓 𝑣̂𝑓𝐼 𝐺(𝑆1𝑓 , 𝑆2𝑓 , 𝑐̇1𝐼 , 𝑐̇2𝐼 ) (C.4) and 𝑇 𝐾 𝐾 𝑣̂𝑑𝐼 𝐺(𝑆1𝑑 , 𝑆2𝑑 , 𝑐̈1𝐽 , 𝑐̈2𝐽 )𝐹(𝐷𝑠𝑑 , 𝜅𝐼𝐽 , 𝑉𝐼𝐽 ) 𝑈𝐼𝐽𝑡 𝐿3 = ∏ ∏ ∏ [ ] ∑𝑓 𝑣̂𝑓𝐼 𝐺(𝑆1𝑓 , 𝑆2𝑓 , 𝑐̈1𝐽 , 𝑐̈2𝐽 )𝐹(𝐷𝑠𝑓 , 𝜅𝐼𝐽 , 𝑉𝐼𝐽 ) , 𝑓 ≠ 𝑠, (C.5) 𝑡=1 𝐼=1 𝐽=1 8 and here the conditional posterior distribution for 𝒗 is 𝑃(𝒗 |𝒄, 𝒉, 𝑹, 𝑺, 𝑫, 𝜿, 𝑽, 𝑼, 𝑱, 𝑰) = 𝐿1 𝐿2 𝐿3 𝑃(𝒗). 9 10 (C.6) The elements of 𝒄̇ and 𝒄̈ were updated separately using Gaussian random walk proposal distributions. The conditional posterior distribution of 𝑐̇𝑢𝐼 (𝑢 = 1,2) is 𝑃(𝑐̇𝑢𝐼 |𝑐̇𝑢̃𝐼 , 𝒗, 𝑼, 𝑹, 𝐼, 𝑺, 𝒔) 𝑇 𝐾 𝑣̂𝑠𝐼 𝐺(𝑆𝑢𝑠 , 𝑆𝑢̃𝑠 , 𝑐̇𝑢𝐼 , 𝑐̇𝑢̃𝐼 ) 𝑈𝐼𝐽𝑡 = (∏ ∏ [ ] ∑𝑓 𝑣̂𝑓𝐼 𝐺(𝑆𝑢𝑓 , 𝑆𝑢̃𝑓 , 𝑐̇𝑢𝐼 , 𝑐̇𝑢̃𝐼 ) ) 𝑃(𝑐̇𝑢𝐼 ) (C.7) 𝑡=1 𝐽=1 11 where 𝑢̃ = 2 for 𝑢 = 1 and 𝑢̃ = 1 for 𝑢 = 2. Similarly, the conditional posterior distribution of 𝑐̈𝑢𝐽 is 32 𝑃(𝑐̈𝑢𝐽 |𝑐̈𝑢̃𝐽 , 𝒗, 𝒉, 𝑹, 𝑺, 𝑫, 𝜿, 𝑽, 𝑼, 𝑱) 𝑇 𝐾 = ∏∏[ 𝑡=1 𝐼=1 𝑣̂𝑑𝐼 𝐺(𝑆𝑢𝑑 , 𝑆𝑢̃𝑑 , 𝑐̈𝑢𝐽 , 𝑐̈𝑢̃𝐽 )𝐹(𝐷𝑠𝑑 , 𝜅𝐼𝐽 , 𝑉𝐼𝐽 ) 𝑈𝐼𝐽𝑡 ] ∑𝑓 𝑣̂𝑓𝐼 𝐺(𝑆𝑢𝑓 , 𝑆𝑢̃𝑓 , 𝑐̈𝑢𝐽 , 𝑐̈𝑢̃𝐽 )𝐹(𝐷𝑠𝑓 , 𝜅𝐼𝐽 , 𝑉𝐼𝐽 ) 𝑃(𝑐̈𝑢𝐽 ), (C.8) 𝑓 ≠ 𝑠. 1 Joint updates of 𝜿 and 𝑽 were performed separately for each combination of 𝐼, 𝐽 using multivariate 2 Gaussian random walk on the logarithm of 𝜅𝐼𝐽 and 𝑉𝐼𝐽 with proposals from a multivariate normal 3 distribution. The joint conditional distribution of 𝜅𝐼𝐽 , 𝑉𝐼𝐽 is 𝑃(𝜅𝐼𝐽 , 𝑉𝐼𝐽 |𝒄̈ 𝑱 , 𝒗, 𝒉, 𝑹, 𝑺, 𝑫, 𝑼, 𝑱) 𝑇 𝑣̂𝑑𝐼 𝐺(𝑆1𝑑 , 𝑆2𝑑 , 𝑐̈1𝐽 , 𝑐̈2𝐽 )𝐹(𝐷𝑠𝑑 , 𝜅𝐼𝐽 , 𝑉𝐼𝐽 ) 𝑈𝐼𝐽𝑡 = ∏[ ] ∑𝑓 𝑣̂𝑓𝐼 𝐺(𝑆1𝑓 , 𝑆2𝑓 , 𝑐̈1𝐽 , 𝑐̈2𝐽 )𝐹(𝐷𝑠𝑓 , 𝜅𝐼𝐽 , 𝑉𝐼𝐽 ) 𝑡=1 𝑓 ≠ 𝑠. 𝛹(𝜅𝐼𝐽 |𝛼𝜅 , 𝛽𝜅 ) 𝛹(𝑉𝐼𝐽 |𝛼𝑉 , 𝛽𝑉 ), (C.9) 4 We utilized Metropolis-Hastings updates for both 𝛼 and 𝛽. In order to improve the mixing we updated the 5 parameters of the hierarchical priors five times for every update of the other parameters (e.g. Fan et al. 6 2010). The posterior distribution of 𝛼𝜃 , 𝛽𝜃 (𝜃 = 𝑉, 𝜅) is 𝑃(𝛼𝜃 |𝜽) = 𝛹(𝜽|𝛼𝜃 , 𝛽𝜃 ) 𝑃(𝛼𝜃 ) 𝑃(𝛽𝜃 |𝜽) = 𝛹(𝜽|𝛼𝜃 , 𝛽𝜃 ) 𝑃(𝛽𝜃 ) (C.10) 7 with 𝛹 given by equations A.1 (for 𝜽 = 𝑽) and A.2 (for 𝜽 = 𝜿). 8 As the indicator variable 𝑈𝐼𝐽𝑡 = 1 for exactly one combination of 𝐼, 𝐽, then 𝑼 may be updated with Gibbs 9 sampling by drawing one random number for each 𝑡 from a multinomial distribution with probabilities 10 given by 𝑃𝑟(𝑈𝐼𝐽𝑡 = 1) 𝑃(𝑑𝑡 |𝒗, 𝑹, 𝑠𝑡 , 𝐼, 𝐽, 𝑺, 𝑫, 𝒄̈ 𝑱 , 𝜅𝐼𝐽 , 𝑉𝐼𝐽 )𝑃(𝑠𝑡 |𝒗, 𝑹, 𝐼, 𝑺, 𝒄̇ 𝑰 )𝑃(𝐼, 𝐽|𝒗, 𝒉, 𝑹) = . ∑𝑖 ∑𝑗 𝑃(𝑑𝑡 |𝒗, 𝑹, 𝑠𝑡 , 𝑖, 𝑗, 𝑺, 𝑫, 𝒄̈ 𝒋 , 𝜅𝑖𝑗 , 𝑉𝐼𝐽 )𝑃(𝑠𝑡 |𝒗, 𝑹, 𝑖, 𝑺, 𝒄̇ 𝒊 )𝑃(𝑖, 𝑗|𝒗, 𝒉, 𝑹) 11 12 Figure Legends 33 (C.11) 1 Figure 1. Examples of spatial kernel with (a) kurtosis =3.33 and variance=1000 (dashed), 10000 (solid), 2 100000 (dotted) (km2) and (b) variance=1000000 (km2) and kurtosis =2 (dashed), 4 (solid), 8 (dotted). 3 Embedded axis’ shows same as major axes but with logarithmic y-axis and larger distances included. 4 5 Figure 2. Cumulative distribution of observed (solid line) and predicted (dotted line) distances of live pig 6 movements between holdings. Note that the x-axis is on the log scale. 7 8 Figure 3. Coefficients of (panels a and b) explanatory variables production type (0 if holding had not 9 reported the production type, 1 if reported) and (panel c) the combination of production type and size class 10 (S=small, M=medium, L=large) analyzed by GLM. Response variable was (a) the number of first, second 11 and third degree infections caused by a holding if infected, (b) number of first degree infections at 12 distances longer than 10, 100 and 500 km and (c) the number of first degree infections (but with size 13 classes included in the explanatory variables). Note that (a) and (b) are the result of three separate analyses 14 (each with 8 explanatory variables) while (c) is the result of one analysis. Legend abbreviations: 15 SPC=Sow pool centers, MH=Multiplying herds, NH=Nucleus herds, PP=Piglet producers, SPS=Sow pool 16 satellites, FF=Farrow-to-finish, FH=Fattening herds, MI=Missing information. 34

Manuscript 3 nov - IFM

Related documents

Products

Support

Manuscript 3 nov - IFM

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib