Manuscript 3 nov - IFM

advertisement
1
Title:
2
Bayesian analysis of animal movements related to factors at herd and between herd levels:
3
Implications for disease spread modeling.
4
Authors: Tom Lindströma, Scott A. Sissonb, Susanna Stenberg Lewerinc and Uno Wennergrena
5
a
IFM Theory and Modelling, Linköping University, 581 83 Linköping, Sweden
6
b
School of Mathematics and Statistics, University of New South Wales, Sydney 2052, Australia
7
8
c
9
Corresponding Author:
Department of Disease Control and Epidemiology, SVA, National Veterinary Institute, 751 89 Uppsala,
Sweden
10
Uno Wennergren
11
Tel: +46 13 28 16 66
12
Fax: +46 13 28 13 99
13
Email: unwen@ifm.liu.se
14
Correspondence address: See above.
15
16
Key words
17
Markov Chain Monte Carlo; Hierarchical Bayesian; Mixture models; Indicator variable; Animal
18
databases; Animal movements; Contact structure
19
20
Abstract
21
A method to assess the influence of between herd distances, production types and herd sizes on patterns of
22
between herd contacts is presented. It was applied on pig movement data from a central database of the
23
Swedish Board of Agriculture. To determine the influence of these factors on the contact between
1
1
holdings we used a Bayesian model and Markov chain Monte Carlo (MCMC) methods to estimate the
2
posterior distribution of model parameters. The analysis showed that the contact pattern via animal
3
movements is highly heterogeneous and influenced by all three factors, production type, herd size, and
4
distance between holdings. Most production types showed a positive relationship between maximum
5
capacity and the probability of both incoming and outgoing movements. In agreement with previous
6
studies, holdings also differed in both the number of contacts as well as with what holding types contact
7
occurred with. Also, the scale and shape of distance dependence in contact probability was shown to differ
8
depending on the production types of holdings.
9
To demonstrate how the methodology may be used for risk assessment, disease transmissions via animal
10
movements were simulated with the model used for analysis of contacts, and parameterized by the
11
analyzed posterior distribution. A Generalized Linear Model showed that herds with production types Sow
12
pool center, Multiplying herd and Nucleus herd have higher risk of generating a large number of new
13
infections. Multiplying herds are also expected to generate many long distance transmissions, while
14
transmissions generated by Sow pool centers are confined to more local areas. We argue that the
15
methodology presented may be a useful tool for improvement of risk assessment based on data found in
16
central databases.
17
18
1. Introduction
19
In order to understand the process of disease spread between animal holdings, researchers are increasingly
20
studying the contact patterns that contribute to the transmission (e.g. Velthuis and Mourits 2007, Dubé et
21
al. 2009, Nöremark et al. 2009a, Vernon and Keeling 2009, Lindström et al. 2010). Analysis of the contact
22
pattern allows for making predictions about disease spread (Kao et al. 2007) as well as testing the effects
23
of changes in its contact structure (Velthuis and Mourits 2007). Different diseases may be transmitted via
24
different paths, but generally between-holding movements of animals may be regarded as the main risk
2
1
factor for transmission of livestock diseases (Févre et al. 2006, Ortiz-Pelaez et al. 2006, Rweyemamu et al.
2
2008).
3
In this paper we present a methodology to transform data of between holding contacts into probability
4
distributions useful for predictions and direct analysis. We analyze pig movements in Sweden and show
5
how it is possible to assess the influence of several factors on the contact pattern, specifically production
6
type, number of animals and distances between holdings. For the latter, it has been shown that contacts
7
between holdings are more common at short distances (Boender et al. 2007, Robinson and Christley 2007,
8
Lindström et al. 2009, Ribbens et al. 2009). Epidemiological studies often describe the probability of
9
transmission dependent on distance with a spatial kernel (Keeling et al. 2001, Tildesley et al. 2008), and
10
Lindström et al. (2009) showed that a good description of contact probabilities at both short and long
11
distances are important. As in Lindström et al. (2009) we characterize the spatial kernel by its two
12
dimensional variance (quantifying the scale) and kurtosis (quantifying the shape). A high value of the
13
kernel variance means that contacts occur more frequently at longer distances. A high value of kurtosis
14
indicates that contacts are frequent at short distances, but concurrently long distance contacts, represented
15
by a fat tail of the kernel, are also common. From ecological research it is known that leptokurtic kernels
16
(i.e. kernels with higher kurtosis than a normal distribution) may be the result of heterogeneity in the
17
dispersal processes (see Hawkes 2009 and references therein). Hence, kurtosis may be interpreted as a
18
measure of the heterogeneity of the distance related contact probability. Figure 1 shows spatial kernels
19
with different combinations of 2D-variance and kurtosis. More frequent contacts at shorter distances result
20
in spatially clustered contact patterns (Keeling 1999) which lead to depletion of local susceptibles and a
21
rapid decline in the reproductive ratio (Keeling 2005). Hence, such dynamics are expected from disease
22
transmission where contacts are estimated to be described by a kernel with a small variance. A large value
23
of kurtosis also results in such patterns, but since long distance contacts are also common, the number of
24
contacts needed for connecting otherwise distant holdings is reduced. From theory of small-world
3
1
networks (Watts and Strogatz 1998) it is known that such contacts may have substantial impact on the
2
dynamics of an outbreak.
3
Contact heterogeneity due to production types has been reported and shown to be important for the
4
dynamics of outbreaks (Dickey et al. 2008, Ribbens et al. 2009, Lindström et al. 2010). The production
5
type of a holding may be expected to influence both the number of animal movements as well as which
6
other holdings animals are moved from/to. For the latter case, clustering patterns may be expected, similar
7
to contact heterogeneities found due to spatial clustering. However, unlike the spatial factor, directional
8
differences may be expected with animal movements being more common from production type A to type
9
B than from B to A. Holdings of some production types also have many contacts (i.e. trade many animals)
10
which means that if infected, these holdings may cause a large number of secondary infections and
11
thereby function as super spreaders (Matthews and Woolhouse 2005).
12
The number of animals on a holding is also expected to influence the contact pattern. Typically, larger
13
holdings are expected to trade more animals and hence have more contacts (Ribbens et al. 2009), making
14
such holdings candidates as super spreaders. However, differences may be expected between production
15
types. For example, the frequency of live animal movements to and from holdings with farrow-to-finish
16
production might not be strongly dependent on the number of animals kept on the premises, since this
17
production type includes both piglet producing units and fattening units on the same holding.
18
Commonly, between-holdings contacts are studied with network analysis (Bigras-Poulin et al. 2008,
19
Brennan et al. 2008, Nöremark et al. In press). Such studies provide good quantitative measures of the
20
observed structure and may also provide an understanding of the contact patterns and the dynamics of
21
disease transmission through these contacts (Keeling 2005, Kao et al. 2007). It may however be difficult
22
to parameterize a model from network measures. Simulation models in a network context are usually
23
confined to resampling observed contacts (e.g. Vernon and Keeling 2009). While the method presented in
24
this paper addresses many of the same questions as network studies, it utilizes (hierarchical) Bayesian
25
models and builds on methodology presented in Lindström et al. (2009) and Lindström et al. (2010). In
4
1
Lindström et al. (2009) a method was presented for estimation of distance-related probability of contacts,
2
but it was there assumed that all holdings are identical. Lindström et al. (2010) introduced a method that
3
analyzed the contact patterns based on production types, but other factors were excluded. In this paper we
4
present a model that describes contact via live animal movements between holdings where the probability
5
of contacts depends on production types, the number of animals at each holding and distance between
6
holdings. We estimate the posterior distribution of model parameters with Markov chain Monte Carlo
7
(MCMC) methods and utilize data found in central databases of animal movements. EU members and
8
some other states (e.g. Australia and New Zealand) are required to keep databases on all livestock
9
holdings and register all movements of pigs and cattle, which means that such data may be available for
10
analysis. The level of detail included in the databases does however vary between countries. While data
11
quality is a problem (Nöremark et al. 2009a, Lindström et al. 2010), analysis of such data allows for
12
investigation of large scale trends in the contact patterns. One should however be aware of its limitations
13
when drawing conclusions from the analyses.
14
The aim of this paper, and the method presented, is to use a probabilistic model to investigate how the
15
contact pattern is influenced by distance between holdings, herd sizes and production types. Our aim is
16
also to investigate how the influence of distance and herd size dependence differ between holdings of
17
different production types. We also aim to show how the analyzed contact pattern can be used in risk
18
assessment.
19
20
2. Material and method
21
2.1 Data
22
The data used was supplied by the Swedish Board of Agriculture. Due to legal requirements, the analysis
23
was performed on encoded data such that the ID number of specific holdings or names of farmers could
24
not be retrieved. This prohibited the tracing of potentially unexpected contacts. Holdings that were
5
1
considered to be inactive were removed, as well as holdings that did not have spatial coordinates (see
2
Nöremark et al. 2009a for more details on this). A total number of 3084 holdings and 20231 movements
3
(carried out from July 2005 until June 2006) were included in the analysis. Movements to slaughterhouses
4
were not included in the analysis.
5
Data included the maximum capacity (i.e. the reported maximum number of animals that could be kept on
6
the premises) of each holding, recorded separately for sows and fattening pigs. If maximum capacity of a
7
demographic group was missing in the database, it was assumed that the maximum capacity was zero.
8
Such entries were mostly found for holdings with production types that are expected not to have animals
9
of that demographic group and we found that maximum capacity equal to zero was rarely entered in the
10
database. In addition, previous studies (Nöremark et al. 2009b) has shown a better consistency between
11
larger holdings and the entries in the database, indicating that while 0 may not be accurate in some
12
instances, a low rather than a high number is to be expected. Seven production types were included in the
13
study: Sow pool centers, Sow pool satellite, Farrow-to-finish, Nucleus herd, Piglet producer, Multiplying
14
herd and Fattening herd. When reported by the farmer, the form has an option for free text. Holdings that
15
only had this information entered were placed in a group denoted “Missing information”. Note that when
16
we use this term, we only refer to missing information about the production type and that farmers may still
17
have reported location and herd sizes. For more details on the included production types, the pig farming
18
structure of Sweden and how the data is entered in the data base, see Nöremark et al. (2009a) and
19
Lindström et al. (2010).
20
21
2.2 Model and parameter estimation
22
Data include holding production types, and this information is included in the model by a matrix 𝑹 of size
23
𝑛 × πΎ, where 𝑛 is the number of holdings and 𝐾 is the number of production types (including the artificial
24
type Missing information). We denote π‘…π‘“π‘˜ = 1 if holding 𝑓 has production type π‘˜ and π‘…π‘“π‘˜ = 0 otherwise,
6
1
for π‘˜ = 1, … , 𝐾 and 𝑓 = 1, … , 𝑛. Data also includes spatial coordinates of holdings and these are
2
translated into a distance matrix, 𝑫 of dimensions 𝑛 × π‘›, where 𝐷𝑓𝑔 is the Euclidean distance between
3
holdings 𝑓 and 𝑔. Herd sizes of pig holdings are measured by the maximum capacity of fattening pigs and
4
sows, and these are denoted π‘ΊπŸ and π‘ΊπŸ (both vectors with 𝑛 elements), respectively. When we refer to
5
either of these demographic classes we write 𝑺, and 𝑆𝑒𝑓 refers to size 𝑒 (𝑒 = 1,2) of holding 𝑓. We use
6
notation such that each movement, 𝑑 (𝑑 = 1,2, … , 𝑇, where 𝑇 is the number of movements), has a start
7
holding 𝑠𝑑 and destination holding 𝑑𝑑 and vectors 𝒔 and 𝒅 (both with 𝑇 elements) refers to all start and
8
destination holdings.
9
10
2.2.1 Weight on production types
11
We want to estimate how contact probabilities depend on the factors herd size, production type and
12
distance between holdings. As in Lindström et al. (2010), we assume that holdings with more than one
13
type will behave as some mixture of each type, and rather than assuming that a holding will behave as an
14
equally weighted mixture of each type, we estimate how much different production types will determine
15
the behavior of a holding. This is estimated with a parameter vector 𝒗 with 𝐾 − 1 elements (see below for
16
explanation of this) where ∑π‘˜ π‘£π‘˜ = 1. A high value of π‘£π‘˜ indicates that production type π‘˜ has a large
17
influence on the contact pattern of a holding that has reported this type concurrently with other types.
18
A holding 𝑓 is assumed to consist of a proportion π‘£Μ‚π‘“π‘˜ of each production type π‘˜, and is determined by 𝒗
19
and 𝑹 through
π‘…π‘“π‘˜ π‘£π‘˜
for π‘˜ ≠ π‘š
∑𝑙≠π‘š 𝑅𝑓𝑙 𝑣𝑙
} if π‘…π‘“π‘š = 0
π‘£Μ‚π‘“π‘š = 0 for π‘˜ = π‘š
π‘£Μ‚π‘“π‘˜ =
7
(1)
π‘£Μ‚π‘“π‘˜ = 0 for π‘˜ ≠ π‘š
} if π‘…π‘“π‘š = 1
π‘£Μ‚π‘“π‘š = 1 for π‘˜ = π‘š
1
where π‘…π‘“π‘˜ = 1 if holding 𝑓 has type π‘˜ and π‘…π‘“π‘˜ = 0 otherwise. Production type π‘š is an artificial type
2
introduced for holdings with missing information. This is never shared with any other type (as it then
3
would not be missing) and is therefore excluded from 𝒗. Hence, 𝒗 has 𝐾 − 1 rather than 𝐾 elements. It is
4
possible to formulate a model where the missing production types of holdings are estimated by writing a
5
joint distribution of parameter estimates and unobserved production types. However, as the farmers of
6
these holdings have chosen not to report any of the seven included production types, we may not assume
7
that these holdings in fact are of any of these types. We rather interpret that this group mainly contains
8
holdings that for different reasons does not fit into any of the listed types. We therefore include them in
9
the analysis as an artificial type and expect the holdings included in the group to be heterogeneous.
10
11
2.2.2 Dependence on production types
12
Also, as in Lindström et al. (2010), dependence on production type was modeled with a parameter matrix
13
𝒉, of dimensions 𝐾 × πΎ with ∑𝐼𝐽 β„ŽπΌπ½ = 1, where a high (low) value of β„ŽπΌπ½ indicates that movements from
14
production types 𝐼 to 𝐽 are common (rare). The estimation of 𝒉 takes into account that some production
15
types are more common than others and the elements are referred to as commonness indices. We expand
16
the analysis of Lindström et al. regarding production types and also give estimates of the absolute number
17
of movements between production types and refer to this as 𝑸, of dimensions 𝐾 × πΎ, where 𝑄𝐼𝐽 are the
18
estimated number of movements from type 𝐼 to 𝐽, however taking into account that herds often are
19
reported to have more than one type. Some production types are reported much more frequently than
20
others and hence the estimates of 𝑸 and 𝒉 provide different and complementary insight to the contact
21
pattern and the role of different holdings in a potential disease outbreak.
22
2.2.3 Herd size dependence
8
1
Dependence on sizes π‘ΊπŸ and π‘ΊπŸ was modeled as a power function with parameters 𝒄̇ and π’„Μˆ , both of
2
dimension 2 × πΎ, where π‘Μ‡π‘’π‘˜ and π‘Μˆπ‘’π‘˜ (𝑒 = 1,2, corresponding to sizes π‘ΊπŸ and π‘ΊπŸ , respectively, and π‘˜ =
3
1, … , 𝐾) is the size dependence of type π‘˜ for sending and receiving contacts, respectively. In the
4
following, we use notation 𝒄 where we wish to refer to either 𝒄̇ or π’„Μˆ . If 𝑐𝑒𝐼 = 0, there is no size
5
dependence for size 𝑒, and 𝑐𝑒𝐼 < 0 (𝑐𝑒𝐼 > 0) indicates a negative (positive) relationship between size and
6
contact probability. For 𝑐𝑒𝐼 = 1, there is approximately a linear relationship such that e.g. twice as many
7
animals results in a twice as high probability of contacts.
8
9
2.2.4 Distance dependence
10
Distance dependence is modeled with a spatial kernel, which may be characterized by its variance (𝑉) and
11
kurtosis (πœ…), measuring the scale and shape, respectively (Lindström et al. 2008, 2009). We assume a
12
rotationally symmetric distribution and define the 2-D variance (in the below, this is what we refer to by
13
variance) as the second moment around zero (i.e. raw moment) of the radial distance. Kurtosis is
14
analogously defined by the fourth raw moment divided by the square of the second raw moment,
15
following suggestion from Clark et al. (1999). Contact probabilities dependent on distance between
16
holdings may differ depending on production types and we therefore estimate 𝑉𝐼𝐽 and πœ…πΌπ½ (indicating
17
elements of matrices 𝑽 and 𝜿, respectively, of dimensions 𝐾 × πΎ) for every combination of 𝐼, 𝐽.
18
However, the underlying processes (e.g. economical and social) are not completely different. For such
19
systems it is suitable to use a hierarchical model (Gelman et al. 2004) where the parameters, in this case
20
the elements of 𝑽 and 𝜿, have a hierarchical prior with a set of unknown hyper-parameters. This approach
21
has the benefit that it improves the estimation of parameters where the data is weak, a concept known as
22
“borrowing strength”. If it may be argued a priori that the parameters are not completely unrelated, then
23
the estimation of one parameter may be informed by the estimation of other parameters. If there is in fact
9
1
little similarity between the parameters, the hierarchical prior will have little influence on the estimations
2
of 𝑽 and 𝜿, indicating that there is little similarity between the parameters.
3
In Lindström et al. (2009) it was shown that data was better estimated when movements were modeled as
4
arising from a mixture of distance dependent and mass action mixing (MAM) processes. In that study all
5
other factors were excluded and a single kernel was used to describe all contacts. To simplify the model
6
and reduce the number of parameters in this study we exclude the MAM part. As we here include other
7
factors and let 𝑉𝐼𝐽 and πœ…πΌπ½ be different for different production types 𝐼, 𝐽 we assume that the model can
8
account for factors that that could not be estimated with a single spatial kernel. To test these assumptions
9
we visually compare the predicted and observed movement distances (see below).
10
11
2.2.5 Contact probability model
12
We used a model formulation
𝑃(𝑑𝑑 , 𝑠𝑑 |𝜽) = ∑ ∑ 𝑃(𝑑𝑑 |𝑠𝑑 , 𝐼, 𝐽, 𝜽𝟏 )𝑃(𝑠𝑑 |𝐼, 𝜽𝟐 )𝑃(𝐼, 𝐽|πœ½πŸ‘ ),
𝐼
(2)
𝐽
13
where 𝜽𝟏 , 𝜽𝟐 and πœ½πŸ‘ are subsets of 𝜽 and refers to particular sets of parameters, yet to be defined. To
14
clarify, we use the indication with 𝜽 here to give a more transparent description of the general outline of
15
the model. Equation 2 should be interpreted such that for movement 𝑑, the probability of destination
16
holding 𝑑𝑑 is conditional on the start holding 𝑠𝑑 and the production types of 𝑠 and 𝑑, denoted 𝐼 and 𝐽
17
respectively. Start holding 𝑠𝑑 is conditional on production type 𝐼 of the start holding. The joint distribution
18
𝑃(𝑑𝑑 , 𝑠𝑑 | … ) is a mixture distribution and (since holdings may have more than one type) the probability is
19
summed over all types with ∑𝐼 ∑𝐽 𝑃(𝐼, 𝐽|πœ½πŸ‘ ) = 1. We use a probability function 𝑃(𝐼, 𝐽|πœ½πŸ‘ ) =
20
𝑃(𝐼, 𝐽|𝒗, 𝒉, 𝑹) as introduced in Lindström et al. (2010)
10
𝑃(𝐼, 𝐽|𝒗, 𝒉, 𝑹) =
1
̂𝐼 𝑁
̂̇𝐽𝐼
β„ŽπΌπ½ 𝑁
̂𝐼 𝑁
̂̇𝐽𝐼
∑𝐼 ∑𝐽 β„ŽπΌπ½ 𝑁
,
(3)
where conditionally on 𝒗 and 𝑹 we define
𝑣̂
̂𝐼 = ∑ 𝑣̂𝑓𝐼 , 𝑁
̂̇𝐽𝐼 = ∑ 𝑣̂𝑓𝐽 (1 − 𝑓𝐼 ),
𝑁
̂𝐼
𝑁
𝑓
(4)
𝑓
2
̂𝐼 and 𝑁
̂̇𝐽𝐼 may be interpreted as measurements of the
where π‘£Μ‚π‘“π‘˜ is given by equation 1. The quantities 𝑁
3
amount of each production type at a holding, taking into account that holdings may not be of only one
4
̂̇𝐽𝐼 is adjusted to account for exclusion of
type (if more than one type is reported). The quantity 𝑁
5
movements ending up at the same destination as the start holding.
6
The distribution of 𝑠𝑑 conditional on type 𝐼 (i.e. 𝑃(𝑠𝑑 |𝐼, 𝜽𝟐 ) of equation 2) is modeled as
𝑃(𝑠𝑑 |𝒗, 𝑹, 𝐼, 𝑺) =
𝑣̂𝑠𝐼 𝐺(𝑆1𝑠 , 𝑆2𝑠 , 𝑐̇1𝐼 , 𝑐̇2𝐼 )
,
∑𝑓 𝑣̂𝑓𝐼 𝐺(𝑆1𝑓 , 𝑆2𝑓 , 𝑐̇1𝐼 , 𝑐̇2𝐼 )
(5)
7
where 𝐺(𝑆1𝑓 , 𝑆2𝑓 , 𝑐̇1𝐼 , 𝑐̇2𝐼 ) is a function describing dependence on sizes 𝑆1𝑓 , 𝑆2𝑓 of holding 𝑓, and 𝑐̇1𝐼 , 𝑐̇2𝐼
8
are the parameters determining the size dependence of sizes 𝑆1𝑓 , 𝑆2𝑓 , respectively, for production type 𝐼.
9
As in the FMD model presented in Tildesley et al. (2008), we assume that contact probability dependence
10
on herd size may be modeled as a power function and we write
𝑐̇1𝐼
𝐺(𝑆1𝑓 , 𝑆2𝑓 , 𝑐̇1𝐼 , 𝑐̇2𝐼 ) = (𝑆1𝑓 + 1)
𝑐̇2𝐼
(𝑆2𝑓 + 1)
.
(6)
11
We use 𝑆1𝑓 + 1 rather than just 𝑆1𝑓 to avoid 𝑃(𝑠|𝒗, 𝑹, 𝐼, 𝑺) = 0 if 𝑆𝑒𝑓 = 0 for any 𝑒 = 1,2 (e.g. Fattening
12
herds are expected to have no sows).
13
The probability distribution of 𝑑 conditional on types 𝐼, 𝐽 and start holding 𝑠𝑑 (i.e. 𝑃(𝑑𝑑 |𝑠𝑑 , 𝐼, 𝐽, πœ½πŸ‘ ) of
14
equation 2) is dependent on both sizes, 𝑺, and distances, 𝑫 and given by
11
𝑃(𝑑𝑑 |𝒗, 𝑹, 𝑠𝑑 , 𝐼, 𝐽, 𝑺, 𝑫, π’„Μˆ 𝑱 , πœ…πΌπ½ , 𝑉𝐼𝐽 ) =
𝑣̂𝑑𝐽 𝐺(𝑆1𝑑 , 𝑆2𝑑 , π‘Μˆ1𝐽 , π‘Μˆ2𝐽 )𝐹(𝐷𝑠𝑑 , πœ…πΌπ½ , 𝑉𝐼𝐽 )
,
∑𝑓 𝑣̂𝑓𝐽 𝐺(𝑆1𝑓 , 𝑆2𝑓 , π‘Μˆ1𝐽 , π‘Μˆ2𝐽 )𝐹(𝐷𝑠𝑓 , πœ…πΌπ½ , 𝑉𝐼𝐽 )
(7)
1
for 𝑓 ≠ 𝑠. The destination of a movement may not be the same holding as the start,
2
𝑃(𝑠𝑑 |𝒗, 𝑹, 𝑠𝑑 , 𝐼, 𝐽, 𝑺, 𝑫, π’„Μˆ 𝑱 , πœ…πΌπ½ , 𝑉𝐼𝐽 ) = 0. Recall that π‘Μˆ is the equivalent of 𝑐̇ but used for modeling of
3
contact probability of incoming movements. The function 𝐹 is used for modeling of dependence of
4
between herd distance. As in Lindström et al. (2009), a generalized normal distribution is used. We write
𝐹(𝐷𝑓𝑔 , πœ…πΌπ½ , 𝑉𝐼𝐽 ) =
𝐷𝑓𝑔 𝑏
−(
)
𝑒 π‘Ž ,
(8)
5
where the relationships between π‘Ž, 𝑏 and 𝑉, πœ… are given for two dimensional kernels in Lindström et al.
6
(2008) as
4
6
2
𝛀( )
𝛀( )𝛀( )
𝑏
𝑏
𝑏 .
𝑉=π‘Ž
,πœ… =
2
2
4
𝛀( )
𝑏
(𝛀 ( ))
𝑏
2
(9)
7
For continuous functions, equation 8 is normalized by 2πœ‹π‘Ž2 𝛀(2⁄𝑏)⁄𝑏. This cancels out in equation 7 and
8
normalization is instead performed by summation of the functions over all possible destination holdings
9
(see equation 7). When incorporated in this manner, 𝑉 is the parameter that will have the highest influence
10
on the disease spread dynamic (Lindström et al. In press) but πœ… also provides important information. Also,
11
it is difficult to a priori determine πœ… and erroneous assumptions may result in erroneous estimations of 𝑉.
12
Writing the full formulation of the joint probability distribution of 𝑑𝑑 and 𝑠𝑑 (equation 2) we get
𝑃(𝑑𝑑 , 𝑠𝑑 |𝒗, 𝒉, 𝑹, 𝑺, 𝑫, 𝒄, 𝜿, 𝑽)
= ∑ ∑ 𝑃(𝑑𝑑 |𝒗, 𝑹, 𝑠𝑑 , 𝐼, 𝐽, 𝑺, 𝑫, π’„Μˆ 𝑱 , πœ…πΌπ½ , 𝑉𝐼𝐽 )𝑃(𝑠𝑑 |𝒗, 𝑹, 𝐼, 𝑺, 𝒄̇ 𝑰 )𝑃(𝐼, 𝐽|𝒗, 𝒉, 𝑹).
𝐼
(10)
𝐽
13
To improve estimation of parameters 𝜿 and 𝑽 we implement hierarchical priors as described in Appendix
14
A. In particular, this allows for improved estimation of parameters where the data is weak, i.e. where few
12
1
movements are recorded between the production types. Estimation of parameters is performed with
2
MCMC, and an indicator variable is introduced to aid computation (see Appendix B). This indicator
3
variable is also used to calculate the posterior distribution of the number of transports between production
4
types, denoted 𝑸, of dimensions 𝐾 × πΎ, where 𝑄𝐼𝐽 refers to the estimated number of movements from
5
production type 𝐼 to 𝐽. Note that 𝑸 is introduced because many holdings have several production types,
6
and therefore the exact numbers of movements between holdings of different production types are
7
unobserved.
8
9
2.2.6 Comparing observed and predicted movement distances
10
As we altered the model of Lindström et al. (2009) and removed the MAM part of the spatial kernel, we
11
compared the observed movement distances with the predictions under the model presented above. Note
12
that while this is a simplification of the kernel function, we are adding complexity by using a different
13
kernel for every combination of production types of the sending and receiving holding. The predicted
14
distances were obtained by generating 𝑁 animal movements with the model described in section 2.2.1.
15
Two thousand replicates were generated; each parameterized by a random draw from the posterior
16
distribution (based on the MCMC output) and the mean cumulative distribution was calculated and
17
compared to the cumulative distribution of observed distances.
18
19
2.3 Simulation
20
To demonstrate how the analyzed contact pattern may be used for risk assessment we performed a
21
simplistic simulation with holdings in the analyzed database as infective units. The aim was to study the
22
effect of the observed contact pattern and not provide estimates for any specific disease. All other contacts
23
between holdings were excluded as well as intra-herd dynamics and recovery. Hence, an infected holding
13
1
remained infectious for the entire time of the simulation. Simulations were initiated with one randomly
2
infected holding and we simulated posterior predictive movements with probabilities given by equation 10
3
and parameterized the model with random draws from the posterior distribution. If a movement was
4
simulated from an infected holding A to a susceptible holding B, the latter was assumed to become
5
instantaneously infected and any subsequent movements from B to a susceptible holding C were assumed
6
to result in transmission. Movements between two already infected holdings were not assumed to generate
7
a new infection.
8
We ran the simulation 1000 times and simulated π‘‡ΜŒ = 40462 movements for every replicate. For each
9
replicate and infected holding we recorded the number of first, second and third degree infections. These
10
are defined such that if holding A infects B, this is a first degree infection of A. If B subsequently infects
11
C, this is a second degree infection of A and if C later infects D, this is a third degree infection of A. The
12
number of infections caused by every infected holding were recorded after 1556 movements, which
13
assuming a constant rate of movements corresponds to a period of four weeks (as 20231 holdings was
14
analyzed for the one year period). The analysis presented in Nörmark et al. 2009a suggests that there is in
15
fact little seasonal variation in the number of pig movements. Long distance transmissions are often of
16
particular interest and therefore we also recorded the number of first degree infections caused by
17
movements longer than 10, 100 and 500 km.
18
We analyzed the results using Generalized Linear Models (GLM). Since the response variable was
19
measured as the number of infections, which are natural numbers (0, 1, 2…), we used a Poisson error
20
distribution with a log link function. The response variable 𝒀 was a vector of π‘Š elements where π‘Š =
21
∑π‘Ÿ π‘€π‘Ÿ and π‘€π‘Ÿ is the total number of holdings for replicate π‘Ÿ (π‘Ÿ = 1, … ,1000) infected earlier than
22
movement π‘‡ΜŒ − 1556. Holdings infected later were excluded as the number of infections after 1556
23
consecutive movements could not be recorded. For most analyses we focus on production types only and
24
for this we used a dummy variable 𝑿, of dimension π‘Š × πΎ, as predictor. The elements π‘‹π‘“Μπ‘˜ = 1 if holding
25
𝑓́ (𝑓́ = 1, … π‘Š) was reported to have production type π‘˜.
14
1
To demonstrate the effect of the maximum capacity we divided the holdings into classes, denoted small,
2
medium and large, respectively. We used the definitions given in Nöremark et al. (2009b) such that a
3
holding is small if the maximum capacity of sows < 15 and slaughter pigs < 300, large if sows > 299 or
4
slaughter pigs > 4999 and is otherwise medium. Hence, 3𝐾 combinations of production types and size
5
classes were obtained and similarly to 𝑿, we used a dummy variable 𝒀, of size π‘Š × 3𝐾, as predictor
6
where π‘Œπ‘“Μπ‘§ = 1 if holding 𝑓́ is reported to belong to combination 𝑧 (𝑧 = 1, … ,3𝐾). Note that while a
7
holding may have more than one production type, it will always belong to exactly one size class in this
8
analysis of the simulation. No holding with “Missing information” was classified as large and so this
9
combination was excluded from the analysis.
10
Since we simulate a large number of replicates and each replicate may include many infected holdings,
11
significance is less relevant. We instead look at the coefficients of the parameters given by the GLM. A
12
large positive value should be interpreted such that if an infected holding is reported to belong to a
13
production type (or type and size class in the analysis where the latter is included), it is expected to
14
generate a large number of new infected (by the analyzed degree) holdings. A large negative value means
15
that holdings with the type are expected to generate few infections.
16
All programs for analysis and simulation was written in and implemented in MatLab 7.8.
17
18
3. Results
19
3.1 Parameter estimates of contact probabilities
20
3.1.1 Weight of production types, 𝒗
21
Table 1 shows estimates of 𝒗, modeling dominance of production types in determining the contact pattern
22
of a holding. The highest value was estimated for Sow pool centers and the lowest for Fattening herds.
15
1
2
3.1.2 Movements between production types, 𝒉 and 𝑸
3
Table 2 lists estimated values of the most common movement, defined by either 𝒉 or 𝑸. The estimated
4
values of commonness indices, 𝒉, showed large resemblance to the estimates given in Lindström et al.
5
(2010). The five highest mean estimates were found for (in decreasing order) movements from
6
Multiplying herds to Sow pool centers, Nucleus herds to other Nucleus herds, Nucleus herds to
7
Multiplying herds, Sow pool centers to Sow pool satellites and Sow pool satellites to Sow pool centers.
8
Estimates of 𝑸 showed that animals were most frequently moved between (in decreasing order) Piglet
9
producers to Fattening herds, Sow pool satellites to Fattening herds, Multiplying herds to Piglet producers,
10
Farrow-to-finish to Fattening herds, Sow pool center to Sow pool satellite, Sow pool satellite to Sow pool
11
center and Multiplying herds to Farrow-to-finish. We list the top 7 rather than five to avoid the false
12
connotation that movements from Sow pool center to Sow pool satellite is more frequent than Sow pool
13
satellite to Sow pool center. These were ranked as number 5 and 6 and this ranking only differed by one
14
movement. Also we include estimates for the seventh highest, Multiplying herds to Farrow-to-finish, as
15
this showed the highest estimates for kernel variance (see below).
16
17
3.1.3 Size dependent parameters, 𝒄
18
Table 3 shows estimates for parameters 𝒄, determining how the maximum capacity of the holdings
19
influences the contact pattern. Most estimates (23 of a total 32) showed a clear positive relationship
20
between size and contact probabilities (i.e. 𝑐 = 0 is not included in the 95% central credibility interval)
21
while 7 estimates showed a negative relationship. Of the negative relationships, 3 were found for estimates
22
of the artificial type “Missing information”.
23
16
1
3.1.4 Distance related parameters, 𝑽 and 𝜿
2
We only include the estimates of 𝑽 and 𝜿 for the most common movements (given above) and list these in
3
Table 2. Variance of the spatial kernel, 𝑉, is a measure of the scale at which contacts occur and a high
4
value of 𝑉𝐼𝐽 indicates that long distance movements are common from holdings of production type 𝐼 to 𝐽.
5
The highest estimate was found for movements from Multiplying herds to Farrow-to-finish holdings and
6
the lowest was found for movements between Sow pool centers.
7
Kernel kurtosis, 𝜿, is a measure of the difference in movement distances. A high value indicates that there
8
are many short distance movements but concurrently many at long distance, and a low value indicates that
9
movement distances are mode uniform. In Lindström et al. (2009), where differences in production types
10
were excluded, the kernel kurtosis was estimated at 32.6 with 95% central credibility interval (29.2, 36.4).
11
Of the 64 estimates of πœ…πΌπ½ in this study, 49 showed strong evidences for lower values (non overlapping
12
central credibility intervals) and 2 showed strong evidence for higher.
13
Figure 2 illustrates the cumulative distribution of observed and predicted movement distances.
14
15
3.2 Simulation of disease transmission
16
Figure 3 shows the coefficients estimated by the GLM’s as described in section 2.3. Large positive values
17
indicate that herds with this characteristic are to generate a large number of new infections and negative
18
values indicate that a herd is expected to generate few transmissions. Error bars were small and is
19
excluded in the picture for clarity. Sow pool centers, Nucleus herds, Sow pool satellites and Fattening
20
herds were estimated to have increased (relative to other production types) risk of generating large number
21
of new infections when higher degree infections are accounted for (Figure 3a). However, apart from the
22
herds in the group Missing information, the Fattening herds had the lowest coefficient also for higher
23
degrees and are still considered low risk herds. Piglet producers were estimated to generate fewer
17
1
infections when accounting for higher degree infections. Sow pool centers and satellites as well as
2
holdings of the artificial type Missing information were estimated to generate fewer new infections when
3
studying long distance transmissions (Figure 3b). Holdings with Multiplying herds and Farrow-to-finish
4
were estimated to have higher risk of long distance transmission.
5
For most production types, larger holdings were estimated to have a higher risk of causing new infections
6
(Figure 3c). No holdings with type Missing information were classified as large, but the risk of infecting
7
other holdings were estimated to be lower for Medium than Small holdings. Also, Large Sow pool centers
8
had a lower coefficient than Medium holdings reported with the same production type. When observing
9
the data we found that no large holding reported with production type Sow pool centers were reported as
10
only this type.
11
12
4. Discussion
13
In this paper we have presented a hierarchical Bayesian model for analysis of contacts between holdings,
14
applied to movements of pigs in Sweden. The analysis revealed a highly heterogeneous contact structure.
15
Holdings of different production types were estimated to differ in the number of contacts and with what
16
other production types the contacts occurred. This was demonstrated both in the estimates of commonness
17
indices, 𝒉, as well as the estimated absolute number of movements, 𝑸. These two measures provide
18
somewhat different information about the contact pattern. Whereas 𝒉 takes into account how frequent the
19
production types are, 𝑸 does not. For example, Farrow-to-finish herds and Fattening herds are both
20
common production types and while there are overall many transports between herds with the former to
21
the latter type (high value of 𝑸), transports between individual holdings of these types are rare (low value
22
of 𝒉).
23
Posterior distributions of 𝒉 and 𝒗 were similar to the estimates given in Lindström et al. (2010). The Sow
24
pool centers were less dominant in determining the contact pattern of a holding when the model was
18
1
extended. However, it was still the dominant production type when reported concurrently with other types.
2
Posterior means of 𝒗 were estimated at more than four times larger than the second most dominant type,
3
Sow pool satellite, and 81 times larger than the least dominant type, Fattening herd.
4
Differences were also shown in how the probability of contacts depends on the maximum capacity of
5
holdings (estimated with 𝒄), although most production types showed a positive relationship (negative
6
values were generally small) between maximum capacity and the probability of both incoming and
7
outgoing movements. This was found for both demographic groups (sows and fattening pigs). Without
8
proper understanding about the production types and structure of the industry, the negative values of 𝒄
9
may seem unexpected. Negative estimates indicate that larger holdings have fewer contacts and one might
10
perhaps expect that these are more active and hence trade more animals. For instance, Farrow-to-finish
11
holdings showed a slightly lower probability of both incoming and outgoing movements with larger
12
maximum capacity of fattening pigs. This is however not surprising as large herds in this category may be
13
expected to produce piglets that are kept in the herd until fattening and thus the main movement would be
14
animals sent to slaughter (which was not included in this study). Note that this study does not include
15
shipments to slaughterhouses. For Nucleus herds a large negative value was found for incoming
16
movements depending on the maximum capacity of sows. This might be due to the fact that large breeding
17
herds mainly introduce new genetic material in the form of semen and rarely buy live animals, as part of
18
their biosecurity policy. Other inconsistencies may be a result of the reporting system. As the production
19
type is reported by the farmer, and no proper definition of the production types are provided, it leaves
20
room for interpretation. Previous studies (Nöremark et al. 2009b) have reported that farmers with small
21
herds had a different interpretation of their type, and, in particular, it was found that they sometimes
22
regarded themselves as breeders even though only a few sows where kept on the premises. Moreover,
23
there is no requirement for updating the information in the database if the production type is changed.
24
Thus, the information on production type may be incorrect in the database. While Nucleus herds (by
25
common definition) generally receive few animals but send many, this may not be the case for small herds
19
1
(here mainly few sows). Nucleus herds are also expected to have a low maximum capacity of slaughter
2
pigs, and herds with large numbers for this demographic group might also behave differently, which may
3
explain the large positive estimates of π’„Μˆ 𝟏 for Nucleus herds (Table 3). The holdings in the group Missing
4
information showed a negative relationship between the number of contacts and the maximum capacity in
5
3 out of 4 estimates. We believe this was a result of the heterogeneous nature of this group, which contains
6
all holdings not reported to belong to any of the other seven types.
7
The analysis of 𝑽 and 𝜿 also showed considerable difference in how contact probability is influenced by
8
the distance between holdings. Comparing the estimates of 𝜿 to the estimates of Lindström et al. (2009)
9
where production type differences were ignored, we found that the kernel kurtosis for movement between
10
production types was generally lower. This supports the interpretation of kurtosis as a measurement of the
11
heterogeneity of the distance related processes resulting in movements between holdings. Also, by
12
including production type differences we found a good fit between observed and predicted movement
13
distances (Figure 2). Of the more common types (i.e. high values of 𝒉 or 𝑸 listed in Table 2), the largest
14
mean posterior of 𝜿 was found for movements from Multiplying herds to Farrow-to-finish holdings. High
15
kurtosis was also found for movements from Farrow-to-finish to Fattening herds. This may be interpreted
16
that the contacts between these types are the result of heterogeneous processes. This may partly be
17
explained by the fact that in Farrow-to-finish herds with a limited capacity for fattening pigs, some piglets
18
are sold to fattening herds, while Farrow-to-finish herds with larger fattening units keep all their piglets.
19
Thus, it is not only the herd size but the relative size of the piglet producing and fattening units in the herd
20
that affects the contact pattern of this herd category. Moreover, there is a trend towards specialized
21
production units in pig farming and some herds registered as Farrow-to-finish may have changed their
22
production into either piglet production or fattening pigs without this being recorded in the database.
23
Movements from Multiplying herds were estimated to have high variance (𝑽), with the highest estimate of
24
the study found for Multiplying herds to Farrow-to-finish. This indicates that long distance movements are
25
common compared to other production types, and from a disease transmission perspective, Multiplying
20
1
herds may cause long distance transmission. Movements between Sow pool centers and satellites are
2
found to have low estimates of 𝑽, indicating that while many movements occur between these types (both
3
in absolute number and relative to their abundance), these movements are of relatively short distance.
4
Of the two parameters used to model distance dependence, 𝑽 and 𝜿, the former is the main determinant of
5
disease spread dynamics (Lindström et al. In press). Extrapolating the kernel variance estimates to
6
implications about disease transmission we may expect that an infected e.g. Sow pool center will cause
7
few long distance transmissions, while a Multiplying herd (if infected) may rapidly increase the range of
8
an emerging disease. This was also found in the simulation study. The coefficients of Sow pool centers
9
(Figure 3b) decreased with distance but increase for Multiplying herds. Coefficients also increased for
10
Farrow-to-finish holdings, which had a high estimated value of 𝑽 for the main contact type, Fattening
11
herds. Generally we expect contacts between types estimated with a high 𝑽 to be particularly important in
12
later stages of outbreaks. When local susceptibles are depleted (due to becoming infected from more local
13
transmission), long distance contacts may spark new infection where depletion has not yet occurred. This
14
dynamic was found e.g. in the UK 2001 FMD outbreak (Keeling et al. 2001).
15
Matthews and Woolhouse (2005) argued that animal markets acted as super spreaders in this outbreak.
16
Such markets are rare in Sweden, and identification of possible super spreaders in the system may instead
17
focus on holdings of different production types. However, by international standards the animal farming is
18
less intensive, and movements are less frequent. Our analysis of holdings as potential super spreaders
19
should therefore be interpreted as relative to other holdings in the system. Figure 3a shows how the
20
potential of generating new infections changes with the infection degree for the different production types.
21
This is mainly a result of the estimates of 𝒉 and 𝑸, and the largest increase was found for Nucleus herds.
22
These mainly move animals to other Nucleus herds and Multiplying herds, which both in turn have many
23
outgoing contacts. Piglet producers showed a lower potential (compared to other production types) of
24
generating new infections when looking at higher degree infections. The general pattern is however quite
25
similar for different infection degrees, and we may conclude that production types with high estimates of
21
1
𝒉 for outgoing movements may act as super spreaders. For many diseases where the time between
2
infection and first symptoms are short, second (and consequently third) degree transmission may be
3
prevented by movement restrictions. While early detection is always crucial, our results suggest that the
4
importance should be even more emphasized for some production types, such as Nucleus herds and Sow
5
pool centers.
6
Further, the analysis of the simulation study showed that for most production types, infected holdings in
7
the larger size classes were likely to generate more transmissions. Hence, we may conclude that larger
8
holdings generally have higher potential to act as super spreaders. Exceptions were found for holdings
9
with Missing information (which is expected due to the negative relationship between maximum capacity
10
and contact probability, see above) and Sow pool satellites. The latter showed a decrease in the coefficient
11
given by the GLM for Large holdings. We believe this is a result of the fact that this size class contained
12
no holdings reported with only this production type, and the number of transmissions was largely
13
explained by the coexisting production types.
14
Using databases of holdings and animal movements to estimate parameters allows for assessment of large
15
scale patterns. Also, unlike qualitative studies, where inference is made from a few handpicked holdings
16
checked for consistency, the parameters are estimated from the same type of data that may be used for
17
outbreak simulations. If, for example, parameters are estimated from 100 holdings with every trait
18
checked and edited in great detail, they may not be reliably used for modeling contact data for holdings
19
that have not been checked in the same way. However, erroneous and dubious reports pose a problem. In
20
order to provide better estimates, the data quality needs to be improved. Better guides to farmers, as well
21
as a requirement for regular updates of recorded information may provide more reliable data. In particular,
22
the interpretation of production type is of great importance for work such as in this study. Production type
23
has a high influence on the contact patterns and could be of great use in risk estimation if the data is
24
reliable. Working with data from central databases always means a risk of erroneous reports affecting the
25
results. Using the same data as in this study, Lindström et al. (2010) reported unexpectedly many
22
1
movements between Sow pool centers, a highly unlikely event and probably due to deficiencies in the
2
recorded data on production type. As these are rare production types but with many movements, they are
3
particularly sensitive to erroneous entries in the data base and this may have affected the results.
4
Network analysis is another common approach to study animal movement contacts based on data base
5
entries. This commonly involves measurements that captures the overall structures and conclusions about
6
disease spread is then made from these measures (see Dubé et al. 2009 and references therein). The
7
method we have presented here can be seen as a lower lever analysis and uses a set of parameters to
8
capture the underlying process of between herd contacts that ultimately results in the higher level structure
9
captured by network measurements. For instance, our results show that generally holdings with larger herd
10
sizes both send and receive more animals and hence would, in a network context, have both higher in- and
11
outdegree (i.e. number of ingoing and outgoing links). Further, if β„ŽπΌπ½ is generally large for all production
12
types 𝐽 (𝐼), then holdings of type 𝐼 (𝐽) are expected to have a high outdegree (indegree). An advantage of
13
our analysis is that we may also address secondary (and higher order) infections and as shown in Figure 3a
14
this can change the picture of which holding are to be considered as potential super spreaders.
15
The distance dependence parameters, in particular 𝑽 (Håkansson et al. 2010), also relate to some
16
commonly used network measurements. Various types of centrality indices are frequently used in network
17
analyses. These capture how important nodes are for the overall connectivity of the network (Wasserman
18
and Faust 1994). Holdings with production types with high probability of long distance movements (large
19
values of 𝑽) have the ability to connect otherwise distant (in terms of number of links) holdings and would
20
have a high centrality. A strong spatial component, i.e. low values of 𝑽, will instead result in highly
21
clustered networks (Keeling 1999).
22
With Bayesian inference it is straightforward to incorporate uncertainties in the parameter estimates. The
23
more movements we include, the smaller the credibility intervals. A disadvantage of network analysis is
24
that a single measure is presented, and this measure depends largely on the number of movements
23
1
analyzed, hence on the time scale chosen. Whereas network analysis considers an observed animal
2
movement between two holdings as a fixed link, our approach instead consider this as a random event,
3
however with different probabilities. Another advantage of analysis at this lower level is that the
4
parameter may be directly incorporated in explicit simulations of outbreaks. Here we have presented a
5
simplistic simulation model to demonstrate how the parameters translate to predictions of disease spread.
6
A more realistic but far more complex simulation model of a disease outbreak should include other
7
relevant contact types as well as disease specific parameters, such as incubation time, recovery rate and
8
intra-herd dynamics.
9
An advantage of network analysis is however the accessibility. At present, there are numerous software
10
packages available which allows for straightforward analyses of data. The method we have presented here
11
is in comparison computationally heavy and requires some basic knowledge about MCMC techniques.
12
It should also be stressed that, while the model presented here is cumbersome in the number of
13
parameters, there are further aspects of the contact structure that are not included. Reoccurring contacts
14
between holdings may be expected, and in particular this is true for the holdings in the Sow pool system.
15
A low variance of the spatial kernel, as reported for contacts between Sow pool centers and satellites,
16
results in a high probability of contacts with nearby holdings, and a high rate of reoccurring contacts are
17
expected to occur in the simulation study. However, since it is not incorporated explicitly, we believe this
18
may have caused an overestimation of the number of infections caused by Sow pool centers and satellites
19
in the simulation study. Recurrent contacts are expected to influence disease spread dynamics and while it
20
requires estimation of additional parameters, this may be a salient extension of the model.
21
While the contact model presented may be improved, we believe that much of the relevant features are
22
included. If data of other contacts are available, the method may also be applied to these. Using
23
generalized measures of contact probabilities (such as the parameters of the presented contact model)
24
allows for the comparison, both between different holdings but also between different types of contacts.
25
Also, if similar analyses are applied to data of other countries, comparisons of parameters may inform us
24
1
of differences in the contact structure. We also believe that the model may be used for risk assessment as a
2
complement to classic methods, but there is a need for better data quality for reliable inference. Yet, an
3
analysis as the one presented may also guide towards increased data quality in the future.
4
5
5. Conclusion
6
In this study we have analyzed live animal movements between pig holdings and found that the contact
7
pattern is highly heterogeneous. We found that generally, but not always, a positive relationship exists
8
between the maximum capacity of a holding and the number of contacts.
9
Describing distance dependence with a spatial kernel and analyzing its characteristics provides valuable
10
information about the contact pattern between holdings and is a main feature in predicting disease spread.
11
Hence the more detailed knowledge gained by the methodology presented may improve both knowledge
12
and predictive power. We found that the probability of contacts between holdings dependent on distance
13
was influenced by the production type of the start and end holding.
14
Heterogeneous contact patterns, with some holdings likely to act as super spreaders, and differences in the
15
probability of long distance contacts is expected to cause stochastic dynamics of a disease outbreak where
16
animal movements are important for the transmission.
17
18
Conflict of interest
19
We have no conflict of interest.
20
21
Acknowledgement
25
1
We thank the Swedish Board of Agriculture for supplying the data and Swedish Civil Contingencies
2
Agency for funding
3
4
References
5
Bigras-Poulin, M., Thompson, R.A., Chriel, M., Mortensen, S., Greiner, M., 2006. Network analysis of
6
Danish cattle industry trade patterns as an evaluation of risk potential for disease spread. Prev. Vet. Med.
7
76, 11–39.
8
Boender, G.J., Meester, R., Gies, E., De Jong, M.C.M., 2007. The local threshold for geographical spread
9
of infectious diseases between farms. Prev. Vet. Med. 82, 90–101.
10
Brennan, M.L., Kemp, R., Christley, R.M. 2008. Direct and indirect contacts between cattle farms in
11
north-west England. Prev. Vet. Med. 84, 242–260.
12
Casella, G., George, E.I., 1992. Explaining the Gibbs sampler. Am. Stat. 46, 167–174.
13
Clark, J. S., Silman, M., Kern, R., Macklin, E., HilleRisLambers, E. 1999. Seed dispersal near and far:
14
generalized patterns across temperate and tropical forests. Ecology 80, 1475–1494.
15
Chib, S., Greenberg, E., 1995. Understanding the Metropolis-Hastings algorithm. Am. Stat. 49, 327–335.
16
Dickey, B.F., Carpenter, T.E., Bartell, S.M., 2008. Use of heterogeneous operation-specific contact
17
parameters changes predictions for foot-and-mouth disease outbreaks in complex simulation models. Prev.
18
Vet. Med. 87, 272–287.
19
Dubé, C., Ribble, C., Kelton, D., McNab, B., 2009. A review of network analysis terminology and its
20
application to foot-and-mouth disease modelling and policy development. Transbound. Emerg. Dis. 56,
21
73–85.
26
1
Fan Y., Dortet-Bernadet, J.-L., Sisson, S. A. 2010. A note on Bayesian curve fitting via auxiliary
2
variables. J. Comput. Graph. Stats. 19, 626–644.
3
Févre, E.M., Bronsvoort, B.M.de C., Hamilton, K.A., Cleaveland, S., 2006. Animal movements and the
4
spread of infectious diseases. Trends Microbiol. 14, 125–131.
5
Gamerman, D., Lopes, H.F., 2006. Markov Chain Monte Carlo: Stochastic Simulation for Bayesian
6
Inference, second ed. CRC Press, Chapman & Hall.
7
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B., 2004. Bayesian Data Analysis, second ed. Chapman &
8
Hall/CRC (Chapter 18).
9
Hawkes, C. 2009 Linking movement behaviour, dispersal and population processes: is individual variation
10
a key? J. Anim. Ecol. 78, 894–906.
11
Håkansson, N., Jonsson, A., Lennartsson, J., Lindström, T., Wennergren, U., 2010. Generating structure
12
specific networks. Adv. Complex Syst. 13, 239-250.
13
Kao, R.R., Green, D.M., Johnson, J., Kiss, I.Z., 2007. Disease dynamics over very different time-scales:
14
foot-and-mouth disease and scrapie on the network of livestock movements in the UK. J. R. Soc. Interface
15
4, 907–916.
16
Keeling, M.J., 1999. The effects of local spatial structure on epidemiological invasions. Proc. R. Soc.
17
London B 266, 859–869.
18
Keeling, M.J., Woodhouse, M.E., Shaw, D.J., Matthews, L., 2001. Dynamics of the 2001 UK foot and
19
mouth epidemic: stochastic dispersal in a dynamic landscape. Science 294, 813–817.
20
Keeling, M., 2005. The implications of network structure for epidemic dynamics. Theo. Pop. Bio. 67, 1–8.
21
Lindström, T., Håkansson, N., Westerberg, L., Wennergren, U., 2008. Splitting the tail of the
22
displacement kernel shows the unimportance of kurtosis. Ecology 89, 1784–1790.
27
1
Lindström, T., Sisson, S.A., Nöremark, M., Jonsson A., Wennergren, U., 2009. Estimation of distance
2
related probability of animal movements between holdings and implications for disease spread modeling.
3
Prev. Vet. Med. 91, 85–94.
4
Lindström, T., Sisson, S.A., Sternberg Lewerin, S., Wennergren, U., 2010. Estimating animal movement
5
contacts between holdings of different production types. Prev. Vet. Med. 95, 23-31.
6
Lindström, T., Håkansson, N., Wennergren, U., The shape of the spatial kernel and its implications for
7
biological invasions in patchy environments. Proc. Roy. Soc. London B. In press.
8
Matthews, L., Woolhouse, M., 2005. New approaches to quantifying the spread of infection. Nat. Rev.
9
Mol. Cell Biol. 3, 529–536.
10
Nöremark, M., Håkansson, N., Lindström, T., Wennergren, U., Sternberg Lewerin, S., 2009a. Spatial and
11
temporal investigations of reported movements, births and deaths of cattle and pigs in Sweden. Acta Vet.
12
Scand. 51:37.
13
Nöremark M, Lindberg A, Vågsholm I, Sternberg Lewerin S. 2009b. Disease awareness, information
14
retrieval and change in biosecurity routines among pig farmers in association with the first PRRS outbreak
15
in Sweden. Prev. Vet. Med. 90, 1–9.
16
Nöremark, M., Håkansson, N., Sternberg Lewerin, S., Lindberg, A. Jonsson A., Network analysis of cattle
17
and pig movements in Sweden; measures relevant for disease control and risk based surveillance. In press.
18
Ortiz-Pelaez A, Pfeiffer D. U., Soares-Magalhães R.J., Guitian F.J., 2006. Use of social network analysis
19
to characterize the pattern of animal movements in the initial phases of the 2001 foot and mouth disease
20
(FMD) epidemic in the UK. Prev. Vet. Med. 75, 40–55.
21
Rweyemamu, M., Roeder, P., Mackay, D., Sumption, K., Brownlie, J., Leforban, Y., Valarcher, J.F.,
22
Knowles, N.J., Saraiva, V., 2008. Epidemiological patterns of foot-and-mouth disease worldwide.
23
Transbound. Emerg. Dis. 55, 57–72.
28
1
Ribbens, S., Dewulf, J., Koenen, F., Mintiens, K., de Kruif, A., Maes, D., 2009. Type and frequency of
2
contacts between Belgian pig herds. Prev. Vet. Med. 88, 57–66.
3
Robinson, S.E., Christley, R.M., 2007. Exploring the role of auction markets in cattle movements within
4
Great Britain. Prev. Vet. Med. 81, 21–37.
5
Tildesley, M.J., Deardon, R., Savill, N.J., Bessell, P.R. Brooks, S.P., Woolhouse, M.E.J., Grenfell, B.T.,
6
Keeling, M.J., 2008. Accuracy of models for the 2001 foot-and-mouth epidemic. Proc. Roy. Soc. London
7
B. 275, 1459–1468.
8
Velthuis, A.G., Mourits, M.C., 2007. Effectiveness of movement-prevention regulations to reduce the
9
spread of foot-and-mouth disease in The Netherlands. Prev. Vet. Med. 82, 262–281.
10
Vernon, M.C., Keeling, M.J., 2009. Representing the UK’s cattle herd as static and dynamic networks.
11
Proc. Roy. Soc. London B. 276, 469–476.
12
Wasserman, S., Faust, K., 1994. Social Network Analysis: Methods and Applications. Cambridge
13
University Press, Cambridge.
14
Watts, D.J., Strogatz, S.H., 1998. Collective dynamics of ‘‘small-world’’ networks. Nature 393, 440–442.
15
16
Appendix A. Hierarchical priors
17
We implement hierarchical priors for 𝑽 and 𝜿, and denote these 𝛹(𝑉𝐼𝐽 |𝝃𝑽 ) and 𝛹(πœ…πΌπ½ |πƒπœΏ ), respectively,
18
where 𝝃𝑽 and πƒπœΏ are vectors of unknown hyper-parameters. These vectors model the degree of similarity
19
between parameters and (if similarities are prominent) improves estimation of the 𝑽 and 𝜿 when the data
20
is weak (e.g. few movements between types 𝐼, 𝐽). For 𝑽 we use an inverse gamma distribution with hyper-
21
parameters 𝛼𝑉 and 𝛽𝑉
29
𝛼
𝛹(𝑉𝐼𝐽 |𝛼𝑉 , 𝛽𝑉 ) =
𝛽𝑉 𝑉
𝛼𝑉
(1⁄𝑉𝐼𝐽 ) 𝑒 −𝛽𝑉⁄𝑉𝐼𝐽 .
𝛀(𝛼𝑉 )
(A.1)
1
The probability density function of the inverse gamma distribution is defined for values larger than zero.
2
Since the lower limiting value of 𝜿 is 4/3 (a uniform distribution obtained for 𝑏 → ∞, Lindström et al.
3
2008) we use a shifted inverse gamma distribution as hierarchical prior for 𝜿 and write
𝛼
𝛹(πœ…πΌπ½ |π›Όπœ… , π›½πœ… ) =
π›½πœ… πœ…
π›Όπœ…
(1⁄(πœ…πΌπ½ − 4⁄3)) 𝑒 −π›½πœ…⁄(πœ…πΌπ½−4⁄3) .
𝛀(π›Όπœ… )
(A.2)
4
5
Appendix B. Indicator variables
6
To facilitate computations when sampling from posteriors associated with mixture distributions, a
7
common strategy is to introduce indicator variables (for example Gelman et al. 2004). With this approach,
8
equation 10 may be rewritten as
𝑃(𝒅, 𝒔|𝒗, 𝒉, 𝑹, 𝑺, 𝑫, 𝒄, 𝜿, 𝑽, 𝑼)
= ∏ ∏ ∏[𝑃(𝑑𝑑 |𝒗, 𝑹, 𝑠𝑑 , 𝐼, 𝐽, 𝑺, 𝑫, π’„Μˆ 𝑱 , πœ…πΌπ½ , 𝑉𝐼𝐽 )𝑃(𝑠𝑑 |𝒗, 𝑹, 𝐼, 𝑺, 𝒄̇ 𝑰 )𝑃(𝐼, 𝐽|𝒗, 𝒉, 𝑹)]
𝑑
𝐼
π‘ˆπΌπ½π‘‘
, (B.1)
𝐽
9
where 𝑼 is a tensor of dimension 𝐾 × πΎ × π‘‡ and π‘ˆπΌπ½π‘‘ = 1 for exactly one combination of 𝐼, 𝐽 for each
10
movement 𝑑. The full posterior distribution of unobserved parameters 𝒄, 𝜿, 𝑽, 𝒉, 𝒗, π›Όπœ… , π›½πœ… , 𝛼𝑉 , 𝛽𝑉 , 𝑼 is
𝑃(𝒄 , 𝜿, 𝑽, 𝒉, 𝒗, π›Όπœ… , π›½πœ… , 𝛼𝑉 , 𝛽𝑉 , 𝑼|𝑹, 𝑺, 𝑫, 𝒅, 𝒔)
= 𝑃(𝒅, 𝒔|𝒗, 𝒉, 𝑹, 𝑺, 𝑫, 𝒄, 𝜿, 𝑽, 𝑼)𝛹(𝜿|π›Όπœ… , π›½πœ… ) 𝛹(𝑽|𝛼𝑉 , 𝛽𝑉 )𝑃(𝒗)𝑃(𝒉)𝑃(𝒄)𝑃(π›Όπœ… )𝑃(π›½πœ… )𝑃(𝛼𝑉 )𝑃(𝛽𝑉 ), (B.2)
11
where ,𝑃(𝒗), 𝑃(𝒉), 𝑃(𝒄), 𝑃(π›Όπœ… ), 𝑃(π›½πœ… ), 𝑃(𝛼𝑉 ) and 𝑃(𝛽𝑉 ) are prior distributions of parameters and
12
hyper-parameters. For 𝒗 and 𝒉 (recall that the elements of these sum to one) we use uninformative
13
π·π‘–π‘Ÿπ‘–π‘β„Žπ‘™π‘’π‘‘(1,1, … ,1) priors. Priors 𝑃(𝒄), 𝑃(π›Όπœ… ) and 𝑃(𝛼𝑉 ) are set to be proportional to one on the support
14
of the parameters, while 𝑃(π›½πœ… ) and 𝑃(𝛽𝑉 ) are defined as uniform for 𝛽 > 1. The inverse gamma
15
distribution does not have a finite mean for 𝛽 < 1 and we assume that both 𝑉 and πœ… are finite quantities.
30
1
Incorporation of the unobserved indicator variable 𝑼 in the model also allows for posterior estimation of
2
the number of movements between production types. The posterior distribution of 𝑸 is calculated from the
3
posterior distribution of 𝑼 by
𝑄𝐼𝐽 = ∑ π‘ˆπΌπ½π‘‘ .
(B.3)
𝑑
4
5
Appendix C. MCMC estimation
6
We use Markov chain Monte Carlo (MCMC) techniques to estimate the posterior distribution of the model
7
parameters. This involves the construction of a stochastic Markov chain with stationary distribution given
8
by the posterior distribution of interest. Given a current state of the chain, MCMC methods sequentially
9
update the parameters either individually or in blocks, based on the full posterior conditional distributions
10
of each parameter under the model. Repeating this procedure, and after the chain has converged, the states
11
of the chain represent (correlated) draws from the posterior distribution of model parameters. Two basic
12
updates are involved. If the conditional distribution of a parameter is of a standard form, Gibbs sampling
13
(see e.g. Casella and George 1992 for further details) may be used. If however the distribution is of non-
14
standard form, Metropolis-Hastings updates may be used (see e.g. Chib and Greenberg 1995 for further
15
details). In this case, parameter values 𝜽∗ are proposed from a density function π‘ž(𝜽∗ |𝜽) and subsequently
16
accepted with probability
π‘šπ‘–π‘› (1,
𝑃(𝜽∗ | … )𝑃(𝜽∗ )π‘ž(𝜽|𝜽∗ )
),
𝑃(𝜽| … )𝑃(𝜽)π‘ž(𝜽∗ |𝜽)
(C.1)
17
where 𝜽 and 𝜽∗ are current and proposed parameter values, 𝑃(𝜽) is the prior and 𝑃(𝜽| … ) is the
18
likelihood evaluated at 𝜽. Further information on MCMC methods can be found in Gamerman and Lopes
19
(2006).
20
All parameters except 𝑼 may be updated with Metropolis-Hastings steps. Parameter matrix 𝒉 is only
21
involved in 𝑃(𝐼, 𝐽|𝒗, 𝒉, 𝑹) of equation B.1, and so the resulting conditional posterior distribution of 𝒉 is
31
𝑇
𝐾
𝐾
𝑃(𝒉|𝒗, 𝑼, 𝑹) ∝ ∏ ∏ ∏[𝑃(𝐼𝑑 , 𝐽𝑑 |𝒗, 𝒉, 𝑹) ]π‘ˆπΌπ½π‘‘ 𝑃(𝒉).
(C.2)
𝑑=1 𝐼=1 𝐽=1
1
𝐾
π‘ˆπΌπ½π‘‘
The prior, 𝑃(𝒉), is proportional to 1 and the distribution ∏𝑇𝑑=1 ∏𝐾
is given in
𝐼=1 ∏𝐽=1[𝑃(𝐼𝑑 , 𝐽𝑑 |𝒗, 𝒉, 𝑹) ]
2
Lindström et al. (2010) as
𝐿1
= π‘€π‘’π‘™π‘‘π‘–π‘›π‘œπ‘šπ‘–π‘Žπ‘™(𝑀1,1 , 𝑀1,2 , … 𝑀2,1 , 𝑀2,2 … π‘€π‘˜,π‘˜−1 , π‘€π‘˜,π‘˜ |𝑝1,1 , 𝑝1,2 , … 𝑝2,1 , 𝑝2,2 … π‘π‘˜,π‘˜−1 , π‘π‘˜,π‘˜ ),
(C.3)
3
where 𝑝𝐼,𝐽 = 𝑃(𝐼, 𝐽|𝒗, 𝒉, 𝑹) is given in equation 3, 𝑀𝐼,𝐽 = ∑𝑑 π‘ˆπΌπ½π‘‘ for all 𝐼, 𝐽. Dirichlet distributions were
4
used for proposals. More details may be found in Lindström et al. (2010).
5
Dirichlet proposals were also used for updates of 𝒗, which is included in each distribution
6
𝑃(𝑑𝑑 |𝒗, 𝑹, 𝑠𝑑 , 𝐼, 𝐽, 𝑺, 𝑫, π’„Μˆ 𝑱 , πœ…πΌπ½ , 𝑉𝐼𝐽 ), 𝑃(𝑠𝑑 |𝒗, 𝑹, 𝐼, 𝑺, 𝒄̇ 𝑰 ) and 𝑃(𝐼, 𝐽|𝒗, 𝒉, 𝑹). For simplicity we write
𝑇
𝐾
𝐾
𝐿2 = ∏ ∏ ∏ [
𝑑=1 𝐼=1 𝐽=1
7
𝑣̂𝑠𝐼 𝐺(𝑆1𝑠 , 𝑆2𝑠 , 𝑐̇1𝐼 , 𝑐̇2𝐼 )
π‘ˆπΌπ½π‘‘
]
∑𝑓 𝑣̂𝑓𝐼 𝐺(𝑆1𝑓 , 𝑆2𝑓 , 𝑐̇1𝐼 , 𝑐̇2𝐼 )
(C.4)
and
𝑇
𝐾
𝐾
𝑣̂𝑑𝐼 𝐺(𝑆1𝑑 , 𝑆2𝑑 , π‘Μˆ1𝐽 , π‘Μˆ2𝐽 )𝐹(𝐷𝑠𝑑 , πœ…πΌπ½ , 𝑉𝐼𝐽 )
π‘ˆπΌπ½π‘‘
𝐿3 = ∏ ∏ ∏ [
]
∑𝑓 𝑣̂𝑓𝐼 𝐺(𝑆1𝑓 , 𝑆2𝑓 , π‘Μˆ1𝐽 , π‘Μˆ2𝐽 )𝐹(𝐷𝑠𝑓 , πœ…πΌπ½ , 𝑉𝐼𝐽 )
, 𝑓 ≠ 𝑠,
(C.5)
𝑑=1 𝐼=1 𝐽=1
8
and here the conditional posterior distribution for 𝒗 is
𝑃(𝒗 |𝒄, 𝒉, 𝑹, 𝑺, 𝑫, 𝜿, 𝑽, 𝑼, 𝑱, 𝑰) = 𝐿1 𝐿2 𝐿3 𝑃(𝒗).
9
10
(C.6)
The elements of 𝒄̇ and π’„Μˆ were updated separately using Gaussian random walk proposal distributions. The
conditional posterior distribution of 𝑐̇𝑒𝐼 (𝑒 = 1,2) is
𝑃(𝑐̇𝑒𝐼 |𝑐̇𝑒̃𝐼 , 𝒗, 𝑼, 𝑹, 𝐼, 𝑺, 𝒔)
𝑇
𝐾
𝑣̂𝑠𝐼 𝐺(𝑆𝑒𝑠 , 𝑆𝑒̃𝑠 , 𝑐̇𝑒𝐼 , 𝑐̇𝑒̃𝐼 )
π‘ˆπΌπ½π‘‘
= (∏ ∏ [
]
∑𝑓 𝑣̂𝑓𝐼 𝐺(𝑆𝑒𝑓 , 𝑆𝑒̃𝑓 , 𝑐̇𝑒𝐼 , 𝑐̇𝑒̃𝐼 )
) 𝑃(𝑐̇𝑒𝐼 )
(C.7)
𝑑=1 𝐽=1
11
where 𝑒̃ = 2 for 𝑒 = 1 and 𝑒̃ = 1 for 𝑒 = 2. Similarly, the conditional posterior distribution of π‘Μˆπ‘’π½ is
32
𝑃(π‘Μˆπ‘’π½ |π‘Μˆπ‘’Μƒπ½ , 𝒗, 𝒉, 𝑹, 𝑺, 𝑫, 𝜿, 𝑽, 𝑼, 𝑱)
𝑇
𝐾
= ∏∏[
𝑑=1 𝐼=1
𝑣̂𝑑𝐼 𝐺(𝑆𝑒𝑑 , 𝑆𝑒̃𝑑 , π‘Μˆπ‘’π½ , π‘Μˆπ‘’Μƒπ½ )𝐹(𝐷𝑠𝑑 , πœ…πΌπ½ , 𝑉𝐼𝐽 )
π‘ˆπΌπ½π‘‘
]
∑𝑓 𝑣̂𝑓𝐼 𝐺(𝑆𝑒𝑓 , 𝑆𝑒̃𝑓 , π‘Μˆπ‘’π½ , π‘Μˆπ‘’Μƒπ½ )𝐹(𝐷𝑠𝑓 , πœ…πΌπ½ , 𝑉𝐼𝐽 )
𝑃(π‘Μˆπ‘’π½ ),
(C.8)
𝑓 ≠ 𝑠.
1
Joint updates of 𝜿 and 𝑽 were performed separately for each combination of 𝐼, 𝐽 using multivariate
2
Gaussian random walk on the logarithm of πœ…πΌπ½ and 𝑉𝐼𝐽 with proposals from a multivariate normal
3
distribution. The joint conditional distribution of πœ…πΌπ½ , 𝑉𝐼𝐽 is
𝑃(πœ…πΌπ½ , 𝑉𝐼𝐽 |π’„Μˆ 𝑱 , 𝒗, 𝒉, 𝑹, 𝑺, 𝑫, 𝑼, 𝑱)
𝑇
𝑣̂𝑑𝐼 𝐺(𝑆1𝑑 , 𝑆2𝑑 , π‘Μˆ1𝐽 , π‘Μˆ2𝐽 )𝐹(𝐷𝑠𝑑 , πœ…πΌπ½ , 𝑉𝐼𝐽 )
π‘ˆπΌπ½π‘‘
= ∏[
]
∑𝑓 𝑣̂𝑓𝐼 𝐺(𝑆1𝑓 , 𝑆2𝑓 , π‘Μˆ1𝐽 , π‘Μˆ2𝐽 )𝐹(𝐷𝑠𝑓 , πœ…πΌπ½ , 𝑉𝐼𝐽 )
𝑑=1
𝑓 ≠ 𝑠.
𝛹(πœ…πΌπ½ |π›Όπœ… , π›½πœ… ) 𝛹(𝑉𝐼𝐽 |𝛼𝑉 , 𝛽𝑉 ),
(C.9)
4
We utilized Metropolis-Hastings updates for both 𝛼 and 𝛽. In order to improve the mixing we updated the
5
parameters of the hierarchical priors five times for every update of the other parameters (e.g. Fan et al.
6
2010). The posterior distribution of π›Όπœƒ , π›½πœƒ (πœƒ = 𝑉, πœ…) is
𝑃(π›Όπœƒ |𝜽) = 𝛹(𝜽|π›Όπœƒ , π›½πœƒ ) 𝑃(π›Όπœƒ )
𝑃(π›½πœƒ |𝜽) = 𝛹(𝜽|π›Όπœƒ , π›½πœƒ ) 𝑃(π›½πœƒ )
(C.10)
7
with 𝛹 given by equations A.1 (for 𝜽 = 𝑽) and A.2 (for 𝜽 = 𝜿).
8
As the indicator variable π‘ˆπΌπ½π‘‘ = 1 for exactly one combination of 𝐼, 𝐽, then 𝑼 may be updated with Gibbs
9
sampling by drawing one random number for each 𝑑 from a multinomial distribution with probabilities
10
given by
π‘ƒπ‘Ÿ(π‘ˆπΌπ½π‘‘ = 1)
𝑃(𝑑𝑑 |𝒗, 𝑹, 𝑠𝑑 , 𝐼, 𝐽, 𝑺, 𝑫, π’„Μˆ 𝑱 , πœ…πΌπ½ , 𝑉𝐼𝐽 )𝑃(𝑠𝑑 |𝒗, 𝑹, 𝐼, 𝑺, 𝒄̇ 𝑰 )𝑃(𝐼, 𝐽|𝒗, 𝒉, 𝑹)
=
.
∑𝑖 ∑𝑗 𝑃(𝑑𝑑 |𝒗, 𝑹, 𝑠𝑑 , 𝑖, 𝑗, 𝑺, 𝑫, π’„Μˆ 𝒋 , πœ…π‘–π‘— , 𝑉𝐼𝐽 )𝑃(𝑠𝑑 |𝒗, 𝑹, 𝑖, 𝑺, 𝒄̇ π’Š )𝑃(𝑖, 𝑗|𝒗, 𝒉, 𝑹)
11
12
Figure Legends
33
(C.11)
1
Figure 1. Examples of spatial kernel with (a) kurtosis =3.33 and variance=1000 (dashed), 10000 (solid),
2
100000 (dotted) (km2) and (b) variance=1000000 (km2) and kurtosis =2 (dashed), 4 (solid), 8 (dotted).
3
Embedded axis’ shows same as major axes but with logarithmic y-axis and larger distances included.
4
5
Figure 2. Cumulative distribution of observed (solid line) and predicted (dotted line) distances of live pig
6
movements between holdings. Note that the x-axis is on the log scale.
7
8
Figure 3. Coefficients of (panels a and b) explanatory variables production type (0 if holding had not
9
reported the production type, 1 if reported) and (panel c) the combination of production type and size class
10
(S=small, M=medium, L=large) analyzed by GLM. Response variable was (a) the number of first, second
11
and third degree infections caused by a holding if infected, (b) number of first degree infections at
12
distances longer than 10, 100 and 500 km and (c) the number of first degree infections (but with size
13
classes included in the explanatory variables). Note that (a) and (b) are the result of three separate analyses
14
(each with 8 explanatory variables) while (c) is the result of one analysis. Legend abbreviations:
15
SPC=Sow pool centers, MH=Multiplying herds, NH=Nucleus herds, PP=Piglet producers, SPS=Sow pool
16
satellites, FF=Farrow-to-finish, FH=Fattening herds, MI=Missing information.
34
Download