# Web-based Supplementary Materials for A Bayesian

```Web-based Supplementary Materials for
A Bayesian approach to the Multi-state Jolly-Seber
Capture-Recapture Model
Jerome A. Dupuis∗
Laboratoire de Statistique et Probabilités, Université Paul Sabatier
Toulouse, France
and
Carl James Schwarz†
Department of Statistics and Actuarial Science, Simon Fraser University
∗
†
email: Jerome.Dupuis@math.ups-tlse.fr
email: cschwarz@stat.sfu.ca
1
Appendix A
ej (i))
Calculation of f (Bj (i)|B
To simplify notations, index i is omitted in z(i,t) and x(i,t) , and Bi (j) is
denoted by B. If B is of type I, and equal to (z1 , . . . , ztj ), such that zt = ∗,
for all t ≤ t∗ and zt∗ +1 = r (animal is thus entered in the population between
times t∗ and t∗ + 1, in state r) then
t=tj
e ∝ βt∗ (r)
f (B|B)
Y
{1 − pt (zt )}qt (zt , zt+1 )
t=t∗ +1
e = βt (s) when zt +1 = s
when t∗ ≤ tj − 1. If t∗ = tj then f (B|B)
j
j
If B is of type II, we have:
t=t0j
e ∝ qt (zt −1 , zt )
f (B|B)
j
j
Y
qt (zt , zt+1 ){1 − pt (zt )}.
t=tj
When B is of type III or IV calculations are similar; details are omitted.
2
Appendix B
Proof of Theorem 2
Ergodicity of the Markov chain (ξ (l) ) lies on the fact that the kernel density
of the Markov chain generated by the Gibbs sampling, say Q(.|.), is strictly
positive on the support of the distribution of θ, N, zm |D. Indeed:
Q(θ0 , N 0 , z0m |θ, N, zm ) = π(θ0 |zm , N, D)π(N 0 | θ0 , D)f (z0m |N 0 , θ0 , D)
and each conditional density appearing in the right member is strictly positive
on its support.
3
Appendix C
Example for which the component-by-component simulation is not valid
In our example zi denotes the individual dispersal process of animal i which
is liable to move in a region divided into two states a and b as motivated by
Clobert et al. (2001). We assume that the movement process is directed by
a first-order Markov chain. We distinguish two types of resident according
to animal i resides in a or in b, and we let: z(i,t) = 1 in the first case and
z(i,t) = 2 in the second one. Similarly, we distinguish two types of dispersal
according to it disperses from a or from b. We let z(i,t) = 3 in the first case
and and z(i,t) = 4 in the second one. It is clear that the z(i,t) ’s also form a
first-order Markov chain (i being fixed) and that the definitions of the z(i,t) ’s
imply that some transitions are impossible, as it appears in the following
matrix:

+

0


0


+

0 + 0 

+ 0 + 


+ 0 + 


0 + 0
where state † has been omitted for convenience, and where + indicates a
possible transition and 0 an impossible transition.
To demonstrate that the Gibbs sampling is not ergodic when the componentby-component simulation scheme is used, we use the Arnason-Schwarz model
but the same problem also exists in the multi-state Jolly-Seber model. Moreover, for simplicity, we assume that the Markov chain (z(i,t) ) is time homo4
geneous, that the capture probability pt (r) do not depend on t, that q(r, s)
is not decomposed as ψ(r, s) &times; φr , and we have put uniform distributions on
all the parameters. Finally, we consider a very simple data set y to show the
failure of the the Gibbs sampling to be ergodic:
y1 = 1 . . 1 3 4; y2 = 1 1 3 2 2 4; y3 = 1 3 4 3 4 3; y4 = 1 1 1 3 2 2; y5 = 1 3 2 2 2 4.
When missing data is simulated component-by-component, Dupuis (1995)
showed that z(i,t) is simulated according to: p(z(i,t) |z(i,t−1) , x(i,t) , z(i,t+1) ) ∝
q(z(i,t−1) , z(i,t) ){1 − p(z(i,t) )}q(z(i,t) , z(i,t+1) ) where z(i,t+1) denotes either the
observation or the simulated value of z(i,t+1) at step (l − 1) and where z(i,t−1)
denotes either the observation or the simulated value of z(i,t−1) at step (l).
(l)
First, the Markov chain (zm ; l ≥ 0) takes values in {(1, 1), (3, 4)}, and that
it can neither move from (1, 1) to (3, 4) (because q(3, 1) = 0), nor move
from (3, 4) to (1, 1) (because q(1, 4) = 0). The states (3, 4) and (1, 1) are
(l)
thus absorbing states of the Markov chain (zm ; l ≥ 0). It implies that the
(l)
Markov chain (zm , θ(l) ; l ≥ 0) generated by the Gibbs algorithm is not ergodic; consequently, the Gibbs sampling yields incorrect results. For examP
ple, L1 Ll=1 q (l) (1, 1) converges to 0.50 when the Gibbs sampling is run from
(0)
zm = (1, 1), and to 0.33 when it is run from (3, 4), whereas the Bayesian
estimate of q(1, 1) is actually equal to 0.46 (value obtained by implementing
the Gibbs sampling using the block-by-block scheme). The data set y chosen
to illustrate our talk is clearly artificial, but it is easy to check that the nonergodicity will also occur with any data set y including, for example, one (or
more) sequences of type r . . . s.
5
Web Table 1
Summary statistics for the northern pike study. nt is the number of fish
captured; mt is the number of marks present; Rt is the number of fish
released (with tags); rt is the number (from Rt ) subsequently recovered; and
zt is the number of fish seen before year t, not in year t, and seen after
year t. Note while these statistics are sufficient for the standard Jolly-Seber
model, they are not sufficient for the multi-state model and the full
capture-histories were used in the example.
Year
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2005-fall
nt
1663
1939
2669
1769
1817
626
1567
2076
1997
2136
2033
6799
1150
mt
0
148
239
163
139
19
33
14
11
9
6
40
157
6
Rt
1663
1939
2669
1769
1552
1
0
0
0
0
0
6746
1126
rt
223
257
177
103
60
1
0
0
0
0
0
157
0
zt
0
75
93
107
71
112
80
66
55
46
40
0
0
```