Web-based Supplementary Materials for A Bayesian approach to the Multi-state Jolly-Seber Capture-Recapture Model Jerome A. Dupuis∗ Laboratoire de Statistique et Probabilités, Université Paul Sabatier Toulouse, France and Carl James Schwarz† Department of Statistics and Actuarial Science, Simon Fraser University Burnaby, BC, Canada V5A 1S6 ∗ † email: Jerome.Dupuis@math.ups-tlse.fr email: cschwarz@stat.sfu.ca 1 Appendix A ej (i)) Calculation of f (Bj (i)|B To simplify notations, index i is omitted in z(i,t) and x(i,t) , and Bi (j) is denoted by B. If B is of type I, and equal to (z1 , . . . , ztj ), such that zt = ∗, for all t ≤ t∗ and zt∗ +1 = r (animal is thus entered in the population between times t∗ and t∗ + 1, in state r) then t=tj e ∝ βt∗ (r) f (B|B) Y {1 − pt (zt )}qt (zt , zt+1 ) t=t∗ +1 e = βt (s) when zt +1 = s when t∗ ≤ tj − 1. If t∗ = tj then f (B|B) j j If B is of type II, we have: t=t0j e ∝ qt (zt −1 , zt ) f (B|B) j j Y qt (zt , zt+1 ){1 − pt (zt )}. t=tj When B is of type III or IV calculations are similar; details are omitted. 2 Appendix B Proof of Theorem 2 Ergodicity of the Markov chain (ξ (l) ) lies on the fact that the kernel density of the Markov chain generated by the Gibbs sampling, say Q(.|.), is strictly positive on the support of the distribution of θ, N, zm |D. Indeed: Q(θ0 , N 0 , z0m |θ, N, zm ) = π(θ0 |zm , N, D)π(N 0 | θ0 , D)f (z0m |N 0 , θ0 , D) and each conditional density appearing in the right member is strictly positive on its support. 3 Appendix C Example for which the component-by-component simulation is not valid In our example zi denotes the individual dispersal process of animal i which is liable to move in a region divided into two states a and b as motivated by Clobert et al. (2001). We assume that the movement process is directed by a first-order Markov chain. We distinguish two types of resident according to animal i resides in a or in b, and we let: z(i,t) = 1 in the first case and z(i,t) = 2 in the second one. Similarly, we distinguish two types of dispersal according to it disperses from a or from b. We let z(i,t) = 3 in the first case and and z(i,t) = 4 in the second one. It is clear that the z(i,t) ’s also form a first-order Markov chain (i being fixed) and that the definitions of the z(i,t) ’s imply that some transitions are impossible, as it appears in the following matrix: + 0 0 + 0 + 0 + 0 + + 0 + 0 + 0 where state † has been omitted for convenience, and where + indicates a possible transition and 0 an impossible transition. To demonstrate that the Gibbs sampling is not ergodic when the componentby-component simulation scheme is used, we use the Arnason-Schwarz model but the same problem also exists in the multi-state Jolly-Seber model. Moreover, for simplicity, we assume that the Markov chain (z(i,t) ) is time homo4 geneous, that the capture probability pt (r) do not depend on t, that q(r, s) is not decomposed as ψ(r, s) × φr , and we have put uniform distributions on all the parameters. Finally, we consider a very simple data set y to show the failure of the the Gibbs sampling to be ergodic: y1 = 1 . . 1 3 4; y2 = 1 1 3 2 2 4; y3 = 1 3 4 3 4 3; y4 = 1 1 1 3 2 2; y5 = 1 3 2 2 2 4. When missing data is simulated component-by-component, Dupuis (1995) showed that z(i,t) is simulated according to: p(z(i,t) |z(i,t−1) , x(i,t) , z(i,t+1) ) ∝ q(z(i,t−1) , z(i,t) ){1 − p(z(i,t) )}q(z(i,t) , z(i,t+1) ) where z(i,t+1) denotes either the observation or the simulated value of z(i,t+1) at step (l − 1) and where z(i,t−1) denotes either the observation or the simulated value of z(i,t−1) at step (l). (l) First, the Markov chain (zm ; l ≥ 0) takes values in {(1, 1), (3, 4)}, and that it can neither move from (1, 1) to (3, 4) (because q(3, 1) = 0), nor move from (3, 4) to (1, 1) (because q(1, 4) = 0). The states (3, 4) and (1, 1) are (l) thus absorbing states of the Markov chain (zm ; l ≥ 0). It implies that the (l) Markov chain (zm , θ(l) ; l ≥ 0) generated by the Gibbs algorithm is not ergodic; consequently, the Gibbs sampling yields incorrect results. For examP ple, L1 Ll=1 q (l) (1, 1) converges to 0.50 when the Gibbs sampling is run from (0) zm = (1, 1), and to 0.33 when it is run from (3, 4), whereas the Bayesian estimate of q(1, 1) is actually equal to 0.46 (value obtained by implementing the Gibbs sampling using the block-by-block scheme). The data set y chosen to illustrate our talk is clearly artificial, but it is easy to check that the nonergodicity will also occur with any data set y including, for example, one (or more) sequences of type r . . . s. 5 Web Table 1 Summary statistics for the northern pike study. nt is the number of fish captured; mt is the number of marks present; Rt is the number of fish released (with tags); rt is the number (from Rt ) subsequently recovered; and zt is the number of fish seen before year t, not in year t, and seen after year t. Note while these statistics are sufficient for the standard Jolly-Seber model, they are not sufficient for the multi-state model and the full capture-histories were used in the example. Year 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2005-fall nt 1663 1939 2669 1769 1817 626 1567 2076 1997 2136 2033 6799 1150 mt 0 148 239 163 139 19 33 14 11 9 6 40 157 6 Rt 1663 1939 2669 1769 1552 1 0 0 0 0 0 6746 1126 rt 223 257 177 103 60 1 0 0 0 0 0 157 0 zt 0 75 93 107 71 112 80 66 55 46 40 0 0