Supporting Information The Lévy flight foraging hypothesis in a pelagic seabird Stefano Focardi and Jacopo G. Cecere S1. Detection of reorientation events in LMCRW....................................................................................................... 2 S2. Identification of trajectories in experimental trips. ............................................................................................ 7 S3. Models ................................................................................................................................................................ 11 S4. Formulas for likelihoods...................................................................................................................................... 12 S5. SAS Code ............................................................................................................................................................ 13 S6. Computation of quantile residuals ..................................................................................................................... 14 S7. Experimental rank-frequency plots and predicted EXPB and PLB curves........................................................... 14 Plot expected PLB............................................................................................................................................... 14 Plot expected EXPB ............................................................................................................................................ 14 S8. Sequential linearity plots .................................................................................................................................... 15 S9. Rank-frequency plots, expected values and residuals ........................................................................................ 15 S10 Bibliography ....................................................................................................................................................... 25 1 1. Validation of analysis used to estimate LMCRW For animals whose angular distribution fitted model F1, we used ŵ to evaluate the turning point where a reorientation event occurred. The problem is illustrated in Fig. S1A: most of angles deriving from a uniform distribution are larger (in absolute value) than of those derived from a wrapped Cauchy distribution. We attributed the (1-w) larger to the UD and all the others to the WCD. This is equivalent to establishing trajectory-specific threshold angles θ: if > θ the fix corresponds to a reorientation event (Fig S1B). The method appears to be rough and clearly the number of angles appropriately attributed to the 2 distributions would depend on value. For the method is exact, while in the limit the two distributions are not distinguishable. Then we have to evaluate the degree of precision in estimating the Lévy index of this method using a set of simulated LMCRW. A q q B >|q| reorientation Figure S1. A. The pdf of a wrapped Cauchy distribution (red) with a concentration parameter of =0.9 and of an uniform distribution (green) are plotted as a function of turning angle θ indicates the angular threshold. B. The reorientation (red) occurs when > θ. We performed simulations of intermittent movements according to Bartumeus and Levin (2008). During the scanning mode, the movement is characterized by turning angles s ( s ) extracted from a wrapped Cauchy distribution (WCD) with mean 0 and concentration parameter assigned: P( s ) 1 2 . 2 (1 2 2 cos s ) represents the correlation among turning angles and it is small in sinuous trajectories and large when the movement is straight. For convenience, we assume constant speed, and each move is characterized by a fixed (and 2 short) time length . During the trajectory, re-orientations occur, and we assume that the re-orientation turning angles r are independent from the s, and that, more specifically, they are extracted from a uniform angular distribution (UD). The length of displacements y between reorientations are drawn from a power-law distribution p y cy simulated using a classical method (cf. Edwards 2008): y ( ) ymin (1 u ) 1 1 , where u is a uniformly-distributed random variate in [0,1]. We defined w (1-w) as the fraction of moves extracted from a WCD (UD). No value of w was imposed to the simulations but was computed a-posteriori as the fraction of s (r) on the total number of turning angles in the trajectory. In line with Morales et al (2004), the models were fitted using Monte Carlo Markov Chain (MCMC) techniques implemented within the PROC MCMC in SAS 9.3 (SAS Institute 2009). As vague priors, we used the uniform (0,1) for both w and For the sake of clarity, we denote ŵ and ̂ as the estimated variables, with w and as the actual values. Let us define the bias of the MCMC estimate for a generic variable k, as biask abs(k kˆ) . kˆ The parameters used in simulations were taken to cover the range of parameter values observed in studied shearwaters. The used values are 1.2, 1.4…, 2.8) and The step length ymin was set to 100 and ymax to 400000. We used simulations to identify how accurate the procedure is to estimate movement parameters. 3 1.1. Movement simulations. We reported some examples of simulated trajectories in Fig. S2. It appears evident that larger the , less sinuous the movement. Figure S2. We display two simulations obtained with =1.4, =0.70 (left), and =0.95 (right). The black point represent the origin of simulation. 1.2. Recovery of simulated and w. The accuracy of the estimation of the true 𝜌̂ is represented in Fig. S3A. The relationship between the bias of 𝑤 ̂ and is represented in Fig. S3B. The bias of 𝜌̂ appears moderately good (always <25%) while the estimates of the bias of 𝑤 ̂are usually good by being always <10%. Unsurprisingly, in both cases, bias decreases with because when r is large, the angular distribution becomes more and more clustered around 0, so that, given a certain sample size, it may be better identified by the MCMC. The bias of 𝜌̂ is independent by (Fig. S3C) and usually remains below 15%. Also, the bias of 𝑤 ̂ remains low (always below 7.5%) when plotted as a function of (Fig. S3D), albeit, one can note that it is larger for large values. This analysis establishes the limit of accuracy by which one can estimate movement parameters of a complex movement model using the proposed methodology. 4 B A C D Figure S3. A. The bias of 𝜌̂ is plotted as a function of the assigned values. B. The bias of 𝑤 ̂ is plotted as a function of the assigned values. C. The bias of 𝜌̂ is plotted as a function of the assigned values. D. The bias of the recovered 𝑤 ̂ is plotted as a function of the assigned values. Vertical bars represent standard errors. 1.3. Accuracy in the identification of reorientation angles. To identify the reorientation angles we use the estimated 𝑤 ̂ of each trajectory, and we select all 𝑤 ̂ larger angles (in absolute value) as reorientation angles, which is equivalent to establishing a trajectory-specific threshold for angles as previously suggested by Turchin (1998) and Reynolds et al. (2007). Since reorientation angles are uniformly distributed, we expect that the larger (in absolute value) they are, the easier they are to be identified. In other words, we will be able to identify only a fraction of turning angles. We also expect that the larger the, the more efficient the identification. On average, our identification rate was 0.53±0.023 SE; however, it changed as a function of assigned and values, as shown in Fig. S4. For (Fig. S4A), we observed, as expected, that the identification rate increase with . The pattern for (Fig. S4B) is, indeed, more complex, with a minimum identification rate for =1.6. We believe that it is given by the interplay between the linearity of the path (which increases for decreasing ), which improves 5 detection, and a larger number of reorientation events, which occurs at large values. The trade-off between these contrasting effects probably produces the observed minimum. A B Figure S4. The fraction of successful identifications of reorientation angles is shown for (A) and (B) values. What really matters for the aim of this paper is to investigate how good is the recovery of the assigned values (Fig. S5). Except for very small values, the quality of estimate is quite good, albeit a bit biased low. We have to note that results have been determined by the adoption of the procedure for ymin selection proposed by Clauset et al (2009) which we have adopted in the analysis of Cory’s shearwaters. We have performed a set of numerical experiments (not reported here) without ymin selection or using values evaluated “by eye” from the plots of the distribution of distances, and we found a much worse estimate of . We conclude that our method presents some shortcomings, but basically allows us to obtain good quality estimates. 6 Figure S5. The recovered values are reported as a function of the assigned values. 2. Identification of trajectories in experimental trips. In Fig. S6 we reported the actual trajectory and the estimated Lévy displacement. 7 8 9 Figure S6. In blue we represent the recorded trajectory; in red we connect the point where we identify a reorientation event with a straight line. We reported bird identification. The subfix _L and _T indicate the colony (Linosa and Tremiti, respectively), the subfix _i and _c the phase of the breeding period (incubation and chickrearing, respectively). 10 3. Models We compared four different models for move length distribution P(y). Let us define and as the parameters of the power-law and exponential distribution, respectively, and ymin and ymax as the minimal and largest values considered. The pdf are: Power-law (PL): 𝑃(𝑦) = 𝐶𝑦 −𝜇 , 𝑦 ≥ 𝑦𝑚𝑖𝑛 With normalization constant 𝐶 = (𝜇 − 1)𝑦𝑚𝑖𝑛 𝜇−1 . Power-law bounded (PLB): 𝑃(𝑦) = 𝐶𝑦 − 𝜇 , 𝑦𝑚𝑖𝑛 ≤ 𝑦 ≤ 𝑦𝑚𝑎𝑥 With normalization constant 𝐶 = (𝜇−1) 1−𝜇 1−𝜇 . This formula is valid only for >1 (𝑦𝑚𝑖𝑛 −𝑦𝑚𝑎𝑥 ) Exponential (EXP): 𝑃(𝑦) = 𝜆 exp(−𝜆(𝑦 − 𝑦min )), 𝑦 ≥ 𝑦𝑚𝑖𝑛 Exponential bounded (EXPB): 𝑃(𝑦) = exp(−𝜆𝑦 𝜆 min )−exp(−𝜆𝑦max ) 11 exp(−𝜆𝑦), 𝑦𝑚𝑖𝑛 ≤ 𝑦 ≤ 𝑦𝑚𝑎𝑥 4. Formulas for likelihoods We reported the formulas for likelihood of computing EXP, EXPB, PL, and PLB from Edwards et al (2007) and Edwards (2011), as well as, a SAS code for estimating the models’ parameters. We have the following formulas for likelihood given y represents the vector of data (move lengths) of size n: Unbounded power law 𝑛 𝐿𝑜𝑔(𝐿(𝜇, 𝒚)) = 𝑛𝑙𝑜𝑔(𝜇 − 1) + 𝑛(𝜇 − 1) log(𝑦𝑚𝑖𝑛 ) − 𝜇 ∑ log(𝑦𝑖 ) 1 Unbounded exponential 𝑛 𝐿𝑜𝑔(𝐿(𝜆, 𝒚)) = 𝑛𝑙𝑜𝑔𝜆 + 𝑛𝜆𝑦𝑚𝑖𝑛 − 𝜆 ∑ 𝑦𝑖 1 Bounded power law 𝑛 𝐿𝑜𝑔(𝐿(𝜇, 𝒚)) = 𝑛𝑙𝑜𝑔(𝜇 − 1) − 𝑛𝑙𝑜𝑔(𝑦𝑚𝑖𝑛 1−𝜇 − 𝑦𝑚𝑎𝑥 1−𝜇 ) − 𝜇 ∑ log(𝑦𝑖 ) 1 Bounded exponential 𝑛 𝐿𝑜𝑔(𝐿(𝜆, 𝒚)) = 𝑛𝑙𝑜𝑔𝜆 − 𝑛𝑙𝑜𝑔(exp(−𝜆𝑦𝑚𝑖𝑛 ) − exp(−𝜆𝑦𝑚𝑎𝑥 )) − 𝜆 ∑ 𝑦𝑖 1 12 5. SAS Code We used PROC NLMIXED of SAS 9.3 to estimate the parameters of the four distributions above. The option data= indicates the file which contains n data, denoted by y. Few variables have to be computed for the estimation: logy is the log(yi), ysum represents the summation of the ata, and logysum the log(ysum). * PL unbounded; proc nlmixed data= ; parms mu=1.01 to 3 by 0.1; bounds 1<mu<=3; logl1=n*log(mu-1)+n*(mu-1)*log(ymin)-mu*logysum; model logy ~ general(logl1); run; * PL bounded; proc nlmixed data=; parms mu=1 to 3 by 0.01; bounds 1<=mu<=3; *note that for PLB=1 the PLB model is normalisable; if (mu>1) then logl1=+n*log(mu-1)-n*log(ymin**(1-mu)-ymax**(1-mu))- mu*logysum; else if (mu=1) then logl1=-n*log(log(ymax)-log(ymin))-logysum; model logy ~ general(logl1); run; * EXP unbounded; proc nlmixed data=; parms lambda=0.01 to 1 by 0.1 ; logl1=+n*log(lambda) + n*lambda*ymin - lambda*ysum; model logy ~ general(logl1); run; * EXP bounded; proc nlmixed data= ; parms lambda=0.01 to 1 by 0.1 ; bounds 0<lambda<=1; logl1=+n*log(lambda) - n*log(exp(-lambda *ymin) - exp(-lambda *ymax)) - lambda *ysum; model logy ~ general(logl1); run; 13 6. Computation of quantile residuals for PLB Let us assume to have fixed 𝜇̂ and ymin , the probability that an observation y is included between ymin and a generic t value reads: 𝑡 𝐹(𝑡) = ∫ 𝐶𝑦 − 𝜇̂ 𝑑𝑦 𝑦𝑚𝑖𝑛 with 𝐶 = (𝜇̂ − 1)𝑦𝑚𝑖𝑛 𝜇̂−1 , which yields: 𝐶 (𝑡1−𝜇̂ ̂ 1−𝜇 𝐹(𝑡) = − 𝑦𝑚𝑖𝑛 1−𝜇̂ ). The quantile residual ri for the observed yi value is: 𝑟𝑖 = Φ−1 (𝐹(𝑦𝑖 )), where Φ−1 ( ) is the inverse of the cumulative distribution function of the standard normal. To compute ri, we used the function PROBIT of SAS. Apart from sampling variability of C and 𝜇̂ , r should be distributed as N(0,1). 7. Experimental rank-frequency plots and predicted EXPB and PLB curves A rank frequency plot represents the cumulative number of moves greater than any given y (ymin≤y≤ymax) plotted as function of y in a double-logarithmic plot. To visualize the amount of fitting of experimental data y with predicted models (here we limit our analysis to PLB and EXPB which are considered to be more relevant), it is useful to plot predicted values for these models. This is quite similar to the analysis of residuals above with the difference that now we are interested to 1-F(t). Plot expected PLB 1 − 𝐹(𝑦) = Where, as above, 𝐶 = (𝜇−1) 1−𝜇 1−𝜇 (𝑦𝑚𝑖𝑛 −𝑦𝑚𝑎𝑥 ) 𝐶 (1 − ̂ 1−𝜇 𝜇̂ )𝑦𝑚𝑎𝑥 − 𝐶 (1 − 𝜇̂ )𝑦1−𝜇̂ . Plot expected EXPB 𝟏 − 𝑭(𝒚) = 𝟏 (𝒆𝒙𝒑(−𝝀𝒚) − 𝒆𝒙𝒑(−𝝀𝒚𝒎𝒂𝒙 ) 𝝀 14 8. Sequential linearity plots Let us consider the series xi, yi (i=1,…N), where yi is the log(move length) and xi is the log(number of point with move length > yi). For each i-th value, we considered the series of data from the first to the i-th point. For this series, we considered the linear and the quadratic regression model, and the best model was selected using the AIC (corrected for small sample size) value. Starting from point N-1 (the N-th point has to be excluded because xN is 0 and the logarithm cannot be computed), we computed how many points showed a quadratic regression and stopped (at point k) when the linear was the preferred model. The value (N-k)/N is the number of data points which are members of the linear part of the curve. 9. Rank-frequency plots, expected values and residuals The rank frequency plot is an important tool to discriminate whether or not an experimental distribution of move length conforms to a Lévy distribution (Benhamou 2007). For each bird characterized by a F1 angular distribution, we tested for compliance with Lévy walks in Fig. S5. We reported: left column: the rank-frequency plot of actual data, expected values under model PLB (red) and EXPB (blue); central column: histogram of quantile residuals and expected normal distributions (in the inset, we report sample size, mean value of residual (expected 0), standard deviation (expected 1), and the Kolmogorov-Smironov test (expected non-significant); right column: QQ plot of residuals. 15 Figure S7. Model fitting. Bird identification. The subfix _L and _T indicate the colony (Linosa and Tremiti, respectively), the subfix _i and _c the phase of the breeding period (incubation and chick-rearing, respectively). 16 17 18 19 20 21 22 23 24 10. Bibliography Benhamou, S. 2007. How many animals really do the Lévy walk? Ecology, 88, 1962–1969. (doi:10.1890/06-1769.1). Clauset, A., Shalizi C.R. & Newman M.E.J. (2009) Power-law. distributions in empirical data. SIAM Review , 51, 661703. Dunn P. K. &. Smyth G. K (1996) Randomized Quantile Residuals. Journal of Computational and Graphical Statistics 5: 236-244 Edwards A.M. et al 2007. Revisiting Lévy flight search patterns of wandering albatrosses, bumblebees and deer. Nature 449:1044-1048 Edwards A.M. 2011. Overturning conclusions of Lévy flight movement patterns by fishing boats and foraging animals. Ecology92:1247–1257. Ecological ArchivesE092-104-A1 Morales JM, Haydon DT, Frair J, Holsinger KE, Fryxell JM (2004 ) Extracting more out of relocation data: building movement models as mixtures of random walks. Ecology, 85: 2436-2445. Reynolds, A.M., Smith, A.D., Reynolds, D.F., Carreck, N.L. & Osborne, J.L. (2007) Honey bees perform optimal scalefree searching flights when attempting to locate a food source. The Journal of Experimental Biology, 210, 3763-3770. Turchin, P. (1998) Quantitative Analysis of Movement: Measuring and Modeling Population Redistribution in Animals and Plants. Sinauer Press, Sunderland, Mass. USA. 25