Supplementary Document for Point Pattern Modeling for Degraded Presence-Only Data over Large Regions Avishek Chakraborty & Alan E. Gelfand & Adam M. Wilson & Andrew M. Latimer & John A. Silander 1 Posterior consistency of the intensity We concluded Section 3 of the paper mentioning the theoretical well-behavedness of the posterior of prevalence intensity surface. Here we discuss the relevant result in detail. Posterior consistency in generalized linear models (GLM) with a Gaussian process (GP) prior on the link function is discussed in, e.g., Ghosal and Roy (2006) and Tokdar and Ghosh (2007). Establishing posterior consistency is useful to deduce the rate of posterior convergence (Ghosal et al., 2000). Sufficient conditions for showing posterior consistency in the case of non i.i.d. observations with Gaussian error was established in Choi and Schervish (2007). Consistency results for binary data with assumptions on the covariance function of the GP prior for the link function were developed in Ghosal and Roy (2006). The same for a Poisson model was discussed in Pillai (2008) and results were provided for the case where the support of the prior distribution is restricted to the class of functions Λγ = {λ(·); λ(·) > γ}, for some fixed γ > 0. We observe that under a log GP prior distribution (with some regularity conditions mentioned) on the intensity function, the assumption of a uniform lower bound can be relaxed to that of strict positivity. In particular, we can prove a modified version of remark 4.2.5 of Pillai (2008). ind Theorem 1. Let y(s) ∼ f (·, s) where f (·, s) is the Poisson distribution with mean λ(s) corresponding to the covariate s ∈ D, where D is a compact subset of R2 . Place a prior Π on the space of functions η(·) = logλ(·) through a GP with mean µ0 (·) and covariance kernel σ(·, ·). Let Λ0 = { λ0 (·) ∈ C D : ∀s ∈ D, λ0 (s) > 0 }, where C D is set of all continuous functions on D. Then under regularity conditions on µ0 , σ, we have, for any λ0 (·) ∈ Λ0 , R Π(λ : |λ(s) − λ0 (s)|dQn (s) > ǫ|y1 , y2 , ..., yn ) → 0 in Pλ0 , where Qn is the empirical distribution function of s1 , s2 , ..., sn . 1 Proof. According to Choi and Schervish (2007), define Λi (λ0 , λ) = Ki (λ0 , λ) Vi (λ0 , λ) = = f (y(si )|λ0 (si )) f (y(si )|λ(si )) Eλ0 (Λi (λ0 , λ)) V arλ0 (Λi (λ0 , λ)) log Then to prove the theorem, it is sufficient to show the following. (A1) ∃ a set B such that (i) Given ǫ > 0 Π(B ∩ {λ : Ki (λ0 , λ) < ǫ∀ i}) > 0, ∞ X Vi (λ0 , λ) (ii) < ∞, ∀ λ ∈ B i2 i=1 (A2) ∃ sequence of test functions {Φn }, sets {Λn }, and constants C1 , C2 , c1 , c2 > 0 such that for any sequence of neighborhoods Un of λ0 , (i) ∞ X Eλ0 Φn < ∞ n=1 (ii) supλ∈Unc ∩Λn Eλ (1 − Φn ) < C1 e−c1 n (iii) Π(Λcn ) < C2 e−c2 n As in Ghosal and Roy (2006), we assume the covariance kernel for η(·) is σ( x, y) = τ −1 σ0 (βx, βy). Let us start with the true intensity function λ0 (·) ∈ Λ0 . We first want to show that for any ǫ > 0, P ({λ : ||λ − λ0 ||∞ < ǫ}) > 0 As a continuous function on a compact set, λ0 (·) is bounded below and above by some constants lλ0 , uλ0 > 0 respectively. So η0 (s) = log(λ0 (s)) is continuous on D (since log is a continuous function on any compact subinterval of R+ ). Thus from Choi and Schervish (2007) η0 ∈ RKHS of σ0 in the sup metric. We first show that given ǫ > 0 there exists δ > 0 such that if ||η(x) − η0 (x)||∞ ≤ δ), ||λ(x) − λ0 (x)||∞ ≤ ǫ, where δ can be constructed back from ǫ. By theorem 4 of Ghosal and Roy (2006) under regularity assumptions (see Appendix 1) on prior of τ, β and smoothness of σ0 (·, ·), µ0 (·), for any δ > 0, P ({η(·) : ||η(x) − η0 (x)||∞ ≤ δ}) > 0 and we will be done. Now for some δ > 0 (to be specified later) consider Dη0 ,δ = {η; ||η − η0 ||∞ < δ} Given η ∈ Dη0 ,δ , for λ = eη , ||λ(x) − λ0 (x))||∞ = supx |eη(x) − eη0 (x) | ≤ supx |eη0 (x) |supx |e(η−η0 )(x) − 1| ≤ uλ0 supx |e(η−η0 )(x) − 1| and also e |e −δ η(x)−η0 (x) −δ −1 < < η(x) − η0 (x) < δ ∀ x eη(x)−η0 (x) − 1 < eδ − 1 ∀ x − 1| < max(eδ − 1, 1 − e−δ ) ∀ x 2 So we have for ||η − η0 ||∞ < δ, ||λ − λ0 ||∞ < uλ0 max(eδ − 1, 1 − e−δ ), hence we have for δ(ǫ) = min(log(1 + uλǫ ), log(1 − uλǫ )−1 ), ||λ − λ0 ||∞ ≤ ǫ. 0 0 Then given ǫ > 0, define Bλ0 ,ǫ = {λ(·) = eη(·) : η ∈ Dη0 , 2uǫ ǫ 2 λ0 ∧δ( 2ǫ ) }, so we have P (Bλ0 ,ǫ ) > 0 and λ ∈ Bλ0 ⇒ ||λ − λ0 ||∞ < Now, we have for λ ∈ Bλ0 ,ǫ Ki (λ0 , λ) = −(λ0 − λ)(si ) + λ0 (si )(η − η0 )(si ) So |Ki (λ0 , λ)| < ||λ0 − λ||∞ + uλ0 ||η − η0 ||∞ < 2ǫ + uλ0 2uǫλ < ǫ Again for 0 ∞ X Vi ǫuλ0 2 λ ∈ Bλ0 ,ǫ we have Vi (λ0 , λ) < ||η − η0 ||∞ uλ0 < 2 . So we have <∞ i2 i=1 and (A1) is verified. To verify (A2), we work with the sieve Λn = {λ = eη(·) : ||Dω (η)||∞ < Mn , |ω| < α} Condition (c) of (A2) for this sieve was verified in Ghosal and Roy (2006). The rest follows exactly as in Pillai (2008), pp 70–73. 2 Additional figures and reading materials Sections 1 and 2 of the paper contain an extensive discussion on recent approaches in ecological world for modelling species distribution. Here we enlist some additional references, apart from the ones cited in the main article, potentially useful for the readers interested in further detail. References Austin, M.P., Belbinb, L., Meyers J.A., Dohertya, M.D. and Luotoc, M. (2006) Evaluation of statistical models used for predicting plant species distributions: Role of artificial data and theory. Ecological Modelling 199(2), 197–216 Choi, T. and Schervish, M.J. (2007) On posterior consistency in nonparametric regression problems. Journal of Multivariate Analysis 98: 1969–1987. Dudı́k, M., Schapire, R., and Phillips, S. (2006) Correcting sample selection bias in maximum entropy density estimation. In Advances in neural information processing systems 18 (eds: Weiss, Y., Schölkopf, B., and Platt, J.) 323–330. MIT Press, Cambridge, MA. boosted regression trees. Journal of Animal Ecology 77: 802–813 Ghosal, S., Ghosh, J. K. and Van der Vaart, A. W. (2000) Convergence rates of posterior distributions. Annals of Statistics 28: 500–531. Ghosal, S. and Roy, A. (2006) Posterior consistency of Gaussian process prior for nonparametric binary regression. Annals of Statistics 34: 2413–2429. 3 Graham, C. H. and Hijmans, R. J. (2006) A comparison of methods for mapping species ranges and species richness. Global Ecology and Biogeography 15: 578–587. Guisan, A. and Zimmerman, N. E. (2000) Predictive habitat distribution models in ecology. Ecological Modelling 135: 147–186. Guisan, A. and Thuiller, W. (2005) Predicting species distribution: offering more than simple habitat models. Ecology Letters 8: 993–1009. Hernandez, P. A., Graham, C. H., Master, L. L. and Albert, D. L. (2006) The effect of sample size and species characteristics on performance of different species distribution modeling methods. Ecography 29(5): 773–785. Hooten, M.B., Larsen, D.R. and Wikle, C.K. (2003) Predicting the spatial distribution of ground flora on large domains using a hierarchical Bayesian model. Landscape Ecology 18: 487–502. Phillips, S. J., Anderson, R. P. and Schapire, R. E. (2006) Maximum entropy modeling of species geographic distributions. Ecological Modelling 190(3-4): 231–259. Pillai, N. S. (2008) Posterior consistency of nonparametric Poisson regression models. PhD thesis, Duke University, 66–77 http://stat.duke.edu/people/theses/PillaiNS.html Schulman, L., Toivonen, T. and Roukolainen, K. (2007). Analyzing botanical collecting effort in Amazon and collection for it in species range estimation. Journal of Biogeography 34: 1388–1399 Tokdar, S. T. and Ghosh, J. K. (2007) Posterior consistency of logistic Gaussian process priors in density estimation. Journal of Statistical Planning and Inference 137(1): 34–42. Vaughan, I.P. and Oremerod, J. (2003) Improving the quality of distribution models for conservation by addressing shortcomings in field colelction of training data. Conservation Biology 17: 1601–1611. 4