Null models in Ecology Diane Srivastava Sept 2010 The big questions • What constitutes a null model? • What biological assumptions are behind the deterministic constraints in null models? • How do these constraints affect our ability to detect “interesting” patterns? • Is a process or a pattern assumed to be stochastic in null models? • Are neutral models null models? What is a null model? Gotelli and Graves (1996): “A null model is a pattern-generating model that is based on randomization of ecological data…Certain elements of the data are held constant and others are allowed to vary stochastically…The randomization is designed to produce a pattern that would be expected in the absence of a particular ecological mechanism” Two views of null models: • Statistical descriptions of randomized data (Simberloff 1983) • Simulations of random assembly processes (Colwell and Winkler 1984, Gotelli and Graves 1996) Example: A null model for the trophicrank hypothesis Populate each bromeliad with individuals by randomly sampling regional pool Stop populating a bromeliad when it reaches its individual capacity (related to size) Frequency From data, construct regional pool of individuals 100 90 80 70 60 50 40 30 20 10 Repeat 1000s of time Compare observed trophic-rank effect 0.24 0.22 0.20 0.18 0.16 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.00 -0.02 -0.04 -0.06 -0.08 Calculate difference in species-area curves between predators and prey -0.10 0 Predator – Prey Z values Example: A null model for the trophicrank hypothesis From data, construct regional pool of individuals Populate each bromeliad with individuals by sampling regional pool Stop populating a bromeliad when it reaches its individual capacity (related to size) Calculate difference in species-area curves between predators and prey Repeat 1000s of time Compare observed trophic-rank effect What aspects of the data were held constant? Example: A null model for the trophicrank hypothesis From data, construct regional pool of individuals Populate each bromeliad with individuals by sampling regional pool Stop populating a bromeliad when it reaches its individual capacity (related to size) Calculate difference in species-area curves between predators and prey Repeat 1000s of time Compare observed trophic-rank effect What aspects of the data were held constant? What was randomized? Example: A null model for the trophicrank hypothesis From data, construct regional pool of individuals Populate each bromeliad with individuals by sampling regional pool Stop populating a bromeliad when it reaches its individual capacity (related to size) Calculate difference in species-area curves between predators and prey Repeat 1000s of time Compare observed trophic-rank effect What aspects of the data were held constant? What was randomized? What ecological process(es) were removed by the null model? Example: A null model for the trophicrank hypothesis From data, construct regional pool of individuals Populate each bromeliad with individuals by sampling regional pool Stop populating a bromeliad when it reaches its individual capacity (related to size) Calculate difference in species-area curves between predators and prey Repeat 1000s of time Compare observed trophic-rank effect What aspects of the data were held constant? What was randomized? What ecological process(es) were removed by the null model? What biological assumptions did I make? Null models assume: • Ecological hypotheses are falsifiable (sensu Popper) • The simplest hypothesis is the best (Occam’s razor) • Ecological processes can be removed from the data by simulation (now the fun begins…) A brief history of null models • First null models tested S/G ratios in communities (Maillefer 1929, William 1947) • Rarefaction techniques for species richness estimates developed in 1960s (Sanders 1968) • Null models developed for niche overlap and body size limits in 1970s (Brown 1973, Sale 1974) • First neutral model for relative abundance: Calwell 1976 • Passive sampling models for species-area relationships in 1980s (Colwell 1981) Null models developed for: - Species-area relationship Diversity constancy through time Food-web structure Niche overlap Limiting similarity to body sizes Species: genus ratios in communities Phylogenetic diversity within communities Nestedness Species co-occurrence Null models developed for: - What process might your null model remove? Species-area relationship Diversity constancy through time What large dataset might Food-web structure you randomly sample to generate a community Niche overlap level null pattern? Limiting similarity to body sizes Species: genus ratios in communities Phylogenetic diversity within communities Nestedness Species co-occurrence Species co-occurence Do species distributions reflect negative effects of competition? If so, species should not be distributed randomly – some species combinations should be missing or rare Checkerboard distribution: siteA siteB Species A 1 0 Species B 0 1 (Unfortunately this pattern can occur for other reasons like habitat filtering!) Species co-occurence 1960s/70s: Chris Pielou developed null model for co-occurrence that involved randomizing a species list amongst sites, and comparing to observed. Also developed variance test for co-occurrence. 1975: Jared Diamond published 7 “assembly rules” based on observations of birds on islands. 1979: Connor and Simberloff countered Diamond with randomization tests of co-occurrence 1980s: Gilpin and Diamond argue C&S’s constraints bias randomization test to no finding patterns; Schluter’s variance ratio test 1990s: Gotelli and Entsminger release ECOSIM Species co-occurrence: constraints Pielou and Pielou (1968): Rows and columns both equiprobable Connor and Simberloff (1979): Rows and column totals both fixed + restricted each species to sites with a species richness in its observed range. Gilpin and Diamond (1982): Cell probability proportional to observed row and column totals. Species co-occurrence: constraints Pielou and Pielou (1968): Rows and columns both equiprobable Species equally likely to provide next colonist, sites equally likely to receive next colonist Connor and Simberloff (1979): Rows and column totals both fixed + restricted each species to sites whose species richness was in its observed range. Gilpin and Diamond (1982): Cell probability proportional to observed row and column totals. Species co-occurrence: constraints Pielou and Pielou (1968): Rows and columns both equiprobable Species equally likely to provide next colonist, sites equally likely to receive next colonist Connor and Simberloff (1979): Rows and column totals both fixed + restricted each species to sites with a species richness in its observed range. Regionally rare species remain rare, species rich islands remain species rich, incidence functions of species are preserved. Gilpin and Diamond (1982): Cell probability proportional to observed row and column totals. Species co-occurrence: constraints Pielou and Pielou (1968): Rows and columns both equiprobable Species equally likely to provide next colonist, sites equally likely to receive next colonist Connor and Simberloff (1979): Rows and column totals both fixed + restricted each species to sites with a species richness in its observed range. Regionally rare species remain rare, species rich islands remain species rich, incidence functions of species are preserved. Gilpin and Diamond (1982): Cell probability proportional to observed row and column totals. Sites are targets that differ in likelihood of being hit, some species have traits that predispose them to be next colonist Species co-occurrence: constraints Site saturation Species regional abundance Site accessibility Species colonization ability Species co-occurrence: constraints Site accessibility Site saturation Species regional abundance Deterministic constraint Species colonization ability ? Modeled as stochastic Species co-occurrence: constraints Site accessibility Site saturation Paradox: To incorporate these constraints, need to assume that observed regional occurrence of species or occupancy Species– in order to test of sites is independent of competition colonization for effects of competition on species occurrence and Species ability site occupancy! regional abundance Deterministic constraint ? Modeled as stochastic Species co-occurrence: the pairs? • Randomization tests look at overall patterns in cooccurrence, little meaning to pairs with high C-scores (Gotelli and Graves 1996) • C-scores of individual pairs influenced by abundance of species • Need to conduct separate randomization tests of just the pair to determine significance. Species co-occurrence in the age of meta-analyses • Meta-analysis of 96 datasets suggests that most have nonrandom associations (fixed row, fixed column method; Gotelli and McCabe 2002) What is a null model? (Revisited) Graham Bell (2000) argued that there are two distinct types of null model: Statistical and dynamic. Statistical: “output varies stochastically” (e.g. randomization tests of co-occurrence) Dynamic: “input to the system varies stochastically” (e.g. neutral model: mutation and demography are stochastic) Statistical null vs. neutral (Statistical) null Neutral What pattern process Species Interactions? No No Species Equivalence? No Yes Random processes do not cause random patterns! Neutral models show correlation of species between sites (top) or though time (bottom) simply because of dispersal limitation. Spatial covariance Such non-random patterns are not expected under a statistical null model. Temporal covariance Simulated neutral communities have significant C-scores (Ulrich 2004, Bell et al. 2005) (Bell et al 2006, Bell 2000) Can neutral processes explain nonrandom species associations? Perhaps only in part! C-scores in real data are much higher than expected from neutral models Neutral model: mean = 0.5 100 datasets: mean = 2.67 (Ulrich 2004) (Gotelli and McCabe 2002) ECOSIM Lab West Indies Finches West Indies finches, From Gotelli and Abele 1982 Your task: Experiment with trying all possible combinations of equiprobable, proportional and fixed row/column totals! What changes in constraints have the most effect? Gotelli 2000: Fixed row methods have lowest type 1 (&2) errors Truly random SIM2: Column totals equiprobable SIM4: Columns totals proportional SIM9: Column totals fixed (Connor & Simberloff 1979) Finches