J. Chave – Pattern and scale in ecology p. 1 1 SUPPLEMENTARY INFORMATION 2 For “The problem of pattern and scale in ecology: what have we learned in 20 years?” 3 Jérôme Chave 4 5 6 SI1. Measuring the modularity of a network The search for modules in complex systems has often been equated to that of 7 ‘communities’, group of actors who interact more than expected by chance (Freemann 1977). 8 Although theory has a long history (de Solla Price 1965, Freemann 1977), many methods to 9 measure patterns of modularity have been developed in the wake of internet social networks 10 11 in the early 2000s. Modularity may be given a formal definition, as follows. In any network, biological or 12 not, a module is defined as a relatively autonomous subset of components (nodes), such that 13 any two components are highly connected, but such that a component within the module is 14 loosely connected with any component outside the module. One measure of modularity, 15 denoted by Q, measures the excess of links within modules in relative to the number expected 16 by chance, summed over all modules (Girvan & Newman 2002). We study a graph composed 17 of N nodes which are related by K undirected links. If we assign each node to one of M 18 predefined communities, we may define the fraction πΈππ of links connecting nodes in 19 π π community i to nodes in community j (such that ∑π π=1 ∑π=1 πΈππ = 1). Then, ππ = ∑π=1 πΈππ is 20 the fraction of links joining at least one node in community i. If the links are randomly placed 21 between nodes, irrespective of the community, then πΈππ = ππ ππ . Modularity then is π = 22 2 ∑π π=1(πΈππ − ππ ). 23 24 The concept of betweenness centrality is another useful to quantify the notion of modularity (Freeman 1977, Girvan & Newman 2002). It measures how important an edge is J. Chave – Pattern and scale in ecology p. 2 25 as a connection route between a priori defined modules. Formally, let πππ be the number of 26 shortest path between node i and j, with πππ ≥ 1. We focus on the node k, and ask how many 27 of these shortest paths connecting i and j go through an edge e. Let ππππ be the number of these 28 paths connecting {π, π} and going through edge e. The betweenness of edge e, π΅π , is defined as 29 the sum, over all pairs {π, π}, of the ratio ππππ /πππ (Girvan & Newman 2002, Newman & Girvan 30 2002). Ma & Zeng (2003) used such an approach to understand the design principles and the 31 patterns of organization in large metabolic networks, by unravelling the most central 32 metabolites of the networks based on its topology. Salathé & Jones (2010) investigated the 33 spread of disease in networks with community structure, and found that community structure 34 has a major impact on disease dynamics. Using the ideas of betweenness centrality, they 35 found that in networks with strong community structure, immunization interventions targeted 36 at individuals bridging communities are more effective than those simply targeting highly 37 connected individuals. 38 With the rapid development of high-throughput DNA sequencing technologies, it is 39 becoming obvious that species need to be defined statistically. In the past, microbiologists 40 have used 16S prokaryotic ribosomal DNA sequences to infer the similarity among microbial 41 strains, and this has resulted in astounding discoveries, including altogether new lineages 42 (Pace 1997). Early such example is the discovery of the cyanobacteria Prochloroccocus 43 marinus in the late 1980s, and that of picoplankton Ostreococcus tauri in 1994. Traditionally, 44 techniques of species discovery have been predicated on the assumption that 16S DNA 45 sequences with 97% similarity or more delimit species. However, the problem of sequence 46 clustering is often far more complex than a simple 97% threshold (Meyer & Paulay 2005), as 47 the rapid development of DNA-based identification of eukaryotic organisms, the large 48 programs of DNA barcoding for a wide range of clades, have offered plenty of opportunities 49 to verify (Moritz & Cicero 2004, Knowles & Carsten 2007). J. Chave – Pattern and scale in ecology p. 3 50 Instead, techniques for modularity discovery should be implemented, and these are not 51 unrelated to the multivariate analyses invented by Fisher (1936) to analyse Anderson’s (1936) 52 Iris flower dataset. One intriguing such technique to cluster data into natural groups (modules, 53 operational taxonomic units, or others) is that developed based on the physics of diluted 54 magnets (Wisemann et al. 1998). Another application has made use of the so-called Markov 55 clustering (Enright et al. 2002), used for defining molecular taxonomic units on fungi using 56 sequence data (Zinger et al. 2009). With the explosion of environmental genomics 57 approaches, there is no doubt that these sequence clustering techniques will become crucial to 58 avoid statistical artefacts in the process of diversity discovery and community detection. 59 60 References 61 Anderson, E. (1936). The species problem in Iris. Ann. Miss. Bot. Garden, 23, 457-509. 62 de Solla Price, D.J. (1965). Networks of scientific papers. Science, 169, 510-515. 63 Enright, A. J., van Dongen, S. & Ouzounis, C.A. (2002). An efficient algorithm for large- 64 65 66 scale detection of protein families. Nucleic Acids Res., 30, 1575-1584. Fisher, R.A. (1936). The use of multiple measurements in taxonomic problems. Ann. Eugenics, 7, 179-188. 67 Freeman, L.C. (1977). A measure of centrality based on betweenness. Sociometry, 40, 35-41. 68 Girvan, M. & Newman, M.E.J. (2002). Community structure in social and biological 69 70 71 72 73 networks. Proc. Natl. Acad. Sci. USA, 99, 7821–7826 Ma, H.-W. & Zeng, A.-P. (2003). The connectivity structure, giant strong component and centrality of metabolic networks. Bioinformatics, 19, 1423-1430. Meyer, C.P. & Paulay, G. (2005). DNA barcoding: error rates based on comprehensive sampling. PLoS Biol., 3, e422. J. Chave – Pattern and scale in ecology 74 75 76 77 78 79 80 81 82 p. 4 Newman, M.E.J. & Girvan, M. (2004). Finding and evaluating community structure in networks. Phys. Rev. E, 69, 026113. Pace, N.R. (1997). A molecular view of microbial diversity and the biosphere. Science, 276, 734-740. Salathé, M. & Jones, J.H. (2010). Dynamics and control of diseases in networks with community structure. PLoS Comp. Biol., 6, e1000736. Wiseman, S., Blatt, M. & Domany, E. (1998). Superparamagnetic clustering of data. Phys. Rev. E, 57, 3767–3783 Zinger, L., Coissac, E., Choler, P. & Geremia, R.A. (2009). Assessment of microbial 83 communities by graph partitioning in a study of soil fungi in two alpine meadows. Appl. 84 Environm. Microbiol., 75, 5863-5870. 85 86 J. Chave – Pattern and scale in ecology 87 p. 5 SI2 Coarsening dynamics in discrete spatial models. 88 Durrett & Levin (1994) famously illustrated the importance of being spatial and 89 discrete by modelling three classical biological problems using four models: individual-based 90 versus mean-field models, and spatial versus non-spatial models. Here I offer a simplified 91 version alng the same lines of reasoning with spatial models which both exhibit domain 92 growth dynamics. The spatial dynamics of these models is illustrated in Fig. 7 of the main 93 text. 94 The voter model is the first model. It is usually defined using electors regularly placed 95 on a grid and who may take one of two possible opinions (Clifford & Sudbury 1973), but here 96 I translate the same model in terms of a species coexistence model. In a square lattice, every 97 site is occupied with an individual of two possible species. The total number of individuals is 98 N. The dynamics follows how each site changes its species occupancy as a result of local 99 interactions. During an infinitesimal time step, one randomly chosen individual dies, and it is 100 replaced by the offspring of a randomly chosen neighbour (of four possible neighbours). A 101 macroscopic time step consists of N such draws (so that each individual is chosen once on 102 average). As time t goes by, clusters occupied by the same species grow in size, and the 103 number of neighbouring pairs with different species – henceforth, zones of tension – declines 104 proportionally to 1/lnβ‘(π‘) (Fig 7), as may be shown based on the duality of the voter model 105 with a system of coalescing random walks. Now assume that, with some probability π > 0, 106 some vacated sites can be occupied by offspring produced anywhere in the lattice. Adding 107 even a small amount of long-distance dispersal, the domain growth depicted in Fig 7 is 108 destroyed, and the model no more shows any coarsening. 109 A second toy model is the majority model (known as the zero-temperature Ising model 110 in physics, Krapivski et al. 2010). It is defined similarly that the voter model, but the local 111 update rules are slightly different: all four neighbours send an offspring into the vacated cell, J. Chave – Pattern and scale in ecology p. 6 112 and the majority always wins. In case of a tie, the winner is chosen at random. In this model, 113 isolated clusters are at a disadvantage and the number of zones of tension declines much faster 114 than in the voter model, as 1/√π‘. Importantly, the domain growth behaviour of this model is 115 not altered by the addition of a probability π > 0 of long-distance dispersal (up to a point 116 π = ππ where a critical transition occurs). 117 These two models are lattice-based, and they could be approximated by non-spatial 118 models such as in the example offered by Durrett & Levin (1994) or following methods 119 developed in Bolker & Pacala (1997). In fact, for these particular cases, much is known about 120 the macroscopic behaviour of the models (for overviews, see Hinrichsen 2000 & Krapivski et 121 al. 2010). 122 123 124 Bolker, B. & Pacala S.W. (1997). Using moment equations to understand stochastically driven spatial pattern formation in ecological systems. Theor. Pop. Biol., 52, 179-197. 125 Clifford, P. & Sudbury, A.W. (1973). A model for spatial conflict. Biometrika, 60, 581-588. 126 Durrett, R. & Levin, S.A. (1994). The importance of being discrete (and spatial). Theor. Pop, 127 128 129 130 131 132 Biol., 46, 363-394. Hinrichsen, H. (2000). Non-equilibrium critical phenomena and phase transitions into absorbing states. Adv. Phy., 49, 815-958 Krapivski, P.R., Redner, S. & Ben-Naim, E. (2010). A Kinetic View of Statistical Physics. Cambridge University Press, 2010.