1 Joining the dots: an automated method for constructing food webs 2 from compendia of published interactions 3 4 Clare Graya, b, David H Figueroac, Lawrence N. Hudsond, Athen Mae, Dan Perkinsb, Guy Woodwardb* 5 6 a Queen Mary University of London, School of Biological and Chemical Sciences, London, E1 4NS, 7 UK b 8 Department of Life Sciences, Imperial College London, Silwood Park, Buckhurst Road, Ascot, 9 Berkshire,SL5 7PY,UK c 10 11 12 d Universidad Católica de Temuco, Escuela de Ciencias Ambientales, Montt 56 Temuco Chile Department of Life Sciences, Natural History Museum, Cromwell Road, London, SW7 5BD, UK e Queen Mary University of London, School of Electronic Engineering and Computer Science, 13 London, E1 4NS, UK 14 *Corresponding author 15 Email: guy.woodward@imperial.ac.uk (GW) 16 17 1 18 19 20 Abstract Food webs are important tools for understanding how complex natural communities are 21 structured and how they respond to environmental change. However their full potential has yet to be 22 realised because of the huge amount of resources required to construct them de novo. Consequently, 23 the current catalogue of networks that are suitable for rigorous and comparative analyses and 24 theoretical development still suffers from a lack of standardisation and replication. 25 Here, we present a novel R function, WebBuilder, which automates the construction of 26 food webs from taxonomic lists, and a dataset of trophic interactions. This function works by 27 matching species against those within a dataset of trophic interactions, and ‘filling in’ missing trophic 28 interactions based on these matches. We also present a dataset of over 20,000 freshwater trophic 29 interactions, and use this and four well-characterised freshwater food webs to test the method. 30 The WebBuilder function facilitates the generation of food webs of comparable quality to 31 the most detailed published food webs, but at a fraction of the research effort or cost. Furthermore, it 32 matched and often outperformed a selection of predictive models, which are currently among the best, 33 in terms of capturing key properties of empirical food webs. The method is simple to use, systematic 34 and, perhaps most importantly, reproducible, which will facilitate (re-) analysis and data sharing. 35 Although developed and tested on a sample of freshwater food webs, this method could easily be 36 extended to cover other types of ecological interactions (such as mutualistic interactions). 37 38 Keywords – food web construction, gut contents analysis, food web modelling 39 1 - Introduction 40 Characterising food webs (networks representing trophic interactions between species) and 41 other ecological networks (networks which represent any type of ecological interaction, such as 42 pollination) can help us understand and, ultimately predict multispecies systems’ responses to changes 43 in environmental conditions (Thompson et al., 2012; Tylianakis et al., 2010). Food webs can reveal 44 subtle but important changes in the biotic interactions that underpin ecosystem functioning, stability, 45 and resilience to perturbations - higher-level phenomena that cannot be inferred from studying the 46 nodes (i.e., species or populations) alone (Gray et al., 2014; Thompson et al., 2012). 47 Despite the many advantages of a network-based approach to ecology, significant challenges 48 need to be overcome, particularly in terms of gathering interaction data. Interactions occur between 49 individuals and data are often collected at this level: for example, via collection, rearing and 2 50 identification of every leaf miner, and subsequent leaf miner parasitoid along a transect to build 51 herbivore-parasitoid networks (Macfadyen et al., 2011; as in Memmott et al., 2000), or through 52 dissecting and identifying consumer gut-contents via microscopy (as in Layer et al., 2013). Such 53 laborious methods require substantial investment of time and resources, and it can take many 54 thousands of lab hours to characterise just one food web, which even then may still be undersampled 55 for links between its rarer members (see Table 1; e.g. Woodward et al. 2005; Olito & Fox 2014). 56 Many hundreds or thousands of individuals of each species are often needed to fully characterise the 57 full set of feeding links within a food web (e.g. Ings et al., 2009), which is rarely practical given the 58 financial and time restraints of research funding. In addition, such comprehensive sampling is often 59 destructive and can impose undesirable disturbance on study systems. Consequently, empirical food 60 webs are often incompletely described and constructed from relatively small sample sizes (e.g. 61 Kaiser-Bunbury et al., 2010; Layer et al., 2013). This limits the conclusions that can be drawn and the 62 number of comparable food webs that are available both across and within studies (Bascompte et al., 63 2003; e.g. Briand, 1983; Olesen et al., 2007) although exceptions to this exist (Bascompte et al., 2003; 64 Cohen and Mulder, 2014). Most studies still have patchy and differing levels of sampling effort and 65 taxonomic resolution, making meta-analyses difficult or even inappropriate: the ability to construct 66 large numbers of realistic, comparable food webs across multiple systems would, therefore, help 67 realise the true potential of network approaches (Gray et al., 2014). 68 69 Table 1. Methods for constructing food webs, with their advantages and disadvantages. Method Observation of evidence of interaction (e.g. feeding trials or gut contents analysis) Extrapolating from previously published interactions (e.g. WebBuilder function) Advantages High confidence in links produced. Disadvantages Very slow and labour-intensive. Rare interactions are often missed. Interaction type is biased by the method employed, e.g. the prey of suctorial predators cannot be determined through gut contents analysis. Fair confidence in links produced. Rare interactions can be included. Interactions from multiple studies determined through different methods can be easily incorporated. Reliant on the quality of the data contained within the reference dataset. Can only be used to construct ‘cumulative’ or ‘summary’ food webs, i.e. temporal or spatial changes in feeding behaviour cannot be incorporated. Low effort and quick. 3 Examples Woodward et al. (2005) Macfadyen et al. (2011) Henson et al. (2009) Ledger et al. (2012) Hall & Raffaelli (1991) Goldwasser et al. (1993) Havens (1993) Piechnik et al. (2008) Pocock et al. (2012) Layer et al.(2013) Cohen & Mulder (2014) Strong & Leroux (2014) Predictive models Ecological rules and theory can be incorporated. Require prior knowledge of the structure of the food web in order to optimize parameter values. Low effort and quick. Many perform poorly at predicting individual interactions, even when food web structure is predicted well. Cohen et al. (1985) Williams & Martinez (2000) Petchey et al. (2008) Allesina & Pascual (2009) Allesina (2011) Olito & Fox (2014) 70 71 72 Ecological networks are often constructed by incorporating species interactions from the 73 published literature (Table 1) and many food webs are constructed entirely in this manner (Cohen and 74 Mulder, 2014; e.g. Goldwasser et al., 1993; Havens, 1993; Strong and Leroux, 2014), while other 75 food webs contain a blend of observational and extrapolated data (Layer et al., 2013; M. J. O. Pocock 76 et al., 2012). By filling in ‘missing’ trophic interactions to a given species list, the implicit assumption 77 is made that, if a given pair of species have been observed to interact at one site, they will interact in 78 the same way at other sites where they co-occur (at least in terms of a feeding link between the 79 species being realised, or not). Food webs built through this method are often referred to as 80 ‘summary’ or ‘cumulative’ food webs as they represent all potential interactions (of a particular type, 81 for instance trophic interactions within a food web) between species of a particular community, rather 82 than a snapshot in time. As such, food webs build through this method are unsuitable for detecting 83 changes in species feeding behaviour across sites or over time, but are highly effective for detecting 84 broad macro-ecological trends such as changes in food web structure across environmental gradients 85 (K Layer et al., 2010; Mulder and Elser, 2009; e.g. Piechnik et al., 2008). 86 This approach can be taken further, by assigning interactions of species on the basis of 87 taxonomic similarity: i.e., species within the same genus are assumed to have identical links if a link 88 has been established through direct observation for at least one congener (e.g. Goldwasser et al., 1993; 89 K Layer et al., 2010). This process is often used when constructing summary food webs for species 90 the interactions of which have not been fully characterised (e.g., as revealed from yield-effort curves) 91 to minimise potential biases arising from under-sampling, i.e. including only observed links would 92 otherwise significantly underestimate food web complexity, especially among the rarer and/or more 93 obscure taxa (Woodward et al 2010). Recent work (Eklof et al., 2012) has provided justification for 94 this approach, by highlighting the strong influence that taxonomy has in determining the structure of 95 food webs. Thus, given the prevalence of undersampling in even relatively well-described food webs, 96 dietary data extrapolated from the literature and generalised taxonomically can potentially produce far 97 more complete and realistic summary food webs than those that rely solely on observations made in a 98 particular locale. 99 Despite the prevalence of these methods for constructing summary food webs in the literature 100 (Cohen and Mulder, 2014; e.g. Goldwasser et al., 1993; Havens, 1993; Katrin Layer et al., 2010; M. J. 4 101 O. O. Pocock et al., 2012; Strong and Leroux, 2014), there is still no standard method for inferring 102 feeding interactions, resulting in inconsistencies among studies, even within the same ecosystem type. 103 This is especially problematic because authors rarely state explicitly which links have been observed 104 or extrapolated, or the source from which they have been drawn, or how closely the previously 105 published interactions match those reported in their particular study, making replication impossible 106 and preventing other researchers from scrutinising published interactions fully (but see Strong and 107 Leroux, 2014). 108 Recent research has sought to develop predictive models of the structure of ecological 109 networks (Eklof et al., 2012; Gravel et al., 2013; Olito and Fox, 2014; e.g. Rohr et al., 2010). Simple 110 rules based on ecological theory have been used to model and predict the structure and topology of 111 food webs, the most successful of which include deterministic models based on information on 112 species’ body sizes, for example the ‘Difference’, ‘Ratio’, and ‘Difference/Ratio’ models (Allesina, 113 2011) and the Allometric Diet Breadth Model (ADBM; Petchey et al. 2008) which incorporates 114 allometric scaling and optimal foraging parameters. Whilst these models have been developed 115 primarily to advance ecological theory, they provide a possible tool through which food webs could 116 be built de novo in order to address questions about network structure across environmental gradients 117 or scales. However, to achieve their best performance (proportion of correctly predicted links) these 118 models require some prior knowledge about the number of links in the network. For instance, for the 119 models mentioned above a researcher is required to go through a parameter optimisation procedure, 120 by fixing the number of links, values of constants and exponents can be derived, by maximizing the 121 number of links correctly predicted. When constructing a network for the first time for a particular 122 system, a research would be required to fix the number of links to an expected value which would bias 123 the network structure towards that which the researcher expected to find. 124 Additionally, recent work (Olito and Fox, 2014) has highlighted that while predictive models 125 might perform well at predicting metrics of network structure, they tend to perform poorly at 126 predicting pairwise interactions (Sáyago et al., 2013; e.g. Vázquez et al., 2009; Verdú and Valiente- 127 Banuet, 2011; Vizentin-Bugoni et al., 2014), so whilst they may predict network structural metrics 128 well, they are doing so for the wrong reason as the underlying biological mechanisms have not been 129 fully incorporated into the predictive models (Petchey et al., 2011). To the best of our knowledge, the 130 models used here have not, up until now, been used to predict network structure de novo, as this is not 131 the scenario for which they were developed. 132 Given the limitations of constructing food webs from observation of interactions or predictive 133 models, we need an automated, repeatable and reliable method of building local food webs that can be 134 applied across studies and, ultimately, different ecosystem and network types. Here, we introduce a 135 method, the WebBuilder function that assembles food webs by systematically assigning links for 136 taxa based upon a given set of user-defined rules applied to a dataset of known trophic interactions. 137 We provide an implementation of our method for the R statistical modelling language (R Core Team, 5 138 2013), building upon the methods and data structures provided by the Cheddar R package (Hudson et 139 al., 2013). We tested the method on four highly resolved freshwater food webs which have had their 140 interactions characterised through gut contents analysis, as these represent some of the most complete 141 food webs described to date, as a test case for our proof-of-concept. Specifically, our key aims were 142 to: 143 144 145 146 1. Collate a dataset of trophic interactions in a standard format to act as an example system in which to test this method. 2. Automate the process of constructing food webs from this reference dataset in a repeatable and reliable manner. 147 3. Compare the performance of this method with the structure of food webs with ‘known’ 148 interactions, i.e. those which have been built through observation of the interactions. 149 150 4. Compare the performance of this method with another way of predicting food web structure; the ADBM, Difference, Ratio and Difference/Ratio models. 151 2 - Methods 152 2.1 - Dataset of trophic interactions 153 We collated a dataset of 20,823 pairwise trophic interactions among species (or the next highest level 154 of resolution available, usually genus), from 51 different data sources, most of which were primary 155 literature (Table A.1, Appendix A). It contains trophic interactions between primarily UK freshwater 156 species, including 203 producer taxa, 593 invertebrate taxa, 24 fish taxa, 10,348 producer-animal 157 links, 9,531 animal-animal links and 944 detritus-animal links. When the necessary data were not 158 available in the original publications, we contacted the authors directly, where possible, to obtain the 159 raw data. The taxonomy of every resource and consumer has been standardised through the Global 160 Names Resolver (http://resolver.globalnames.biodinfo.org/) using the Global Biodiversity Information 161 Facility dataset. For every resource-consumer link the taxonomy (species, genus, subfamily, family, 162 order, class) of both is given, along with life-stage information, if relevant, and a literature reference 163 for the source of the link. This dataset builds upon the collection assembled by Brose et al (2005), and 164 to the best of our knowledge, represents the largest standardised collection of trophic links for 165 freshwater organisms. This dataset is available to download at 166 https://sites.google.com/site/foodwebsdataset/ (doi: 10.5281/zenodo.13751) and is designed to be 167 easily updated by the iterative addition of new data (details of how to submit new data to the dataset 168 are given on the website), allowing its content to improve over time, in an analogous manner to 169 molecular-based bioinformatics datasets. New data will be subjected to a quality assurance procedure 170 prior to inclusion in the dataset. Specifically, all taxa will be parsed through Global Names Resolver 171 (http://resolver.globalnames.biodinfo.org/) using the Global Biodiversity Information Facility dataset. 172 Additionally the data will be eyeballed for irregularities. It is anticipated that this data will exists as an 6 173 open access resource, and as such the community of researchers who access it will report any errors 174 they find so they can be double checked and removed. New iterations of the dataset can be produced, 175 hosted on the webpage alongside the original, and assigned a new DOI, allowing researchers to cite 176 exactly which version of the dataset they have used for their research, allowing analyses to be 177 repeated using identical versions to those cited in a given study, if required in the future. 178 179 2.2 - The WebBuilder function 180 The method of constructing ecological networks by extrapolating from previously published 181 interactions is implemented in a new R function - WebBuilder (see Appendix B for code). The user 182 is required to provide the following; firstly a list of taxa (i.e. nodes) in the community of interest (step 183 I, Figure 1), this data can be gathered from multiple sources and could be in the form of survey or 184 biomonitoring data. Secondly, for each node, the minimum level of taxonomic generalisation 185 (explained below; step II), and the taxonomic classification of each node (step III). Lastly a registry – 186 a dataset of known trophic interactions, including taxonomic classification (step IV), as example of 187 which is published here, but which can also be created by the user or obtained elsewhere. It is 188 recommended that the user resolve the taxonomy of their taxa list and registry using the same 189 procedure so as to ensure that taxa are matched correctly, if the user were using the registry provided 190 here they would need to parse their taxa list through Global Names Resolver 191 (http://resolver.globalnames.biodinfo.org/) using the Global Biodiversity Information Facility dataset. 192 The function searches the registry for every possible combination of resource-consumer interactions 193 (for N taxa there are N2 possible trophic interactions) which match the provided taxa list given the 194 specified level of taxonomic generalisation. 195 The minimum level of taxonomic generalisation determines the taxonomic rank at which 196 matches are made, thus generalising the resources or consumers of the candidate node to the species, 197 genus, subfamily, family, order etc level, as specified in the input (step II, Figure 1). For instance, a 198 researcher might decide to ascribe the level of taxonomic generalisation of ‘genus’ to the mayfly 199 Baetis fuscatus, allowing it to be matched with the more commonly studied species Baetis rhodani in 200 the dataset, and take on the appropriate feeding interactions of that species, i.e. those which include 201 taxa also present on the provided taxa list (see the first Scenario in Figure 2 ). This level of taxonomic 202 generalisation is selected based on knowledge of a candidate node’s trophic interactions in relation to 203 its sister taxa (i.e. if all members of the same taxonomic unit can be assumed to have the same trophic 204 interactions or not), and this can be tailored depending on the resource/consumer status of the node. 205 For example, consumers of the larvae of the non-biting midge subfamily Tanypodinae tend to be 206 trophic generalists and would likely consume other larvae of the family Chironomidae, while it is not 207 likely that the resources of Tanypodinae larvae (which are predominantly predatory) would be shared 208 by all Chironomidae larvae (many of which are grazers or filter feeders). Hence it would not be 7 209 appropriate to assign the trophic generalisation level ‘family’ to both resource and consumer 210 interactions of Chironomidae. Instead a researcher might ascribe the ‘resource method’ for 211 Tanypodinae as ‘family’, but the ‘consumer method’ as ‘subfamily’ (see the second Scenario in 212 Figure 2). The function output contains references to the original empirical links, the number of 213 matches that were found and the taxonomic level at which those matches were found, so links can be 214 additionally screened and scrutinised post hoc, and analysis can be repeated easily because the 215 function output contains the necessary information. Example R code is supplied (Appendix C). 216 2.3 - Comparing the WebBuilder function with empirical food webs 217 The WebBuilder function was validated on a collection of highly-resolved stream food webs which 218 have had their trophic interactions characterised through direct observation; Broadstone Stream 219 (Woodward et al., 2010), Afon Hirnant (Gilljam et al., 2011; Woodward et al., 2010), Tadnoll Brook 220 (Edwards et al., 2009) and the summary food web for the replicated four reference Mill Stream side- 221 channels (Ledger et al., 2012; Woodward et al., 2012). The replicates for the Mill Stream data were 222 pooled to aid comparison with the other food webs, which were all constructed as a single aggregate 223 food web. The Broadstone and Afon Hirnant food webs contained only trophic interactions between 224 macro-invertebrates, the Tadnoll food web contained interactions between macro-invertebrates and 225 fishes and the Mill Stream data contain interactions between macroinvertebrates, algae and detritus. 226 When the WebBuilder function was used to generate the empirical food webs, in turn each 227 respective local dataset was first removed from the global dataset, so each food web was generated in 228 the absence of its own link information (to remove circularities). 229 The performance of the WebBuilder function was evaluated by calculating the True Skill 230 Statistic (TSS; Allouche et al., 2006). This statistic was used as it can be digested into its component 231 parts to provide information on the types of differences between the empirical and generated food 232 webs, and builds upon the most commonly used metric which is simply the proportion of links 233 correctly generated (e.g. Allesina, 2011; Petchey et al., 2008; Woodward et al., 2010). This statistic 234 was chosen over likelihood based approaches because we were not interested so much in the 235 efficiency of these predictive models, more the biological realism of the generated food webs 236 (Petchey et al., 2011). The TSS is calculated from the following formula: 237 πππ = (ππ − ππ)/[(π + π)(π + π)] 238 where a is the number of links which were correctly generated by the function (the True Positives 239 Rate; TPR), b the number of links generated by the function but not observed empirically, c the 240 number of links not generated by the function but were observed empirically and d the number of 241 links neither generated by the function nor observed empirically. TSS score values range from -1 to 1, 242 where a score of -1 represents a generated food web that is the inverse of the empirically observed one 243 (no observed empirical links are seen in the generated food web, and every non-link in the empirical 8 244 food web is present in the generated food web), and 1 representing a generated food web having the 245 exact same links as the empirically observed one. 246 Each empirical food web was generated using the level of taxonomic generalisation 247 considered most appropriate (see Appendix D), this mostly consisted of exact and genus level matches 248 although some family and order matches were used. To test how the generated food webs compared to 249 their empirical counterparts, a series of network metrics were calculated; number of links (L), linkage 250 density (L/S; where S is the number of nodes), connectance (C, where C=L/S2), generality (the 251 average number of resources per consumer), vulnerability (the average number of consumers per 252 resource), and proportion of top, intermediate and basal nodes (with cannibalistic links removed). The 253 difference between the generated and empirical network metric was tested with paired Wilcoxon 254 signed rank tests. 255 To test how the quality of the generated food webs varied with dataset size, the dataset was 256 randomly subsampled, in sequential steps of 5% from 5-100%, of the original dataset size, and then 257 used to generate each food web. Each subsample size was repeated five times and each empirical food 258 web was generated in the absence of its own food web data as above, to remove circularities. For each 259 node within each network, the same level of taxonomic generalisation was used as above. 260 To test how the quality of the pairwise interactions generated by the WebBuilder function 261 varied with the level of taxonomic generalisation, each food web was built using exact, genus, family 262 or order taxonomic generalisation for all nodes. The degree (the number of links into or out of a 263 particular node), generality, and vulnerability for every node in the generated food web was compared 264 with that in the empirical network. The difference between the two for every node was recorded so 265 that a positive score represented interactions ‘missed’ by the WebBuilder function, and a negative 266 score represented ‘extra’ interactions not found in the empirical food web. The distribution of these 267 scores gives an indication of how well the WebBuilder function predicted pairwise interactions 268 across the whole network: i.e., if, on average, it tended to ‘miss’ more interactions, or tended to pick 269 up ‘extra’ interactions. To test if the mean was different from zero (indicating no difference in the 270 quality of pairwise interactions between the generated and empirical food webs) a one sampled t-test 271 was used. 272 273 2.4 - Comparing the WebBuilder function to theoretical food web models 274 The performance of the WebBuilder function was compared with examples of some of the best- 275 performing predictive models currently available: the ‘Difference’, ‘Ratio’ and ‘Difference/Ratio’ 276 models (Allesina, 2011) as well as the Allometric Diet Breadth Model (ADBM; with “ratio” handling 277 time, Petchey et al., 2008). The ‘Difference’, ‘Ratio’ and ‘Difference/Ratio’ models all generate food 278 web links on the basis of body size, (either the difference between consumer body size and resource 279 body size, the ratio between the two, or the difference multiplied by the ratio). The ADBM builds on 9 280 this and incorporates allometries of body size and foraging behaviour of individual consumers to 281 model food web structure (Allesina, 2011 for more detailed explanations; see Petchey et al., 2008). 282 Detritus nodes were first excluded from the Mill Stream food web because these nodes had no body 283 size or abundance data. For the ‘Difference’, ‘Ratio’ and ‘Difference/Ratio’ models two parameters 284 required optimisation, a and b. For the ADBM we used parameter values for the mass to attack rate 285 constant (a), resource mass to attack rate exponent (ai) and consumer mass to attack rate exponent (aj) 286 from the literature (Rall et al., 2012) rather than through parameter optimisation as in Petchey et al. 287 (2008), so as to simulate a situation for which the WebBuilder function was designed, where food 288 webs are being generated for the first time with no prior knowledge of the system other than the 289 species richness. For two parameters (mass to handling time constant, h.a; mass to handling time 290 critical ratio, b) we were unable to find information in the literature with which to value these 291 parameters, so went through the process of optimisation. For all models this was achieved by 292 constructing food webs with a range of values for each parameter, and selecting those food webs 293 which had a number of links that was within the range set by the WebBuilder function, i.e. if the 294 WebBuilder function generated K links, and there were L empirical links and K-L=t we selected all 295 possible solutions within the range L-t:L+t, to make the comparison with the WebBuilder function 296 fair. Note that for some food webs the difference between L and K was large, leading to large 297 variation in the food web sizes generated by these models. Indeed for the Afon Hirnanlt food web this 298 range fell below zero, and so the range was arbitrarily set to be the same proportional size as that of 299 the Tadnoll food web, which had the next highest range. Parameter optimisation was conducted 300 without using the connectance of the empirical food webs, hence although the same data have been 301 used, results will vary from previous publications. Prior knowledge of the connectance of food webs 302 would not be possible if a food web were being built de novo, so here we are using these models in a 303 different way from their original application. 304 3 - Results 305 3.1 - Comparing the WebBuilder function with empirical food webs 306 When we constructed food webs from random subsets of the dataset, the quality (as measured by TSS 307 scores) of the generated food webs improved as the number of records in the dataset increased, 308 allowing more complete resource and consumer interactions to be ascribed to each taxa (Figure 3). 309 The strength of this relationship was food web specific, for instance Broadstone and Afon Hirnant did 310 not continue to improve beyond a dataset size of about 25%. These food webs are relatively simplistic 311 compared to Tadnoll and Millstream, and so the WebBuilder function reached its optimum 312 performance when generating these food webs with a fraction of the total dataset. Tadnoll and 10 313 Millstream did not reach their asymptotes suggesting that more data is needed to improve upon the 314 quality of their generated food webs. 315 The level of taxonomic generalisation for each node was important for the quality of the 316 generated food web; if the taxonomy of a given node list was generalised too far (typically beyond the 317 family level) then the ascribed links became unrepresentative and the food web become over- 318 connected resulting in an increased FPR and lower TSS score (Figure A.1, Appendix A). At the scale 319 of individual trophic interactions, the difference in degree, generality and vulnerability was generally 320 positive when matching taxa exactly or at the genus level, and becomes progressively more negative 321 as the taxonomic generalisation increased, indicating that the WebBuilder function was ‘missing’ 322 links when matching nodes exactly or at the genus level, and including progressively more links the 323 further the taxonomy was generalised (Figure 4). For Afon Hirnant and Tadnoll there was no 324 significant difference in the generality of consumers between the generated and empirical food webs 325 when taxa were matched at the genus level, and there was no significant difference in vulnerability of 326 resources for the Tadnoll food web when match at the genus level. This suggests that matching taxa at 327 the genus level for these food webs produces the most ‘accurate’ pairwise interactions. For all other 328 food webs and levels of taxonomic generalisations the generated links were different from that of the 329 empirical food web (Figure 4). 330 The occurrence of each food web’s nodes in the dataset is given in Table 2. The coverage of 331 these within the reduced dataset (food web data were removed from the dataset when used to generate 332 the food web for that site) varied between 1,497 (Broadstone) and 6,704 occurrences (Mill Stream) 333 (Table 2). Even at the family level some nodes from Broadstone, Afon Hirnant and Mill Stream were 334 not represented in the dataset, meaning that those nodes needed to be generalised further still for the 335 WebBuilder function to generate their links. These nodes tended to be rare taxa which were poorly 336 represented in the dataset. The generated food webs (see Appendix S4 for the generated trophic links) 337 had similar network metrics to the empirical food webs (Table 3), although the proportion of top 338 nodes was consistently lower in the generated food webs, and the proportion of intermediate and basal 339 nodes was consistently higher. All generated food web metrics were found to be similar to that of 340 their empirical counterparts (paired Wilcoxon signed rank test, p=>0.05). The TSS ranged from 0.407 341 (Broadstone) to 0.557 (Tadnoll Brook). Two food webs (Broadstone and Afon Hirnant) contained 342 nodes that were found to have predatory links in the empirical food web but were not predicted to 343 have any by the WebBuilder function, and vice versa many nodes distributed across all the food 344 webs were predicted to have consumer links but were not found to have any empirically (Figure 5). 345 346 Table 2. The representation of the food web taxa within the full dataset and partial dataset (i.e., diet 347 data gathered from a food web were excluded from the generation of its own inferred food web). 11 Number of appearances in Percentage of nodes appearing in partial dataset at dataset each taxonomic level Food web Full data Partial data set set exact genus family Broadstone 2,196 1,497 81% 84% 94% Afon Hirnant 2,945 2,266 72% 79% 92% Tadnoll 12,405 4,314 84% 91% 100% Mill Stream 11,545 6,704 87% 96% 97% 348 349 351 (C, where C=L/S2), Generality, Vulnerability, proportion of top, intermediate and basal species of the 352 empirical and generated food webs. The performance of the WebBuilder function (relative to the 353 original empirical food web) is summarised by the TSS statistic (which gives an overall measure of 354 performance), and TPR (the proportion of links correctly generated). All food web metrics for the 355 generated food webs were found to be similar to that of their empirical counterparts (paired Wilcoxon 356 signed rank test, p = >0.05) 4.43 6.93 2.82 7.58 2.91 4.91 9.19 8.64 0.158 0.247 0.085 0.23 0.05 0.085 0.124 0.117 13 9.84 7.67 15.87 9.33 12.45 16.98 16.63 iat e Bas al 124 194 93 250 169 285 680 639 Inte rme d C Top 357 358 L/S Vul nera empirical generated empirical Afon Hirnant generated empirical Tadnoll generated empirical Mill Stream generated Broadstone L Gen Network bili t y Table 3. The number of links (L), linkage density (L/S, where S=number of nodes), the connectance eral it y 350 4.33 7.48 4 8.5 3.23 5.59 14.15 11.7 0.68 0.04 0.58 0 0.69 0.07 0.46 0.19 0.29 0.64 0.12 0.45 0.21 0.31 0.19 0.32 0.04 0.25 0.24 0.39 0.1 0.53 0.35 0.41 TSS 0.407 0.376 0.418 0.24 0.557 0.372 0.547 0.531 359 3.2 - Comparing the WebBuilder function method to theoretical food web models 360 The percentage of links correctly predicted (TPR) by the ADBM ranged from 12-43%, for the 361 Difference model it was 0-46%, Ratio model it was 0-51% and the Difference/Ratio it was 0-54% 362 (Figure 6). All webs generated by the WebBuilder function had higher TPR and TSS scores than 363 the median values for the Difference, Ratio and Difference /Ratio models (Figure 6). In general the 364 WebBuilder function had higher TPR and TSS scores than the ADBM, however the TPR score for 365 Tadnoll and Afon Hirnant were similar to the median ADBM TPR score for that particular food web 12 TPR 366 (as opposed to the overall median). Additionally the TSS score for Tadnoll generated by the 367 WebBuilder function was similar to the median ADBM TSS score (Figure A.2, Appendix A). 368 369 4 - Discussion 370 4.1 - Strengths and weaknesses of the WebBuilder function 371 Here we have demonstrated a systematic and reproducible method for building ecological networks 372 from compilations of previously observed interactions. The WebBuilder function facilitates 373 comparability across studies, re-analysis and data sharing. Although developed in the context of 374 freshwater food webs, given its simplicity and generality the WebBuilder function could be easily 375 applied to other systems, such as terrestrial food webs or even mutualistic networks. Plenty of other 376 datasets already exist which could be exploited similarly to produce comparable, reproducible 377 networks from marine and terrestrial systems (e.g., Barnes, 2008; Database of Insects and their Food 378 Plants; http://www.brc.ac.uk/dbif/). 379 The WebBuilder function is an effective tool for constructing summary ecological networks for 380 the first time. The overall performance (TSS) of the WebBuilder function exceeded that of the 381 ADBM, Difference, Ratio and Difference /Ratio models. The proportion of correctly predicted links 382 (TPR) was similar to or exceeded that of the ADBM. The ADBM cannot predict links for nodes that 383 have no body-size information – either because it is not known or because the concept is meaningless 384 for the node, such as detrital resources. This problem does not apply to the WebBuilder function. 385 The ADBM, Difference, Ratio and Difference/Ratio models have been used to generate the food webs 386 presented here before, and have performed better than we have achieved here (Allesina, 2011; 387 Petchey et al., 2008; Woodward et al., 2010), however to achieve this accuracy the generated food 388 webs were constrained to have the same connectance as the empirical food webs, an approach not 389 available when building a food web for the first time. Indeed there were instances here that the TPR 390 and TSS of modelled food webs exceeded that of the WebBuilder function, but from the range of 391 possible food webs generated by these models, it is impossible to select the most ‘accurate’ one 392 without knowledge of the expected number of links. The WebBuilder function does not rely on 393 prior knowledge of the food web, only on the correct identification of the nodes, thus reducing biases 394 and restrictions. 395 It is perhaps unfair to compare the performance of the WebBuilder function to that of the 396 ADBM, Difference, Ratio and Difference/Ratio models due to the inherent differences in the 397 mechanisms through which they operate, and indeed it is not our intention for this exercise to be taken 398 as a criticism of these alternative approaches. Rather, we have compared them here in order to place 399 the WebBuilder method in the broader context of some of other more widely-used predictive methods 13 400 currently available. Comparing our approach with the performance of the ADBM essentially 401 represents a test of, and a means of improving, our understanding of the mechanistic theory behind 402 these trophic interactions. Comparing the food webs produced by our approach with empirical food 403 webs represents a test of the quality of the underlying data held within the dataset of trophic 404 interactions. The WebBuilder function should be used as tool with which to construct large 405 collections of food webs with which to test our understanding of food web structure across 406 environmental gradients. The WebBuilder function is particularly suited to constructing food webs 407 for data-poor systems, e.g., where there is no information available about the abundance or body size 408 of nodes, with the only information being a list of species present. Clearly the ADBM or other 409 predictive models would not be suited to these conditions, as they were never designed to work in this 410 way. The WebBuilder function, however, would be able to generate reasonably realistic food webs if 411 given a reference dataset of relevant trophic interactions. The WebBuilder function is adaptive, and 412 can be improved upon over time; for instance, by increasing the size and coverage of the dataset of 413 interactions. Hence it requires a substantial amount of data to perform well, unlike the predictive 414 models analysed here. These types of methods can be viewed as complementary: a researcher might 415 use both in conjunction in order to harness the advantages of both to better predict food web structure: 416 indeed we envisage combining the WebBuilder function and other predictive approaches in parallel 417 to build and understand food webs. 418 The four food webs presented here are among the most highly resolved and complete freshwater 419 food webs published to date, yet the links are still under-sampled for many nodes (Woodward et al., 420 2010), due to methodological issues and logistical constraints on sampling effort. The WebBuilder 421 function can help to overcome these issues. Firstly, it can take many hundreds of individuals to 422 characterise a species diet (Ings et al., 2009) and thus the interactions between rare consumers and 423 rare resources are often under-sampled. The WebBuilder function helps to overcome this as rare 424 interactions need only be observed once in the dataset of previously published interactions in order to 425 be incorporated into applicable food webs as they are constructed: i.e., potentially the “global diet” of 426 a species is held within the dataset, and can be expanded in future data collections. Secondly, the 427 method of observing interactions often limits the types of interactions which can be characterised; for 428 instance, the prey of suctorial predators (which are especially common in terrestrial ecosystems) 429 cannot be identified through traditional gut contents analysis, but if characterised through other means 430 (e.g. laboratory trials or molecular sequencing) they can be included in the dataset and incorporated 431 into generated food webs. For instance, two suctorial predators in the Broadstone food web 432 (Platambus maculatus and Bezzia sp.) did not have their guts analysed for predatory links in the 433 original study (Woodward et al., 2005) and so had been previously excluded from the food web (e.g. 434 Petchey et al., 2008; Woodward et al., 2010), these nodes would have been predicted to prey upon 435 other species by the WebBuilder function. This is due to the WebBuilder function generalising 14 436 the taxonomy of these nodes, and their subsequent appearances in the dataset, as other studies have 437 characterised the diets of these taxa. Some links in the dataset of trophic interactions were known 438 from just a single data source, e.g. Cordulegaster boltonii as a consumer of Nemurella pictetii is 439 known only from the Broadstone food web. Therefore, when we excluded self-referrential diet data, 440 the WebBuilder function reconstruction of Broadstone did not predict a trophic link between C. 441 boltonii and N. pictetii. We have not quantified how often this effect occurred. As with other open- 442 source datasets, anomalies will be ironed out as the dataset is enriched with more observations as it 443 grows, and its coverage will improve over time. 444 Besides constructing food webs de novo, the WebBuilder function could be used to standardise 445 a collection of networks gathered from different sources prior to analysis. This would effectively 446 standardise the sampling effort for included interactions (although not for species richness or 447 taxonomic resolution) and would remove spatially or temporally explicit interactions (or lack thereof). 448 If the analysis was concerned with the structure of summary food webs from different locations and 449 habitat types then this might be an appropriate first step. 450 451 452 4.2 - Future Directions The realism of links generated by the WebBuilder function could be addressed by assessing the 453 number of times a particular interaction appears in the dataset, as well as the number of times an 454 interaction could have occurred but did not (i.e. species found at the same site but not found to 455 interact). If a particular interaction has been observed many times across many systems, it is probably 456 reasonable to assume it also occurs at other sites where those species co-exist. However, if it has only 457 been observed rarely, or at a site with very different characteristics than the one in question (for 458 instance contrasting environmental conditions, or significantly different community assemblages) this 459 assumption might not be so reasonable. As the size of the dataset continues to grow, evaluation of 460 whether links are realised or not will improve over time. 461 The WebBuilder function is designed to construct summary food webs, and ignores potential 462 behavioural shifts of species, hence it is unsuitable for constructing temporally or spatially explicit 463 food webs. Additional data such as abundance information could be used to weight interactions, this 464 would, for instance, reduce the weight of interactions between rare species reducing their influence on 465 food web structure and increasing the realism of the resulting food web. There is an increasing body 466 of literature detailing the importance of weak and strong interactions within networks (Berlow et al., 467 2004; e.g. de Ruiter et al., 1995; Vazquez et al., 2007) and a multitude of methods already exist for 468 determining interaction strengths in food webs (see Berlow et al., 2004) some of which can be 469 employed alongside the WebBuilder function. Thus, despite the ‘coarse’ nature of food webs built 470 in this way there is much potential for their use in ecological research, and by combining them with 15 471 models such as those presented here potential mismatches arising from behavioural shifts could be 472 highlighted. 473 It would be straightforward for the underlying code of the WebBuilder function to be extended 474 to incorporate a range of traits that could influence the realisation of potential trophic interactions, 475 other than phylogeny, such as life stage or body size. For instance, within freshwater food webs body 476 size is an important determinant of trophic interactions, and food web structure predicted using body 477 size alone may be more accurate than those predicted using phylogeny alone (Woodward et al., 2010). 478 This could further increase the realism of the constructed food webs and hence their wider 479 applicability and usefulness. 480 This dataset of trophic interactions was collated to test the performance of the WebBuilder 481 function when predicting the structure of the four empirical UK freshwater food webs used here. It 482 would be straightforward to extend the coverage of this dataset by augmenting it with data collected 483 from other geographic regions. If this dataset is used to construct food webs in the future researchers 484 will need to use their discretion to decide how applicable it is to their system. For instance, this initial 485 version of the dataset does not provide good coverage of lentic species, or species from across Europe 486 or other parts of the world. However, interaction data are being published at a rapidly accelerating rate 487 (Ings et al., 2009) and this can be used to form an iterative feedback process, improving data quality 488 over time; the presence of links predicted by the WebBuilder function can therefore be tested 489 evermore rigorously in the future. Identifying underrepresented nodes in the dataset will help target 490 further research more cost effectively: e.g., a great deal is known about the diet of a handful of often 491 economically valuable species in the dataset (for instance, brown trout, Salmo trutta appears >3,000 492 times), but very little is known about many others. Additionally, technologies such as those provided 493 by recent advances in molecular sequencing will improve the efficiency of trophic interaction 494 detection (Clare, 2014) and therefore the volume of data which can be incorporated into the dataset. 495 We actively encourage researchers with suitable data to contribute them to this dataset. Exciting 496 initiatives such as Global Biotic Interactions (Poelen et al., 2014), by incorporating necessary 497 information such as the method through which an interaction was determined, could provide a global, 498 open source repository of interaction data which the WebBuilder function could access through R. 499 As more of these unknown links become known, nodes will not need to be generalised taxonomically 500 in order to find matches in the dataset, the links generated will more closely match the known links 501 for those species and therefore the quality of the ecological food webs generated by the 502 WebBuilder function will improve. 503 504 4.3 - Conclusions 505 We have demonstrated that the food webs generated here are comparable to empirically observed 506 food webs and exceeded the accuracy of other potential methods of predicting freshwater food webs. 16 507 This method could be used to build vast numbers of ecological networks from data that already exists, 508 such as routine biomonitoring data which is collected in huge volumes in many parts of the world 509 (e.g., Dutch soil biomonitoring data have recently been used to build a large collection of food webs; 510 Cohen & Mulder 2014). Producing collections of replicable networks is vital for advancing ecological 511 network research beyond the largely unreplicated case-study approach that has dominated to date: the 512 WebBuilder function approach presents a new robust and repeatable method that helps move us 513 considerably closer to that goal. 514 5 - Acknowledgements 515 This study is a contribution from the Imperial College Grand Challenges in Ecosystems and the 516 Environment initiative. CG was supported by Queen Mary University of London and the Freshwater 517 Biological Association. GW, DMP and LNH were supported by the Natural Environment Research 518 Council (Grants reference: NE/J015288/1 and NE/J011193/1). We thank Samraat Pawar and Uli 519 Brose for their helpful comments on an earlier draft that greatly improved the manuscript, as well as 520 Stefano Allesina who helped with the analysis of the ‘Difference’, ‘Ratio’ and ‘Difference/Ratio’ 521 models. 522 6 - References 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 Allesina, S., 2011. Predicting trophic relations in ecological networks: A test of the Allometric Diet Breadth Model. J. Theor. Biol. 279, 161–168. doi:DOI 10.1016/j.jtbi.2010.06.040 Allesina, S., Pascual, M., 2009. Food web models: A plea for groups. Ecol. Lett. 12, 652–662. doi:10.1111/j.1461-0248.2009.01321.x Allouche, O., Tsoar, A., Kadmon, R., 2006. Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). J. Appl. Ecol. 43, 1223–1232. doi:10.1111/j.1365-2664.2006.01214.x Barnes, C., 2008. Predator and prey body sizes in marine food webs. Ecology 89, 881. Bascompte, J., Jordano, P., Melian, C.J., Olesen, J.M., 2003. The nested assembly of plant-animal mutualistic networks. Proc. Natl. Acad. Sci. U. S. A. 100, 9383–9387. doi:DOI 10.1073/pnas.1633576100 Berlow, E.L., Neutel, A.-M., Cohen, J.E., De Ruiter, P.C., Ebenman, B., Emmerson, M., Fox, J.W., Jansen, V.A.A., Iwan Jones, J., Kokkoris, G.D., Logofet, D.O., McKane, A.J., Montoya, J.M., Petchey, O., 2004. Interaction strengths in food webs: issues and opportunities. J. Anim. Ecol. 73, 585–598. doi:10.1111/j.0021-8790.2004.00833.x Briand, F., 1983. Environmental Control of Food Web Structure. Ecology 64, 253–263. Clare, E.L., 2014. Molecular detection of trophic interactions: emerging trends, distinct advantages, significant considerations and conservation applications. Evol. Appl. 7, 1144–1157. doi:10.1111/eva.12225 Cleveland, W., Grosse, E., Shyu, W., Chambers, J., Hastie, T., 1992. Local regression models. Stat. Model. S. Cohen, J.E., Mulder, C., 2014. Soil invertebrates, chemistry, weather, human management, and edaphic food webs at 135 sites in The Netherlands: SIZEWEB. Ecology 95, 578. Cohen, J.E., Newman, C.M., Briand, F., 1985. A Stochastic Theory of Community Food Webs: II. Individual Webs. Proc. R. Soc. B Biol. Sci. doi:10.1098/rspb.1985.0043 17 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 De Ruiter, P.C., Neutel, A.M., Moore, J.C., 1995. Energetics, Patterns of Interaction Strengths, and Stability in Real Ecosystems. Science (80-. ). 269, 1257–1260. doi:DOI 10.1126/science.269.5228.1257 Edwards, F.K., Lauridsen, R.B., Fernandes, W.P.A., Beaumont, W.R.C., Ibbotson, A.T., Scott, L., Davies, C.E., Jones, J.I., 2009. Re-introduction of Atlantic salmon, Salmo salar L., to the Tadnoll Brook, Dorset. Proc. Dorset Nat. Hist. Archaeol. Soc. 130, 9–16. Eklof, a., Helmus, M.R., Moore, M., Allesina, S., 2012. Relevance of evolutionary history for food web structure. Proc. R. Soc. B Biol. Sci. 279, 1588–1596. doi:10.1098/rspb.2011.2149 Gilljam, D., Thierry, A., Edwards, F.K., Figueroa, D., Ibbotson, A.T., Jones, J.I., Lauridsen, R.B., Petchey, O.L., Woodward, G., Ebenman, B., 2011. Seeing Double: Size-Based and Taxonomic Views of Food Web Structure. Adv. Ecol. Res. 45, 67–133. doi:Doi 10.1016/B978-0-12386475-8.00003-4 Goldwasser, L., Goldwasser, L., Roughgarden, J., Roughgarden, J., 1993. Construction and analysis of a large Caribbean food web. Ecology 74, 1216–1233. Gravel, D., Poisot, T., Albouy, C., Velez, L., Mouillot, D., 2013. Inferring food web structure from predator–prey body size relationships. Methods Ecol. Evol. n/a–n/a. doi:10.1111/2041210X.12103 Gray, C., Baird, D.J., Baumgartner, S., Jacob, U., Jenkins, G.B., O’Gorman, E.J., Lu, X., Ma, A., Pocock, M.J.O., Schuwirth, N., Thompson, M., Woodward, G., 2014. Ecological networks: the missing links in biomonitoring science. J. Appl. Ecol. 51, 1444–1449. doi:10.1111/13652664.12300 Hall, S.J., Raffaelli, D., 1991. Food-Web Patterns: Lessons from a Species-Rich Web. J. Anim. Ecol. 60, 823–841. Havens, K.E., 1993. Predator-Prey Relationships in Natural Community Food Webs. Oikos 68, 117. doi:10.2307/3545316 Henson, K.S.E., Craze, P.G., Memmott, J., 2009. The restoration of parasites, parasitoids, and pathogens to heathland communities. Ecology 90, 1840–1851. doi:Doi 10.1890/07-2108.1 Hudson, L.N., Emerson, R., Jenkins, G.B., Layer, K., Ledger, M.E., Pichler, D.E., Thompson, M.S.A., O’Gorman, E.J., Woodward, G., Reuman, D.C., 2013. Cheddar: analysis and visualisation of ecological communities in R. Methods Ecol. Evol. 4, 99–104. doi:10.1111/2041210X.12005 Ings, T.C., Montoya, J.M., Bascompte, J., Bluthgen, N., Brown, L., Dormann, C.F., Edwards, F., Figueroa, D., Jacob, U., Jones, J.I., Lauridsen, R.B., Ledger, M.E., Lewis, H.M., Olesen, J.M., van Veen, F.J.F., Warren, P.H., Woodward, G., 2009. Ecological networks - beyond food webs. J. Anim. Ecol. 78, 253–269. doi:DOI 10.1111/j.1365-2656.2008.01460.x Kaiser-Bunbury, C.N., Muff, S., Memmott, J., Muller, C.B., Caflisch, A., Müller, C.B., 2010. The robustness of pollination networks to the loss of species and interactions: a quantitative approach incorporating pollinator behaviour. Ecol. Lett. 13, 442–452. doi:DOI 10.1111/j.14610248.2009.01437.x Layer, K., Hildrew, A., Monteith, D., Woodward, G., 2010. Long-term variation in the littoral food web of an acidified mountain lake. Glob. Chang. Biol. no–no. doi:10.1111/j.13652486.2010.02195.x Layer, K., Hildrew, A.G., Woodward, G., 2013. Grazing and detritivory in 20 stream food webs across a broad pH gradient. Oecologia 171, 459–471. doi:10.1007/s00442-012-2421-x Layer, K., Riede, J.O., Hildrew, A.G., Woodward, G., 2010. Food Web Structure and Stability in 20 Streams Across a Wide pH Gradient. Adv. Ecol. Res. 42, 265–299. doi:Doi 10.1016/S00652504(10)42005-X Ledger, M.E., Brown, L.E., Edwards, F.K., Milner, A.M., Woodward, G., 2012. Drought alters the structure and functioning of complex food webs. Nat. Clim. Chang. 3, 223–227. doi:10.1038/nclimate1684 Macfadyen, S., Craze, P.G., Polaszek, A., van Achterberg, K., Memmott, J., 2011. Parasitoid diversity reduces the variability in pest control services across time on farms. Proc. R. Soc. B-Biological Sci. 278, 3387–3394. doi:DOI 10.1098/rspb.2010.2673 18 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 Memmott, J., Martinez, N.D., Cohen, J.E., 2000. Predators, parasitoids and pathogens: species richness, trophic generality and body sizes in a natural food web. J. Anim. Ecol. 69, 1–15. doi:DOI 10.1046/j.1365-2656.2000.00367.x Mulder, C., Elser, J.J., 2009. Soil acidity, ecological stoichiometry and allometric scaling in grassland food webs. Glob. Chang. Biol. 15, 2730–2738. doi:10.1111/j.1365-2486.2009.01899.x Olesen, J.M., Bascompte, J., Dupont, Y.L., Jordano, P., 2007. The modularity of pollination networks. Proc. Natl. Acad. Sci. U. S. A. 104, 19891–19896. doi:DOI 10.1073/pnas.0706375104 Olito, C., Fox, J.W., 2014. Species traits and abundances predict metrics of plant – pollinator network structure , but not pairwise interactions. Oikos 1–9. doi:10.1111/oik.01439 Petchey, O.L., Beckerman, A.P., Riede, J.O., Warren, P.H., 2011. Fit, efficiency, and biology: some thoughts on judging food web models. J. Theor. Biol. 279, 169–71. doi:10.1016/j.jtbi.2011.03.019 Petchey, O.L., Beckerman, A.P., Riede, J.O., Warren, P.H., 2008. Size, foraging, and food web structure. Proc. Natl. Acad. Sci. U. S. A. 105, 4191–4196. doi:10.1073/pnas.0710672105 Piechnik, D. a., Lawler, S.P., Martinez, N.D., 2008. Food-web assembly during a classic biogeographic study: Species’ “trophic breadth” corresponds to colonization order. Oikos 117, 665–674. doi:10.1111/j.0030-1299.2008.15915.x Pocock, M.J.O., Evans, D.M., Memmott, J., 2012. The robustness and restoration of a network of ecological networks. Science (80-. ). 335, 973–7. doi:10.1126/science.1214915 Pocock, M.J.O.O., Memmott, J., Evans, D.M., Memmott, J., 2012. Supporting Material - The robustness and restoration of a network of ecological networks. Science (80-. ). 335, 973–7. doi:10.1126/science.1214915 Poelen, J.H., Simons, J.D., Mungall, C.J., 2014. Global biotic interactions: An open infrastructure to share and analyze species-interaction datasets. Ecol. Inform. 24, 148–159. doi:10.1016/j.ecoinf.2014.08.005 R Core Team, 2013. R: A language and environment for statistical computing. Rall, B.C., Brose, U., Hartvig, M., Kalinkat, G., Schwarzmüller, F., Vucic-Pestic, O., Petchey, O.L., 2012. Universal temperature and body-mass scaling of feeding rates. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 367, 2923–34. doi:10.1098/rstb.2012.0242 Rohr, R.P., Scherer, H., Kehrli, P., Mazza, C., Bersier, L.-F., 2010. Modeling food webs: exploring unexplained structure using latent traits. Am. Nat. 176, 170–177. doi:10.1086/653667 Sáyago, R., Lopezaraiza-Mikel, M., Quesada, M., Álvarez-Añorve, M.Y., Cascante-Marín, A., Bastida, J.M., 2013. Evaluating factors that predict the structure of a commensalistic epiphytephorophyte network. Proc. Biol. Sci. 280, 20122821. doi:10.1098/rspb.2012.2821 Strong, J.S., Leroux, S.J., 2014. Impact of Non-Native Terrestrial Mammals on the Structure of the Terrestrial Mammal Food Web of Newfoundland, Canada. PLoS One 9, e106264. doi:10.1371/journal.pone.0106264 Thompson, R.M., Brose, U., Dunne, J.A., Hall, R.O., Hladyz, S., Kitching, R.L., Martinez, N.D., Rantala, H., Romanuk, T.N., Stouffer, D.B., Tylianakis, J.M., 2012. Food webs: reconciling the structure and function of biodiversity. Trends Ecol. Evol. 27, 689–697. doi:DOI 10.1016/j.tree.2012.08.005 Tylianakis, J.M., Laliberté, E., Nielsen, A., Bascompte, J., 2010. Conservation of species interaction networks. Biol. Conserv. 143, 2270–2279. doi:http://dx.doi.org/10.1016/j.biocon.2009.12.004 Vázquez, D.P., Chacoff, N.P., Cagnolo, L., 2009. Evaluating multiple determinants of the structure of plant-animal mutualistic networks. Ecology 90, 2039–2046. doi:10.1890/08-1837.1 Vazquez, D.P., Melian, C.J., Williams, N.M., Bluthgen, N., Krasnov, B.R., Poulin, R., 2007. Species abundance and asymmetric interaction strength in ecological networks. Oikos 116, 1120–1127. doi:DOI 10.1111/j.2007.0030-1299.15825.x Verdú, M., Valiente-Banuet, A., 2011. The relative contribution of abundance and phylogeny to the structure of plant facilitation networks. Oikos 120, 1351–1356. doi:10.1111/j.16000706.2011.19477.x Vizentin-Bugoni, J., Maruyama, P.K., Sazima, M., 2014. Processes entangling interactions in communities: forbidden links are more important than abundance in a hummingbird-plant network. Proc. R. Soc. B Biol. Sci. 281, 20132397–20132397. doi:10.1098/rspb.2013.2397 Williams, R.J., Martinez, N.D., 2000. Simple rules yield complex food webs. Nature 404, 180–183. 19 656 657 658 659 660 661 662 663 664 665 Woodward, G., Blanchard, J., Lauridsen, R.B., Edwards, F.K., Jones, J.I., Figueroa, D., Warren, P.H., Petchey, O.L., 2010. Individual-Based Food Webs: Species Identity, Body Size and Sampling Effects. Adv. Ecol. Res. 43, 211–266. doi:10.1016/b978-0-12-385005-8.00006-x Woodward, G., Brown, L.E., Edwards, F.K., Hudson, L.N., Milner, A.M., Reuman, D.C., Ledger, M.E., 2012. Climate change impacts in multispecies systems: drought alters food web size structure in a field experiment. Philos. Trans. R. Soc. B-Biological Sci. 367, 2990–2997. doi:DOI 10.1098/rstb.2012.0245 Woodward, G., Speirs, D.C., Hildrew, A.G., 2005. Quantification and resolution of a complex, sizestructured food web. Adv. Ecol. Res. 36, 85–135. 666 667 7 - Figure captions 668 669 Figure 1. A simplified workflow demonstrating the WebBuilder function. For a workable example 670 see Appendix C. 671 672 Figure 2. An example of inputs and outputs for the WebBuilder function. Two different scenarios are 673 highlighted. Firstly in blue the taxa Baetis fuscatus is generalised to the genus level for both its 674 consumer and resource links, this allows it to be matched with B. scambus in the registry and the 675 Navicula tripunctata – B. fuscatus, and Cocconeis placentula – B. fuscatus links to be included in the 676 output. Secondly in green, the taxa Tanypodinae are generalised to the family level for its resource 677 links and subfamily level for its consumer links, allowing it to matched with all entries in the registry 678 with the subfamily Tanypodinae and the Tanypodinae – Salmo trutta, and Baetis fuscatus – 679 Tanypodinae links to be included in the output. 680 681 Figure 3. The quality of the generated food web increases with the size of the dataset. Fluctuations in 682 the TSS score are caused by changes in the component parts of the TSS, i.e. while the TPR may 683 increase as the dataset size increases, other metrics such as the FPR might also increase, causing the 684 total TSS to fall (see methods). Lines are fitted using a LOESS smoother (Cleveland et al., 1992), 685 grey shading indicates the 95% confidence intervals. 686 687 688 Figure 4. Box plots showing the changes in generated trophic interactions as the level of taxonomic 689 generalisation is varied. The difference in degree (top), generality (middle) and vulnerability 690 (bottom) of individual nodes between the generated and empirical food web, thus the sample size 691 reflects the number of nodes in the empirical food web. Positive values represent links which were 692 ‘missed’ by the WebBuilder function, while negative values represent additional links not found 20 693 empirically. Stars indicate if the mean is different from zero (one sampled t-test) and indicate if the 694 generated trophic interactions are different from that of the empirical food web, 0.05 >= p > 0.01 = 695 *, 0.01 >= p > 0.001 = **, p <= 0.001=***. 696 697 698 Figure 5. Predation matrices for the empirical food webs compared to those generated by the AR 699 method. Nodes are ordered by increasing body mass. A trophic link is represented by a point 700 indicating that the taxon in that column consumes the taxon in that row. Links generated by the 701 WebBuilder function are represented by empty circles, and those found empirically are 702 represented by smaller, filled circles. 703 704 Figure 6. Box plots showing the performance of the WebBuilder function compared to the ADBM, 705 Difference, Ratio and Difference/Ratio models. The performance of the WebBuilder function is 706 plotted as four vertical lines, one for each of the empirical food webs; Broadstone 707 , Tadnoll and Mill Stream , Afon Hirnant . The TSS score (top panel) gives an overall measure of the 708 performance of the predictive method relative to empirical food webs, and varies between 1 (a 709 generated food web that is exactly the same as the empirical food web) and -1 (a generated food web 710 which is the exact inverse of the empirical food web). The TPR (True Positives Rate; bottom panel) is 711 the proportion of generated food web links that were also found empirically, and varies between 0 (no 712 links generated correctly) and 1 (all links generated correctly). A box plot of each set of values is 713 given, indicating the range, quartile ranges and median of each set of values. For the WebBuilder 714 function only the individual scores for the four food webs are shown, for all others there are too many 715 generated scores to be shown individually; ADBM (n=508), Difference (n=32,025), Ratio (n=43,638) 716 and Difference/Ratio (n=41,602). 717 718 21 719 Appendices: 720 Appendix A: 721 Table A.1 – Table of sources of trophic information contained in the dataset 722 Figure A.1 - The trade-off between taxonomic resolution and dataset coverage 723 Figure A.2 – The performance of the WebBuilder function compared to predictive models. 724 Appendix B: Function R code 725 Appendix C: Example R code 726 Appendix D: The trophic links predicted by the WebBuilder function. 727 22 728 Supporting Information 729 Appendix A 730 Table A.1 The sources of trophic-interaction data used in the database. Source Number of trophic links 6433 6,280 2,440 1,423 316 286 199 170 153 145 144 143 108 81 78 65 57 41 37 33 35 34 33 30 30 27 20 20 13 10 9 9 Gilljam et al. 2011 Layer et al. 2010 Ledger et al. 2012 Brose et al. 2005 Warren 1989 Becker 1990 Jones et al. 1951 Northcott 1981 Hynes 1950 Moore & Potter 1976 Iversen 1988 Spänhoff et al. 2003 Thomas 1962 Slack 1936 Clitherow et al. 2013 Maitland 1965 Lancaster et al. 2005 Rowan Dunn 1954 Radforth 1940 Woodward et al. 2008 Woodward et al. 2005 Woodward unpublished Woodward et al. 2008 Badcock 1949 Mackereth 1957 Cook 1979 Perkins unpublished Townsend & Hildrew 1979 Tikkanen et al. 1997 Harper-Smith et al. 2004 Englund 2005 N. Dewhurst & G. Woodward unpublished data 23 System Place freshwater stream UK freshwater stream experimental freshwater channels freshwater lake experimental freshwater stream freshwater pond UK USA freshwater stream Europe freshwater river UK freshwater lake laboratory experimental freshwater freshwater stream UK freshwater stream laboratory experimental freshwater freshwater stream UK Europe freshwater stream Europe freshwater river UK freshwater river UK freshwater river Europe freshwater river UK freshwater stream UK freshwater lake UK freshwater river UK freshwater stream UK UK UK UK UK UK UK freshwater stream UK freshwater unknown freshwater stream UK freshwater stream UK freshwater river UK freshwater lake UK freshwater stream UK freshwater stream UK freshwater lake Europe 7 4 4 3 3 2 2 2 2 1 Young & Procter 1986 Mann & Blackburn 1991 Warren, unpublished Friday 1988 Gee & Young 1993 Elliott et al. 1988 Fox 1978 Harrison et al. 2005 Hall et al. 2000 Armitage & Young 1990 freshwater lake USA freshwater stream experimental freshwater stream freshwater lake UK Europe freshwater stream UK freshwater lake UK freshwater UK freshwater UK UK freshwater stream UK freshwater stream USA 731 732 733 References: 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 Armitage, M.J. & Young, J.O. (1990). The realized food niches of three species of stream-dwelling triclads (Turbellaria). Freshwater Biology, 24, 93–100. Badcock, R.M. (1949). Studies in Stream Life in Tributaries of the Welsh Dee. Journal of Animal Ecology, 18, 193–208. Becker, G. (1990). Comparison of the dietary composition of epilithic trichopteran species in a firstorder stream. Archiv für Hydrobiologie, 120, 13–40. Brose, U., Cushing, L., Berlow, E.L., Jonsson, T., Banasek-Richter, C., Bersier, L.-F., Blanchard, J.L., Brey, T., Carpenter, S.R., Blandenier, M.-F.C., Cohen, J.E., Dawah, H.A., Dell, T., Edwards, F., Harper-Smith, S., Jacob, U., Knapp, R.A., Ledger, M.E., Memmott, J., Mintenbeck, K., Pinnegar, J.K., Rall, B.C., Rayner, T., Ruess, L., Ulrich, W., Warren, P., Williams, R.J., Woodward, G., Yodzis, P. & Martinez, N.D. (2005). Body sizes of consumers and their resources. Ecology, 86, 2545. Clitherow, L.R., Carrivick, J.L. & Brown, L.E. (2013). Food Web Structure in a Harsh Glacier-Fed River. Plos One, 8, e60899. Cook, M.P. (1979). The interrelationships of fish and zooplankton in gravel-pit lakes. City of London Polytechnic. Elliott, J.M., Humpesch, U.H. & Macan, T.T. (1988). Larvae of the British Ephemeroptera: a key with ecological notes. Englund, G. (2005). Scale dependent effects of predatory fish on stream benthos. Oikos, 1, 19–30. Fox, P.J. (1978). The population dynamics of the bullhead (Cottus gobio L., Pisces) with special reference to spawning, mortality of young fish and homoestatic mechanisms. Journal of Fish Biology, 12. Friday, L.E. (1988). A key to the adults of British water beetles. Dorset Press, Dorset, UK. Gee, H. & Young, J.O. (1993). The food niches of the invasive Dugesia tigrina (Girard) and indigenous Polycelis tennis Ijima and P. nigra (MοΏ½ller) (Turbellaria; Tricladida) in a Welsh lake. Hydrobiologia, 254, 99–106. Gilljam, D., Thierry, A., Edwards, F.K., Figueroa, D., Ibbotson, A.T., Jones, J.I., Lauridsen, R.B., Petchey, O.L., Woodward, G. & Ebenman, B. (2011). Seeing Double: Size-Based and Taxonomic Views of Food Web Structure. Advances in Ecological Research, 45, 67–133. Hall, R.O., Wallace, J.B. & Eggert, S.L. (2000). Organic matter flow in stream food webs with reduced detrital resource base. Ecology, 81, 3445–3463. Harper-Smith, S., Berlow, E.L., Roland, A., Williams, R.J. & Martinez, N.D. (2004). Ecology through food webs: visualising and quantifying the effects of stocking alpine lakes with trout. Dynamic Webs: Multispecies Assemblages, Ecosystem Development, and Environmental Change (eds P. De Ruiter, J.C. Moore & V. Wolters), pp. 407–423. Elsevier Academic Press Inc. Harrison, S.S.C., Bradley, D.C. & Harris, I.T. (2005). Uncoupling strong predator-prey interactions in streams: the role of marginal macrophytes. Oikos, 108, 433–448. 24 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 Hynes, H.B.N. (1950). The Food of Fresh-Water Sticklebacks (Gasterosteus aculeatus and Pygosteus pungitius), with a Review of Methods Used in Studies of the Food of Fishes. Journal of Animal Ecology, 19, 36–58. Iversen, T.M. (1988). Secondary production and trophic relationships in a spring invertebrate community. Limnology and Oceanography, 33, 582–592. Jones, J.R.E., Journal, S., May, N. & Jones, B.Y.J.R.E. (1951). An Ecological Study of the River Towy. Journal of Animal Ecology, 20, 68–86. Lancaster, J., Bradley, D.C., Hogan, A. & Waldron, S. (2005). Intraguild omnivory in predatory stream insects. Journal of Animal Ecology, 74, 619–629. Layer, K., Riede, J.O., Hildrew, A.G. & Woodward, G. (2010). Food Web Structure and Stability in 20 Streams Across a Wide pH Gradient. Advances in Ecological Research, 42, 265–299. Ledger, M.E., Brown, L.E., Edwards, F.K., Milner, A.M. & Woodward, G. (2012). Drought alters the structure and functioning of complex food webs. Nature Climate Change, 3, 223–227. Mackereth, J.C. (1957). Notes on the Plecoptera from a Stony Stream. Journal of Animal Ecology, 26, 343–351. Maitland, P.S. (1965). The Feeding Relationships of Salmon, Trout, Minnows, Stone Loach and Three-Spined Stickle-Backs in the River Endrick, Scotland. Journal of Animal Ecology, 34, 109–133. Mann, R.H.K. & Blackburn, J.H. (1991). The biology of the eel Anguilla anguilla (L.) in an English chalk stream and interactions with juvenile trout Salmo trutta L. and salmon Salmo salar L. Hydrobiologia, 218, 65–76. Moore, J.W. & Potter, I.C. (1976). A Laboratory Study on the Feeding of Larvae of the Brook Lamprey Lampetra planeri (Bloch). Journal of Animal Ecology, 45, 81–90. Northcott, D.S. (1981). The role of aquatic macrophytes in the availabillty of food for young fish. City of London Polytechnic. Radforth, I. (1940). The Food of the Grayling (Thymallus thymallus), Flounder (Platichthys flesus), Roach (Rutilus rutilus) and Gudgeon (Gobio fluviatilis), with Special Reference to the Tweed Watershed. Journal of Animal Ecology, 9, 302–318. Rowan Dunn, D. (1954). The Feeding Habits of some of the Fishes and some Members of the Bottom Fauna of Llyn Tegid (Bala Lake), Merionethshire. Journal of Animal Ecology, 23, 224–233. Slack, H.D. (1936). The Food of Caddis Fly (Trichoptera) Larvae. Journal of Animal Ecology, 5, 105–115. Spänhoff, B., Schulte, U., Alecke, C., Kaschek, N. & Meyer, E.I. (2003). Mouthparts, gut contents, and retreat-construction by the wood-dwelling larvae of Lype phaeopa (Trichoptera: Pschomyiidae). European Jounal of Entomology, 100, 563–570. Thomas, J.D. (1962). The Food and Growth of Brown Trout (Salmo trutta L.) and its Feeding Relationships with the Salmon Parr (Salmo salar L.) and the Eel (Anguilla anguilla (L.)) in the River Teify, West Wales. Journal of Animal Ecology, 31, 175–205. Tikkanen, P., Muotka, T., Huhta, A. & Juntunen, A. (1997). The roles of active predator choice and prey vulnerability in determining the diet of predatory stonefly (Plecoptera) nymphs. Journal of Animal Ecology, 66, 36–48. Townsend, C.R. & Hildrew, A.G. (1979). Resource Partitioning by Two Freshwater Invertebrate Predators with Contrasting Foraging Strategies. Journal of Animal Ecology, 48, 909–920. Warren, P.H. (1989). Spatial and temporal variation in the structure of a fresh-water food web. Oikos, 55, 299–311. Woodward, G., Papantoniou, G., Edwards, F. & Lauridsen, R.B. (2008). Trophic trickles and cascades in a complex food web: impacts of a keystone predator on stream community structure and ecosystem processes. Oikos, 117, 683–692. Woodward, G., Speirs, D.C. & Hildrew, A.G. (2005). Quantification and resolution of a complex, size-structured food web. Advances in Ecological Research, 36, 85–135. Young, J.O. & Procter, R.M. (1986). Are the lake-dwelling leeches, Glossiphonia complanata (L.) and Helobdella stagnalis (L.), opportunistic predators on molluscs and do they partition this food resource? Freshwater Biology, 16, 561–566. 25 825 26 826 827 828 829 830 831 Figure A.1, The percentage of food web nodes that appear in the database, compared to the accuracy of the resulting food web prediction. The taxonomic resolution of the food web nodes are reduced systematically in order to increase their occurrence in the database. 27 832 833 834 835 836 837 838 839 840 Figure A.2. The performance of the WebBuilder function compared to the ADBM, Diff, Ratio and Diff/Ratio models. The TSS score gives an overall measure of the performance of the predictive method relative to empirical webs, TPR (True Positives Rate) is the proportion of generated food web links which were also found empirically. A box plot of each set of values is given, indicating the range, quartile ranges and median of each set of values. For the WebBuilder function only the individual scores for the four food webs are shown, for all others the mean of each food web is shown. 28 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 Appendix B # Functions for synthesising food webs from a registry of # known trophic links .StripWhitespace <- function(v) { # Returns v with leading and trailing whitespace removed return (gsub('^\\s+|\\s+$', '', v, perl=TRUE)) } .AddRegistryLinks <- function(synthesized, registry, res.col, con.col, res.method, con.method, reference.cols, debug) { stopifnot(res.col %in% colnames(synthesized)) stopifnot(res.col %in% colnames(registry)) stopifnot(con.col %in% colnames(synthesized)) stopifnot(con.col %in% colnames(registry)) if(any(!reference.cols %in% colnames(registry))) { missing <- setdiff(reference.cols, colnames(registry)) stop('Missing reference columns: ', paste(missing, collapse=',')) } # Match on multiple columns by pasting # http://www.r-bloggers.com/identifying-records-in-data-frame-a-thatare-not-contained-in-data-frame-b-%E2%80%93-a-comparison/ # Strip whitespace and convert to lowercase a <- lapply(synthesized[,c(res.col,con.col)], .StripWhitespace) a <- lapply(a, tolower) a <- do.call('paste', a) b <- lapply(registry[,c(res.col,con.col)], .StripWhitespace) b <- lapply(b, tolower) b <- do.call('paste', b) Debug <- function(...) if(debug) cat(...) Debug('Matching on resource', res.method, 'consumer', con.method, '\n') # Match where both resource and consumer columns are non-empty and a link # does not already exist to.match <- ''!=synthesized[,res.col] & ''!=synthesized[,con.col] & is.na(synthesized$N.records) for(row in which(to.match)) { N <- sum(a[row]==b) if(N>0) { synthesized$res.method[row] <- res.method synthesized$con.method[row] <- con.method synthesized$N.records[row] <- N for(col in reference.cols) { 29 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 synthesized[,col] <paste(sort(unique(registry[a[row]==b,col])), collapse=',') } } } return (synthesized) } .AllPossible <- function(nodes, exclude.cols=NULL) { # A data.frame of all possible links resource <- consumer <- nodes[,setdiff(colnames(nodes), exclude.cols),drop=FALSE] colnames(resource) <- paste0('res.', tolower(colnames(resource))) colnames(resource)['res.node'==colnames(resource)] <- 'resource' colnames(consumer) <- paste0('con.', tolower(colnames(consumer))) colnames(consumer)['con.node'==colnames(consumer)] <- 'consumer' synthesized <- data.frame(resource[rep(1:nrow(resource), each=nrow(resource)),,drop=FALSE], consumer[rep(1:nrow(resource), times=nrow(resource)),,drop=FALSE]) return (synthesized) } .ARWorker <- function(synthesized, registry, methods, reference.cols, debug) { # Add links for each level in methods for(outer in methods) { for(inner in methods[1:which(methods==outer)]) { synthesized <- .AddRegistryLinks(synthesized, registry, ifelse('exact'==outer, 'resource', paste0('res.', outer)), ifelse('exact'==inner, 'consumer', paste0('con.', inner)), outer, inner, reference.cols, debug) if(inner!=outer) { synthesized <- .AddRegistryLinks(synthesized, registry, ifelse('exact'==inner, 'resource', paste0('res.', inner)), ifelse('exact'==outer, 'consumer', paste0('con.', outer)), inner, outer, reference.cols, debug) } } } synthesized <- droplevels(synthesized[!is.na(synthesized$N.records),]) rownames(synthesized) <- NULL return (synthesized) } WebBuilder <- function(nodes, registry, methods=c('exact','genus','subfamily','family','order','class'), reference.cols=c('linkevidence', 'source.id'), debug=FALSE) 30 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 { # nodes - a data.frame containing columns 'node' (character - unique node # names) and either 'minimum.method' the minumum method for that # taxon or 'minimum.res.method' and 'minimum.con.method' - the # minimum methods for that taxon as a resource and a consumer # respectively. Values in 'minimum.method', 'minimum.res.method' and # 'minimum.con.method' must be in the 'methods' argument. # registry - a data.frame of known trophic interactions # methods - character vector of methods, in order # reference.cols - names of columns in registry that will be included in the # returned data.frame # Checks stopifnot(0<nrow(nodes)) stopifnot(nrow(nodes)==length(unique(nodes$node))) stopifnot(all(c('resource','consumer') %in% colnames(registry))) error.msg <- paste('nodes should contain either "minimum.method" or', '"minimum.res.method" and "minimum.con.method"') if('minimum.method' %in% colnames(nodes)) { if(any(c('minimum.res.method', 'minimum.con.method') %in% colnames(nodes))) { stop(error.msg) } # Use minimum.method for both ends of the links nodes$minimum.res.method <- nodes$minimum.con.method <nodes$minimum.method } else { if(!all(all(c('minimum.res.method', 'minimum.con.method') %in% colnames(nodes)))) { stop(error.msg) } # Use the provided minimum.res.method and minimum.con.method } stopifnot(all(nodes$minimum.res.method %in% methods)) stopifnot(all(nodes$minimum.con.method %in% methods)) synthesized <- .AllPossible(nodes, c('minimum.res.method','minimum.con.method','minimum.method')) synthesized[,c('res.method','con.method')] <- '' synthesized$N.records <- NA synthesized[,reference.cols] <- '' links <- .ARWorker(synthesized, registry, methods, reference.cols, debug) # Set factor levels # What client has specified 31 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 nodes$minimum.res.method <- factor(nodes$minimum.res.method, levels=methods) nodes$minimum.con.method <- factor(nodes$minimum.con.method, levels=methods) # What RegistryLinks found links$res.method <- factor(links$res.method, levels=methods) links$con.method <- factor(links$con.method, levels=methods) links <links[as.integer(links$res.method)<=as.integer(nodes$minimum.res.method)[ma tch(links$resource, nodes$node)] & as.integer(links$con.method)<=as.integer(nodes$minimum.con.method)[match(li nks$consumer, nodes$node)],] return(droplevels(links)) } 32 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 Appendix C #Gray et al Food Webs MS #worked example of how to use the WebBuilder function within teh cheddar package library(cheddar) source("WebBuilder.R") #add in links #read in without factor levels so NAs can be removed later registry<-read.csv('Gray et al SuppMat data.csv') data(BroadstoneStream) BroadstoneStream <- OrderCommunity(BroadstoneStream, 'M') IsolatedNodes(BroadstoneStream) BroadstoneStream<-RemoveNodes(community=BroadstoneStream, remove=c('Algae','CPOM','FPOM','Leptothrix spp.','Terrestrial invertebrates','Diptera')) #extract node information #n.b. node names need to be resolved using the GBIF database on Global Names Resolver #in order to match the taxonomy used in the database nps<-NPS(BroadstoneStream) #supply a list of desired level of generalisation, either a single list with one entry per node, or two #with two entries per node, i.e. a 'resource method' and a 'consumer method' minimum.res.method<- c("class", "family", "family", "family", "family", "family", "family", "family", "family", "genus", "genus", "family", "family", "genus", "genus", "family", "family", "genus", "genus", "order", "genus", "family", "class", "family", "genus", "family", 33 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 "family", "genus", "genus", "family", "genus") minimum.con.method <- c("class", "genus", "genus", "genus", "genus", "genus", "genus", "family", "genus", "genus", "genus", "family", "family", "genus", "genus", "genus", "genus", "genus", "genus", "order", "genus", "family", "class", "family", "genus", "family", "family", "genus", "genus", "family", "genus") nodes<-cbind(nps[,c(1,7:10)],minimum.res.method,minimum.con.method) links<-WebBuilder(nodes, registry, method=c('exact','genus','family','order','class')) BroadstoneExample <- Community(properties = CPS(BroadstoneStream), nodes = NPS(BroadstoneStream), trophic.links = links) table(links$con.method, links$res.method) #visualise network PlotWebByLevel(BroadstoneExample, level='ChainAveragedTrophicLevel') 34