Botanical Information and Ecology Network Brian J. Enquist Dept. Ecology and Evolutionary Biology University of Arizona, Tucson, A.Z. and The Santa Fe Institute, Santa Fe, N.M. Botanical Information and Ecology Network (BIEN) Organizers Brian J. Enquist, University of Arizona, Richard Condit, STRI, Panama and CTFS Robert K. Peet, University of North Carolina at Chapel Hill, Brad Boyle, University of Arizona, Steven Dolins, Bradley University, Mark Schildhauer, NCEAS CTFS Meeting, Panama December 2006 Advisory meeting for CTFS plot database A unique opportunity to do something larger . . . . BIEN Goals (1) Specific science questions – Compile the primary sources of biodversity data at the nexus of merging herbarium, plot (abundance), and trait data for plants in the Americas. (2) Technology development goals associated with answering these questions effectively – as well as to establish an informatics methodology for continuing to assemble and integrate relevant observation data for this and other projects. (3) Longer-term program development – seek support to develop a permanent technical solution to the integration of vegetation/botanical data Science Goals - Generate a standardized species list for the Americas - Generate geographic range maps for all species within the BIEN database - Ask basic science questions at the nexus between abundance, distribution, traits, and diversity across broad gradients Justification for Working Group • The lack of standardized and integrated botanical information across world’s major biomes is an impediment to advancing basic ecological understanding. • In order to do biodiversity science need to document and develop workflows to integrate and ‘scrub’ botanical data • Need to clarify how the abundance, distribution, and diversity of plants vary across broad gradients and respond to global change Data Sources Cyberinfrastructure Plot and Trait Data TAXONOMIC PHYLOGENETIC INTELLIGENCE DATA SCRUBBING CORRECTING, Data Standardization Tools Specimen Data Exchange schema Database BIEN 2.0 Data Discovery Confederated resource BIEN 3.0 Science ! Deliverables Justification for Working Group • The lack of integrated botanical information across world’s major biomes is an impediment to advancing basic ecological understanding. • This is especially true in the tropics where biodiversity is uniquely quite high and patchy. • Need to clarify exactly how tropical and temperate floras and communities might differ and how they vary across gradients and respond to global change BIEN Proposals - Jan 2008 - NCEAS BIEN Working Group Proposal - Feb 2009 - iPlant BIEN Working Group Proposal - Aug. 2010 - iPlant BIEN GeoSpatial Proposal Provided support for - Meetings - Graduate student support - Post-doctoral funding (Brad Boyle) - Technician support (John Donoghue, Aaron Marcuse-Kubitza) 2008 2010 BIEN NCEAS Meetings 2009 Additional sub-meetings Spring 2010 Development of a Taxonomic Name Resolution Service Meeting Missouri Botanical Garden Spring 2011 Development of Geospatial Initiative Meeting at iPlant, Tucson The past three years . . . . . What have we learned? A large fraction of biodiversity data available is crap! - Mangled coordinates - Mis-spelled names, taxa - Bad taxonomy - Data corruption from cultivated species - Heterogeneous sampling Integrating and using biodiversity data is fraught with numerous technical issues Not trivial issues – major impediments to use of biodiversity data Developed tools and workflows to correct, scrub data, and remove ‘bad data’ Summary – Major Steps BIEN Data Workflow Hurtles surpassed in order to integrate and standardize botanical data - Compilation of herbarium, trait, and ecological plot data - Formalization of BIEN database 2.0 - Geovalidation of observations - Taxonomic corrections and synonymy (TNRS!) - Identifying and removing cultivated specimens, plantation data etc. - Computational challenge – Scaling up geographic range calculations http://tnrs.iplantcollaborative.org/ The past three years . . . . . Science anticipation Botanical data have enormous problems Developed a workflow, tools, and scripts to standardize, clean, and scrub botanical data in order to do science Now, we are ready . . . . . BIEN 2.0 Data Sources (post scrubbing) Plot Data - CTFS Herbarium and Observation Data - GBIF - FIA - Madidi plots - MOBOT - Vegbank - NYBG - TEAM - SALVIAS - CRIA (Brazil collections) - Arizona Plot# = 329,741 - UNC, NCS etc. - REMIB (Mexico) Plant Traits - Utrecht - GLOPNET - Numerous literature sources Specimen# = 9,345,197 Total Number of Species = 204,929 Total number of observations = 12,171,014 - BIEN researcher data Traits = 27 Trait observations = 140,285 What we now have available - A scrubbed species list of all plants in the Americas - Summary data for all species in BIEN - mean abundance, max abundnce, total #plots observed in - latitudinal range - mean trait values and trait variation - mean dbh, max dbh - habit information (tree, shrub, liana, etc.)? - Summary data for all species in BIEN ? - Conservation Status (IUCN Red List) -Geographic Range maps (Convex Hull, MaxEnt etc.) Recent output from High Performance Computing -A species-level phylogeny for BIEN species (?) - A website (data soon to be accessible) http://bien.nceas.ucsb.edu/bien/ What is unique about BIEN2.0? These are the primary data used for asking questions about botanical diversity, distribution, and ecology Computation demands . . . – No one has modeled ranges for this number of species - We have a work flow established for large scale calculation of ranges We have documented a repeatable work flow that any researcher must use to take species observations and combine them with traits, ecology, and to put them ‘on a map’. - this is the most basic work flow that is required in biodiversity science Meeting goals Break into subgroups (1) Do science and write papers (2) Detail BIEN3.0 database and geospatial tools Write a major requirements document for NCEAS (3) Future funding (4) iPlant/BIEN GeoSpatial planning NCEAS deliverables -Species list for new word checklist -Range maps for all species - Traits and habit values for a large fraction of BIEN species -The ability to download and calculate basic stats for taxa and clades -Data accessible (at least summary data) GeoSpatial Discovery Proposal to iPlant NCEAS (other?) iPlant User User Interface Climate layers Science User GeoSpatial Data Discovery Environment Applied User Outreach and Education Integrate data BIEN Plant Distributions User contributing data? - data standardization tools? What type of science? What kind of applied And education demand? iPlant GeoSpatial Seed Projects iPlant cyberinfrastructure development team iPlant Plant Adaptation Group iPlant Tree of Life (iPToL) iPlant Plant Nutrition Tree Biology BIEN GeoSpatial Tools Tree Biology Data Discovery NCEAS McGill et al. NCEAS Group Data Discovery Environment Proposed GeoSpatial Tools Merge BIEN 3.0 with climate and geography layers Tools for Botanical GeoSpatial Discovery (1) Click on a map and get a species list (2) Click on a plant clade and map out its distribution (3) Click on a plant observation (taxa) and obtain climate/environmental data Tools to Geoscrub and Correct Botanical Observation Data