BiodiversityWorld the biologist’s goals The University of Reading Cardiff University

advertisement
BiodiversityWorld
the biologist’s goals
The University of Reading
Frank Bisby, Alistair Culham, Neil Caithness, Tim Sutton, Peter Brewer,
Chris Yesson
Cardiff University
Alec Gray, Andrew Jones, Richard White, Nick Fiddian, Xuebiao Xu,
Mikhaila Burgess, Jaspreet Singh Pahwa
The Natural History Museum
Malcolm Scoble, Paul Williams,
Shonil Bhagwat
Bristol University
Paul Valdes
(The University of Southampton)
Major challenges in
Biodiversity Science
• How to access Global Biodiversity?
– To see and aggregate data from all round the
world
– To synthesise a global view
– To move from description to real analysis
– Ultimately to bring the totality onto the Internet at a
level of abstraction above that achieved by
individual travel and fieldwork
– What GBIF calls ‘Digital Biodiversity Science’
Major challenges in
Biodiversity Science
• First steps towards a Systems Biology
for the behaviour of global biodiversity
– To access an aggregated and synthesised
view of the factual base
– To build hypotheses with a sound basis
– To model outcomes based on the
hypotheses
– To test the modelled outcomes
Major challenges in
Biodiversity Science
• To a large extent these challenges are
convergent with the goals of the UK
e-Science Initiative
– indeed, it has been said that analysing global biodiversity is
one of the clearest application areas
– ‘e-Science is about global collaboration in key areas of
science, and the next generation of infrastructure that will
enable it’ (John Taylor, 02)
– We certainly qualify as e-Science
– We certainly need distributed computing, possibly
combining needs for the GRID and for the Semantic Web.
Our Vision for the BDWorld GRID:
• a distributed problem-solving
environment
• giving access to a wide array of the
world’s data sources and analytical
tools
• providing an integrated and flexible
environment for analysis of global
scale patterns in biodiversity
Our Vision for the BDWorld GRID:
• And suitable for addressing some
difficult Biodiversity questions:
- where might a species be expected
to occur, under past, present, or
predicted climatic conditions?
- where should conservation efforts
be concentrated?
- to what extent is biogeography
reflected in phylogeny?
What are the technical
goals of BDWorld?
• Extensible problem solving environment
for global biodiversity analysis
• Employ GRID technology because:
• (i) Distributed computing
• (ii) Distributed resources
• (iii) Semantic mediation
• Resource location
• Workflow design & validation
START
STAGE 1
Returns list of accepted taxa,
synonyms and common names
Enquiry: select ‘data’
for ‘taxon set’
STAGE 2
STAGE 3
Presentation
and
storage of
results
Return dataset composed of
homologous responses from
multiple thematic data sources
Analytical
Toolbox
Species 2000
Catalogue of
Life
Distributed array of
thematic data
sources
Reference to
Abiotic datasets
Distributed
Array of
GSD’s
Enquiry name(s)
Architecture
BDWorld Resources
Data sets & Analytical tools
Resource Wrappers
BGI
BDWorld GRID Interface
GRID
•
Bioclimatic Modelling:
• Predicting species distributions
under past, present and future
climate scenarios.
• Models:
– GARP (Genetic Algorithms for Rule-set Production)
– CSM (Climate Space Models)
– Bioclim
Case Study - Leucaena leucocephala
Leucaena leucocephala
(Lam.) De Wit
 Native of Central America
 Widely introduced around the tropics
 Widely utilised around the globe for:
- Wood
- Forage
- Soil enrichment and erosion control
 Regarded as an invasive weed in some areas
Distribution Data
•Area data from ILDIS
•Point data from private databases and herbaria
Point data of Leucaena leucocephala from Hughes (1998)
October 2001
Example of Modelling
Model of Leucaena leucocephala - for exploring:
- in which countries may further introductions be made?
- has the species become invasive by adapting to new niches?
- how will the distribution change under global warming scenarios?
October 2001
Leucaena leucocephala –
future predictions
 Hadley Circulation Model - HadCM3 – IS92a Scenario
“Population rises to 11.3 billion by 2100 and economic growth averages 2.3%
per annum between 1990 and 2100 with a mix of conventional and renewable
energy sources being used.”
Global view
Workflow Design
•
Biodiversity Richness &
Conservation Evaluation
• Which areas represent an optimal
conservation area network?
• What compromises can be made in
such a selection process?
3. Phylogenetic Analysis &
Biogeography:
• Does a combined analysis of climate
and character data enhance the
robustness of a phylogenetic
analysis?
A strict consensus of
1024 most parsimonious
trees for Pelargonium
Some relevant resource types:
• Data sources:
– Taxonomic Verification and Synonymic Indexing
Species 2000& ITIS Catalogue of Life
– Species Information Sources (SISs)
• Species geography: Species bank databases
• Descriptive data: Species bank databases
• Specimen distribution (BioCASE, AVH, SpeciesAnalyst, RDG,
MBG...DIGIR,ABCD etc)
– Geographical
• Boundaries of geographical & political units
• Climate surfaces (Hadley, Paul Valdes' Palaeoclimate Data)
• Modelled Climate progressions past and future
– Genetic sequences (EMBL/GenBank, local data)
• Analytic tools:
– Biodiversity richness assessment (WorldMap)
– Bioclimatic modelling (Garp, CSM, Bioclim)
– Phylogenetic analysis (Paup, clustalw, etc)
What does this mean for data
management - data sets?
• Functionality and integrity –
– Accurate access by taxonomy
• Synonymic indexing in taxonomic verification
systems
• Accurate identification and names in other data sets
– Accurate access by geographical distribution
• Accurate geospatial data for specimen and
observational datasets
• Also a role for political units in synthetic datasets
– Accurate access via metadata and semantic
mediation
• Semantic inference using metadata
and ontology
What does this mean for data
management - systems?
• Global Connectivity –
– Need for physical connectivity
• WWW, GRID, Semantic Web……..
– Need for Semantic Standards
• TDWG (IUBS Taxonomic Databases Working Group)
• GBIF
– Need for generic solutions to resource location,
metadata and packaging of biodiversity objects.
BiodiversityWorld
the biologist’s goals
The University of Reading
Frank Bisby, Alistair Culham, Neil Caithness, Tim Sutton, Peter Brewer,
Chris Yesson
Cardiff University
Alec Gray, Andrew Jones, Richard White, Nick Fiddian, Xuebiao Xu,
Mikhaila Burgess, Jaspreet Singh Pahwa
The Natural History Museum
Malcolm Scoble, Paul Williams,
Shonil Bhagwat
Bristol University
Paul Valdes
(The University of Southampton)
Download