TFBIS Project 282
Primer / discussion document for comment from project team and wider community
Prepared by Jochen Schmidt
Background
Various organisations across New Zealand hold and / or curate archives and databases of freshwater biodiversity survey (FBIO) data. With FBIO data we mean observations of species in the field.
Typically these data types contain many attributes of observations related to a observation. To date there is no easy and nationally agreed way to query or transfer freshwater biodiversity survey information across organisations and systems. New Zealand needs a biodiversity interoperability framework to achieve that. In this project we conduct a number of workshops with a key stakeholder group of freshwater biodiversity data managers and architects to scope key users, components, architecture, and develop a roadmap for a national freshwater biodiversity interoperability framework. (from NIWA TFBIS proposal)
Key Stakeholder Group
Jim McLoed, WRC
Jim Fretwell, BOP
Jerry Cooper, LCR
James Lambie, Horizons
Jochen Schmidt, NIWA
Brent Wood, NIWA
Paul Barter, Cawthorn
Norm Thornley, DOC
Lucy Baker, MfE (Observer)
Potentially observer from MSI/MAF
Target provider audience
Our understanding is that there are different communities with regards to management of FBIO data:
1.
A range of CADDIS users among regional councils and Cawthorn.
2.
A range of EcoBase (IQUEST) users among regional councils.
3.
Freshwater Fish community feeding data into FFDB and that into FBIS.
4.
NIWA aquatic plants group uses AQPDB which feeds into FBIS.
5.
NIWA freshwater invertebrate groups feed into FBIS using various spreadsheets.
6.
NIWA freshwater algae groups plan to feed into FBIS using various spreadsheets.
7.
A large range of NIWA and RC staff use their own excel spreadsheets etc. (= ’wild west’)
It seems to be sensible to focus for now on the question how to integrate the CADDIS/EcoBase/FBIS communities into an interoperability framework.
Target user audience
The key user group we are focusing in this study are freshwater scientists or other end-users who are interested in acquiring all available FBIO data for a particular domain of interest and wanting to conduct some sort of analyses / investigation of that dataset. Our understanding of information use in the freshwater bio community is that there are many Scientists, Consultants, and other end-users who want to extract data and perform their own analyses in that format. Key Problems will be consistency in taxa data, location data, and time data across different providers as well as methods.
Use-cases for the user audience
NIWA freshwater scientist seeks all observation made in a particular year of a particular species in NZ. She wants to produce a species density map from that data.
Yale PD wants to get observation information on a range of different freshwater fish species for New Zealand, time-tagged. He wants to study predator-prey relationships (real case;-)
Regional Council Scientist wants to extract all available data for invertebrate species x for a particular catchment to create invertebrate abundance maps and derive the ecological health of different rivers.
What are the fundamental and common concepts of a FBIO Management system?
The key paradigm of the FBIO interoperability framework can be defined as:
Regardless how and where FBIO data is managed, we want stakeholders to be able to discover it in a taxa, spatial, and temporal context, provide consistent metadata, and be able to access the information in a consistent way.
To enable interoperability we need to define the same concepts which are common in the underlying FBIO systems (as an abstract concept).
We suggest they are:
Feature of interest. The surveyed environmental feature. Typically for freshwater: river reach, lake, lagoon, wetland, estuary, … In a DB typically represented as a descriptive field or a link to reference GIS. Typically and simplified described by a geospatial feature (although we realize that that is a simplification!). Could be exposed as WFS/WMS?
Survey (or dataset). A set of FBIO data collected for one purpose (e.g. project), through one survey. Survey or dataset information provides valuable meta information for FBIO data like orgainsation, project, purpose, limitations of use, etc. Typically the scope of what “survey” entails can be variable and is defined by the curator. We suggest using ANZLIC? / Darwin core? as the common data concept to describe those and Catalogue services for the Web
CSW as the standard delivery mechanism.
Sampling efforts (or sampling events). A survey always consists of a number of sampling events. A sampling event is defined by FBIO data collected in one location (coordinate), one date/time, one method (and more?). What is the common conceptual information model for sampling efforts? We suggest WFS/WMS as the common delivery mechanism.
Sampling Data. The actual data can be quite variable and multi-fold. Therefore currently in
FBIS we have modelled that as a set of (hierarchical) key-value pairs. What is the common conceptual information model for sampling data?
(Note: We are just focussed on survey data for the purposes of this project, but we do recognise that other data, e.g. community captured, & opportunistic, exists).
Feature of interest
Metadata,
GIS Polygon n 1
Survey / Dataset
Metadata
1 n
1
Sampling effort
Represented through
Method n
Metadata, point, time
Taxa / NZOR
1 1 n
1
Sampling data
Sampling data, taxa
Key elements of a NZ FBIO interoperability framework
We suggest the following elements need to be dealt with:
1.
Interoperable taxa management. To be interoperable all taxa data need to be compatible.
This should be achieved through references to NZOR.
2.
Interoperable Feature of Interest Management. To be interoperable references to feature of interest need to be compatible across the databases. Not so important if end-user uses coordinates?
3.
Interoperable spatial references. To be interoperable all spatial references need to be compatible. Same coordinate system used or well-defined and transformation (on the fly).
4.
Interoperable temporal references. To be interoperable all temporal references need to be compatible. Should not be a big deal??
5.
Interoperable methods references. To be interoperable all methods references need to be compatible. Typically different systems use their own terminology. It will be very hard to achieve consistency here. Probably best dealt with reference to a locally managed methods register.
6.
Open Standard (web service) to expose survey information. We suggest this should be done through OGC catalogue services for the web CSW (or others?). Note that this web service
need to include links to (web services exposing) the sampling efforts related to the survey.
7.
Open Standard (web service) to expose sampling effort information. We suggest this should be done through a clearly specified OGC web feature service WFS (Or SOS? Or Others?). Note that this web service need to include links to (web services exposing) the sampling data
related to the effort.
8.
Open Standard (web service) to expose sampling data. Need to develop a standard for that?
Is there an existing one? GBIF/SOS?
Overall architecture of a NZ FBIO framework.
We suggest that the overarching principles and architecture should be defined as follows:
Every major collector of FBIO data need to feed their data into a “managed FBIO system”.
A “managed FBIO system” uses the fundamental concepts as defined above.
A “managed FBIO system” is consistent with (1), (2), (3), (4), (5) above. (We might be able to relax 2, 5?).
A “managed FBIO system” stores survey information exposes its information through (6), (7),
(8)