Endressen GermplasmDwC

advertisement
GLOBAL
BIODIVERSITY
INFORMATION
FACILITY
DarwinCore Germplasm Extension
and deployment in the GBIF
infrastructure
TDWG 2009, Montpelier, November 12, 2009
Dag Endresen (NordGen) & Samy Gaiji (GBIF)
WWW.GBIF.ORG
Topics for this session






Darwin Core (2009)
DwC Germplasm Extension (DRAFT 0.1)
Germplasm Extension Terms
Mapping to the Multi-Crop Passport Descriptors
Integrated Publishing Toolkit (IPT)
IPT Germplasm Extension
Darwin Core (2009)

The Darwin Core should be viewed as an
extension of the Dublin Core for biodiversity
information.

The purpose of these terms is to facilitate
data sharing


a well-defined standard core vocabulary

a flexible framework

to maximize re-usability
The Darwin Core can be extended by
adding new terms to share additional
information.
http://rs.tdwg.org/dwc/
DwC star schema

Star schema model

Can relate elements
one-to-many
DwC Germplasm Extension

DwC Germplasm Extension (DRAFT 0.1)
 August 26, 2009

The DarwinCore Germplasm Extension
 additional terms
 to describe germplasm samples
 maintained by genebanks worldwide
http://rs.nordgen.org/dwc/
http://www.nordgen.org/epgris3/wiki/index.php/DwC_Germplasm
DwC Germplasm Extension

DwC Germplasm Extension (DRAFT 0.1)

Modelled starting from the Multi-Crop
Passport standard (MCPD, 2001)

Includes the new terms for crop trait
experiments developed as part of the
European EPGRIS3 project.

Includes a few additional terms for new
international crop treaty regulations.
DwC Germplasm (1)
DwC Germplasm (2)
DwC Germplasm (3)
DwC Germplasm (4)
DwC Germplasm (5)
DwC Germplasm (6)
GermplasmDistribution
Perhaps add new terms to facilitate the reporting of germplasm
distribution for the ITPGRFA (International Treaty for Genetic
Resources for Food and Agriculture)
GermplasmManagement
The Millennium Seed Bank (Kew) has contributed feedback to the
DwC-G modeling and proposed to include a number of seed
management descriptors.
• Seed processing terms
• Seed cleaning
• Seed germination testing
Mapping of
DwC-G
terms to
the MCPD
descriptors
Mapping of
DwC-G
terms to
the MCPD
descriptors
(continued)
MCPD -> ABCD 2.06 (2004)
National Inventory Code
Institute Code
Accession Number
Collecting Number
Collecting Institute Code
Genus
Species
Species Authority
„Subtaxa“
„Subtaxa“ Authority
Common Crop Name
Accession Name
Acquisition Date
Country of Origin
Location of Collection Site
Latitude of CS
Longitude of CS
Elevation of CS
Collecting Date of Sample
Breeding Institute Code
Biological Status of Accession
Ancestral Data
Collecting/Acquisition Source
Donor Institute Code
Donor Accession Number
Other Identification (Number)
associated with the accession
Location of Safety Duplicates
Type of Germplasm Storage
Remarks
Decoded Collecting Institute
Decoded Breeding Institute
Decoded Donor Institute
Decoded Safety Duplication Location
Accession URL
Helmut Knüpffer
IPK Gatersleben
Descriptors marked red did not match the earlier versions of ABCD
 ABCD was extended by a PGR section [W. Berendsohn, H. Knüpffer]
Walter Berendsohn
BGBM
http://www.ecpgr.cgiar.org/epgris/Tech_papers/EURISCO_Descriptors.pdf
GBIF Informatics Suite

GBIF Decentralization
Strategy (WP 2009-2010)

Customized biodiversity
data networks

Tools to empower
decentralized thematic or
regional networks
IPT






Project site: http://code.google.com/p/gbif-providertoolkit/
IPT DEMO. http://ipt.gbif.org/
IPT LITE DEMO: http://ipt-lite.gbif.org/index.html
IPT Mailing List: http://lists.gbif.org/mailman/listinfo/ipt/
GBIF HIT: http://code.google.com/p/gbif-indexingtoolkit/
GBRDS: http://code.google.com/p/gbif-registry/
• The GBIF IPT is an open source, Java (TM) based web
application that connects and serves primary biodiversity
data.
• The data registered in the IPT is connected to the GBIF
distributed network and made available for public
access.
• Designed to decentralize and speed up the process of
indexing (large) biodiversity occurrence datasets.
• IPT also provides a local tool for data quality
assessment to data publishers.
GBIF Integrated Publishing Toolkit (IPT)
-
-
Java 1.5 or higher is required
Apache Tomcat is recommended (1 GB RAM+)
GBIF IPT is provided as a WAR archive (for easy
deployment)
GeoServer is included for web mapping (OGC
Compliant, WFS, WMS, etc)
H2 Embedded Java Database (with JDBC
interface and web console)
Hibernate (object relational mapping)
http://ipt.nordgen.org/ipt/
IPT Interfaces





REST XML
TAPIR
DwC Archive
OGC (WFS, WMS, Web Mapping)
EML (Ecological Markup Language)
Darwin Core Archive (DwC-A)



DwC-A publish dwc records including extensions
Simple text based format
Zipped single file archive
Germplasm.txt
http://code.google.com/p/gbif-ecat/wiki/DwCArchive
The GBIF IPT
service has a
graphical interface
to the datasets.
Including a map, pie
charts, or the right
side context menu
(taxonomy and
geography).
24
The IPT user
interface includes
the extensions
XML interface
includes the
extensions
GBIF IPT implements the
Darwin Core Standard; and
provides an interface to
easily build extensions to
the core Darwin Core terms.
The draft germplasm
extension is one example of
how-to extend the Darwin
Core terms for the GBIF IPT.
Using GBIF technology (and
contributing to its
development), the PGR
community can easily
establish specific PGR
networks without
duplicating GBIF's work.
The compatibility of data
standards between PGR
and biodiversity collections
made it possible to integrate
the worldwide germplasm
collections into the
biodiversity community
(GBIF, TDWG).
GBIF PGR Network 2
http://data.gbif.org/datasets/network/2
Special thanks to:
• GBIF, Global Biodiversity Information
Facility http://www.gbif.org
• TDWG, Biodiversity Information
Standards http://www.tdwg.org
• BioCASE, The Biological Collection
Access Service for Europe.
http://www.biocase.org
• Bioversity International
http://www.bioversityinternational.org
Things can happen in a band, or any type of collaboration, that would not otherwise happen. (Jim Coleman, Musician)
Download