Developing a Protocol for Scientific Collections

advertisement
Week 9
Developing a Protocol
for Scientific Collections
Natural History Institute, Prescott, AZ.
Senior Project- BS in Environmental Studies
Luna (Taide) Martínez G.
PRESCOTT COLLEGE
Spring 2014
Week 9
Luna (Taide) Martínez
Developing a Protocol for Scientific Collections
Week 9: Contributing to SEINet
On this section, I will describe the steps that institutions need to follow in order to
become contributors to SEINet.
SEINet
Introduction
The Southwest Environmental Information Network is a portal to distribute data
resources to the scientific community in the region. It offers tools to locate, access, and
work with different kinds of data. About 20 member institutions provide this information to
be shared with the world. SEINet currently hosts specimen and observation collections, an
image library, plant games for users, 8 flora projects, and other access technologies. It has a
total of 1.1 million specimen records, mainly from the Southwest. SEINet was created at
Arizona State University with a grant from the National Science Foundation in 1999.
Characteristics
-Making Checklists: Users can choose a collection or a specific part of it and
convert a list of specimens into a list of taxa. This could be used for class field trips, local
managers, or other education and conservation projects.
-Plant images as an aid to identification: SEINet’s image library includes over
17,000 records. Many taxa have images associated with them, so clicking on the taxon on a
checklist will bring up any media files linked to them.
-Mapping: Users can search a collection for a given family, genus, or species and
produce a Google Map or a Google Earth map of their area.
-Using maps to identify specimens: Users can narrow a specimen down to an area
interactive keys that appear for each specific area.
Spring 2014
Week 9
Page
-Identification keys: SEINet allows users to zoom into a specimen and use
2
and use the Map feature to identify it.
Luna (Taide) Martínez
Developing a Protocol for Scientific Collections
Contributing to SEINet
New users can contribute to SEINet through the use of Symbiota. Any data set
compliant to any version of Darwin Core can be loaded into a data portal. Once the user
uploads these records, they are indexed into a relational structure with existing records.
Symbiota provides the following guidelines for compliance with its field names:
http://symbiota.org/tiki/files/SymbiotaOccurrenceFields.pdf
It also provides a spreadsheet template that can be used to configure a database that
we wish to upload:
http://symbiota.org/tiki/files/SymbiotaOccurrenceFields.pdf
There are three methods to contribute with specimen data:
1. Providing direct read-only connection to our existing database.
2. Flat comma delimited data file (CSV). Comma separated values are files that
store tabular data in plain-text form. It consists of a number of records separated
by line breaks; each record consists of fields separated by characters, commonly
commas. The file would be compressed as a zip file before sending it to
Symbiota.
3. Providing a URL to a DiGIR provider (Distributed generic information
retrieval). Open source software like Dropbox to access databases. Most of these
providers, however, do not include image and annotated info.
4. Sending a copy of the database to Symbiota (only if the 3 previous options are
not a possibility).
Once this information is sent, portal managers can provide a way for collection
managers to update their collection regularly via a web interface. Annotation information
can be included and is usually supplied as a separate file/source. If there are images online,
that need protection. It is necessary to contact their portal administrator so they can give us
access to their data editing modules. seinetAdmin@asu.edu
Spring 2014
Week 9
Page
format the data are stored in (Excel, Oracle, etc), and whether we have a field to tag species
3
specimen records can be linked to them. It is also necessary to let Symbiota know what
Luna (Taide) Martínez
Developing a Protocol for Scientific Collections
Creating a Symbiota Portal
The Symbiota website clearly states that it is preferable to contribute to regional
flora/fauna through integrating data into existing collections and datasets, even though it is
possible to install the software so it only features the data from a single collection.
Symbiota’s features appear to be more effective if they can work to link several collections
and datasets. Finally, setting up a new portal represents some technical management skills
in order to maintain and update it.
Requirements
-Apache HTTP Server (2.x or higher)
-PHP (5.x or higher)
-MySQL (5.x or higher)
-SVN Client
-Installation instructions and source code:
http://sourceforge.net/projects/symbiota/
http://symbiota.org/tiki/tiki-index.php?page=Installation+Instructions
Data requirements
-Specimen records: Annotation to the identification of a specimen are needed whenever
available.
-Taxonomic data: Needed in order to create a taxonomic thesaurus in order to resolve
discrepancies.
-Morphological character information: Identification keys are generated directly from
descriptive data.
Spring 2014
Week 9
Page
over species composition, area comments, etc.
4
-Species lists: Static lists are preferred in order to maintain accuracy and greater control
Luna (Taide) Martínez
Developing a Protocol for Scientific Collections
-Images: The system can host specimen and field images.
Data Loading
The layers of data in Symbiota are: user, taxonomic, occurrence, images, floristic,
ID key, and taxon profile. There is web-management for many of these layers.
1. User and permissions: Default administrative user are installed with the following
login: admin; password: admin.
2. Taxonomic Thesaurus: Required for all Symbiota modules (Occurrence, Floristic,
and Key Modules). Taxonomic names are stored within a data structure that defines
the taxonomic hierarchy.
-Names can be added one by one to the default thesaurus through:
-Batch Loading - Multiple names can be loaded from a flat, CVS text file or
an ITIS download file.
3. Occurrence (specimen) Records: Several methods are used to upload specimen data.
The data interoperability link I provided describes these.
4. Images: This will create the image URLs that will be stored in the "images" table.
5. Biotic Survey Data (floras / faunas): Species within a checklist can be linked to
specimen vouchers within collections. Checklists can be created within ones user
profile and then made public. Projects can be created using the editing features.
6. Identification Key Data: Data need to be loaded directly into the table by hand. As
characters are loaded, they will need to be linked to the taxonomic hierarchy at the
Page
5
ranking level that they are most relevant.
Spring 2014
Week 9
Download