Week 9 Developing a Protocol for Scientific Collections Natural History Institute, Prescott, AZ. Senior Project- BS in Environmental Studies Luna (Taide) Martínez G. PRESCOTT COLLEGE Spring 2014 Week 9 Luna (Taide) Martínez Developing a Protocol for Scientific Collections Week 9: Contributing to SEINet On this section, I will describe the steps that institutions need to follow in order to become contributors to SEINet. SEINet Introduction The Southwest Environmental Information Network is a portal to distribute data resources to the scientific community in the region. It offers tools to locate, access, and work with different kinds of data. About 20 member institutions provide this information to be shared with the world. SEINet currently hosts specimen and observation collections, an image library, plant games for users, 8 flora projects, and other access technologies. It has a total of 1.1 million specimen records, mainly from the Southwest. SEINet was created at Arizona State University with a grant from the National Science Foundation in 1999. Characteristics -Making Checklists: Users can choose a collection or a specific part of it and convert a list of specimens into a list of taxa. This could be used for class field trips, local managers, or other education and conservation projects. -Plant images as an aid to identification: SEINet’s image library includes over 17,000 records. Many taxa have images associated with them, so clicking on the taxon on a checklist will bring up any media files linked to them. -Mapping: Users can search a collection for a given family, genus, or species and produce a Google Map or a Google Earth map of their area. -Using maps to identify specimens: Users can narrow a specimen down to an area interactive keys that appear for each specific area. Spring 2014 Week 9 Page -Identification keys: SEINet allows users to zoom into a specimen and use 2 and use the Map feature to identify it. Luna (Taide) Martínez Developing a Protocol for Scientific Collections Contributing to SEINet New users can contribute to SEINet through the use of Symbiota. Any data set compliant to any version of Darwin Core can be loaded into a data portal. Once the user uploads these records, they are indexed into a relational structure with existing records. Symbiota provides the following guidelines for compliance with its field names: http://symbiota.org/tiki/files/SymbiotaOccurrenceFields.pdf It also provides a spreadsheet template that can be used to configure a database that we wish to upload: http://symbiota.org/tiki/files/SymbiotaOccurrenceFields.pdf There are three methods to contribute with specimen data: 1. Providing direct read-only connection to our existing database. 2. Flat comma delimited data file (CSV). Comma separated values are files that store tabular data in plain-text form. It consists of a number of records separated by line breaks; each record consists of fields separated by characters, commonly commas. The file would be compressed as a zip file before sending it to Symbiota. 3. Providing a URL to a DiGIR provider (Distributed generic information retrieval). Open source software like Dropbox to access databases. Most of these providers, however, do not include image and annotated info. 4. Sending a copy of the database to Symbiota (only if the 3 previous options are not a possibility). Once this information is sent, portal managers can provide a way for collection managers to update their collection regularly via a web interface. Annotation information can be included and is usually supplied as a separate file/source. If there are images online, that need protection. It is necessary to contact their portal administrator so they can give us access to their data editing modules. seinetAdmin@asu.edu Spring 2014 Week 9 Page format the data are stored in (Excel, Oracle, etc), and whether we have a field to tag species 3 specimen records can be linked to them. It is also necessary to let Symbiota know what Luna (Taide) Martínez Developing a Protocol for Scientific Collections Creating a Symbiota Portal The Symbiota website clearly states that it is preferable to contribute to regional flora/fauna through integrating data into existing collections and datasets, even though it is possible to install the software so it only features the data from a single collection. Symbiota’s features appear to be more effective if they can work to link several collections and datasets. Finally, setting up a new portal represents some technical management skills in order to maintain and update it. Requirements -Apache HTTP Server (2.x or higher) -PHP (5.x or higher) -MySQL (5.x or higher) -SVN Client -Installation instructions and source code: http://sourceforge.net/projects/symbiota/ http://symbiota.org/tiki/tiki-index.php?page=Installation+Instructions Data requirements -Specimen records: Annotation to the identification of a specimen are needed whenever available. -Taxonomic data: Needed in order to create a taxonomic thesaurus in order to resolve discrepancies. -Morphological character information: Identification keys are generated directly from descriptive data. Spring 2014 Week 9 Page over species composition, area comments, etc. 4 -Species lists: Static lists are preferred in order to maintain accuracy and greater control Luna (Taide) Martínez Developing a Protocol for Scientific Collections -Images: The system can host specimen and field images. Data Loading The layers of data in Symbiota are: user, taxonomic, occurrence, images, floristic, ID key, and taxon profile. There is web-management for many of these layers. 1. User and permissions: Default administrative user are installed with the following login: admin; password: admin. 2. Taxonomic Thesaurus: Required for all Symbiota modules (Occurrence, Floristic, and Key Modules). Taxonomic names are stored within a data structure that defines the taxonomic hierarchy. -Names can be added one by one to the default thesaurus through: -Batch Loading - Multiple names can be loaded from a flat, CVS text file or an ITIS download file. 3. Occurrence (specimen) Records: Several methods are used to upload specimen data. The data interoperability link I provided describes these. 4. Images: This will create the image URLs that will be stored in the "images" table. 5. Biotic Survey Data (floras / faunas): Species within a checklist can be linked to specimen vouchers within collections. Checklists can be created within ones user profile and then made public. Projects can be created using the editing features. 6. Identification Key Data: Data need to be loaded directly into the table by hand. As characters are loaded, they will need to be linked to the taxonomic hierarchy at the Page 5 ranking level that they are most relevant. Spring 2014 Week 9