Document

advertisement
Data management for NEES
Stanislav (Standa) Pejša,
NEEScomm Data Curator
spejsa@purdue.edu
Content
• Benefits of data management
•
•
•
•
•
•
•
•
File formats
File names
File description
Storage
Reports/Publications
Copyright
Curation
Things to remember
Benefits of
data management
 Better chances to find what you are looking for
 Predictable location
 Meaningful structure
 More efficient work with data
 Easier sharing of your data and transfer of knowledge
 Safe location of files
 Less stress when finishing a project
File formats

Use common and mainstream formats

Use formats consistently

Each formats requires different preservation approach


If possible avoid bundling and embedding different formats that can result in loss of functionality or data
Recommended formats:





Sensor measurements: tab delimited ASCII or CSV
Reports, publications and other documentation: PDF is recommended
Images: PNG, JPG, and GIF; avoid BMP
Frame captures: the recommended formats are ZIP, TAR, TAR.GZ
Video: currently there are no restrictions;
avoid formats that require a specific codec, e.g. ASF
File names
 File naming convention is a good idea


Be consistent with using lower case and upper cases
Use file extensions consistently – do not mix JPG and jpg (lower case is
proffered)
 Make filenames meaningful
 Avoid forbidden characters: |;,!@%#$()<>/\"'`~{}[]=+&^*?
 Do not start filenames with a period ".”
 Avoid whitespace; use underscore (_) or hyphen (-) instead
File description
 Descriptions on directories and/or files
 Make retrieval and identification of files easier
 Help researchers to understand purpose of the file or directory
 Description should include:
 Notes about data processing
 Software used for creation or processing of data
 Useful information necessary for rendering of files
 Notes about context of the file
Storage
 NEEShub will keep your data SAFE



Your laptop or desktop is not enough
Save as you work on experiments
Do not wait to be told to upload your data
 and your experiments ORGANIZED




Data
(./Experiment-n/Trial-n/Rep-n/Type_of_data)
Sensor metadata
(./Experiment-n/Documentation/Sensors)
Material properties
(./Experiment-n/Documentatin/MaterialNNNN)
Technical drawings: specimen, instrumentation plans
(./Experiment-n/Documentation/Drawings)


Analytical files
Presentations, reports, images
(./Experiment-n/Analysis)
(./Documentation)
Reports

Reports are requirement - essential tool for understanding research and its context




Final report (project level)
Executive summary (project level)
Experimental setup report (experiment level)
NEEShub accepts


MSci/PhD theses
Pre-prints/post-prints


Pre-prints – draft before they are peer-reviewed
Post-prints – drafts with comments of peer-reviewers
Researchers typically DON’T own copyright to their articles

Reports to other grant agencies
Resources in
the Project warehouse…
 Project related resources

Articles

Conference papers

Theses
* currently limited set of
document types
 Retrievable within the
NEEShub
 Discoverable through Google
Resources in
the Project warehouse pt. 2
 Project related resources

Articles

Conference papers

Theses
* currently limited set of
document types
 Retrievable within the
NEEShub
 Discoverable through Google
Copyright @ NEEShub
 Let others know that they can use your data
 Open Data
 data
 Creative Commons
 presentations, reports, pre-prints/post-prints, teaching materials
 Open Source
 software
more on intellectual property considerations
https://nees.org/legal/licensing
Curation
 IS service that helps researchers to archive their data in meaningful way
 IS about planning and organizing data, metadata, and documentation
 IS concerned about current and future use of data
 IS iterative and interactive process between researcher teams and curator
 IS continuum of actions from creation through publication and
preservation of data
Curation and metadata
Metadata need to be:
 consistent
 accurate
 standardized
Example:
Relationship among:
Sensor metadata
Data files headers
Instrumentation
Curation and metadata
On the experiment level
 CURATION self-check
in ‘EDIT’ mode
 you can repeatedly check you progress
and compliance with the data model
 self-check indicates whether files were
uploaded to correct location
 use the provided box to communicate
with the curator
 once done send a curation
request to the curator
Things to remember
 Save files as you work on them
 Plan ahead
 Do not wait to be told to upload your data
 Be consistent
If you need help with upload or organization of data
 Search for curation
 Many documents are tagged ‘curation’ or ’data curation’
 Or email the NEEScomm Data Curator
 spejsa@purdue.edu
Thank you !
And if you have any questions, email me at:
spejsa@purdue.edu
Download