Data Organization

advertisement
Data Organization
Data Collection and Spreadsheets
Data
Discovery
Proposal
Planning
Writing
Project
Start Up
Re-Use
Data
Collection
Data
Analysis
Deposit
Data
Archive
Data
Sharing
Re-Purpose
Data Life Cycle
End of
Project
Consistent Data Organization
• Spreadsheets (such as those found in
Excel) are sometimes a necessary evil
– They allow “shortcuts” which will result in your
data not being machine-readable
• But there are some simple steps you can
take to ensure that you are creating
spreadsheets that are machine-readable
and will withstand the test of time
Spreadsheets
From NASA Environmental Data Management Best Practices Webinar: Bob Cook
Spreadsheet Best Practices
•
•
Include a Header Line 1st line (or record)
Label each Column with a short but descriptive name
Names should be unique
Use letters, numbers, or “_” (underscore)
Do not include blank spaces or symbols (+ - & ^ *)
More Spreadsheet Best Practices
• Columns of data should be consistent
Use the same naming convention for text data
• Each line should be “complete”
More Spreadsheet Best Practices
Columns should include only a single kind of data
• Text or “string” data
• Integer numbers
• Floating point or real numbers
Use Naming Standards & Codes
• Use commonly accepted label names that
describe the contents (e.g., precip for
precipitation)
• Use consistent capitalization (e.g., not: temp,
Temp, and TEMP in same file)
• Standard codes
– State Postal (VA, MA)
– FIPS Codes for Counties and County
Equivalent Entities
(http://www.census.gov/geo/reference/codes/cou.html)
Use Standardized Formats
• Use standardized formats for units
International System of Units (SI)
http://physics.nist.gov/Pubs/SP330/sp330.pdf
• ISO 8601 Standard for Date and Time
YYYYMMDDThh:mmss.sTZD
20091013T09:1234.9Z
20091013T09:1234.9+05:00
•
Spatial Coordinates for Latitute/Longitude
+/- DD.DDDDD
-78.476 (longitude)
+38.029 (latitude)
Download