Data Organization Quality Assurance and Transformations Data Discovery Proposal Planning Writing Project Start Up Re-Use Data Collection Data Analysis Deposit Data Archive Data Sharing Re-Purpose Data Life Cycle End of Project Data Validation • Check for missing, impossible, anomalous values – Plotting – Mapping • Examine summary statistics • Verify data transfers from notebooks to digital files • Verify data conversion from one file format to another Hook, et al. 2010. Best Practices for Preparing Environmental Data Sets to Share and Archive. Available online: http://daac.ornl.gov/PI/BestPractices-2010.pdf. Preserve & Record Information Processing Script (R) Keep Original (Raw) File – Do not include transformations, interpolations, etc. – Make the raw data “read-only” Save as a new file Data Manipulation • You will need to repeat reduction and analysis procedures many times – – – – You need to have a workflow that recognizes this Scripted languages can help capture the workflow You could just document all steps by hand After the 20th iteration through your data set; however, you may feel more fondly towards scripted languages • Learn the analytical tools of your field – Talk to colleagues, etc. and choose at least one tool to master Preserve Processing Information • Scripts used in file cleaning • Programs / algorithms • Document workflows or data file transformations Temperature data (T) Data import into R Salinity data (S) “Clean” T & S data Quality control & data cleaning Analysis Graph Production Data in R format Summary statistics Preserving: Scripted Notes • Use a scripted language to process data – R Statistical package (free, powerful) – SAS – MATLAB • Processing scripts records processing – Steps are recorded in textual format – Can be easily revised and re-executed – Easy to document • GUI-based analysis may be easier, but harder to reproduce Reproducibility Methods • Do use version control • Do document software environment • Only save what cannot be reconstructed from original data + code