Data goodness Mostly in black and white By Dom You must love your data! • Lost data : • Current imaging data in BRIC cost ~£5.1M, just for scanning costs! (2011) no research no publications no jobs no PhDs! Sad Dom • Look after your data! – It looks after you • Happy Dom Data Storage • Home directories: – ISIS home, U Home » Not for large amounts of imaging data • Projects directory – ISIS, V: Big stuff goes here • If you require large amounts of space – E.g. > 50 GB – LET ME KNOW IN ADVANCE! Server goodness • Why is the server a good place to store data? • Mirror and parity - some errors - data can be easily recovered – BACKUPS: • Tape backups, daily - 1 month retention • if you have funding, processed data can be mirrored off site • raw data is always mirrored offsite (ECDF) by default – Desktop PC's • not reliable - no mirroring, no parity - some errors - data is lost (Often all of it) • Network backups often fail – Machines turned off, Network busy – moving to a new system when I get time! Data love • Curation: Do this as you work! • Plan your data use – Use meaningful folder names – Make 'README.txt' files with dates, names of students/employees involved, references to software, scripts and versions, purpose of experiment/processing. – Be tidy with your data - tidy up occasionally – Friday afternoon - quick tidy up – Big tidy up at end of experiment/ project/ phase/ year • BE CAREFUL, don’t rush • Data, spreadsheets, databases – Anonymisation – *** Repatriation keys*** Code and Scripts • Coding: • Testing – Make sure that the software you are using does exactly what you think it does! » Check every step for every image! – Do not use hard coded paths • Use versioning software (ECDF) Safe data is Happy data!