Data Storage and Research Methodology GEO4012

advertisement
GEO4012
Data Storage and Research Methodology
Some aspects on
USE AND STORAGE of DATA
AT THE DEPARTMENT
Data storage during Master thesis project
• Every Master-student at the institute will
obtain a user-specific directory to store
temporary data, results, documents etc.
• Your home-directory (M:\ on windows or
~<username> on Linux) should not be
used to store Master project related data.
• Reason:
•
•
•
More disk space
Easier to share data with your supervisor and/or
other students working in the same project
Jurisdiction
Data storage during Master thesis project
• Master-project directory
•
•
K:\section- or projectdisk\username (Windows)
/felles/section- or projectdisk/username (Linux)
• Linux example:
Windows example:
You will receive an e-mail when the
directory has been created.
Data storage at IG@UIO
•
•
•
•
Data and results should be stored in open formats
(e.g. ASCII) or Global Standards (e.g. SEGY, bit
maps)
Proprietary formats (e.g. Excel) should not be used
to store final results
Raw data, if not ‘confidential’, should preferably be
stored at K:\data\<data-type>. The IT-group can
help to copy data to that site.
IT-related questions should always be directed to
drift@geo.uio.no with a cc to your supervisor. It is
recommended to talk with your supervisor first.
System overview
vann
DATA HANDLING
jern
ice
ekman
abel
rossby
sverdrup
kant
HOME DIRECTORIES
Some aspects of
RESEARCH METHODOLOGY
Research Documentation
• Why document your research?
•
•
•
•
•
•
To allow other researchers to understand the
methods you used;
To be able to replicate your results;
To determine if your findings are reliable;
To make it easier for those who come after you;
To avoid suspicion of fraud or plagiarism;
To receive credit for the research you’ve done on
a project and eventually write scientific papers;
Research Documentation
• How to document your research?
•
•
•
•
Keep track of all the methods / models used to
conduct your research
Keep information which describes all aspects of your
data (Meta-data)
Keep a list of all the scientific papers you read/consult
Draft your research report (don’t wait for the very end)
Models/Methods
• Simulation is an important tool in
engineering and research.
• But be careful with its use:
•
•
How well does the simulation model reflect the
reality?
You might be inferring conclusions based on
“artificial worlds” ...
• So:
•
Always keep track of the model version you used
and all the changes you may have done
Meta-data
• Metadata (metacontent) are defined as the
data providing information about one or
more aspects of the data, such as:
•
•
•
•
•
•
Means of creation of the data
Purpose of the data
Time and date of creation
Creator or author of the data
Location on a computer network where the data
were created
Standards used
Metadata - Examples
Bad documentation
#!/bin/bash
#SBATCH --job-name=test
#SBATCH --account=geofag
#SBATCH --time=10:10:00
#SBATCH --mem-per-cpu=2000M
#SBATCH --nodes=1 --ntasks-per-node=8
source /cluster/bin/jobsetup
module load matlab
matlab -nodisplay -nodesktop -nosplash < cryo.m
Good documentation
#!/bin/bash
#
### Script for matlab code CryoGrid2 on Abel
### on 8 tasks
#
### Mandatory parameters to run a job via SLURM
#SBATCH --job-name=PFNNorway
#SBATCH --account=geofag
#SBATCH --time=10:10:00
#SBATCH --mem-per-cpu=2000M
#SBATCH --nodes=1 --ntasks-per-node=8
## Set up job environment on abel
source /cluster/bin/jobsetup
module load matlab
## Start matlab
matlab -nodisplay -nodesktop -nosplash < cryo.m
Metadata- Examples
Petrel
Metadata - Examples
Smart data storage by humans
Input data (TB)
Article/
Master
source
Program/code
Libraries
Compilers
Hw/time
Script(s)
Changes?
Machine
Meta-data
Internet
visualization
data
Output data (TB)
To Save
Version control
• Why do we need version control?
• What are the basic operations for version
control?
• Example with SVN
Why do we need version control?
•
A version control system keeps track of all
work and all changes in a set of files, and
allows several developers (potentially widely
separated in space and time) to collaborate.
• To keep track of a larger programming or text
project including file locking/version control
and conflicts.
Other tools for managing projects
• rcs - UNIX command: rcs creates new RCS files or
changes attributes of existing ones. An RCS file contains
multiple revisions of text, an access list, a change log,
descriptive text, and some control attributes.
• CVS - Concurrent Version Control,
http://en.wikipedia.org/wiki/CVS_(software)
• GIThub/"GIT" - GitHub offers both paid plans for private
repositories, and free accounts for open source projects.
http://en.wikipedia.org/wiki/GitHub
Basic operations for version control
•
•
•
•
•
•
Checkout
Update
Commit
Tag
Branch
Merge
http://en.wikipedia.org/wiki/Revision_control
SVN - Initial copy of the repository
• Finding the repository
Ask a team member where to find it or check the local
repository!
$ ssh svn.uio.no
$ cd /svnroot
/usit/vcs-uio/svnroot
$ ls
osloctm3
...
• Getting the repository (At your master project directory):
$
$
$
$
mkdir svn
cd svn
mkdir osloctm3
svn checkout svn+ssh://svn.uio.no/svnroot/osloctm3
SVN - Normal usage of existing repo
• Going there
$ cd osloctm3
$ svn update [FILE]
U fc/fc-switches.html ..
• Editing the file(s)
$ emacs –nw fc-switches.html
• Checking the updates (optional)
$ svn diff fc-switches.html
• Sending the change upstream
$ svn commit fc-switches.html
Note: Why or why not sending the changes
upstream?
• Yes: you did something for the project
Found an error/bug
Put new functionality
• No: //Think!//
Changes are for your interest only
It may break the idea of the project
NB! *When* you find your changes missing in the original it is way to
late and you must
drop it. The cost on either side may be quite big
choose to make a branch/new repo (who will help you?)
•
•
•
•
•
•
•
References
• USIT (Norwegian):
http://www.uio.no/tjenester/it/maskin/filer/versjonskontroll/svn.html
• Internet, ie. http://www.abbeyworkshop.com/howto/misc/svn01/
• Subversion own project web http://subversion.tigris.org/
• Wikipedia - http://en.wikipedia.org/wiki/Subversion_%28software%29
Download