nwodatamanagplbiodiv2

advertisement
“Onderzoeksprogramma Biodiversiteit werkt”
Introduction Data Management Plan
The Dutch government recognises the importance of open access to research data from public
funding and is committed to the terms and conditions set by the OECD on this matter1&2.
NWO is subject to the international commitments of the Dutch government but also
recognises, as an independent science funding organisation, the importance of open access to
research data for science and society. The Internet, and established and new digital
information infrastructures, create an environment in which the value of generated research
data can be significantly enhanced by allowing multiple users to find, see and use research
data. NWO therefore signed the “Berlin Declaration”3 and actively supports access to
research data by means of the Internet and digital information infrastructures that are relevant
for the activities and programmes of NWO.
With regards to specifically biodiversity data; the Netherlands have signed the Memorandum
of Understanding of the Global Biodiversity Information Facility (GBIF)4 and are committed
to the “free and open access to biodiversity data” objectives of this international network.
In order to evaluate the output of research proposals in terms of the (electronic) accessibility
of data, applicants are requested to include a Data Management Plan with their research
proposal. The Data Management Plan is expected to describe the data that will be authored as
well as how the data will be managed and made accessible throughout its lifetime. The data
management plan should cover several general elements:
- the types of data to be authored;
- the standards that would be applied for primary data, metadata, etc.;
- provisions for archiving and preservation;
- access policies and provisions;
- plans for eventual transition or termination of the data collection in the long-term future.
This set of elements provides a basic view of what NWO is likely to expect in a data
management plan. The specific sections, remarks and questions below act as a more detailed
guideline. Information that is not mentioned in this document but is considered to be relevant
for the Data Management Plan should obviously be mentioned. Please bear in mind that the
Data Management Plan acts as an information source for NWO, not as an internal action plan
to operationalise the stages of data management in your research.
The guidelines provided in this document are partly based on documentation generated by the
UK’s Digital Curation Centre5.
OECD Declaration on Access to research Data from Public Funding, 30 March 2004 – C(2004)31/REV1
OECD Recommendation of the Council concerning Access to Research Data form Public Funding, 14
December 2006 – C(2006)184
3
Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities:
http://oa.mpg.de/openaccess-berlin/berlin_declaration.pdf
4
GBIF Memorandum of Understanding (2007–2011): http://www2.gbif.org/mou07-11.pdf
5
The Digital Curation Centre, UK: http://www.dcc.ac.uk/
1
2
-1-
Requirements Data Management Plan
1) Introduction and context
1.1
1.2
1.3
Basic research information (title, summary, duration, partners etc.).
What are the aims and purpose of the research?
List of related data policies.
E.g. is the research co-funded by another funding body with a specific data policy?
Is the research subject to your institute’s data policy?
Is the research subject to certain data restrictions because you work in protected area’s
and / or with protected species?
2) Legal, rights and ethical issues
2.1
2.2
2.3
2.4
Who owns the Intellectual Property and Copyright? Is this set in your employers
contract?
How will the data be licensed? E.g. Creative Commons, attribution?
What are the ethical and privacy issues? How will these be resolved? E.g. names of data
collectors, anonymisation of data, consent agreements.
What is the dispute resolution process and / or mechanism for mediation?
3) Access, data sharing and reuse
3.1
3.2
3.3
How will data be made available? What is the process for gaining access? Are any
permissions & restrictions placed on data?
- Mention the use of data networks such as GBIF for species occurrence data, GenBank
for genetic sequences, CBOL for DNA bar coding data, ILTER for ecological data, etc.
- Specify the resolution of data access, e.g. metadata only, data with blurred
geographical info, data in full detail, etc.
- Mention the technical accessibility and related process for access. Examples: listed on
website and available upon request, downloadable from website, available through
webservices, etc.
Under what conditions are the data available? Mention any kind of legal procedures or
agreements that are in place.
How can / will the data be shared and re-use? Which bodies / groups are likely to be
interested? What are the foreseeable uses?
- Related to 3.1; mention through which networks and for which community the data
will be specifically available.
- Are there specific disciplines and/or research communities that will benefit from the
publication of your (raw) data?
- Are there specific none-scientific communities that will benefit from your published
data? E.g. nature conservation, eco-tourism, gardeners, etc.
Are there embargo periods? How is the release timeframe justified? Include note on
right-of-first-use for original data collector / creator / investigator. E.g. data available
after two year, data available after all primary publications, data release subject to
institutional policy, etc.
4) Data collection / development methods
4.1
What does 'data' comprise for the research? Data description incl. volume, type, content
to be created etc.
- Examples: natural species occurrence data for indigenous trees, genetic data for mouse
-2-
4.2
4.3
4.4
4.5
4.6
4.7
4.8
populations, plant community data in Dutch wetlands, capture re-capture data for fish
populations in lake IJsselmeer, radio-tracking data for wild geese, historic natural
history collection data, etc.
Do mention the less obvious data that come with the biological data such as images,
shape files, etc.
- Try to quantify volumes of data, for example: data collection concerns 5 years of
monthly population sampling at 25 sites with an estimated number of xx recordings
throughout the research period
- Taxonomic, geographical and temporal range of the data. (These are important
features if you only release the metadata for your research.)
For example: Family Mustelidae, The Netherlands and Belgium, 2010-2015
Have you surveyed existing data? (incl. 3rd party data) What can be used / extended?
Are there any access issues? What is the ‘added value’ to reuse? Why does new data
need to be created? What is the relationship between new dataset(s) and existing data?
How will you manage interoperability i.e. what methods will be used to integrate the
data being gathered in the project with pre- existing data sources?
How will you capture / create the data? incl. content selection, instrumentation,
technologies and approaches chosen, methods for naming, versioning etc.
How will metadata and documentation be captured? What form will it take? What
standards will be used? What contextual details are needed to make data meaningful?
Mention the taxonomic hierarchy you have used for taxonomic ordering.
Why have you chosen particular standards and approaches? E.g. recourse to staff
expertise, Open Source, accepted domain-local standards, widespread usage, institute’s
policy, etc.
What criteria will be used for Quality Assurance / Management? E.g. ISO standards,
documentation, calibration, taxonomic validation, general data validation, monitoring,
transcription metadata, etc.
Are you using Digital Object Identifiers (DOI’s), Life Science Identifiers (LSID’s), or
any other kind of persistent identifiers for your data or datasets?
5) Data standards
5.1
5.2
5.3
5.4
5.5
Describe data types e.g. experimental measures, qualitative, raw, processed, etc.
List all data standards you are specifically using, e.g. Ecological Metadata Language
(EML) for ecological data, DarwinCore for species occurrence data, ISO 19136:2007
for geographic information, etc.
Which file formats and platforms will be used and why? E.g. recourse to staff expertise,
Open Source, accepted standards, widespread usage, etc.
How do data creation decisions take account of end user needs? Have you thought about
the data interoperability on forehand?
Does the research takes account of specific data directives, such as the EU’s Inspire
Directive6 for spatial information?
6) Short-term storage and data management
6.1
6.2
6
Anticipated data volumes? (Mega – Gigabytes?)
Where will the data be stored? On what media? Who will be responsible? How will it
be transmitted? (encryption if appropriate)
- Mention the use of specific data storage packages, e.g. TurboVeg for plant community
data.
http://inspire.jrc.ec.europa.eu/
-3-
6.3
6.4
6.5
- Mention more general platform information, e.g. customised access database, oracle
database, etc.
How will access arrangements and data security be managed? How permissions,
restrictions and embargoes are enforced? Note on sensitive data, storage on off- network
mobile devices etc.
How regularly, by whom, and how will data be backed up?
Appraisal and retention timeframes, ideally with definite figures. (N.B. this may simply
point to relevant institutional or funding body requirements / policies: political,
temporal, commercial, legal.)
7) Deposit and long-term preservation
7.1
7.2
7.3
7.4
7.5
7.6
7.7
What is the long-term strategy for maintaining, curating and archiving the data?
On what basis will data be selected for preservation? How long will data be kept?
(ideally with definite figures) (N.B. this may simply point to relevant institutional or
funding body requirements / policies: political, temporal, commercial, legal.) How will
you dispose / transfer sensitive data? Justification of decisions.
How will data be prepared for preservation / data sharing? (incl. anonymisation if
appropriate)
Where and how data will be archived e.g. deposit in public repository or existing
community database? Transmission of data (encryption if appropriate)
What related information will be deposited? - references, reports, research papers, fonts,
original bid proposal, etc
What metadata / documentation will be created at each stage of ingest / transformation?
– descriptive, structural, administrative, preservation etc. How will this be created and
by whom?
What procedures are in place for preservation and backup? How regular, by whom,
methods used? E.g. format normalisation, migration, etc.
8) Resourcing
8.1
8.2
Staff / organisational roles and responsibilities for implementing this data management
plan, incl. time allocations, project management of technical aspects, contributions of
non-project staff etc.
Financial issues (e.g. payments to service providers within institutions, payments to
external data centres for hosting data, income derived from licensing data, etc.)
9) Compliance and review
9.1
9.2
How adherence to this data management plan will be checked or demonstrated?
How and when this data management plan will be reviewed?
10) Agreement / ratification by stakeholders (if appropriate)
10.1 Statement of agreement (with signatures if required)
Cees H.J. Hof, Coordinator Netherlands Biodiversity Information Facility (NLBIF)
Amsterdam, 29 August 2011
-4-
Download