CONCEPTUALISE Mark Thorley Natural Environment Research Council

advertisement
a centre of expertise in data curation and preservation
CONCEPTUALISE
Mark Thorley
Natural Environment Research Council
Funded by:
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK:
Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-ncsa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San
Francisco, California, 94105, USA.
Digital Curation 101, October 6th-10th, 2008, NeSC, Edinburgh
a centre of expertise in data curation and preservation
Overview
•
•
•
•
Background – what & why.
Policy drivers – funder rules.
Roles – who’s job is it anyway?
Practical steps – thinking about data issues.
a centre of expertise in data curation and preservation
a centre of expertise in data curation and preservation
Background: what do we mean by
data?
• Data as a by-product of research.
• Data as a part of the scientific record – must
be maintained to allow reproduction and
validation.
• Data as a ‘published’ output in its own right.
a centre of expertise in data curation and preservation
Background: why is data
management important?
• Good research practice: to do good science
requires good data management.
• Reproduceability: maintenance of the
scientific record.
• Long-term value for re-use and re-purposing.
• Good research = good digital curation.
a centre of expertise in data curation and preservation
Background: drivers for sharing
• Scientific need: especially for large-scale or
long-term studies.
• Increased value: where part of a larger
collection (eg. Oceans or atmosphere).
• Value for money: data collection can be very
expensive.
• Publicly funded: public right of access.
a centre of expertise in data curation and preservation
Key learning 1
• Do not do data management / curation for its
own sake. Do it to ensure:
• Good research outcomes;
• Data of long-term value are available for re-use
and re-purposing;
• The scientific record is protected.
a centre of expertise in data curation and preservation
Policy drivers
• Professional standards: GLP etc.
• ESF: Good scientific practice in research and
scholarship (2000).
• RCUK: Governance of good research
conduct (2008).
• Research Council data policies.
a centre of expertise in data curation and preservation
ESF: Data accumulation, handling
and storage
•
•
36. Data are produced at all stages in experimental research
and in scholarship. Data sets are an important resource, which
enable later verification of scientific interpretation and
conclusions. They may also be the starting point for further
studies. It is vital, therefore, that all primary and secondary data
are stored in a secure and accessible form.
37. Institutions must pay particular attention to documenting and
archiving original research and scholarship data. Several codes
of good practice recommend a minimum period of 10 years,
longer in the case of especially significant or sensitive data.
National or regional discipline-based archives should be
considered where there are practical or other problems in
storing data at the institution where the research was
conducted.
a centre of expertise in data curation and preservation
RCUK code of conduct
• Management and preservation of data and
primary materials:
•
…. ensure that relevant primary data and research evidence are preserved
and accessible to others for reasonable periods after the completion of the
research. This is a shared responsibility between researcher and the
research organisation, but individual researchers should always ensure that
primary material is available to be checked. Such conditions should also be
applied where ownership of data may rest with third parties, for example
where there is commercial sponsorship of research. Data should normally
be preserved and accessible for not less than 10 years for any projects, and
for projects of clinical or major social, environmental or heritage importance,
the data should be retained for up to 20 years, and preferably permanently
within a national collection, or as required by the funder’s data policy.
a centre of expertise in data curation and preservation
Generic policy principles
• Research Councils recognise data as a
valuable long-term, public-good resource.
• Data sharing improves opportunities for
exploitation.
• Investigator teams have a right of first use
and a right to be acknowledged.
• Effective exploitation requires effective data
management.
a centre of expertise in data curation and preservation
• Formal data policy – currently being updated.
• Joint JISC & ESRC supported UK Date Archive,
including the Economic and Social Data Service.
• Applicants must carry out a data review to ensure
funds not requested for data that are already
available. Data must be offered to the archive within
3 months of end of award.
• Partner in National Data Strategy for Social Science
Research.
a centre of expertise in data curation and preservation
• Data policy handbook and guidance. New
version under development.
• All data must be offered to a NERC data
centre to enable long-term management and
re-use.
• Recognition of rights of investigator teams.
• NERC supports 6 data centres for long-term
management of environmental data.
a centre of expertise in data curation and preservation
• Data sharing policy and implementation guidelines.
Endorsed by Council 2006, apply from April 2007.
• Applicants must produce a data sharing plan. Data
sharing encouraged in all research areas where there
is a strong scientific need and it is cost effective to do
so.
• Funds can be requested to support data
management and sharing activities.
a centre of expertise in data curation and preservation
• Data sharing and preservation policy – applies to new
grants awarded from January 2006.
• Applicants must produce a plan for data sharing and
preservation and include costings in grant
applications.
• Implementing data management facilities at MRC
owned centres (as part of corporate responsibility for
data).
a centre of expertise in data curation and preservation
• De facto policy - detailed in funding guidance.
• Any significant electronic resources or datasets
created as a result of research funded by the AHRC
must be made available in an accessible depository
for at least three years after the end of the grant.
• Can request resources to support management and
sharing.
• Archaeology – special case. Must use the AHRC
supported Archaeology Data Service.
a centre of expertise in data curation and preservation
• No formal policy as yet, however, strong
consideration of policy development.
• Encourages PIs to manage primary data as
the basis for publications securely and for an
appropriate time in a durable form under the
control of the institution of their origin.
a centre of expertise in data curation and preservation
• Polices under development following merger
of PPARC and CCLRC.
• Facilities (ie CCLRC) – well developed
policies and facilities on a per-project basis.
• Grant holders (ie PPARC) – Data curation
policy agreed in principle.
a centre of expertise in data curation and preservation
Roles: who’s job is it?
Data sharing / curation has added requirements
and expectations on to research teams.
Research
Curation
Re-use
Researchers have to ‘do stuff’
to their data to enable re-use.
‘Stuff’ is not always getting done!
a centre of expertise in data curation and preservation
‘Doing stuff’ to data takes time and skills which
research teams do not always have.
Research
Curation
Re-use
Asking researchers to ‘do stuff’
with data that falls outside of
their area of expertise/interest.
a centre of expertise in data curation and preservation
Research teams need access to data
management skills and incentives.
Re-use
Research
Curation
Informaticians bridge the gap.
a centre of expertise in data curation and preservation
Who are the key players?
Informaticians
Researchers
Research
Re-use
Curation
Data managers
a centre of expertise in data curation and preservation
Key learning 2
• Data management is too important to be left
to data managers – must involve researchers!
• Researchers and data managers must work
together to identify data management
activities appropriate to the research.
• Roles will change over time:
• Within project – responsibility of research team;
• Post project – responsibility of ‘data centre’.
a centre of expertise in data curation and preservation
Practical steps: the grant application
• Depends on the research area and the
requirements of the funder, but avoid
nugatory effort.
• Research funder wants to know:
• Relevant policies being met;
• PI has given serious consideration to what is
needed wrt data and has demonstrated will be
able to deliver;
• Necessary resources included in proposal.
a centre of expertise in data curation and preservation
Grant application: example
• What data are planned for collection and which of
these data are perceived of having long-term value?
• What, if any, existing data will be required? Who will
supply these data and will there be a cost?
• Have the necessary legal & ethical issues been
considered (eg consent and confidentiality)?
• What specialist data and informatics skills will be
required by the programme and where will these be
obtained from?
• Where will data be held and how will access be
provided for long-term re-use and re-purposing?
• ££££££££
a centre of expertise in data curation and preservation
Practical steps: the detailed plan
• Ask an expert !
• Researchers must work with data managers to
ensure that which is of long-term value is
appropriately managed.
• Be pragmatic.
• Build on what others have done;
• Don’t aim to change the world – recognise that for
many researchers data management is a
necessary but ‘un-productive’ overhead.
• Early intervention – better long-term outcome.
a centre of expertise in data curation and preservation
Key learning 3
• Early intervention leads to better long-term
outcomes.
• Data management professionals should be
involved from the research planning phase
onwards;
• Develop domain specialist ‘informaticians’.
a centre of expertise in data curation and preservation
•Further information
•Mark Thorley
•NERC
mrt@nerc.ac.uk
Download