downloading - University of Oxford

advertisement
Vision for DataBank: Summary
DataBank will be a parallel online service to ORA (Oxford University Research Archive) with which it will
operate seamlessly. ORA is a repository for text-based material: DataBank is designed to hold other
types of research outputs, loosely be termed ‘data.’ This broad term encompasses content such as
numerical datasets, audio files, still and moving images and other non-text items. Such items require
different metadata, search, access and delivery to text items. DataBank will operate in conjunction with
other research data stores dispersed across the University and beyond, for the storage and discovery of
and access to Oxford research data. An integrated data catalogue1 (metadata only) will be developed to
record the existence and location of data created at Oxford, including that not held in DataBank.
Researcher’s perspective
DataBank will be valued, trusted and used by Oxford researchers as a key resource in support of
research. Interaction with DataBank will be easy i) for deposit of research data and ii) as a core resource
for easy discovery of and access to Oxford’s research data outputs. Paired deposit of papers in ORA
linked to supporting data in DataBank will be common practice. The DataBank data catalogue will be the
main source of information about the existence and location of Oxford data and as such will support
researchers wanting to build on previous research and find collaborators. Metadata and content deposit
by researchers and acquisition from distributed sources will be simple and automated wherever
possible. Metadata will be obtained from and deposited in other repositories on behalf of Oxford
authors (where permitted). DOIs can be assigned to datasets on request and in line with the Bodleian
Libraries DOI service. The design and functionality of the DataBank search and access interface will be
informed by user preferences. DataBank will provide tools and functions to enable easy search,
manipulation and export of search results and for embedding the service in websites. Assistance with
deposit and help and information will be made easily available. Access to datasets will be barrier free
(open access), unless data creators request an embargo, with appropriate machine-readable rights
attached.
University of Oxford perspective
DataBank will be the accepted central University store for small research data outputs (metadata and
datasets) and will form one strand of Oxford’s federated research data management facilities. The
DataBank catalogue will contain a comprehensive inventory of Oxford’s research data outputs, as such it
will support activities such as responding to FOI requests. Where appropriate, data will be linked to
publications recorded in ORA. DataBank open access content will be easily discoverable via search
engines and other discovery tools and within the semantic web. DataBank will be a core resource within
the mix of methods employed across the University for disseminating and publicising its research, and as
such will be a key tool for outreach and engagement with the wider community. DataBank will be the
University’s central service for the digital preservation, curation and continued access to small research
data outputs. DataBank will use the entity store for research information management data, and share
data and services for the benefit of the Oxford research community and the Collegiate University. The
University will support the clear economic and strategic benefits for supporting DataBank within the UK
and global research communities.
Bodleian Libraries’ perspective
DataBank will be a core mainstream service, within the Bodleian Libraries digital collections for research
and teaching support. It will be used by library staff in their interactions with readers. It will retain a
reputation as a benchmark technical system for preserving, managing and serving content. The systems
underpinning DataBank will be robust and reliable, will exploit current and forward-looking
technologies, and will meet appropriate accepted standards. It will provide a view on one of the
Bodleian Libraries digital collections (and its subsets), with the ability to interoperate with and
seamlessly navigate to other collections and discovery services. DataBank (and the data catalogue) will
contain data outputs and metadata from funded and unfunded research across all disciplines. Its policies
will be clear and easily accessible, and its procedures efficient and workable. DataBank will serve and
export metadata in standard formats inlcuding as open linked data that encourage use and re-use.
Interaction with DataBank will be spread across Bodleian Libraries’ departments and across faculties and
1 Working title: DataFinder
administrative departments of the University. Information about and tools to support new forms of
scholarly communication will be provided.
Manager’s/administrator’s perspective
DataBank will be a core part of Oxford’s research information management systems and its metadata
and content valued across the University. Data can be easily exported from DataBank and shared
appropriately with such systems (for example, in Research Services). DataBank will provide data and
tools for reporting, for business information and for informing strategic decisions. It will provide
appropriate data for and obtain data from external agencies (for example, research funders) as an
integrated part of Oxford research management systems. Deposit statistics and usage statistics will be
gathered and published as appropriate. DataBank will have an intuitive administrative and editing
interface for use by approved editors and administrators. Bodleian Libraries’ staff will work with
faculties to enable local systems to deposit content into DataBank using the DataFlow model.
Appendix
Current limitations
Development of DataBank has so far been dependent on ‘soft’ funding. The basis of the system was
developed to enable storage of data forming a part of a digital DPhil thesis. Future development in the
short term is dependent on project funding (specifically JISC UMF-funded DataFlow project). This will
pay for some developer staff time with development driven by project goals (aligned as far as possible
with BDLSS goals). Further development will be limited by the available resources. This vision presents
an ‘ideal world’ view of the future DataBank.
Scope
 Deposit of content and metadata into DataBank will be possible by a variety of channels, both
manual and automated (wherever possible). Deposit can be via a local data management system.
 Content will comprise ‘non-text’ items that fall within the scope for file size
 Files up to 20Gb will be accepted into DataBank.
 Search and access will be via a web interface
 Metadata will be assigned using the ‘sheer curation’ model: core metadata as specified by DataCite
will be priority with additional metadata added as and when possible. DOIs will be assigned where
appropriate. Metadata schema will be adopted as appropriate.
 DataBank will not enable data manipulation: its purpose is storage and access
 Data export and download will be provided in a variety of formats to be determined by ease of
implementation, user requirements and availability of resources
 Workflows and procedures will be adapted as Bodleian Libraries and university policies are
developed and implemented, for example digital preservation policy and research data
management policies.
Sally Rumsey, June 2011
Download