Vision for DataBank: Summary DataBank will be a parallel online service to ORA (Oxford University Research Archive) with which it will operate seamlessly. ORA is a repository for text-based material: DataBank is designed to hold other types of research outputs, loosely be termed ‘data.’ This broad term encompasses content such as numerical datasets, audio files, still and moving images and other non-text items. Such items require different metadata, search, access and delivery to text items. DataBank will operate in conjunction with other research data stores dispersed across the University and beyond, for the storage and discovery of and access to Oxford research data. An integrated data catalogue1 (metadata only) will be developed to record the existence and location of data created at Oxford, including that not held in DataBank. Researcher’s perspective DataBank will be valued, trusted and used by Oxford researchers as a key resource in support of research. Interaction with DataBank will be easy i) for deposit of research data and ii) as a core resource for easy discovery of and access to Oxford’s research data outputs. Paired deposit of papers in ORA linked to supporting data in DataBank will be common practice. The DataBank data catalogue will be the main source of information about the existence and location of Oxford data and as such will support researchers wanting to build on previous research and find collaborators. Metadata and content deposit by researchers and acquisition from distributed sources will be simple and automated wherever possible. Metadata will be obtained from and deposited in other repositories on behalf of Oxford authors (where permitted). DOIs can be assigned to datasets on request and in line with the Bodleian Libraries DOI service. The design and functionality of the DataBank search and access interface will be informed by user preferences. DataBank will provide tools and functions to enable easy search, manipulation and export of search results and for embedding the service in websites. Assistance with deposit and help and information will be made easily available. Access to datasets will be barrier free (open access), unless data creators request an embargo, with appropriate machine-readable rights attached. University of Oxford perspective DataBank will be the accepted central University store for small research data outputs (metadata and datasets) and will form one strand of Oxford’s federated research data management facilities. The DataBank catalogue will contain a comprehensive inventory of Oxford’s research data outputs, as such it will support activities such as responding to FOI requests. Where appropriate, data will be linked to publications recorded in ORA. DataBank open access content will be easily discoverable via search engines and other discovery tools and within the semantic web. DataBank will be a core resource within the mix of methods employed across the University for disseminating and publicising its research, and as such will be a key tool for outreach and engagement with the wider community. DataBank will be the University’s central service for the digital preservation, curation and continued access to small research data outputs. DataBank will use the entity store for research information management data, and share data and services for the benefit of the Oxford research community and the Collegiate University. The University will support the clear economic and strategic benefits for supporting DataBank within the UK and global research communities. Bodleian Libraries’ perspective DataBank will be a core mainstream service, within the Bodleian Libraries digital collections for research and teaching support. It will be used by library staff in their interactions with readers. It will retain a reputation as a benchmark technical system for preserving, managing and serving content. The systems underpinning DataBank will be robust and reliable, will exploit current and forward-looking technologies, and will meet appropriate accepted standards. It will provide a view on one of the Bodleian Libraries digital collections (and its subsets), with the ability to interoperate with and seamlessly navigate to other collections and discovery services. DataBank (and the data catalogue) will contain data outputs and metadata from funded and unfunded research across all disciplines. Its policies will be clear and easily accessible, and its procedures efficient and workable. DataBank will serve and export metadata in standard formats inlcuding as open linked data that encourage use and re-use. Interaction with DataBank will be spread across Bodleian Libraries’ departments and across faculties and 1 Working title: DataFinder administrative departments of the University. Information about and tools to support new forms of scholarly communication will be provided. Manager’s/administrator’s perspective DataBank will be a core part of Oxford’s research information management systems and its metadata and content valued across the University. Data can be easily exported from DataBank and shared appropriately with such systems (for example, in Research Services). DataBank will provide data and tools for reporting, for business information and for informing strategic decisions. It will provide appropriate data for and obtain data from external agencies (for example, research funders) as an integrated part of Oxford research management systems. Deposit statistics and usage statistics will be gathered and published as appropriate. DataBank will have an intuitive administrative and editing interface for use by approved editors and administrators. Bodleian Libraries’ staff will work with faculties to enable local systems to deposit content into DataBank using the DataFlow model. Appendix Current limitations Development of DataBank has so far been dependent on ‘soft’ funding. The basis of the system was developed to enable storage of data forming a part of a digital DPhil thesis. Future development in the short term is dependent on project funding (specifically JISC UMF-funded DataFlow project). This will pay for some developer staff time with development driven by project goals (aligned as far as possible with BDLSS goals). Further development will be limited by the available resources. This vision presents an ‘ideal world’ view of the future DataBank. Scope Deposit of content and metadata into DataBank will be possible by a variety of channels, both manual and automated (wherever possible). Deposit can be via a local data management system. Content will comprise ‘non-text’ items that fall within the scope for file size Files up to 20Gb will be accepted into DataBank. Search and access will be via a web interface Metadata will be assigned using the ‘sheer curation’ model: core metadata as specified by DataCite will be priority with additional metadata added as and when possible. DOIs will be assigned where appropriate. Metadata schema will be adopted as appropriate. DataBank will not enable data manipulation: its purpose is storage and access Data export and download will be provided in a variety of formats to be determined by ease of implementation, user requirements and availability of resources Workflows and procedures will be adapted as Bodleian Libraries and university policies are developed and implemented, for example digital preservation policy and research data management policies. Sally Rumsey, June 2011