VIVO as an Institutional Data Registry Philippa Broadley, Gawri Edussuriya, Lance De Vine, Stephanie Bradbury Queensland University of Technology. Brisbane, Australia Contact: [email protected]t.edu.au VIVO at QUT The Queensland University of Technology (QUT) in Brisbane, Australia, is involved in a number of projects funded by the Australian National Data Service (ANDS). Currently, QUT is working on a project (Metadata Stores Project) that uses open source VIVO software to aid in the storage and management of metadata relating to data sets created/managed by the QUT research community. The registry (called QUT Research Data Finder) will support the sharing and reuse of research datasets, within and external to QUT. QUT uses VIVO for both the display and the editing of research metadata. Several customisations to VIVO are under development to better facilitate it for its role as a dataset registry. A general architecture with workflow consisting of a public facing VIVO as well as a restricted access VIVO for data integration is in the final stages of development (see Figure 1). Research Data Australia Figure 4. Collection record in Research Data Australia (left). NLA Figure 5. Collection record in QUT Research Data Finder (right). Border firewall Corporate firewall OAI-PMH Provider Outbound request to selected IP for minting of identifiers. Java Application Server Several novel approaches have been used in the development with the anticipated outcome that the process of importing and exporting data to and from VIVO will be more efficient. For example, we have created DOI minting web application. Currently, the DOI minting process in integrated with VIVO. This application will aid researchers in tracking who is reusing their data sets and how they are being reused. Research Data Finder (VIVO) MySQL (Database) SOLR Data Librarian Research Master Java Application Server Research Data Finder (VIVO) MySQL (Database) ePrints ... SOLR Dataset Store 1 Transformation and Integration Dataset Store 2 Staging and Data Integration. QUT has developed and will use a Java object model (CRMM) as an intermediate data representation for data integration and transformation tasks (see Figure 6 below). Figure 1. Architecture of the metadata capture and publishing system. Key Outcomes of Metadata Stores Project A public research profile portal (QUT Research Data Finder, beta version until December 2012 - http://researchdatafinder.qut.edu.au/vivo/) Workflow for registering new research data collections in the university Alignment of metadata records about research activities with our institutional research management system (ResearchMaster) Alignment of metadata records about parties with an institutional name authority e.g. QUT’s Staff Profiles system - http://staff.qut.edu.au/ Strategic reporting on contents and coverage of the metadata store (for internal use) As well as displaying records in QUT Research Data Finder, we will contribute collection, party, activity and service records to Research Data Australia (ANDS’ discovery portal - http://researchdata.ands.org.au/) (see Figures 2 and 4). Metadata mappings have been developed for various formats and mappings are presently being constructed for moving data between CRMM and VIVO. The rationale behind this is that the Java objects are relatively easy to manipulate, especially via high level scripting languages that run on the JVM. Groovy technology was used for data manipulation. RIF-CS/XML encapsulated within OAIPMH RIF-CS/XML Transformation Scripts Transformation Scripts YAML (YAML Ain’t Markup Language) VIVO Jena Model CRMM (placeholder name) Java model Transformation Scripts Java Object DB or RDB Mapping Mediaflux Figure 6. Basic overview of metadata transformation and integration using the Java Object Model (CRMM). About ANDS One of the chief goals of ANDS is to build the Australian Research Data Commons, a cohesive collection of research data outputs from all Australian research institutions. Funded by the Australian Commonwealth Government's Department of Industry, Innovation, Science Research and Tertiary Education (DIISRTE), ANDS is engaged with Australian universities to ensure that research data is better described, more connected, more integrated and organised, more accessible and more easily used for new purposes. This project is supported by the Australian National Data Service (ANDS) Figure 2. Party record in Research data Australia (left). ANDS is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative. Figure 3. Party record in QUT Research Data Finder (right). This work is licensed under a Creative Commons Attribution 3.0 Australia License.