Library Services OUDL-Stellar metadata implementation Author Document no. (if applicable) Publication Date Version no. Status Confidentiality Location (inc. Livelink link) Last saved Alex Addyman and Lara Whitelaw 26/06/2013 Draft Public 2016-02-08 (note – this is an automated field) OUDL-STELLAR metadata implementation.docx Page 1 of 13 Authors: Alex Addyman and Lara Whitelaw 26/06/2013 Contents Summary Table............................................................................................................. 3 Content Specific Metadata............................................................................................. 4 Main Texts ............................................................................................................ 4 Collections Metadata .......................................................................................... 5 Supplementary Texts ......................................................................................... 8 Time-Based Media............................................................................................. 10 Web Pages .......................................................................................................... 10 Figure 1 - Source metadata from Voyager ........................................................ 4 Figure 2 - MARCXML to MODS and DC ............................................................... 5 Figure 3 - Legacy collections metadata example ............................................. 6 Figure 4 - Portfolio Record .................................................................................... 9 OUDL-STELLAR metadata implementation.docx Page 2 of 13 Authors: Alex Addyman and Lara Whitelaw 26/06/2013 Summary Table Content type Audio Metadata Generic: OAI-DC Audio Metadata: EBUCore mapped to the Media Ontology Collection Preservation: PREMIS Generic: OAI-DC Collections Metadata: DCCAP or MODS (mapped to EAD) Main texts Generic: DC, RELS-INT, RELS-EXT Digital Resource: MODS Preservation: PREMIS Moving image or video Generic: OAI-DC AV Metadata: EBUCore mapped to the Media Ontology Still image Preservation: PREMIS Generic: OAI-DC Image Metadata: VRA Core, MIX Preservation: PREMIS Supplementary text Generic: OAI-DC Digital Resource: MODS Web page Preservation: PREMIS Generic: DC, RELS-INT, RELS-EXT Webpage Metadata: MODS Preservation: PREMIS OUDL-STELLAR metadata implementation.docx Page 3 of 13 Authors: Alex Addyman and Lara Whitelaw 26/06/2013 Content Specific Metadata Main Texts The Open University Archives has digitised a range of print study materials which formed part of the S100 Science Foundation Course during its early years. These include the Main Texts which were sent out to students in order to guide them through the course as well as any supplementary material such as course guides, workbooks, Tutor Mark Assignments and so on. For the purposes of metadata the Main Texts are treated separately from the supplementary texts. Sources of legacy metadata Every Main Text digitised will have an existing record in our Voyager Library Management System encoded in the MARC-21 format. The MARC field usage varies per record in our sample collection but the example in Figure 1 is fairly atypical. Figure 1 - Source metadata from Voyager MARC Code MARC Field Example $001 Control Number 206749 $005 Date and time of latest transaction 20070716165119.0 008/30-31 MU (code j) Fixed length data fields 010508s1971 enka 000 0 eng $020 a ISBN 0335020321 $082 a DDC Number 500 $110 a Corporate name Open University S100/Unit 14 $245 a Title statement $260 a Place of publication The chemistry and structure of the cell Milton Keynes $260 b Name of publisher Open University $260 c Date of publication 1971 $500 a General note $650 a Subject - topical $650 x Subject - general subdivision Unit 14 of S100 Science foundation course Biochemistry Cells Cell physiology History $740 a $842 a Uncontrolled Related/Analytical Title Textual physical form designator $852 b Location $852 n Sublocation $008/15-17 008/07-10 008/07-11 Science foundation course In order to assess the suitability of MODS we took the MARC Voyager records and examined which MARC fields were most commonly used across our sample content. We then cross-walked this against MODS as shown in OUDL-STELLAR metadata implementation.docx Page 4 of 13 Authors: Alex Addyman and Lara Whitelaw 26/06/2013 Figure 2. We also cross-walked against simple Dublin Core as this is a core requirement of OAI harvesting. OUDL-STELLAR metadata implementation.docx Page 5 of 13 Authors: Alex Addyman and Lara Whitelaw 26/06/2013 Figure 2 - MARCXML to MODS and DC MARC Code MARC Field $001 Control Number $005 Date and time of latest transaction Fixed length data fields 008/30-31 MU (code j) $008/15-17 008/07-10 008/07-11 MODS field <recordIdentifier> DC field Identifier <recordChangeDate> with encoding="iso8601" <language><languageTerm> No crossover <place><placeTerm> with type="code" and authority="marccountry" <dateIssued> with encoding="marc" No crossover Date Language Date $020 a ISBN <dateCreated> with encoding="marc" <identifier type="isbn"> $082 a DDC Number <classification authority="ddc"> Subject $110 a Corporate name <name type="corporate"> Contributor $245 a Title statement <titleInfo><title> Title $260 a Place of publication Publisher $260 b Name of publisher <place><placeTerm> with type="text" <publisher> $260 c Date of publication <dateIssued> Date $500 a General note Description $562 a $562 b Copy and Version Identification Note Copy identification $650 a Subject - topical <note> with type=appropriate name assigned <note> with type="version identification " <note> with type="version identification " <subject authority=" "> $650 x <topic> Subject/Coverag e No crossover $852 b Subject - general subdivision Uncontrolled Related/Analytical Title Textual physical form designator Location $852 n Sublocation <shelfLocator> $740 a $842 a <titleInfo type="alternative"><title> Identifier Publisher No crossover No crossover Subject Format <physicalLocation> No crossover No crossover MODS is derived from MARC21 and so it is no surprise that it is both granular and semantically similar to map to every MARC field from our legacy metadata. Dublin Core alone does not offer the complexity to fully represent our legacy metadata. Collections Metadata The large and varied content which will make up OUDL requires a clear and easy-to-navigate hierarchical structure. To achieve this collections and subcollections will be developed to group similar content together. Collections Characteristics As the development of the OUDL is an iterative process and as digital items suitable for inclusion into the repository will no doubt grow with time there are no fixed collections as yet. However the following collections themes have been identified whose titles and contents are subject to change. OUDL-STELLAR metadata implementation.docx Page 6 of 13 Authors: Alex Addyman and Lara Whitelaw 26/06/2013 OU Study Materials Archive Contents: All the study materials (also called learning materials/course materials/teaching materials) in all formats. "Study Materials" was the preferred phrase when discussed with the Learning and Teaching team a while ago and is now the official title of the collection on the website etc. OU Life Collection Contents: OU Historical Images (non-teaching - this could be complicated as our images are jumbled together at the moment - teaching and historical) OU Historical broadcasting (Open Forum TV and radio, other OU "magazine" programmes) OU Vice-Chancellor's speeches (need to talk to the VC's office about these) Other collections such as Sesame, Open House, OU web archive of social/comms stuff could also go into here. OU Learning Journey Collection This collection title would be consistent with the OU/Agreement and the title now given to this material by OMU Sources of legacy metadata Collections-level metadata for digital content at the OUDL is non-existent. The closest thing we can draw on is Encoded Archival Description (EAD) metadata used to describe physical archive collections such as the Jennie Lee Collection and the Walter Perry Collection (Open University, 2006) (see Figure 3). These are important to map because they contain useful identifier metadata and because they may be preserved in the OUDL in the future. Figure 3 - Legacy collections metadata example Element name Reference Example GB/2315/WP Held.at The Open University Archive Dates of Creation 1926 - 2003 Physical Description 225 files Name of creator Walter Perry created the collection. Title The Walter Perry Collection Sub-title Author Finding aid compiled by Miss Ruth Cammies Publication The Open University Library 2006 The Open University,, Walton Hall, , Milton Keynes, , MK7 6AA , Tel: 01908 653378, 1st Edition Edition Creation Finding aid encoded in EAD (Encoded Archival Description) 2002 using Altova XMLSpy by Miss Ruth Cammies, Open University Archivist, Mrs Julie Vavangas, Archive Assistant, Miss Georgina Parsons, Archive Assistant. Initial catalogue of the first deposit compiled by Beveley Hunt, Archivist, 2001.2006 OUDL-STELLAR metadata implementation.docx Page 7 of 13 Authors: Alex Addyman and Lara Whitelaw Element name Example Descriptive Rules This finding aid has been created using the ISAD(G) 2nd Edition (International Standard of Archival Description (Generalised)) and Encoded Archival Description (EAD). Finding aid written inEnglish Language usage 26/06/2013 scope and content biographical history arrangement The second deposit arrived at the Open University Archive with no structure or filing system. The structure has therefore been artificially created to aid access. Individual files have not been split unless clearly stated in the item record. WP/1 The Open University WP/2 Other Educational Work WP/3 Papers regarding Health, Science and the Environment WP/4 Personal Files and Interests access guidelines copying restrictions immediate source of acquisition custodial history archvists note To access the collection contact the Open University Archivist. All items will be monitored for personal or sensitive information before they are released to researchers. The Archivist reserves the right to restrict access if necessary. All researchers will be required to complete an access/data protection/ copyright form Access to some of the papers within the collection is restricted under the principles of the Data Protection Act 1998. Reproduction of items from the collection will be permitted according to copyright legislation and Open University Library policy. The first deposit of material was largely an internal transfer of papers from the Vice Chancellor's Office of the Open University. The second deposit was transferred from the Edinburgh Regional Office of the Open University. A small selection of materials were donated by Lady Perry. The first deposit was transferred to the University Library by Walter Perry. The second deposit was transferred from Walter Perry's office within the Edinburgh Regional Centre for the Open University after his death in 2003. A small selection of files was also donated by Lady Perry at this time. This finding aid was created in 2006. related material subjects Education, Higher Distance Education Medicine Research Broadcasting subject.personal.names Perry Walter 1921 - 2003 Lord Perry of Walton Lee Baroness Jennie 1904 - 1988 MP Wilson Baron of Rievaulx James Harold 1916 - 1995 Statesman Goodman Baron Arnold Abraham 1913 - 1995 Lawyer subject.corporate.names Labour Party Great Britain The Open University Great Britain subject.geographical.names Milton Keynes England Edinburgh Scotland Dundee Scotland Dublin Core Collections-level Application Profile The Dublin Core Collections-level Application Profile (Dublin Core Metadata Initiative, 2007) is one of the most widely used collections level profiles. As it draws on Dublin Core standards it is highly interoperable with other schemas and libraries. As is the case with the simple DC schema however it is somewhat limited in scope but at the collections level this is less of an issue as the OUDL-STELLAR metadata implementation.docx Page 8 of 13 Authors: Alex Addyman and Lara Whitelaw 26/06/2013 collection descriptions should be very brief. DC-CAP draws on the functional model and element set defined by RSLP (UKOLN, 2000). Supplementary Texts Accompanying every Main Text produced for an Open University Unit is a series of Supplementary Texts. These include the following types1: Assessment Assignment Computer Marked Assignment (CMA) Calendar Case Study Companion Computing Guide End of Course Assessment (ECA) Files Glossary Handbook Media Notes Module Guide Musical Scores Portfolio Readings Student Marked Assessment (SMA) Specimen Exam Paper Study File Study Guide Tutor Marked Assignment (TMA) Work Book Sources of Legacy Metadata Historically this supplementary material has not been catalogued in the same way that Main Texts have been through Voyager MARC Records. There are two sources of metadata we can draw from however. The first is PLANET which is the Open University’s central planning system. Within PLANET every item produced for a course is recorded in an inventory. The metadata is minimal but will at least allow us to identify titles, identifiers and crucially which presentation of a course they belong to. Some courses will also have more in-depth metadata profiles in the Portfolio system which is a digital asset management system. Roughly 30% of items within Portfolio (which itself only contains items from a small proportion of courses produced by the OU) have detailed IEEE LOM records ( 1 This list is by no means comprehensive and is being expanded as part of the project. OUDL-STELLAR metadata implementation.docx Page 9 of 13 Authors: Alex Addyman and Lara Whitelaw 26/06/2013 Figure 4). Where available these will be directly ported in to OUDL. OUDL-STELLAR metadata implementation.docx Page 10 of 13 Authors: Alex Addyman and Lara Whitelaw 26/06/2013 Figure 4 - Portfolio Record Selection of Metadata Standards As a number of items are contained within the IEEE LOM format it made sense to retain this format for all of our Supplementary Texts. LOM is also the most appropriate standard to use given that the items in questions are more exclusively learning objects as opposed to the Main Texts which are more like traditional bibliographic texts. We have mapped the Portfolio elements back to their original LOM structure and used this to map the PLANET fields to create a consistent profile. Given that the PLANET metadata was not intended to be used for resource discovery purposes there are some fields which are difficult to map, particularly subject fields. Where possible and appropriate we may be able to draw these fields from the module to which the supplementary item belongs. However for more granular subject descriptions manual cataloguing may be required. When we started to look at how we could transform these materials to Linked Data for the STELLAR project we found that IEEE LOM was not available in RDF. We considered whether to use Learning Resource Metadata Initiative (LRMI) instead which is the schema.org based standard that will replace IEEE LOM but found that although it will be useful for sharing our metadata with others, like OAI-DC this specification is not suitable for internal repository use. We have therefore made the decision to use MODS for supplementary resources as well as main texts and map this profile to LRMI for sharing externally. OUDL-STELLAR metadata implementation.docx Page 11 of 13 Authors: Alex Addyman and Lara Whitelaw 26/06/2013 Time-Based Media The OUDL will include a significant amount of video and audio content which has been made available digitally through this and previous projects. The most notable project was the Access to Video Assets project (AVA) which sought to “address the increasing demand for exploitation of The Open University’s rich media legacy assets” (Open University Library Services, 2008). The key output of AVA was the development of VideoFinder – a centralised repository and catalogue for OU video (which has since been extended to include audio assets). The Access to Video Assets Time-Based Media Profile was created in the development of VideoFinder. The original AVA profile used Dublin Core, PREMIS, EBU-Core, MPEG-7 and VideoSD based on the AHDS Core Element Set for Moving Images (Arts and Humanities Data Service, 2007) and was recorded in the following namespace as a schema: http://open.ac.uk/library/ava/ns_ou.xsd. When we review this standard for the STELLAR project we found that it did not give us the functionality that we have with other schemas developed for the OUDL. Additionally we were unable to transform this specification into Linked Data. We decided to remap the specification and looked at both PBCore and to EBUCore as candidate schemas. The Public Broadcasting Metadata Dictionary Project (PBCore): http://pbcore.org/index.php is simple and flexible standard based on Dublin Core and looked a very promising match for the OU’s audio-visual resources, but unfortunately there is currently no RDF ontology representing the schema and it therefore was ruled out. The European Broadcasting Union’s EBUCore: http://tech.ebu.ch/MetadataSpecifications is a much more complex specification than PBCore but is used by EUScreen (European portal on audiovisual public archives counting 12 EBU members and national archives), which delivers linked data to Europeana. The W3C Media Annotation Ontology is based on EBU's Class Conceptual Data Model, it is also being registered in SMPTE and has been published as AES60 by the Audio Engineering Society (AES ). The project has therefore decide to use the EBUCore metadata profile within its repository. Web Pages The Open University Archive has begun the process of archiving university web pages from the learn.open.ac.uk domain dating back to 2006. The project has been using The Way Back Machine (The Internet Archive, 2001) to convert web pages into a standardised web archive format known as a WARC file. This file allows a site and its contents to be re-rendered at a later date. Sources of legacy metadata Metadata for the websites may be derived from the WARC file itself which contains very basic descriptive elements and technical elements. Other than this OUDL-STELLAR metadata implementation.docx Page 12 of 13 Authors: Alex Addyman and Lara Whitelaw 26/06/2013 there is no legacy metadata for VLE pages either within the HTML of the pages or elsewhere. Selecting metadata standards Descriptive metadata As there is little descriptive metadata to harness we have created a MODS record which will require largely manual metadata entry unless a source of VLE descriptive metadata can be found. The elements for this are drawn from the minimal descriptive metadata we can harness from the WARC file and those listed in the Library of Congress Web Archive Application Profile (Library of Congress, 2009). Technical metadata Technical metadata will be drawn from the WARC file itself and mapped to PREMIS in line with preservation requirements. OUDL-STELLAR metadata implementation.docx Page 13 of 13