implementation - The Open University

advertisement
Library Services
OUDL-Stellar metadata implementation
Author
Document no. (if applicable)
Publication Date
Version no.
Status
Confidentiality
Location (inc. Livelink link)
Last saved
Alex Addyman and Lara Whitelaw
26/06/2013
Draft
Public
2016-02-08 (note – this is an automated field)
OUDL-STELLAR metadata implementation.docx
Page 1 of 13
Authors: Alex Addyman and Lara Whitelaw
26/06/2013
Contents
Summary Table............................................................................................................. 3
Content Specific Metadata............................................................................................. 4
Main Texts ............................................................................................................ 4
Collections Metadata .......................................................................................... 5
Supplementary Texts ......................................................................................... 8
Time-Based Media............................................................................................. 10
Web Pages .......................................................................................................... 10
Figure 1 - Source metadata from Voyager ........................................................ 4
Figure 2 - MARCXML to MODS and DC ............................................................... 5
Figure 3 - Legacy collections metadata example ............................................. 6
Figure 4 - Portfolio Record .................................................................................... 9
OUDL-STELLAR metadata implementation.docx
Page 2 of 13
Authors: Alex Addyman and Lara Whitelaw
26/06/2013
Summary Table
Content type
Audio
Metadata
Generic: OAI-DC
Audio Metadata: EBUCore mapped
to the Media Ontology
Collection
Preservation: PREMIS
Generic: OAI-DC
Collections Metadata: DCCAP or
MODS (mapped to EAD)
Main texts
Generic: DC, RELS-INT, RELS-EXT
Digital Resource: MODS
Preservation: PREMIS
Moving image or video
Generic: OAI-DC
AV Metadata: EBUCore mapped to
the Media Ontology
Still image
Preservation: PREMIS
Generic: OAI-DC
Image Metadata: VRA Core, MIX
Preservation: PREMIS
Supplementary text
Generic: OAI-DC
Digital Resource: MODS
Web page
Preservation: PREMIS
Generic: DC, RELS-INT, RELS-EXT
Webpage Metadata: MODS
Preservation: PREMIS
OUDL-STELLAR metadata implementation.docx
Page 3 of 13
Authors: Alex Addyman and Lara Whitelaw
26/06/2013
Content Specific Metadata
Main Texts
The Open University Archives has digitised a range of print study materials
which formed part of the S100 Science Foundation Course during its early years.
These include the Main Texts which were sent out to students in order to guide
them through the course as well as any supplementary material such as course
guides, workbooks, Tutor Mark Assignments and so on. For the purposes of
metadata the Main Texts are treated separately from the supplementary texts.
Sources of legacy metadata
Every Main Text digitised will have an existing record in our Voyager Library
Management System encoded in the MARC-21 format. The MARC field usage
varies per record in our sample collection but the example in Figure 1 is fairly
atypical.
Figure 1 - Source metadata from Voyager
MARC Code
MARC Field
Example
$001
Control Number
206749
$005
Date and time of latest transaction
20070716165119.0
008/30-31 MU (code j)
Fixed length data fields
010508s1971 enka 000 0 eng
$020 a
ISBN
0335020321
$082 a
DDC Number
500
$110 a
Corporate name
Open University S100/Unit 14
$245 a
Title statement
$260 a
Place of publication
The chemistry and structure of
the cell
Milton Keynes
$260 b
Name of publisher
Open University
$260 c
Date of publication
1971
$500 a
General note
$650 a
Subject - topical
$650 x
Subject - general subdivision
Unit 14 of S100 Science
foundation course
Biochemistry
Cells
Cell physiology
History
$740 a
$842 a
Uncontrolled Related/Analytical
Title
Textual physical form designator
$852 b
Location
$852 n
Sublocation
$008/15-17
008/07-10
008/07-11
Science foundation course
In order to assess the suitability of MODS we took the MARC Voyager records
and examined which MARC fields were most commonly used across our sample
content. We then cross-walked this against MODS as shown in
OUDL-STELLAR metadata implementation.docx
Page 4 of 13
Authors: Alex Addyman and Lara Whitelaw
26/06/2013
Figure 2. We also cross-walked against simple Dublin Core as this is a core
requirement of OAI harvesting.
OUDL-STELLAR metadata implementation.docx
Page 5 of 13
Authors: Alex Addyman and Lara Whitelaw
26/06/2013
Figure 2 - MARCXML to MODS and DC
MARC Code
MARC Field
$001
Control Number
$005
Date and time of latest
transaction
Fixed length data fields
008/30-31 MU (code j)
$008/15-17
008/07-10
008/07-11
MODS field
<recordIdentifier>
DC field
Identifier
<recordChangeDate> with
encoding="iso8601"
<language><languageTerm>
No crossover
<place><placeTerm> with
type="code" and
authority="marccountry"
<dateIssued> with encoding="marc"
No crossover
Date
Language
Date
$020 a
ISBN
<dateCreated> with
encoding="marc"
<identifier type="isbn">
$082 a
DDC Number
<classification authority="ddc">
Subject
$110 a
Corporate name
<name type="corporate">
Contributor
$245 a
Title statement
<titleInfo><title>
Title
$260 a
Place of publication
Publisher
$260 b
Name of publisher
<place><placeTerm> with
type="text"
<publisher>
$260 c
Date of publication
<dateIssued>
Date
$500 a
General note
Description
$562 a
$562 b
Copy and Version
Identification Note
Copy identification
$650 a
Subject - topical
<note> with type=appropriate name
assigned
<note> with type="version
identification "
<note> with type="version
identification "
<subject authority=" ">
$650 x
<topic>
Subject/Coverag
e
No crossover
$852 b
Subject - general
subdivision
Uncontrolled
Related/Analytical Title
Textual physical form
designator
Location
$852 n
Sublocation
<shelfLocator>
$740 a
$842 a
<titleInfo type="alternative"><title>
Identifier
Publisher
No crossover
No crossover
Subject
Format
<physicalLocation>
No crossover
No crossover
MODS is derived from MARC21 and so it is no surprise that it is both granular
and semantically similar to map to every MARC field from our legacy metadata.
Dublin Core alone does not offer the complexity to fully represent our legacy
metadata.
Collections Metadata
The large and varied content which will make up OUDL requires a clear and
easy-to-navigate hierarchical structure. To achieve this collections and subcollections will be developed to group similar content together.
Collections Characteristics
As the development of the OUDL is an iterative process and as digital items
suitable for inclusion into the repository will no doubt grow with time there are
no fixed collections as yet. However the following collections themes have been
identified whose titles and contents are subject to change.
OUDL-STELLAR metadata implementation.docx
Page 6 of 13
Authors: Alex Addyman and Lara Whitelaw
26/06/2013
OU Study Materials Archive
Contents:
 All the study materials (also called learning materials/course
materials/teaching materials) in all formats. "Study Materials" was the
preferred phrase when discussed with the Learning and Teaching team a
while ago and is now the official title of the collection on the website etc.
OU Life Collection
Contents:
 OU Historical Images (non-teaching - this could be complicated as our
images are jumbled together at the moment - teaching and historical)
 OU Historical broadcasting (Open Forum TV and radio, other OU
"magazine" programmes)
 OU Vice-Chancellor's speeches (need to talk to the VC's office about
these)
 Other collections such as Sesame, Open House, OU web archive of
social/comms stuff could also go into here.
OU Learning Journey Collection
This collection title would be consistent with the OU/Agreement and the title now
given to this material by OMU
Sources of legacy metadata
Collections-level metadata for digital content at the OUDL is non-existent. The
closest thing we can draw on is Encoded Archival Description (EAD) metadata
used to describe physical archive collections such as the Jennie Lee Collection
and the Walter Perry Collection (Open University, 2006) (see
Figure 3). These are important to map because they contain useful identifier
metadata and because they may be preserved in the OUDL in the future.
Figure 3 - Legacy collections metadata example
Element name
Reference
Example
GB/2315/WP
Held.at
The Open University Archive
Dates of Creation
1926 - 2003
Physical Description
225 files
Name of creator
Walter Perry created the collection.
Title
The Walter Perry Collection
Sub-title
Author
Finding aid compiled by Miss Ruth Cammies
Publication
The Open University Library 2006 The Open University,,
Walton Hall, , Milton Keynes, , MK7 6AA , Tel: 01908 653378,
1st Edition
Edition
Creation
Finding aid encoded in EAD (Encoded Archival Description)
2002 using Altova XMLSpy by Miss Ruth Cammies, Open
University Archivist, Mrs Julie Vavangas, Archive Assistant,
Miss Georgina Parsons, Archive Assistant. Initial catalogue of
the first deposit compiled by Beveley Hunt, Archivist,
2001.2006
OUDL-STELLAR metadata implementation.docx
Page 7 of 13
Authors: Alex Addyman and Lara Whitelaw
Element name
Example
Descriptive Rules
This finding aid has been created using the ISAD(G) 2nd
Edition (International Standard of Archival Description
(Generalised)) and Encoded Archival Description (EAD).
Finding aid written inEnglish
Language usage
26/06/2013
scope and content
biographical history
arrangement
The second deposit arrived at the Open University Archive
with no structure or filing system. The structure has therefore
been artificially created to aid access. Individual files have not
been split unless clearly stated in the item record.
WP/1 The Open University
WP/2 Other Educational Work
WP/3 Papers regarding Health, Science and the
Environment
WP/4 Personal Files and Interests
access guidelines
copying restrictions
immediate source of
acquisition
custodial history
archvists note
To access the collection contact the Open University Archivist.
All items will be monitored for personal or sensitive
information before they are released to researchers. The
Archivist reserves the right to restrict access if necessary. All
researchers will be required to complete an access/data
protection/ copyright form
Access to some of the papers within the collection is restricted
under the principles of the Data Protection Act 1998.
Reproduction of items from the collection will be permitted
according to copyright legislation and Open University Library
policy.
The first deposit of material was largely an internal transfer of
papers from the Vice Chancellor's Office of the Open
University. The second deposit was transferred from the
Edinburgh Regional Office of the Open University. A small
selection of materials were donated by Lady Perry.
The first deposit was transferred to the University Library by
Walter Perry. The second deposit was transferred from Walter
Perry's office within the Edinburgh Regional Centre for the
Open University after his death in 2003. A small selection of
files was also donated by Lady Perry at this time.
This finding aid was created in 2006.
related material
subjects
Education, Higher
Distance Education
Medicine Research
Broadcasting
subject.personal.names
Perry Walter 1921 - 2003 Lord Perry of Walton
Lee Baroness Jennie 1904 - 1988 MP
Wilson Baron of Rievaulx James Harold 1916 - 1995
Statesman
Goodman Baron Arnold Abraham 1913 - 1995 Lawyer
subject.corporate.names
Labour Party Great Britain
The Open University Great Britain
subject.geographical.names
Milton Keynes England
Edinburgh Scotland
Dundee Scotland
Dublin Core Collections-level Application Profile
The Dublin Core Collections-level Application Profile (Dublin Core Metadata
Initiative, 2007) is one of the most widely used collections level profiles. As it
draws on Dublin Core standards it is highly interoperable with other schemas
and libraries. As is the case with the simple DC schema however it is somewhat
limited in scope but at the collections level this is less of an issue as the
OUDL-STELLAR metadata implementation.docx
Page 8 of 13
Authors: Alex Addyman and Lara Whitelaw
26/06/2013
collection descriptions should be very brief. DC-CAP draws on the functional
model and element set defined by RSLP (UKOLN, 2000).
Supplementary Texts
Accompanying every Main Text produced for an Open University Unit is a series
of Supplementary Texts. These include the following types1:























Assessment
Assignment
Computer Marked Assignment (CMA)
Calendar
Case Study
Companion
Computing Guide
End of Course Assessment (ECA)
Files
Glossary
Handbook
Media Notes
Module Guide
Musical Scores
Portfolio
Readings
Student Marked Assessment (SMA)
Specimen Exam Paper
Study File
Study Guide
Tutor Marked Assignment (TMA)
Work Book
Sources of Legacy Metadata
Historically this supplementary material has not been catalogued in the same
way that Main Texts have been through Voyager MARC Records. There are two
sources of metadata we can draw from however. The first is PLANET which is the
Open University’s central planning system. Within PLANET every item produced
for a course is recorded in an inventory. The metadata is minimal but will at
least allow us to identify titles, identifiers and crucially which presentation of a
course they belong to.
Some courses will also have more in-depth metadata profiles in the Portfolio
system which is a digital asset management system. Roughly 30% of items
within Portfolio (which itself only contains items from a small proportion of
courses produced by the OU) have detailed IEEE LOM records (
1
This list is by no means comprehensive and is being expanded as part of the project.
OUDL-STELLAR metadata implementation.docx
Page 9 of 13
Authors: Alex Addyman and Lara Whitelaw
26/06/2013
Figure 4). Where available these will be directly ported in to OUDL.
OUDL-STELLAR metadata implementation.docx
Page 10 of 13
Authors: Alex Addyman and Lara Whitelaw
26/06/2013
Figure 4 - Portfolio Record
Selection of Metadata Standards
As a number of items are contained within the IEEE LOM format it made sense to
retain this format for all of our Supplementary Texts. LOM is also the most
appropriate standard to use given that the items in questions are more
exclusively learning objects as opposed to the Main Texts which are more like
traditional bibliographic texts. We have mapped the Portfolio elements back to
their original LOM structure and used this to map the PLANET fields to create a
consistent profile.
Given that the PLANET metadata was not intended to be used for resource
discovery purposes there are some fields which are difficult to map, particularly
subject fields. Where possible and appropriate we may be able to draw these
fields from the module to which the supplementary item belongs. However for
more granular subject descriptions manual cataloguing may be required.
When we started to look at how we could transform these materials to Linked
Data for the STELLAR project we found that IEEE LOM was not available in RDF.
We considered whether to use Learning Resource Metadata Initiative (LRMI)
instead which is the schema.org based standard that will replace IEEE LOM but
found that although it will be useful for sharing our metadata with others, like
OAI-DC this specification is not suitable for internal repository use. We have
therefore made the decision to use MODS for supplementary resources as well
as main texts and map this profile to LRMI for sharing externally.
OUDL-STELLAR metadata implementation.docx
Page 11 of 13
Authors: Alex Addyman and Lara Whitelaw
26/06/2013
Time-Based Media
The OUDL will include a significant amount of video and audio content which has
been made available digitally through this and previous projects. The most
notable project was the Access to Video Assets project (AVA) which sought to
“address the increasing demand for exploitation of The Open University’s rich
media legacy assets” (Open University Library Services, 2008). The key output
of AVA was the development of VideoFinder – a centralised repository and
catalogue for OU video (which has since been extended to include audio assets).
The Access to Video Assets Time-Based Media Profile was created in the
development of VideoFinder. The original AVA profile used Dublin Core, PREMIS,
EBU-Core, MPEG-7 and VideoSD based on the AHDS Core Element Set for
Moving Images (Arts and Humanities Data Service, 2007) and was recorded in
the following namespace as a schema: http://open.ac.uk/library/ava/ns_ou.xsd.
When we review this standard for the STELLAR project we found that it did not
give us the functionality that we have with other schemas developed for the
OUDL. Additionally we were unable to transform this specification into Linked
Data. We decided to remap the specification and looked at both PBCore and to
EBUCore as candidate schemas.
The Public Broadcasting Metadata Dictionary Project (PBCore):
http://pbcore.org/index.php is simple and flexible standard based on Dublin
Core and looked a very promising match for the OU’s audio-visual resources, but
unfortunately there is currently no RDF ontology representing the schema and it
therefore was ruled out.
The European Broadcasting Union’s EBUCore:
http://tech.ebu.ch/MetadataSpecifications is a much more complex specification
than PBCore but is used by EUScreen (European portal on audiovisual public
archives counting 12 EBU members and national archives), which delivers linked
data to Europeana. The W3C Media Annotation Ontology is based on EBU's Class
Conceptual Data Model, it is also being registered in SMPTE and has been
published as AES60 by the Audio Engineering Society (AES ). The project has
therefore decide to use the EBUCore metadata profile within its repository.
Web Pages
The Open University Archive has begun the process of archiving university web
pages from the learn.open.ac.uk domain dating back to 2006. The project has
been using The Way Back Machine (The Internet Archive, 2001) to convert web
pages into a standardised web archive format known as a WARC file. This file
allows a site and its contents to be re-rendered at a later date.
Sources of legacy metadata
Metadata for the websites may be derived from the WARC file itself which
contains very basic descriptive elements and technical elements. Other than this
OUDL-STELLAR metadata implementation.docx
Page 12 of 13
Authors: Alex Addyman and Lara Whitelaw
26/06/2013
there is no legacy metadata for VLE pages either within the HTML of the pages
or elsewhere.
Selecting metadata standards
Descriptive metadata
As there is little descriptive metadata to harness we have created a MODS
record which will require largely manual metadata entry unless a source of VLE
descriptive metadata can be found. The elements for this are drawn from the
minimal descriptive metadata we can harness from the WARC file and those
listed in the Library of Congress Web Archive Application Profile (Library of
Congress, 2009).
Technical metadata
Technical metadata will be drawn from the WARC file itself and mapped to
PREMIS in line with preservation requirements.
OUDL-STELLAR metadata implementation.docx
Page 13 of 13
Download