Kraemer Family Library University of Colorado Colorado Springs

advertisement
Kraemer Family Library
University of Colorado
Colorado Springs
Minimum Metadata Standard
Draft Version 1.2
(March 2012)
1
1. Status
1.1. Document Status
Draft
Comments Requested

Draft Released
1.2. Document Location
\\Columbia\dept\ADR\Metadata Standards\MinimumStandardv1_5.docx
1.3. Version History
Version
Release
Number
Date
1.0
6/14/2011
1.2
3/9/2012
Editor
Mary Rupp
Mary Rupp
Description
Created document
Modifications to metadata formats
2. Documentation Structure
2.1. Normative and Non-normative Sections
Normative material describes element names, attributes, formats and the contents of
elements that is required in order for content or systems to comply with the KFL
metadata specifications. Non-normative material explains, expands on, or clarifies
the normative material, but it does not represent requirements for compliance.
Normative materials are explicitly identified as such, any material not identified as
such can be assumed to be non-normative.
2.2. Requirement Wording Note
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL
NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and
“OPTIONAL” in this document are to be interpreted as described in RFC 2119 (Key
words for use in RFCs to Indicate Requirement Levels, Best Current Practice of the
IETF [Internet Engineering Task Force], 1997)
2.3. Additional Documentation
2.3.1.
2
3. Introduction
3.1. Purpose and Scope
The purpose of this document is to describe the elements of the Dublin Core Metadata
schema and their usage by the Kraemer Family Library in the [Digital Repository –
Shared Services]. For the Dublin Core specification, see
http://dublincore.org/specifications/.
3.2.
3.3. Acknowledgements
This draft borrows freely from
 Dublin Core Metadata Elements Set v.1.1
http://www.dublincore.org/documents/dces/
 PRISM (Publishing Requirements for Industry Standard Metadata) subset for the
Dublin Core Namespace v.2.0
http://www.prismstandard.org/specifications/2.0/PRISM2.0Errata09.zip
 CSU Core Data Dictionary v.1.1
3
3.4. Format
All the element definitions appear in a uniform format.
Element Attribute
Name
Definition
Standards
Referenced
Obligation
Recurrence
Qualifiers
Schema
Audience
Simple DC
Mapping
Input Guidelines
Examples
Comment
Description
Designation of the element.
Explanation of the meaning of each element term.
Metadata standards reference in description of element:
 Dublin Core Metadata Initiative|Specifications [DC]
http://www.dublincore.org/specifications/
 Dublin Core Metadata Initiative|Type Vocabulary
[DCMITYPE]
http://dublincore.org/documents/dcmi-type-vocabulary/
 Thesaurus of Geographic Names [TGN]
http://www.getty.edu/research/tools/vocabulary/tgn/index.
html
 W3C Date and Time Formats [W3CDTF]
http://www.w3.org/TR/NOTE-datetime
 Internet Assigned Numbers Authority | MIME Media
Types [MIME]
http://www.iana.org/assignments/media-types/
 Internet Engineering Task Force | Best Current Practice
[RFC4646] http://www.ietf.org/rfc/rfc4646.txt
States whether the element is
 Required
 Recommended
 Optional
States whether the element may be repeated
 Repeatable
 Non-repeatable
Lists terms - semantically similar to the element – used to narrow
or refine the meaning of an element.
Lists valid schema to be used in the element.
Lists intended audience for the element
 System
 Manager (repository manager, collection curator)
 Staff User
 End User
The simple Dublin Core to which this element maps for metadata
sharing via OAI harvesting.
Provides guidance for entering and encoding values for the
element and its qualifiers.
Sample usage of the element.
Description or additional information on the use of the element.
4
4. Metadata Elements
4.1. Contributor
Name
Definition
Standard Referenced
Obligation
Recurrence
Qualifiers
Schema
Audience
Simple DC Mapping
Input Guidelines
Examples
Comment
Contributor
An entity involved in the creation or responsible for making
contributions to the resource.
DC
Recommended
Repeatable
.role (Optional if not linked to a specific Contributor)
Examples: advisor, committee member, chair, co-chair,
editor, illustrator, etc.
Manager, Staff User, End User
Contributor
1. Enter each Contributor in a separate element.
2. The form of the name should be used consistently in all
occurrences, across all projects.
3. The role of the contributor may be entered with the
qualifier Contributor.Role based on locally created list of
roles.
Contributor: Reddy, Venkateshwar
Contributor.Role: Committee member
Examples of a Contributor include a person, an organization,
or a service. Typically, the name of a Contributor should be
used to indicate the entity.
Implementation in DigiTool
XML Format
<contributor>role, name; rank, department, organization. (additional information) </contributor>
5
4.2. Coverage
Name
Definition
Standard Referenced
Obligation
Recurrence
Qualifiers
Schema
Audience
Simple DC Mapping
Input Guidelines
Examples
Comment
Coverage
The spatial or temporal topic of the resource, the spatial
applicability of the resource, or the jurisdiction under which
the resource is relevant.
DC, TGN
Recommended
Repeatable
.spatial (Required)
.temporal (Required)
Manager, Staff User, End User
Coverage
1. Use as a qualified element.
2. Use separate elements for each place or time period.
a. Spatial characteristics may include geographic
names, latitude/longitude, or other established
geo-reference values.
b. Temporal characteristics include those aspects of
time that relate to the intellectual content of a
resource and not its lifecycle.
3. Enter dates YYYY/MM/DD.
a. For a range of dates, enter dates as YYYYYYYY or YYYY/MM-YYYY/MM or
YYYY/MM/DD-YYYY/MM/DD.
b. Use free text to input BCE dates (e.g. 200 BCE)
Coverage.Spatial: 38° 50' 26'' N; 105° 2' 41'' W
Coverage.Temporal: 1924
Spatial topic and spatial applicability may be a named place
or a location specified by its geographic coordinates.
Temporal topic may be a named period, date, or date range.
A jurisdiction may be a named administrative entity or a
geographic place to which the resource applies.
Recommended best practice is to use a controlled vocabulary
such as the Thesaurus of Geographic Names [TGN]. Where
appropriate, named places or time periods can be used in
preference to numeric identifiers such as sets of coordinates
or date ranges
6
4.3. Creator
Name
Definition
Standard Referenced
Obligation
Recurrence
Qualifiers
Schema
Audience
Simple DC Mapping
Input Guidelines
Examples
Comment
Creator
An entity primarily responsible for making the resource.
DC
Required, if applicable
Repeatable
None
Manager, Staff User, End User
Creator
1. If multiple people or entities are equally responsible for
the intellectual or artistic content of the resource, each
person or entity must be listed in a separate element.
2. If there are mixed responsibilities, the Creator element
should be used for the primary responsibility and the
Contributor element should be used for the secondary
responsibility.
3. The form of name should be used consistently in all
occurrences, across all projects.
Creator: Burnett, Brian
Creator: Office of Sustainability (individuals identified as
Contributor with Role defined)
Examples of a Creator include a person, an organization, or a
service. Typically, the name of a Creator should be used to
indicate the entity.
7
4.4. Date
Name
Definition
Standard Referenced
Obligation
Recurrence
Qualifiers
Schema
Audience
Simple DC Mapping
Input Guidelines
Examples
Comment
Date
A point or period of time associated with an event in the
lifecycle of the resource.
DC, W3CDTF
Required
Non-repeatable
.original = date analog original was created
.digital = date digital surrogate or version was created
System, Manager, Staff User, End User
Date
1. Analog item digitized
a. .original = date analog original was created
b. .digital = date digital surrogate was created
2. Born Digital item - .original and .digital dates same
3. Reformatted Digital item
a. .original = date digital item was created
b. .digital = date reformatted version was created
Date.Original: 1965-06-15
Date.Digital: 2011-07-01
Date may be used to express temporal information at any
level of granularity. Recommended best practice is to use an
encoding scheme, such as the W3CDTF profile of ISO 8601
[W3CDTF].
8
4.5. Description
Name
Definition
Standard Referenced
Obligation
Recurrence
Qualifiers
Schema
Audience
Simple DC Mapping
Input Guidelines
Examples
Comment
Description
An account of the resource.
DC
Required for public objects.
Optional for unprocessed files.
Repeatable
.abstract (Required if applicable, Repeatable)
.tableofcontents (Optional, Non-repeatable)
Manager, Staff User, End User
Description
1. Use Description without qualifier when the description is
neither an abstract nor a table of contents.
2. Enter descriptive text, remarks, and comments about the
digital resource.
a. Include information from all sources
b. Examples: description, technique, distinguishing
features of the digital resource (see Source)
3. Use Description.Abstract only for existing abstract.
a. Repeat .abstract if needed to fit full abstract in
record – use for paragraph separations?
4. Use Description.tableofcontents when table of contents,
chapter or section list, or list of works within single file
is available.
Description:
Description.abstract:
Description.tableofcontents:
Description may include but is not limited to: an abstract, a
table of contents, a graphical representation, or a free-text
account of the resource.
9
4.6. Format
Name
Definition
Standard Referenced
Obligation
Recurrence
Qualifiers
Schema
Audience
Simple DC Mapping
Input Guidelines
Examples
Comment
Format
The file format, physical medium, or dimensions of the
resource.
DC, MIME
Required
Non-repeatable
.extent (Required, Repeatable) size or duration of digital
resource
System, Manager, Staff User, End User
Format
1. The Format element is provided as part of the extracted
technical metadata in DigiTool.
2. Use Format to record the Internet Media Type (MIME)
a. If the resource format of the digital resource is
not yet registered as a MIME type, use the MIME
convention of using a broad category of object
format (audio, video, application, etc.) then use as
a brief identifier for the second half of the MIME
type the file name suffix that is usually attached
to files of that format.
3. Use Format.extent to record size and duration of resource
a. List the file size in the format.extent element in
terms of bytes instead of kilo- or megabytes.
b. The extraction process in DigiTool will record the
file size in bytes.
4. For audio and video formats, list duration (playing time)
in a separate format.extent field.
a. The playing time should be listed as both a
numeric value and a caption that is needed to
interpret the numeric value.
b. Use hour(s), minute(s), and second(s) as captions.
Format:
Format.extent:
Examples of dimensions include size and duration.
Recommended best practice is to use a controlled vocabulary
such as the list of Internet Media Types [MIME].
See relation.requires if software or hardware external to the
resource are needed.
10
4.7. Identifier
Name
Definition
Standard Referenced
Obligation
Recurrence
Qualifiers
Schema
Audience
Simple DC Mapping
Input Guidelines
Examples
Comment
Identifier
An unambiguous reference to the resource within a given
context.
DC
Required
Repeatable
None
System, Manager, Staff User, End User
Identifier
1. Use separate Identifier elements to enter multiple
identifiers
2. PID – the Digital Asset Management System will
generate a unique Identifier when the digital object is
ingested into the system.
3. A handle will be generated by the handle server for
digital resources in DigiTool (not on test server).
4. Batch ingest
a. DCxml, MARCxml, or .csv – Identifier must be
added to the metadata with the exact file name(s)
(including file extention) of the file to be ingested
i. If multiple Identifiers are needed, this
Identifier element must be the first in the
order of Identifier elements for DigiTool
to perform correct linking.
ii. In DCxml and MARCxml this Identifier
element is displayed in the object viewer.
In .csv ingest, this Identifier is not
displayed.
b. When ingesting JPEG2000 with .csv files, an
additional Identifier element will be added to the
metadata with the exact file name of the digital
master (including file extension).
5. Minimally, there must be an Identifier assigned by the
metadata creator using the guidelines below, an Identifier
assigned by the Digital Asset Management System, and
an Identifier assigned by the handle system.
See Identifier components below.
Recommended best practice is to identify the resource by
means of a string conforming to a formal identification
system.
11
Identifier Components
The Identifier will consist of a structured string of characters (alphabetic and numeric) requiring
a minimum of 3 components; each component must be a fixed length.
1. First Component = 4-character alphabetic abbreviation for the campus = CUCS
a. Must be unique in the repository
b. Used to identify the campus/institution.
2. Second Component = 4-character string (alphabetic and/or numeric) that is a
subcollection within the campus.
a. Does not have to be unique
b. If the string is numeric (e.g. representing a year), the string must be the
subcollection and precede the third component to avoid ambiguity.
c. Use “aaaa” as a placeholder if a subcollection is not necessary.
3. Third Component = 6-digit accession number unique to the resource within the context of
the first two components.
a. This number will begin with a digit other than zero.
b. Normally between ‘100001’ and ‘999999’
c. Must be 6-digits long and unique within the collection.
4. Additional Components – each will consist of 4-character strings (alpha/numeric)
a. May be added after the third component.
b. To aid in building more specific logical collections in DigiTool.
c. Does not have to be unique.
d. Determined at project level.
e. No restrictions on how many additional components may be added.
f. All additional components must be appended to the end of the required minimal
three components as described above.
Examples:
CUCSSN66100001
 CUCS = University of Colorado Colorado Springs
 SN66 = Student Newspaper, 1966
 100001 = issue #1
CUCSUR41100001
 CUCS = University of Colorado Colorado Springs
 UR41 = Undergraduate Research Journal (URJ), Volume 4, Issue 1
 100001 = first article (numbered as appear in table of contents)
These Identifiers will not be used for browsing by repository users. A list will be maintained by
the Digital Repository Coordinator.
12
4.8. Language
Name
Definition
Standard Referenced
Obligation
Recurrence
Qualifiers
Schema
Audience
Simple DC Mapping
Input Guidelines
Examples
Comment
Language
A language of the resource – human language in which text
is written or spoken.
DC
Recommended
Repeatable
None
ISO 639-2
http://www.loc.gov/standards/iso639-2/php/English_list.php
Manager, Staff User, End User
Language
1. If the digital resource contains more than one language,
enter the additional languages in separate Language
fields or clearly separate each language value by
semicolon space.
2. If special explanation is necessary to identify how
language relates to the digital resource, add text to the
Description element to describe the situation.
Language: english
Dublin Core - Recommended best practice is to use a
controlled vocabulary such as RFC 4646 [RFC4646].
Compare with DigiTool and possibility of using full
language names in English.
13
4.9. Publisher
Name
Definition
Standard Referenced
Obligation
Recurrence
Qualifiers
Schema
Audience
Simple DC Mapping
Input Guidelines
Examples
Comment
Publisher
An entity responsible for making the resource available.
DC
Required
Repeatable
None
Manager, Staff User, End User
Publisher
1. University of Colorado Colorado Springs.
2. Colleges, departments, institutes, centers, etc. may also
appear in repeated Publisher element if they are
responsible for making the digital resource available.
a. Also publishing the original resource in paper.
b. See list for standard form of department name.
Publisher: University of Colorado Colorado Springs.
Publisher: University of Colorado Colorado Springs.
Kraemer Family Library
Examples of a Publisher include a person, an organization, or
a service. Typically, the name of a Publisher should be used
to indicate the entity.
14
4.10.
Relation
Name
Definition
Standard Referenced
Obligation
Recurrence
Qualifiers
Schema
Audience
Simple DC Mapping
Input Guidelines
Examples
Comment
Relation
A related resource.
DC
Recommended - depends on Qualifier
Depends on Qualifier
Required – Qualifier describes relationship
.isPartOf (Req’d/Rep) describes physical or logical relation
(e.g. chapter)
.hasPart (Req’d/Rep) includes related resource (e.g. song)
.isVersionOf (Req’d/Rep) version, edition, adaptation
.hasVersion (Opt’l/Rep) version, edition, or adaptation exists
.isFormatOf (Req’d/Rep) different format of content
.hasFormat (Opt’l/Rep) different format exists
.isReferencedBy (Opt’l/Rep)
.references (Opt’l/Rep) described resource references other
resources
.isReplacedBy (Req’d/Non-r) resource superseded by related
resource
.replaces (Req’d/Non-r) resource supersedes related resource
.isRequiredBy (Rec’d/Rep) is required by related resource to
support its function, delivery, or coherence of content
.requires (Rec’d/Rep) related resource required to
support its function, delivery, or coherence of content
.conformsTo (Req’d/Rep) reference to established standard
to which the resource conforms
Manager, Staff User, End User
Relation
1. Use separate Relation element to enter multiple
relationships.
2. Include sufficient information in the Relation element to
enable users to identify, cite, and locate or link to related
resource.
3. Do not repeat Source information in Relation element.
Relation.isPartOf: UCCS Archives Collection CS:84/001
Relation.replaces: UCCS Master Plan…2005
Relation.isReferencedBy: McKay, D. 25 Years….
Recommended best practice is to identify the related
resource by means of a string conforming to a formal
identification system.
Information necessary to describe, find, or link to a related
resource.
15
4.11.
Rights
Name
Definition
Standard Referenced
Obligation
Recurrence
Qualifiers
Schema
Audience
Simple DC Mapping
Input Guidelines
Examples
Comment
Rights
Information about rights held in and over the resource.
DC
Required
Repeatable
None
Manager, Staff User, End User
Rights
1. The Rights element will minimally be a hyperlink to the
right statement relevant to the resource.
2. Specific rights information may be added to the Rights
element as free text when conditions warrant.
3. Access?
Rights: [link]
?Rights: Access is limited to University of Colorado
Colorado Springs users. Copyright information: [link]
Typically, rights information includes a statement about
various property rights associated with the resource,
including intellectual property rights.
16
4.12.
Source
Name
Definition
Standard Referenced
Obligation
Recurrence
Qualifiers
Schema
Audience
Simple DC Mapping
Input Guidelines
Examples
Comment
Source
A related resource from which the described resource is
derived.
DC
Recommended
Repeatable
None
Manager, Staff User, End User
Source
1. Input the ISSN, ISBN, or other international standard
numbers assigned to the analog original.
a. File names, accession numbers, call numbers, or
other identification schemes should be entered in
Identifier element.
2. Enter source information in order of importance.
a. Describe the nature of the relationship between
resources “Excerpted from”, “Original”, etc.
b. Information about the physical condition of the
physical object may be included.
i. use the physical unit of measurement most
appropriate for the resource
3. Use separate Source elements to identify source for
multiple items.
4. If a single resource has multiple sources, separate with
semicolons in single Source element.
Source: Original: 35mm color slide
The described resource may be derived from the related
resource in whole or in part. Recommended best practice is
to identify the related resource by means of a string
conforming to a formal identification system.
17
4.13.
Subject
Name
Definition
Standard Referenced
Obligation
Recurrence
Qualifiers
Schema
Audience
Simple DC Mapping
Input Guidelines
Examples
Comment
Subject
The topic of the resource.
DC
Recommended
Repeatable
None
Staff User, End User
Subject
1. The use of Subject is to be determined at the project level
a. Create project data dictionary including any
controlled vocabulary or vocabulary used.
b. If multiple controlled vocabularies are used for a
digital resource, each vocabulary should be
entered in separate elements. (identification TBD)
Subject: Gallery of Contemporary Art (GoCA)
Typically, the subject will be represented using keywords,
key phrases, or classification codes. Recommended best
practice is to use a controlled vocabulary. To describe the
spatial or temporal topic of the resource, use the Coverage
element.
18
4.14.
Title
Name
Definition
Standard Referenced
Obligation
Recurrence
Qualifiers
Schema
Audience
Simple DC Mapping
Input Guidelines
Examples
Comment
Title
A name given to the resource.
DC
Required
Non-repeatable
Title.alternative (Required if applicable, Repeatable) Any
form of the title used as a substitute or alternative to the
formal title of a digital resource. Can be used for
 Caption title
 Former title
 Spine title
 Collection title
 Series title
 Artist’s title
 Object name
 Translation of title
Manager, Staff User, End User
Title
1. Transcribe title from resource if available from the
digital resource itself. Otherwise follow standard
cataloguing rules.
2. Title may be an identifying phrase or name supplied by
the metadata creator, project manager, or archivist.
a. Same rules apply to title.alternate
b. Does not need to be unique [untitled]
3. Remove initial article and append to end of title,
following a comma, to aid title sorting of results.
4. Capitalize only the first letter of the title and proper
nouns contained within the title.
5. Use the punctuation provided by the title or standard
English for created titles.
Title: The Scribe, v.#, n.#
Title.alternate: The Scribe
Typically, a Title will be a name by which the resource is
formally known.
19
4.15.
Type
Name
Definition
Standard Referenced
Obligation
Recurrence
Qualifiers
Schema
Audience
Simple DC Mapping
Input Guidelines
Examples
Comment
Type
The nature or genre of the resource.
DC, DCMITYPE
Required
Repeatable
None
Manager, Staff User, End User
Type
1. Images of written language are assigned as Text.
2. Some digital resources may require more than one Type
a. Scanned page may include text and image
b. Attached text and audio file
3. Digital representations of 3-D objects should be assigned
“Image” rather than “Physical Object”.
Type: text
Type: image
Recommended best practice is to use a controlled vocabulary
such as the DCMI Type Vocabulary [DCMITYPE]. To
describe the file format, physical medium, or dimensions of
the resource, use the Format element.
20
Summary Table
Element
Contributor
Coverage
Creator
Date
Description
Format
Identifier
Language
Publisher
Relation
Rights
Source
Subject
Title
Type
Required
If
Always Applicable Recommended Optional
Notes




Format: YYYYMMDD
Optional for
unprocessed files.






Depends on qualifier


Subject vocabulary
determined at project
level.
Optional for
unprocessed files.



21
Download