Kraemer Family Library University of Colorado Colorado Springs Minimum Metadata Standard Draft Version 1.2 (March 2012) 1 1. Status 1.1. Document Status Draft Comments Requested Draft Released 1.2. Document Location \\Columbia\dept\ADR\Metadata Standards\MinimumStandardv1_5.docx 1.3. Version History Version Release Number Date 1.0 6/14/2011 1.2 3/9/2012 Editor Mary Rupp Mary Rupp Description Created document Modifications to metadata formats 2. Documentation Structure 2.1. Normative and Non-normative Sections Normative material describes element names, attributes, formats and the contents of elements that is required in order for content or systems to comply with the KFL metadata specifications. Non-normative material explains, expands on, or clarifies the normative material, but it does not represent requirements for compliance. Normative materials are explicitly identified as such, any material not identified as such can be assumed to be non-normative. 2.2. Requirement Wording Note The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119 (Key words for use in RFCs to Indicate Requirement Levels, Best Current Practice of the IETF [Internet Engineering Task Force], 1997) 2.3. Additional Documentation 2.3.1. 2 3. Introduction 3.1. Purpose and Scope The purpose of this document is to describe the elements of the Dublin Core Metadata schema and their usage by the Kraemer Family Library in the [Digital Repository – Shared Services]. For the Dublin Core specification, see http://dublincore.org/specifications/. 3.2. 3.3. Acknowledgements This draft borrows freely from Dublin Core Metadata Elements Set v.1.1 http://www.dublincore.org/documents/dces/ PRISM (Publishing Requirements for Industry Standard Metadata) subset for the Dublin Core Namespace v.2.0 http://www.prismstandard.org/specifications/2.0/PRISM2.0Errata09.zip CSU Core Data Dictionary v.1.1 3 3.4. Format All the element definitions appear in a uniform format. Element Attribute Name Definition Standards Referenced Obligation Recurrence Qualifiers Schema Audience Simple DC Mapping Input Guidelines Examples Comment Description Designation of the element. Explanation of the meaning of each element term. Metadata standards reference in description of element: Dublin Core Metadata Initiative|Specifications [DC] http://www.dublincore.org/specifications/ Dublin Core Metadata Initiative|Type Vocabulary [DCMITYPE] http://dublincore.org/documents/dcmi-type-vocabulary/ Thesaurus of Geographic Names [TGN] http://www.getty.edu/research/tools/vocabulary/tgn/index. html W3C Date and Time Formats [W3CDTF] http://www.w3.org/TR/NOTE-datetime Internet Assigned Numbers Authority | MIME Media Types [MIME] http://www.iana.org/assignments/media-types/ Internet Engineering Task Force | Best Current Practice [RFC4646] http://www.ietf.org/rfc/rfc4646.txt States whether the element is Required Recommended Optional States whether the element may be repeated Repeatable Non-repeatable Lists terms - semantically similar to the element – used to narrow or refine the meaning of an element. Lists valid schema to be used in the element. Lists intended audience for the element System Manager (repository manager, collection curator) Staff User End User The simple Dublin Core to which this element maps for metadata sharing via OAI harvesting. Provides guidance for entering and encoding values for the element and its qualifiers. Sample usage of the element. Description or additional information on the use of the element. 4 4. Metadata Elements 4.1. Contributor Name Definition Standard Referenced Obligation Recurrence Qualifiers Schema Audience Simple DC Mapping Input Guidelines Examples Comment Contributor An entity involved in the creation or responsible for making contributions to the resource. DC Recommended Repeatable .role (Optional if not linked to a specific Contributor) Examples: advisor, committee member, chair, co-chair, editor, illustrator, etc. Manager, Staff User, End User Contributor 1. Enter each Contributor in a separate element. 2. The form of the name should be used consistently in all occurrences, across all projects. 3. The role of the contributor may be entered with the qualifier Contributor.Role based on locally created list of roles. Contributor: Reddy, Venkateshwar Contributor.Role: Committee member Examples of a Contributor include a person, an organization, or a service. Typically, the name of a Contributor should be used to indicate the entity. Implementation in DigiTool XML Format <contributor>role, name; rank, department, organization. (additional information) </contributor> 5 4.2. Coverage Name Definition Standard Referenced Obligation Recurrence Qualifiers Schema Audience Simple DC Mapping Input Guidelines Examples Comment Coverage The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant. DC, TGN Recommended Repeatable .spatial (Required) .temporal (Required) Manager, Staff User, End User Coverage 1. Use as a qualified element. 2. Use separate elements for each place or time period. a. Spatial characteristics may include geographic names, latitude/longitude, or other established geo-reference values. b. Temporal characteristics include those aspects of time that relate to the intellectual content of a resource and not its lifecycle. 3. Enter dates YYYY/MM/DD. a. For a range of dates, enter dates as YYYYYYYY or YYYY/MM-YYYY/MM or YYYY/MM/DD-YYYY/MM/DD. b. Use free text to input BCE dates (e.g. 200 BCE) Coverage.Spatial: 38° 50' 26'' N; 105° 2' 41'' W Coverage.Temporal: 1924 Spatial topic and spatial applicability may be a named place or a location specified by its geographic coordinates. Temporal topic may be a named period, date, or date range. A jurisdiction may be a named administrative entity or a geographic place to which the resource applies. Recommended best practice is to use a controlled vocabulary such as the Thesaurus of Geographic Names [TGN]. Where appropriate, named places or time periods can be used in preference to numeric identifiers such as sets of coordinates or date ranges 6 4.3. Creator Name Definition Standard Referenced Obligation Recurrence Qualifiers Schema Audience Simple DC Mapping Input Guidelines Examples Comment Creator An entity primarily responsible for making the resource. DC Required, if applicable Repeatable None Manager, Staff User, End User Creator 1. If multiple people or entities are equally responsible for the intellectual or artistic content of the resource, each person or entity must be listed in a separate element. 2. If there are mixed responsibilities, the Creator element should be used for the primary responsibility and the Contributor element should be used for the secondary responsibility. 3. The form of name should be used consistently in all occurrences, across all projects. Creator: Burnett, Brian Creator: Office of Sustainability (individuals identified as Contributor with Role defined) Examples of a Creator include a person, an organization, or a service. Typically, the name of a Creator should be used to indicate the entity. 7 4.4. Date Name Definition Standard Referenced Obligation Recurrence Qualifiers Schema Audience Simple DC Mapping Input Guidelines Examples Comment Date A point or period of time associated with an event in the lifecycle of the resource. DC, W3CDTF Required Non-repeatable .original = date analog original was created .digital = date digital surrogate or version was created System, Manager, Staff User, End User Date 1. Analog item digitized a. .original = date analog original was created b. .digital = date digital surrogate was created 2. Born Digital item - .original and .digital dates same 3. Reformatted Digital item a. .original = date digital item was created b. .digital = date reformatted version was created Date.Original: 1965-06-15 Date.Digital: 2011-07-01 Date may be used to express temporal information at any level of granularity. Recommended best practice is to use an encoding scheme, such as the W3CDTF profile of ISO 8601 [W3CDTF]. 8 4.5. Description Name Definition Standard Referenced Obligation Recurrence Qualifiers Schema Audience Simple DC Mapping Input Guidelines Examples Comment Description An account of the resource. DC Required for public objects. Optional for unprocessed files. Repeatable .abstract (Required if applicable, Repeatable) .tableofcontents (Optional, Non-repeatable) Manager, Staff User, End User Description 1. Use Description without qualifier when the description is neither an abstract nor a table of contents. 2. Enter descriptive text, remarks, and comments about the digital resource. a. Include information from all sources b. Examples: description, technique, distinguishing features of the digital resource (see Source) 3. Use Description.Abstract only for existing abstract. a. Repeat .abstract if needed to fit full abstract in record – use for paragraph separations? 4. Use Description.tableofcontents when table of contents, chapter or section list, or list of works within single file is available. Description: Description.abstract: Description.tableofcontents: Description may include but is not limited to: an abstract, a table of contents, a graphical representation, or a free-text account of the resource. 9 4.6. Format Name Definition Standard Referenced Obligation Recurrence Qualifiers Schema Audience Simple DC Mapping Input Guidelines Examples Comment Format The file format, physical medium, or dimensions of the resource. DC, MIME Required Non-repeatable .extent (Required, Repeatable) size or duration of digital resource System, Manager, Staff User, End User Format 1. The Format element is provided as part of the extracted technical metadata in DigiTool. 2. Use Format to record the Internet Media Type (MIME) a. If the resource format of the digital resource is not yet registered as a MIME type, use the MIME convention of using a broad category of object format (audio, video, application, etc.) then use as a brief identifier for the second half of the MIME type the file name suffix that is usually attached to files of that format. 3. Use Format.extent to record size and duration of resource a. List the file size in the format.extent element in terms of bytes instead of kilo- or megabytes. b. The extraction process in DigiTool will record the file size in bytes. 4. For audio and video formats, list duration (playing time) in a separate format.extent field. a. The playing time should be listed as both a numeric value and a caption that is needed to interpret the numeric value. b. Use hour(s), minute(s), and second(s) as captions. Format: Format.extent: Examples of dimensions include size and duration. Recommended best practice is to use a controlled vocabulary such as the list of Internet Media Types [MIME]. See relation.requires if software or hardware external to the resource are needed. 10 4.7. Identifier Name Definition Standard Referenced Obligation Recurrence Qualifiers Schema Audience Simple DC Mapping Input Guidelines Examples Comment Identifier An unambiguous reference to the resource within a given context. DC Required Repeatable None System, Manager, Staff User, End User Identifier 1. Use separate Identifier elements to enter multiple identifiers 2. PID – the Digital Asset Management System will generate a unique Identifier when the digital object is ingested into the system. 3. A handle will be generated by the handle server for digital resources in DigiTool (not on test server). 4. Batch ingest a. DCxml, MARCxml, or .csv – Identifier must be added to the metadata with the exact file name(s) (including file extention) of the file to be ingested i. If multiple Identifiers are needed, this Identifier element must be the first in the order of Identifier elements for DigiTool to perform correct linking. ii. In DCxml and MARCxml this Identifier element is displayed in the object viewer. In .csv ingest, this Identifier is not displayed. b. When ingesting JPEG2000 with .csv files, an additional Identifier element will be added to the metadata with the exact file name of the digital master (including file extension). 5. Minimally, there must be an Identifier assigned by the metadata creator using the guidelines below, an Identifier assigned by the Digital Asset Management System, and an Identifier assigned by the handle system. See Identifier components below. Recommended best practice is to identify the resource by means of a string conforming to a formal identification system. 11 Identifier Components The Identifier will consist of a structured string of characters (alphabetic and numeric) requiring a minimum of 3 components; each component must be a fixed length. 1. First Component = 4-character alphabetic abbreviation for the campus = CUCS a. Must be unique in the repository b. Used to identify the campus/institution. 2. Second Component = 4-character string (alphabetic and/or numeric) that is a subcollection within the campus. a. Does not have to be unique b. If the string is numeric (e.g. representing a year), the string must be the subcollection and precede the third component to avoid ambiguity. c. Use “aaaa” as a placeholder if a subcollection is not necessary. 3. Third Component = 6-digit accession number unique to the resource within the context of the first two components. a. This number will begin with a digit other than zero. b. Normally between ‘100001’ and ‘999999’ c. Must be 6-digits long and unique within the collection. 4. Additional Components – each will consist of 4-character strings (alpha/numeric) a. May be added after the third component. b. To aid in building more specific logical collections in DigiTool. c. Does not have to be unique. d. Determined at project level. e. No restrictions on how many additional components may be added. f. All additional components must be appended to the end of the required minimal three components as described above. Examples: CUCSSN66100001 CUCS = University of Colorado Colorado Springs SN66 = Student Newspaper, 1966 100001 = issue #1 CUCSUR41100001 CUCS = University of Colorado Colorado Springs UR41 = Undergraduate Research Journal (URJ), Volume 4, Issue 1 100001 = first article (numbered as appear in table of contents) These Identifiers will not be used for browsing by repository users. A list will be maintained by the Digital Repository Coordinator. 12 4.8. Language Name Definition Standard Referenced Obligation Recurrence Qualifiers Schema Audience Simple DC Mapping Input Guidelines Examples Comment Language A language of the resource – human language in which text is written or spoken. DC Recommended Repeatable None ISO 639-2 http://www.loc.gov/standards/iso639-2/php/English_list.php Manager, Staff User, End User Language 1. If the digital resource contains more than one language, enter the additional languages in separate Language fields or clearly separate each language value by semicolon space. 2. If special explanation is necessary to identify how language relates to the digital resource, add text to the Description element to describe the situation. Language: english Dublin Core - Recommended best practice is to use a controlled vocabulary such as RFC 4646 [RFC4646]. Compare with DigiTool and possibility of using full language names in English. 13 4.9. Publisher Name Definition Standard Referenced Obligation Recurrence Qualifiers Schema Audience Simple DC Mapping Input Guidelines Examples Comment Publisher An entity responsible for making the resource available. DC Required Repeatable None Manager, Staff User, End User Publisher 1. University of Colorado Colorado Springs. 2. Colleges, departments, institutes, centers, etc. may also appear in repeated Publisher element if they are responsible for making the digital resource available. a. Also publishing the original resource in paper. b. See list for standard form of department name. Publisher: University of Colorado Colorado Springs. Publisher: University of Colorado Colorado Springs. Kraemer Family Library Examples of a Publisher include a person, an organization, or a service. Typically, the name of a Publisher should be used to indicate the entity. 14 4.10. Relation Name Definition Standard Referenced Obligation Recurrence Qualifiers Schema Audience Simple DC Mapping Input Guidelines Examples Comment Relation A related resource. DC Recommended - depends on Qualifier Depends on Qualifier Required – Qualifier describes relationship .isPartOf (Req’d/Rep) describes physical or logical relation (e.g. chapter) .hasPart (Req’d/Rep) includes related resource (e.g. song) .isVersionOf (Req’d/Rep) version, edition, adaptation .hasVersion (Opt’l/Rep) version, edition, or adaptation exists .isFormatOf (Req’d/Rep) different format of content .hasFormat (Opt’l/Rep) different format exists .isReferencedBy (Opt’l/Rep) .references (Opt’l/Rep) described resource references other resources .isReplacedBy (Req’d/Non-r) resource superseded by related resource .replaces (Req’d/Non-r) resource supersedes related resource .isRequiredBy (Rec’d/Rep) is required by related resource to support its function, delivery, or coherence of content .requires (Rec’d/Rep) related resource required to support its function, delivery, or coherence of content .conformsTo (Req’d/Rep) reference to established standard to which the resource conforms Manager, Staff User, End User Relation 1. Use separate Relation element to enter multiple relationships. 2. Include sufficient information in the Relation element to enable users to identify, cite, and locate or link to related resource. 3. Do not repeat Source information in Relation element. Relation.isPartOf: UCCS Archives Collection CS:84/001 Relation.replaces: UCCS Master Plan…2005 Relation.isReferencedBy: McKay, D. 25 Years…. Recommended best practice is to identify the related resource by means of a string conforming to a formal identification system. Information necessary to describe, find, or link to a related resource. 15 4.11. Rights Name Definition Standard Referenced Obligation Recurrence Qualifiers Schema Audience Simple DC Mapping Input Guidelines Examples Comment Rights Information about rights held in and over the resource. DC Required Repeatable None Manager, Staff User, End User Rights 1. The Rights element will minimally be a hyperlink to the right statement relevant to the resource. 2. Specific rights information may be added to the Rights element as free text when conditions warrant. 3. Access? Rights: [link] ?Rights: Access is limited to University of Colorado Colorado Springs users. Copyright information: [link] Typically, rights information includes a statement about various property rights associated with the resource, including intellectual property rights. 16 4.12. Source Name Definition Standard Referenced Obligation Recurrence Qualifiers Schema Audience Simple DC Mapping Input Guidelines Examples Comment Source A related resource from which the described resource is derived. DC Recommended Repeatable None Manager, Staff User, End User Source 1. Input the ISSN, ISBN, or other international standard numbers assigned to the analog original. a. File names, accession numbers, call numbers, or other identification schemes should be entered in Identifier element. 2. Enter source information in order of importance. a. Describe the nature of the relationship between resources “Excerpted from”, “Original”, etc. b. Information about the physical condition of the physical object may be included. i. use the physical unit of measurement most appropriate for the resource 3. Use separate Source elements to identify source for multiple items. 4. If a single resource has multiple sources, separate with semicolons in single Source element. Source: Original: 35mm color slide The described resource may be derived from the related resource in whole or in part. Recommended best practice is to identify the related resource by means of a string conforming to a formal identification system. 17 4.13. Subject Name Definition Standard Referenced Obligation Recurrence Qualifiers Schema Audience Simple DC Mapping Input Guidelines Examples Comment Subject The topic of the resource. DC Recommended Repeatable None Staff User, End User Subject 1. The use of Subject is to be determined at the project level a. Create project data dictionary including any controlled vocabulary or vocabulary used. b. If multiple controlled vocabularies are used for a digital resource, each vocabulary should be entered in separate elements. (identification TBD) Subject: Gallery of Contemporary Art (GoCA) Typically, the subject will be represented using keywords, key phrases, or classification codes. Recommended best practice is to use a controlled vocabulary. To describe the spatial or temporal topic of the resource, use the Coverage element. 18 4.14. Title Name Definition Standard Referenced Obligation Recurrence Qualifiers Schema Audience Simple DC Mapping Input Guidelines Examples Comment Title A name given to the resource. DC Required Non-repeatable Title.alternative (Required if applicable, Repeatable) Any form of the title used as a substitute or alternative to the formal title of a digital resource. Can be used for Caption title Former title Spine title Collection title Series title Artist’s title Object name Translation of title Manager, Staff User, End User Title 1. Transcribe title from resource if available from the digital resource itself. Otherwise follow standard cataloguing rules. 2. Title may be an identifying phrase or name supplied by the metadata creator, project manager, or archivist. a. Same rules apply to title.alternate b. Does not need to be unique [untitled] 3. Remove initial article and append to end of title, following a comma, to aid title sorting of results. 4. Capitalize only the first letter of the title and proper nouns contained within the title. 5. Use the punctuation provided by the title or standard English for created titles. Title: The Scribe, v.#, n.# Title.alternate: The Scribe Typically, a Title will be a name by which the resource is formally known. 19 4.15. Type Name Definition Standard Referenced Obligation Recurrence Qualifiers Schema Audience Simple DC Mapping Input Guidelines Examples Comment Type The nature or genre of the resource. DC, DCMITYPE Required Repeatable None Manager, Staff User, End User Type 1. Images of written language are assigned as Text. 2. Some digital resources may require more than one Type a. Scanned page may include text and image b. Attached text and audio file 3. Digital representations of 3-D objects should be assigned “Image” rather than “Physical Object”. Type: text Type: image Recommended best practice is to use a controlled vocabulary such as the DCMI Type Vocabulary [DCMITYPE]. To describe the file format, physical medium, or dimensions of the resource, use the Format element. 20 Summary Table Element Contributor Coverage Creator Date Description Format Identifier Language Publisher Relation Rights Source Subject Title Type Required If Always Applicable Recommended Optional Notes Format: YYYYMMDD Optional for unprocessed files. Depends on qualifier Subject vocabulary determined at project level. Optional for unprocessed files. 21