Metadata standards Guidelines, data structures, and file formats to facilitate reliability and

Metadata standards
Guidelines, data structures, and file
formats to facilitate reliability and
quality of description
INF 384 C, Spring 2009
Why create and follow metadata standards?
What kinds of standards are there?
How does this all work?
How do standards evolve?
INF 384 C, Spring 2009
The world of standards
A standard is any agreed-upon means of doing
Standards can be formally created and adopted or
merely customary.
With standards, products and processes have a certain
level of consistency and predictability that can make
production and use more efficient.
INF 384 C, Spring 2009
Goals of metadata standards
Metadata standards enable more reliable description. For
example, by agreeing to use separate fields to indicate first
names and last names of resource creators, displays of
search results by author can be properly alphabetized and
more easily read, no matter if first name or last name comes
first in the display.
Reliable description enables the sharing of data across
different systems.
INF 384 C, Spring 2009
Types of standards
Elings and Waibel describe four types of metadata standards:
• Data structure (fields); MARC and EAD.
• Data content (values); AACR2 (RDA) and DACS.
• Data format; XML.
• Data exchange; Z39.50 and OAI.
These are useful categories, but sometimes standards may straddle
them. You could say, for example, that MARC reflects AACR2 and
not the other way around (although MARC defines data fields in a
technical sense, AACR2 defines the content with which the fields
are populated and to some degree conceptually determines the
MARC fields; in practice these two become functionally
INF 384 C, Spring 2009
Multiple standards at work
A cataloger uses AACR2 to determine:
• That a book’s title should be part of its description.
• The wording, spelling, capitalization, and punctuation
of the title.
The cataloger uses MARC to record the title information
in a consistent form that computers can process.
INF 384 C, Spring 2009
Multiple standards at work
Two computer networks can use Z39.50 to determine
how to exchange their MARC catalog records.
The result? A user at Library A can search Library B’s
catalog and not discern a difference in the way that
information is structured and presented. It just works.
INF 384 C, Spring 2009
Developing and adopting standards
Organizations agree to adopt standards because the benefits
of creating products or services that work together can be
However, developing standards and forging that agreement
can be a difficult process.
For metadata content standards, using them can be
complicated, and there is plenty of room for interpretive
Content standards: considerations
Why are content standards so complicated? Because
documents are various!
Most content standards will try to implement a few
basic guidelines supplemented by rules and options for
special cases.
Ideally, the basic guidelines will be based on clearly
articulated goals and principles.
Example: RDA goals
RDA has articulated a concrete set of descriptive goals and principles.
A few goals:
• Enable description of any resource (not just printed materials).
• Align with the FRBR conceptual model (works, expressions, manifestations,
resources) and its objectives (finding, selecting, understanding, and so on).
• Create content descriptions that can be used in multiple encodings and
• Retain backward compatibility with existing records.
Example: RDA Principles
One principle is that descriptions should reflect “the resource’s
representation of itself.”
This is a longstanding principle in library cataloging: where
possible, description = transcription.
This can be linked to the objective of finding known items: the
catalog description should match how the item is known to
others, which is most likely from the item itself.
Example: RDA guidelines
This principle of transcription underlies the basic
guideline for RDA titles, which is that the “title proper”
or primary title should come from the preferred source
of information, which for books is the title page.
While the wording comes from the title page, though,
the capitalization and punctuation are standardized for
all titles.
INF 384 C, Spring 2009
Example: RDA special cases
What if...
• Some introductory words on the title page seem like they’re not really part of
the title (e.g., Walt Disney Presents Sleeping Beauty)?
• The title is given in two languages (e.g., Canadian Literature/Litterature
• There is a spelling mistake in the title?
• The document is a manifestation of a commonly known work but has a
slightly different title than most manifestations (e.g., William Shakespeare’s
• A subtitle appears under what seems to be the main title (e.g., Museum
Informatics an introductory textbook)?
• The title is over one paragraph long?
INF 384 C, Spring 2009
Keeping standards relevant
Standards are immediately out of date, of course.
RDA has been in development since 2004, as part of a
cooperative effort by U.S., U.K., Canadian, and Australian
library associations. These are tremendous efforts!
Particular institutions, such as the Library of Congress, will issue
their own rules for interpreting the standards, which smaller
organizations (such as the University of Texas) may or may
not choose to adopt.
INF 384 C, Spring 2009
Your mission
Complete your subject classification for next
week: introduction, classified structure,
alphabetical structure, and reflective essay.
A few notes on assignments, based on the
individual conferences, follow...
INF 384 C, Spring 2009
A few assignment notes
Brevity is nice for concept labels, but it’s more
important to specify the precise extent of the
concept clearly.
If you mean “taking pictures with a digital
camera,” don’t use the label “digital camera.”
INF 384 C, Spring 2009
If you’ve identified several synonymous terms for a concept,
select one term for the label. You can mention the others in a
usage note in the alphabetical structure.
Water bugs is a synonym for this term. Class documents that
refer to water bugs here.
INF 384 C, Spring 2009
Non-subject concepts
Don’t include document attributes that aren’t subjects, such as forms or genres
(blogs, articles, books, diaries...).
You are creating a representation of a subject that can be used to organize
documents; you are not describing the types of documents in which users might
be interested.
Include in your classification: terms for concepts that relate to gardening, such
as types of plants (grasses, cacti, shrubs).
Do not include in your classification: Document types that list such plants
(plant databases, seed catalogs). However, you might use your classification to
categorize a cactus database with the Cacti concept...
INF 384 C, Spring 2009