2. Descriptive metadata Standards

advertisement
The Future of MARC: dead or
reviving?
Rebecca Guenther
NYTSL Fall Program
Nov. 4, 2011
Overview of presentation
History of MARC
 The current bibliographic framework
 Efforts to evolve MARC

XML formats
 Linked data explorations
 RDA changes


LC Bibliographic Framework
Transition Initiative
What is MARC 21?

A syntax defined by an international standard for
communications with 2 expressions:






Classic MARC (MARC 2709)
MARCXML
A data element set defined by content
designation and semantics
Many data elements are defined by external
content rules; a common misperception is that it
is tied to AACR2
It does not specify internal storage and
institutions do not store “MARC 21”
A set of 5 formats for different purposes:
Bibliographic, Authority, Holdings, Classification,
Community Information
The current bibliographic
environment







Billions of rich descriptive records in MARC
systems
Many national formats have been harmonized
with MARC 21
Integrated library systems support MARC
bibliographic, authority and holdings formats for
different functions
Wide sharing of records for 30+ years
OCLC is a major source of records
MARC records are being reused (sometimes
converted) and repackaged
Need to interact with descriptions in other
formats/syntaxes
MARC successes

Can carry data formulated by different
cataloging rules and conventions





Multiple descriptive rules, different principles and
models
Different subject thesauri
Multiple languages and scripts
Cooperation in record exchange has resulted in
widespread use and cost savings
Richness of MARC records supports multifaceted
retrieval


Coded data
Parsed data
Problems with MARC






MARC 2709 syntax problems
Limitation of available fields, subfields,
indicator values, etc.
Redundant data (fixed vs. variable fields)
The longevity of the format complicates
reuse of data tags; redundancies have
built up over time
Ability to link is limited
Lack of explicit hierarchical levels
Efforts to streamline MARC
21

Take advantage of XML



Develop simpler (but compatible)
alternatives


Increasingly use MARC 21 in an XML structure
Take advantage of freely available XML tools
MODS and MADS
Allow for interoperability with different XML
metadata schemas

Assemble coordinated set of tools
MARCXML





MARCXML uses the MARC data element set in
an XML syntax
Lossless roundtrip conversions
Simple flexible XML schema, no need to
change when MARC 21 changes
Continuity with current data and flexible
transition options
Problems with limitations in tagging persist
http://lccn.loc.gov/2004012412/marcxml
MARC derivatives: MODS and
MADS









Attempts to deal with MARC limitations
Eliminates some of the problems with MARC
(e.g. lack of tags/subfield codes)
More user-friendly (uses language tags)
Repackages redundant data elements into one
Can carry hierarchical data
Less tied to cataloging rules
Highly compatible with MARC but simpler,
although retaining some richness
Widely implemented especially for digital
projects
Governed by Editorial Committee
Example: http://lccn.loc.gov/2004012412/mods
Related XML schemas: METS




METS
 A container/information package
 Wrapper for MARCXML and MODS descriptions
 Allows for additional technical and
preservation metadata
 Enables tracking of actions on the metadata
itself
Many use METS as a framework for digital
libraries and their metadata
Particularly useful for complex digital objects
Allows for reuse of rich descriptions
Experimentation with “Linked
data”





Library of Congress Authorities & Vocabularies service:
http://id.loc.gov
Allows both human-oriented and programmatic access to
LC authorities and vocabularies
Actionable URIs associated with concepts
First offering was Library of Congress Subject Headings,
then Names, MARC code lists, Thesaurus of Graphic
Materials, ISO 639-2, PREMIS vocabularies
Advantages





Facilitate development and maintenance process for
vocabularies
Expose vocabularies to wider communities
Experiment with Linked Data
Offer bulk downloads
Example:
http://id.loc.gov/authorities/sh85049843
Experimentation with Linked
Data
MADS in RDF
 MODS in RDF
 Linking vocabularies in id.loc.gov
with other external vocabularies
 PREMIS OWL ontology
 Integration between ontologies and
controlled vocabularies becomes
possible

MARC Changes for RDA
MARC community made many changes to
accommodate RDA
 In some cases RDA was more granular than
MARC and data elements had to be examined as
to whether such detail was needed
 Limitations in number of fields/subfields
prevented complete crosswalking
 Need for additional experimentation to
determine what needs to be accommodated
http://www.loc.gov/marc/RDAinMARC29-9-1211.html

Challenges in adapting MARC
for RDA






RDA was changing as MARC was revised
Not all MARC users will be using RDA
Continuity with current data is important
Not all RDA users will use the increased
granularity– tension between simpler vs
more complex
Impact of FRBR
Financial constraints of too much change
and scarce resources
Specific RDA changes

RDA Content, Media, Carrier



Fields 336, 337, 338
Controlled vocabularies—codes or text
Carrier characteristics



Additional values in 008
New subfields in 340
New fields for sound, video, digital
Authority changes

Attributes of Names and Resources



Changes to Authority format for uniform titles
(works or expressions)


Changes to Authority format for additional metadata
about persons, families, organizations
New fields for date, place, address, field of activity,
occupation, gender, family information
New fields for date, content type, language, form of
work, medium of performance, key
All elements for works/expressions also added to
bibliographic
Other RDA changes

Relationships between resources
Name to resource (RDA App I)
 Resource to resource (RDA App J)
 Name to name (RDA App K)
 Uses MARC relators or subfield $i


Production, publication, distribution

Field 264 with designation of function in
indicator
URIs in MARC records

Links to resources



Field 856 for link to resource or related
resource
URIs available in numerous fields as a link to
additional information, e.g. 505, 506, 583
Links to values



Controlled vocabulary values may be
identified by a URI
id.loc.gov
RDA vocabularies bring established with URIs
in Open Metadata Registry
http://rdvocab.info/
URIs in MARC records


Do URIs need their own data element or
are they self-identifying?
Data elements where needed





Code lists (relators, countries, GACs, orgs)
RDA controlled vocabs (e.g. 336,337, 338)
Fields with controlled lists (with $2)
Headings
Approach (experimental)


Use same subfield where data is now
Both URI and textual data?
Results of the RDA test





Feeling that MARC structure doesn’t allow for
taking full advantage of RDA
Not all RDA data elements have a distinct place
in MARC
RDA is element based; MARC groups elements
that can’t live independently
Concerns whether MARC can interoperate with
other metadata in a semantic web world
Limitations for showing relationships between
entities and applying FRBR model
Evolving the bibligraphic
framework: issues to consider









Actionable vs. descriptive data
Parsed vs. text
Controlled/access vs. transcribed
Codes vs. words
Library vs. non-library traditions
My model vs. your model
Stability vs. change
Basic retrieval vs. scholar retrieval
Cost of change
Bibliographic Framework
Transition Initiative






Rethinking bibliographic control because of technological
and environmental changes
Content and packaging of RDA suggest that a different
carrier is needed to fully exploit it
Reevaluate use of scarce resources and provide efficiencies
in creating and sharing bibliographic metadata
Analyze present and future environment
Identify components of the bibliographic framework to
support users
Plan for an evolution to a future framework
Issues to be addressed






Determine aspects of MARC that should be
retained
Experiment with Semantic Web and Linked Data
technologies
Foster reuse of existing rich metadata
Allow for navigating relationships among entities
Explore risks of action and inaction and pace of
change
Plan for migrating existing metadata into a new
infrastructure
Components of a new
bibliographic framework




Based on Working Group on Bibliographic
Control and RDA test
Continue to support MARC during the
transition and as long as is needed
Broaden participation in a network of
resources and be able to link patrons to
all kinds of resources
Follow an open and transparent process
Requirements




Broad accommodation of content rules
and data models
Provide for types of data that accompany
or support bibliographic data, e.g.
holdings, preservation
Accommodate textual and linked data
with URIs
Reconsider the relationship between
internal storage, displays and input
screens
Requirements




Consider all sizes and types of libraries
Continue maintenance of MARC until no
longer necessary; minimize changes to
only those needed for RDA
Compatibility with existing records
Provide transformations from MARC 21 to
the new environment to enable
experimentation
General approach




Focus on the Web environment, Linked
Data and RDF
Integrate library data and other cultural
heritage data on the Web
Use of triplestores to provide more
options for storing and retrieving data
Allow the library environment to become
more readily understandable by data
creators and software developers
Explorations





Develop interaction scenarios in the
broader information community
Develop use cases to scope its boundaries
and interdependence with other
initiatives, e.g. PREMIS, METS
Develop ontologies for the description of
resources
Experiment collaboratively with new
models
Use existing partners for prototyping
Collaborations
Close contact with MARC format
partner institutions (national
libraries)
 Review and comment from MARC
advisory bodies (e.g. MARBI)
 Prototyping by networks and
vendors
 Input on modeling with general
resource description community

Timetable and next steps





Provide funding through a 2-year grant
Organize consultative groups and
prototyping activities
Develop models and scenarios
Assemble and review ontologies
Few real details on time frame
Community input
Individuals and institutions can
recommend members to serve on
the advisory or technical committee
 Join and post thoughts to the
bibliographic transition listserv
(bibframe@loc.gov)
 Comments will be publicly available

Likely characteristics of postMARC








Web and linked data based
High level simple core ontologies
Modularized format that allows for extensions
Application builders can pick from ontologies and
extensions
There should be a way to keep all elements of
MARC
MARC to post-MARC could be lossy
Agnostic to cataloging rules
Ability to output in various syntaxes
Conclusions




MARC 21 has served the community well for
wide sharing of bibliographic metadata
Much effort will go into the new initiative
There are widely differing views
More questions than answers remain
 How much of MARC will be retained?
 Will the new format look like MODS, a
derivative, or will it be completely new?
 How will supporting data be accommodated?
 How will systems change?
 How long will it take?
Thank you!
Rebecca Guenther
rguenther52@gmail.com
http://www.meetyourdata.com
Download