Lifecycle Metadata for Digital Objects November 6, 2006 Descriptive Metadata:

advertisement
Lifecycle Metadata for Digital
Objects
November 6, 2006
Descriptive Metadata:
“Modeling the World”
Descriptive metadata for
what?
WWW: now seen as the ONE place to
find everything
Descriptive metadata provides:



Unique identification for a resource
Information permitting evaluation/selection
of a resource
Information describing all “essential
properties”
WWW: How to find things
What does search mean on the WWW?
How to support its multiple purposes?


Failure of search engines to render precise results
(problems of scale; is this true?)
Failure of HTML metatags (spamming)
Solution



local expert cataloging (providing access points)
making local cataloging available
remote free-text searching (inferring access points)
Dublin Core and its limitations
Warwick Framework and RDF
Universal Semantic Web
Some metadata examples
Individual objects (Dublin Core and its
derivatives)
Multimedia and/or complex objects
(METS/MPEG21)
Books and other chunks of information
(MARC)
Finding aids (EAD)
Semantic Web
Berners-Lee’s vision for the Web:
basically machine-understandable
metadata about meaning for everything
on the Web
http://www.semaview.com/d/Semweb_
Illustrated.pdf
Aside on cataloging
Cataloging systems as relatively static:
relationships remained tacit and externally
specified




Classification systems
Controlled vocabularies
Name authorities
Note all of these can be represented in XML as
specific namespaces (MARC, MODS, etc.)
New methods aren’t that different: ontologies for
the Semantic Web are also namespaces--but
ones that are much more specific about actions
Ontologies
Like previous classification systems, they are
being built by hand
General (Cyc) and domain-specific (especially for
B to B, web services)
Ontologies establish a joint terminology between
members of a community of interest
Ontologies specify domain knowledge in terms of
formal logic that includes actions by and among
entities
Ontologies will be used to guide extraction of
semantic content from texts (and perhaps
automatic generation of metadata)
Topic Maps
Representation of information using topics,
associations, and occurrences
Note how this “triples” representation fits
well with RDF (entity, relationship, entity)
An XML representation: XTM
An (older) ISO standard: ISO/IEC
13250:2003
Related to ontologies and mind maps;
designed to “map” semantic regions
Web Services
How to provide processing services over the
WWW: XML and HTTP infrastructure passing
remote procedure calls
UDDI (Universal Description Discovery and
Information) is the registry of services
WSDL (Web Services Description Language)
allows “advertisement” of services (in XML, of
course) in the UDDI registry
SOAP (Simple Object Access Protocol) is the XML
wrapper for requests sent to services
Example: DC metadata registry:
http://dublincore.org/dcregistry/
Does what we know fit into
this?
DC and derivatives are aimed at the single object
(though not always used for it) and are
frequently used in WWW contexts (cf. Warwick
Framework ≈ RDF namespaces)
EAD describes descriptions of aggregate chunks
of information (chunked in terms of “series” or
“collections”) but can describe single objects
MARC/MODS describes aggregate chunks of
information (chunked in the form of “books”)
METS and MPEG21 are frames for multiple and
multimedia objects
Granularity
Granularity governs the level at which
metadata can be descriptive
Metadata granularity tends to be finer for
digital objects
Digital objects cannot be managed
without individual granularity (thank you
David Bearman)
EAD: Describing descriptions
What is a finding aid?
Describing a finding aid so it can be
searched
Expanding a finding aid to accommodate
individual granularity
Is it efficient to drill down through a
finding aid to individual objects?
Can EAD be searched from the bottom up?
EAD Schizophrenia
Because it describes finding aids, it has
retained concern with look and feel
Mixes granular conceptual description
with box/folder lists for physical (and
contingent) object arrangements
Lack of granularity is expressed in the
possibility of writing narrative with <p>
tags everywhere
MARC: Chunked packages
International Standard Bibliographic Description
(ISBD) as parent of MARC, TEI
MODS: User-friendly MARC? subset of MARC
elements (20), language-based tags
MARC as descriptive metadata




Bibliographical detail for the work
Bibliographical detail for the specific instance of the
work (cf. FRBR)
Places the work within one or many classificatory
systems (ontologies, controlled vocabularies, authority
lists)
But alas! Not consistent!
METS: Multimedia/Multiversion
METS developed to express “archival bond”
among objects related to one another as a single
work (cf. FRBR, Warwick Framework, RDF)
Reflects concerns of digital librarians who want to
make a wide range of versions available
Standard form:




General descriptive metadata for package
Object link
Object type
Specific descriptive metadata set(s) for specific kinds
of objects
What about the single object?
Is Dublin Core enough? Outdated? (15 elements)
What about derivatives?


Qualified DC, DC profiles
Australian elements (20)
Why describe the single object?
Who will describe at the object level?



Zillions of archivists?
Authors?
Automatic analysis (ontology-driven)?
Wisdom of Crowds vs Long Tail
The wisdom of crowds: tagging as
democratized subject catalogin
The long tail: specialist cataloging for
small niche groups, now visible online
Download