Lifecycle Metadata for Digital Objects

advertisement
LIFECYCLE METADATA
FOR
DIGITAL OBJECTS
Danielle Cunniff Plumer
School of Information
The University of Texas at Austin
Summer 2014
Class Introductions
• Identify
• Yourself
• Types of digital projects you have
worked on or are interested in
• Your role in these digital projects
• Your experience with metadata
Metadata
• “Metadata is structured information that
describes, explains, locates, or otherwise makes
it easier to retrieve, use, or manage an
information resource.”
• Key Concepts:
• Structured information
• Ease of use
• Formal standards
Source:
NISO. (2004) Understanding Metadata.
Bethesda, MD: NISO Press, p.1
Functions of Metadata
• List:
Metadata
• A good object has associated metadata.
• A good object will have descriptive and
administrative metadata, and compound objects
will have structural metadata to document the
relationships between components of the object
and ensure proper presentation and use of the
components.
Source:
NISO. (2007). “Objects Principle 6” in
A Framework of Guidance for Building Good
Digital Collections. Bethesda, MD: NISO Press.
Types of Metadata
• Descriptive
• Structural
• Administrative
• Technical
• Rights Management
• Preservation
Source:
NISO. (2004) Understanding Metadata.
Bethesda, MD: NISO Press, p.1
Metadata Standards
• Interoperability and object exchange require the
use of established standards
• Semantic:
• Syntactic:
• Format:
• Exchange:
Content Standards
Metadata Schemas
Machine encoding
Interface protocols
• XML is a common method for exchanging
metadata descriptions on the Internet
• Others include: JSON, RDF
Metadata Components
Human
Content Standard
Syntax/Schema
Data Format
Machine
Data
Exchange
Insert Joke about Standards
"Fortunately, the charging one has been solved now that we've all standardized
http://xkcd.com/927/
on mini-USB. Or is it micro-USB? Shit."
Plan of Course: Schedule
Week
Topic
Lab
1
Introduction
Command Line Tools (1)
2
Metadata in Digital Objects
Command Line Tools (2)
3
Metadata and Markup
XML and RDF
4
Content Standards
Descriptive Metadata (ASpace)
5
Authority Records
EAC-CPF (ASpace)
6
Controlled vocabularies
Controlled vocabularies (ASpace)
7
Ontologies and Linked Data
Linked Data
8
Digital Forensics
BitCurator
9
Student Presentations
10
Student Presentations
Plan of Course: Readings
• Required:
• Baca, Murtha, ed. 2008. Introduction to Metadata (Online Edition, Version 3.0).
Los Angeles: Getty Publications. Retrieved from
http://www.getty.edu/research/publications/electronic_publications/intrometadat
a/setting.html
• Society of American Archivists. 2013. Describing Archives: A Content Standard
(DACS). Second Edition. Chicago: Society of American Archivists. Retrieved
from http://files.archivists.org/pubs/DACS2E-2013.pdf
•
• Optional:
• Dow, Elizabeth. 2005. Creating EAD-Compatible Finding Guides on Paper.
Scarecrow Press.
• Roe, Kathleen. 2005. Arranging and Describing Archives and Manuscripts
(Archival Fundamentals Series II). Chicago: Society of American Archivists.
•
• Additional readings will be assigned for specific topics.
Plan of Course: Assignments
Assignment
Due Date
Class participation
• Lab work
Percent of Grade
25%
Ongoing
Markup Assignment
25%
• Authorities (EAC-CPF)
July 9
• Finding Aid (EAD)
July 21
• Tutorial (ArchivesSpace)
July 28
Seminar Paper
25%
• Presentation
August 4-11
• Paper
August 4
Exams
• Midterm
July 16
10%
• Final
August 13
15%
Questions?
Using Metadata
• Description
• To (more or less)
uniquely identify
an item
• To identify the parts of
an item and their
relationship to the whole
Author Davis, Ellis A.
Title
The encyclopedia of
Texas, compiled and
edited by Ellis A. Davis
and Edwin H. Grobe.
Imprint Dallas, Texas
Development Bureau
[1922?]
Using Metadata
• Location
• To show where to find
the item
• Call number
• Archival container
• Storage unit
• Uniform Resource
Indicator
Library:
Location:
Identifier:
UNT Libraries
WILLIS 4FL
TEXANA COLL
976.4 D292t V. 1
Using Metadata
• Condition
• To document the
condition of an item at a
given time
• To record any actions
taken with respect to the
item’s condition
Binding:
Full-Leather
Book Condition: Fair
Jacket Condition: No Jacket
Using Metadata
• Use
• To explain conditions of
use for an item
• Based on condition
• Based on rights
Status: LIB USE ONLY
Rights: Public Domain based on
publication date of 1922.
Digital is Different
• Does the metadata describe the physical item or
the digital item, or both?
• Physical item
• Metadata as “surrogate” for the physical item
• Metadata for inventory of and access to the
physical item
• Metadata aggregated for use in “union catalogs”
• Digital item
• Metadata as a component of the digital object itself
• Metadata used as a way of pointing to the digital object from
metadata aggregations
Metadata and Digital Objects
• Metadata can be:
• In the digital object
• File headers, e.g., TIFF, EXIF; EAD, TEI headers
• Near the digital object
• Same directory, hard drive, network
• Thousands of miles from the digital object
• On another network, in another state, in another country
Email Metadata (1 of 3)
Delivered-To: dcplumer@utmail.utexas.edu
Received: by 10.220.251.201 with SMTP id mt9csp166864vcb;
Mon, 9 Jun 2014 10:47:13 -0700 (PDT)
X-Received: by 10.60.63.110 with SMTP id f14mr27889922oes.8.1402336033290;
Mon, 09 Jun 2014 10:47:13 -0700 (PDT)
Return-Path: <dcplumer@gmail.com>
Received: from angband.mail.utexas.edu (angband.mail.utexas.edu. [146.6.25.8])
by mx.google.com with ESMTPS id eb3si28108314oeb.17.2014.06.09.10.47.12
for <dcplumer@utmail.utexas.edu>
(version=TLSv1 cipher=RC4-SHA bits=128/128);
Mon, 09 Jun 2014 10:47:13 -0700 (PDT)
Received-SPF: pass (google.com: domain of dcplumer@gmail.com designates 209.85.216.170
as permitted sender) client-ip=209.85.216.170;
Authentication-Results: mx.google.com;
spf=pass (google.com: domain of dcplumer@gmail.com designates 209.85.216.170 as
permitted sender) smtp.mail=dcplumer@gmail.com;
dkim=pass header.i=@gmail.com
X-Utexas-Sender-Group: None
X-IronPort-MID: 87878886
X-SBRS: 4.9
X-Utexas-Seen-Inbound: true
Email Metadata (2 of 3)
Received: from mail-qc0-f170.google.com ([209.85.216.170])
by angband.mail.utexas.edu with ESMTP/TLS/RC4-SHA; 09 Jun 2014 12:47:06 -0500
Received: by mail-qc0-f170.google.com with SMTP id l6so1040781qcy.1
for <dcplumer@utexas.edu>; Mon, 09 Jun 2014 10:47:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=20120113;
h=mime-version:sender:from:date:message-id:subject:to:content-type;
bh=Wwk4SOA8TwweO+8MVa0kj6m8+yIRwdAxIdQANX5SDiA=;
b=jBD86rqYhja+NT1vAffPy1U5qHwzAGQ3PGy6ITs4hzkBUYATlSt65ebQQAb/3as2V8
gDBzcgcpDkJNnQEpwHwbqvkFfjTwquSJWXJVYOtbvefIkqFV/NpSfW8qda7NGcOsjT50
H8iGNnG1CZ3ijW9HvpelUmr7FkqeJX+1RF6q3R+15RsoB837cXmwhTahRXGpNt/hsrWF
s/PMVXPR55JnOm8CQ3/uRhWmfceiSkvPUL/zYwfxw46TFDp7gY7ubZyICni6sbIyJ06t
TnXdb6WBbfvkpnxa5pdvpcCi3YVB5tgEfVuub7+c75rez8P3hkb7aBzJ5P721GGwu4BT
aasg==
X-Received: by 10.224.162.212 with SMTP id w20mr5870278qax.50.1402336025270;
Mon, 09 Jun 2014 10:47:05 -0700 (PDT)
MIME-Version: 1.0
Email Metadata (3 of 3)
Sender: dcplumer@gmail.com
Received: by 10.229.39.72 with HTTP; Mon, 9 Jun 2014 10:46:25 -0700 (PDT)
From: danielle plumer <danielle@dcplumer.com>
Date: Mon, 9 Jun 2014 12:46:25 -0500
X-Google-Sender-Auth: 5r4G-BoZO-kUgBWPn7LKoFFOomo
Message-ID: <CAAJtrZYB5rPf+2joUY=P5+C=HE9sWwSr=1QqV7VyG_GTH5GTMw@mail.gmail.com>
Subject: Test message
To: dcplumer@utexas.edu
Content-Type: multipart/alternative; boundary=089e013cb7cea45fbb04fb6accb2
--089e013cb7cea45fbb04fb6accb2
Content-Type: text/plain; charset=UTF-8
This is a test message to see what kinds of metadata are embedded in an
email.
--089e013cb7cea45fbb04fb6accb2
Content-Type: text/html; charset=UTF-8
<div dir="ltr">This is a test message to see what kinds of metadata are embedded in an
email.</div>
--089e013cb7cea45fbb04fb6accb2--
Twitter Metadata
http://online.wsj.com/public/resources/documents/TweetMetadata.pdf
Identifiers as Metadata
• A good object will be named with a persistent, globally
unique identifier that can be resolved to the current
address of the object.
• Good identifiers will at minimum be locally unique, so that
resources within the digital collection or repository can be
unambiguously distinguished from each other.
• Global uniqueness can then be achieved through the addition of a
globally unique prefix element, such as a code representing the
organization.
Source:
NISO. (2007). “Objects Principle 4” in
A Framework of Guidance for Building Good
Digital Collections. Bethesda, MD: NISO Press.
Identifiers in Metadata
• The description of a digital object must be specific
enough to distinguish it from similar objects
• Unique identifiers make this easier
• The link between a digital object and its metadata
must exist and be maintained over the entire
lifespan of the object
• Persistent identifiers are essential
Exercise: Physical & Digital
• Break into groups
• Choose one of the objects listed (view online)
• http://goo.gl/BpkS1U
• For each object, create two records:
• Surrogate record for the physical object in a library,
museum, or archives.
• Metadata record for the digitized object available online.
• Describe the differences between the
two records
Plan for Thursday
• Read:
• Stephenson, Neal. (1999). In the beginning was the command line.
Available from http://www.cryptonomicon.com/beginning.html
Download