Lifecycle Metadata for Digital
Objects
September 11, 2006
It’s markup all the way down
Some infrastructure details
Class meetings: SZB 546, Wed. 1-4
URL of syllabus: http://courses.ischool.utexas.edu/galloway/2
006/fall/INF389K/index.html (presently under construction)
Office: SZB 566
Office hours: Tuesday, 9-11 or by appointment
TA: Dana Lamparello
What is metadata?
Data about data (Information about information?)
Kind: what is this thing?
Function
Management
Where
How long
Access
Metadata and information
Shannon and
Weaver: what is information?
Source
Encoder
Message
Channel
Decoder
Receiver
Saussure: the arbitrariness of the sign
Separating signs by gaps
Modulating sound or signal
Using white space
Bracketing objects
Metadata orders
Still under construction!
Informal series to cover the field of digital metadata
Object as representation (1, 2)
Meaning/content of object (3, 4)
Management of object (5)
First-order metadata (base conventions)
Written and spoken language: intrinsic metadata that makes it possible
Layout/expressive conventions
Separation of words
Arrangement of groups of words
Punctuation, capitalization, emphasis, etc.
Note that this is usually considered to belong to an external standard, and about it!
nobody worries
Second-order metadata
(rendering)
Encoding (ASCII, proprietary formatting schemes)
Compression schemes
Encryption or other intentional distortion schemes
Note that when these are referred to as external standards, and especially when the result is not human-readable, worries about it.
everyone
Third-order metadata
(meaning)
“Connections to the world”
Meaning
Semantics: what does it mean?
Pragmatics: in what context?
Classification and facets
Fourth-order metadata
(beyond individual objects)
Groups of digital object types:
“aggregate metadata”
Archival series
Project files
“Complex documents”
Books seen as chapters and sections
Fifth-order metadata
(instrumentality)
Functions
What is the digital object’s purpose?
What can you do with the digital object?
What should object?
you do with/for the digital
Explicit digital object types
“Object types” are clusters of functions
Explicit treatment regimes
Why markup?
Natural language processing, the dream: selfdescribing text objects
The reality: you can’t do NLP without a lexicon (including semantics) and a grammar
Markup allows lexicon and grammar to be contained in the text
Markup thus allows all or parts of metadata to be contained in the object
How else to include metadata?
File header
Additional file with linkage
Database segment of DAM/repository system
External annotation databases
Markup standards?
SGML
Proliferation of proprietary markup
Now comes the preservation problem: return to standard
SGML
HTML
XML
Metadata standards?
Zillions!
Dublin Core, VRA Core, etc.
MARC->MODS
RKMS
ODRL, InDECS
Etc., etc.