Lifecycle Metadata for Digital
Objects
October 30, 2006
Archival Metadata
Appraisal, Inventory, Retention Schedule, Authenticity, Accession
Records Continuum Issues
Attention to all digital records from before creation
Do as much as you can on the front end
Integrate into the business process
Metadata enhances management and repurposing, whatever the fate of the digital object
What is Appraisal?
In archival (nonprofit) sense, not about assigning exchange value, but only use value (even if only the value “archival”)
In business (for-profit) sense, about assigning both use value and known or possible exchange value (cf.
“information asset,” “digital asset”)
Appraisal Theories I
(Shepherd)
European School: moral defense
If it is a record, then it should be kept
Selection by creators
Provenance, original order
American School: especially Schellenberg
Primary value (TX: administrative, fiscal, legal)
Secondary values (TX: historical)
Evidential
Informational
Records management, life-cycle
Appraisal Theories II
(Shepherd)
Societal Models
Booms
Society-centered
Where the creator meets the citizen through function
Macro Appraisal: Functional Analysis
Top-down analysis from function
Appraisal Practice
Keep (apply values)
Destroy (apply costs)
Why not keep it all?
Excluding records driven by costs
Excluding records driven by elitist ideas of informational value (cf. case files)
Excluding records driven by concepts of archival purpose
But compare notion of monetized intellectual property vs risk analysis in business
Digital Record Types
Phase I: the mainframe
Primarily databases
NARA archival “data warehouses”
Phase II: the desktop
Disciplining the desktop
UBC, 5015.2
Phase III: the network and its nodes
Write once, run everywhere
Universal encoding standard
Web services
“Thresholds” to encode function, etc.
Digital Appraisal Decisions
Keep (costs of carrying into the future)
Allow to Die (keep but do nothing)
Repurpose (separating content and form)
Destroy (microwave the disk?)
Digital Appraisal: What to
Appraise
Content (as with paper?)
Technical support
System
Creating application
Display requirements
Functionality
Digital Appraisal Process
When?
Before creation
How?
Macro-appraisal of records/objects
Functional appraisal of supporting system
By whom? “Participatory appraisal”
Records managers/archivists
IT specialists
Creators
What is Inventory?
After the fact
Survey and classification of existing objects
Location
Format
Dates
Confidentiality
Estimate of space requirements
Determination of retention costs
Storage
Migration
Access
Follow by remedial appraisal
What is a Retention Schedule?
Classic record statuses: active, semiactive, inactive
Keep
Alter function of custodian
Alter custodianship
Allow to Die
Leave with creator?
Why not always do this?
Destroy
Determine when to destroy
Almost always a method for reprieve
Texas Retention schedule form
Automatically Collect or Add?
Refer to Word example
Automatic collection of many types of metadata by the creating system (standard for all types?)
Automatic application of other metadata by the managing system/RMA (varying with type?)
User-added metadata (standard and varying)
Schedule Triggers
Time-driven triggers: records schedule specifies action after a certain amount of time has passed
Event-driven triggers: records schedule specifies action if a certain event transpires
Mixed triggers: records schedule specifies action after a certain amount of time if an event doesn’t take place first
Record-level vs Group-level
Metadata
Record-level: Metadata orders 1-4
1 written (content)
2 encoded (content)
3 meaning (ontology)
4 function/purpose=type (form)
Group-level: Metadata order 5
5 Object grouping schemes (categories)
Record groups, record series (intellectual management)
File plans (within-group ordering if present)
Format, security concerns (physical management)
Records management model for digital object management
Requirements here were developed to manage government and regulated records
Managing records is part of general management practice
Managing non-record digital objects of more than transitory interest shares many of these concerns
Managing non-record objects
Protecting digital assets
Value of intellectual property and time considerations for copyright
Investment in conversion process and possible reconversion
Provision of access to digital assets
Predicting technological requirements
Predicting costs
What is accession?
Accession the noun: the object or group of objects accessioned; actually applies to the accession occasion.
Accession the verb: to take legal and/or physical custody of an object.
Accession also includes the process of making a record of the accession.
How does accession follow on from transfer?
Transfer terminates with quality control on the object received, to be sure it is an authentic copy of what was sent and someone takes responsibility for having received it. There is a seamless connection with accession.
Accession begins with the “adoption” of the object: in an analogy with the human world, it undergoes a “renaming ceremony” and is
“adopted into the tribe.”
Note this process indicates a change in ownership but not always in custodianship .
What is the nature of the accession task?
The object received has been uprooted from its former context
The object is equipped with enough metadata to reconstruct that context
Contextual metadata now is no longer functional but is descriptive of the old context
Object must be integrated into a new (meta-) context
New functions must be provided for
These functions may include replicating the functions from the old context
The paper accession form
(example)
Accession number (to assist with internal management)
Accession title (may already exist?)
Date of receipt
Location (new)
Administrative / biographical information
(data from another source)
Contents (subject?)
Extent (already exists)
Donor information
Restrictions (IP)
Custodial history
Date of acknowledgement
(note: maintain three copies!)
Digital processes at accession
(OAIS ingest process)
Accept a SIP
Perform QA on SIP
Prepare contents for storage and management
Create/derive management/preservation metadata: descriptive, technical
Quarantine
Prior to evaluation for acceptance
Virus checking in quarantine
Antivirus update to virus-checking tool before each check
Accepting the SIP: Validation of the object
Validation test suite (established for every acceptable format and metadata schema)
Validation tools (established in SIP agreement)
DTD, Schema templates
Format viewer/emulator
Validation process
Formal validation process
Check wrapper against SIP agreement
Perform QA on metadata
Validation outcomes
Rejection
Re-transfer
Acceptance
Extraction/assembly of metadata
Metadata as data and processing applications
Metadata storage: issue of separate storage
Extraction from object
Extraction from wrapper
Assembly from transfer and accession processes
Metadata important to accession
Intellectual property: metadata instantiated as policy and permission settings in access system
Retention requirements: metadata instantiated as expiration dates in management system
Review categories for describing metadata elements
Element name (subelements?)
Singular vs repeating
Definition
Mandatory vs Optional
Granularity
How recorded (by whom—or what)
Also: allowed values
Example table
See “2001 metadata table” example on syllabus
“elements” on this table define discrete functional
“instances”:
record instance person instance
organization instance series instance disposition instance transfer instance change history use instance management instance
Operationalization of elements
Element name (applies to elements and subelements); if a standard, should be so indicated (namespace nomenclature)
Subelements: this kind of structure is useful for grouping elements, but may or may not be reflected in the XML implementation (are subelements really hierarchical?)
Singular vs repeating: for implementation,
“repeating” will signal the need for a separate table
Operationalization of elements
(continued)
Definition: definition must be very specific, since it provides information for implementation, especially the need for attributes
Mandatory vs optional: Must be part of validation at every stage
Granularity: this characteristic will be connected to how metadata are collected and how they are connected to the object
How recorded: Automatically? Manually? By whom?
Allowed values (if relevant)
Preparation of the object for storage: copies, versions
For persistent objects (XML model)
Conversion to neutral format
Retain wrapped original as own digital signature
Copies: archival, use, versions
Storage locations: multiple, separated
Online
Offline
Federated
Track the object for its life in the repository
(location-instances)
“Internal accession” of revised/migrated versions
Over time additional versions will be generated
Migrated versions
Repurposed/refactored versions
If these versions are worth making, they are worth caring for
These versions should be taken through most parts of the accession process