Lifecycle Metadata for Digital Objects October 16, 2006 Implementing metadata in a

advertisement
Lifecycle Metadata for
Digital Objects
October 16, 2006
Implementing metadata in a
repository system
The OAIS model





NASA/CCSDS
Developed to preserve scientific data
Assumed that creators were scientists
Assumed that users would be limited
Specifies a set of functions for a trusted
digital repository
OAIS external relations
Where’s the (external)
metadata?


Implications of externalities: pre-ingest, postdissemination
Pre-ingest: repository may not control



Who provides the metadata?
How is it obtained?
Post-dissemination: repository must respond


What environment will the metadata be used in?
What metadata should be kept re usage?
Importance of external
agreements

SIP agreement





Defining formats
Providing tools to receive and access formats
Providing specific metadata
Providing for testing of automated ingest
DIP agreement



Not an explicit part of the model
But crucial to define “designated user community” and its
expectations
At any given time, may define “plain-vanilla” context of
general user population
OAIS (internal) functional
model
Where’s the (internal)
metadata?






Metadata is generated at every step
Ingest
AIP bundling
Ongoing data management
Access events
Repository management
Roles of metadata in a
repository







Regulation of ingest process
Serving as warrant of genuineness
Defining placement in repository
Defining relations with other objects in
repository
Regulation of access permissions
Regulation of preservation scheduling and
actions
Assisting in management of repository
Ingest

Verifying what was received



Automated test template
Harvest of existing metadata per SIP agreement
Preparing to put it away




Aggregates?
Single items?
Additional metadata?
Archival bond links to existing collections?
Archival storage


Taking care of digital objects
Preserving them as received



Importance of message digest
Regular integrity-checking
Preserving them otherwise than as received


“Use copies” for frequently-used materials
Migration on demand
Storage within the repository

Active file system



Active database



Used to contain archival digital objects
Used to contain use copies
Used to contain metadata
Can also be used to contain index to all text
data in repository (as inverted index)
Offline file system and database (“dark
archive”)

Used to store objects and metadata securely
Data management





Taking care of metadata
Maximizing access
Tracking and understanding usage
Assisting with making new connections
(analyzing usage data)
Integrating possible feedback metadata
Database functions

Internal



Tracks ingest, repackaging, usage, and preservation
activities
Provides locator for objects
External



Provides searching on metadata fields
Provides searching on object content for text objects
Provides information for validation of access privileges
Database type and choice





Relational
Hierarchical
“Native XML”
“Supports XML”
Hybrid (database structure, XML document
access)
Access


As conceived in original OAIS document:
handled by people, offline
In practice: automated as much as possible



Who gets access?
What kind of access?
Recording access instances
Overall management

Repository as a whole





External relations in general
Administration including periodic recertification
Preservation planning
SIP agreements and negotiation with
depositors
DIP agreements and interaction with users
Persistence and
trustworthiness





Most crucial element of the OAIS model:
requirement for specifying cessation process
Commitment to donor and user community
Guarantee of continued service
Explicit agreements with potential successor
organizations
Vital to user community that expects permanent
guarantees (e.g., government)
Certification



Repository excellence must be judged against
standards
May be audited by certifying body (this is still under
discussion)
Certification plan is current task of core group from
RLG and NARA with interest from Cornell, Harvard,
OCLC and others; draft checklist released spring
2006, final version in process
Download