The Harvard University Library (HUL) operates the Digital Repository Service (DRS) as a preservation and access repository for library-like digital assets.
The primary obligation of the DRS is to manage digital assets and ensure their usability over time.
The emphasis on usability implies that the object of preservation is the underlying information content of the digital asset as exposed to patrons through specific behaviors.
The unit of preservation interest is the digital object, an expression of abstract information content that is tangibly manifested in one or more formatted digital files.
The baseline preservation function of the DRS includes system backup, disaster planning and recovery, monitoring and risk assessment, and the maintenance of the bit-level integrity of all managed digital objects. Note, however, that these activities alone may not be sufficient to ensure usability over time.
All managed digital objects receive the highest level of preservation service that is supportable given their formal characteristics, the degree to which those characteristics are documented in metadata, and current technical understanding of the digital environment.
Preservation services beyond the baseline functions, such as preservation planning and intervention, are performed on a “best effort” basis, subject to managerial considerations of staff availability, the allocation of resources for other HUL priorities, and cost.
To achieve necessary operational efficiencies as the scale of the managed assets grows over time, preservation monitoring, assessment, and planning are performed at an aggregate, rather than the unit level. Preservation intervention, on the other hand, always occurs at the unit level.
The mechanism for aggregation is the
, which defines a class of digital objects that share a particular arrangement of:
, and relational
Characterization of these technical properties by
Patron expectations for
Content models are periodically assessed with respect to their
ability to facilitate usability over time. The results of these assessments are expressed in terms of a tripartite classification scheme:
– Content models
conducive to preserving the usability of underlying content and behavior over time
– Content models
conducive to preserving usability over time
– Content models
conducive to usability over time
The primary questions involved in this assessment are focused on: a.
– How easily available and widely deployed are appropriate processing tools?
DRS Preservation Policy and Practice Page 1
– Primarily expressed in terms of:
– Will preservation intervention be needed sooner rather than later?
– When preservation intervention becomes necessary, is the potential for loss of information content or experiential behavior large or small?
– When preservation intervention becomes necessary, is the degree of resource expenditure high or low?
This assessment is based in part on a consideration of the formats used in a particular content model. Consequently, all formats also undergo a similar assessment process using the same classification scheme. The granularity and completeness of metadata are also an important part of this assessment, as are behavioral considerations.
DRS staff issue periodic best practice recommendations for digital asset creation and acquisition.
These recommendations provide guidance regarding the selection of content models most conducive to preservation.
Conformance to DRS best practice recommendations and the use of the HUL integrated infrastructure for digital content management and delivery will result in a higher level of preservation confidence than would otherwise be the case.
The use of content models is necessary to clarify the varying planning and intervention efforts that must be carried out for the DRS to meet its preservation obligations for digital content that is otherwise indistinguishable solely on the basis of format. The inclusion of
as a component of content model definition recognizes that all digitally-encoded information needs to be
, i.e., transformed into humanly-sensible form, in order for that information to be usable. (Note that rendering is distinct from
, which involves merely providing a copy of the digitally-encoded information, i.e., the
“bits,” to a client-side agent.) While rendering always takes place on the client-side, there may be significant pre-rendering server-side steps in the process, for example, AIP-to-DIP conversion. Clientside rendering can be performed by agents that are commonly deployed, e.g., web browsers, or are highly specialized agents particular to niche communities.
Server-side repository Client-side agent
Storage and management Access
Image processing tool /
TIFF JPEG IDS
JPEG Web browser
TIFF JP2 HTML JPEG Web browser
XML TIFF ASCII
HTML GIF Web browser
BIL image stack
XML TIFF ADS
Biomedical volumetric viewer
DRS Preservation Policy and Practice
For example, these five instances of TIFF files will probably be “preserved” in several different manners depending on the nature of the server-side AIP-to-DIP conversion, some of which are transformative while others are not, and the type of client-side agent responsible for rendering.
DRS Preservation Policy and Practice Page 3