DRS 2 one in a series of periodic updates Harvard University Library Andrea Goethals October 21, 2009 DRS = Digital Repository Service Agenda 1. 2. 3. 4. 5. DRS 2 context DRS vs DRS 2 Current work: DRS 2.1 Next set of work: DRS 2.2 Questions & comments 1. DRS 2 context … HUL’s Digital Preservation Program A continuation of HUL’s mission to provide current and future access to research materials and resources, with recognition that preserving access to digital content requires different strategies, tools and skills Centerpiece of the preservation program: the DRS Shapers of the DRS Digital Preservation Community Best practices, standards, lessons learned, experiments Collaborative projects, member organizations, interest groups, meetings, conferences, correspondence, conversations, shared tool development Harvard needs Increasing amount of digital content New formats and genres, born-digital material DRS growth has been fueled by large projects… Require services to store, preserve, manage, make discoverable, etc. Bring new requirements Support changing user expectations Print on demand, e-readers DRS growth 120 100 TB 80 60 40 20 0 Jun-01 Jun-02 Jun-03 Jun-04 Jun-05 Jun-06 Jun-07 Jun-08 Jun-09 10/1/09: 118 TB in the DRS (Counting all backups: 378 TB) 2. DRS vs DRS 2 DRS Set of professionally managed services preservation planning & activities, administration, management tools creation/ acquisition creation & format guidelines, training, ingest service storage & monitoring service delivery services, access restrictions, persistent names use DRS 2 Same services, but much improved preservation planning & activities, administration, management tools creation/ acquisition creation & format guidelines, training, ingest service storage & monitoring service delivery services, access restrictions, persistent names use DRS 2 Improvements revamped management tools, adding reporting, more preservation planning creation/ acquisition more guidelines, acceptance of more formats and metadata richer data model, more robust and scalable storage system, better monitoring and recovery processes additional access restrictions, redundant delivery servers, additional delivery services use 3. Current work: DRS 2.1 DRS 2.1 Scope 1. 2. Redesign of conceptual foundation Release to a QA environment DRS 2.1 Scope 1. Redesign of conceptual foundation 2. Modified data model Content models Object descriptors New and different metadata schemas Release to a QA environment New and enhanced tools for creation and deposit of objects for depositor testing Modified Data Model Current DRS: file level All metadata is associated at the file level All management has to be done at the individual file level Even if the same metadata applies to a group of files Non-intuitive and unwieldy DRS 2: adding 2 more levels objects (files) bitstreams Objects? Aggregations of files that together represent a coherent unit of content Useful for management, reporting and searching All the files that make up a single digital book All the master and use copies representing a single photograph “How many PDS document objects do I have in the DRS?” Hook for new metadata Administrative categories (projects, exhibits, collections, etc.) Descriptive metadata, catalog records Bitstreams? A subset of a file Hooks for metadata that apply to part but not all of the file To characterize the audio portion of a video file To describe the contents of a ZIP file Allow fine-grained description and management May save storage space some types of content can remain compressed and still be described Content models Object types Define valid file formats and relationships known delivery and rendering applications associated assessments and preservation plans Enforces conformity - we know what we have Tie directly to technology watches and preservation plans DRS 2.1 content models – deposit & delivery 1. Still image 2. PDS document 3. Initially just PDF files, delivered by FDS Opaque 5. Page-turned documents, delivered by PDS Document 4. Image objects, delivered by IDS Files in any format Text Text, XML, etc. delivered by FDS Object descriptors A METS metadata file per object on the file system alongside content files Descriptive, administrative, preservation, technical and structural metadata Describes the object, all its files and bitstreams and related significant events Gives the metadata the same secure storage as the content files Self-contained, portable objects Peering into a METS object descriptor For the object For the object, each file and bitstream MODS PdsMD (for PDS document objects) PREMIS HulAdminMD For each applicable file and bitstream MIX TextMD DocumentMD … Deposit tools Currently: BatchBuilder DRS Loader DRS 2.1: Enhanced BatchBuilder New! File Information Tool Set (FITS) New! Object Tool Set (OTS) Enhanced DRS Loader New! DRS Services Enhanced BatchBuilder Will build batches of objects rather than batches of files Will automatically determine all technical metadata (using FITS) Will automatically create all object descriptors (using OTS) DRS Services New back-end service to centralize and control access to DRS objects Simplifies front end applications Secures content and metadata DRS 2.1 services 1. 2. Object ingest File delivery June 2010: QA release to depositors Depositors will be able to test new workflows in QA New BatchBuilder and DRS Loader to create and deposit objects into the DRS Enhanced IDS, FDS and PDS to view the deposited content 4. Next set of work: DRS 2.2 DRS 2.2 Scope DRS Web Admin Additional content models Audio, Web Harvest, Dark PDS Document, various Google, MOA2 document, Biomedical Image, Target Image and Email Improved audio support Easier discovery, batch updates, reporting, etc. Repository administration and monitoring MP3, MP4/AAC BatchBuilder support Rights and access management metadata Rights metadata stored in DRS with content Analysis of need for more granular access restrictions June 2011: Production release Creation, deposit and management of objects All delivery services integrated with the DRS Services All DRS files will have been migrated to objects Many people in OIS working on DRS 2 Digital Library Projects Group Systems Operations Group Systems Development Group Metadata Analyst More information HUL’s Digital Preservation Program http://hul.harvard.edu/ois/digpres/ DRS 2 Enhancements http://hul.harvard.edu/ois/systems/drs/enhancements.html andrea_goethals@harvard.edu 5. Questions & Comments