DRS 2 Orientation Harvard University Library September 30, 2010 DRS = Digital Repository Service Agenda 1. DRS 2 1. 2. 3. 2. 3. 4. Concepts (Andrea) New metadata (Robin) Overall schedule (Andrea) BatchBuilder 2 demo (Vitaly) Testing instructions (Vitaly) Questions & comments DRS 2 Concepts DRS 1: everything’s a file TIFF image file Text file ZIP file JP2 image file METS XML file TIFF image file METS XML file JPEG image file JP2 image file JP2 image file Text file PDF document file METS XML file JP2 image file Text file TIFF image file JPEG image file File level is not a meaningful level for curatorial uses… Which DRS files make up my digital manuscript? HOLLIS number 009412949 http://nrs.harvard.edu/urn-3:FHCL.HOUGH:1116980 http://pds.lib.harvard.edu/pds/view/6522882 TIFF image file Text file ZIP file JP2 image file METS XML file TIFF image file METS XML file JPEG image file JP2 image file JP2 image file Text file PDF document file METS XML file JP2 image file Text file TIFF image file JPEG image file TIFF image file Text file ZIP file JP2 image file METS XML file DRS file ID = 6522882 TIFF image file METS XML file JPEG image file JP2 image file JP2 image file Text file PDF document file METS XML file JP2 image file Text file TIFF image file JPEG image file TIFF image file Text file ZIP file JP2 image file METS XML file TIFF image file METS XML file JPEG image file JP2 image file JP2 image file Text file PDF document file METS XML file JP2 image file Text file TIFF image file JPEG image file TIFF image file Text file ZIP file JP2 image file METS XML file TIFF image file METS XML file JPEG image file JP2 image file JP2 image file Text file PDF document file METS XML file JP2 image file Text file TIFF image file JPEG image file TIFF image file JP2 image file METS XML file TIFF image file JP2 image file page 1 TIFF image file JP2 image file METS XML file TIFF image file page 2 JP2 image file Objects Aggregations of files that together represent a coherent unit of content All the files that make up a single digital book All the master and use copies representing a single photograph Useful for management, reporting and searching “How many PDS document objects do I have in the DRS?” Objects New hook for metadata Administrative categories (projects, exhibits, collections, etc.) Descriptive metadata, catalog records Hollis # 009412949 Object Digital Medieval Manuscripts at Houghton Library Moralia in Job: manuscript Content models Object types Define valid file formats and relationships known delivery and rendering applications associated assessments and preservation plans Enforce conformity - we know what we have in the DRS and can monitor & preserve it DRS 2.1 content models – deposit & delivery 1. Still image 2. PDS document 3. Initially just PDF files, delivered by FDS Opaque 5. Page-turned documents, delivered by PDS Document 4. Image objects, delivered by IDS Files in any format Text Text, XML, etc. delivered by FDS Still image CM – print Several derivative JPEG deliverables TIFF archival master Derivative JPEG thumbnail Pope Joan Series: Illustration from Philippus Bergomensis, De Claribus Mulieribus. Ferrara, Rossi Harvard Art Museum/Fogg Museum, Gift of Philip Hofer PDS document CM - book JP2 archival master / deliverable images per page … Plain text files per page Zoeller, Karl William. Merchandising the plumbing business. Chicago : Domestic Engineering Co., c1921. Baker Library. Document CM - report PDF deliverable Intergovernmental Panel on Climate Change (IPCC) WG1 Fourth Assessment Report, Environmental Science and Public Policy Archives Harvard College Library Opaque content model The contents of Judge Tragers’ hard drive, Harvard Law School Library Wordperfect files, Text files, PDF documents, etc. Plus documentation about the collection Text CM – methodology Plain text file Processing methodology for Intergovernmental Panel on Climate Change (IPCC) documents, HCL Imaging Services. New metadata Object descriptors A METS metadata file per object on the file system alongside content files Descriptive, administrative, preservation, technical and structural metadata Describes the object, all its files and bitstreams and related significant events Gives the metadata the same secure storage as the content files Self-contained, portable objects The move to standards PREMIS -- for key preservation metadata, including MODS -- for descriptive metadata Form-specific schemas for technical metadata, including Events that affect content Relationships that are not implicit MIX for images textMD for text DocumentMD for PDF and other document formats More to come… Supplemented by local administrative schemas New local metadata adminCategory adminFlag captions, phase 2 content model identification DRS URI isFirstGenerationInDrs Behavior, default, unit name, description for objects Closest to original capture isPreferredDeliverableSource Changes to local metadata OwnerSuppliedName Role Instead of “purpose”; repeatable Quality Repeatable for both objects and files Processing Required for objects, optional for files Optional Methodology Now for objects and files of all types Tracking changes DRS 2 will keep track of Three types: Changes that affect content Troubleshooting content errors Key administrative metadata Events Administrative flags “Versioned” metadata elements Not tracking every metadata change Events Object creation deletion /recovery from deletion ingest merge File addition deletion / recovery from deletion integrity check confirmation replacement virus check confirmation Other tracking Metadata where changes will be tracked: Access Flag Administrative Flag Billing Code Owner Code What’s inside a descriptor? Descriptive Metadata MODS Administrative Metadata For the object: PREMIS (including relationships) DRS administrative metadata For each file: PREMIS (including relationships) Format-specific metadata DRS administrative metadata PREMIS Events Inventory of Files Structure Map Overall schedule Overall schedule Available now: first release of BatchBuilder 2 for depositor training and testing Fall 2010 – Summer 2011 Supports 5 content models BatchBuilder 2 enhancements & bug fixes Web Admin 2 development and testing ~September 2011: BatchBuilder 2 and Web Admin 2 in production BatchBuilder 2 BatchBuilder 2 Will build batches of objects rather than batches of files Will automatically determine most technical metadata (using FITS) Will automatically create all object descriptors (using OTS) BatchBuilder 1 BatchBuilder 2 Expects files and creates batches of files. Expects objects and creates batches of objects. Can use an existing PDS METS file for PDS objects. Can import a structmap from an “oldstyle” PDS METS file to create a PDS Document descriptor. Uses batch genres. Uses DRS Content Models. Uses a supplied HOLLIS ID to import contents of a HOLLIS record to a PDS METS Label. Uses a supplied HOLLIS ID to import contents of a HOLLIS record into the MODS section of the object descriptor. Batch level and directory level metadata entered in Batch Template panel. Object level and directory level metadata entered in Object Template. Project level metadata is entered in Administrative Properties panel. Project level metadata is entered in Deposit Settings panel. No depositor authorization – anyone with access to the ftp dropbox can load batches. Depositor authorization – only depositors with permission to load into a particular owner code can load batches into that owner code. Testing instructions Questions & Comments