ALA Annual 2013 ALCTS PARS Intellectual Access to Preservation Metadata PREMIS: To Be or Not To Be in My METS The Preservation Journey at the University of Connecticut Libraries Digital Preservation “Digital preservation combines Policies Strategies, and Actions that ensure access to digital content over time.” --ALA/ALCTS/PARS Short Definition TRAC & PREMIS Policies: Compliance (if not certification) with TRAC/CCSDS 652.0-M-1 (“Magenta Book”) Requires set of policies Strategies: Fixity checking: “4.4.1.2:The repository shall actively monitor the integrity of AIPs.” Remotely replicated copies Actions: Fixity checking, at ingest and over time Record fixity “hashes” or “message digests” in PREMIS Replace fixity failures with good copies Current Landscape Currently The University of Connecticut Libraries relies on a number of solutions that incorporate various “levels” of preservation CONTENTdm Digital Commons Archivists’ Toolkit Archivematica UCL’s AMFS (Archival Master File Service) In 2011, a new team, the Second Generation Digital Library Services Working Group (2G), was created to investigate alternatives to these solutions that incorporated a more consistent preservation mission over all its solutions for its digital collections. Timeline of Events for 2G • Creation of the Second Generation Digital Library Services Working Group Fall 2011 • Initial Questions on Metadata • Metadata Standards & Normalization Spring & • Metadata & Our Content Model: Where Does Metadata Live? Summer 2012 • Islandora • Role of Metadata & Islandora Fall 2012 • METS Profile Registration • Playing with metadata Spring & • Re-conceptualize Islandora as a presentation layer Summer 2013 Fall 2013 & Beyond • Investigations into Handling PREMIS Events, RDF & Linked Open Data Fall 2011 In the search for alternatives to our current solutions, selected Fedora Digital Repository Began investigating ingest scenarios How will content creators submit content and associated metadata? How will metadata be structured and organized? Will content and metadata be in shareable formats and/or proprietary? Where do content and associated metadata live in Fedora? Do we work with SIPs and what would these SIPs look like? Do we follow others and rely on METS? How do we use METS? Spring & Summer 2012 UCL’s Content Model (CM) for Fedora Needed to decide whether to “lump” or “split” our architecture Decision was made to “split” our content model into 3 different levels of related Fedora Digital Objects UCL’s CM, Metadata and Content Our “atomistic” CM means a more “atomistic” approach to metadata Grouping Object Level that acts as the highest level to group like objects Container Object Level refers to the type of a specific resource, such as an image vs a letter Media Object Level contains the actual digital content such as the jpg or pdf Metadata can live at any of our 3 different levels (or metadata can live as a data stream in any Fedora Digital Object at the grouping, container, or media object level.) Metadata Standards and Normalization (METS) We wanted the ability to process a variety of metadata (technical, descriptive, preservation, administrative, structural) at ingest, we needed a way to “normalize” ingested metadata in order to create and/or update data streams in the appropriate Fedora Digital Objects We chose METS Fall 2012 METS, aka the Uberset What is the Uberset? The role of the METS Uberset file Our ideal role for metadata and in particular preservation metadata Islandora Seen early on as an administrative model and presentation layer for our Fedora Digital Repository Spring & Summer 2013 Development of the METS Uberset file What are the minimal requirements for metadata? How do these minimal requirements for across 3 different levels of related Fedora Digital Objects? What is the role of METSRights in relation to the other rights statements in the descriptive metadata? What do we do with the technical metadata? Where do we get our initial PREMIS data and where does that go in the METS Uberset file? Spring & Summer 2013 Issues Encountered Development of Islandora Encountered conflicting content models Decision to use Islandora as a presentation layer but not as an administrative model Metadata in the METS Uberset file Problem with the technical metadata from Archivematica Problem with PREMIS as it was transformed from METS Uberset file to a data stream Inconsistent Logical loop created Other issues with METS Uberset files Workarounds solutions to problems above necessitated that METS Uberset files be re-written over several times Large METS Uberset files which caused transformation problems and slow transformation times Too much repetition in METS Uberset files Summer 2013 Decision to move away from the METS Uberset File as a tool to create and/or update data streams From metadata “lumpers” to “splitters” Notion of Metadata “Modules” as Flexible Re-usable Interoperable Ability to add directly into FOXML and Fedora data streams Unique specifications (best practices, standards, policies, forms, etc.) Parts that can be packaged a number of different ways including as our METS Uberset file if needed Fall 2013 & Our Next Steps Refine specifications for metadata modules Descriptive metadata (Simple Dublin Core, MODS for now) Rights metadata (METSRights for now) Administrative metadata (Fedora) Structural metadata (Fedora, rels-ext, rels-int) Technical metadata (Via FITS) Preservation metadata (Normalized from Technical metadata with the one event, “ingest” for now) Investigate how to handle PREMIS beyond the ingest event Investigate RDF and Linked Open Data Thank you Jennifer Eustis Jennifer.eustis@lib.uconn.edu David Lowe David.lowe@lib.uconn.edu