MetaArchive of Southern Digital Cultural Partners in the dispersed redundant dark archive University Libraries at Emory Auburn Florida State Georgia Tech Louisville Virginia Tech LOCKSS: Lots of Copies Keep Stuff Safe Library of Congress NDIIPP The MetaArchive of Southern Digital Culture will create and develop a digital preservation network for critical and at-risk content relating to Southern culture and history. The partners in this initiative will select and preserve institutional digital archives, including institutionally relevant materials such as electronic theses and dissertations, as well as ephemeral works such as online exhibitions and cultural history displays. This digital content will include subjects that complement Library of Congress collections covering the Civil War, the civil rights movement, slave narratives, Southern music, handicrafts, and church history. These collections will encompass born-digital works containing existing works as well as newly created digital works produced for web and other archival purposes. MetaArchive Project Goals 1. Create a conspectus of digital content within the subject domain held by the partners 2. Harvested body of the most critical content to be preserved (3 TB per institutions) 3. Develop a model cooperative agreement for ongoing collaboration and sustainability 4. Distributed preservation network infrastructure based on the LOCKSS software MetaArchive Deliverables • Define the Scope of the Content – What is Southern digital culture? – What is “at risk?” • Developing a conspectus: content selection – What collections will be preserved? – Metadata • Adaptations showing any unique or qualified tags • Rights issues: harvesting for preservation vs. user access Key Features of the MetaArchive of Southern Digital Culture 1. 2. 3. 4. 5. 6. 7. 8. Distributed preservation strategy Flexible organizational model Formal content selection process Capability for migrating archives Dark archiving strategy Low cost to deployment Self-sustaining incentives Simple preservation exchange mechanisms with the Library of Congress MetaArchive Preservation Strategy • Redundant Access – Bit Preservation • Automated – Content Ingestion – MD5 Checksums – LOCKSS Polling Algorithm (Trust Relationships) – Distributed Copies • 200,000 + square mile area MetaArchive NDIIPP Network via Internet2 University of Louisville NYC Va Tech CH DC IN MAX Connection to Va Tech Emory University Ga Tech ATL Florida State University Auburn University FL Lambda Rail MAX Network Abilene Network SOX Network MetaArchive Hardware • Off-the-Shelf Strategy – Dell/Intel Based Hardware • Could easily be HP or SUN Intel Based Hardware etc. • Could be old desktops w/large hard drives. – New Low Cost SATA SAN • EMC AX100 – $4.00 per GB (already dropping in price) MetaArchive Software • Operating System – RedHat Linux Enterprise AS v. 3/4 • Ease of update management and experience w/OS – Could easily work on other versions of Linux • JAVA SDK • LOCKSS Content Ingestion/Replication – LOCKSS Daemon 1.8.3 – 6-8 week updates w/RPM • Conspectus Database – MySQL/PHP Interface – Standalone System • MetaArchive Collection Description Metadata Schema MetaArchive Collection Description Metadata Schema • Based on UKOLN RSLP Collection Description – Includes • LOCKSS Manifest Page • Risk Ranking – MA Ranking • OAI-PMH Data Provider URL Collection-Level Conspectus Metadata Specification Access Rights Accrual Periodicity Accrual Policy Accumulation Date Range Alternative Title Associated Collection Associated Publication Bytes Cataloged Status Catalogue or description Collection Size Contents Date Range Creator Custodial History Description Format Characteristics Institution Collection Identifier Is Available Via Language LOCKSS Manifest Page Manifestation MetaArchive Collection Identifier OAI Provider Publisher Recommended Harvest Procedure Rights Risk Factors Risk Rank Spatial Coverage SubCollection Subject SuperCollection Temporal Coverage Title Type MetaArchive follows Standards • OAIS Reference Model – LOCKSS Compliance • OAI-PMH 2.0 – Using as alternative to current LOCKSS AU strategy w/ETDs – VaTech, GaTech, FSU • UKOLN RSLP Collection Description – Basis for MetaArchive Conspectus • http://www.metaarchive.org/pdfs/conspectus_md_2005.html MetaArchive Collaboration • Kickstart Installations for Linux Servers – Easy to setup all hardware together exactly the same. • Efficiency of Replication – Kickstart can be used with production system as well as with any Intel based machine. • Communication – Telephone conference, video conference I2, iVocalize Chat/VOIP Room, Wiki, PhpCollab • Study issues – Dynamic content – Format migration MetaArchive Rights Issues Any use of protected works generally will need to: – fit within an exception to the exclusive rights of owners, such as the “fair-use” doctrine or other provisions relating specifically to library copying and other activities – undergo an investigation to determine whether the work still enjoys protection or has lapsed into the public domain due to notice or renewal defects – occur as a result of valid permission from the copyright owner(s) – constitute an acceptable risk for the institution in potential absence of “clear” resolution MetaArchive Approach to Rights Issues • Who will make decisions? • How will we investigate current status of works that pre-date 1976 Act? • Define “acceptable” legal risk individually or across institutions. • Does “dark archive” lessen potential risks? • How to identify owners and seek their permission? MetaArchive: Beyond Rights Issues • Rights of publicity – Names, images, likeness • Rights of privacy – Potential for damage • Common-law or state statutory protections – Apply to most restrictive jurisdiction? – Apply to widespread norm across jurisdictions? MetaArchive: Beyond Rights Issues • What is the practical meaning of “infringement” in the context of a “dark archive?” • What kinds of limits currently exist in Sect. 108 that prevent or lessen its application to preservation efforts such as MetaArchive? • How can the law address works that have no “owner” in a practical sense from which to acquire permission to preserve? Contact Information MetaArchive of Southern Digital Culture – – – – Dwayne Buttler – dwayne.buttler@louisville.edu Martin Halbert - mhalber@emory.edu Robert H. McDonald – rmcdonal@mailer.fsu.edu Gail McMillan – gailmac@vt.edu http://www.metaarchive.org