Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at the 9th International Symposium on ETDs, Quebec City Presented By: Gail McMillan, Director Digital Library and Archives Virginia Tech Agenda Goals What is LOCKSS? Participating Universities International ETD Preservation Analysis and Results Conclusion Digital Preservation Goal: Information should be Readable Usable in the future Preservation – NOT just backup Existing preservation techniques Floppy, CD and hard disk drives Central and distributed database servers Technical Infrastructure Goals Build on successful LOCKSS opensource model Create dark archive for locally produced digital content Use off-the-shelf hardware Use open-source software Easy replication Demonstrate LOCKSS scalability LOCKSS Lots of Copies Keep Stuff Safe Peer-to-peer digital preservation system Open source software Turns an inexpensive desktop computer into a digital preservation appliance Easy, inexpensive way to Collect Store Preserve Provide access to the contents--or, not. Functions of LOCKSS (1) Collect Via a web crawler Appropriate Preserve crawl rules are specified and Audit Every institution preserves Its own contents Contents of partner universities Contents are polled to determine authenticity and reinstate bad files Functions of LOCKSS Provide access By running web proxies Open or restricted access Dark Archives for partners’ ETDs (2) Levels of access controlled at originating institutions Administration Via a web user interface Controlling access to cached contents and other functions LOCKSS Preservation Contents of each university (nodes M1 through M5) preserved at every other university Multiple, dispersed copies Not a backup-nothing is overwritten All versions retained M2 M1 M3 M5 M4 ASERL-LOCKSS-ETD Initiative Florida State University Georgia Institute of Technology University of Kentucky University of Tennessee Vanderbilt University Virginia Polytechnic Institute and State University http://www.aserl.org/ Preservation using LOCKSS Prerequisites Minimum hardware configuration LOCKSS software installed on all participating partners’ systems Permissions for the LOCKSS system to collect, preserve, periodically validate, repair ETDs Example Hardware Configuration Enterprise (3TB) Dell PowerEdge Server 1850 LOCKSS - $3500 Dell PowerEdge Server 1850 Firewall - $2500 Dell/EMC AX100 SAN (3TB) - $10,000 RedHat Enterprise AS – 2@$50 = $100 UPS - $700 Server Rack - $1200 Grand Total $16,800.00 w/ Rack - $18,000.00 Desktop (200Gb) Intel Based Desktop LOCKSS (200Gb) - $500 Intel Based Desktop Firewall - $350 CentOS Linux - $0 UPS - $50 Grand Total $900.00 Participating Universities International universities Pontifícia Universidade Católica do Rio de Janeiro, Brazil Humboldt-Universität, Germany University of Cape Town, South Africa US universities Florida State University Georgia Tech Virginia Tech International ETDs Preservation (1) For international universities KS wrote plug-ins to collect contents (ETDs) from the 3 universities For US universities Verified and reused OAI plug-ins for the 3 universities International ETD Preservation (2) Example ETD collection University of Cape Town ETD collection Manifest (i.e., permissions) page: http://pubs.cs.uct.ac.za/lockss/manifest. html Screen shots of UCT plug-in and the crawl results of contents follow University of Cape Town Plug-in (1) UCT plugin: Crawl Results with • Level (depth) =4 • Fetch delay = 6 seconds Harvested International ETD Collections Harvested American ETD Collection [source: http://lockss-etd.lib.vt.edu:8081/DaemonStatus ] Tutorial on how to write plug-ins KS developed mini-tutorial http://scholar.lib.vt.edu/lockss/introduction.htm 10 screens This tutorial can be Generalized for ETD plug-ins Extended to write OAI plug-ins Conclusion and Future Work International ETDs can be harvested and preserved using LOCKSS and OAI-PMH It requires cooperation and collaboration from participating universities Future Work An online portal open for the public to view certain details Brazil expressed interest in formalizing ETD preservation for the NDLTD using LOCKSS Acknowledgements Special thanks to LOCKSS (Stanford University) Thomas Robertson Seth Morabito Thanks to all participating universities Florida State Georgia Tech Humboldt-Universität, Germany Pontifícia Universidade Católica do Rio de Janeiro, Brazil University of Cape Town, South Africa Virginia Tech Send Questions/Comments to ksanthan@vt.edu