Report on Preservation of ETDs: The LOCKSS Prototype Reported at the 9

advertisement
Report on Preservation of ETDs:
The LOCKSS Prototype
The work of Kamini Santhanagopalan
Virginia Tech Graduate Student in Computer Science
Reported at the 9th International
Symposium on ETDs, Quebec City
Presented By:
Gail McMillan, Director
Digital Library and Archives
Virginia Tech
Agenda






Goals
What is LOCKSS?
Participating Universities
International ETD Preservation
Analysis and Results
Conclusion
Digital Preservation

Goal: Information should be
Readable
 Usable in the future



Preservation – NOT just backup
Existing preservation techniques


Floppy, CD and hard disk drives
Central and distributed database
servers
Technical Infrastructure Goals






Build on successful LOCKSS opensource model
Create dark archive for locally produced
digital content
Use off-the-shelf hardware
Use open-source software
Easy replication
Demonstrate LOCKSS scalability
LOCKSS

Lots of Copies Keep Stuff Safe




Peer-to-peer digital preservation system
Open source software
Turns an inexpensive desktop computer
into a digital preservation appliance
Easy, inexpensive way to
Collect
 Store
 Preserve
 Provide access to the contents--or, not.

Functions of LOCKSS
(1)
 Collect

Via a web crawler
 Appropriate
 Preserve

crawl rules are specified
and Audit
Every institution preserves
 Its
own contents
 Contents of partner universities
 Contents are polled to determine
authenticity and reinstate bad files
Functions of LOCKSS

Provide access



By running web proxies
Open or restricted access
Dark Archives for partners’ ETDs


(2)
Levels of access controlled at originating
institutions
Administration

Via a web user interface
 Controlling access to cached contents
and other functions
LOCKSS Preservation

Contents of each
university (nodes
M1 through M5)
preserved at
every other
university



Multiple,
dispersed copies
Not a backup-nothing is
overwritten
All versions
retained
M2
M1
M3
M5
M4
ASERL-LOCKSS-ETD Initiative






Florida State University
Georgia Institute of Technology
University of Kentucky
University of Tennessee
Vanderbilt University
Virginia Polytechnic Institute and State University
http://www.aserl.org/
Preservation using LOCKSS
 Prerequisites
Minimum hardware configuration
 LOCKSS software installed on all
participating partners’ systems
 Permissions for the LOCKSS system
to collect, preserve, periodically
validate, repair ETDs

Example Hardware Configuration

Enterprise (3TB)







Dell PowerEdge Server
1850 LOCKSS - $3500
Dell PowerEdge Server
1850 Firewall - $2500
Dell/EMC AX100 SAN
(3TB) - $10,000
RedHat Enterprise AS –
2@$50 = $100
UPS - $700
Server Rack - $1200
Grand Total $16,800.00

w/ Rack - $18,000.00

Desktop (200Gb)





Intel Based Desktop
LOCKSS (200Gb) - $500
Intel Based Desktop
Firewall - $350
CentOS Linux - $0
UPS - $50
Grand Total $900.00
Participating Universities

International universities




Pontifícia Universidade Católica do Rio
de Janeiro, Brazil
Humboldt-Universität, Germany
University of Cape Town, South Africa
US universities



Florida State University
Georgia Tech
Virginia Tech
International ETDs Preservation (1)

For international universities


KS wrote plug-ins to collect contents
(ETDs) from the 3 universities
For US universities

Verified and reused OAI plug-ins for
the 3 universities
International ETD Preservation (2)

Example ETD collection


University of Cape Town ETD collection
Manifest (i.e., permissions) page:
http://pubs.cs.uct.ac.za/lockss/manifest.
html

Screen shots of UCT plug-in and the
crawl results of contents follow
University of Cape Town Plug-in (1)
UCT plugin:
Crawl
Results
with
• Level
(depth)
=4
• Fetch
delay = 6
seconds
Harvested International ETD Collections
Harvested American ETD Collection
[source: http://lockss-etd.lib.vt.edu:8081/DaemonStatus ]
Tutorial on how to write plug-ins

KS developed mini-tutorial
http://scholar.lib.vt.edu/lockss/introduction.htm


10 screens
This tutorial can be


Generalized for ETD plug-ins
Extended to write OAI plug-ins
Conclusion and Future Work



International ETDs can be harvested and
preserved using LOCKSS and OAI-PMH
It requires cooperation and collaboration
from participating universities
Future Work


An online portal open for the public to view
certain details
Brazil expressed interest in formalizing ETD
preservation for the NDLTD using LOCKSS
Acknowledgements

Special thanks to LOCKSS



(Stanford University)
Thomas Robertson
Seth Morabito
Thanks to all participating universities






Florida State
Georgia Tech
Humboldt-Universität, Germany
Pontifícia Universidade Católica do Rio de
Janeiro, Brazil
University of Cape Town, South Africa
Virginia Tech
Send
Questions/Comments to
ksanthan@vt.edu
Download