Katherine Skinner, Emory University Gail McMillan, Virginia Tech NDIIPP Annual Partners Meeting June 24, 2009 Central aim: to better understand the terrain of the emergent field of digital curation. how emergent is it? what trends are beginning to emerge within it? MetaArchive 2009 2 ETD: December 2007-April 2008 Universities and Colleges 96 Respondents Five Listservs: ▪ Association of Research Libraries, Association of Southeastern Research Libraries, Council of Graduate Schools, Digital Library Federation, and Electronic Theses and Dissertations MetaArchive 2009 3 Two surveys, 158 participants Cultural Memory: March 2009 Archives, Museums, Libraries, Historical Societies, Government Agencies 62 Respondents Three Listservs: ▪ H-Museum, A&A-L (Society of American Archivists), and ERECS-L (Electronic Records Managers) MetaArchive 2009 4 Who is collecting digital materials, what are they collecting, and how are they storing these materials? Who seeks to preserve their digital collections and how do they want to preserve them? What are the biggest barriers to preservation? What are the most desired offerings in preservation? MetaArchive 2009 5 Cultural Memory: 98.4% are collecting Range: 1 GB-20 TB, average 2 TB Average Growth: 540 GB/year Formats/Genres include: text (83%), video (76%), audio (75%), email (47%), databases (48%), websites (41%), and GIS material (36%) + scads more Repository structures include: home-grown (65%), CONTENTdm (17%), Fedora (9%), DSpace (7%), Access/Excel (6%), plus SRB, Filemaker, and 10 others MetaArchive 2009 6 ETDs: 80% accept ETDs; 40% only accept ETDs Range: 22-60 GB, average 41 GB Average Growth: 4.5 GB/year Formats/Genres include: images (92%), applications (89%), audio (79%), text (64%), video (52%), and other (15%) Repository structures include: DSpace (31%), ETD-db (15%), Fedora (5%), Eprints (2%), as well as locally developed solutions (34%) and vendor-based solutions: bepress (6%), DigiTool (6%), ProQuest (6%), and CONTENTdm (6). MetaArchive 2009 7 Formats (ETD & Cultural Memory) ETD .ppt .qt .tif .xml .wav .png .pdf .mpg .mp3 .aif .avi .doc .gif Cultural Memory .html .jpg .mov .dwt .xls .csv .zip .mix .snd .tex .txt .midi .exe .jar MetaArchive 2009 JP2 .ps Textual documents Databases Still images Video Audio GIS Websites Email Computer games Science data Publications Presentation materials 8 Platforms (ETD & Cultural Mem.) ETDdb Eprints Fedora DSpace Archimede bepress/ Digital Commons CONTENTdm Cybertesis Dias DigiTool DLXS ProQuest MetaArchive 2009 MS Access Excel SRB ResCarta Augias-data Cumulus CollectiveAccess Windows Explorer IRODS Filesystem ArchivalWare Filemaker Pro iTunes 9 Documentum Fez Millennium Online Catalog OhioLINK Oracle Sesame VTLS Vital Past Perfect ANCS MINISIS CDs/DVDs In House Structure (ETD & Cultural Mem) Cultural Memory subject (33%) collection (35%) format (21%) date (10%) department (10%) creator (8%) funder (4%) ETD All in one directory (28%) Date (26%) Departments, Authors, or Disciplines (26%) Access-level labels (7%) Don’t know (13%) *some Cultural Memory respondents selected multiple ways MetaArchive 2009 10 Variation is the theme Infrastructures Data Structures Presents preservation challenges, to be sure! MetaArchive 2009 11 Who seeks preservation and how do they want to preserve? Readiness is low Most institutions are not even backing up Dearth of preservation plans and policies Desire is high Want training Want independent assessments Want to manage their own digital preservation solutions MetaArchive 2009 12 Cultural Memory: Only 50% back up 100% of their digital holdings Only 19% report having in-house “expert” knowledge in digital preservation 79% have NO preservation plan 55% have NO written policies ETDs: 95% are engaging SOME backup strategies 72% have NO preservation plan MetaArchive 2009 13 Cultural Memory 83% will develop policies in the next 3 years 90% cited interest in participating in a community-based digital preservation solution Only 30% cited interest in third-party vendor offerings, even at a reasonable cost ETDs 70% have experience with/knowledge of LOCKSS 92% cited interest in participating in an NDLTD- supported LOCKSS-based EDT archive MetaArchive 2009 14 CMO’s engaging actively with the idea of digital preservation High level of knowledge about communitybased approaches to digital preservation Outsourcing is not the top choice of institutions as they pursue digital preservation; they would rather participate in it themselves MetaArchive 2009 15 What are the biggest barriers to preservation? Growth of digital collection Backups. NOT File formats Platforms Structures. NOT Lack of documented policies, procedures MetaArchive 2009 16 What are the threats identified by our survey respondents? MetaArchive 2009 17 What are the most desired preservation offerings? 1. 2. 3. 4. 5. 6. Training provided by professional organizations Independent study/assessment Local courses in computer or digital technology Hire staff with digital knowledge experience Hire consultants Training provided by vendors MetaArchive 2009 18 The MetaArchive Cooperative The most effective preservation strategies incorporate replication of content geographically distributed secure locations private network of trusted partners MetaArchive 2009 19 Desirable Preservation Service 1. 2. 3. 4. 5. 6. 7. Cooperative preservation network Standards Training: Best practices, inc. technical Model policies Conversion or migration services Preservation services provided by third party vendors Access services MetaArchive 2009 20 Conclusion Calf-Path Syndrome Idiosyncratic, ad-hoc data storage structures Increasingly difficult remediation MASH: triage Survey documented narratives Outreach Offer help to those adrift in cyberspace Through collaboration there are cost-effective and strong strategies that can protect cultural memories MetaArchive 2009 21 Katherine Skinner katherine.skinner@emory.edu Gail McMillan gailmac@vt.edu MetaArchive 2009 22