Guide to Creating a Digital Archive by Meaghan Fukunaga, Julie Judkins and Krystal Thomas Introduction This guide was written by Julie Judkins and Meaghan Fukunaga, Digital Curators at the Center for the History of Medicine (CHM), University of Michigan Medical School, and Krystal Thomas, Digital Library Coordinator at the Theodore Roosevelt Center at Dickinson State University. The information contained in this guide is based on the professional experience of the authors, best practices in the field of digital preservation, and coursework at the University of Michigan’s School of Information. The authors can be contacted at: Julie Judkins: julieju@med.umich.edu Meaghan Fukunaga: mof@med.umich.edu Krystal Thomas krystal.thomas@dickinsonstate.edu 1 Planning Initial planning begins with stakeholder meetings to discuss topics like intended audience, standards, collection policy scope, hardware and software, finances, and staff requirements. All involved parties should be present at these meetings whenever possible. Discussion Questions Audience ◦ Who will be most likely to use the archive? ◦ Who do we want to draw to the archive? ◦ Do we want any security measures for privacy? ◦ Do we want a public archive? Standards ◦ Do we want to follow federal standards, like those put out by the National Archives and Records Administration (NARA)? ◦ Do we want to follow standards used by academic institutions? ◦ What controlled vocabularies will we want to use? Do we need to create our own set of keywords? Collection Policy ◦ What is the scope of our archive? ◦ What materials do we want to include? Do we want to exclude any materials? ▪ What are the copyright concerns for the materials we wish to include? ◦ What are acceptable sources for our materials? Do we want to exclude any sources? ◦ How long do we want to preserve the materials in the archive? ◦ Will we ever deaccession any materials? ◦ How will we preserve the materials in the archive? Hardware and Software ◦ What sorts of hardware (new computer, scanner, external hard drive) do we need to acquire? ◦ Are we going to scan items in house or use an outside vendor? ◦ What kind of software do we want to use? ◦ Do we want to host it ourselves? ◦ Do we want to use a vendor? Finances ◦ How much money do we have for initial development? ◦ How much money do we have over the long-term? ◦ Do we have a steady, long-term funding solution? ◦ If our long-term funding dries up, do we have a viable backup? Staff ◦ How many staff members do we want? ◦ How many staff members can we sustain over the long-term? ◦ What kind of educational background do our staff members need? ◦ What kind of experience do our staff members need? 2 Development of the 1918 Influenza Digital Encyclopedia MPublishing ◦ Helped CHM design and mount the digital interface that will exhibit the archival documents collected. Digital Conversion Unit (DCU) ◦ Where CHM’s materials were digitized. Files were then transferred to MPublishing for processing. Development of the Theodore Roosevelt Digital Library Information Technology Department, North Dakota (ITD) ◦ Home of the digital archival materials for the Theodore Roosevelt Digital Library. Acts as an advisor to the TRDL for long-term preservation planning. DataFormat ◦ Company which designed and maintains the Content Management System (CMS) for the TRDL, DARMA. Acts as a partner in maintenance of standards and OAIS compliance. Hosts our back-end CMS and public access images The Berndt Group ◦ Web design firm that redesigned hosts, and maintains the Theodore Roosevelt Center website and digital library. Also maintains the website’s CMS, SiteCore and acts as a technology partner for future development. 3 Implementation Creating “investability” (getting people excited about using your digital archive) Research presentation opportunities Contribute articles to relevant journals or edited collections about the project or digital humanities Join a relevant listserv and promote the launch of your project or other relevant milestones (use caution -- don’t overdo this, you risk alienating your audience) Create a social media presence (blog, facebook, twitter) to show people what is coming and to share some of the discoveries you’ve already made If your project can use volunteers, get the word out locally (or globally if someone can volunteer from anywhere) and get real people helping you who can then tell their networks what your project is about User Testing and Reporting Questions to ask during planning ◦ Who will use our collection? ◦ What are our users needs? ◦ What will our users look for? ◦ How will our users find what they’re looking for? ◦ What will our users expect to be able to do with our collection? (share it with friends, save items for later use, download the document, comment on the document) Questions to ask before/just after the website is launched ◦ Who can we ask to help us test the website’s functionality? (i.e. colleagues) ◦ How will we test the usability of the website? (questionnaire? pop-up web survey? informal response?) ◦ How feasible are the suggestions? Can we implement them? How? Tracking ◦ How will we track site visitors? ◦ What sorts of reporting do we want to know about user behavior on the site? (Most popular pages? How long a user stays on the page? How does a user find our page?) 4 Maintenance Digital archives require a solid long-term maintenance strategy which takes into consideration financial, staff, and preservation requirements. Long-term maintenance planning should be a large component of your initial planning. Long-term maintenance plans should also be revisited for feasibility from time to time as the project progresses. Discussion Questions Finances ◦ Do we have a long-term funding strategy? ◦ Are we prepared in the event of a disaster? (i.e. catastrophic power failure, company hosting project servers goes out of business, loss of any major project stakeholders (including financial backers and staff)) ◦ Are we financially prepared to migrate digital materials to new formats or even a new software platform as needed? ◦ Do we have the finances to maintain and keep current any software or websites associated with the archive? Staff ◦ Do we have the finances to maintain staff over the long-term? ◦ Do we have a plan to help staff keep current with trends in their fields? (i.e. conference attendance, certification, seminars) ◦ What core staff do we want to maintain? (i.e. a digital archivist, web-designer, etc.) Preservation ◦ How long do we want to maintain this digital archive? ◦ How will we ensure our digital materials are backed-up in the event of a catastrophe? ◦ Do we understand the technical aspects of long-term preservation of digital materials? ◦ Can we support digital materials over the long-term? (i.e. migration to new formats) ◦ If we are using a vendor to host our archive can we afford this over the long-term? ◦ Do we have the funds to maintain this digital archive over the long-term? ◦ Do we have the funds to maintain staff over the long-term? 5 Learn More Examples of Digital Archives Veterans History Project at Library of Congress Walt Whitman Archive National Archives Experience: Digital Vaults September 11th Digital Archive Digital Library of the Week (ALA) Minnesota Reflections Europeana World Digital Library Community & Conflict: The Impact of the Civil War in the Ozarks Theodore Roosevelt Digital Library The American Influenza Epidemic of 1918: A Digital Encyclopedia Seeking Michigan Information about Digitization Digital Conversion Unit, University of Michigan. “Summary of Digitization Costs.” https://www.lib.umich.edu/files/DCURechargeRates-2009.pdf Writing History in the Digital Age (many articles are examples of digital projects) ◦ Julie’s article on AIE: “Case Study of the American Influenza Epidemic of 1918: A Digital Encyclopedia” Digital Preservation, resources and tools compiled by the Library of Congress regarding preservation of digital information ◦ Sustainability of Digital Formats Theodore Roosevelt Center Digital Imaging and Metadata Guidelines ◦ Comprehensive manual about all imaging guidelines and metadata standards for the Theodore Roosevelt Digital Library project NINCH (National Initiative for a National Cultural Heritage) Guide to Good Practice in the Digital Representation and Management of Cultural Heritage Materials ◦ From Scotland, a good overview of all aspects of planning, implementing and maintaining a digital project with archival and library materials. Oral History On-line Resources Existing Oral History Sites (Veteran's History Project, Library of Congress) Oral History on the Web -- Exemplary Sites (History Matters) Digitization Standards Library of Congress Digitization Standards Federal Agencies Digitization Guidelines Initiative U.S. National Archives and Records Administration (NARA) Technical Guidelines for Digitizing Archival Materials for Electronic Access: Creation of Production Master Files – Raster Image Society of American Archivists Standards Portal ◦ Includes Description and arrangement standards as well as digitization standard information PREMIS: Preservation Metadata Maintenance Activity (Library of Congress) Digital Projects Guide (Harvard University Information Technology) 6 Collection Policy Examples University of Texas Libraries http://www.lib.utexas.edu/admin/cird/policies/subjects/framework.html Theodore Roosevelt Center (Dickinson, North Dakota) http://www.theodorerooseveltcenter.org/About-Us/TRDL-Collection-Policy.aspx South Carolina Digital Library http://www.scmemory.org/about/policy.php Metadata, Keywords, and Thesauri Library of Congress Basic Genre Terms for Cultural Heritage Materials Library of Congress Subject and Name Authorities Art and Architecture Thesaurus FAST: Faceted Application of Subject Terminology User testing and Usability Dumas and Loring, Moderating usability tests (2008) Dumas and Redish, A Practical Guide To Usability Testing (Rev. ed, 1999) Nielsen, Jakob’s useit.com Rubin and Chisnell. Handbook of usability testing: how to plan, design, and conduct effective tests (2nd edition, 2008) Copyright Resources Copyright and Digitization, Midwest Collaborative for Library Services Copyright Term and the Public Domain, Cornell University (updated yearly) United States Copyright Office Social Media Resources Ten Must Haves in Your Social Media Policy, Mashable Social Media Policies Superlist, iiG Social Media Strategy Handbook, The American Red Cross Conferences/Networking Opportunities Digital Humanities Winter Institute http://www.cmlt.umd.edu/ o January 7 - 11, 2013 o College Park, Maryland (University of Maryland) o Week long courses on subjects related to the digital humanities http://www.cmlt.umd.edu/?q=courses Digital Humanities Summer Institute http://www.dhsi.org/ o June 10 - 14, 2013 o Victoria, British Columbia, Canada (University of Victoria) o Week long courses on subjects related to the digital humanities http://dhsi.org/courses.php o Scholarships available (applications due in Fall) THATCamp http://thatcamp.org/ o National and International, year-round Joint Conference on Digital Libraries http://jcdl.org/ o 2013 conference will be held in Indianapolis, Indiana DLib Magazine’s list of conferences of interest http://www.dlib.org/groups.html 7 Further Reading Historical Collections for the National Digital Library: Lessons and Challenges at the Library of Congress [Part I] Historical Collections for the National Digital Library: Lessons and Challenges at the Library of Congress [Part II] Standards Related to Digital Imaging of Pictorial Materials Library of Congress Workshop Materials ◦ The Digital Library Environment materials include information on Digital Project planning, Metadata creation and Cataloging standard selection. D-Lib Magazine ◦ “An electronic publication with a focus on digital library research and development, including new technologies, applications, and contextual social and economic issues.” 8