HATHITRUST A Shared Digital Repository Why Digitize? or The Limits of Preservation Books from Different <angles> Unless otherwise noted, these slides and their contents are licensed under a Creative Commons Attribution Unported License. 2014 TEI/DHCS Plenary Session Evanston, IL Mike Furlough Executive Director, HathiTrust Caveat auditor 23 October 2014 2 HATHITRUST.ORG 23 October 2014 3 Bethany Nowviskie, “Digital Humanities in the Anthropocene” http://nowviskie.org/2014/anthropocene/ From The Art of Google Books: http://theartofgooglebooks.tumblr.com/post/74936156541/married-employees-hand-overbookplate-and From Lorcan Dempsey’s Weblog: http://orweblog.oclc.org/archives/001284.html ≠ HathiTrust Mission To contribute to the common good by collecting, organizing, preserving, communicating, and sharing the record of human knowledge. Efforts include, but are not limited to …building comprehensive collections co-owned and managed by partners. …enabling access by users with print disabilities. …supporting computational research with the collections. …stimulating shared collection storage strategies among libraries. 23 October 2014 12 HathiTrust Members Allegheny College Arizona State University Baylor University Boston College Boston University Brandeis University Brown University California Digital Library Carnegie Mellon University Colby College Columbia University Cornell University Dartmouth College Duke University Emory University Florida State University Getty Research Institute Harvard University Library Indiana University Iowa State University Johns Hopkins University Kansas State University Lafayette College Library of Congress Massachusetts Institute of Technology McGill University` Michigan State University Montana State University Mount Holyoke College New York Public Library New York University North Carolina Central University North Carolina State University Northwestern University 23 October 2014 13 The Ohio State University The Pennsylvania State University Princeton University Purdue University Rutgers University Stanford University Syracuse University Temple University Texas A&M University Texas Tech Tufts University Universidad Complutense de Madrid University of Alabama University of Alberta University of Arizona University of British Columbia University of Calgary University of California Berkeley Davis Irvine Los Angeles Merced Riverside San Diego San Francisco Santa Barbara Santa Cruz The University of Chicago University of Connecticut University of Delaware University of Florida University of Houston University of Illinois University of Illinois at Chicago The University of Iowa University of Kansas University of Maine University of Maryland University of Massachusetts, Amherst University of Miami University of Michigan University of Minnesota University of Missouri University of Nebraska-Lincoln University of New Mexico The University of North Carolina at Chapel Hill University of Notre Dame University of Oklahoma University of Pennsylvania University of Pittsburgh University of Queensland University of Tennessee, Knoxville University of Texas University of Utah University of Vermont University of Virginia University of Washington University of WisconsinMadison Utah State University Vanderbilt University Virginia Tech Wake Forest University Washington University Yale University Library Shared Responsibilities • Leverage expertise across institutions – Collective work • Distributed Infrastructure – Preservation repository and access services • University of Michigan • Mirror site: Indiana University – Metadata management services (Zephir) • California Digital Library – HathiTrust Research Center • Indiana University and University of Illinois 23 October 2014 14 Growth of Collection 14,000,000 12,104,793 12,000,000 9,966,572 10,599,355 10,878,121 10,000,000 7,836,698 8,000,000 5,221,092 6,000,000 4,000,000 2,477,871 2,000,000 0 2008 23 October 2014 15 2009 2010 2011 2012 2013 2014 Language Distribution (1) Latin, 1% Remaining Languages, 13% The top 10 languages make up ~87% of all content Arabic, 2% Italian, 3% Japanese, 3% English, 49% Russian, 4% Chinese, 4% Spanish, 5% German, 9% French, 7% * As of February 17, 2014 23 October 2014 16 Language Distribution (2) The next 40 languages make up ~12% of total Slovak, 1% Turkish,-Ottoman, 1% Malayalam, 1% Finnish, 1% Romanian, 1% Malay, Slovenian, 1% Telugu, 1% 1% Greek,MultipleArmenian, 1% Yiddish, 1% Ancient-(tolanguages Panjabi, 1% 1453), 1%Bulgarian Nepali, 0% , 1% , 1% Serbian, 1% Marathi, 1% Vietnames Catalan, 1% e, 1% Ukrainian, 1% Polish, 7% Greek,-Modern(1453--), 2% Sanskrit, 2% Norwegian, 2% Portuguese, 7% Dutch, 5% Hebrew, 5% Hindi, 5% Bengali, 2% Hungarian, 2% Tamil, 2% Persian, 2% Indonesian, 4% Croatian, 3% Czech, 3% 23 October 2014 17 Korean, 4% Danish, 3% Turkish, 3% Urdu, 3% Thai, 3% Swedish, 4% * As of February 17, 2014 Dates 0-1500, 0.04% 1500-1599, 0.07% 1600-1699, 0.01% 2000-2009 1700-1799, 0.01% 10% 1850-1899 1800-1849 3% 1910-1919 1900-1909 10% 4% 4% 1920-1929 4% 1930-1939 4% 1940-1949 4% 1960-1969 11% 1990-1999 14% 1980-1989 14% 1970-1979 13% 1950-1959 6% * As of February 17, 2014 23 October 2014 18 Preservation with Access • Preservation – TRAC-certified – Long-term commitments on digital content facilitate planning, decision-making • Discovery – Bibliographic and full-text search of all materials – Mechanisms for local loading of records • Access and Use – – – – 23 October 2014 Full text search (all users) Public domain and open access works (all users) Collections and APIs (all users) Lawful uses of in-copyright works (members) 19 Title page of edition of JF Cooper’s Satanstoe presented in the Making of America database. (Accessed October 18, 2014) Spine of edition of JF Cooper’s Satanstoe presented in the Early American Fiction database. (Accessed October 18, 2014) 6 of 15 records for different copies of JF Cooper’s Satanstoe presented HathiTrust. (Accessed October 18, 2014) Some Issues • Collection strategies – What else? – Associated access and preservation questions • The “Evolving Scholarly Record” – The book and the network – Fragmentation and loss How to find out more • • • • • About: http://www.hathitrust.org/about Resources: http://www.hathitrust.org/resources Twitter: http://twitter.com/hathitrust Facebook: http://www.facebook.com/hathitrust Monthly newsletter: – http:www.hathitrust.org/updates – RSS http://www.hathitrust.org/updates_rss • Contact us: feedback@issues.hathitrust.org • Blogs: http://www.hathitrust.org/blogs – Large-scale Search – Perspectives from HathiTrust 21 October 2014 26 Thank you! furlough@hathitrust.org @MikeFurlough