Why Digitize? or The Limits of Preservation

advertisement
HATHITRUST
A Shared Digital Repository
Why Digitize?
or
The Limits of Preservation
Books from Different <angles>
Unless otherwise noted, these slides and their
contents are licensed under a Creative Commons
Attribution Unported License.
2014 TEI/DHCS Plenary Session
Evanston, IL
Mike Furlough
Executive Director, HathiTrust
Caveat auditor
23 October 2014
2
HATHITRUST.ORG
23 October 2014
3
Bethany Nowviskie, “Digital Humanities in the Anthropocene” http://nowviskie.org/2014/anthropocene/
From The Art of Google Books: http://theartofgooglebooks.tumblr.com/post/74936156541/married-employees-hand-overbookplate-and
From Lorcan Dempsey’s Weblog: http://orweblog.oclc.org/archives/001284.html
≠
HathiTrust Mission
To contribute to the common good by collecting,
organizing, preserving, communicating, and sharing the
record of human knowledge.
Efforts include, but are not limited to
…building comprehensive collections co-owned and
managed by partners.
…enabling access by users with print disabilities.
…supporting computational research with the collections.
…stimulating shared collection storage strategies among
libraries.
23 October 2014
12
HathiTrust Members
Allegheny College
Arizona State University
Baylor University
Boston College
Boston University
Brandeis University
Brown University
California Digital Library
Carnegie Mellon University
Colby College
Columbia University
Cornell University
Dartmouth College
Duke University
Emory University
Florida State University
Getty Research Institute
Harvard University Library
Indiana University
Iowa State University
Johns Hopkins University
Kansas State University
Lafayette College
Library of Congress
Massachusetts Institute of
Technology
McGill University`
Michigan State University
Montana State University
Mount Holyoke College
New York Public Library
New York University
North Carolina Central
University
North Carolina State
University
Northwestern University
23 October 2014
13
The Ohio State University
The Pennsylvania State
University
Princeton University
Purdue University
Rutgers University
Stanford University
Syracuse University
Temple University
Texas A&M University
Texas Tech
Tufts University
Universidad Complutense
de Madrid
University of Alabama
University of Alberta
University of Arizona
University of British Columbia
University of Calgary
University of California
Berkeley
Davis
Irvine
Los Angeles
Merced
Riverside
San Diego
San Francisco
Santa Barbara
Santa Cruz
The University of Chicago
University of Connecticut
University of Delaware
University of Florida
University of Houston
University of Illinois
University of Illinois at
Chicago
The University of Iowa
University of Kansas
University of Maine
University of Maryland
University of Massachusetts,
Amherst
University of Miami
University of Michigan
University of Minnesota
University of Missouri
University of Nebraska-Lincoln
University of New Mexico
The University of North
Carolina at Chapel Hill
University of Notre Dame
University of Oklahoma
University of Pennsylvania
University of Pittsburgh
University of Queensland
University of Tennessee,
Knoxville
University of Texas
University of Utah
University of Vermont
University of Virginia
University of Washington
University of WisconsinMadison
Utah State University
Vanderbilt University
Virginia Tech
Wake Forest University
Washington University
Yale University Library
Shared Responsibilities
• Leverage expertise across institutions
– Collective work
• Distributed Infrastructure
– Preservation repository and access services
• University of Michigan
• Mirror site: Indiana University
– Metadata management services (Zephir)
• California Digital Library
– HathiTrust Research Center
• Indiana University and University of Illinois
23 October 2014
14
Growth of Collection
14,000,000
12,104,793
12,000,000
9,966,572
10,599,355 10,878,121
10,000,000
7,836,698
8,000,000
5,221,092
6,000,000
4,000,000
2,477,871
2,000,000
0
2008
23 October 2014
15
2009
2010
2011
2012
2013
2014
Language Distribution (1)
Latin, 1%
Remaining
Languages, 13%
The top 10 languages make up
~87% of all content
Arabic, 2%
Italian, 3%
Japanese, 3%
English, 49%
Russian, 4%
Chinese, 4%
Spanish, 5%
German, 9%
French, 7%
* As of February 17, 2014
23 October 2014
16
Language Distribution (2)
The next 40
languages
make up
~12% of
total
Slovak, 1%
Turkish,-Ottoman, 1%
Malayalam, 1%
Finnish,
1%
Romanian, 1%
Malay,
Slovenian, 1%
Telugu, 1%
1%
Greek,MultipleArmenian, 1%
Yiddish, 1%
Ancient-(tolanguages
Panjabi, 1%
1453), 1%Bulgarian
Nepali, 0%
, 1%
, 1% Serbian, 1%
Marathi,
1%
Vietnames
Catalan, 1%
e, 1%
Ukrainian, 1%
Polish, 7%
Greek,-Modern(1453--), 2%
Sanskrit, 2%
Norwegian, 2%
Portuguese, 7%
Dutch, 5%
Hebrew, 5%
Hindi, 5%
Bengali, 2%
Hungarian, 2%
Tamil, 2%
Persian, 2%
Indonesian, 4%
Croatian, 3%
Czech, 3%
23 October 2014
17
Korean, 4%
Danish, 3%
Turkish, 3%
Urdu, 3% Thai, 3%
Swedish, 4%
* As of February 17, 2014
Dates
0-1500, 0.04%
1500-1599, 0.07%
1600-1699, 0.01%
2000-2009 1700-1799, 0.01%
10%
1850-1899 1800-1849
3%
1910-1919 1900-1909
10%
4%
4%
1920-1929
4%
1930-1939
4%
1940-1949
4%
1960-1969
11%
1990-1999
14%
1980-1989
14%
1970-1979
13%
1950-1959
6%
* As of February 17, 2014
23 October 2014
18
Preservation with Access
• Preservation
– TRAC-certified
– Long-term commitments on digital content facilitate
planning, decision-making
• Discovery
– Bibliographic and full-text search of all materials
– Mechanisms for local loading of records
• Access and Use
–
–
–
–
23 October 2014
Full text search (all users)
Public domain and open access works (all users)
Collections and APIs (all users)
Lawful uses of in-copyright works (members)
19
Title page of edition of JF Cooper’s Satanstoe presented in the Making of America database. (Accessed October 18, 2014)
Spine of edition of JF Cooper’s Satanstoe
presented in the Early American Fiction database.
(Accessed October 18, 2014)
6 of 15 records for different copies of JF Cooper’s Satanstoe presented HathiTrust. (Accessed October 18, 2014)
Some Issues
• Collection strategies
– What else?
– Associated access and preservation questions
• The “Evolving Scholarly Record”
– The book and the network
– Fragmentation and loss
How to find out more
•
•
•
•
•
About: http://www.hathitrust.org/about
Resources: http://www.hathitrust.org/resources
Twitter: http://twitter.com/hathitrust
Facebook: http://www.facebook.com/hathitrust
Monthly newsletter:
– http:www.hathitrust.org/updates
– RSS http://www.hathitrust.org/updates_rss
• Contact us: feedback@issues.hathitrust.org
• Blogs: http://www.hathitrust.org/blogs
– Large-scale Search
– Perspectives from HathiTrust
21 October 2014
26
Thank you!
furlough@hathitrust.org
@MikeFurlough
Download