Business Models for the Interdependent Digital Collection

advertisement
What is HathiTrust
and
Why is it relevant to research
libraries?
‘Sourcing and Scaling’ brought to the
collective collection
What is HathiTrust?
HathiTrust is attempting nothing short of creating a
comprehensive preservation repository of published
literature, primarily though not exclusively through
digitization.
Content Distribution
6,947,494 – Total
1,567,058 – Public Domain
* As of October 11, 2010
Language Distribution (1)
The Top 10 languages make
up close to 86% of total
content
* As of October 11, 2010
Language Distribution (2)
The next 40
languages make
up ~14% of total
* As of October 11, 2010
Dates
* As of October 11, 2010
Originating Institution
* As of October 11, 2010
Content over time
* As of October 11, 2010
HathiTrust is about collections, writ large, and not about
Google digitization.
The first order of HathiTrust business is long-term
preservation of this digital content, and we don’t
believe in preservation without access.
HathiTrust takes the business of sustainability seriously,
with regard to governance, finances and technology.
Governance
Budget/Finances
Decision-making
Strategic
Advisory Board
Executive
Committee
HathiTrust
Guidance on
Policy,
Planning
Executive Committee
•
•
•
•
•
•
•
•
•
Paul Courant, University Librarian and Dean of Libraries, UM
Laine Farley, Executive Director, CDL
John King, Vice Provost for Academic Information, UM
Paula Kaufman, University Librarian and Dean of Libraries, UI
Brian Schottlaender, University Librarian, UCSD
Ed Van Gemert, UW – Madison (ex officio)
Brenda Johnson, Dean of Libraries, IU
Brad Wheeler, Chief Information Officer, IU
John Wilkin, Executive Director of HathiTrust and
Associate University Librarian, LIT, UM
Strategic Advisory Board
•
•
•
•
•
•
•
•
•
Ed Van Gemert (Chair), UW - Madison
John Butler, AUL for Information Technology, U Minn
Patricia Cruse, Director, Preservation, CDL
Bernie Hurley, Director, Library Technologies, UC Berkeley
R. Bruce Miller, University Librarian, UC - Merced
Sarah Pritchard, University Librarian, Northwestern
Paul Soderdahl, Director, LIT, U Iowa
John Wilkin, Executive Director, HathiTrust (ex officio)
Robert Wolven, Columbia University
… and the future
• October 2011 Constitutional Convention
• Delegates from institutions that are
participating by October 31st, 2010
• Weighted voting model to reflect varying
levels of investment
• Formal review of HathiTrust by SAB in early
2011 (in time for Constitutional Convention)
• Framing the next stage of governance,
refinement of new cost model
all of the reasonable costs of sustaining the archive—
including replacement costs and a sort of insurance policy—
are combined to create a sort of atomic cost unit (in this
case, a GB of content)
How much does it cost?
Governance
Budget, Finances
Decision-making
Policy
Enterprise
Management
Repository
Administration
Repository
Administration
Communication
and Coordination
with partner
institutions
Hardware
configuration and
maintenance
Data management
(content storage,
backup, integrity
checks, deletion)
Project
management
Planning
Web and
application server
configuration and
maintenance
Security
Hardware selection
and replacement
Content and
Metadata
specifications
Permissions
Rights
Management
Bibliographic
Data
Management
Copyright
determination
Entity description
(record-level)
Copyright review
Object
identification
(item-level)
Copyright
information
management
(database)
Data availability
Collection
Development
Digital
• Expansion beyond
books and journals
(born-digital,
images and maps,
audio)
• Selection of
content (for nonGoogle volume
ingest and pilots
projects)
Print
• Cloud Library (effect
of digital on print)
Rightsholder
permissions
Disaster Recovery
Logging
Processes for
ensuring content
integrity
e-Commerce
Print on Demand
Content Ingest
Content Access
Quality
Assurance
User Services
Transformation
PageTurner
Quality Review
Usability
Validation
Collection Builder
Content
Certification
User support
(helpdesk)
Large-scale Search
Financial
contributions
of partners
Research Center
Bibliographic
Catalog
APIs
Outreach
Project website
Monthly
newsletter
Papers and
presentations
HathiTrust Functional
Framework
Communication
with potential
partners
Surveys, general
inquiries
Repository
evaluation and
audit (e.g.,
DRAMBORA,
TRAC)
Legal
Risk management
(use of materials)
Partner
agreements
Advocacy
Mission and goals
• Mission: “to contribute to the common good by collecting, organizing,
preserving, communicating, and sharing the record of human knowledge.”
• Goals
– To build a reliable and increasingly comprehensive digital archive of library
materials converted from print that is co-owned and managed by a number of
academic institutions.
– To dramatically improve access to these materials in ways that, first and
foremost, meet the needs of the co-owning institutions.
– To help preserve these important human records by creating reliable and
accessible electronic representations.
– To stimulate redoubled efforts to coordinate shared storage strategies among
libraries, thus reducing long-term capital and operating costs of libraries
associated with the storage and care of print collections.
– To create and sustain this “public good” in a way that mitigates the problem of
free-riders.
– To create a technical framework that is simultaneously responsive to members
through the centralized creation of functionality and sufficiently open to the
creation of tools and services not created by the central organization.
A global change in the library environment
60%
Academic print book collection already substantially
duplicated in mass digitized book corpus
50%
% of Titles in Local Collection
June 2010
Median duplication: 31%
40%
30%
20%
June 2009
Median duplication: 19%
10%
0%
0
20
40
60
80
Rank in 2008 ARL Investment Index
100
120
an ARL institution that wishes to use
HathiTrust as part of a larger strategy—
part of a “cloud” strategy
The HathiTrust Business Model, v.2: Costs based on
holdings overlap and the perceived benefits we derive
new cost model: http://www.hathitrust.org/cost
For public domain volumes:
(PD*X*C)/N
For a given incopyright volume:
IC=(C*X)/H
sharing in the curation; having a voice in shaping the future
driving down costs
reducing bibliographic
indeterminacy
making meaningful
decisions about
formats and quality
Collective digital curation
increasing discoverability
consolidating development talent
improving
strength of
archiving
Partner Status
• As of October 11th
– 33 Contributing partner libraries
– 1 Sustaining partner library
• In final stages of contract review or pending
announcement
– 5 Contributing partner libraries
– 6 Sustaining partner libraries
scale!
“transfer resource[s] away from 'infrastructure'
and towards user engagement.”
Lorcan Dempsey
Inviting participation…
http://www.hathitrust.org/join
Download