HathiTrust: On TRAC - HathiTrust Digital Library

advertisement
HATHITRUST
A Shared Digital Repository
HathiTrust: On TRAC
ICPSR Applied Data Science
Repository Requirements and Assessment: HathiTrust
July 26, 2012
Jeremy York, Project Librarian, HathiTrust
Partnership
Arizona State University
Baylor University
Boston College
Boston University
California Digital Library
Columbia University
Cornell University
Dartmouth College
Duke University
Emory University
Florida State University
Getty Research Institute
Harvard University Library
Indiana University
Johns Hopkins University
Lafayette College
Library of Congress
Massachusetts Institute of
Technology
McGill University`
Michigan State University
New York Public Library
New York University
North Carolina Central
University
North Carolina State
University
Northwestern University
The Ohio State University
The Pennsylvania State
University
Princeton University
Purdue University
Stanford University
Texas A&M University
Universidad Complutense
de Madrid
University of Arizona
University of Calgary
University of California
Berkeley
Davis
Irvine
Los Angeles
Merced
Riverside
San Diego
San Francisco
Santa Barbara
Santa Cruz
The University of Chicago
University of Connecticut
University of Florida
University of Illinois
University of Illinois at Chicago
The University of Iowa
University of Maryland
University of Miami
University of Michigan
University of Minnesota
University of Missouri
University of Nebraska-Lincoln
The University of North
Carolina at Chapel Hill
University of Notre Dame
University of Pennsylvania
University of Pittsburgh
University of Utah
University of Virginia
University of Washington
University of WisconsinMadison
Utah State University
Washington University
Yale University Library
Digital Repository
• Launched 2008
• Initial focus on digitized book and journal
content
– 10.4 million volumes
– 5.5 book titles
– 270,000+ serial titles
– 3.1 public domain volumes (~30%)
Mission
• To contribute to the common good by collecting,
organizing, preserving, communicating, and
sharing the record of human knowledge
HathiTrust
Universal Library
Common Goal
Single Entity, Many Partners
Collections and Collaboration
• Comprehensive collection
- Preservation…with Access
• Shared strategies
• Public Good
Services
• Long-term preservation
– Bit-level and migration
•
•
•
•
•
•
Bibliographic search
Full-text search
Reading and download capabilities
Print on demand
Collections
Datasets, Research Center
Governance
• 12-member Board of Governors
– April 2012
• Manages budget and finances
• Budget separately held within the University
of Michigan
• Strategic Advisory Board
• Working Groups and Committees
CRL Audit
• Why
– Value Community Standards
– Accountability, Openness, Transparency
• Desire to know how we were doing, and let the
community know
What is TRAC
• Trusted Digital Repositories (OCLC, RLG) 2002
– A framework of attributes and responsibilities
– One of recommend items was process for
certifying digital repositories
• TRAC (RLG, NARA) 2007
– CRL, nestor, DCC, National Library of Australia
• Administered by CRL in US
CRL Audit (2)
• Guided by criteria included in TRAC, as well as
other metrics developed by CRL
• HathiTrust’s practices are sound…appropriate
to the content being archived and the general
needs of the CRL community.
What was involved?
• Timeline
– Data gathering: November 2009 - December 2010
– Site visit May 2010
• Logistics
– Question by email, documentation
– Phone conversations
– Staff: Project Librarian, Digital Preservation
Library, Executive Director
Where we were
• Developmental stages
– Changing, growing
• Core pieces in place
Results
• Organizational Infrastructure (2)
– Mission statement, succession plan, staff, assessment,
accountability, business plan, agreements
• Digital Object Management (3)
– Properties preserved, SIP, AIP, validation, naming
conventions, identifiers, understandability,
preservation strategies, logging, access policies
• Technologies Technical Infrastructure Security (4)
– Hardware, software, error-handling, change
management, security, staff roles, disaster
preparedness
Key Issues
• Staff/Organization
• Rights and ownership of HathiTrust enterprise
assets
• Succession plan
• Clarify and strengthen quality assurance and
print archiving components of HathiTrust
program
Executive Committee
Strategic Advisory Board
Budget/Finances Decision-making
Guidance on Policy, Planning
Collective Work: Working
Groups and Committees
Strategic
• Collections
• Discovery Interface
• Full-text Search
Operational
Operational
Communications
•• Communications
UserSupport
Support
•• User
UserExperience
Experience
•• User
Distributed work
• Driven by needs of institutions
• Leverage across the partnership
• Projects, Grant Work, Ingest Specifications, PageTurner,
Bibliographic Data Management
HathiTrust
Governance
Budget, Finances
Decision-making
Policy
Enterprise
Management
Repository
Administration
Repository
Administration
Communication
and Coordination
with partner
institutions
Hardware
configuration and
maintenance
Data management
(content storage,
backup, integrity
checks, deletion)
Project
management
Planning
Web and
application server
configuration and
maintenance
Security
Hardware selection
and replacement
Content and
Metadata
specifications
Permissions
Rights
Management
Bibliographic
Data
Management
Copyright
determination
Entity description
(record-level)
Copyright review
Object
identification
(item-level)
Copyright
information
management
(database)
Data availability
Collection
Development
Digital
• Expansion beyond
books and journals
(born-digital,
images and maps,
audio)
• Selection of
content (for nonGoogle volume
ingest and pilots
projects)
Print
• Cloud Library (effect
of digital on print)
Rightsholder
permissions
Disaster Recovery
Logging
Processes for
ensuring content
integrity
e-Commerce
Print on Demand
Content Ingest
Content Access
Quality
Assurance
User Services
Transformation
PageTurner
Quality Review
Usability
Validation
Collection Builder
Content
Certification
User support
(helpdesk)
Large-scale Search
Financial
contributions
of partners
Research Center
Bibliographic
Catalog
APIs
HathiTrust Functional
Framework
Outreach
Project website
Monthly
newsletter
Papers and
presentations
Communication
with potential
partners
Surveys, general
inquiries
Repository
evaluation and
audit (e.g.,
DRAMBORA,
TRAC)
Legal
Risk management
(use of materials)
Partner
agreements
Advocacy
Key Issues (2)
• Rights and ownership of HathiTrust enterprise
assets
• Succession plan
• Clarify and strengthen quality assurance and
print archiving components of HathiTrust
program
Future Work
• Disaster Recovery
• Change Management
– Moving to new formats: image, audio, born-digital
• Governance
• Certification updates
Thank you very much!
Download