Seamless Sharing: NYU, HathiTrust, ReCAP and the Cloud Library KAT HAGEDORN HATHITRUST SPECIAL PROJECTS COORDINATOR UNIVERSITY OF MICHIGAN LIBRARIES OCTOBER 9, 2009 With thanks to Constance Malpas at OCLC and John Wilkin at University of Michigan for their considerable contributions Overview The cloud library and this pilot project Brief overview of HathiTrust Findings Expectations Cloud Library, not cloud computing Similar but vastly different Necessity/desire to share resources Multiple digital and print repositories Repositories can now move into a “cloud” that will become a shared network resource What infrastructure needed? Loans Borrowing System Digitized Library Collections Off-Site Collections Shared Collections ReCAP Transfers Retrievals Aggregate holdings and joint commitments constitute a Disclose Local Collections Holdings Withdrawals Registry shared asset enabling collaborative management strategies Assets Infrastructure Policies Procedures Perceived need Already good support of other “virtual” shared services, e.g., ILL, doc delivery What exists in off-site storage and digital repositories that isn’t currently accessible? Collection development mechanisms need to discover accessibility and preservation statuses How should we build such a service for consumers? Demand for services Multiple, sometimes overlapping, reasons institutions will be interested in being part of a cloud library preserving titles that are rare and/or special in some manner remove titles that are duplicated across many institutions added value of shared materials in digital repository (discovery, search) contributing to a public good Partners in pilot NYU – model customer Acute space pressures; major library renovation Limited mandate to build local collection of record ReCAP – model supplier Large-scale shared academic storage collection HathiTrust – model supplier Large-scale shared digital repository OCLC Research and CLIR – consultants & convener A bit about HathiTrust To contribute to the common good by collecting, organizing, preserving, communicating, and sharing the record of human knowledge materials converted from print improve access …to meet the needs of the co-owning institutions reliable and accessible electronic representations coordinate shared storage strategies “public good” …sustaining the historical record simultaneously …centralized …open Growth of HathiTrust Includes ingest of materials not from Google (GBS) Intersections Material that NYU can obtain through HT dependent on copyright status – enhance ‘local’ collection opportunities for institutional cooperation shared policy frameworks joint service agreements increased operational efficiencies Material that NYU can relegate with a high degree of confidence HathiTrust Material that NYU can already source through existing ILL – enhance local collection N=3.8M N=2.3M N=7.6M ReCAP that NYU may choose to relegate based on copyright/ availability ReCAP Material that NYU may choose to relegate with appropriate service level agreement The Cloud Library Increased reliance on a network of collections and services with a robust underpinning of shared policy and service infrastructures that are jointly owned by participating libraries Naturally, as number of participants grows, value of partnership increases Goal of pilot study: service expectations for both digital and print repositories cost/benefit analyses for sharing resources processes for discovery of shareable titles Process for discovery of overlap Ingestion on a monthly basis Checking of OCLC numbers (without can’t be processed)– use of xID to derive more New data structure… Harvest Hathi metadata Overlap analysis report Process, index, analyze Normalize rights values Monthly data harvest 2 weeks per cycle to process Join Hathi and WorldCat data Rights anomalies report Extract OCLC numbers Derive add’l OCLC numbers via xID Extract WorldCat data OCLCnum report HathiTrust: Looking forward Ingesting from 4 institutions (UC, Indiana, Wisconsin, Michigan), more to come Moving from off-site storage scanning to main libraries Result: slight changes in number of PD volumes Change in membership …broader base of institutions for cost-sharing Future contracts will mostly be picklists Internet Archive ingest starts this winter/late fall Completion of TRAC certification Requirements and benefits Service expectations for both HathiTrust and ReCAP turnaround time continuity of operations access privileges With HathiTrust, all are par for the course As partners in the cloud library… preservation of texts and metadata longevity and perptuity trust and reliability access to titles not held by library (comprehensive) opportunity for voice in HathiTrust development Questions? Constance Malpas (OCLC): malpasc@oclc.org John Wilkin (HathiTrust): jpwilkin@umich.edu Kat Hagedorn (HathiTrust): khage@umich.edu http://hathitrust.org/ hathitrust-info@umich.edu