Project Kick-Off Meeting September 13, 2013 Today’s Objectives • Who and what is SCS? • Rethinking Library Resources • Data-Driven Deselection • Overview: SCS Processes – Planning & Requirements Gathering – Data Preparation – GreenGlass (web-based collection analysis application) – Group Collection Summary 2 Today’s Objectives • VIVA Project Goals • VIVA Project Scope • Proposed Project Schedule & Dependencies • Project Roles and Communication • Review Initial Tasks – Questions: Data – Questions: Collections • Clarify everything we can! 3 Your Objectives? • Introductions • What do you and your colleagues hope to achieve with this project? • What would a successful outcome look like? Sustainablecollections.com 4 Who and what is SCS? • Founded in February 2011 • Principals – Chief Analytics Officer [Andy Breeding] – Chief Technical Officer [Eric Redman] – Chief Operations Officer [Ruth Fischer] – Chief Executive Officer [Rick Lugg] – OCLC [Strategic Partner] • 100+ projects to date Sustainablecollections.com 5 Sustainablecollections.com 6 SCS Mission To help libraries manage and share print monographs Sustainablecollections.com 7 Actionable Collection Intelligence℠ Sustainablecollections.com 8 Deselection: Defined Broadly • ‘Deselection' can encompass a number of different goals: • Transfer to offsite storage, automated storage & retrieval systems (ASRS) or compact shelving • Shared Print Archiving • Retention and Preservation • Digitization • Weeding or Withdrawal Sustainablecollections.com 9 Broader Collection Analytics • Identifying & protecting scarcely-held titles • Gap analysis • Overlap analysis • Exact edition vs. Any edition • Print/E-Book overlap [not quite there yet] • Using historical data to influence ongoing collection development Sustainablecollections.com 10 Good Decisions Require Data • How many holdings/copies? • Where are they? • Is the title securely archived? • Can the title be accessed quickly? Can the title be reobtained if needed? • Collection strengths • What options are available for each title? • What will the data support? Sustainablecollections.com 11 SCS Group Projects to date • Michigan Shared Print Initiative (MI-SPI) • California State University System • Connect New York • Maine Shared Collections Strategy • VIVA Videos • Central Iowa Collaborative Collections Initiative (CICCI) • Washington Research Library Consortium (WRLC) Sustainablecollections.com 12 RETHINKING LIBRARY RESOURCES Sustainablecollections.com 13 Evolution of the Library Paradigm Reader-centered: from monastic scriptorium and library; dominated by light and reading tables Book-centered: collection growth; unrelenting need for more shelving Learning-centered: digital content; information commons; learning spaces; information literacy Source: Scott Bennett, Libraries and Learning: A History of Paradigm Change (2003) 14 The Problem • • • • • • • Stacks are overcrowded Use of print books is low and declining Library space is wanted for other purposes Print redundancy is significant The cost of keeping books on shelves is high Alternatives exist, but data is scattered Traditional approaches to deselection are costly and time-consuming Sustainablecollections.com 15 Stacks are crowded and empty 16 Circulation in Academic Libraries Continues to Decline 37% Decline 17 Space Requirements: Monographs Volumes 100,000 250,000 500,000 1,000,000 2,700,000 Square Feet 20,000 45,000 80,000 150,000 405,000 Source: Stephen R. Lawrence, Lynn Silipigni Connaway, and Keith H. Brigham, “Life Cycle Costs of Library Collections” College & Research Libraries, November 2001, p. 546. 18 Library space is wanted for other purposes… “The crowding out of readers by reading materials is one of the most common and disturbing ironies in library space planning.” --Scott Bennett Sustainablecollections.com 19 Lifecycle Costs: Monographs • CLIR, June 2010 • Courant & Nielsen • Estimated Annual Costs $4.26/ volume annually in central stacks $0.86/volume in highdensity facility 20 Print redundancy is significant… Potential for shared print And local reductions 21 Two functions of library print collections • Preservation function • “Dispensing” function Source: Michael Buckland, Redesigning Library Services: A Manifesto (Chicago: American Library Association, 1992). Sustainablecollections.com 22 Strong preferences: print, self-sufficiency Hathi Trust or other digital surrogate Print in Collective Collection Print in state Print within group Sustainablecollections.com 23 ‘Archive’ copies • Print Archives • Failsafe for technological or natural disaster • New digital surrogates or re-digitization • Dark, dim, or light? • People trust print • Digital Archives • Secure, high-quality • Hathi Trust, Portico • CRL certification 24 ‘Service’ copies • Once content is securely archived, ‘dispensing’ function can be managed with fewer copies • Focus on distribution, convenience, speed of delivery • Borrow or re-purchase; print, electronic (including PDA, DDA, Short-term Loan); POD 25 Surplus copies • Archiving requirements satisfied • Sufficient service copies to meet anticipated demand • How many holdings/copies remain? • Are all of them needed? • Share? Store? Withdraw? Sustainablecollections.com 26 The Case of Bertrand Russell… Alternatives exist, but the data is scattered… 27 Sustainablecollections.com 28 Shared Collections? 29 Shared Benefit? 30 Independent action in a collective context Sustainablecollections.com 31 Deselection Metadata Sustainablecollections.com 32 Collaborative Analysis Sustainablecollections.com 33 Collective Decisions Based on Data 2,000,000 1,800,000 0 Circs 491,866 1,600,000 1-3 Circs 1,400,000 4+ Circs 1,200,000 725,379 1,000,000 800,000 600,000 145,121 331,091 108,695 400,000 200,000 214,067 645,194 442,308 240,846 - 1 2 3-12 OVERVIEW: SCS PROCESSES Sustainablecollections.com 35 Project Segments • Planning & requirements gathering • Getting usable catalog extracts • Data preparation and review ===================================================== • Group collection summary • Scenario development • Iterations • Candidate lists Sustainablecollections.com 36 Planning & Requirements Gathering 1. Project goals and strategies 2. Collections and analytical strategies 3. Cataloging practices and data extracts Sustainablecollections.com 37 Project Goals and Strategies • Take time to understand member needs and perspectives: – Where does collection assessment fit in the hierarchy of goals? – Differing levels of urgency? Space pressures? – Differing philosophies related to shared print? – Agreement on what constitutes a successful outcome? • Equity issues should be discussed (very large versus very small collections) • Is everyone in a position to make retention commitments? Sustainablecollections.com 38 Project Goals and Strategies • How does a collection analysis project relate to the development of a Memorandum of Understanding, last-copy policy, and other shared print initiatives? – Timing – Membership – Duration of commitments Sustainablecollections.com 39 Collections and Analytical Strategies • Ensure a shared understanding of the scope of the project • Most productive focus: circulating print monographs • Which libraries, which branches? • What comparisons will the group’s data support? Sustainablecollections.com 40 Collections and Analytical Strategies • Define comparator groups – VIVA pilot group – Other VIVA libraries* – Other groups or individual libraries (TBD)* – Other libraries in the state (standard)* – US libraries (standard)* – Global libraries (standard)* * Based on WorldCat holdings Sustainablecollections.com 41 Comparator library groups Library UNI UI ISU University of Northern Iowa NIU University of Iowa NUI Iowa State University IWA Ashford University Loras College Briar Cliff University Luther College Buena Vista University Maharishi University of Management Clarke University Mercy College of Health Sciences Coe College Morningside College Cornell College Mount Mercy College Des Moines University Northwestern College Divine Word College Palmer College of Chiropractic Dordt College St. Ambrose University Graceland College University of Dubuque Iowa Wesleyan Upper Iowa University Waldorf College Wartburg College Wartburg Theological Seminary 42 William Penn University Other IPAL IO9 IOB IOE IOC ION IMV IWO DIV IOT IOF IOI IOL IOH MIU Y4Q IOM UIW IOO PWT IOJ IOV IOY IX5 IOW IWT IOX Collections and Analytical Strategies • Define group-wide local interest materials – to be protected from withdrawal? For retention commitments? • Local title protection rules – for individual member libraries? Sustainablecollections.com 43 Collections and Analytical Strategies • External comparisons – WorldCat Holdings – Hathi Trust In-Copyright – Hathi Trust Public Domain – Internet Archive – CHOICE – CHOICE Outstanding Academic Titles Sustainablecollections.com 44 Collections and Analytical Strategies • Will subject analysis be wanted? • What will the goals of the subject analysis be? • Dewey to LC cross-walk needed? Other cataloging schemas ? Sustainablecollections.com 45 How to deal with uneven depths of data? Library 46 Total Charges Earliest Last Charge Date Library 1 20 years 6/29/1993 Library 2 11 years 6/26/2002 Library 3 7 years 1/20/2005 Library 4 23 years 7/23/1990 Library 5 15 years 9/22/1998 Extensive item data will be collected • item call number • location code* • volume • item type code* • last reserve date • note field* • copy # • opac message* • in-house uses • item status code* • barcode • total checkouts • last check-in date • item record number • last check-out date • item create date Sustainablecollections.com 47 Getting usable catalog extracts • SCS will specify the desired data • Data call(s) with system librarians and catalogers • As needed, we can arrange for assistance • SCS will set-up an FTP area for extract delivery • SCS will review test extracts and request changes if necessary • SCS will confirm successful delivery of all extracts Sustainablecollections.com 48 SCS normalizes the data from each library Bibliographic, item, circulation, and holdings data extracted, transformed, and loaded to an SCS Postgres database • Filter out-of scope bib records (eBooks, maps, DVDs, Gov Docs) • Eliminate duplicate bib records • Normalize call numbers • Eliminate trailing spaces in control numbers • Validate OCLC numbers • LCCN/title-string lookups for records lacking an OCLC number • Identify and accommodate unusual implementations of MARC • Identify bibs without items and items with multiple bib records • Map item-level data and interpret codes • Assign LC (and/or Dewey) Classes to records 49 Data counts that we will want you to validate • Bib record counts, filtered and unfiltered • Bib records filtered out by cause • Circulation / internal use counts • Title/item counts by location 50 Bib records filtered out by cause: example Bib Records filtered out 51 21,675 Government docs 1,880 Non-language materials 2,821 Non-monographic materials 1,880 Non-print resources 13,725 Unable to obtain OCLC number 3,461 Bib Title/Author mismatch with OCLC 279 Multiple OCLC numbers per record 47 Individual Library Data Loaded to GreenGlass • Visualize your library’s collection • Run queries against your library’s collection • See your collection on the context of • Usage • Age • Overlap • Understand your collection in new ways • See GreenGlass videos on the SCS website 52 Sustainablecollections.com 53 Sustainablecollections.com 54 Sustainablecollections.com 55 Sustainablecollections.com 56 Data Remediation Lists Available in GreenGlass • OCLC numbers assigned by SCS • Records without OCLC numbers • Holdings not set in WorldCat • WorldCat Title/Author Risk • Multiple OCLC numbers • Other • Hathi Public Domain titles • Hathi/ Internet Archive URLs 57 GreenGlass Caveats • Aside from taking advantage of catalog remediation lists, it is important that no action be taken based on individual GreenGlass modeling, etc. • Group level analysis will be provided offline. 58 GROUP COLLECTION SUMMARY 59 Recorded Uses Title-Holding Counts 1 All Title Holdings - Filtered All Libraries % 1,048,251 100% Recorded Use Counts 60 2 Total Recorded Uses = 0 448,173 43% 3 Total Recorded Uses = 1 208,568 20% 4 Total Recorded Uses = 2 119,039 11% 5 Total Recorded Uses = 3 73,754 7% 6 Total Recorded Uses 4-9 150,156 14% 7 Total Recorded Uses > 10 48,651 5% 14 Last charge after 2010 104,933 10% 15 Last charge after 2007 211,842 20% 16 Last charge after 2005 272,626 26% WorldCat™ Counts – US WorldCat Counts - US - Specific Edition 61 Title Holdings % 2 Unique in the US 2,804 0% 4 2-4 Holdings in the US 7,327 1% 6 5-9 Holdings in US 10,822 1% 8 10-19 Holdings in US 19,452 2% 10 20+ Holdings In US 1,007,213 96% 12 50+ Holdings in US 953,539 91% 14 100+ Holdings in the US 875,579 84% 16 200+ Holdings in the US 728,019 69% Overlap based on SCS Matching – for a 5 Library Group Overlap within the 5 participating libraries Title Holdings % 2 Unique in group 526,526 50% 3 Title-holdings in 2 libraries 280,360 27% 4 Titles-holdings in 3 libraries 154,351 15% 5 Titles-holdings in 4 libraries 68,681 7% 6 Titles-holdings in all 5 libraries 18,333 2% 62 Overlap with a Peer Group Overlap with other IPAL libraries – specific editions 63 Title Holdings % 29 WorldCat holding set in 1 other IPAL Library 170,962 16% 30 WorldCat holding set in 2-4 other IPAL libraries 293,053 28% 32 WorldCat holding set in 5-9 other IPAL libraries 155,259 15% 34 WorldCat holding set in 10+ other IPAL libraries 33,678 3% Title-Holdings by Publication Year 25,000 20,000 15,000 10,000 5,000 0 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 64 Title Holdings by LC Class 65 Holdings and Usage Levels Compared Number of Title Holdings Average Uses per Title-Holding 66 Hathi Trust and Internet Archive SCS Matches 9 67 Hathi Trust Public Domain Match Title Holdings % 53,595 5% 10 Hathi Trust In-Copyright Match 455,250 43% 11 Internet Archive Match 158,754 15% 12 In Internet Archive not in Hathi 60,875 6% 13 In Hathi not in Internet Archive 425,414 41% After the Group Summary has been delivered … Ask additional questions! • What surprises? • What do we still not know? What additional information should we ask for? • How can we use the data to inform cooperative collection development agendas? • How can we use the data to inform potential deselection projects? Sustainablecollections.com 68 Project Segments • Planning & requirements gathering • Getting usable catalog extracts • Data preparation and review ===================================================== • Group collection summary • Scenario development • Iterations • Candidate lists Sustainablecollections.com 69 QUESTIONS Sustainablecollections.com 70 Today’s Objectives (2) • VIVA Project Goals • VIVA Project Scope • Proposed Project Schedule & Dependencies • Project Roles and Communication • Initial Tasks • Clarify everything we can 71 VIVA PROJECT GOALS Sustainablecollections.com 72 VIVA Project Goals • Pilot a coordinated, consortial approach to collection assessment. • Use the data and analysis to inform future, collaborative collection development. • Identify scarcely-held titles in need of protection. • Begin a discussion about the possibility of reducing unnecessary duplication and saving local space through strategic weeding. • Provide remediated and enhanced records back to the participating schools. 73 VIVA Project Scope • 10 data sets representing a range of ILS and institution types – Public doctoral – Public Four-year – Public Two-year – Private • Compare pilot library holdings with rest of VIVA (via WorldCat holdings) • Can this scale to include all of VIVA? Sustainablecollections.com 74 VIVA Project Scope • Circulating Print Monographs (est. 5.8 million) • English-language only • Main libraries only (excludes Law, Health Sciences other specialized libraries) • LC libraries only • Out of Scope: • Reference • Government Documents • Special Collections • eBooks Sustainablecollections.com 75 Task Description Dates High-level project schedule/dependenies Planning Meetings Key players discuss data extracts, anomalies, peers, etc. You Are Here Sept 2013 Comparators, local interest rules, scoping refinements Sept-Oct 2013 Data Preparation Libraries prepare and deliver extracts to SCS. SCS validates, normalizes, matches, and performs holdings lookups. Sept-Dec 2013? Group Collection Summary Categorical overview of the group data set. Used to gauge opportunities and guide scenario development. Early 2014? Collections Decisions Scenario Development Project leaders suggest preliminary assessment criteria. SCS iterates and revises scenarios. Jan-April 2014 Candidate Lists Detailed Excel spreadsheets for review, bases on finalized criteria for retention and withdrawal. Modify as necessary. April-June 2014 Discussions Facilitation This will be needed at many points – but especially around scenario development, allocation, and policy development. Throughout Allocation Assignment of withdrawal opportunities and retention commitments – based on many factors. TBD List Production Once allocation decisions have been made, SCS will derive title/item lists for use by individual libraries. TBD Ongoing Data SCS will maintain (but will not update) the VIVA dataset for 2 years, Sustainablecollections.com Management which can be used for additional76projects. … Project Roles & Communication • VIVA Staff • Project Coordination • Communication • Libraries • Local Operational Context • Input on Criteria, Policies • Local Data/Collections Perspective Sustainablecollections.com 77 Project Roles & Communication • SCS • Data Management/Consolidation/Augmentation • Comparative Intelligence • GreenGlass • Framing & Facilitation • Functional Departments • Systems/Data • Technical Services • Collection Development Sustainablecollections.com 78 Project Roles & Communication • Collections & Resources for Users Committee • Pilot Group local project managers • Steering Committee • Library Directors? • Technical and domain experts • Other VIVA libraries? • Deadlines? Sustainablecollections.com 79 Project Roles & Decision-Making • Collections Decisions • Data Decisions • Decision-Making: How will decisions be made, validated, communicated? How will discussions be conducted? • Project Management: Representation • Communication: Listserv? • What happens when participants disagree? Sustainablecollections.com 80 Role of Task Force/Project Managers • Task force (the local project managers) empowered to finetune the scope of the analysis , as long as the number of records will not increase. • Task force will determine the peer groups to be used as comparisons. (Since the remainder of VIVA will take up around 70 OCLC symbols, there might not be a lot of options.) • The group will be required to document and justify their choices, report to Steering Committee. 81 INITIAL TASKS Sustainablecollections.com 82 Initial Tasks: Data-related • Prepare Data Extracts • Data Mapping Documentation • Item/Status/Location Codes • Circulation Data Elements Sustainablecollections.com 83 VIVA Data Extracts: 10 Sources, 12 Libraries Institution ILS OCLC Symbol Estimated Records* George Mason Notes Voyager VGM [750,000] Old Dominion III Sierra VOD 713,995 University of Virginia SirsiDynix Symphony VA@ Virginia Commonwealth Alma VRC 884,649 Virginia Tech Millennium VPI 660,000 James Madison Millennium VMC [460,443] Radford Millennium VRA 233,809 Germanna CC Aleph PZJ 31,730 CC Libraries share a system, so the three pilot libraries count as a single data extract J. Sargeant Reynolds CC Aleph PZL 66,062 See above Mountain Empire CC Aleph PZP 37,205 See above University of Richmond Voyager VRU 382,228 Washington & Lee Millennium VLW 405,409 Also in WRLC. Re-use extract for VIVA. 1,240,421 Est. Total Records to be Processed** 5,800,000 Est. Total Records to be paid In Pilot*** 4,600,000 84 Extract already paid via JMU-specific project Cataloging Practices and Data Extracts • 6 local systems • 10 approaches to cataloging • Extent and form of item data – will be unique for each library • Some item data will be delivered in 945 sub-fields of the MARC record. Some will be delivered in delimited files. Sustainablecollections.com 85 Extensive item data will be collected • item call number • location code* • volume • item type code* • last reserve date • note field* • copy # • opac message* • in-house uses • item status code* • barcode • total checkouts • last check-in date • item record number • last check-out date • item create date Sustainablecollections.com 86 Initial Tasks: Collections-Related • Confirm scope of analysis • Define Comparator groups • Discuss/Define Local Interest Rules • CHOICE? • ALL REVIEWS? • OUTSTANDING ACADEMIC TITLES Sustainablecollections.com 87 VIVA Comparator Groups? • SCS Limit: up to 100 OCLC holdings symbols in up to 5 groups. (VIVA: 70+ symbols) • Geographic peers? • Resource-sharing partners? • Institutional peers? • Consider: are you interested in archival security of content? Access for users? • Are external comparators useful?: VIVA may already be self-sufficient Sustainablecollections.com 88 Strong preferences: print, self-sufficiency Hathi Trust or other digital surrogate Print in Collective Collection Print in state Print within group Sustainablecollections.com 89 Local Interest Rules • Categories of material to protect regardless of circulation levels • Remember, we are focused on circulating print monographs • These must be systematically identifiable; consistent data must be available Sustainablecollections.com 90 Local Interest: Examples Sustainablecollections.com 91 What do you want to know/learn? Sustainablecollections.com 92 DISCUSSION AND NEXT STEPS 93 In preparation for next steps • Think about the questions you want to ask. Does it matter? Is it actionable? • Think about which data points (and combinations of points) can help answer those questions • Think about VIVA’s 5.8 million title-holdings as a single distributed collection (this is only an exercise) • Think first about titles that have never circulated and are held by multiple libraries • Think about storage, retention, and withdrawal • Ask: what is the worst-case scenario? 94 Additional considerations • Retention commitments – Equity across group – balancing withdrawal and retention – Duration of retention commitment (5-10-25?) • Conservative vs aggressive libraries • Ongoing or one & done? • Complexity vs understandability… Sustainablecollections.com 95 Contact Info http://sustainablecollections.com rick@sustainablecollections.com andy@sustainablecollections.com ruth@sustainablecollections.com 96