Discovery and Access: Primo or What’s Next? Prepared by Lisa Janicke Hinchliffe and Bill Mischo With the Advice of the Discovery and Delivery Study Team University Library Faculty/AP Un-Retreat January 7, 2014 A foundational purpose of a library is making available materials for the user community. Specifically, one of the University Library’s guiding values is “improving access to library content and collections”1 and Goal 1 of the Strategic Initiatives is to “Promote Access to, and Discovery of, Library Content and Collections.”2 The University Library has pursued access through a variety of initiatives in the past decade and continues to seek improvements in this area. Easy Search In 2007, in response to concerns about the Library Gateway system expressed by Library staff and users, the Library deployed a new public facing Gateway that featured Easy Search. Easy Search is a locally developed federated search and recommender tool and functions as the primary search agent operating over licensed abstracting/indexing services, the catalog, ebook collections, media, and other information resources. Easy Search was originally developed just after the Library abandoned implementation of the Web Feat federated search tool due to technical and usability problems.3 With the development and deployment of Easy Search, the Library achieved the goal set out by the Access Working Group in 2003 for a federated/broadcast search tool that could serve as a cornerstone of an integrated information environment for our users.4 Since 2007, the Library has developed custom Easy Search implementations for various departmental libraries, including the UGL, Music and Performing Arts, Grainger Engineering, ACES, Biology/Biotechnology, Classics, Geology, Global Studies, International and Area Studies, Library and Employment Relations, Library and Information Science, Literatures and Languages, Physics, and University High School. Over time, detailed analysis of the Gateway Easy Search custom transaction logs has informed the implementation of a set of search assistance mechanisms and search tips within the Easy Search results display. Primo In 2012, the Library introduced the Primo web-scale discovery system. Primo went live to library personnel on September 29, 2012 and to the public on February 13, 2013. Primo uses an aggregated single indexing structure over catalog records, journal and newspaper articles, the AtoZ journal list, and local digital collections. Specifically, Primo indexes the online catalog, the SFX knowledgebase, IDEALS, ContentDM, and LibGuides as well as activated collections in the Primo Central Index. Primo also offers the ability to conduct an “unblended” search against locally developed scopes such as the UIUC Online Catalog, Online Journals and Databases, UIUC Created Content, or Articles and More (Primo Central Index). Primo “blends” search results into a single, relevancy-ranked display with facet limiters provided for refining and focusing search results around resource type, subject, collections, author names, years, and language. The Easy Search search assistance mechanisms have been implemented within a custom tile in the Primo display, enhancing the functionality of Primo and leveraging our local development work in Easy Search. The custom tile includes direct links to exact journal and database titles matches, links to relevant LibGuides, author search reformulation recommendations, spelling checks, and DOI searches as well as creating new title searches for search strings of more than three words or titles that contain capitalized words. The Web-Scale Implementation Team has conducted a number of evaluation studies over the past months. These studies have included a usability study, user survey, search log transcript analysis, and testing relative to topics from Composition courses. The results from these studies has informed the development of our Primo implementation while also revealing some of the limitations of the system. The implementation of Primo (as a web-scale discovery tool) has allowed us to examine key issues in search and discovery, including the role of a web-scale discovery system in the Library's Gateway, the relationship between a web-scale aggregated central index and the specialty disciplinary abstracting and indexing services the Library licenses, the effectiveness of vendor databases such as EBSCO databases, ISI, and Scopus when integrated into Primo, the value of blended display, instructional issues, the relationship between a web-scale system and a federated search/recommender system such as Easy Search, the efficacy of full-text search as compared with metadata-based searching, user search behavior relative to a web-scale discovery system, and many other issues. The Primo implementation has had mixed results. Many library staff find that Primo combines some of the best search features of vuFind and webVoyage (Classic Voyager) and is an excellent catalog interface for our local catalog. An additional value of Primo is the FRBRization and de-duplication features, pulling together multiple editions and formats of the same title. This functionality of merging records in the public display is something the Library has been seeking to achieve for many years and it works relatively well in Primo; however, there are known problems, particularly with respect to activating records for the HathiTrust collection. There are both positive and negative issues with searching the Primo Central Index. The Primo Central Index is very comprehensive in size and depth of coverage. It does exact title phrase searches well. The pre-filter options in Primo work to limit by format, fielded search (title, author, subject, etc.), or type of search (keyword, phrase, or starts with). The advanced search features in Primo provide effective filtering alternatives when searching with multiple search terms and/or against multiple metadata fields. Unfortunately, for many other searches, particularly general keyword searches or searches that mix fields from title, author, subject and/or publication dates, Primo does not provide adequate results. Primo's default searching is against full-text content which is problematic when 50% of our user searches are for known-items. Most of the time, irrelevant results dominate what the user sees in the results. Primo's search algorithms and relevancy rankings are currently inadequate and are not as robust as other vendors’ search relevancies (i.e., EBSCO's EDS or Proquest's Summon). For these reasons, we have found that Primo is a poor resource for topical undergraduate research and find its introduction as a potential tool in library instruction for undergraduate students questionable, in most cases. Also, many users still prefer to use specific EBSCO databases for discipline-based research and, at present, search results generated in Primo through the EBSCO API cannot be successfully merged with the results from the Primo Central Index. Finally, starting in May 2013, we discovered we were having issues renormalizing and re-indexing our local data in Primo. After consulting with Ex Libris, we were informed we were reaching capacity with the current dedicated server setup, which can hold between 6 to 7 million records depending on the size of the metadata records. We are currently at more than 6.6 million records. The implementation of Primo was intentionally set to be three years to give the Library an opportunity to test the next generation of web scale discovery systems and determine what really works with these services and what challenges there are. Primo meets some of the needs the Library identified at the outset of implementation; however, there are limitations in its functionality and it does not meet all of the requirements. While the Discovery and Delivery Study Team undertakes its work, the Web-Scale Implementation Team has recommended that the Library keep Primo available in its current state but not proceed with additional development or customization of the system at this time. Discovery and Delivery Strategy In order to gather data concerning our future needs and to review our options for moving forward, the Content, Access, Policy and Technology (CAPT) Committee has appointed a Discovery and Delivery Study Team, co-chaired by Lisa Hinchliffe and Bill Mischo.5 The Study Team is charged to develop and recommend “discovery and delivery strategy” for the Library through broad consultation and engagement. Developing this strategy will entail a comprehensive review of how the Library currently facilitates discovery of and provide access to content, the marketplace of current and emerging search, retrieval and access technologies, and approaches for coordinating methods and techniques throughout the Library's decentralized service structure as well as articulations of principles and assumptions that should guide the Library's work in this area. Discussion Questions 1 What principles should be the foundation for the Library’s “discovery and delivery strategy” (e.g., fully develop and implement fewer tools)? What does it mean to be “in the flow” of our user’s work? How can we best engage our user communities in order to understand their information search and retrieval needs? What practices could we adopt in the University Library to achieve a more coherent and efficient search, discovery, and delivery experience for our users? What are best practices in nimble implementation/retirement of systems? Page 2, http://www.library.illinois.edu/committee/exec/documents/20112012/Library_Strategic_Initiatives_Final.pdf 22 Page 3, http://www.library.illinois.edu/committee/exec/documents/20112012/Library_Strategic_Initiatives_Final.pdf 3 Federated Search Pilot Project Implementation Team: Final Report, August 28, 2006, http://www.library.illinois.edu/committee/capt/supplement/2006-2007/FedsFinalReport.pdf 4 The Access Working Group Priority Matrix is not online currently but will be posted on the Discovery and Delivery Study Team website with the minutes from the first meeting of that group. 5 http://www.library.illinois.edu/committee/capt/workinggroups/discoverydelivery/