Endeca: a faceted search solution for the library catalog Kristin Antelman & Emily Lynema UNC University Library Advisory Council June 15, 2006 Overview Why did we do this? What is Endeca? NCSU investment Working with a non-library vendor Assessing the results Some user reaction “The new Endeca system is incredible. It would be difficult to exaggerate how much better it is than our old online card catalog (and therefore that of most other universities). I've found myself searching the catalog just for fun, whereas before it was a chore to find what I needed.” - NCSU Undergrad, Statistics “The new library catalog search features are a big improvement over the old system. Not only is the search extremely fast, but seemingly it's much more intelligent as well.” - NCSU faculty, Psychology Why did we do this? Existing catalogs are hard to use: known item searching works pretty well, but … users often do keyword searching on topics and get large result sets returned in system sort order catalogs are unforgiving on spelling errors, stemming Catalog value is buried Subject headings are not leveraged in searching they should be browsed or linked from, not searched Data from the item record is not leveraged should be able to filter by item type, location, circulation status, popularity How does Endeca work? Endeca Information Access Platform coexists with SirsiDynix Unicorn ILS and Web2 online catalog Endeca indexes MARC records exported from Unicorn Index is refreshed nightly with records added/updated during previous day Endeca IAP overview Endeca Information Access Platform NCSU exports and reformats Data Foundry Parse text files Raw MARC data MDEX Engine Indices Flat text files HTTP HTTP Client browser NCSU Web Application Quick demo http://catalog.lib.ncsu.edu Implementation team Information Technology (4) Team chair and project manager – IT department head Technical lead - Java-trained librarian ILS Librarian – managing data extracts Technical manager – determining appropriate technologies Research and Information Services (1) Reference librarian – experience with public services and OPAC problems Metadata and Cataloging (1) Cataloging librarian – identifying data for indexing and display; fixing backend data problems Digital Library Initiatives (1) Interface development – mockups, usability, beta testing Team met weekly during implementation (total of 40-60 hours) Implementation timeline License / negotiation: Spring 2005 Acquire: Summer 2005 Implementation: August 2005 : vendor training September 2005 : finalize requirements October 2005 – January 2006 : design and development January 12, 2006 : go-live date It doesn’t have to be perfect! Ongoing investments Little ongoing work required for maintenance once application is deployed Infrequent data refreshing from ILS Version upgrades 6 member product team meets bi-weekly Lots of development ideas (as time / library priorities afford)! Saving time previously invested in Web2 OPAC enhancement MarcAdapter: a case study NCSU implementation required local program to transform MARC data for Endeca Endeca staff recognized effort required to duplicate this process at each library, and Quickly created a MarcAdapter plugin for raw MARC data Ability to create local field mappings and special case handlers Eliminate need for external MARC 21 translation and file merging Basic statistics (March – May 2006) Requests by Search Type Search -> Navigation 29% Search 51% Navigation 20% Navigation statistics (March – May 2006) Navigation Requests by Dimension 23,848 Availability 169,249 LC Classification 155,856 Subject: Topic 65,545 Subject: Genre 74,985 Format 87,221 Library 59,248 Subject: Region Subject: Era 38,605 Language 38,074 70,516 Author 0 30,000 60,000 90,000 Requests 120,000 150,000 Navigation statistics (March – May 2006) Navigation by Dimensions New 4% Language 5% Subject: Era 5% Availability 3% LC Classification 20% Subject: Region 7% Subject: Genre 8% Subject: Topic 19% Author 9% Format 9% Library 11% Sorting statistics (March – May 2006) Sorting Requests Call Number 6% Author A-Z 9% Title A-Z 13% Most Popular 19% Pub Date 53% Other interesting tidbits… Authority searching decreased 45% Keyword searching increased 230% (March 2006) Caveat: default catalog search changed from title authority to keyword ~ 5% of keyword searches offered spelling correction or suggestion 3.1% - automatic spell correction 2.3% - “Did you mean…” suggestion Usability testing 10 undergraduate students Endeca performed as well as OPAC for known-item searching in usability test 5 with Endeca catalog 5 with old Web2 OPAC 89% Endeca tasks completed ‘easily’ (8/9) 71% OPAC tasks completed ‘easily’ (15/21) Endeca performed better than OPAC for topical searching in usability test. Topical searching tasks Topical Task Success: Web2 Topical Task Success: Endeca Failed 22% Failed 34% Easy 36% Hard 3% Easy 58% Medium 17% Hard 23% Medium 7% Average topical task duration A relevance study Are search results in Endeca more likely to be relevant to a user’s query than search results in Web2 OPAC? 100 topical user searches from 1 month in fall 2005 How many of top 5 results relevant? 40% relevant in Web2 OPAC 68% relevant in Endeca catalog Future plans FRBR-ized displays FAST (Faceted Access to Subject Terms) instead of LCSH Enrich records with supplemental Web Services content – more usable TOCs, book reviews, etc. More integration with website search Use Endeca to index local collections Thanks http://www.lib.ncsu.edu/endeca Emily Lynema, Systems Librarian for Digital Projects emily_lynema@ncsu.edu