ELECTRONIC RESOURCES IN A NEXT GENERATION CATALOG Wendy Robertson The University of Iowa Libraries Electronic Resources & Libraries, 2008 OVERVIEW Next generation catalogs and Primo Old “Smart Search” at The University of Iowa Implementation issues New “Smart Search” Features and Examples Problems Moving where users are Future plans WHAT TYPES OF ELECTRONIC RESOURCES? Licensed/purchased full content (journals, books, audio, maps etc.) Licensed/purchased databases Local digital content (images, audio, video etc.) Local full text Local websites (including finding aids) WHAT IS A NEXT GENERATION CATALOG? “It’s designed less like a “catalog”—an inventory list—and more like a finding aid. It contains data as well as metadata, and it is bent on doing things with found items beyond listing and providing access to them.” – LITA blog, July 7, 2006 Examples: NCSU’s Endeca implementation Open WorldCat –OCLC Primo –Ex Libris Aquabrowser Library –Bowker (e.g. University of Chicago) Encore–Innovative Interfaces (e.g. Michigan State) ® ® ® ® SELECTED FEATURES OF A NEXT GENERATION CATALOG Faceted navigation Federated searching Full text searching Interaction with other systems/use of API’s Multiple works merged (FRBR) Notification of new items by topic etc. Personalization, tagging Reader’s advisory/recommendations Relevancy ranking Reviews Search terms highlighted Spell checking, did you mean…? WHAT WE ARE AIMING FOR Simple to use, single search box for all our content With high quality content and good metadata PRIMO The University of Iowa’s choice for a next generation catalog Finding and discovery tool Not meant for the advanced researcher Work in progress (“Everything is Beta”) Does not yet have all possible features of a next generation catalog “SMART SEARCH” BEFORE Locally created search of Library Catalog – keyword search E-journal A-Z list Local database of databases, websites, and book and journal collections (previously called the “Gateway”) Libraries website Results from 4 sources not merged Did not include digital collections E-resources displayed in upper left (always at top, searching in collection of <2000 items) Top 5 results display for each source Click “more” for additional resources E-resources displayed in alphabetical order No separate interface E-journals displayed in alphabetical order Separate interface available A-Z list from SFX This interface still exists Website results displayed in limited relevancy order Originally no separate interface Catalog results displayed in reverse system number order Separate interface available Traditional ILS This interface still exists Sorted by date, author, title Digital Content Management System (ContentDM) This interface still exists No cross searching of with other resources PRIMO TIMELINE Worked on implementation summer 2007 Focused on Indexing, display, faceting How to load data Basic functionality Appearance , branding Local soft release in late September http://smartsearch.uiowa.edu Full release in mid-January Two updates implemented since then V.2 will come out this spring GETTING INFORMATION INTO PRIMO: CATALOG Not a live connection—records need to be loaded Loaders exists for MARC Aleph catalog – loaded in multiple times a day New and updated records loaded Records with changes to circulation information loaded GETTING INFORMATION INTO PRIMO: A-Z LIST Changed procedures to use MARCit records for packages, consortial agreements and free titles Primo gives us the single record display we had been wanting Change in ARL stats gave us more flexibility Loaded missing titles into Aleph GETTING INFORMATION INTO PRIMO: E-RESOURCES DATABASE Added field for Aleph ID Loaded basic records into Aleph Database had only brief information Standardized publisher information Added 930 fields to existing records (controlled vocabulary and misc terms) GETTING INFORMATION INTO PRIMO: CONTENTDM Loader exists for Dublin Core We use LC Authorities when possible in CDM DC lacks structure of MARC so some manipulation of names not possible for complex names Assess how subjects and types can best work with facets Results varied depending on CDM collection settings (standardizing) Some data inconsistencies in CDM (standardizing) EXAMPLE DATA FROM CDM http://digital.lib.uiowa.edu/cgi-bin/oai.exe?verb=ListRecords&set=uipress&metadataPrefix=oai_dc (http://digital.lib.uiowa.edu/cgi-bin/oai.exe?verb=ListRecords&set=uipress&metadataPrefix=qdc) NEW SMART SEARCH Electronic resources database and digital resources completely integrated with traditional catalog resources Federated search is separate option Could be merged with local resources Non-local database searching slower Ex Libris working with vendors to improve response time At this time the Libraries website is not included T Digital object from CDM Collection level record for digital collection from catalog Traditional MARC records from ILS Federated search option Large results can be managed with faceting Single record Merged display of print and online records These come from the electronic resources database Digital objects usually are under resource type images or text resources etc., but in this case they are 3-D objects Image from CDM INCLUSION OF LIBRARIES WEBSITE Goal was to have libraries website included at full release Public service said not critical Still very important for Special Collections finding aids Separate search available WEBSITE Current status: Successfully crawled www.lib.uiowa.edu (omitting pages that don't make sense). Modified an open source Perl product Swish-E Spider: http://swish-e.org/docs/spider.html Hopefully live before the end of the semester Our biggest challenges: Crawling logic—Making sure we don't inadvertently access URLs that time out Character encoding as related to HTML and XML entities—we've had to tweak standard Perl packages MERGING RECORDS IN PRIMO Two separate functions—De-duplication and FRBRization Rules assess similarity between records. Those that meet a threshold for similarity will be merged. Dedup records are completely merged; individual records cannot be viewed in Primo but do have a link to Aleph catalog FRBR records are merged for display, but also allow viewing of individual titles EXAMPLE OF SINGLE RECORD DISPLAY Single record. Online access shows on brief results. Single link to Aleph catalog. EXAMPLE OF DEDUP PRINT + ONLINE Online record takes priority for display Single record. Online access shows on brief results. Two links to Aleph catalog. EXAMPLE OF FRBR ONLINE + PRINT Online record takes priority for display FRBR link Print record. Published Washington DC, 1990- Online record. Published Washington DC, 1995- NAME DISPLAY FROM CDM ILS names not inverted CDM names inverted. I could not get them to display properly unless inverted ILS & CDM NON-MERGER Working on this Few collections have individual object both in catalog and in CDM EXAMPLE: M.F.A. THESIS AND M.F.A ART Print thesis and image of thesis both in Smart Search Imperfect because different sources for name LC NAF vs. ULAN (Union List of Artist Names) No authority record in this case What artist calls self Official registered name on thesis KNOWN JOURNAL SEARCH – BEFORE KNOWN JOURNAL SEARCH – AFTER KNOWN JOURNAL SEARCH – BEFORE Not on page KNOWN JOURNAL SEARCH – AFTER KNOWN E-RESOURCE SEARCH – BEFORE Search for OED brings Oxford English Dictionary to top KNOWN E-RESOURCE SEARCH – AFTER Search for OED brings Oxford English Dictionary to top Icons previously labeled which confused library staff KNOWN DATABASE SEARCH – BEFORE All the Ebsco databases Most popular happen to appear Can easily get to rest KNOWN DATABASE SEARCH – AFTER Resource type based on cataloging Integrating resource still in BK format with 006s May be lacking 008/21 d CHANGE: Databases now a resource type Computer file with 008/26 d or e DID YOU MEAN….? LOCAL ADDITIONS TO DID YOU MEAN….? Selected words added at request of staff Ulrichsweb is #8 in list FACETING Not magic—there has to be data in the records (i.e. good cataloging) We added terms based on codes in fixed fields (e.g. Newspaper, CD etc.) Searched for Mozart: GENRE HEADINGS FROM CDM + CATALOG Trade cards Science fiction Szathmary African American women Iowa City ILS ILS CDM CDM includes dates but all in one field. Subfield d not included from ILS (local choice) CDM Unsure why CDM is not clustering with MARC Call number faceting for unclassified and electronic journals Faceted down to RC554-569 Search originally had 152,637 results Faceting for general topic LINKS TO OTHER RESOURCES Links made by an API A little bit circular SEARCH BOXES WHERE USERS ARE Dummy course page. Library widget now a default for all courses. iGoogle page IE search option Facebook No Smart Search box….yet It is just being made right now and should be there after the conference PROBLEMS Online resource link is not being seen Databases have been difficult to find because of labels Known item searching can be more difficult (especially for major works of literature) Librarians concerned it “dumbs down searching” HOWEVER: Users seem to like it Concern that faculty (as expert searchers and older than average students) may not adapt as well as students FUTURE PLANS Inclusion of Libraries’ website Talking to LibGuides about including content Google book search and CIC’s Shared Digital Repository (metadata and full text) Full text from local e-journals Investigating getting tags from LibraryThing Will include data from institutional repository ™ CONCLUSION Need to be flexible Willing to change searching method Able to adjust to constant beta Able to keep up with user’s needs & request Able to incorporate new technology Tool that works for many people much of the time, but not for all people all of the time ILS not going away Electronic resources are especially important for access and have some unique problems THANKS! Contact: Wendy Robertson Electronic Resources Systems Librarian Digital Library Services The University of Iowa Libraries wendy-robertson@uiowa.edu