Why Reference and Instruction Librarians Hate Federated Searching and NextGen Catalogs Nina McHale Assistant Professor, University of Colorado Denver Web Librarian, Auraria Library 1100 Lawrence Street Denver, CO 80204 303-556-4729 nina.mchale@ucdenver.edu Confession: this presentation is therapy for me. In the past six years, I have implemented federated search twice in academic libraries more or less on my own. I also have spent much of 2009 chairing and serving on a NextGen catalog product selection and implementation committee; we went live with WorldCat Local this fall semester. Discovery products such as these are often billed by vendors as the Holy Grail in the age of Google and Amazon. Perhaps this has raised our collective professional expectations of federated search and NextGen products to an unreasonably high level, because, in my experience at least, the reception among reference and instruction librarians can be described as lukewarm at best. In this pool of dissatisfied librarians, I include my former reference and instruction librarian self, my colleagues at the University of Colorado Denver’s Auraria Library, and, more scientifically, the respondents of a survey conducted by Lynn Lampert and Katherine Dabbour at California State University Northridge.1 [SLIDE 2] The short answer for the lukewarm reception is that expectations of discovery tools are built on expectations of the products that they are designed to work with. The longer answer to this question envelops these technical issues and also the dissonance between what federated searching and NextGen catalogs purport to do and the pedagogical roles of reference and instruction staff in the library. While much has changed in the realm of federated searching in recent years, there were still a handful of technical shortcomings that were hard to swallow in 2007 when we chose 360 Search at Auraria. Note, however, that the following discussion does not target any specific products here; my personal experiences with products and customer support with WebFeat (2003) and Serials Solutions 360 Search (2007) were both very positive, in spite of my initial distaste for federated searching generally. The shortcomings discussed here—whether real or perceived—were more or less endemic to the products regardless of the brand at the time they were implemented. These include lack of features, inability to search all databases, speed, and unmet performance expectations. Date and peer review filters are now standard on most online databases, and these features have, logically, become embedded in reference and instruction routines. While Serials Solutions’ 360 Search and other products are technically capable in and of themselves of applying these limits to searches, limitations in the metadata that database vendors provided rendered date and peer-reviewed filters useless. Serials Solutions’ 360 Search supported peer-reviewed filtering, but technical support recommends avoiding it because filtering a list of results for peer-reviewed articles usually results in zero results. Further, users may include a single specific year in their search terms; however, searching a range of dates is not yet supported for the same reason that the peer-reviewed filter does not work. (To reiterate, these were not flaws in the 360 Search product itself but conditions of the technology at the time.) Secondly, in spite of what product literature may claim, no federated searching product can—or in some cases, should—currently search every online resource. Again, this is not necessarily a reflection on the quality of the search product itself, and whether or not to federate all databases and resources is a philosophical question all its own. Four common reasons that a database may be excluded from a federated search include: vendor prohibition, no existing “translator” for the product, an license agreement that stipulates a limited number of concurrent users for a resource, or sources that are priced in a pay-by-search model. First, notable holdouts among vendors who did not permit their clients to federate some or all of their products in 2007 included Hoover’s, InfoUSA, and content giant LexisNexis. At the time Auraria went live with 360 Search, only LexisNexis Academic could be made available through a federated interface. Libraries sold on the concept of federated search have expressed their dismay about exclusion to holdout vendors, and they have also considered looking for equivalent content in other online resources that will allow inclusion. Secondly, the nature of the electronic resources market creates a demand for constant creation by federated search vendors of the “translators” that allow a resource to be included in federated search. Therefore, while it may be technologically possible for the federated search product to work with a given resource, if the vendor has not developed a translator yet, that resource will be effectively wait-listed. If the resource is local or highly specialized, the client library may have to wait until more client libraries request inclusion of a resource to increase the demand for the translator. In the case of the Auraria Library, this meant exclusion of Prospector, the unified catalog for the Colorado Alliance of Research Libraries consortium, which is used heavily for interlibrary loan. Third and fourth, technical support generally recommends that resources for which a library has a limited number of concurrent users or which are priced by pay for search be excluded from federated searching. If a subscription to a resource with limited concurrent users is included, it is unlikely that any user would ever be able to successfully access the product. A resource that is charge-by-use will be rapidly depleted of the allowed number of searches. This is because “use” of the database in this context begins not when a user clicks a link to an item in one of these resources, but when the resource is included in a federated search. In some cases, changing the subscription to the electronic resource to unlimited use may be an option; however, for other resources, this is prohibitively expensive. Some favorite high-quality resources of reference and instruction staff fell into this third category, as purchasing a resource with a low number of concurrent users may be a factor in licensing an expensive product at all. Deciding how to handle all of the orphaned resources created by vendor exclusion, lack of an existing translator, a limited number of concurrent users, or limited number of searches in a customized implementation of a federated search product can be quite difficult. Links to the excluded resources’ native interfaces can be included in an AZ list within the federated environment or on appropriate subject guides or other pertinent web pages, but even so, they may be easily overlooked by patrons seeking a quicker— that is to say, federated—method of searching, such as selecting a bundle of resources grouped around a subject area, e.g., “Art & Architecture.” From a reference and instruction perspective, it can be difficult to market and encourage use of a new search feature that omits some of the best and most recommended resources. A further issue with both federated searching tools and NextGen catalogs is search speed. With the explosion in the number of online resources made available in recent years, many libraries have gradually outgrown their network infrastructure, in a worst-case scenario, network capabilities can be pushed to the breaking point because of the increase in traffic. However, even in a healthy network environment, the simple fact that a federated search product is doing more work than a search in a database’s native interface also accounts for this extended search time. Waiting a minute or more for a search to grind away can create an awkward lull at the reference desk or in a classroom setting does little to instill confidence in the minds of the library staff and patrons. Even though we can say that this search grinding away takes less time than searching each native interface individually, we’ve all become impatient due to high speed of the interwebs. Finally, there are behaviors with discovery products that are simply unexpected. A particular use of the software that sounds brilliant in theory sometimes does not prove effective in practice. For example, reference and instruction staff at Auraria were asked to draw up a list of ten or so resources that would be included in a general-focus “Quick Search” box on the Library’s home page. Eleven databases plus the library catalog were chosen for inclusion, and staff were excited by the potential of offering results to general queries from these resources from a search box on the home page. However, in practice, the result was disappointing. The results returned from the fastest resource were the results on top of the pile, and of the twelve resources chosen, PsycINFO routinely returned results first. Reference and instruction staff felt that this skewed the results for a general query; therefore, the feature was gradually reduced to three databases plus the catalog, and then simply three databases. While these current technical shortcomings are a large part of the dissatisfaction in the reference and instruction department, there are philosophical and pedagogical issues as well. One of the primary concerns of reference and instruction staff is that discovery tools dumb down the research process. All of the controlled vocabulary and carefully constructed indexes behind individual online resources are tossed out the window; the results returned from a certain resource via the federated interface may be of a lesser quality than those returned from a search in that resource’s native interface. In the words of a colleague, in an academic environment, federated searching “removes many kinds of academic research drills and routines one or more steps from reality.” My response to that question is, “Whose reality?” A librarian’s? Because we know how to walk both ways to school uphill in the snow, our patrons have to, too? Just earlier this week at KLA, Rick Anderson from the University of Utah called for the slaughter of five sacred cows of librarianship, the third of which was reference. He noted that reference is not scalable, and that our goal should be to make libraries so easy to use that reference service is not necessary. Auraria Library is, in fact, moving to an information desk model that refocuses the main duties of reference librarians to consultation, liaison, and outreach duties. If the tools we choose to offer our patrons are easier to use, reference librarians can go beyond the library’s walls and increase our presence in our communities in these ways. (Thank you whoever tweeted about Rick Anderson’s presentation, by the way.) Further, federated searching and NextGen products bring no content into either the physical or virtual library. Reference and instruction librarians quite understandably crave content with which to fulfill their reference and instruction duties. With product price tags in the tens of thousands and budgets shriveling, buying a tool that does not stand up to staff expectations and brings no more content into the library seems foolish. I find this a bit ironic, however, because libraries spend tens of thousands of dollars on online resources that have terribly unfriendly user interfaces, even for information professionals, yet are the sole online provider for crucial resources. Finally, in terms of personal use—whether for conducting one’s own research or while assisting patrons—using discovery tools feels like putting the training wheels back onto the bicycle. Reference librarians know or can surmise which resources will likely yield good results for a given query, and they proceed to what they know are “the usual suspects” in the lineup of electronic resources. Because of this expertise, it is difficult to use federated searching instinctively in reference and instruction. One reference librarian told me that she never taught federated search in her classes because she hated it. While I am now a systems librarian and no longer work at the reference desk, an undergraduate asked me in passing shortly after we implemented 360 Search how to find a “professional article” in a biology journal. I found myself directing her to the “biology” drop down menu in our homegrown directory of databases, ultimately offering her a choice of BioOne, BIOSIS, ScienceDirect, and Web of Science. I added as a footnote that she could search multiple resources simultaneously from the Biology subject guide—but not Web of Science because it was not included in our federated search setup because of our limited number of concurrent users. Not to mention the peer-reviewed filter issue so that she’d get a “professional” article. And there’s the rub: explaining the shortcomings of the federated search box on the Biology guide was more difficult than simply pointing her to a couple of “best bet” choices in the first place. Given the above, why are federated search products still on the market, and why are libraries still contracting with vendors? What has changed my own mind in the last five years, transforming me from a reluctant community college reference librarian fighting Webfeat tooth and nail to a web librarian petitioning the university’s budget priorities committee for extra funds to pay for 360 Search? Two things: usability testing and developments in discovery tools themselves. Web usability testing has finally opened our eyes to the fact that patrons have vastly different mental models of the world of information than librarians. For me personally, this became painfully clear in 2004 while observing a graduate student at Georgetown University during a usability testing session. [SLIDE 3] He timed out after three minutes while trying to decipher from the library’s home page where to find a scholarly article about Descartes. He vacillated between the “catalog” and “database” links on the library’s home page for 180 agonizing seconds and, in the end, he never made a choice. Libraries have historically had difficulty marketing and presenting in an intuitive fashion what is the heart of the virtual library: subscribed electronic content, which accounts for the lion’s share of our annual resource budgets. While not inexpensive, discovery tools are now offered with a number of pricing options. Even if a library chooses to federate as many resources as possible, the annual price tag will still likely be less that one percent of the total annual expenditure for electronic resources—a small price to pay for what can be a large return on a very large investment. Additionally, implementing federated searching as a discovery tool can restore patrons’ faith in their ability to find what they need when they come to their library’s web site versus the open Internet. We will only continue to offer more and more online content in the coming years, and discovery tools provide a way to present sensible options for patrons as the number of online resources continues to grow. The products themselves, to include the implementation process, have also evolved quite a bit in recent years. Early vendor offerings of federated search tools were clunky behemoths that required a local server installation and took months, sometimes years, to prepare for patron use. Now, vendors typically offer more lightweight hosted options that are ideal for libraries whose local technology resources are limited or lacking. Serials Solutions’ promised—and, in the case of the Auraria Library, delivered— six-to-eight-week turnaround from contract to launch is a dramatic improvement over the years-long implementation time. Generally, vendor support during implementation is better, with the vendor doing more of the setup work, depending upon the products and the options chosen. And this year, because of a new amendment passed recently in the state of Colorado, it actually took us longer to secure our contract with WorldCat Local than to implement it. In terms of technical improvements with federated search tools, vendors have found effective ways of deduplicating results, which was an early Achilles’ heel. The practice of HTML screen scraping—analyzing the output of a database by “reading” the HTML of the results page—is being replaced by more highly structured XML Gateway technology, which improves the quality of the results returned. Additional features like clustered results, integration with other tools, supported web integration services, and impressive administration and statistics modules are now available among the various products. Notable examples in 2007 included Serials Solutions’ clustered results feature in 360 Search and Info-Graphic’s administrative module for their AGent product. Further, customization options allow discovery tools to overlay an entire web site, unifying many resources to a single access point. Some librarians wanted to put both 360 Search and WorldCat Local on the databases page; there’s not much point spending money on a discovery tool if it’s put on a web page that no one understands in the first place. In addition to providing a single-search box on the library’s home page, many libraries are using federated search products to enhance the more traditional and usually home-grown A-Z/subject directories of databases as well as library guides and pathfinders. Almost any combination of resources and audience is possible. For example, an “English 101” bundle in an academic library could include EBSCO Academic Search Premier, Gale’s General OneFile, and LexisNexis Academic. Code for this particular search could be embedded on a class guide for English composition courses so that less time could be spent training students on the individual interfaces. More time could then be spent discussing topic selection and refinement, Boolean logic—which is supported by federated search products—and other important research concepts. I’m pleased to report that I have found myself instinctively using WorldCat Local as launched on our home page. [SLIDE 4] There were a few of issues that arose during implementation, not the least of which included labels of the tabs on the single search box. We based our search box on the University of Washington’s home page display of WorldCat Local, copying the tab labels “books,” “articles,” “dvds” and “cds.” Then the debate began. The top issue was that librarians were concerned that students would think that they were searching all articles, so the label was briefly “Select Articles.” This was problematic, however, in that it sounded as though “select” was a verb. Then “selected articles” was proposed, but this gave the incorrect impression that this material was somehow vetted by library staff. A compromise was struck by adding a “research tip” on each of the tabbed areas of the search box. [SLIDE 5] This alerted users to additional information in a way that did not sound like an apology about what librarians were dissatisfied with. “All formats” was briefly suggested for the “all” tab, but this was rejected in favor of single-word tab labels, and also because “format” is an empty, jargony librarian word. Are discovery tools the Holy Grail? To the Knights of the Round Table, the Grail represented an unattainable ideal that compelled the Knights to pursue it, in some cases to their own destruction. Sometimes, things fail; Auraria discontinued 360 Search this summer because OCLC is adding federated search to WorldCat Local. [SLIDE 6] I don’t think that the big picture future is so bleak for discovery tools, however. [SLIDE 7] At Computers in Libraries this past Spring, SirsiDynix Vice President Stephen Abram noted, “Criticizing federated searching is like looking at your child learning to crawl and declaring, ‘He’s a horrible accountant, I’d never hire him.’” He further noted, “It takes a village to raise this stuff.” We are finally listening to what our customers want; vendors are listening more to what we want and partnering with libraries during development. Opening these channels of communication and collaboration is a very good thing. New products that seem to be neither federated search nor NextGen catalogs, such as Serials Solutions’ Summon are emerging onto the market, and open source and other small vendor offerings will be interesting to watch in the coming years. These nascent discovery tools are proving that they can successfully connect patrons with subscribed online content; they are more tools that we can put in our patrons’ research toolboxes, even if librarians still prefer not to use them until future developments, hopefully, will make improvements in what reference and instruction librarians find lacking. Even with all of its current technical shortcomings, discovery tools provide a means of presenting our hyperstructured universe, with all of is semi- secret classification schemes and codes, to our customers in a way that they not only understand, but have come to expect. Notes: 1. Lampert, Lynn D. and Katherine S. Dabbour, “Librarian Perspectives on Teaching Metasearch and Federated Search Technologies,” in Federated Search: Solution or Setback for Online Library Services, ed. Christopher Cox (Binghamton, NY: Haworth, 2007) 253-78. 2645 words