Why Reference Librarians Hate Federated Searching

advertisement
Why Reference and Instruction Librarians Hate Federated Searching and NextGen
Catalogs
Nina McHale
Assistant Professor, University of Colorado Denver
Web Librarian, Auraria Library
1100 Lawrence Street
Denver, CO 80204
303-556-4729
nina.mchale@ucdenver.edu
Confession: this presentation is therapy for me. In the past six years, I have
implemented federated search twice in academic libraries more or less on my own. I also
have spent much of 2009 chairing and serving on a NextGen catalog product selection
and implementation committee; we went live with WorldCat Local this fall semester.
Discovery products such as these are often billed by vendors as the Holy Grail in the age
of Google and Amazon. Perhaps this has raised our collective professional expectations
of federated search and NextGen products to an unreasonably high level, because, in my
experience at least, the reception among reference and instruction librarians can be
described as lukewarm at best. In this pool of dissatisfied librarians, I include my former
reference and instruction librarian self, my colleagues at the University of Colorado
Denver’s Auraria Library, and, more scientifically, the respondents of a survey conducted
by Lynn Lampert and Katherine Dabbour at California State University Northridge.1
[SLIDE 2] The short answer for the lukewarm reception is that expectations of discovery
tools are built on expectations of the products that they are designed to work with. The
longer answer to this question envelops these technical issues and also the dissonance
between what federated searching and NextGen catalogs purport to do and the
pedagogical roles of reference and instruction staff in the library.
While much has changed in the realm of federated searching in recent years, there
were still a handful of technical shortcomings that were hard to swallow in 2007 when we
chose 360 Search at Auraria. Note, however, that the following discussion does not target
any specific products here; my personal experiences with products and customer support
with WebFeat (2003) and Serials Solutions 360 Search (2007) were both very positive, in
spite of my initial distaste for federated searching generally. The shortcomings discussed
here—whether real or perceived—were more or less endemic to the products regardless
of the brand at the time they were implemented. These include lack of features, inability
to search all databases, speed, and unmet performance expectations.
Date and peer review filters are now standard on most online databases, and these
features have, logically, become embedded in reference and instruction routines. While
Serials Solutions’ 360 Search and other products are technically capable in and of
themselves of applying these limits to searches, limitations in the metadata that database
vendors provided rendered date and peer-reviewed filters useless. Serials Solutions’ 360
Search supported peer-reviewed filtering, but technical support recommends avoiding it
because filtering a list of results for peer-reviewed articles usually results in zero results.
Further, users may include a single specific year in their search terms; however,
searching a range of dates is not yet supported for the same reason that the peer-reviewed
filter does not work. (To reiterate, these were not flaws in the 360 Search product itself
but conditions of the technology at the time.)
Secondly, in spite of what product literature may claim, no federated searching
product can—or in some cases, should—currently search every online resource. Again,
this is not necessarily a reflection on the quality of the search product itself, and whether
or not to federate all databases and resources is a philosophical question all its own. Four
common reasons that a database may be excluded from a federated search include:
vendor prohibition, no existing “translator” for the product, an license agreement that
stipulates a limited number of concurrent users for a resource, or sources that are priced
in a pay-by-search model.
First, notable holdouts among vendors who did not permit their clients to federate
some or all of their products in 2007 included Hoover’s, InfoUSA, and content giant
LexisNexis. At the time Auraria went live with 360 Search, only LexisNexis Academic
could be made available through a federated interface. Libraries sold on the concept of
federated search have expressed their dismay about exclusion to holdout vendors, and
they have also considered looking for equivalent content in other online resources that
will allow inclusion.
Secondly, the nature of the electronic resources market creates a demand for
constant creation by federated search vendors of the “translators” that allow a resource to
be included in federated search. Therefore, while it may be technologically possible for
the federated search product to work with a given resource, if the vendor has not
developed a translator yet, that resource will be effectively wait-listed. If the resource is
local or highly specialized, the client library may have to wait until more client libraries
request inclusion of a resource to increase the demand for the translator. In the case of the
Auraria Library, this meant exclusion of Prospector, the unified catalog for the Colorado
Alliance of Research Libraries consortium, which is used heavily for interlibrary loan.
Third and fourth, technical support generally recommends that resources for
which a library has a limited number of concurrent users or which are priced by pay for
search be excluded from federated searching. If a subscription to a resource with limited
concurrent users is included, it is unlikely that any user would ever be able to
successfully access the product. A resource that is charge-by-use will be rapidly depleted
of the allowed number of searches. This is because “use” of the database in this context
begins not when a user clicks a link to an item in one of these resources, but when the
resource is included in a federated search. In some cases, changing the subscription to the
electronic resource to unlimited use may be an option; however, for other resources, this
is prohibitively expensive. Some favorite high-quality resources of reference and
instruction staff fell into this third category, as purchasing a resource with a low number
of concurrent users may be a factor in licensing an expensive product at all.
Deciding how to handle all of the orphaned resources created by vendor
exclusion, lack of an existing translator, a limited number of concurrent users, or limited
number of searches in a customized implementation of a federated search product can be
quite difficult. Links to the excluded resources’ native interfaces can be included in an AZ list within the federated environment or on appropriate subject guides or other pertinent
web pages, but even so, they may be easily overlooked by patrons seeking a quicker—
that is to say, federated—method of searching, such as selecting a bundle of resources
grouped around a subject area, e.g., “Art & Architecture.” From a reference and
instruction perspective, it can be difficult to market and encourage use of a new search
feature that omits some of the best and most recommended resources.
A further issue with both federated searching tools and NextGen catalogs is
search speed. With the explosion in the number of online resources made available in
recent years, many libraries have gradually outgrown their network infrastructure, in a
worst-case scenario, network capabilities can be pushed to the breaking point because of
the increase in traffic. However, even in a healthy network environment, the simple fact
that a federated search product is doing more work than a search in a database’s native
interface also accounts for this extended search time. Waiting a minute or more for a
search to grind away can create an awkward lull at the reference desk or in a classroom
setting does little to instill confidence in the minds of the library staff and patrons. Even
though we can say that this search grinding away takes less time than searching each
native interface individually, we’ve all become impatient due to high speed of the
interwebs.
Finally, there are behaviors with discovery products that are simply unexpected.
A particular use of the software that sounds brilliant in theory sometimes does not prove
effective in practice. For example, reference and instruction staff at Auraria were asked to
draw up a list of ten or so resources that would be included in a general-focus “Quick
Search” box on the Library’s home page. Eleven databases plus the library catalog were
chosen for inclusion, and staff were excited by the potential of offering results to general
queries from these resources from a search box on the home page. However, in practice,
the result was disappointing. The results returned from the fastest resource were the
results on top of the pile, and of the twelve resources chosen, PsycINFO routinely
returned results first. Reference and instruction staff felt that this skewed the results for a
general query; therefore, the feature was gradually reduced to three databases plus the
catalog, and then simply three databases.
While these current technical shortcomings are a large part of the dissatisfaction
in the reference and instruction department, there are philosophical and pedagogical
issues as well. One of the primary concerns of reference and instruction staff is that
discovery tools dumb down the research process. All of the controlled vocabulary and
carefully constructed indexes behind individual online resources are tossed out the
window; the results returned from a certain resource via the federated interface may be of
a lesser quality than those returned from a search in that resource’s native interface. In the
words of a colleague, in an academic environment, federated searching “removes many
kinds of academic research drills and routines one or more steps from reality.”
My response to that question is, “Whose reality?” A librarian’s? Because we
know how to walk both ways to school uphill in the snow, our patrons have to, too? Just
earlier this week at KLA, Rick Anderson from the University of Utah called for the
slaughter of five sacred cows of librarianship, the third of which was reference. He noted
that reference is not scalable, and that our goal should be to make libraries so easy to use
that reference service is not necessary. Auraria Library is, in fact, moving to an
information desk model that refocuses the main duties of reference librarians to
consultation, liaison, and outreach duties. If the tools we choose to offer our patrons are
easier to use, reference librarians can go beyond the library’s walls and increase our
presence in our communities in these ways. (Thank you whoever tweeted about Rick
Anderson’s presentation, by the way.)
Further, federated searching and NextGen products bring no content into either
the physical or virtual library. Reference and instruction librarians quite understandably
crave content with which to fulfill their reference and instruction duties. With product
price tags in the tens of thousands and budgets shriveling, buying a tool that does not
stand up to staff expectations and brings no more content into the library seems foolish. I
find this a bit ironic, however, because libraries spend tens of thousands of dollars on
online resources that have terribly unfriendly user interfaces, even for information
professionals, yet are the sole online provider for crucial resources.
Finally, in terms of personal use—whether for conducting one’s own research or
while assisting patrons—using discovery tools feels like putting the training wheels back
onto the bicycle. Reference librarians know or can surmise which resources will likely
yield good results for a given query, and they proceed to what they know are “the usual
suspects” in the lineup of electronic resources. Because of this expertise, it is difficult to
use federated searching instinctively in reference and instruction. One reference librarian
told me that she never taught federated search in her classes because she hated it.
While I am now a systems librarian and no longer work at the reference desk, an
undergraduate asked me in passing shortly after we implemented 360 Search how to find
a “professional article” in a biology journal. I found myself directing her to the “biology”
drop down menu in our homegrown directory of databases, ultimately offering her a
choice of BioOne, BIOSIS, ScienceDirect, and Web of Science. I added as a footnote that
she could search multiple resources simultaneously from the Biology subject guide—but
not Web of Science because it was not included in our federated search setup because of
our limited number of concurrent users. Not to mention the peer-reviewed filter issue so
that she’d get a “professional” article. And there’s the rub: explaining the shortcomings
of the federated search box on the Biology guide was more difficult than simply pointing
her to a couple of “best bet” choices in the first place.
Given the above, why are federated search products still on the market, and why
are libraries still contracting with vendors? What has changed my own mind in the last
five years, transforming me from a reluctant community college reference librarian
fighting Webfeat tooth and nail to a web librarian petitioning the university’s budget
priorities committee for extra funds to pay for 360 Search? Two things: usability testing
and developments in discovery tools themselves.
Web usability testing has finally opened our eyes to the fact that patrons have
vastly different mental models of the world of information than librarians. For me
personally, this became painfully clear in 2004 while observing a graduate student at
Georgetown University during a usability testing session. [SLIDE 3] He timed out after
three minutes while trying to decipher from the library’s home page where to find a
scholarly article about Descartes. He vacillated between the “catalog” and “database”
links on the library’s home page for 180 agonizing seconds and, in the end, he never
made a choice.
Libraries have historically had difficulty marketing and presenting in an intuitive
fashion what is the heart of the virtual library: subscribed electronic content, which
accounts for the lion’s share of our annual resource budgets. While not inexpensive,
discovery tools are now offered with a number of pricing options. Even if a library
chooses to federate as many resources as possible, the annual price tag will still likely be
less that one percent of the total annual expenditure for electronic resources—a small
price to pay for what can be a large return on a very large investment. Additionally,
implementing federated searching as a discovery tool can restore patrons’ faith in their
ability to find what they need when they come to their library’s web site versus the open
Internet. We will only continue to offer more and more online content in the coming
years, and discovery tools provide a way to present sensible options for patrons as the
number of online resources continues to grow.
The products themselves, to include the implementation process, have also
evolved quite a bit in recent years. Early vendor offerings of federated search tools were
clunky behemoths that required a local server installation and took months, sometimes
years, to prepare for patron use. Now, vendors typically offer more lightweight hosted
options that are ideal for libraries whose local technology resources are limited or
lacking. Serials Solutions’ promised—and, in the case of the Auraria Library, delivered—
six-to-eight-week turnaround from contract to launch is a dramatic improvement over the
years-long implementation time. Generally, vendor support during implementation is
better, with the vendor doing more of the setup work, depending upon the products and
the options chosen. And this year, because of a new amendment passed recently in the
state of Colorado, it actually took us longer to secure our contract with WorldCat Local
than to implement it.
In terms of technical improvements with federated search tools, vendors have
found effective ways of deduplicating results, which was an early Achilles’ heel. The
practice of HTML screen scraping—analyzing the output of a database by “reading” the
HTML of the results page—is being replaced by more highly structured XML Gateway
technology, which improves the quality of the results returned. Additional features like
clustered results, integration with other tools, supported web integration services, and
impressive administration and statistics modules are now available among the various
products. Notable examples in 2007 included Serials Solutions’ clustered results feature
in 360 Search and Info-Graphic’s administrative module for their AGent product.
Further, customization options allow discovery tools to overlay an entire web site,
unifying many resources to a single access point. Some librarians wanted to put both 360
Search and WorldCat Local on the databases page; there’s not much point spending
money on a discovery tool if it’s put on a web page that no one understands in the first
place. In addition to providing a single-search box on the library’s home page, many
libraries are using federated search products to enhance the more traditional and usually
home-grown A-Z/subject directories of databases as well as library guides and
pathfinders. Almost any combination of resources and audience is possible. For example,
an “English 101” bundle in an academic library could include EBSCO Academic Search
Premier, Gale’s General OneFile, and LexisNexis Academic. Code for this particular
search could be embedded on a class guide for English composition courses so that less
time could be spent training students on the individual interfaces. More time could then
be spent discussing topic selection and refinement, Boolean logic—which is supported by
federated search products—and other important research concepts.
I’m pleased to report that I have found myself instinctively using WorldCat Local
as launched on our home page. [SLIDE 4] There were a few of issues that arose during
implementation, not the least of which included labels of the tabs on the single search
box. We based our search box on the University of Washington’s home page display of
WorldCat Local, copying the tab labels “books,” “articles,” “dvds” and “cds.” Then the
debate began. The top issue was that librarians were concerned that students would think
that they were searching all articles, so the label was briefly “Select Articles.” This was
problematic, however, in that it sounded as though “select” was a verb. Then “selected
articles” was proposed, but this gave the incorrect impression that this material was
somehow vetted by library staff. A compromise was struck by adding a “research tip” on
each of the tabbed areas of the search box. [SLIDE 5] This alerted users to additional
information in a way that did not sound like an apology about what librarians were
dissatisfied with. “All formats” was briefly suggested for the “all” tab, but this was
rejected in favor of single-word tab labels, and also because “format” is an empty,
jargony librarian word.
Are discovery tools the Holy Grail? To the Knights of the Round Table, the Grail
represented an unattainable ideal that compelled the Knights to pursue it, in some cases to
their own destruction. Sometimes, things fail; Auraria discontinued 360 Search this
summer because OCLC is adding federated search to WorldCat Local. [SLIDE 6] I don’t
think that the big picture future is so bleak for discovery tools, however. [SLIDE 7] At
Computers in Libraries this past Spring, SirsiDynix Vice President Stephen Abram noted,
“Criticizing federated searching is like looking at your child learning to crawl and
declaring, ‘He’s a horrible accountant, I’d never hire him.’” He further noted, “It takes a
village to raise this stuff.” We are finally listening to what our customers want; vendors
are listening more to what we want and partnering with libraries during development.
Opening these channels of communication and collaboration is a very good thing. New
products that seem to be neither federated search nor NextGen catalogs, such as Serials
Solutions’ Summon are emerging onto the market, and open source and other small
vendor offerings will be interesting to watch in the coming years.
These nascent discovery tools are proving that they can successfully connect
patrons with subscribed online content; they are more tools that we can put in our
patrons’ research toolboxes, even if librarians still prefer not to use them until future
developments, hopefully, will make improvements in what reference and instruction
librarians find lacking. Even with all of its current technical shortcomings, discovery
tools provide a means of presenting our hyperstructured universe, with all of is semi-
secret classification schemes and codes, to our customers in a way that they not only
understand, but have come to expect.
Notes:
1. Lampert, Lynn D. and Katherine S. Dabbour, “Librarian Perspectives on Teaching
Metasearch and Federated Search Technologies,” in Federated Search: Solution or
Setback for Online Library Services, ed. Christopher Cox (Binghamton, NY: Haworth,
2007) 253-78.
2645 words
Download