Faculty/AP Un-Retreat, January 7, 2014 : Primo or What’s Next

advertisement
Faculty/AP Un-Retreat, January 7, 2014
Session Topic: Discovery and Access: Primo or What’s Next
Session Leaders: Bill Mischo and Tim Cole
Introduction
One of the University Library’s guiding values is “improving access to library content and
collections” and Goal 1 of the Strategic Initiatives is to “Promote Access to, and Discovery of,
Library Content and Collections. The University Library has pursued access to information
resources through a variety of initiatives in the past decade and continues to seek
improvements in this area. The two Un-Retreat sessions on Discovery and Access generated
some wide-ranging and in-depth discussions regarding access services and elicited a great deal
of useful information and opinions on Library discovery technologies and philosophy. While the
session leaders did frame the group discussions around the efficacy of Primo and Easy Search,
a myriad of information access and delivery mechanisms were discussed.
We are at a critical juncture in our discovery and delivery strategy in the Library. The
implementation of Primo has allowed us to examine key issues in search and discovery,
including the role of a web-scale discovery system (WSDS) in the Library's Gateway, the
relationship between a web-scale aggregated central index and the specialty disciplinary
abstracting and indexing services the Library licenses, the effectiveness of vendor databases
such as EBSCO databases, ISI, and Scopus when integrated into Primo, the use and
effectiveness of blended result displays, instruction issues connected with WSDS, the
relationship between a web-scale system and a federated search/recommender system such as
Easy Search, the efficacy of full-text search as compared with metadata-based searching, and
user search behavior within web-scale discovery systems.
Session participants were presented with the following discussion questions:
 What principles should be the foundation for the Library’s “discovery and delivery
strategy” (e.g., fully develop and implement fewer tools)?
 What does it mean to be “in the flow” of our user’s work?
 How can we best engage our user communities in order to understand their information
search and retrieval needs?
 What practices could we adopt in the University Library to achieve a more coherent and
efficient search, discovery, and delivery experience for our users?
 What are best practices in nimble implementation/retirement of systems?
The proposed principle of “fully develop and implement fewer tools” was not interpreted by
participants in either session as necessarily a desirable tactic or goal. Many of the other
questions were addressed indirectly. It is important to note that the Discovery and Delivery
Study Team (DDST) appointed by CAPT has discussed strategies for meeting all the goals and
tasks detailed below. The DDST has held several open sessions with Library staff, APs, and
faculty and will address all the goals and tasks identified by this Un-Retreat report. The DDST
charge is provided as an appendix to this report.
1
Challenges:
The session leaders provided background and historical information on federated search
technologies, Easy Search, Primo, and other web-scale discovery systems (WSDS). The
participants were informed that the Primo WSDS implementation was initially viewed as a
natural progression from Easy Search. The Primo implementation team planned to utilize
Primo’s Google-like display capabilities and the publisher-based collections in conjunction with
the search suggestions and links within the Primo custom tile. The custom tile was designed to
incorporate many of the Easy Search tactical search tips and suggestive prompts that had been
developed for Easy Search. After early issues with the comprehensiveness of the Primo Central
Index collection arose, the implementation team decided that loading additional A&I services
into Primo would solve some of the comprehensiveness issues. The loading of major A&I
service records from Scopus and Web of Knowledge is now being offered by several WSDS,
but participants commented that loading this content into WSDS does not replace the native A&I
services, as more metadata and controlled vocabularies -- in addition to more customizable
interfaces -- are typically available in the disciplinary A&I services.
Each session participant was asked to describe their use of Library discovery tools and
services. The use of Primo by session participants was very infrequent and, in the opinion of
participants, often did not result in a successful resolution of their (or the patron’s) information
need. These problems center around Primo’s sole reliance on full-text search (every search is
an AND search across the full-text of a document; there is no “metadata only” search and no
OR search across selected fields) coupled with issues with the search results relevancy
rankings and blended format result displays. For these reasons, searches for specific knownitems where there is no match in Primo can bring back a large number of irrelevant matches,
and, when there is a match, the desired known-item match may not appear in the first several
pages of Primo results – although this has improved markedly recently for some searches (but
not all). There is some irony in the fact that approximately 50% of the searches we see at the
Gateway are known-item searches but that Primo is relying on a full-text search system for
effective retrieval of these known-items.
At the same time, topical searches may bring back results with matches from words on separate
pages of the full-text as the highest ranked results – rather than matches from words in the title,
subject vocabulary, and abstract. In addition, Primo may match on data that cannot be
displayed to users due to contractual reasons between Primo/Ex Libris and database producers.
Because of this, in general, Primo is not being promoted by instructional and subject librarians,
particularly as testing has shown other tools are better suited to more precise or relevant search
results sets. While these problems exist in all WSDS, Primo, in particular, is a poor resource for
topical undergraduate research and its introduction as a potential tool in library instruction for
undergraduate students is questionable. It also does not offer the comprehensiveness and
custom relevancy rankings that subject specialists are accustomed to with their disciplinary A&I
services.
All Un-Retreat session participants noted that they typically used a combination of Easy Search
and one (or both) of the Voyager and VuFind catalogs. Many of the session participants also
2
used disciplinary A&I services, citing fuller metadata records and enhanced and flexible search
features and mechanisms. Participants noted that there is little compelling reason or need to
use Primo. Online catalog information and access mechanisms are covered in the Voyager and
VuFind OPACs. While Primo has added the Scopus and ISI Web of Knowledge A&I services,
they are indexed in the same way as all content in the Primo Central Index and are subject to
Primo’s relevancy ranking problems. They are not as useful within Primo as they are in their
native form, especially given that within Primo the records are represented with fewer data
elements and reduced metadata.
There was general agreement that Easy Search is easier to train on, that it is easier to
customize at the departmental library/subject area level, and that the grouping of search result
targets by result categories mimicked the “bento box” approach employed by several other ARL
libraries. Easy Search helps guide newbies looking for best database option, but admittedly also
suffers from what can be an overwhelming display format and information overkill.
Statistics from the Primo custom tile logs show that approximately 75 searches a day are being
performed from the native Primo front-end interface. That’s a small number. All other Primo
searches are from Easy Search -- which is averaging over 6,500 searches per day. In addition,
from analysis of Easy Search logs, we know that users do not demonstrate a preference for
Primo results from the target display listings.
Some other clearly defined challenges emerged in our conversations:
Challenge #1 -- we need to better define the information needs of our users and address the
identified issues and problems we have with our information resources and determine how the
Library's collections and resources are best placed into the pathway of the user.
Challenge #2 -- we need to better define priorities; that should come before we institute a
WSDS.
Goal 1: Get more complete data about our how our users search and the types of
searches they are performing.
Strategies for Goal 1:
The custom transaction logs that we have gathered from the Gateway and departmental library
single entry search boxes have provided insights into user search behaviors. They have been
used to design and deploy the search assistance mechanisms and tactical tips used in both
Easy Search and the Primo custom tile. Several detailed log analyses (the most recent in 2011)
have identified the types of searches being performed, and several studies have looked at
Primo, Scopus, and Ebsco database coverage and retrieval effectiveness. The large-scale
transaction log data has been supplemented by user interviews and focus group interviews as
well as focused log analysis projects.
3
Task 1: Identify a representative sample set of user searches (a test suite) and use them
to examine the performance of different WSDS, including Primo, Ebsco Discovery
Services, Summon, Google Scholar, and WorldCat Local in terms of coverage, ease of
retrieval, and delivery effectiveness. This is part of the “bakeoff” that has been widely
discussed.
Task 2: Look at the target clickthrough patterns identified in the logs, particularly with
regard to search success.
Task 3: Examine the session tracking available in Easy Search to observe user
navigation patterns after search has been initiated.
Task 4: Examine search reformulation patterns and how search support systems (like the
suggestions in Easy Search and custom tile in Primo) can incorporate these findings.
Task 5: Compare what we know about searches at the Gateway with searches being done
in disciplinary A&Is.
Goal 2: Provide Discovery Approaches that Address our User Needs
Participants felt that the overriding concern of the Library should be in providing access services
that address our environment of heterogeneous users. We have a broad continuum of users
and user needs, interests, and discovery characteristics. We need to address the needs of a
wide variety of users -- from undergraduate students with few information literacy and scholarly
communication skills to senior researchers in very specific subject domains. It will be difficult if
not impossible to provide a one-stop shopping environment for these users. Some users do not
utilize or need Easy Search or Primo.
A large number of users and reference staff perform predominantly known-item searching and
know what they are looking for. For these users, efficient delivery of content is most important.
These known-item searches cover a variety of materials formats.
The Library needs to highlight and place in front of the user the tools that are most useful.
Library tools and services need to be placed in the flow of our users’ work. It is important to put
people into a situation where they can readily access the most useful resources. With the
importance of the A&I services, we need to determine how to best direct the user to the most
relevant database in the subject area. We cannot design a perfect system, but we need to
design an evolving, improving system,
There may not be an ideal system: We currently have multiple search and discovery systems
and multiple user paths. We seem to struggle with consensus --that in itself is an answer. We
know that our users demand online full-text. There has been a growth in the use of ebooks and
there is an almost universal reliance on e-journals.
Strategies for Goal 2:
4
Task 1: Work on a better Wayfinding function. Develop mechanisms for matching the
most appropriate tools and techniques with the specific user community.
Task 2: Design a system that combines the Easy Search recommender approach with the
WSDS model. This could possibly be done with extending the custom tile functionality. .
Task 3: Other display approaches may be easier for users to comprehend and use. Easy
Search is a bento box approach similar to Google’s grouping by type or format. Note that
Rochkind, NCSU, and other bento approaches require going into the target displays to
reach the full-text links – just as the user has to do in Easy Search. In many ways, Easy
Search incorporates many of the best practices currently emerging in WSDS but is
lacking the quick display of sample results.
Task 4: Look at a more “classical” bento box utilizing Easy Search and a WSDS.
Task 5: Given that full-text delivery is paramount, examine the relationships between
search techniques, discoverability mechanisms, retrieval effectiveness, and delivery
technologies.
The participants strongly felt that we need to clean up the data and the tools we have. Perhaps
more important than deploying a WSDS is fixing the problems we have with existing services.
There are record accuracy problems and SFX sometimes gives bad results. These errors are
only magnified in a WSDS and so, regardless of the specific tools we adopt, must be
addressed.
There is great frustration with trying to locate things we have but that don't show up in discovery
tools, including things we've digitized but don't show up in search results. For the first time, a
system like the Primo online catalog scope has allowed for the amalgamation of the Library's
MARC bibliographic records for the physical collections and non-MARC metadata records for
our digitized collections. We have learned that mixing these local collections into a blended
search that also includes millions of article citations may not be the most optimum search
solution.
Goal 3: Investigate the Overarching Role of a WSDS
WSDS seem to be more useful for bachelor degree granting institutions than Research I
institutions. The Primo Central Index (PCI) has deep coverage over many sources but is still not
comprehensive and cannot take the place of the myriad subject A&I Services and publisher
repository search systems. And the WSDS interface is not as feature rich and customized to the
subject discipline (e.g. there is no Chemical Abstracts Registry Index search in Primo). Primo is
intended as a Google for academics. It may be good for survey searches but there are issues
with scope of coverage and relevancy rankings for known-item search. In addition, Primo tends
to obscure the characteristics that help undergraduate students determine what kind of item
something is and whether it is considered scholarly within the discipline of the course they are
taking.
5
There is overlap but also a clear difference between a WSDS and disciplinary A&Is. The A&Is
will always offer search features not available in a lowest common denominator WSDS. The
specialty databases and publisher portals are still needed.
Strategies for Goal 3:
Task 1: Look at additional custom tile functions that can mitigate some of the issues with
WSDS. This will aid us in future development for whatever service we utilize.
Task 2: We know that some locally developed services (e.g. Journal and Article Locator,
Archon) provide important access and delivery services. Look at their integration into
WSDS and further integration through the Gateway and subject library websites.
Task 3: The literature is less than clear regarding the intended audience of a WSDS.
Several surveys suggest institutions view the primary role of a WSDS as serving
undergraduates; however, our local analysis has determined that undergraduate needs
are better met through EBSCO Academic Search. Is there a way to balance
undergraduate needs with the needs of users accessing the Library through the single
entry search box on the Gateway?
Task 4: Because of the difficulties users have with WSDS blended result displays and the
relevancy rankings in the Primo Central Index, it was suggested that a version of Primo
that includes only monographs and local digital content be generated (we have put
together a Primo View for this).
Some participants felt that a useful WSDS may be an impossible dream or a solution in search
of a problem.
However, it may be that we picked the wrong WSDS. We should stand up a fully-loaded
functional EBSCO Discovery Service, a Summon system, and a WorldCat Local system (we
already have WorldCat Local) to compare with Primo.
Participants felt that we need to find a way to support discovery of non-bibliographic information
within special collections, data, etc. Primo is not a good solution for finding datasets and
datasets will be of increasing importance.
Participants also felt that we need to find a way to make sure staff know more about what tools
we have (and what they cover and what they should be used for). They also felt that we need to
involve more people with regard to WSDS selection and implementation -- more than CAPT.
We also need more networking with others. With all we have learned regarding user search
over the past several years, we should take a leadership role in helping determine the best
scenarios regarding search and discovery. We could sponsor a conference here on WSDS or
use a CIC conference for discussions with others member libraries. We are all facing similar
situations. And we may need a broader discussion of what should be in the WSDS. Note: Lisa
Hinchliffe and Susan Avery will be presenting at the CIC Spring Information Literacy Conference
on their discovery analysis work.
6
Resources needed for addressing the Goals and Tasks identified in this report.
The DDST wants to use some of the remaining funds in the recurring Student Library/IT Fee
allocation for Next-Generation System Support to fund full-blown UIUC EBSCO Discovery
Services and Summon WSDS implementations. These systems will be used in the comparative
retrieval studies testing and comparisons. There is a recurring allocation of $250K in this budget
line.
While much of the transaction log comparative studies can be automated or partially automated,
there will be a need for hourly GAs to perform the detailed analysis and assist in the statistical
analysis. It is important to identify this representative sample set of user search so we can use it
to test any available system and the funding to set up this study should be set up as soon as
possible.
Appendix: Discovery and Delivery Study Team
Charge: The Discovery and Delivery Study Team is charged to develop a recommended
“discovery and delivery strategy” for the University Library. Developing this strategy entails
comprehensive review of how the Library currently facilitates discovery of and provide access to
content, the marketplace of current and emerging search, retrieval and access technologies,
and approaches for coordinating methods and techniques throughout the Library’s decentralized
service structure as well as articulations of principles and assumptions that should guide the
Library’s work in this area. The Discovery and Delivery Study Team will review the Priorities
(2003) from the Taskforce on Access, and evaluate the Library’s implementation work over the
past decade as well as user perspectives gathered through user surveys, usability studies, and
search log analysis. Through this review, as well as forums for library employees and users to
discuss current challenges and opportunities, the Team will identify 4-6 topics for small groups
to investigate. The Study Team will recommend the topics and small group membership to
CAPT, which will determine both the topics and the makeup of the small groups. The small
groups will investigate these topics in detail and develop recommendations. The Study Team
will then articulate these recommendations as well as principles and assumptions into an
integrated discovery and delivery strategy, which will be submitted to CAPT.
Membership: Lisa Hinchliffe (co-lead), Bill Mischo (co-lead), Kirstin Dougan, Sarah Williams,
Michael Norman, Susan Avery.
7
Download