Statement of Competency ( file)

advertisement
Timothy W. Miller
Fall 2013
SLIS 289
Core Competency E: design, query and evaluate information retrieval
systems.
Statement of Competency:
Information retrieval is a complex process that must be handled according
to the way that the data is structured. A database such as Dialog is structured
very differently from the basket of children’s board books at the public library.
Understanding how databases are designed is a requisite for knowing how to
query and find information. Evaluating the search results to ensure that the
queries are precise and aren’t inadvertently filtering out the information you want
is also a skill that requires a basic understanding of how databases are designed.
I have studied various types of databases, have created simple databases and
have also created simple search interfaces to gain a better understanding of how
I can best conduct my searches. I have worked with advanced features of basic
web search engines such as Google and Bing and have investigated other
methods of retrieving information from the Internet. I have found that whether I
am searching on Yahoo!, Factiva, Ebsco, a library OPAC or if I’m querying a
MySQL database, that I can’t expect to find everything I’m looking for without
knowing how the search works. Therefore I have studied and experimented with
a variety of these types of databases to learn all that I can about the best ways to
plan my searches.
Evidence:
1
Timothy W. Miller
Fall 2013
SLIS 289
My first piece of evidence is a basic evaluation of a growing trend in
library OPACs- the evolution of NextGen OPACs with Web 2.0 features
(evalNextGenOPACs.docx). This study examines the new features that OPAC
designers are incorporating to bridge the gap between the traditional OPAC and
the web search engine. My findings did not indicate that there is a magic recipe
for building the perfect OPAC. One conclusion is that libraries should design
their search interfaces to be as user-friendly as web search engines. Studies
show that most users will simply abandon searches that aren’t simple and
intuitive- searches that don’t reveal obvious results early on in the process.
However, it has also been shown that users often aren’t satisfied when such
searches bring up relevant titles, but won’t bring up the accompanying full text
articles. There is a middle ground between overly complex searches (that
include elaborate filter options and controlled vocabularies) and overly simple
searches (that bring up too many irrelevant or inaccurate results). Web 2.0
features such as user-generated content (tags, reviews), assistive search (spellchecking, recommending similar queries), federated searching (searching more
than one database at a time), and faceted navigation (computer-generated links
to narrow overly broad result sets) have been recommended in an attempt to
bridge these two extremes. However, the two most important considerations are
ensuring that the interface is user-friendly and that the results are relevant and
valid. I argue that when trying to reconcile these two concepts, the former should
be the primary focus. If the interface is unusable to the average library patron, it
simply will not be their first choice. However, if we strive to make the OPAC as
2
Timothy W. Miller
Fall 2013
SLIS 289
user-friendly as possible and release ‘beta’ versions as we make improvements,
users will be attracted to the interface and even though some functions may not
always be perfect, they will be improved upon over time. Furthermore, as users
get more and more accustomed to this type of ‘beta’ development (as seen in
web browsers and general search engines similar to Google), they will be more
forgiving of those types of short-comings and more responsive when asked to
learn how to use a new feature that meets their needs.
The next piece of evidence is a basic interface that I designed as an
experiment in creating a PHP/MySQL database (assignment_8.php). This
database is a mock-up of a library OPAC but is built on principles that apply to
any type of online database such as Ebsco or ProQuest. It features a way to
create user accounts as well as bibliographic records. Since it is built using a
relational database management system (MySQL), the data within the database
can be linked and used to make the interface interactive. A user can enter a
search term to find a specific item record using keyword or subject searches.
Each overly simple bibliographic record contains subject, ISBN, publishing date
and title fields that can all be searched with a simple Google-like search (one
search box). A user can create an account with a username and password (that
is secure from a malicious attack). Using the basic structure of this database, it
can easily be extended to allow for the addition of features such as usergenerated content (once logged in, a user could fill out a form to add tags to a
new column in a table that links those tags to the ‘booklist’ table) or creating
librarian-curated filtering options (by creating a restricted form that allows staff to
3
Timothy W. Miller
Fall 2013
SLIS 289
create links that filter to criteria such as subject type). However, since this was
created as a basic experiment and is therefore an over-simplification of an
OPAC, there are a few limitations- issues that I have not addressed. These
include: search results that are only sorted by title, not relevance (there also is no
dedicated search engine); no access control to restrict users from viewing
sensitive information (such as other user passwords); and there is no feature to
look up the thesaurus in order to choose precise terms. As databases such as
these grow in size and become more feature-rich, there is also a considerable
increase in demands on computing resources. A MySQL database that would
include these and other features (as well as the SQL queries themselves) would
need to be designed so that searches can be run quickly and efficiently.
The third piece of evidence is a log of some searches I performed using
Dialog (miller_exercise1.docx). The searches demonstrate some essential
database search techniques, including: constructing a precise search string;
using the thesaurus to choose controlled vocabulary; referencing the manual (the
‘bluesheets’) to determine which features can be used with specific databases;
using a variety of commands to find all variations of an author’s name and all
relevant controlled vocabulary terms for a specific concept (e.g. audiobook(s) =
playaway(s) = recorded book(s) = talking books); using truncation and wildcards
to expand search terms; and using the bluesheet information and search
commands to make cost-effective queries (e.g. avoid bringing up results that are
costly until the search has been adequately narrowed down and a level of
precision has been determined). The searches also demonstrate some essential
4
Timothy W. Miller
Fall 2013
SLIS 289
search strategies, including: pearl-growing (finding a relevant result and using
information within that item’s record, such as subject terms, to continue the
search); onion-peeling or successive fractions (starting with a broad search and
filtering to narrow down the results); and combining relevant terms with Boolean
operators to narrow or widen a search. The searches that I conducted show that
I use a variety of techniques and strategies to find each piece of information.
Using only one strategy or technique will rarely suffice. I also use these
techniques to ensure that I have found only relevant resources (precision) and
that my search has uncovered all of the relevant records (recall).
Conclusion:
Evaluating database design and search strategies is highly interesting to
me. From learning how a database is constructed and what features are
included, to experimenting to find out which strategies are most efficient and/or
cost effective, I enjoy the detective work. Being a person who is also interested
in learning how systems function, I also have experimented with building my own
databases for specific purposes. In my work I construct SQL queries to run
reports that enable me to keep track of item circulation. I also construct
programs so that I can enter data into MySQL databases rather than
spreadsheets so that I can run more powerful queries and have more control
over data formats and exporting. To keep my database design and search skills
current, I read the product literature (such as the Dialog bluesheets or Ebsco
help sheets). I test new features and products, such as Ebsco’s mobile apps, to
see what techniques can be used and how they can be accessed. To learn
5
Timothy W. Miller
Fall 2013
SLIS 289
about data storage techniques I build and create test cases as examples:
assignment_8.php (PHP/MySQL), mobileCatalog.html (jQuery/JSON),
henry_v.html (XML/XSLT). This is an area that is constantly changing as
designers are figuring out ways to make searching more intuitive and userfriendly and as technologies are advancing to allow for bigger databases and
faster searches. This is also an area that I truly enjoy staying informed about
and gladly accept the challenge.
6
Download