Uploaded by An Gh

samu

advertisement
2014 3rd International Conference on User Science and Engineering (i-USEr)
Wikipedia Search Engine:
Interactive Information Retrieval Interface Design
Amanjot Kaur Sandhu1 and Tiewei Liu2
College of Fine Arts1
School of Information2
The University of Texas at Austin, Texas, USA
{1kaur.amanjotsandhu, 2twliu}@utexas.edu
Abstract— Wikipedia search interface was redesigned based on
the literature review for this project. An initial interface was
designed with interactive information retrieval features covering
the aspects of search box, categories, navigation, layout and
search result views. Eight randomly selected subjects tested the
interface in the scenario of simulated search tasks and provided
their feedback via post-task questionnaire. A redesigned search
user interface was proposed based on the subjects’ feedback to
the initial design. The new interface is expected to meet users’
search preference and will inform further IIR (Interactive
Information Retrieval) interface design.
according to traditional classification schemes. Subjects were
asked which set of categories helped them best to complete the
task and why one set of categories was more helpful than
others.
Keywords - Wikipedia search, interactive information retrieval,
search user interface
iv. What kind of layout is preferred by users?
Subjects were presented with several layout designs of the
result page and were asked for their preference. Whether the
different elements (categories, search results, etc.) displayed
on the screen impacted the search experience was the research
question to be answered.
I.
iii. Is our navigation design flexible?
Search is a long journey during which users’ information
needs to change frequently. Subjects usually jump back and
forth and from page to page. Subjects were asked to evaluate
the flexibility of our crumb box navigation design and other
navigation features.
INTRODUCTION
Interactive search user interface is a hot topic in the
information science. Numerous features have been proposed
to optimize user experience in the past two decades. Though
most features have been adopted in the popular search engines,
online communities and other websites, to what extent can
these features help in the search process and whether users
really like using these features are questionable. The purpose
of the study is to get users’ search preference on the platform
of a self-designed Wikipedia search interface and in the
scenario of simulated search tasks. This study assisted users to
compare and evaluate several key features of interactive
search interface design, such as the search box, control
features, navigation and layout. Then the search interface was
redesigned based on the subjects’ feedback. The designed
interface is expected to be an ideal interactive search interface
that can meet users’ search preference. The results of the user
tests will also inform further IIR interface design. Study was
intended to answer the following questions based on the
experimental results:
v. Are the different search result views useful to users?
Subjects were shown various formats of search results,
including texts, images and metadata and asked whether the
different views of search results were useful. Users’
preferences to different formats of information display were
learned.
vi. Are there other interactive features users want to see in a
search engine?
Subjects were asked to recommend interactive features to
be added to the interface being tested. The answers reflected
users’ preference and expectation for the IIR interface.
II.
Many efforts have been made by information professionals
and computer scientists to investigate features that can
optimize the usability and better users’ experience in the
interactive search engines. Russell-Rose and Tate [1] argue
that recognition over recall is one of the key principles in HCI.
People are better at recognizing things they have previously
experienced than recalling them from memory. Auto-complete
accommodate such human nature by transforming the problem
of recall into recognition. The auto-complete function
provides users with the option of whether to select from the
suggested list or to enter the query in full. However, the latter
has two advantages: It helps users save time and keystrokes;
i. Are the auto-complete and query clarification features of
the search box useful?
Subjects were asked to compare non-interactive search box
and interactive search box with auto-complete and query
clarification features.
ii. What kind of categories helps users better in the search?
Two sets of categories were shown to subjects when they
performed the search tasks. One set of categories was created
with clustering and sorting method. The other was designed
978-1-4799-5813-9/14/$31.00 ©2014 IEEE
RELATED WORK
18
2014 3rd International Conference on User Science and Engineering (i-USEr)
Scatter/Gather method was more difficult to use than the
classic information retrieval systems in terms of user
perception though it helps the subjects accomplish the tasks
more efficiently. In fact, Scatter/Gather clustering was
particularly useful when users are less familiar with the search
tasks [10].
and it avoids spelling mistakes and typographic errors.
Russell-Rose and Tate distinguish the use of auto-complete
and auto-suggest as auto-complete is used for lookup; while
auto-suggest accommodates exploratory search [1]. White and
Roth [2] discuss query suggestion in the context of exploratory
search systems (ESSs), which must offer users the ability to
specify information needs.
Users’ existing knowledge
impacted the User- defined queries and it may be the
possibility that this also limits the opportunity for exploratory
search. Query suggestions help users to select additional query
terms [2]. When suggestions are generated from the historical
query log data, they actually narrow a search to target a
particular subtopic [2]. Wilson [3] breaks the elements of
search user interface into four main groups (input, control,
informational and personalizable) and discusses auto-complete
in this framework. Auto-complete guides people towards
queries that are likely to work. Since auto-complete provides
information to the searcher as they query, it “helps make the
search box a better Informational feature as well as an Input
feature. Auto-complete can also be personalizable with the
queries a user has been used in his search history [3]. White
and Marchionini [4] take interactive query expansion (IQE) as
a useful technique that helps users formulate improved query
statements and ultimately retrieve better search results. They
introduced a technique called Real-Time Query Expansion
(RTQE), which “offers query expansion terms to searchers as
they enter queries, and updates following each term to reflect
potential completions of the search query”.
Tidwell [11] takes page layout as the art of manipulating
the user’s attention on a page to convey meaning, sequence,
and points of interaction. Visual hierarchy is an important
element of page layout. “A good visual hierarchy gives instant
clues about the relative importance of page elements and the
relationships among them” [11]. The large text block located
in the center of the page is usually the primary content; while
small but important items should be put at the top of the page,
along the left side or in the top-right corner [11]. Russell-Rose
and Tate [1] discuss three main choices of layout: vertical,
horizontal and hybrid when talking about the faceted search.
The most common vertical layout places facets on the left. “It
provides visual coherence that helps reinforce the relationship
between the selections made and the results returned” [1]. In
addition, it helps maintain visibility if the browser is resized
[1]. Some websites choose to display the facets on the right,
such as the Harvard University Library and the Edinburgh
University Library [12], [13]. In the horizontal layouts, facets
are placed on the top of the page. In this way facets are placed
at a more dominant and visible position of the page. However,
the number of facets shown on the page depends e page. The
facet menu will also be invisible when users scroll down the
results. The hybrid layout combines the features of vertical
and horizontal configurations and arranges the facets both on
the sides and the top.
Classification systems aim to help make information more
findable and usable by removing some of the ambiguity of
language [5]. Categories, as a manual classification, are
sometimes discussed in comparison to the automated grouping
approach of clustering. Hearst [6] argues that category systems
are usually logical and consistent. They present wellunderstood and predictable meaning units. Besides, category
systems navigate well in a hierarchical structure. Documents
also need to be manually assigned to categories. By contrast,
Clustering methods are fully automatable. But they are less
consistent, coherent and comprehensible. In addition, they
usually lack predictability, mix different dimensions
simultaneously. Current online clustering systems cannot
produce understandable results in a hierarchical structure [6].
Faceted classifications are less applicable when collections are
large and unmanaged [3], [7]. Wilson [3] argues that
clustering, the approach to automatically identify attributes of
a collection or result set is more important in this circumstance.
However, he admits that the results of the automated
classification can be highly variable and it is difficult to
generate meaningful groups and effective labels [3].
Tidwell [11] argues that navigating around a website or
application is like commuting. A good navigation design
should shorten the distance a user must travel in search.
Navigation incurs a cognitive cost. Interface designers should
pay attention that the cost can’t be too high [10]. Images and
metadata are alternative formats of displaying information.
They can also increase the visualization of text data. RussellRose and Tate [1] consider these applications at an aggregate
level, that is, they are designed to aggregate, organize and
summarize data from numerous sources by using data
visualizations to communicate key metrics, patterns and
overall status [1].
III.
METHODOLOGY
Both qualitative and quantitative
implemented in this project.
approaches
were
A. Literature Review
The initial interface was designed based on literature
review. Previous research results concerning users’
information needs and preferences were good resources that
helped to understand user behavior. [1] Some frequently
recommended features were adopted in the initial design and
study was intended to get users’ feedback to these features.
Clustering is a popular topic in recent years. Clustering has
achieved high precision, recall and efficiency in information
retrieval because of its advantages in domain independence,
scalability, and the potential to capture meaningful themes
within a set of documents [3], [8]. Clustering enables the user
to explore a collection through interaction and a form of query
preview. However, as Tunkelang [9] states the clusteringbased Scatter/Gather work assumes that documents only
contain unstructured text. This minimal data model limits the
power of an exploratory interface [9]. Study shows that the
B. User Test
Eight users were recruited to perform two imitated search
tasks on the initial search interface. Each subject needs to
19
2014 3rd International Conference on User Science and Engineering (i-USEr)
answer several post-task questions. Their answers informed
that how to improve the interface design.
• Search Task 2: Search for the Apple Company’s
Wikipedia images.
C. Observation
The subjects were asked to complete a post-task
questionnaire after they completed both search tasks. The
questions covered the key features to be studied in this project.
Data collection was anonymous. The subject’s name only
appeared on the consent form. No other personal data was
recorded. Each subject was assigned a number that was used
to record the study result.
The process of user tests was observed to better understand
the difficulties subjects encountered when completing the
tasks on the interface. The in-time notes taken provided an
additional support when analyzing the data collected in the
user tests.
D. Statistical Analysis
Statistical method was implemented when analyzing the
data collected in the user tests. The result of the statistical
analysis also sets criteria for whether and how to modify the
features in the initial design.
IV.
D. Stage 4: Data Analysis
The questionnaires collected in the user tests were
carefully studied. Each question in the questionnaire was
designed to get users’ feedback to one of the features to be
studied in the project. Most objective questions were evaluated
with five scales: strongly agree, agree, neutral, disagree and
strongly disagree. Subjects’ feedback proved to be stable and
consistent in general. Almost all the objective questions
received overwhelmingly more numbers of “agree” and
“disagree”. Users’ preference to the interactive features was
thus obviously seen. Answers to the 3 subjective questions
were listed and analyzed one by one. In-time notes made in
the user tests provided additional support when interpreting
the answers to the questionnaire.
RESEARCH PROCESS
There were five stages in the research process: project idea
and literature review, initial interface design, user test, data
analysis, and redesign of user interface.
A. Stage 1: Literature Review
The initial idea of the project is to find out users’
preference to the features of interactive search interfaces that
are related to categories, including what kinds of categories
users prefer and how to display those categories. Some books
and articles were read in the field of interactive search
interface, information seeking and user experience. The
bibliography includes the various readings and literature from
different resources. In this process, more IIR design features
caught our attention. However, interactive search box and
diverse format of search results (text, image, meta data) were
then added to the research plan in order to make the search
experience more complete to users.
E. Stage 5: Redesign of User Interface
Redesign of the Wiki search interface was based on the
data collected in the user tests. Design features received
positive feedback was remained; while those received negative
feedback were abandoned. New features the users
recommended were adopted if they were compatible with our
original design that received good feedback or if they can
replace the original features that received bad feedback. The
following changes were made to the initial design:
1) Search query clarification page
In this page, user was asked to clarify the search query and
displays the various possible categories related to that query
(see Fig. 1). We changed the terminology of some categories
after the users’ feedback.
2) Search results list page
This page shows the search results in the list view (see Fig.
2). The left side displays all the related categories. User can
change the language also. User can navigate to three different
views: List, Images and Keywords. After user testing, the tool
tip of the labels added with the icons of these views. Also
B. Stage 2: Initial Interface Design
After the features to be studied were decided, the initial
interface design was made. For the convenience of data
retrieval, Wikipedia search engine was decided to design and
data retrieved from Wikipedia, one of the most popular
databases of open access was used. The initial search interface
contained the features to be studied, such as auto-complete,
query clarification, categories, various search results views
and flexible navigation. All the features were incorporated in
the minimum number of web pages. Finally, a search engine
consisting 12 pages was created. Two versions of prototype
were created, one with Indesign and the other with Axure.
C. Stage 3: User Test
After the prototype was created, 8 subjects were invited to
test the usability of the search interface. The subjects were
randomly selected students, ranging from freshmen to PhD
students. All are English proficient. The subjects were
informed to the purpose of the research and the procedure of
the user study. Subjects were also informed that the study was
voluntary and confidential. Each subject signed a consent form
before participated in the study. Each subject was then asked
to complete two search tasks on the interface.
• Search Task 1: Search the information of the Apple
Company.
Fig. 1. Search query clarification page
20
2014 3rd International Conference on User Science and Engineering (i-USEr)
users like this layout over the other layouts which were used
during the user testing. They want the categories to be
displayed on the left side as it is the traditional way of
showing.
3) Search results images page
This page displays the Wikipedia images of the search
results (see Fig. 3). On the mouse over of any image, it will
display the slide show of all the other images of that
Wikipedia page in that thumbnail. After user’s feedback, the
snippet of the Wikipedia page added with the image thumbnail.
The heading of this page will also refine so that users’ can
easily recognize what this page is all about.
4) Search image details page
User will navigate to this page after clicking on any image
of the search results images page (see Fig. 4). All the images
of that Wikipedia page displays here. User can click on any
thumbnail and that selected image will be displayed in bigger
size with the image information below. After the user’s
feedback, the link to that Wikipedia page has added and also
the clear back navigation to the previous page has added.
5) Search results keyword page
This page shows all the related keywords of a category
(see Fig. 5). This page will help researchers and experts to see
the possibilities of all the related keywords of any category
they will be searching. After user’s feedback, the clear
heading of this page will be added so that users can easily
understand what this page is all about.
V.
Fig. 2. Search results list page
RESULTS
Each design feature of the search interface was studied
with specific objective questions in the questionnaire.
Answers to the subjective questions provided explanations to
subjects’ preference in the objective questions in some cases.
They also helped researchers get users’ preference to other
features of interactive search user interface which were not
considered in the initial design but of significant importance in
IIR study.
Fig. 3. Search results images page
A. Search Box
The search box has two interactive features: query autocomplete and query classification. When being asked whether
the auto-complete function is useful, 4 of the 8 subjects agreed
it is useful (see Fig. 6). When answering the subjective
questions, one subject suggested adding two options, i.e. text
search and image search, to the search box so that users can
search within a limited scope from the very beginning.
Fig. 4. Search image details page
B. Categories
Two pages with two different sets of categories were shown to
the subjects. One set was used by another popular Wikipedia
search engine – SearchTechnologies and generated according
to the frequency of Wikipedia tags being clicked by users. In
fact, this method of generating categories is clustering and
sorting of tags. The other set was designed by the researchers
based on the common understanding and traditional
classification principles. The entries included in each set of
categories are:
Fig. 5. Search results keyword page
21
2014 3rd Internatioonal Conference on User Science and Engineering (i-U
USEr)
subjects, 3 subjects agree annd 2 strongly agree that the
different views of the search results are helpful; while the
other 3 found the different foormats of search results are not
beneficial. Some subjects com
mmented that the metadata view
would be helpful for experrts. Another subject said that
metadata page would be helpfuul when the user wanted to play
around what all is available forr a particular category.
F. Search Engine
a
subjects to evaluate the
One objective question asked
Wikipedia search engine as a whole.
w
Being asked whether the
search engine is easy to use, 2 subjects strongly agreed and 4
agreed that it was easy to usse (see Fig. 7). The other two
subjects held neutral opinion.
4
3
The autto-complete function is
useful in
n the search task.
2
1
The que
ery clarification is
0
useful in
n the search task.
Stronly
Agree
Agree
Neutral Disagree Strongly
Disagree
Fig. 6. Search box query results
1) SearchTechnologies categories
Apple II Games, Commodore 64 Gam
mes, Year of Birth
Missing (living people), DOS Games, Amerrican Film Actors,
Amiga Games, American Television Actors,, English-language
Films, Mac OS Games, Windows Games, Atari ST Games,
IOS Games, 2011 Singles and Atari 8-Bit Faamily Games.
2) Researchers’ categories
All, Corporation, People, Product, Eatables, Folklore,
Books, Games, Places and Scholarly Papers.
G. More Features
The two subjective questions, one asking subjects what
t search engine and the other
features they want to add to the
asking for other suggestions, received very good feedback.
The suggestions can be geneeralized into two categories –
labels and visualization.
1) Labels
Two subjects suggested adding titles or descriptions
indicating what the image wass and where it was retrieved to
each picture. Three subjects suggested
s
adding labels to text,
image and metadata icons dispplayed on the upper-right of the
search results. Researchers alsso observed that some subjects
had difficulty finding the buttoon of the “Image View” during
the user test.
Only 1 subject found both sets of categoories useless as the
subject only used the search function and select “All” to
complete the task. The other 7 subjects unnanimously agreed
that the categories created by the researcherrs help them most
when performing the task. From the follow-uup question asking
why they prefer that set of categories, we can see that most
subjects think researchers’ categories are moore useful because
they are simple and broad. Some also thouught the categories
used by SearchTechnologies were too speecific and hard to
understand. Another thought the SeearchTechnologies
categories would be useful when users narrow
n
down the
search result; but they were not be displayedd as the first-level
filter.
2) Visualization
Some subjects came up witth suggestions that can improve
the visualization of elementss of the search interface. For
example, the circles on the query
q
clarification page can be
dynamic & animated; the skip button should be better placed;
the metadata view should havve smaller font and more clear
view; instead of bold and hiighlighted words in the search
result list, it is better to show taags or categories.
C. Navigation
One key navigation feature of the initiial interface is to
show the crumb box at the top of the searchh results. However,
the navigation function is tested throughou
ut the website with
commonly used features like navigational laabels and buttons.
When being asked whether the navigation is flexible and easy
to use, 5 subjects held neutral opinion; 2 subbjects agreed and 1
strongly agreed that the navigation was well designed. Two
subjects did not find a way to exit from thee pages displaying
images and suggested an “Exit” or “Back”” button could be
added to these pages.
VI.
DISCUSSION
I
The study proved to be fruuitful considering its reliability
and validity. “Validity is the extent to which methods and
measures allow a researcher too get at the essence of whatever
it is that is being studied, whhile reliability is the extent to
which the method and measuures yield consistent findings”
[14]. The study achieved compparatively high reliability. As an
experiment conducted in the lab
l environment, the situations
subjects experienced were tightly controlled by the
researchers, such as receivinng the same instructions and
D. Layout
Three different layouts of the search result page were
presented to the subjects, with categories onn the left side, on
the right side and on the middle-top of the page respectively.
Out of the 8 subjects, 6 prefer the layouut with categories
displayed on the left. The other two layoutss each gained one
preference.
4
3
2
This search engine
is easy to use.
1
0
E. Formats of Search Results
The search results were presented in thrree formats – text,
image and metadata. Four pages were dessigned to test the
usefulness of the diverse formats of results display. Of the 8
Stronly Agree NeutralDisagree
Strong
gly
Agree
Disagre
ee
Fig. 7. Search Engine as a whole query results
22
2014 3rd International Conference on User Science and Engineering (i-USEr)
performing exactly the same tasks on the same interface in the
same environment. The strictly controlled experiment
conditions make consistent research results possible. Several
efforts were made to improve the validity of the study.
extensions and visualizations are not the concern of this study.
However, users obviously have expectations in these aspects.
Future researchers should pay more attention to these issues.
In summary, this study invites researchers to several
interesting topics of IIR. The study thus was proved to be
fruitful.
First of all, the subjects were randomly selected students.
All are English and computer proficient. Therefore, all the
subjects were assumed to be able to fully understand and
competent to complete the search tasks. Second, it may be
questionable whether subjects’ behavior exhibited when they
were aware of being monitored with Morae would be the same
as their behavior in a natural environment. However, subjects
were informed that the study was anonymous. Each subject
was assigned a number that will be used to record the study
result and there is no way to link subjects to the data. A
consent form was signed by each subject before participation.
The strict protocol between subjects and researchers
guaranteed that subjects could freely express their opinion in
the study. Third, the study was well focused on several key
features of interactive search user interfaces and the post-task
questions were designed specifically to get subjects’
preference to each feature. Thus results spoke directly to the
research questions. Therefore, the researchers could get at the
essence of the issue to be studied.
VIII. LIMITATIONS
Though the study achieved reliability and validity to some
extent, the results of this research have certain limitations.
First, in order to complete the course project, researchers had
to finish the literature review, design prototype, conduct user
tests and redesign the search engine within a limited time.
Some features, such as adding “Exit” or “Back” buttons to the
image pages, should have been considered by the researchers
at the stage of prototype design. Failing to include such a
function might have impact to subjects’ answers when they
evaluate the navigation function and the general performance
of the search engine. Second, only two tasks were designed to
test the Wikipedia search engine. Moreover, the two tasks
tested different features. This greatly reduced the external
validity of the results of this study.
REFERENCES
VII. CONCLUSION
[1]
The results of the user tests well spoke to the research
questions to be studied. Users’ preferences to the key features
of the search interface were stable in general. Most features
designed based on previous studies were welcomed by the
subjects. This study contributed to further IIR study in several
aspects:
[2]
[3]
[4]
First of all, an ideal design of interactive search user
interface was proposed based on the results of user study. This
design contained some features that can meet most users’
search preference and thus can be used as a model for search
interface development. Second, categories generated
automatically by the search systems based on the frequency of
clicks are very popular in recent years. Many search engines,
including some successful and popular ones, have adopted this
clustering and sorting method to create categories. However,
this study shows that users still prefer categories created
according to the traditional hierarchical classification schemes,
which are usually characterized as systemization and
generality. These findings are also supported by the study
conducted by English etc., which suggested that the explicit
exposure of hierarchical faceted metadata in a manner that is
intuitive and inviting to users can strikingly optimize the
usability of user interface [15]. Third, whether metadata
should be used as a format of displaying information to
general users and how to display it is a topic to be answered in
further studies. As we can see from this study, some subjects
thought the metadata view were useless in the search; while
some believed that metadata could only be used in some
specific cases. Fourth, the study discovered several features
that deserve future investigation. The features such as labels,
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
23
T. Russell-Rose and T. Tate, “Designing the search experience: The
information architecture of discovery,” Newnes, 2013.
R. W. White and R. A. Roth, “Exploratory search: Beyond the queryresponse paradigm. Synthesis,” Lectures on Information Concepts,
Retrieval, and Services, 2009, 1(1), 1–98.
doi:10.2200/S00174ED1V01Y200901ICR003
M. L Wilson, “Search user interface design synthesis,” Lectures On
Information Concepts, Retrieval, And Services, 2012, 3(3), 1–143.
doi:10.2200/S00371ED1V01Y201111ICR020
R. W. White and G. Marchionini, “Examining the effectiveness of realtime query expansion,” Information Processing and Management, 2007,
43(3), pp. 685–704. doi:10.1016/j.ipm.2006.06.005
G. Smith, Tagging: People-Powered Metadata for the Social Web.
Berkeley, CA: New Riders Publishing, 2007
M. Hearst, Search User Interfaces, Cambridge University Press, 2009.
J. Teevan, S. Dumais, and Z. Gutt, “Challenges for supporting faceted
search in large, heterogeneous corpora like the Web,” HCIR, 2008.
Redmond, WA, USA.
M. Hassenzahl and N. Tractinsky, “User experience: A research agenda,”
Behaviour & InformationTechnology, 25(2), 91–97, 2006.DOI:
10.1080/01449290500330331
D. Tunkelang, “Faceted search. synthesis,” Lectures on Information
Concepts,
Retrieval,
and
Services,
2009,
1(1),
1–80.
doi:10.2200/S00190ED1V01Y200904ICR005
X. Gong, W. Ke, Y. Zhang, and R. Broussard, “Interactive search result
clustering: A study of user behavior and retrieval effectiveness,”
Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital
Libraries, 07/2013, pp.167 – 170.
J. Tidwell, Designing Interfaces, O'Reilly Media, 2010
The Harvard University Library Hollis System. [Online]. Available:
http://hollis.harvard.edu/?q=HCI
The Edinburgh University Library OPAC. [Online]. Available:
http://catalogue.lib.ed.ac.uk/vwebv/search?searchArg=hci&searchCode=
GKEY%5E*&searchType=0
D. Kelly, “Methods for evaluating interactive information retrieval
systems with users,” Found. Trends Inf. Retr., 2009, 3(1—2), 1–224.
doi:10.1561/1500000012.
J. English, M. Hearst, R. Sinha, K Swearingen, and P. Yee, “Flexible
search and navigation using faceted metadata,” Technical Report,
University of Berkeley, 2002.
Download