Toward Usable Browse Hierarchies for the Web

advertisement
TOWARD USABLE BROWSE HIERARCHIES
FOR THE WEB
Kirsten Risden
Microsoft Research
1 Introduction
The World Wide Web presents both new challenges and opportunities for
conducting Human Factors work. Browse hierarchies used to classify web
content present an interesting case in point. On the one hand, the size of the
domain to be classified and its general-purpose intent (i.e., it is intended for all
users and retrieval of all types of information) make traditional techniques such
as card sorts too unwieldy. On the other hand, the nature of the Web makes
efficient data collection from a relatively large number of people a possibility.
The purpose of this initial, small-scale study was to begin to investigate usability
methods that have the potential to scale to the range of users and tasks and at the
same time take advantage of the data collection possibilities that exist for the
web (e.g., server log data.)
A coherent, learnable category structure is a central goal for browse hierarchies
such as those on Yahoo, Excite, msn.com and other Internet portals. Such a
structure will allow users to efficiently find the information they need and to
become more and more proficient in using the hierarchy over time. We know
from cognitive psychology (Rosch 1975) that coherent, learnable category
structures have high within-category similarity and high between category
discriminability. For abstract categories, such as those found in browse
hierarchies for the web, we also know that linguistic cues that highlight relevant
features of the categories can be important (Horton and Markman 1980).
Categories whose members go together in a loose way, have high overlap with
other categories, or are represented with general labels that do not highlight
reasons for category membership should be difficult for people to use.
Beyond the difficulties of size and the general-purpose nature of browse
hierarchies in the creation of usable browse hierarchies, the fact that once a
browse hierarchy is released to the Web content continues to be added makes
maintaining a good user experience challenging. Changes to the makeup and
size of categories mean that the category structure can continue to grow and
evolve new categories and new category structure. Accommodating changes
may have unanticipated effects on the user’s ability to find information. Taking
the hypothetical example in Table 1 below, evolving from an “Arts” to an “Arts
& Humanities” category may lead to confusion between this new category and a
“Culture & Society” category within the same browse hierarchy.
This is
presumably because increases in the generality of a category demand more
encompassing labels, which in turn, allow for greater overlap between category
content. So in addition to doing usability evaluation during the creation of a
browse hierarchy, there is a need to do usability “check ups” on and ongoing
basis.
ARTS
Art History
Artists
Design Arts
Museums
Theory
Visual Arts
ARTS & HUMANITIES
Architecture
Art History
Artists
Culture
Design Arts
Humanities
Museums
Musical Arts
Performing Arts
Photography
Theory
Visual Arts
CULTURE & SOCIETY
Architecture
Art History
Culture
Death and Dying
Fashion
Food and Drink
Gender
Religion
Holidays
Mythology
Table 1. Illustration of how changes within one category (Arts) may lead to greater
similarity and user confusion across categories (Arts & Humanities versus Culture and
Society). Added categories are bolded.
Clearly, it would be nice to have a way to continually monitor how effectively
and efficiently users can locate information as well as the source of any
difficulties they are experiencing. Users should be able to make a direct path
through the hierarchy to the content they want easily identifying which
categories lead to the information and differentiating them from categories that
will not lead to the desired information. Traversals between major categories of
the browse hierarchy on the same information retrieval task would be evidence
of confusion and an indication of usability problems in areas where traversals
are common. Such data may be obtained through server logs under certain
circumstances. The goal of the following study was to determine the potential
usefulness of tracking traversal patterns through a browse hierarchy as a way to
monitor confusion and determine its source.
2. Study design and methodology
5 participants were asked to complete 35 information retrieval tasks using an
experimental browse hierarchy. The hierarchy was presented to subjects within a
software tool that displays categories in a simple hierarchical format and records
user paths. This software set the tasks in a context in which user interface
problems would not interfere with the specific interest in the usability of the
categories. The 35 information retrieval tasks were based on popular activities
on the Web, the participants did the tasks in different orders, and they were
allowed to “back up” in the hierarchy if they thought they needed to use a
different area.
Of particular interest were the top-level categories explored on a given task.
This was, in part, to simplify analysis and, in part, because top level categories
pose challenges to finding information in a browse hierarchy. Top-level
categories tend to be more general. As a result it is often quite difficult for users
to determine which top-level category a particular topic is likely to be in. If the
methodology I am proposing is useful, it should be sensitive to this difficulty
and should expose the primary sources of user problems. The top-level
categories of the browse hierarchy used in this study are shown in Table 2
below.
Business & Finance
Computers & Internet
Entertainment & Media
Health & Fitness
Home & Family
Interests & Lifestyles
People & Communities
News & Information
Reference & Education
Sports & Recreation
Travel & Leisure
Table 2. Top level categories of the browse hierarchy used in this study.
3 Results
Exploration of multiple top-level categories on the same task was assumed to
indicate a lack of certainty or confusion regarding where the information would
be located. Each top-level category explored on a given task was scored as
“confused” with each of the others explored on that task. For example, if the
task was to look for information about bike riding and the user looked in Health
& Fitness first, Interests & Lifestyles second and finally settled on Sports &
Recreation, then each of these categories would be scored as being confused
with one another on this trial. Confusability matrices were constructed by
tallying the number of times each pair of categories was confused across tasks
and subjects.
Analysis of these data was organized around three questions. 1) How prevalent
is confusion between top-level categories in the browse hierarchy? 2) What is
the structure of this confusion? 3) What is the source of the problem?
The average frequency with which top-level categories were confused with one
another was 18.55 across the 35 trials. 44% of that occurred during the first ten
tasks for an average frequency of 8.18. The average frequency with which toplevel categories were confused on the last ten tasks was 3.82. This indicated that
there was a substantial problem with differentiating the categories from one
another; one that diminished but was nonetheless present even after 25 trials.
To determine the structure of confusion, a network representation was created.
(See Chi 1983 and Chen 1997 for other examples of using networks to
understand concept relationship in complex information sets.) Categories
that were confused were linked together in the network 1. This network is shown
in Figure 1. The overall pattern reveals that the vast majority of user confusion
involved the Interests & Lifestyles and News & Information categories. A
separate network constructed for just the last 10 tasks (not shown here) that
users performed showed that Interests & Lifestyles continued to cause confusion
even after a substantial number of trials.
Figure 1. Network representation of category confusion. Categories that were frequently
confused with one another are linked.
Examination of the content of these and other categories in the browse hierarchy
showed substantial redundancy. Many sub categories were “members” of two
or more of the top-level categories. The proportion of redundant sub categories
ranged from a high of .89 to a low of .00 as shown in Table 3. Follow up
analyses showed a strong correlation between the proportion of redundant sub
categories and the frequency with which a top-level category was confused with
other categories (r (9) = .76, p < .05). This finding suggests that a high level of
redundancy makes it very difficult for users to learn to differentiate one category
from another.
Overly general labels may also fail to provide linguistic cues that highlight
differences between categories. For example, most of the categories could be
viewed as “interests” or “information”. The use of these highly general words in
Interests & Lifestyles and News & Information may make it difficult for users to
distinguish between these and other categories. In other words, the more similar
a label is to other labels the more frequently it will be confused with other
categories during information retrieval tasks. To determine whether this is the
case, a separate set of subjects was asked to rate each pair of labels on a 7 point
Likert scale. Average similarity scores are provided in Table 3. Higher numbers
indicated greater similarity. The correlation between similarity ratings and
frequency of confusion was strong and significant (r (9) = .71, p < .05)
indicating that categories with more similar labels were more likely to be
confused with other categories in the set.
1
For clarity of presentation, only those category pairs confused four or more times
are linked. This accounts for 56% of the total confusion data and clearly illustrates
the major patterns in the data.
Top level category
Proportion redundant
Average similarity to
sub categories
other labels.
Business & Finance
.00
4.35
Computers & Internet
.14
4.12
Entertainment & Media
.20
4.30
Health & Fitness
.00
4.35
Home & Family
.80
4.75
Interests & Lifestyles
.89
5.15
People & Communities
.83
4.25
News & Information
.80
5.08
Reference & Education
.60
4.25
Sports & Recreation
.20
4.45
Travel & Leisure
.33
4.57
Table 3. Redundancy and similarity scores for each top-level category.
4 Discussion
The major conclusion that can be drawn from this study is that tracking traversal
patterns through a browse hierarchy is a useful and insightful way to monitor user
experience. This study has shown that traversal data contained valuable information
about usability problems in one browse hierarchy. More importantly, analysis of
the traversal data revealed the source of confusion, and permitted diagnosis of why
it occurred. Specifically, the users in this study experienced a significant amount of
confusion in using a browse hierarchy. The source of confusion was pin pointed to
two major categories by mapping the data into a network representation that makes
inter relations among categories explicit. Finally, the patterns observed in the
network representation were validated against measures of learnable category
structures. This demonstrated that those patterns were indeed rooted in users'
psychological experience of the browse hierarchy and provided both an explanation
for why confusion was occurring where it did and what to do to alleviate it.
The next step is to generalize the methods and approach used in this small scale
study to data from large numbers of people carrying out their own information
retrieval tasks in real world settings. Tools for automatically collecting and
analyzing such data will need to be developed to make such work tractable.
However, the opportunity to continually monitor user experience in a dynamic and
changing software environment is likely to be worth the effort.
5 References
Chen, C. (1997). Structuring and visualizing the WWW by generalized similarity
analysis. In Proceedings of the 8th ACM Conference on Hypertext, (Southampton,
U.K., April). Pp. 177-186.
Chi, M.T.K. & Koeske, R.D. (1983). Network representation of a child’s dinosaur
knowledge. Developmental Psychology, 19, 29-39.
Markman, E. M. (1980). Developmental differences in the acquisition of basic and
superordinate categories. Child Development, 51, 708-719.
Rosch, E. H. (1975). Cognitive representations of semantic categories. Journal of
Experimental Psychology: General, 104, 192-233.
Download