Behavioral Intentions - Home

advertisement
When the Wait Isn’t So Bad: The Interacting Effects of Web Site
Delay, Familiarity, and Breadth
February 20, 2004
Dennis F. Galletta (contact author)
Associate Prof. of Bus. Admin.
Katz Graduate School of Business
University of Pittsburgh
Pittsburgh, PA 15260
Phone 412-648-1699; Fax 412-648-1693
galletta@katz.pitt.edu
Scott McCoy
Assistant Professor of MIS
School of Business
College of William & Mary
Williamsburg, VA 23187-8795
Phone 757-221-2062; Fax: 757-221-2937
scott.mccoy@business.wm.edu
Raymond M. Henry
Assistant Professor
Department of Management
Clemson University
Clemson, SC 29634
Phone 864-656-0591; Fax 864-656-2015
rhenry@clemson.edu
Peter Polak
Assistant Professor of CIS
School of Business Administration
University of Miami
Coral Gables, FL 33124-6540
Phone 305-284-8044; Fax 305-284-5161
ppolak@miami.edu
Acknowledgements: The authors would like to thank Enrique Mu, Dong-Gil Ko, and Alex Lopes for allowing their students to participate. The authors also appreciate the excellent comments from the attendees of
the research workshop at the University of Illinois at Urbana-Champaign.
Abstract
Although its popularity is widespread, the World-Wide Web is well known for one particular drawback: its
frequent delay when moving from one page to another. This experimental study examined whether delay and two other
Web site design variables (site depth and content familiarity) have direct and interactive effects on user performance,
attitudes, and behavioral intentions. An experiment was conducted with 160 undergraduate business majors in a completely counterbalanced fully factorial design that exposed them to two Web sites and asked them to browse the sites for
9 pieces of information. Results showed that all three factors have strong direct impacts on user attitudes and performance, leading to behavioral intentions, as predicted. A significant 3-way interaction was found between all three factors on attitudes and performance, and strong 2-way interactions were found between all pairs of factors on performance, also as predicted. The negative attitude and performance effects of delay appear to be reduced by breadth and
familiarity together, but only performance is enhanced by all possible two-way interactions. Additional research is needed to discover other factors interacting with delay, other effects of delay, and under what amounts of delay these effects
occur.
Keywords: electronic commerce, response time, web site design, attitudes, performance
ISRL Categories: AA01, AA02, AA05, AI0105, EI0206, FB0403, HA0902, HC0101
When the Wait Isn’t So Bad: The Interacting Effects of Web Site
Delay, Familiarity, and Breadth
In spite of large investments in business-to-consumer electronic commerce, online stores lag
behind brick-and-mortar stores by a wide margin. Online consumer spending represents only 1.5%
of the $872.5 billion in total retail sales for the most recent quarter (Commerce Department, 2003).
Meanwhile, the majority of users report that they try to use the Web at least to collect information
for purchase decisions, but have problems doing so.
Practitioners have provided ample testimony that one of the most important problems is the
pervasive delay with each attempt to view information. Studies have found that the enhanced usability of a fast site leads to increased usage (Nielsen, 1999a), and improving delay from 8 seconds or
more to between 2 and 5 seconds doubled the traffic of some sites (Wonnacott, 2000). There is a
potential strategic disadvantage resulting from slow service; losses from download times of 8 seconds and higher have been estimated to be $4.35 billion annually (Zona, 1999).
The research community has also examined systems delay in many settings over the years,
including, most recently, Web browsing. Building on that work, this study examines delay, along
with factors that have not earlier been studied in the Web arena, or with delay, as they influence
multiple outcomes such as attitudes, behavioral intentions, and performance of users.
Delay: No Silver Bullet
Delay is considered by some to be “the most commonly experienced problem with the
Web” (Pitkow et al., 1998), leading many to call it the “World Wide Wait” (Nah & Kim, 2000). Although modal access speed has increased considerably since early user surveys, problems of delay
continue to plague users (Selvidge, 2003; Nielsen, 1999b). Many are turning to broadband access for
help, although high costs and limited access will continue to leave them in the minority of users both
in the US and worldwide.
2
One should not be tempted, however, to assume that providing broadband to the everincreasing minority of people will solve delay problems, which have a variety of causes (Nah & Kim,
2000; Rose et al., 1999; Nielsen, 1997). The ultimate speed of loading a page is only as fast as the
slowest link in the chain from the server itself, to the Internet backbone, to the ISP, and finally, to
the user’s desktop. If they manage to avoid telecommunications congestion at the weakest link in the
system, users could still find themselves in a global waiting line at the server or find themselves as
victims of design or configuration error that in some cases triple page loading time (Koblentz, 2000).
Network traffic growth continues to exceed upgrades in network bandwidth, painting a gloomy picture for the future (Sears & Jacko, 2000). An increase in the number of broadband users will also
require a large number of server upgrades (Connolly, 2000). In the words of Sears and Jacko, “delays
will be a concern for many years to come” (p. 49).
Delay, Familiarity and Structure
While neglected until now, a case will be made in this paper that familiarity and structure interact with delay in their effects on a user’s Web experiences. Slow loading of a single page or even a
small number of pages might not be very noticeable or memorable. However, repeated delays are
quite common for two reasons:


Many sites have “deep” structures that require several clicks to arrive at a target page.
Many users are unfamiliar with the hierarchical structure of the site, becoming confused or “lost,”
choosing incorrect hyperlinks (Edwards & Hardman, 1989) by a trial-and error. A user-centered
approach will align the links with user goals (Sawasdichair and Poggenpohl, 2002).
Repeated clicks on each site will add up to a sizable time investment in viewing what might
be a blank screen for a great deal of time.i The factors mentioned above lead to the following research question:
How do delay, familiarity, and site depth interact to influence attitudes, performance, and finally, behavioral intentions in online product information searches?
3
This question should be of interest to researchers who wish to understand how those factors
independently and interactively form a predictive model of Web attitudes and performance. It
should also be of interest to designers who wish to attract and retain visitors to their site.
Because designers seldom have complete control over delay, it is important to understand
ways in which other features of the site might compensate for, or perhaps exacerbate, delay difficulties. Providing fast access to a target Web page is therefore a complex issue that cannot be remedied
with a single technology or approach; for the foreseeable future, it appears that users will continue to
confront issues of delay. Design choices such as increasing the depth of a site or making use of unfamiliar terminology can greatly magnify the delay (Shneiderman, 1998).
Theoretical Foundations
Individuals using the Web navigate within a structure to find information (Bernard, 2002).
There are, however, several ways to view this structure. Korthauer and Koubek (1994) identify three
types of structural variables that affect users’ ability to find information: inherent structure, internal
structure, and navigational structure. Inherent structure refers to the underlying physical structure,
which creates the overall form of a Web site. Internal structure is the user’s knowledge of the domain being accessed. Navigational structure refers to the structure of tangible navigational aids within a site. These three aspects of site structure impact the user’s ability to find information.
Discussions of the ability to find information in a Web site are informed by Pirolli and
Card’s (1995) Information Foraging Theory (Katz & Byrne, 2003; Larson & Czerwinski, 1998). This
theory asserts that individuals searching for information employ strategies to maximize their rate of
finding target information and minimize the costs associated with finding that information (Katz &
Byrne, 2003). Costs pertain to cognitive effort required of users and penalties imposed by following
incorrect paths. A key component of this theory is the concept of information scent, defined as “the
4
‘imperfect’ perception of the value, cost, or access path of information sources obtained from proximal cues” (Pirolli & Card, 1995, p. 646). When the scent is strong, users can easily find target information. In a Web environment, scent would refer to the amount of information about the location of target information that is provided by the site’s structure.
A site can be made more usable if it conforms to a known genre (Crowston and Williams,
2000), defined as “typified communicative actions characterized by similar substance and form and
taken in response to recurrent situations” (Orlikowski and Yates, 1994, p. 299). If Web sites serving
a similar function have similar structures (such as terminology or other cues about the desired path)
that are implied by conventions within that genre, scent would increase and navigation will be simplified (Crowston and Williams, 2000).
While genre may have some impact, there are still structural choices that can be made, including choices about the number of links and nodes used and the labeling of links, even within the
same genre. These choices will impact scent and the ability of users to find information. The ability
to find information will be mediated by the complexity of the structure being navigated (Bernard,
2002). Various structures impose different costs and provide differing degrees of information scent.
This article presents an experimental study that looks at two aspects of structure (depth and
familiarity) along with delay to determine the extent to which these factors have independent and
interactive effects on user attitudes, performance, and finally, behavioral intentions.
The research model is presented in Figure 1. In the following sections the elements of the
model are described and hypotheses about the relationships among these elements are developed.
Delay
Our understanding of Web page download time is informed by research on computer response time. Response time has been shown consistently to be a major factor in overall usability
5
both in studies of the World-Wide Web (McKinney et al., 2002; Torkzadeh & Dillon, 2002; Rose et
al., 2001, 1999; Rose & Straub, 2001; Nielsen, 1999b) and large-scale information systems (Shneiderman, 1998; Kuhmann, 1989).
Familiarity
H6
H7
Speed
H1
H7
H3
H5
Depth
Attitudes
Performance
H8
Behavioral
Intentions
H4
H2
main effect
2-way interaction
3-way interaction
Figure 1: Research Model
Effects of delay. Prior research has shown an inverse relationship between response time
and user performance (Barber & Lucas, 1983; Butler, 1983) and productivity (Martin & Corl, 1986;
Dannenbring, 1983). Long delays lead to frustration (Doherty & Kelisky, 1979), dissatisfaction (Lee
& MacGregor, 1985), feelings of being lost (Sears et al., 2000), and giving up (Nah, 2002). Negative
impacts are strongest when delays are longer than expected, occur in unpredictable patterns, or are
unaccompanied by duration or status information (Dellaert & Kahn, 1999; Polak, 2002).
Tolerable Level of Delay. Early studies suggested that the ideal response time is 2 seconds
(Miller, 1968), although such short response times might be impractical to use as a general Web design standard (Shneiderman, 1998), given the abundant and varied sources for delay.
The maximum delay varies widely in previous literature, where delays over certain limits led
to a variety of outcomes. At 38 seconds, users will abandon a task (Nah, 2002). At 15 seconds, the
delay becomes disruptive (Shneiderman, 1998). At 12 seconds, users will lose patience (Hoxmeier &
DiCesare, 2000). At 10 seconds, they will lose interest (Ramsay et al., 1998). At 8 seconds they suffer
psychological and performance consequences (Kuhmann, 1989).
6
A recent study by Galletta et al. (2004) imposed delays of 0, 2, 4, 6, 8, 10, and 12 seconds on
196 Web-browsing subjects to determine effects of lengthening delay times on performance, attitudes, and behavioral intentions. The results revealed significant effects on all three dependent variables as delays increased. Interestingly, by the time delays exceeded 6 seconds, further reductions in
performance, attitudes, and behavioral intentions seemed to lose significance.
It is striking that such relatively short delays in prior research have been shown to produce
significant results in both user attitudes and performance. Users of the World Wide Web often must
deal with much longer wait times (Ramsay et al., 1998). From a cognitive perspective, increased delays increase the cognitive effort that users must expend. The increased time between steps on the
information path may cause users to lose track of the information scent. Delays also directly increase the cost that a user incurs for each choice that is made in finding target information. Therefore, we propose the following hypotheses:
H1a: A fast site will result in more favorable user attitudes about the Web site than will a slow site.
H1b: A fast site will result in higher user performance than will a slow site.
Depth
Many alternative Web site designs can be selected, with varying effects on the number of selections that must be made. Given a fixed number of end nodes in a Web site, a designer can elect
to use a “broad” strategy, increasing the number of hyperlinks on each page while decreasing the
number of clicks and page loads, or a “deep” strategy with fewer links per page but more hierarchical levels. This is a primary aspect of the inherent structure of the website. The tradeoff between
breadth and depth has been widely studied in menu design for information acquisition (for example,
see Norman & Chin, 1988; Parkinson et al., 1988; Landauer & Nachbar, 1985; Kiger, 1984; Miller,
1981) and hypermedia (Edwards & Hardman, 1989), but only recently in Web design (Katz & Byrne,
2003; Chae & Kim, 2003, Zaphiris & Mtei, 1997).
7
Much of the work concerning the tradeoff between breadth and depth has focused on determining the optimal number of items per level (Larson & Czerwinski, 1998). While there has been
some variation in research findings, most of the studies listed above have determined that breadth is
preferred to depth (Jacko et al., 1995).
The theory of task complexity (Frese, 1987) can be used to explain this preference (Jacko &
Salvendy, 1996): A deeper structure increases the number of decisions that must be made, which in
turn increases complexity, response time and the number of errors. Additionally, the information
scent, the certainty that the chosen path is the correct one, decreases between distant nodes (Bernard, 2002), perhaps because of the stronger association between the descriptor and the target (Larson & Czerwinski, 1998). These explanations are consistent with Shneiderman’s (1998) observation:
“when the depth goes to four or five (levels), there is a good chance of users becoming lost or disoriented” (p. 249). Also, too few links per page will require more page loads, decisions, and further
delay (Zaphiris & Mtei, 1997; Landauer & Nachbar, 1985)
Although increasing the breadth of a site requires fewer clicks by the user to reach the desired page, providing too many links on a page can result in clutter, making it difficult to read or
comprehend (Larson & Czerwinski, 1998). Most laboratory studies consider 8-9 items per menu to
be a “broad” structure and 2-3 items per menu to be a “deep” structure, but even when those numbers are altered, the preference for breadth over depth remains (Katz & Byrne, 2003)
To summarize, an ideal, universal depth or breadth of a Web site cannot be determined, although most studies have found that users prefer, and perform best with, a “broad” structure with 89 items per menu in interactive systems. These choices can have direct implications on the complexity of a site and information scent that is provided, both factors that impact the ability to find
target information. Therefore, we propose:
8
H2a: A broad Web site will result in more favorable user attitudes about the site than will a deep Web site.
H2b: A broad Web site will result in higher user performance than will a deep Web site.
Familiarity
Another aspect of structure is the internal structure, or the underlying user knowledge of the
domain. Familiarity with Web site content has been shown to be an important determinant of shopping effectiveness, but familiarity is difficult to control because of widespread heterogeneity of users
(Chau et al., 2000). We therefore define familiarity as the ability of users to discern, from the terminology presented on the site, their next move to reach their goal.
Because users choose their paths based on the likelihood of finding target information, familiarity with the terms used to define those paths increases the information scent provided by the
paths. Pages higher on the hierarchy often depict broad identifiers for which the user must make
inclusion judgments in order to reach the appropriate page at lower levels (Paap & Roske-Hofstrand,
1988; Katz & Byrne, 2003; Pardue & Landry, 2001). If users do not understand the category names
and partitions used by designers, they will take longer (Somberg & Picardi, 1983) and have more difficulty reaching their goal (Dumais & Landauer, 1983). The goal is similar to that in “wayfinding,”
(Santosa, 2003; Jul & Furnas, 1997) but we make use of semantic, rather than graphical, cues.
The disorientation that occurs when users become lost is so widespread and serious that
Edwards and Hardman (1989) call it “hypertext’s major limiting factor” (p. 62). A cognitive explanation of this is that users spend some of their limited processing capabilities on navigation that could
otherwise be spent on attending to comprehension of the content (Thuring et al., 1995). This cognitive overhead should obviously be minimized or the site will become unusable.
Unfamiliarity requires that users locate information through trial and error (Schwartz &
Norman, 1986), increasing the costs that users experience and the likelihood of choosing incorrect
paths. When dealing with unfamiliar product categories it is possible for users to spend a great deal
9
of time lost in the hierarchy of the system (Norman & Chin, 1988; Robertson et al., 1981). Paap and
Roske-Hofstrand (1988) suggested that user familiarity with category names should influence the
design of hierarchical structures.
Quite often, firms organize Web sites according to their own internally-generated product
line categories. For example, one very large consumer electronics firm once included three paths to
its camcorders. The links were “Camcorders,” “Handycams,” and “Professional Camcorders.” All
were camcorders, but in different price, size, and/or feature groupings. A customer might be looking for a particular model or format, but lack of familiarity with the groupings would make it necessary to perform a “brute force” search through the site.
The use of meaningful categories is one of the determinants of effective e-commerce Web
sites (Turban & Gehrke, 2000) and of Web site ease of use (Lederer et al., 2000). These structural
elements directly impact information acquisition costs, cognitive effort required, and information
scent. We propose the following hypotheses regarding familiarity:
H3a: A familiar Web site will result in more favorable user attitudes about the site than will an unfamiliar
Web site.
H3b: A familiar Web site will result in higher user performance than will an unfamiliar Web site.
Interactions
The model asserts that delay, depth, and familiarity have not only direct but also interactive
effects on attitudes, performance, and finally, behavioral intentions. Though somewhat unusual, we
also hypothesize all possible interactions because of expected synergistic relationships among the
three conditions described below. The hypothesized interactions are based on the sharply rising
amount of foraging required under unfavorable combinations of conditions.
10
Depth and Familiarity
While increasing depth will decrease the rate of information acquired (Pirolli & Card, 1999),
increasing the unfamiliarity of the terminology on the site will introduce trial-and-error foraging at
the same time, decreasing dramatically the foraging rate as described below.
A familiar site with L levels to traverse (from the main index page to a bottom-level page)
would require an expected average of C clicks to reach the bottom node as follows:
C fam  L
By definition, a perfectly familiar site causes no user errors in navigation; therefore the number of clicks to reach a bottom-level page is equal to the number of levels the user must traverse,
with no backtracking. For example, if there are 5 levels from top to bottom, a user who starts at level 1 needs to traverse 4 levels to reach level 5, the goal.
A perfectly unfamiliar site, however, without learning, would require a “brute force” systematic (next neighbor) search with backtracking. Such a search would require many more clicks to
reach the correct bottom node. The expected number of clicks to reach the desired node is a function of the number of links per page and the number of levels:
L 1
C unfam  ( n i )  1
i 1
where: n = the number of links per page (assumed to be constant
throughout)
L = the number of levels to traverse
Counting from the level under the home page, at each level i, there are ni intermediate pages.
In a “brute force” search, the user would traverse each of these pages, hoping to find a direct link to
the bottom-level node sought. Each unsuccessful search would normally stop at level L-1, that is,
one level above the bottom, in the absence of the desired link. In other words, if a user is looking
for a certain model camcorder, and arrives at a page including only links to the pages of other mod-
11
els, he or she would not need to load each page containing the incorrect model, and would move on.
Ultimately, when a link to the correct bottom page is found, he or she would need to click one last
time to get there, hence the +1 at the end of the formula.
Traversing through the structure requires backtracking after each “dead end,” suggesting two
clicks per intermediate page (one down and one back up). However, on average, a randomly-placed
goal would be found halfway through the search. Therefore, the effect of backtracking is cancelled
by finding the goal halfway through the task (multiplying by 2 clicks per page, and then dividing by
2), so the formula provides the average number of clicks to arrive at the goal.
If a site has 81 bottom level pages two options might be to arrange them into 4 levels each
with 3 links (34=81) or into 2 levels each with 9 links (92=81). The differences between the expected
number of clicks in a familiar and unfamiliar site using each option are striking (see Table 1), indicating theoretical support for interaction provided by the formula.
Table 1: Average Number of Clicks Predicted for Navigating to a Given Node
Familiar Site
Unfamiliar Site
Deep (4 levels to traverse, with 3
links each)
4
40
Broad (2 levels to traverse with 9
links each)
2
10
The pattern in Table 1 supports the following hypotheses:
H4a: There will be an interaction between familiarity and depth on user attitudes.
H4b: There will be an interaction between familiarity and depth on user performance.
Depth and Delay
According to Shneiderman (1998), deep systems with slow response time are “annoying to
the user” because they result in “long and multiple delays” (p. 255). The formula and Table 2 support that assertion. The total expected delay time in an unfamiliar site is zero with instantaneous response time, and rises dramatically when an 8-second delay is imposed. Averaging across the famili-
12
arity treatment, the expected amount of delay jumps from 0 to 48 seconds [(16 + 80)/2] when delay
is imposed in the sample broad site, and from 0 to 177 seconds [(32 + 320)/2] in the sample deep
site. In short, Hypotheses 5a and 5b state that a “broad” design can partially make up for delay.
Table 2: Delay Predicted for Navigating to a Given Node, 0 or 8 second delay
Familiar Site
Unfamiliar Site
Deep (4 levels to traverse, with 3
links each)
0 second delay / 8-second delay
0 seconds / 32 seconds
0 seconds / 320 seconds
Broad (2 levels to traverse with 9
links each)
0 second delay / 8-second delay
0 seconds / 16 seconds
0 seconds / 80 seconds
H5a: There will be an interaction between delay and depth on user attitudes.
H5b: There will be an interaction between delay and depth on user performance.
Familiarity and Delay
Likewise, averaging across the number of levels in Table 2, the worst case predicts an expected delay that jumps from 0 to 28 seconds for a perfectly familiar site and from 0 to 200 seconds
for a perfectly unfamiliar site thanks to the “brute force” strategy that might be needed given the
large differences in scent for the foraging task. Indeed, Galletta et al. (2004) found that attitudes
“bottomed out” sooner for unfamiliar sites than familiar sites.
H6a: There will be an interaction between familiarity and delay on user attitudes.
H6b: There will be an interaction between familiarity and delay on user performance.
Delay, Depth, and Familiarity
The most striking conclusion of Table 2 lies in the likely three-way interaction that comes
without abstracting across depth or familiarity dimensions. As the delay with each click increases, the
differences among the cells increase dramatically. If pages loaded instantaneously, the expected delay
(for all clicks to the goal) would be zero in all cells. However, if a site incurred an 8-second delay, a
brute-force search in the deep site would impose an expected delay of 320 seconds (5.3 minutes),
13
while a brute-force search in the broad site would impose a delay of only 80 seconds (1.3 minutes).
In a familiar site, the user would only need to click once at each level, with a delay of 32 seconds and
16 seconds in the deep and broad site, respectively. While previous researchers stopped short of examining these dimensions together, the mathematical and logical justification above suggests that:
H7a: There will be an interaction between delay, familiarity, and depth on user attitudes.
H7b: There will be an interaction between delay, familiarity, and depth on user performance.
Is The Worst Case Realistic?
To address the worst case, we examined logs of a subset of our subjects to see how closely
they conformed to the idealistic numbers suggested by the formula and Tables above. Based on the
first 128 participants (64 per cell) for whom working logs were available, Table 3 presents the actual
number of clicks for each condition. As might be expected, user performance did not exactly match
the predicted numbers, but was quite close. In general, participants in the familiar site performed at
less than the expected rate, and participants in the unfamiliar site performed at better than “worst
case” levels. The subjects in the familiar site might have had missed a few of the cues designed to be
familiar, and participants in the unfamiliar site might have learned some of the paths along the way.
Table 3: Average Actual versus Predicted Clicks for Navigation to a Given Node
Familiar Site
Unfamiliar Site
Deep (4 levels to traverse, with 3
links each)
6.5 (predicted: 4)
26.5 (predicted: 40)
Broad (2 levels to traverse with 9
links each)
3.8 (predicted: 2)
8.2 (predicted: 10)
Attitudes and Behavioral Intentions
One of the key Web design goals is to get potential users to visit and to return later. Recent
research suggests that the use of behavioral intentions is appropriate in a Web context (Song &
Zahedi, 2001), where a user’s behavioral intention (to return) serves as a useful summary variable
indicating design success. Users form lasting attitudes about sites based on very brief encounters
14
(Chiravuri & Peracchio, 2003). Fishbein and Ajzen (1975) theorize attitudes to be a central antecedent of behavioral intentions. The relationship between attitudes and behavior has seen significant
attention (Ajzen, 2001). Technology acceptance research (Davis, 1989; Venkatesh & Davis, 2000;
Taylor & Todd, 1995) follows from this stream. Based on this work, we would expect a direct and
positive relationship between attitudes and behavioral intentions:
H8a: More favorable user attitudes will lead to more favorable behavioral intentions about the site.
Performance and Behavioral Intentions
The HCI literature has focused substantial attention on performance as the “ultimate concern” (p. 404) and central dependent variable for users of technology, whether evaluating designs of
hardware or software (Card et al., 1983). Although their focus has been on choosing design factors
based on their contributions to performance, HCI researchers have used quantitative performance
models to make multi-million dollar adoption decisions (e.g., Gray et al., 1993).
Over the last two decades, hundreds of HCI studies have provided quantitative analysis of
user-system performance for adopting the best technological approach in a variety of situations. Design of both software and hardware almost always involves such analysis, whether the designer is
considering how to enter formulas into a spreadsheet, how an external mouse compares to a laptop
pointing device, or how much time is saved by using a pupil tracking system.
In the MIS literature, Compeau and colleagues have posited a link between performance and
behavior for several years, based on Social Cognitive Theory (SCT) (see Bandura, 1977 and Compeau & Higgins, 1995, for a review of SCT). In a recent study by Compeau et al. (1999), the two
strongest predictors of usage behavior were performance and attitudes (in both cases r = .25 and p
< .001). While behavior is not the same as behavioral intentions, we suspect that a link exists between performance and behavioral intentions, and we present the final hypothesis:
H8b: Higher user performance will lead to more favorable behavioral intentions about the site.
15
Dependent Variables: Summary
To summarize, attitudes and performance are theorized to be important antecedents of behavioral intentions, when a user has other alternatives to using the system or browsing on a particular site. In this study, we focus on delay, site depth, and familiarity as important Web site design factors, serving as direct and interacting antecedents of user attitudes and performance, and ultimately
(for our purposes), behavioral intentions.
Research Methodology
The study was conducted in an experimental setting to control delay, site depth, and familiarity, as well as to allow measurement of outcome variables. Two levels of each of three factors provided a 2x2x2 design with two between-subjects factors (delay and depth) and one within-subjects
factor (familiarity). We employed a completely counterbalanced, fully factorial design, providing all
32 combinations of order, delay, depth, and familiarity.ii
Operationalization of Variables
Delay
Delay, the central factor in this study, was manipulated with Javascript code to provide a
constant eight-second delay per page for the slow site, with no such delay for the fast site. An eightsecond delay is about the middle of the range of what others have considered serious (Hoxmeier &
DiCesare, 2000; Ramsay et al., 1998; Zona, 1999; Shneiderman, 1998; Kuhmann, 1989), and has been
shown to be well past the patience levels of laboratory subjects (Galletta et al., 2004). The eightsecond delay also fits the simplicity of the pages (Sears & Jacko, 2000).
A manipulation check for the delay variable asked subjects to evaluate the speed of display-
16
ing the Web pages. On a 7-point Likert scale, “fast” subjects, on average, rated the speed as 6.0
(standard deviation 1.2) and “slow” subjects rated the speed as 2.2 (standard deviation 1.3). The difference was significant (t=312.3; one-tailed p<.001).
Familiarity
Two artificial Web sites were created. The “familiar” site contained images, prices, and descriptions of products found in grocery and/or hardware stores almost anywhere, to ensure that
subjects would recognize these products and categories of products (for example, a top level category was “Health Care Products”). In contrast, the “unfamiliar” site contained fictitious software
products and accessories and their categories were meaningless (for example, a top level category
was “Novo Products”). Subjects could not make use of previous experience in searching through
the unfamiliar site.
A manipulation check for the familiarity variable asked subjects to evaluate their level of familiarity with the subject matter in the Web site. Subjects on average rated the familiar site 5.4
(standard deviation 1.4) and unfamiliar site as 2.2 (standard deviation 1.4) on a 7-point Likert scale.
The difference was significant (t=21.3; one-tailed p<.001).
Depth
The experimental Web sites were created with two different hierarchical depths for the 81
bottom level pages. A “broad” structure was created with 2 levels to traverse (9 hyperlinks per page),
and a “deep” structure contained 4 levels to traverse (3 hyperlinks per page). These levels were chosen after considering previous studies (Norman & Chin, 1988; Landauer & Nachbar, 1985; Kiger,
1984; Miller, 1981), which used very similar schemes.
The Web sites utilized a linear design where lower child pages were accessible only through
their parent page. We did not provide the ability to skip through levels by shortcuts to bottom nodes
17
or other non-linear methods of navigation. The only exception was a link to the site’s Home page
located on every page to provide a lifeline to a well-known starting point.
A manipulation check for the depth variable asked subjects to evaluate the number of further choices or links on each page visited in the Web site. On a 7-point Likert scale, subjects on average rated the broad site 4.1 (standard deviation 1.9) and the deep site 3.9 (standard deviation 1.6).
Although in the expected direction, the difference was not significant (t=1.0; one-tailed p<.16), suggesting that either the condition or the manipulation check measure was inadequate.
Further testing on the manipulation check for depth revealed that users had trouble judging
“absolute” depth or breadth in this setting. Because the effect of depth has greatest implications in
the unfamiliar site, effectively multiplying the number of required clicks by at least 5, we omitted the
scores of subjects who went from the unfamiliar to the familiar sites, assuming that the extra two
clicks in the deep familiar site would be too trivial to comprise a meaningful difference in perceived
depth. For the subjects who browsed the unfamiliar site as the second site, the mean score was 4.6
for the broad site and 3.9 for the deep site, differing significantly on the depth variable (t=1.7; onetailed p<.046). Thus, there is some evidence that the subjects could perceive and judge the
depth/breadth manipulation, though not as strongly as the other factors.
Dependent Variables
Table 4 provides a summary of the items and their scales.
Table 4: Instrument Reliabilities
Dependent Measure
Attitudes
Performance
Behavioral Intentions
Number of Items Type of
Items
7
9
2
9-point scale
Dichotomous
7-point scale
Reliability
alpha = .95
KR-20 = .90
alpha = .94
18
Attitudes
Attitudes about the sites were measured averaging the responses to a set of seven 9-point
Likert-type questions adapted from Part 3 of the long form of the QUIS (Questionnaire for User
Interaction Satisfaction) (Shneiderman, 1998, p. 136), which has demonstrated reliability and validity
(Chin et al., 1988). Cronbach’s alpha, was .95.
Performance
Performance was operationalized using an average score for nineiii search tasks that were developed for this study. The nine tasks required subjects to visit each third of the site‘s hierarchy the
same number of times, providing a reasonably balanced overall view of the site. For accomplishing
their search tasks the questions for each site were identical except for the search object names (general store products versus software products). Subjects needed to visit exactly the same position of
the structure in each site, and the only differences between the familiar and unfamiliar sites were the
products and product categories themselves.
The nine tasks required browsing the site for facts (such as prices or packaging) about various products and received 1 point for each correct answer. Short, dichotomous instruments are not
normally subjected to reliability analysis, so the results of such analysis should not be viewed with
the same expectations as Likert-type instruments or dichotomous instruments with over 20 items
(Nunnally, 1978). Nevertheless, the statistic of the Kuder-Richardson-20, or KR-20 test (analogous
to alpha, but for dichotomous items), was quite high in this study (.90).
Behavioral Intentions
Behavioral intentions were measured using the average of two original questions that focus
on two related future behaviors: how readily the subject would visit the site again and how likely he
or she would recommend that others visit the site (7-point scales). The alpha score for this very
19
short instrument was also extremely high, at .94, as in a previous study (Galletta et al., 2004).
Subjects
Undergraduate business majors enrolled in sections of an upper-division undergraduate MIS
course at a large Northeastern U.S. university were invited to participate in the study. Of 191 students invited, 177 (93%) volunteered to participate. The first 160 served to fill 5 completely counterbalanced groups of 32 each (77 females, 83 males). The last 17 subjects were discarded. Subjects
were given the opportunity to participate during one of their class periods, and instructors offered to
provide extra credit points to stimulate participation. The within- and between-subjects design provided an equal number of subjects (40) for each of the 8 cells.
A pre-test employing a separate group of 32 subjects from the same sampling pool (1 for
each of the 32 combinations of conditions) in a previous term helped us refine the instruments and
procedures. Those data are not included in our results.
Subjects were randomly assigned to one of the 32 counterbalanced conditions. In general,
ANOVA revealed no differences among the 8 cells on gender, computer experience (measured by
the instrument from Compeau & Higgins, 1995), and computer efficacy (measured by two scales
adapted from Nelson, 1991; following the procedure of Hartzel, 1999). Of the 21 2x2x2 tests (3
main effects and 4 interactions for each of the three demographic variables), only one difference was
significant: computer efficacy differed along a 2-way interaction of delay and depth (F=4.7; p=.032;
1,149 df). One test out of 21 is expected to be significant at the .05 level based purely by chance, and
a Bonferroni adjustment renders it nonsignificant.
Procedure
The Web sites were created and their 32 combinations written on CDs to precisely control
the browser’s response time.iv A computer laboratory containing 46 identical Pentium II computers
20
and 17” XGA screens further controlled the subjects’ environments. Upon their arrival, subjects
were asked to sign an “informed consent” form, and were told that participation was voluntary and
they could leave the study at any time. They were then instructed to enter the randomly-assigned
code number (1-32) printed on their packet to trigger the system to load the correct treatment condition. Logs provided by background software and manipulation check questions described earlier
provided assurance that the correct number was entered.
A very small sample site provided a “warm-up” and the subjects were prompted to complete
the required 9 tasks for each of the two main sites. At the conclusion of each main site, the subjects
were instructed to close the browser window and complete the questions addressing attitudes and
behavioral intentions. On average, subjects completed the experiment in about an hour.
While page simplicity could endanger task realism, our site is typical of most web sites that
categorize product pages into intermediate categories that are linked together from a top page, and
the task typifies common questions for which users might seek information. Many sites require that
users seek the best scent possible to navigate to pages lower in the hierarchy toward the foraging
goal. Any failures in realism would be likely to mute our findings.
Results
Because the correlation between performance and attitudes was .50 (p<.001), our analysis of
each hypothesis from H1 through H7 began with a multivariate ANOVA followed by a univariate
ANOVA for each dependent variable. Because each MANOVA test was significant, univariate tests
were run. SPSS version 10 was used for all of the analysis in this study.
Tables 5 and 6 provide means for each dependent variable and Table 7 provides the multivariate tests of H1 through H7. The tests reveal that every main and every possible interaction effect
is significant (p<.002). Therefore, univariate testing was performed for each hypothesis.
21
Individual ANOVA tests were run for both variables, and the amount of variance explained
by each analysis (including interaction effects) was .557 for attitudes and .498 for performance (adjusted R2).
Table 5: Cell Means for Attitudes
Treatment
Mean
Deep
Unfamiliar
Familiar
Broad
Unfamiliar
Familiar
Slow
Standard
Deviation
Mean
Fast
Standard
Deviation
1.8
4.8
1.0
1.9
2.8
6.4
1.5
1.6
2.5
6.2
1.4
1.7
4.0
6.6
2.1
1.2
Table 6: Cell Means for Performance
Treatment
Deep
Unfamiliar
Familiar
Broad
Unfamiliar
Familiar
Slow
Mean Standard
Deviation
Mean
Fast
Standard
Deviation
.42
.95
.29
.09
.80
.98
.30
.05
.81
1.0
.23
0
.93
1.0
.17
.02
Table 7: Overall MANOVA: Both Dependent Variables (for H1-H7)
Factor
Delay
Familiarity
Depth
Delay * Familiarity
Delay * Depth
Depth * Familiarity
Delay * Familiarity * Depth
F
37.3
216.0
35.8
17.2
6.4
15.6
7.7
Sig
.000
.000
.000
.000
.002
.000
.001
Means for the dependent variables, and results of hypothesis tests for the main effects are
presented in Tables 8-10. All 6 tests are significant (in the predicted direction) at the .001 level,
providing strong support for the main effects predicted by the model. It is notable that the
22
depth/breadth treatment shows strong results; it is apparent that the weakness in the depth/breadth
treatment was in the manipulation check, and not the actual treatment itself.
Table 8: The Delay Main Effect
Dependent Measure
Attitudes
Performance
Slow
Mean Standard
Deviation
3.8
2.3
.80
.30
Mean
5.0
.92
Fast
Standard
Deviation
2.3
.19
Difference (F, p)
at 1,312 df
F=42.1; p<.001
F=40.8; p<.001
Table 9: The Depth Main Effect
Dependent Measure
Attitudes
Performance
Deep
Broad
Mean Standard
Mean
Standard
Deviation
Deviation
3.9
2.4
4.8
2.3
.79
.31
.93
.16
Difference (F, p) at
1,312 df
F=27.3; p<.001
F=52.0; p<.001
Table 10: The Familiarity Main Effect
Dependent Measure
Attitudes
Performance
Unfamiliar
Mean Standard
Deviation
2.8
1.7
.74
.31
Familiar
Standard
Deviation
6.0
1.8
.98
.06
Mean
Difference (F, p)
at 1,312 df
F=331.7; p<.001
F=144.0; p<.001
Main Effects
Attitudes and performance were significantly more positive in the faster site, broader site,
and familiar site, providing clear support for Hypotheses H1a, H1b, H2a, H2b, H3a, and H3b.
Interaction Effects
All main (reported above) and interaction effects were assessed using the same model, resulting in analysis similar to the multivariate analysis in Table 7.
Depth and Familiarity
Hypotheses H4a and H4b predict that the positive effects of breadth are more important for
23
an unfamiliar site than for a familiar site (and conversely, the negative effects of depth are more severe for an unfamiliar site than a familiar site). Although the interaction effect was not significant for
attitudes (H4a), the performance effect seems to hold very clearly, and precisely in the expected direction (H4b; F=31.4; p<.001).
Depth and Delay
Hypotheses H5a and H5b predict that the positive effects of breadth (rather than depth) are
enhanced more for a slower site than for a faster site. As in the analysis of the previous interaction,
such an interaction was found for performance (H5b; F=12.57; p<.001) but not attitudes (H5a). The
means are in the expected direction, demonstrating that breadth indeed reduces the performance
disadvantage of a slow site.
Familiarity and Delay
Hypotheses H6a and H6b predict that the positive effects of familiarity are enhanced more
for a slower site than for a faster site. Once again, the interaction effect on performance was significant (H6b; F=34.5; p<.001) but not on attitudes (H6a). The results demonstrate a strong interaction,
as predicted, indicating that familiarity reduces the performance disadvantage of a slow site.
Delay, Depth, and Familiarity
Hypotheses H7a and H7b predict that the combined positive effects of familiarity and
breadth are enhanced more for a slower site than for a faster site. The results support both H7a and
H7b, demonstrating a three-way interaction for both attitudes (F=5.9; p<.016) and task performance (F=7.98; p<.005). The cell means reveal that familiarity and breadth interact to reduce the
performance and attitudinal disadvantage of a slow site.
Antecedents of Behavioral Intentions
H8a and H8b state that behavioral intentions can be predicted by attitudes and performance,
24
respectively. Intercorrelations among the constructs indicated general support for both hypotheses
(p<.001), with a .774 correlation between attitudes and behavioral intentions and a .392 correlation
between performance and behavioral intentions.
To determine the relative contribution of each variable in predicting behavioral intentions, a
multiple regression approach was also used. The regression model included intentions as a dependent variable and attitudes and performance as independent variables, and was found to explain significant variance in behavioral intentions (adjusted R2=.597). However, only the attitude measure
contributed significantly to the model (t=18.8; p<.001). In a stepwise regression, only attitudes entered the model (adjusted R2=.598).
Table 11 provides a summary of the Hypotheses and results. The results indicate that, perhaps unsurprisingly, there are indeed powerful main effects from Web site delay, depth, and familiarity on user attitudes and performance in searching through the site. However, beyond those intuitive
main effects, for performance we found strong two-way interactions between depth and familiarity,
depth and delay, and between familiarity and delay as expected. A strong three-way interaction
among all three factors was found for both performance and attitudes. Finally, attitudes appear to be
a powerful antecedent of behavioral intentions.
Interestingly, performance effects were consistently stronger in every main and interaction
test than effects on attitudes. Meanwhile, 2-way interaction effects were completely absent for attitudes, and while most of the hypotheses received strong support from manipulating the three factors, the strongest findings are in the arena of performance. Simultaneously, attitudes are the strongest predictor of behavioral intentions, overwhelming any additional impacts of performance. Therefore, a focus only on performance or attitudes appears to be rather incomplete.
Table 11: Summary of Findings
25
H1a
H1b
H2a
H2b
H3a
H3b
H4a
H4b
H5a
H5b
H6a
H6b
H7a
H7b
H8a
H8b
Expectation
Effect of
Factors
Result
Attitudes: Fast > Slow
Performance: Fast > Slow
Attitudes: Broad > Deep
Performance: Broad > Deep
Attitudes: Familiar > Unfamiliar
Performance: Familiar > Unfamiliar
Attitudes: Interaction between Depth & Familiarity
Performance: Interaction between Depth & Familiarity
Attitudes: Interaction between Depth & Delay
Performance Interaction between Depth & Delay
Attitudes: Interaction between Familiarity & Delay
Performance: Interaction between Familiarity & Delay
Attitudes: Interaction between Delay, Depth, & Familiarity
Performance: Interaction between Delay, Depth, & Familiarity
Attitudes as an antecedent of Behavioral Intentions
Performance as an antecedent of Behavioral Intentions
Main
Main
Main
Main
Main
Main
2-way
2-way
2-way
2-way
2-way
2-way
3-way
3-way
direct
direct
Supported
Supported
Supported
Supported
Supported
Supported
Not Supported
Supported
Not Supported
Supported
Not Supported
Supported
Supported
Supported
Supported
Mixed
Discussion
Our model predicted all possible main and interactive effects of delay, familiarity, and depth
on users’ attitudes and performance on nine search tasks. The model also predicted that behavioral
intentions would, in turn, be influenced by both attitudes and performance.
The 3-way ANOVAs explained 56% of the variance in attitudes and 50% of the variance in
performance. As expected, all main effects were highly significant on both of the hypothesized outcome variables. Attitudes and performance on the search task were strongly degraded by delay, unfamiliar structure and content, and deep structure. Those results are consistent with HCI research
outside of the Web arena, and provide some assurance that the independent variables chosen in this
study did have strong effects on the outcomes we examined.
The interaction effects seemed to have the strongest effects on user performance. All possible 2-way interactions among the factors showed significant effects on performance, but not on attitudes. The 3-way interaction was significant for both performance and attitudes. Although it is rela-
26
tively rare to theorize and find 3-way interactions, it appears that the three factors indeed have a synergistic relationship.
The failure of 2-way interactions for attitudes led us to examine the means graphically. Figures 2 and 3 provide 3-D bar charts of both performance and attitudes, and it becomes apparent
that the unfamiliar treatment has such a devastating effect on attitudes that there is much less room
for interaction effects to take hold. Stated another way, the familiarity main effect seems to overshadow the 2-way interactions for attitudes. It is possible that the 3-way interaction remains highly
significant for attitudes because of the high strength of the synergies among the three factors. Further, examination of Figure 3 seems to indicate that familiarity has even stronger effects than delay
on attitudes, suggesting that familiarity might actually be the core problem.
0.8
0.6
0.4
0.2
Unfam
Fam
0
De
ep
-Sl
De
ow
ep
- Fa
Bro
s
t
ad
Sl o
Bro
w
ad
- Fa
st
7
6
5
4
3
2
1
0
Unfam
1
Figure 3: 3-Way Interaction for Attitudes
Fam
Figure 2: 3-Way Interaction for Performance
De
ep
De
-S
l
ep
-Fa ow
Br
oa
st
d
-S
Br
low
oa
d-F
as
t
Behavioral intentions correlated significantly with both attitudes (r=.774; p<.001) and performance (r=.392; p<.001), suggesting that both H8a and H8b are supported. While performance
explains 15.1% of the variation in behavioral intentions (p<.001), a multiple regression revealed that
the effects of attitudes appeared to overshadow completely the effects of performance on behavioral
intentions. In other words, if we did not measure attitudes, we would consider H8b to be supported
rather strongly. We suspected multicollinearity between attitudes and performance, but the correlation between the two was .495, below the traditional threshold of .6 and the VIF score was 1.32, be-
27
low the general threshold of 4. Therefore, our conclusion is that support for H8a is strong and support for H8b is “mixed,” demonstrating that studying performance without attitudes could provide
misleading results.
Limitations
As in most studies, there are several limitations that should be kept in mind when considering the results of this study.
First, the subjects were college students, and the results might not generalize to the rest of
the population. Fortunately, Voich (1995) found students to be particularly representative of values
and beliefs of individuals employed in a variety of occupations. Also, we did not expect students to
react differently to the stimuli than would other individuals (for example, students would probably
not be appropriate for studying how loan officers make decisions). Thus, the materials were designed to tap rather “invariant” (Simon, 1990) activities such as aspects of Web use.
Another limitation is that the task might not represent typical tasks that on-line shoppers
undertake. To address this potential problem, we attempted to make the tasks as realistic as possible.
It is perhaps frequent to find shoppers browsing through a site, looking for facts and details of
products that they might buy. Our task required precisely this activity. The main departure from traditional usage was the lack of “search” facilities. A search option would have destroyed our control
over the paths followed and would have confounded our results.
Finally, incentives of laboratory browsers and home shoppers might differ. In short, home
shoppers can choose a different store if they became frustrated searching for a desired product,
while our users did not desire any of the products but participated to receive extra credit. While extra credit is indeed an incentive it does not reflect incentives of actual shopping. Tracking real shoppers on real on-line stores would have provided greater realism, but the cost would be a loss of con-
28
trol. Our sacrifice provided greater assurance that the effects we found were caused by the treatments. We probably muted our findings because students in a lab with a set of tasks placed in front
of them would not be as likely to care very much about the task or be as likely to suffer reduced performance (quit before their tasks were all complete). Our findings are therefore perhaps more conservative than necessary.
Implications and Conclusions
This study is meant to provide implications for both researchers and practitioners. Web researchers can use, with confidence, two-decade-old findings of research on main effects delay,
depth, and familiarity in the context of time-sharing menu systems. Interactions among those factors
provide strong findings beyond the main effects that have been studied separately for so long, and
studying both performance and attitudinal outcomes together provide a more complete story than
studying only performance or attitude outcomes.
Information foraging theory appears to be a promising basis for research in Web design, as
users appear to have strong reactions as the costs of browsing for information increase, possibly exceeding its benefits. Furthermore, time is a potent component of foraging cost, but only considering
delay time would omit other important factors such as familiarity and depth, two factors that add to
foraging cost. These factors allow a researcher to perform the more complete analysis that is suggested by estimating foraging cost with a formula that predicts the nature of the users’ likely path to
a goal and the distance to that goal.
While our model appears to explain a large amount of variation in attitudes and task performance, there is ample opportunity to focus on other outcomes and other antecedents which might
lead to greater understanding of the usability of Web sites. Also, while we were able to demonstrate
correlations between behavioral intentions and both attitudes and task performance, other studies
29
should focus very closely on these three constructs to better understand their relationships under
different situations.
Practitioners should be advised to pay close attention to each of the three individual factors
within their control to the greatest extent possible. Page loading time should be minimized, links to
bottom level pages should use the most familiar possible terminology, and intermediate pages should
have more, rather than fewer, links per page.
More importantly, however, although practitioners are not usually made aware of interactions among experimental factors, the ones in this study provide some clear advice:




First and foremost, designers should ascertain if the terminology used to categorize Web
content is familiar to users. One promising tool suggested for testing a potential design is
the cognitive walkthrough (Blackmon, et al., 2003).
When it is difficult or impossible to expect a reasonable amount of familiarity in target users, then a great deal of effort should be expended to maximize breadth and speed (either
by minimizing graphic content, streamlining page design, or assuring proper server configuration, redundancy, and/or bandwidth).
When familiarity is expected to be high, pages can be made more attractive by using
graphics, and the site can be made with greater depth than might otherwise have been considered advantageous.
When it is likely that delay will be an issue because of the need for graphical content or
likely bottlenecks, then breadth should be maximized and the site’s structure should employ very familiar, clear terminology.
While analysis of single factors such as those investigated in this study are important for es-
tablishing foundations for understanding basic design phenomena, analysis of interactions provide
the basis for simultaneously more thorough understanding for researchers and more practical heuristics for designers. Refining and extending such interactive models would be a promising avenue for
future work.
30
References
Ajzen, I. “Nature and Operation of Attitudes,” Annual Review of Psychology, 52, 2001. 27–58
Bandura, A. “Self-Efficacy: Toward a Unifying Theory of Behavioral Change,” Psychological Review,
84:2, February 1977, 191-215.
Barber, R.E. and Lucas, H.C., “System Response Time, Operator Productivity, and Job Satisfaction,” CACM, (26:11), November 1983, 972-986.
Blackmon, M.H., Kitajima, M. and Polson, P.G. “Repairing usability problems identified by the cognitive walkthrough for the web,” SIGCHI 2003, Ft. Lauderdale, FL, pp. 497-504.
Butler, T.W. “Computer Response Time and User Performance,” CHI 1983, Boston, 58-63.
Card, S.K., Moran, T.P., and Newell, A. The Psychology of Human-Computer Interaction, 1983, Hillsdale,
NJ: Lawrence Erlbaum Associates.
Chae, M. and Kim, J. “An Empirical Study On The Breadth And Depth Tradeoffs In Very Small
Screens: Focusing On Mobile Internet Phones,” AMCIS 2003, pp. 2083 – 2093.
Chau, P.Y.K. Au, G. and Tam, K.Y. “Impact of information presentation modes on online shopping: An empirical evaluation of a broadband interactive shopping service,” Journal of Organizational Computing and Electronic Commerce, 10 (1), 2000, 1-22.
Chi, E.H., Pirolli, P. and Pitkow, J. “The scent of a site: a system for analyzing and predicting information scent, usage, and usability of a Web site,” SIGCHI Proceedings, 2000, The Hague, The
Netherlands, pp. 161-168.
Chin, J.P., Diehl, V.A. and Norman, K.L. “Development of an Instrument Measuring User Satisfaction of the Human-Computer Interface,” CHI 1988, New York, 213-218.
Chiravuri, A. and Peracchio, L. “Investigating Online Consumer Behavior Using Thin Slices Of Usability Of Web Sites,” AMCIS 2003, pp. 2105-2110
Compeau, D.R. & Higgins, C.A. “Computer Self-Efficacy: Development of a Measure and Initial
Test,” MIS Quarterly, (19:2), 1995, 189-211.
Compeau, D.R., Higgins, C.A., and Huff, S. “Social Cognitive Theory and Individual Reactions to
Computing Technology: A Longitudinal Study,” MIS Quarterly, 23, 1999, 145-158.
Connolly, P.J., “Growing Pains Won’t Slow Down the Broadband Revolution,” Infoworld, November
6, 2000, 66 and 68.
Crowston, K. and Williams, M., 2000, "Reproduced and Emergent Genres of Communication on
the World Wide Web," The Information Society, 16, 201-215.
31
Dannenbring, G.L. “The Effect of Computer Response Time on user Performance and Satisfaction:
a Preliminary Investigation,” Behavior Research Methods & Instrumentation, 15(2), 1983, 213-216.
Dataquest/Gartner, 2002. http://www3.gartner.com/Init.
Davis, F.D. "Perceived Usefulness, Perceived Ease of Use and User Acceptance of Information
Technology," MIS Quarterly (13:3), September, 1989, 319-340.
Dellaert, B.G.C. and Kahn, B.E., “How Tolerable is Delay? Consumers’ Evaluations of Internet
Web Sites after Waiting,” CentER Discussion Paper, MIT e-Commerce Research Forum,
http://e-commerce.mit.edu/cgi-bin/viewpaper?id=8, 1999.
Doherty, W. and Kelisky, R. “Managing VM/CMS Systems for User Effectiveness,” IBM Systems
Journal, (18:1), 1979, 143-163.
Dumais, S. T. and Landauer, T. K. “Using Examples to Describe Categories,” Proceedings of the CHI
Conference on Human Factors in Computing Systems, Boston, 1983, 112-115.
Edwards, D.M. and Hardman, L. 'Lost in hyperspace': cognitive mapping and navigation in a hypertext environment. Hypertext: theory into practice. Edited by R. McAleese. Oxford: BSP, 1989.
Fishbein, M. and Ajzen, I. Belief, Attitude, Intention and Behavior: An Introduction to Theory and Research,
Addison-Wesley Publishing Company, Reading, MA, (1975).
Frese, M. “A theory of control and complexity: Implications for software design and integration of
the computer system into the work place,” In Frese, M., Ulich, E. & Dzida, W. (eds.), Psychological issues of human-computer interaction at the work place. North-Holland, Amsterdam, 1987, 313-337.
Galletta, D., Henry, R., McCoy, S., and Polak, P. (2004) “Web Site Delays: How Slow Can You
Go?” Journal of AIS, Volume 5, Number 1, pp. 1-28.
Gauch, S. and Smith, J.B. “Search Improvement via Automatic Query Reformulation,” ACM Transactions on Information Systems, (9:3), 1991, 249-280.
Gray, W.D., John, B.E. & Atwood, M.C. (1993) "Project Ernestine: Validating GOMS for Predicting
and Explaining Real-World Task Performance," Human-Computer Interaction, 8(3), 237-309.
Grossberg, M., Weisner, R.A., and Yntema, D.B., “An Experiment on Problem Solving With Delayed Computer Responses,” IEEE Transactions on Systems, Man, and Cybernetics, 6, 1976, 219-222.
Hartzel, K.S. “Understanding the Role of Client Characteristics, Expectations, and Desires in Packaged Software Satisfaction,” unpublished doctoral thesis, University of Pittsburgh, Katz Graduate School of Business, 1999.
Hoxmeier, J. A. and DiCesare, C. “System Response Time and User Satisfaction: An Experimental
Study of Browser Based Applications", AMCIS, 2000, 140-145.
32
In-Stat/MDR, 2002; http://www.instat.com/
Jacko, J. A., Salvendy, G. and Koubek, R. J. “Modeling of Menu Design in Computerized Work,”
Interacting with Computers, (7:3), 1995, 304-330.
Jacko J. A. and Salvendy, G. “Hierarchical menu design: Breadth, depth, and task complexity,” Perceptual and Motor Skills, (82:3), 1996, 1187-1202.
Jul, S. and Furnas, G.W. Navigation in Electronic Worlds: ACM SIGCHI Bulletin (29:4), 1997
Kaindl, H., Kramer, S., and Afonso, L.M. “Combining Structure Search and Content Search for the
World-Wide Web,” Ninth ACM Conference on Hypertext, Pittsburgh, PA, 1998, 217-224.
Katz, M.A. and Byrne, M.D., “The Effect of Scent and Breadth on Use of Site-Specific Search on Ecommerce Web Sites”, ACM Transactions on Human-Computer Interaction, (10:3), 2003, 198-220.
Kiger, J., “The depth/breadth trade-off in the design of menu-driven user interfaces,” International
Journal of Man-Machine Studies, 20, 1984, 201-213.
Koblentz, E. “Web Hosting Partner Keeps Stone Site Rolling,” Eweek, October 30, 2000, p. 27.
Korthauer, R.D. and Koubek, R.J., 1994, "An Empirical Evaluation of Knowledge, Cognitive Style,
and Structure upon the Performance of a Hypertext Taks," International Journal of HumanComputer Interaction, 6(4), 373-390.
Kuhmann, W. “Experimental Investigation of Stress-inducing Properties of System Response
Times,” Ergonomics, (32:3), 1989, 271-280.
Landauer, T. and Nachbar, D. “Selection from alphabetic and numeric menu trees using a touch
screen: Breadth, depth, and width,” Proceedings of CHI’85, New York 1985, 73-78.
Larson, K. and Czerwinski, M., “Web Page Design: Implications of Memory, Structure and Scent
for Information Retrieval,” Proceedings of CHI’98, Los Angeles, 1998, 25-31.
Lederer, A.L., Maupin, D.J., Sena, M.P., and Zhuang, Y. “The Technology Acceptance Model and
the World Wide Web,” Decision Support Systems, (29:3), 2000, 269-282.
Lee, A.T., “Web Usability: A Review of the Research,” SIGCHI Bulletin, (31:1), 1999, 38-40.
Lee, E. and MacGregor, J. “Minimizing User Search Time in Menu Retrieval Systems,” Human Factors, 27, 1985, 157-162.
Lohse, G.L. and Spiller, P. “Electronic Shopping,” Communications of the ACM, (41:7), 1998, 81-87.
Martin, G.L., and Corl, K.G., “System Response Time Effects on User Productivity,” Behavior and
Information Technology, (5:1) 1986, 3-13.
33
McKinney, V., Yoon, K. & Zahedi, F. (2002). “The Measurement of Web-Customer Satisfaction:
An Expectation and Disconfirmation Approach,” Information Systems Research, 13 (3), 296-315.
Miller, D. “The Depth/Breadth Tradeoff in Hierarchical Computer Menus,” Proceedings of the 25th
Annual Meeting of the Human Factors Society, New York, 1981, 296-300.
Miller, R.B. “Response Time in Man-Computer Conversational Transactions,” Proceedings of the Fall
Joint Computer Conference, 1968, 267-277.
Muramatsu and Pratt, “Transparent Queries: investigation users' mental models of search engines,”
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 217-224, 2001.
Nah, F. “A Study of Web Users’ Waiting Time,” in Intelligent Support Systems Technology: Knowledge Management, Vijayan Sugumaran (editor), IRM Press, 2002, pp. 145-152.
Nah, F.H. and Kim, K. “World Wide Wait,” in Managing Web-Enabled Technologies in Organizations: A
Global Perspective, Medhi Khosrowpour (editor), Idea Publishing Group, (2000), pp. 146-161.
Nelson, R.R. “Educational Needs as Perceived by IS and end-user personnel: A Survey of
Knowledge and Skill Requirements,” MIS Quarterly, (15:4), 1991, 503-525.
Nielsen, J. “Who Commits The "Top Ten Mistakes" of Web Design?,”
http://www.useit.com/alertbox/990516.html, 1999a.
Nielsen, J. “The Top Ten New Mistakes of Web Design,”
http://www.useit.com/alertbox/990530.html, 1999b.
Nielsen, J. “The Need for Speed,” http://www.useit.com/alertbox/9703a.html, 1997
Norman, K. and Chin, J. “The Effect of Tree Structure on Search in a Hierarchical Menu Selection
System,” Behaviour and Information Technology, (7:1), 1988, 51-65.
Nunnally, J. Psychometric Theory, McGraw-Hill Book Company, New York, MA, 1978.
Olston, C. and Chi, E.H. “ScentTrails: Integrating Browsing and Searching on the Web,” ACM
Transactions on Computer-Human Interaction, Vol. 10, No. 3, 2003, pp. 177–197.
Paap, K. R. and Roske-Hofstrand, R. J., “Design of Menus”, In Handbook of Human-Computer Interaction, M. Helander (Ed.), Elsevier, Amsterdam, 205-235, 1988.
Pardu, H. and Landry, J. “Evaluation Of Alternative Interface Designs For E-Tail Shopping: An
Empirical Study Of Three Generalized Hierarchical Navigation Schemes,” AMCIS 2001, pp.
1335-1337.
Parkinson, S., Hill, M., Sisson, N. and Viera, C. “Effects of Breadth, Depth, and Number of Responses On Computer Menu Search Performance,” International Journal of Man-Machine Studies,
34
28, 1988, 683-692.
Pirolli, P. and Card, S.,1999, "Information Foraging," Psychological Review, 106(4), 643-675.
Pitkow, C., Kehoe, J., and Rogers, J. GVU's 9th WWW User Survey, http://www.cc.gatech.edu/
gvu/user_surveys/survey-1998-04/reports/, 1998.
Polak, P. (2002). “The Direct and Interactive Effects of Web Site Delay Length, Delay Variability,
and Feedback on User Attitudes, Performance, and Intentions,” unpublished doctoral dissertation,
University of Pittsburgh, Katz Graduate School of Business.
Ramsay, J., Barbesi, A., Preece, J. “A Psychological Investigation of Long Retrieval Times on the
World Wide Web,” Interacting with Computers, (10:1), 1998, 77-86.
Robertson, G., McCracken, D., and Newell, A., “The ZOG Approach to Man-Machine Communication,” International Journal of Man-Machine Studies, 14, 1981, 461-488.
Rose, G., Khoo, H., Straub, D.W., (1999) “Current Technological Impediments to Business-ToConsumer Electronic Commerce,” Communications of the AIS, 1, Article 16.
Rose, G.M, Lees, J. & Meuter, M. (2001). “A refined view of download time impacts on e-consumer
attitudes and patronage intentions toward e-retailers,” The International Journal on Media Management, 3 (2), 105-111.
Rose, G.M. & Straub, D. (2001). “The Effect of Download Time on Consumer Attitude Toward the
E-Service Retailer,” e-Service Journal, 1 (1), 55-76.
Santosa, P.I. “Observing a User’s Mental Model of an Informational Website,” AMCIS 2003, pp.
2223-2230.
Sawasdichair and Poggenpohl, “User purposes and information-seeking behaviors in web-based media: a user-centered approach to information design on websites,” Conference on Designing interactive systems: processes, practices, methods, and techniques, 2002, pp. 2-1-212.
Schwartz, J. P. and Norman, K. L. “The Importance of Item Distinctiveness on Performance Using
A Menu Selection System,” Behaviour and Information Technology, (5:2), 1986, 173-182.
Sears, A. and Jacko, J.A. “Understanding the relation between network quality of service and the usability of distributed multimedia documents,” Human-Computer Interaction, (15:1), 2000, 43-68.
Sears, A., Jacko, J.A., and Dubach, E.M. “International aspects of WWW usability and the role of
high-end graphical enhancements,” International Journal of Human-Computer Interaction, (12:2),
2000, 243-263.
Shneiderman, B. Designing the User Interface, Addison-Wesley, Boston, 1998.
Simon, H.A., “Invariants of Human Behavior,” Annual Review of Psychology, (41), 1990, 1-19.
35
Somberg, B. and Picardi, M. “Locus of the Information Familiarity Effect in the Search of Computer Menus,” Proceedings, 27th Annual Meeting of the Human Factors Society, Virginia, 1983, 826-830.
Song, J. and Zahedi, F., “Web Design in E-Commerce: A Theory and Empirical Analysis,” Proceedings of the International Conference on Information Systems, December 17-20, 2001, Session T3.3.
Taylor, S. and Todd, P.A., “Understanding Information Technology Usage: A Test of Competing
Models,” Information Systems Research, 6, 1995, 144-176.
Teal, S.L. and Rudnicky, A.I., “A Performance Model of System Delay and User Strategy Selection,”
Proceedings of the 1992 Conference on Computer-Human Interaction, ACM, New York, 295-305.
Thuring, M., Hannemann, J., and Haake, J. M. 1995. Hypermedia and cognition: Designing for
comprehension. Communications of the ACM, 38:8 57-66.
Torkzadeh, G. & Dhillon, G. (2002). “Measuring Factors that Influence the Success of Internet
Commerce,” Information Systems Research, 13 (2), 187-204.
Turban, E. and Gehrke, D. “Determinants of e-commerce Website” (sic), Human Systems Management,
19, 2000, 111-120.
Venkatesh, V. and Davis, F.D. "A Theoretical Extension of the Technology Acceptance Model:
Four Longitudinal Field Studies," Management Science, (46:2), February 2000, 186-204.
Voich, D., Comparative Empirical Analysis of Cultural Values and Perceptions of Political Economy Issues,
Westport, Connecticut: Praeger, 1995.
Wonnacott, L. “When User Experience is at Risk, Tier 1 Providers can Help,” Infoworld, November
6, 2000, p. 76.
Zaphiris, P. and Mtei, L. “Depth vs Breadth in the Arrangement of Web Links,”
http://www.otal.umd.edu/SHORE/, 1997.
Zona Research Study, “The Need for Speed,”
http://www.zonaresearch.com/info/press/99%2Djun30.htm, 1999.
36
End Notes
Although a search engine could allow a user to jump directly to a needed resource, full-text searches cannot be counted on for exclusive access to documents (Olston & Chi, 2003). First, they are not always available, or
when available, are often not very useful. Few people use such tools effectively, partly due to the unique syntax and
behavior of each search engine (Lohse & Spiller, 1998), partly because searching is a difficult task for most users
(Muramatsu & Pratt, 2001; Gauch & Smith, 1991), and partly because most search outcomes lack precision (Kaindl
et al., 1998). In short, the typical search engine interface is “notorious for its lack of usability” (Lee, 1999, p. 38).
Sometimes terminology itself is the problem, such as searching for the old term “PCMCIA” versus the new term
“PC Card.”
ii To prevent any learning effects in the second task, we counterbalanced completely the order for withinsubjects factors. 32 combinations = 2 familiarity orders (familiar-unfamiliar and unfamiliar-familiar) x 4 speed conditions (fast-fast, fast-slow, slow-fast, and slow-slow) x 4 depth conditions (deep-deep, deep-broad, broad-deep,
broad-broad).
iii We performed hypothesis tests (as shown later in the Analysis section) for all tasks separately and found
nearly identical results. We also examined the progression of performance from one task to the next, and found
that there was no systematic degradation of performance over time.
iv The CD, as well as all instruments and materials, are available from the first author on request.
i
Download