ARTICLE IN PRESS Int. J. Human-Computer Studies 60 (2004) 753–770 Web navigation structures in cellular phones: the depth/breadth trade-off issue Avi Parusha,b,*, Nirit Yuviler-Gavishb a Department of Psychology, Carleton University, B552 Loeb Building, 1125 Colonel By Drive, Ottawa, ON, Canada, K1S 5B6 b Faculty of Industrial Management and Engineering, Israel Institute of Technology, Haifa 32000, Israel Received 15 October 2003; accepted 22 October 2003 Abstract One can browse the web with a variety of devices, including hand-held devices such as the cellular phone. The small screen of those devices poses some serious usability issues, one of which is the appropriate hierarchy depth of the web site. In this study, we empirically examined whether a broad navigation structure, which was found to be superior in regular screen-size platforms, also has an advantage for a small-screen device such as the cellular phone where it may require more movements and scrolling between screens of the same hierarchical level. Navigation times and success rates were measured for two search tasks in a mock web site that was built in two versions: one with a broad navigation structure and the other with a deep structure. Both structures were tested with cellular phone emulation and a standard desktop personal computer (PC). Results indicate that performance was better with the broad navigation structure for both the cellular phone and the PC. In addition, performance was better with the PC as compared to the cellular phone, and this difference was pronounced in the broad structure. The results are discussed in terms of the impact of deviceindependent characteristics of the hierarchy depth along with the theoretical account of increased working memory load, confusion and disorientation associated more with deep structures. r 2004 Elsevier Ltd. All rights reserved. 1. Introduction With the advent of today’s technology, web browsing has become an activity that can be performed with a variety of platforms and devices such as the desktop *Corresponding author. Tel.: +972-4-8294548. E-mail addresses: Avi Parush@carleton.ca, parush@tx.technion.ac.il (A. Parush), nirity@tx.technion.ac.il (N. Yuviler-Gavish). 1071-5819/$ - see front matter r 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.ijhcs.2003.10.010 ARTICLE IN PRESS 754 A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 personal computer (PC), interactive TV, cellular telephones, Personal Digital Assistants and more. Platforms such as the cellular phone and Personal Digital Assistants have unique operational characteristics that directly influence their web browsing usability and thus presenting developers with unique design challenges. The major influencing characteristics are the small physical display and the mobile context of usage. In this paper we are focusing on the impact of the small display on critical aspects of the navigation structure in web sites. The root of the usability problem is the desire to let users of small screen devices have access to the same information that is presented in the fixed, regular-size screen platforms such as the desktop PC in addition to letting the user perform similar tasks with this information (Buchanan et al., 2001; Ericsson et al., 2001). Typical approaches to deal with this requirement are to either deliver the information as is or change the data structure, reformat and adapt it to the small screen (Watters et al., 2003). Various ways of adaptation were developed, the primary one being reformatting page layout. Reformatting guidelines were published for Personal Digital Assistants (e.g., AvantGo; Palm OS) and for cellular phones, primarily Wireless Application Protocol phones (e.g., WAP Forum, 2001; Sprint Spectrum, 2002; Nokia Corporation, 2002; Openwave, 2003). One line of investigation that is related to the content reformatting approach was to examine the ability to read information from small screens (e.g., Duchnicky and Kolers, 1983; Reseil and Shneiderman, 1987; Jones et al., 1999). In general it was found that smaller display size with shorter text lines degraded visual search and reading performance. Another line of development and research looked at the feasibility and efficacy of various dynamic content adaptations to the small screen. These included the use of Rapid Serial Visual Presentation to solve the space–time tradeoff of small screen presentations (e.g., de Bruijn et al., 2002; Ford et al., 1997), the use of transparent navigation mechanisms (Kamba et al., 1996), the use thumbnailing of web pages to enhance search (e.g., Wobbrock et al., 2002), and real-time restructuring of web pages to fit the small screen (e.g., Keranen and Plomp, 2002; Buyukkokten et al., 2001; Chen et al., 2003). Another common approach to adapt the content to small displays is to restructure the information and divide it into smaller displayable chunks. In general it was found that such an approach was associated with more navigation steps required in the small screen device, such as backward and forward (e.g., Dillon et al., 1999), longer web search and browsing history usage (de Bruijn et al., 2002), and longer browsing paths (Jones et al., 1999). Taken together, the approach of information re-structuring in small-display devices was shown to influence navigation in the web site. Regardless of the platform, information structure or architecture is critical to web site navigation (Brinck et al., 2002; Neilsen, 2000). There are two primary factors that characterize web site structure or the navigation structure: the number of items per page and the number of levels in the site. When there are few items in a page (i.e., small chunks), it creates many levels and the structure is considered to be deep. When there are many items in a page (large chunks), the structure is considered to be wide or broad since there is no need for many levels. ARTICLE IN PRESS A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 755 There is an inherent tradeoff between the depth and the breadth in the navigation structure. Various factors such as visual search time, motor response time, limitations of human working memory and others are associated with the impact of this tradeoff on navigation performance. While this tradeoff was studied extensively in regular-size displays, very little attention was directed to examining it in small-screen platforms. The research that deals with the breadth-depth tradeoff in the desktop PC or other fixed, regular screen-size platforms, looked primarily at menu structures and showed a consistent advantage of broad menu structures with respect to performance times and accuracy rates (Allen, 1983; Hagelbarger and Thompson, 1983; Kiger, 1984; Snowberry et al., 1983; Seppala and Salvendy, 1985; Tullis, 1985). In addition, when the item that the participant had to find was deep in the tree, the average time in each level was longer (Allen, 1983; Hagelbarger and Thompson, 1983). Less research was done with respect to the impact of menu structure on forms filling interaction. Norman et al. (2001) showed that when the navigation is linear (there is no need to jump back and forth in the form), participants preferred scrolling down the page or using the ‘‘next page/item’’ links rather than navigate by choosing the direct links. This indicated that for this kind of form filling, a broad menu is better than a deep one. No other research was found to give further support to this conclusion. In contrast, the considerations for choosing between a broad vs. a deep menu or navigation structure can be different and fuzzier for platforms with a small physical screen such as the Personal Digital Assistant and cellular phone. Tang (2001) and Ziefle (2002) present the problems in broad menus for a cellular phone. The problems are caused primarily by the inability to display all the items of the same level (that are on the same page) in one screen as it is on the PC, and as a result they need to be divided and displayed in several screens. As a consequence, users have to gather information from each of these screens as they navigate in the same level, in addition to gathering information from other levels. Thus, a broader menu will not necessarily alleviate the load on the working memory even though it has fewer levels as compared to navigating through many levels in a deep menu structure (Ziefle, 2002). In addition to this problem, Tang (2001) indicated that navigation in cellular phones requires the user to execute motor actions such as pressing keys in order to scroll between screens that belong to the same level (the same page). Consequently, there maybe more user actions required in a broad structure in a small-screen device and these actions may take more time as compared to a deep menu. Very little research looked at the tradeoffs inherent in various web site structures in terms of the influence on navigation performance in a small-screen device. Tang (2001) compared between a broad and a deep structure and showed that it took more time to execute the tasks in the deep menu. Kaikkonen and Roto (2003) compared a flat structure (one level) to a deep structure (two levels) and showed that task execution times were shorter for the flat structure. In addition, Chen et al. (2003) showed that by dynamically splitting pages into two levels, user’s browsing experience was improved. However, these studies did not examine the typical deeper navigation structures, 3–6 levels deep, which are often encountered in web sites. In addition, no rigorous empirical research was done to examine the impact of ARTICLE IN PRESS 756 A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 navigation structure on form filling interaction with the items that are searched for in a cellular phone. The little research that addressed the issue of navigation structures in small-screen devices is not as conclusive as the research that addressed this issue in regular screensize. Specifically, the question remains whether a broad navigation structure in a cellular phone is better than a deep structure, as was shown for the desktop and other platforms with a larger screen. Conversely, a broad structure is not necessarily better because of the small physical screen and the need to display the information of the same page in several screens, which would require more scrolling actions. This question is significant from both theoretical and practical aspects. Theoretically, the question is whether the influence of hierarchy depth in navigation structures is device dependent or independent. If the influence is device dependent we should not expect an advantage to the broad structure in the small-screen device such as a cellular phone (Tang, 2001; Ziefle, 2002). This can be expected because a broad structure in the small-screen device may entail more user actions than the deep structure as was explained above. However, if the influence is device independent and is due to other factors such as human memory limits, then we should expect to see an advantage to the broad structure in the small-screen device similar to what is observed with the regular screen platforms. Practically, there is still a need to formulate clear recommendations and guidelines for the design and information architecture of web sites for small-screen devices. The objective of the study reported here was to empirically examine the breadth– depth tradeoff in the navigation structure of a web site in a cellular phone. An identical web site in a desktop platform was used as a control condition to replicate the advantage of the broad structure reported in the literature. The research was conducted using a mock web site of movie listings and movie ticket booking. For each of the platforms, cellular phone and desktop, two user interfaces were designed for browsing the site. The user interfaces were different in their navigation structures, but identical in every other aspect. One interface was broader and less deep (‘‘broad’’ from hereon)—it consisted of many items per page along with fewer levels. The other interface was deeper (‘‘deep’’ from hereon)—it consisted of few items per page along with five levels. The experimental tasks were to find a certain movie and book tickets to that movie by filling in certain details. The sequence of steps and the time to perform every step were recorded. Another important aspect of the measures in this study was the characterization of various navigation patterns and errors that can be related to the research question. 2. Method 2.1. Participants Ninety-six students from the Industrial Engineering and Management Faculty at the Technion in Haifa, Israel, served as experiment participants. Most of them had a large (36% of the participants) or very large (21% of the participants) experience ARTICLE IN PRESS A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 757 with generally browsing the web. In terms of specifically browsing the web in a cellular phone, the majority (75% of the participants) had no prior experience with that. None of them had an experience with the specific site that was used in the experiment. Participants were students from two courses and received class credit for their participation in the experiment. 2.2. Experimental tasks The experiment included two tasks. In each task the participant had to navigate and search certain information (e.g., find the movie that is shown at the earliest time in a certain city), make a decision (e.g., check all the movies that are shown in that city, compare their show times and then decide on the earliest show) and finally book tickets for that movie. The tasks were distinguished in terms of their relative difficulty. According to Seppala and Salvendy (1985), the definition of task difficulty in navigation structures is derived from the distance between the items that need to be found in the hierarchical tree. The farther the items are from each other, the more difficult the task is. In task 1 of this experiment, the part of navigating to the screens that contained the information needed for making the decision had more inter-item distances in comparison to the second task (task 2). However, in task 2 of this experiment, there was more information to be gathered in order to make the decision for which movie to book tickets, as compared to task 1. The tasks were distinguished in their relative difficulty in order to further examine the possible influences of the navigation structures. For analysis purposes, the tasks performance was divided into two parts: (1) navigation until finding the correct movie; and (2) booking tickets for that movie. 2.3. Stimuli The mock web site that was built for the research included listing of 72 different movies that are played in 43 cities in Israel and a mechanism for booking tickets. This number of items was identical for both structures and both platforms. The site included only simple links, without search options and without any graphics. For every movie one could find its description, score, the actors that participate in it and its show times in the theaters in the different cities. In addition, one could book tickets for a certain movie by filling in details such as the desired show time, the desired number of tickets and the number of a credit card for payment. The site was built as a hierarchical tree. A list of the cities and the movies was the root, and the details about the movies were the terminal items (‘‘the leaves of the tree’’). Consequently, one could reach every terminal item via several paths. The web sites were designed according to known web site design guidelines (e.g., Nielsen, 2000; Sprint Spectrum Web Style Guide). We built two prototypes that enabled web browsing in each of the two platforms. One platform was an actual desktop computer, and the web browser was MS Internet Explorer version 5 (see top panel of Fig. 1). Link colors were according to ARTICLE IN PRESS 758 A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 Fig. 1. Screen-shots of the setup for the two platforms, a regular PC screen (top panel), and on-screen emulation of a cellular phone (bottom panel). the convention used in standard sites. A prototype that emulated web browsing in a cellular phone was developed for the second platform of the study. An image of a generic cellular phone was displayed on a screen, and pressing the various phone keys was done only by pointing and clicking with the mouse (see bottom panel of Fig. 1). Four keys were available: SELECT—to select a link, EXIT—to go to the previous (higher) level and UP and DOWN keys for scrolling. The screen of the cellular phone displayed a title and 1–4 links or 1–6 text lines (for the terminal items). If the page contained more than that, the user had to scroll using the ‘‘UP’’ or ‘‘DOWN’’ keys. The numeric keys were also available for the task of booking movie tickets. Each platform included two navigation structures, a deep one and a broad one. The two deep structures, which were identical for both platforms, consisted of five levels. The broad navigation structures consisted of two levels for the PC and three levels for the cellular phone. It should be emphasized that the web sites in both platforms and in both structures were designed identically in terms of the visual ARTICLE IN PRESS A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 Date Down Time Down Card Down 759 Confirm Select End of ticket booking Fig. 2. Booking tickets in the broad navigation structure in the cellular phone. Date Time Select Exit Date Card Select Exit Time Confirm Select Exit Card End of ticket booking Fig. 3. Booking tickets in the deep navigation structure in the cellular phone. design and the use of textual terminology, and the only differences were the depth of the navigation structures. The different hierarchic structures were also expressed in the ticket booking part of the interaction. In the broad structure in the cellular phone platform, the ticket booking part included only one level, which was divided into several screens. Every screen contained a single data item that the user had to fill in. The ‘‘UP’’ and ‘‘DOWN’’ keys enabled scrolling between the screens. The last screen contained the final confirmation, and for doing it the user had to press the ‘‘SELECT’’ key (see Fig. 2). The ticket booking part in the deep structure was broken into a two-level hierarchy. The upper level contained all the links to the pages with the data items that the user had to fill in, along with the final confirmation option. When the user selected a link she reached a page where she could fill in certain data or confirm the booking. The user had to press ‘‘EXIT’’ to get back to upper level in order to select another option or confirm. The confirmation was done in the same way as for the broad structure (see Fig. 3). 2.4. Apparatus The prototype for the desktop platform was an Internet site that was built using the software MS FrontPage version 2000 and was installed in the local server in the Industrial Management and Engineering Faculty. Actions of participants working with the PC were recorded using iOpus STARR PC & Internet monitor version 3.18, PRO Edition. This software is memory-resident, and it stores data about the web page that the user visited and how much time was spent in every page. The cellular phone emulation was displayed on a computer screen with two main areas. The upper right-hand part displayed general experiment instructions and task instructions. In addition, there was a key with the label: ‘‘Press on me at the ARTICLE IN PRESS 760 A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 end of the task’’. An explanation about the keys in the phone appeared in the lower right-hand side. The image of the cellular phone with the movies site was displayed in the left side (see bottom panel of Fig. 1). The prototype was built using the software Magic version 8.3. The application was also used to log users’ actions: screens visited, keys pressed and time stamp for every such event. Pentium 2 PC was used for the experiment. The operating system was Windows NT. The size of the screen was 17 in. The resolution that was chosen for the display was 1024 768 pixels with 16 bit color depth. 2.5. Design The study was a fully factorial design composed of three independent variables. 1. Platform—The desktop computer and the cellular phone. 2. Navigation structure—A broad structure and a deep structure. 3. Task—Two search and book tickets tasks. The platform and the navigation structure were between-participant factors, and the Browsing Task was a within-participant factor. 2.6. Procedure The experiment took place in the computers laboratory of the Industrial Engineering and Management faculty at the Israel Institute of Technology. All 96 participants were randomly assigned to one of the four groups: (1) 22 participants in the Cellular phone platform with broad structure group; (2) 23 participants in the cellular phone platform, deep structure group; (3) 29 participants in the desktop platform with the broad structure group and finally (4) 22 participants in the desktop platform with the deep structure group. Participants were divided randomly into several sessions of 2 h each. There were up to 16 participants and one experimenter in every session. The computers in the laboratory were prepared with one of the two interfaces and the participants choose a computer randomly. The participants’ work with the computers was self-paced, and they turned to the experimenter only if they had major problems. Experiment and task instructions were displayed on screen as part of cellular phone emulation. Participants assigned to the cellular phone experimental groups read the explanation about the experiment by themselves. Then they were asked to perform two practice tasks, and then perform the two experimental tasks. The instructions emphasized that participants should perform the experimental tasks as quickly and as accurately as possible (without unnecessary steps). They were also asked to indicate task completion by pressing the key with the label: ‘‘Press on me at the end of the task’’. After learning the interface, the participants performed the tasks according to the order determined by the system. The two practice tasks appeared in fixed order, and the actual experimental tasks appeared in random order. A task ended when the participant pressed the ‘‘Press on me at...’’ key, or when 15 min passed since it began. The system informed the participant whether the task was completed successfully, ARTICLE IN PRESS A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 761 failed, or timed-out (15 min passed). The procedure went on for the second task, and after its completion the experiment ended for the participant. The procedure for participants with the desktop platform was similar. The experiment and task instruction were presented on a printed paper. 3. Results In the results analysis, statistical significance refers to po0.05. We also mentioned if there was statistical significance at the 90% significance level (po0.1). 3.1. Performance success rates Failure in task performance was defined as follows: when the participant informed the system that the task was completed by pressing on the appropriate key but did not actually finish it, or when 15 min passed from the beginning of the task and the participant did not finish it. The success rates were similar between the navigation structures and between the tasks. In the broad structure, the success rate was 92.16%, and in the deep structure it was 94.57%. For task 1 the success rate was 91.75%, and for task 2 it was 94.85%. There was some difference between the platforms, but not very large. In the cellular phone platform the success rate was 90.00%, and in the desktop platform it was 96.15%. Logistic Regression was applied to the data in order to explore the factors affecting the success rates in each task (included: platform, navigation structure and the interaction between them). In addition, McNemar test was used to compare between the tasks. Forward Stepwise (Conditional) Regression was used for building the logistic regression, with entry condition of po0.05, and removal condition of p>0.1. 3.1.1. The success rate for task 1 The success rate with the cellular phone platform was 86.67%, and 96.08% with the desktop platform. The difference was significant at the 90% significance level (w2=2.771, p=0.096). The success rate in the broad structure was 90.20%, and in the deep structure 93.33%. The difference was not significant (w2=0.308, p=0.579). The interaction between the navigation structure and the platform was not significant (w2=0.278, p=0.598). 3.1.2. The success rate for task 2 Also for this task none of the factors entered to the regression equation—they were not significant. The success rate with the cellular phone platform was 93.33%, and with the desktop platform 96.08%. The difference was not significant (w2=0.365, p=0.546). The success rate with the broad structure was 94.12%, and with the deep structure 95.56%. The difference was not significant (w=0.100, ARTICLE IN PRESS 762 A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 p=0.752). The interaction between the navigation structure and the platform was not significant (w2=0.213, p=0.644). 3.1.3. Comparing between the tasks In task 1 the success percentage was 91.75%, and in task 2 94.85%. The difference between the tasks was not significant (p=0.508, using McNemar test, binomial distribution). In summary, there was a significant difference between the success rates in performing task 1 in the desktop vs. the cellular phone, with a higher success rate for the desktop platform. The rest of the comparisons were not significant. 3.2. Performance duration analysis Task performance durations consisted of two parts: 1. Navigation time, the time from the beginning of the browsing until the desired movie was found and the participant reached the screen in which tickets for that movie can be booked. 2. Booking duration, the time from entering the booking screen to the end of the task indicated by the participant. The analysis of the navigation times is presented first followed by the analysis of the booking time. The analysis was done only for the participants who succeeded in the tasks. 3.2.1. Navigation times Navigation times were analysed with repeated measures ANOVA, with the task (1 or 2) as the within-participant independent variable and the between-participant independent variables were the platform (cellular phone or desktop) and the navigation structure (broad or deep). A natural log transform was applied to the data to comply with the assumption of the homogeneity of variances (homoschedasticity) among the data samples. The mean navigation time of task 1, 280.69 s, was significantly shorter (F1.71=20.788, po0.001) than the mean navigation time in task 2, 333.47 s. The mean navigation time in the desktop platform was 281.95 s, and it was significantly shorter than the mean navigation time in the cellular phone platform, 335.21 s (F1.71=6.584, p=0.012). The mean navigation time in the broad structure was 271.57 s, and it was significantly shorter than the mean navigation time in the deep interface, 345.58 s (F1.71=13.016, p=0.001). A significant first-order interaction between the task and the navigation structure was found (F1.71=2.929, p=0.091). The mean navigation times in each task as a function of the experimental conditions, are shown in Fig. 4. It can be seen in Fig. 4 that overall there were shorter navigation times with the broad structure as compared to the deep structure, for both the PC and the cellular phone. However, differences between the PC and the cellular platforms were larger with the broad structure, particularly in task 2, as compared with navigation times in the deep structure that were similar for both the PC and the cellular phone. It should ARTICLE IN PRESS A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 Task 2 450 400 350 300 250 200 150 100 50 0 PC Cellular Broad Deep Platform Mean NavigationTime in seconds Mean Navigation Time in seconds Task 1 763 450 400 350 300 250 200 150 100 50 0 PC Cellular Broad Deep Platform Fig. 4. Mean navigation times, along with standard errors, for the broad and deep navigation structures in each of the platforms, and each of the tasks. be noted that the different number of levels in the broad structure between the platforms (two for the PC and three for the cellular phone) did not influence the consistent pattern of shorter navigation times with the broad structure for both platforms in both tasks. 3.2.2. Ticket booking durations The statistical analysis of booking durations was identical to the analysis for the navigation durations. The booking times were similar for the different platforms and the different structures. There was a significant difference between the tasks: in task 1, booking a movie was performed with a mean time of 80.99 s, and this time was significantly shorter than the mean time to perform movie booking in task 2, that was 92.31 s (F1.71=8.288, p=0.005). The mean time to book a movie in the cellular phone platform was 96.57 s, and it was similar to the mean time to book a movie in the desktop platform, 94.19 s (F1.71=0.221, p=0.640). The mean time to book in the broad structure was 96.28 s, and it was similar to the mean time to book a movie in the deep structure, 94.49 s (F1.71=0.004, p=0.952). The various interactions were not significant. 3.3. Navigation characteristics for the cellular phone platform Any navigation sequence that was not the shortest possible path to the required information without any nonessential steps was classified as sub-optimal navigation or interaction error. Sub-optimal navigation sequences or the interaction errors that the participants executed during the navigation in the cellular phone platform were classified and analysed. Following are the types of those interactions that were examined and their measures (number of clicks, number of times that the sequence occurred, etc.). Nonparametric Mann–Whitney test was used to compare between the navigation structures for each task. The significance was calculated for a twosided assumption. We report the differences that were significant (po0.05) or significant at the 90% significance level (po0.1). ARTICLE IN PRESS 764 A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 3.3.1. Premature exit Exit to a higher level before all the required information for the task was viewed. An example to this kind of navigation action: while looking at the details of a movie playing in a certain theater, the participant exited from the movie list in that theater and accessed the list of the theaters in the city. To successfully complete the task, the participant had to continue reviewing all the details of all the movies in that theater and not move to another theater. The number of clicks on the ‘‘EXIT’’ key that the participant used in order to exit from a certain level was counted. There were significantly less ‘‘EXIT’’ clicks, in task 1, using the broad structure with a mean of 0.44 clicks as compared to a mean of 2.67 clicks using the deep structure (Z= 2.224, p=0.025). Similar pattern was found for task 2, with significantly less clicks using the broad structure, mean of 0.81 clicks, as compared to a mean of 2.67 clicks in the deep interface (Z= 2.125, p=0.034). 3.3.2. Multiple viewing of several items The proportion of the items that were viewed more than once. Participants had to check details of several movies in order to execute the tasks. The number of movies whose details were checked more than once and their proportion relative to the number of movies that should have been viewed were computed. For task 2, the proportion of the items that were viewed more than once was significantly smaller using the broad structure with a mean of 7% as compared to the deep structure with a mean of 26% (Z= 1.992, p=0.046). The difference was not significant for task 1. 3.3.3. Wrong movie booking attempts The participant could book tickets only for the correct movie. In a case of an attempt to book tickets for the wrong movie, the display was not changed. The measure for this mistake is the number of times that the participant tried to book tickets wrongly. For task 1 the number of wrong tries was significantly (at the 90% significance level) smaller in the broad structure—not a single attempt, as compared to the deep structure with a mean of 0.5 times (Z= 2.496, p=0.013). 3.3.4. Unnecessary ‘‘SELECT’’ key activation The ticket booking process was designed in a way that confirmation was required only after filling in all the details and not for every item separately. The correct way to confirm was by pressing the ‘‘SELECT’’ key for the ‘Confirm’ option. Sometimes participants pressed the ‘‘SELECT’’ key before going back to the main form to select the ‘Confirm’ option, in the deep structure, or moving to the last screen containing the ‘Confirm’ option, in the broad structure. The measure for this error was the number of times that the participant pressed ‘‘SELECT’’ in the wrong place. Pressing the key several times continually was also counted as one time. In task 1, there were significantly less ‘‘SELECT’’ clicks using the broad structure with a mean of 0.38 clicks as compared to a mean of 3.17 clicks using the deep structure (Z= 3.109, p=0.002). Similar pattern was found for task 2, with significantly less clicks using the broad structure, mean of 0.38 clicks, as compared to a mean of 3.22 clicks in the deep interface (Z= 3.803, po0.001). ARTICLE IN PRESS A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 765 4. Discussion The findings of this study show similarity between the desktop PC and cellular phone in terms of the impact of hierarchy depth of the navigation structure on performance. There was an advantage to browsing in a broad structure as compared to a deep structure for both platforms. In addition, web browsing in the desktop PC was performed faster and better than in the cellular phone. The following discussion is divided into three parts: a summary of the findings, followed by a discussion of the theoretical implications of those findings, and finally the practical implication and future research agenda are presented. 4.1. Summary of the findings 4.1.1. Overall task success rates The success rate in the different conditions can be viewed as a measure of effectiveness. According to this measure, task effectiveness was high: success rates were in the range of 80–100%, and were similar across platforms, navigation structures and the two tasks. In addition, there was a higher success rate for task 1 with the PC as compared to the cellular phone. 4.1.2. Navigation durations The navigation time in the desktop platform was shorter than the navigation time in the cellular phone platform. This finding is in line with findings reported by Kim and Albers (2001). The navigation time in the broad structure was shorter than the navigation time in the deep structure. The advantage of the broad structure was found both for the PC platform and for the cellular phone platform. The differences between the platforms were more pronounced with the broad navigation structure, particularly with task 2. Finally, the navigation time in task 1 was shorter than the navigation time in task 2. 4.1.3. Sub-optimal navigation sequences and errors The various sub-optimal navigation sequences and errors that were made during navigation were classified and analysed in order to characterize and examine more closely the advantage of the broad navigation structure in cellular phone. For task 1 in the broad structure, there were less premature exits and fewer attempts to book tickets for the wrong movie. For task 2 there was also an advantage to the broad structure in terms of having less premature exits. In addition, there was an advantage of the broad structure in terms of less multiple viewing of several items. 4.1.4. Ticket booking durations There were no differences between the different platforms and the different navigation structures in the time it took to complete ticket booking. The lack of difference in performance durations between the platforms and between the two navigation structures, for each platform, is somewhat surprising. It is surprising ARTICLE IN PRESS 766 A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 because one would expect the interaction type of form filling to be more familiar with the PC and thus be performed faster than with the cellular phone platform. In addition, faster performance would be expected with a broad structure of form filling because all the details are in a single screen as opposed to a deep structure, where the details are divided into several screens (also see Norman et al., 2001). It is possible that navigation structures differing in the depth of the hierarchy have their primary impact on the navigation part of the task and less in the interactive part, such as ticket booking in this study. 4.1.5. Ticket booking errors There was an advantage to the broad structure, for both tasks, in terms of less pressing the ‘‘SELECT’’ key after entering the data. This error can be characterized in two ways. One is that using the ‘‘SELECT’’ key implies confusion with the navigation keys such as the UP, DOWN and EXIT keys. The other is that using the ‘‘SELECT’’ key implies erroneous understanding that there is a need to confirm every item that is filled in instead of the requirement to confirm only at the end of the interaction. The increased occurrence of this error in the deep structure is discussed below. 4.2. Theoretical implications The findings for the desktop platform in this study are in line with many previous reports indicating that navigation performance in a broad structure is better than navigation in a deep structure (e.g., Norman, 1991; Snowberry et al., 1983; Kiger, 1984; Tullis, 1985; Seppala and Salvendy, 1985). The important empirical finding in this study is that similar navigation structures implemented in a cellular phone with a small screen also produced performance advantages with the broad structure. The question is: what can account for this performance similarity between platforms that are very different operationally? On the one hand, the findings here may imply that the impact of hierarchy depth in navigation structures is due to device-independent factors. In previous research with regular-size displays, such as the desktop PC and telecommunication monitors, increased load on user’s working memory was characterized as being associated mainly with the deep navigation structure and thus accounting for the poorer performance (e.g., Seppala and Salvendy, 1985; Tullis, 1985; Norman, 1991). In deep menu structures, the user has to make more choices by having to select between various options in every level, and having to go through more levels. Every such choice increases visual search time, decision time and reaction time (Seppala and Salvendy, 1985; Norman, 1991). Thus, the determining deviceindependent factor affecting performance is having more actions and increased working memory load associated with navigating through more levels in deep hierarchies. On the other hand, the findings here cannot be accounted for by simply assuming that there are fewer user actions in broad structure as compared to a deep structure. The broad structure in a small-screen device does not necessarily require less user ARTICLE IN PRESS A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 767 actions in comparison to a deep structure. A given number of items per page in the small-screen device can be distributed, in the broad structure, on many more screens, and consequently, there can be many selection actions—moving between screens in the same level and moving between levels (e.g., Tang, 2001). In this study, for example, the minimal number of clicks required to complete task 1 in the broad structure of the cellular phone was more that the minimal number of clicks required to complete it in the deep structure. From this perspective, because of the small screen of the cellular phone, it is possible that increased working memory load will also be associated with a broad structure which demands the user to move between several screens in the same level. In other words, smaller number of actions or less working memory load alone may not be sufficient to account for the consistent advantage of the broad structure in the cellular phone. Another perspective that can further account for the broad structure advantage is the increased confusion and disorientation associated with deep navigation hierarchies. In some of the decision points in the hierarchy, the terms for describing the categories could be vague and ambiguous, and the user may have trouble identifying which category should be selected in order to reach a certain item. Consequently, there is more uncertainty and confusion in deep menus about the location of the desired items (Tullis, 1985; Norman, 1991). The details about the navigation structure, the items that need to be stored in the working memory, and the additional load due to the characteristics of the deep hierarchy, could make it difficult to store or retrieve those details. Consequently, the users may become disoriented: they may forget where they came from, which places they visited already, and where to go next. Examination of the characteristics of sub-optimal navigation sequences and errors committed in this study with the cellular phone platform imply more confusion and disorientation in the deep structure as compared to the broad structure. Specifically, behaviors such as premature exit, multiple viewing of several items and trying to book the wrong movie reflect confusion and disorientation. Moreover, the error of unnecessarily pressing on the ‘‘SELECT’’ key may reflect confusion between navigation (moving from one screen to another) and specific data-related action (confirmation). In contrast, navigation in the broad structure, even if it means more scrolling between screens that are on the same hierarchical level, would be less associated with disorientation. This is because the user, while scrolling between physical screens, may still maintain position awareness such as ‘‘I know where I am—I am still in the same level’’. The increased amount of scrolling performed with the broad structure in cellular phones as opposed to the PC can account for the more pronounced differences between the platforms in the broad structure. Another factor that needs to be considered is the difference between the two tasks. Because of the greater decision complexity in task 2, the participant needed to gather more information before reaching the ticket booking part. This in turn may have increased the load on the working memory. This additional load degraded performance in task 2 in general. In addition, the increased memory load due to the task complexity and due to the deep structure made the task almost equally difficult in the deep structure in both platforms. ARTICLE IN PRESS 768 A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 4.3. Practical implications and research agenda Generalization of the findings of this study should be qualified. The main limit of the external validity of this study is the use of an on-screen emulation of the cellular phone. Such setup excludes some important factors that are associated with the mobile context of use typical of the cellular phone. The mobile context probably has an impact on user performance, however much can be investigated in the laboratory context. In particular, this study examined a basic conceptual issue of the web site structure and its influence on navigation performance. Such issues can be studied in simulators, emulators or even ‘‘paper prototypes’’, since most of the determining performance factors are not motor actions but rather cognitive—visual search, short-term memory, decision making, orientation and position awareness, comprehension and mental models (e.g., Seppala and Salvendy, 1985; Tullis, 1985; Norman, 1991). The general recommendation for web site architecture is to limit the depth of the hierarchy rather than its breadth (e.g., Brinck et al., 2002; Larson and Czerwinski, 1998). Those recommendations referred primarily to web sites displayed on regular screen-size platforms such as the desktop PC. Based on the findings of this study, taking into consideration its external validity limit, and in line with very few other studies (Chen et al., 2003; Kaikkonen and Roto, 2003), a similar recommendation can be formulated for small-screen devices such as the cellular phone. Browsing performance with a broad structure in cellular phones would still be inferior compared with regular screen-size platforms, but it would improve performance relative to breaking the content of the site into smaller chunks. It should be emphasized that designing a broader navigation structure for web sites displayed on small screen devices does not solve problems such as reading from small screens and the need to perform more scrolling actions. A line of research and development that should be pursued is the examination of various content adaptation techniques combined with broad navigation structures. References Allen, R.B., 1983. Cognitive factors in the use of menus and trees: an experiment. IEEE Journal on Selected Areas in Communications SAC-1 (2), 333–336. AvantGo and HTML Styles for Handheld Devices. Available online from: http://avantgo.co/developer/ reference/styleguide.htm. Brinck, T., Gergle, D., Wood, S.D., 2002. Usability for the Web. Morgan Kaufman Publishers, San Francisco, CA, USA. Buchanan, G., Jones, M., Thimbelby, H., Farrant, S., Pazzani, M., 2001. Improving mobile internet usability. In: Proceedings of the 10th International WWW Conference. ACM Press, New York, pp. 673–680. Buyukkokten, O., Garcia-Molina, H., Paepcke, A., 2001. Accordion summarization for end-game browsing on PDAs and cellular phones. In: The Proceedings of SIGCHI’01, Seattle, WA, USA. ACM Press, New York, pp. 213–220. ARTICLE IN PRESS A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 769 Chen, Y., Ma, W-Y., Zhang, H-J., 2003. Detecting web page structure for adaptive viewing on small form factor devices. In: Proceedings of the WWW2003, Budapest, Hungary. ACM Press, New York, pp. 225–233. de Bruijn, O., Spence, R., Chong, M.Y., 2002. RSVP browser: web browsing on small screen devices. Personal and Ubiquitous Computing 6, 245–252. Dillon, A., Richardson, J., McKnight, C., 1999. The effect of display size and text splitting on reading lengthy text from the screen. Behavior and Information Technology 9 (3), 215–227. Duchnicky, R.L., Kolers, P.A., 1983. Readability of text scrolled on visual display terminals as a function of window size. Human Factors 25, 683–692. Ericsson, T., Chincholle, D., Goldstein, M., 2001. Both the cellular phone and the service impact WAP usability. In: Joint Proceedings of the IHM 2001 and HCI 2001. Springer, Berlin. Ford, S., Forlizzi, J., Ishizaki, S., 1997. Kinetic typography: issues in time based presentation of text. In: Proceedings of the CHI’97, Atlanta, USA. ACM Press, New York, pp. 269–270. Hagelbarger, D.W., Thompson, R.A., 1983. Experiment in tele-terminal design. IEEE Spectrum 20, 40–45. Jones, M., Mardsen, G., Mohd-Nasir, N., Boone, K., Bichanan, G., 1999. Improving web interaction on small displays. Proceedings of the WWW8 Conference, Toronto, Canada, pp. 51–59. Kaikkonen, A., Roto, V., 2003. Navigating in a mobile XHTML application. In: Proceedings of the CHI2003, Vol. 5 (1), Ft. Lauderdale, FL, USA. ACM Press, New York, pp. 329–336. Kamba, T., Elson, S., Harpold, T., Stamper, T., Piyawadee, N., 1996. Using small screen space more efficiently. In: Proceedings of the CHI’96, Vancouver, Canada. ACM Press, New York, pp. 383–390. Keranen, H., Plomp, J., 2002. Adaptive runtime layout of hierarchical UI components. In: Proceedings of the NordiCHI, Arhus, Denmark. ACM Press, New York, pp. 251–254. Kiger, J.I., 1984. The depth/breadth trade-off in the design of menu-driven interfaces. International Journal of Man-Machine Studies 20, 201–213. Kim, L., Albers, M.J., 2001. Web design issues when searching for information in a small screen display. In: Proceedings of the SIGDOC’01, Santa Fe, NM, USA. ACM Press, New York, pp. 193–200. Larson, K., Czerwinski, M., 1998. Web page design: implications of memory, structure and scent for information retrieval. In: Proceedings of the SIGCHI’98, Los Angeles, CA, pp. 25–32. Nielsen, J., 2000. Designing Web Usability. New Riders Publications, Indianopolis, IN. Nokia Corporation: XHTML Guidelines (2002). Available online from: http://forum.nokia.com. Norman, K.L., 1991. The Psychology of Menu Selection: Designing Cognitive Control at the Human/ Computer Interface. Intellect Books, Bristol, UK. Norman, K.L., Friedman, Z., Norman, K., Stevenson, R., 2001. Navigational issues in the design of online self-administered questionnaire. Behaviour and Information Technology 20 (1), 37–45. Openwave Guidelines (2003). Available online from: http://developer.openwave.com/resources/uiguide. html. Reseil, J.F., Shneiderman, B., 1987. Is bigger better? The effects of display size on program reading. In: Salvendy, G. (Ed.), Social Ergonomic and Stress Aspects of Work with Computers. Elsevier, Amsterdam, pp. 113–122. Seppala, P., Salvendy, G., 1985. Impact of depth of menu hierarchy on performance effectiveness in a supervisory task: computerized flexible manufacturing system. Human Factors 27 (6), 713–722. Snowberry, K., Parkinson, S.R., Sisson, N., 1983. Computer display menus. Ergonomics 26 (7), 699–712. Sprint Spectrum L.P., 2002. PCS Web Style Guide, Vision Edition, 2002. Available online from: http:// www.sprintpcs.com. Tang, K.E., 2001. Menu design with visual momentum for compact smart products. Human Factors 43 (2), 267–277. Tullis, T.S., 1985. Designing a menu-based interface to an operating system. Proceedings of CHI’85, pp. 79–84. WAP Forum, 2001. Available online from:http://www.wapforum.org. Watters, C., Duffy, J., Duffy, C., 2003. Using large tables on small display devices. International Journal of Human Computer Studies 58, 21–37. ARTICLE IN PRESS 770 A. Parush, N. Yuviler-Gavish / Int. J. Human-Computer Studies 60 (2004) 753–770 Wobbrock, J.O., Forlizzi, J., Hudson, S.E., Myers, B.A., 2002. WebThumb: interaction techniques for small screen browsers. In: Proceedings of the UIST’02, Vol. 4 (2), Paris, France. ACM Press, New York, pp. 205–208. Ziefle, M., 2002. The influence of user expertise and phone complexity on performance, ease of use and learnability of different mobile phones. Behaviour and Information Technology 21 (5), 303–311.