web page aesthetics and performance: a survey and an

advertisement
Proceedings of the 8th Annual
International Conference on Industrial
Engineering – Theory, Applications
and Practice, Las Vegas, Nevada, USA,
November 10-12, 2003
WEB PAGE AESTHETICS AND PERFORMANCE: A SURVEY AND AN
EXPERIMENTAL STUDY
Kristi E. Schmidt, Michael Bauerly, Yili Liu, and Srivatsan Sridharan
Department of Industrial and Operations Engineering
The University of Michigan
1205 Beal Avenue
Ann Arbor MI 48109-2117
Corresponding author’s e-mail: krischmi@umich.edu
Abstract: A dual-process research and evaluation methodology was used to identify the underlying clusters of design
variables affecting aesthetic judgment of a Web page, and to examine user preference, ease of interaction, and interaction
speed for Web pages with various font and graphic sizes. To identify the clusters of variables, 57 design variables were
identified by conducting a content analysis on relevant literature and by conducting structured interviews. A balanced
incomplete block survey of the 57 variables was administered. Cluster analysis of the results revealed 10 underlying
clusters, two of which were selected to conduct a 2 × 10 experiment that explored Web pages with two levels of graphic
size and ten levels of font size. User preference and ease of interaction increase as font size increases and graphic size
decreases. There was no difference in interaction speed among Web pages with varying font or graphic sizes.
1. INTRODUCTION
The World Wide Web (WWW) has grown in user population and breadth to a great extent since its creation over three
decades ago. It is now viewed as a backbone for several sectors including advertising and marketing, business and
ecommerce, entertainment, healthcare, communication, education, religion, and government (Nua, 2003). There were
nearly 414 million global home Internet users who each spent an average of over twelve hours online during the month of
July 2003, up 1.46% from the previous month (Nielsen//NetRatings, 2003).
The breadth, volume, and accessibility of the Internet has made it popular for individuals and organizations to
create and maintain Web pages that go beyond communication to collaboration and even Internet commerce. The surge in
Internet presence was not, however, paired with widespread design savvy or consideration for usability. The increased
complexity of the Internet and the sheer volume of Internet users make the World Wide Web a very complex and often
competitive environment. If users are unable to find what they need from a given Web page due to the lack of information
or the complexity of navigation, they will become frustrated and move on to another site. On average, users spend one
minute viewing each Web page (Nielsen//NetRatings, 2003). This relatively short amount of time demands the Web page to
communicate critical information rapidly and demands high information processing capability of the user.
Variables affecting user judgments of a Web page have often been intuitively defined and communicated by Web
page designers through instructional Web page design manuals. Oliver (2003) defines four principles of web interface
design and development: 1) usability—how intuitively or easily the media item is navigated and processed; 2)
visualization—creation of visually interesting and aesthetically pleasing media items while avoiding potentially distracting
or unnecessary features; 3) functionality—features of the media item and how useful they are for supporting a given task;
and 4) accessibility—tools that help users access the site in alternative formats and provide increased functionality. Burstein
(2003) groups Web page design variables into fifteen design elements: links, color issues, images, image maps, animated
images, spacing, tables, frames, style sheets, cookies, JavaScript, Java, plug-ins, screen size, and file distribution. These
online Web page design manuals are often based upon designer intuition and qualitative evaluation of existing Web pages.
Research about Web page usability, preference, and performance has also been published in peer-reviewed
journals, yet this research often represents qualitative survey results and literature reviews of case studies, rather than
empirical quantitative research. Turner (2002) identified seven categories affecting Web page usability: navigation, page
design, content, accessibility, media use, interactivity, and consistency. Cox and Dale (2002) developed a conceptual model
to assess how a Web page can meet user expectations based upon six quality factors in Web page design and use: 1) clarity
of purpose; 2) design; 3) accessibility and speed; 4) content; 5) customer service; and 6) customer relationships. Design was
478
Proceedings of the 8th Annual
International Conference on Industrial
Engineering – Theory, Applications
and Practice, Las Vegas, Nevada, USA,
November 10-12, 2003
further broken down into five key issues: 1) links; 2) consistency, menus and site maps; 3) pages, text and clicks; 4)
communication and feedback; and 5) search and fill-in forms.
In addition to overall conceptual Web page design considerations being defined in a scientific and quantitative
means, relative importance of the design elements that are defined must be determined and quantified in order to prioritize
efforts in Web page design. Checklists such as the Heuristic Evaluation by Proxy (HEP test) of Web page usability (Turner,
2002) aim to quantitatively and qualitatively evaluate several observations and criteria, yet checklists such as this are often
based on surveys, compilation of existing literature, or case studies rather than empirical research work. Clearly, current
literature lacks definition of factors affecting user judgments of a Web page based upon rigorous measurement tools and
lacks further definition of the quantitative relationship among these factors. This study quantitatively obtains and
determines relationships among Web page design variables.
The questions addressed in this study were: 1) what are the ranked importance of design variables and the
underlying variable clusters affecting the user’s judgment of Web page aesthetics; and 2) what are the tradeoffs between
Web page font size and Web page graphic size? A dual-process engineering aesthetics research and evaluation
methodology (Liu, 2001, 2003) was used to scientifically and quantitatively investigate these issues. This dual-process
methodology utilizes two parallel but closely related types of research methods that are aimed at achieving a
comprehensive, rigorous, and quantitative understanding of aesthetic response, in this case with respect to Web page
design. The two types of research methods that define the dual-process research and evaluation methodology and that are
carried out simultaneously are multidimensional construct analysis and psychophysical analysis.
Multidimensional construct analysis is a global top-down analysis that quantitatively answers questions involving
the conceptual and mathematical structure of the aesthetic constructs involved in aesthetic judgment, the definition and
measurement of the major psychological and physical dimensions involved, the identification of the relative importance and
relationship of these dimensions, and the development of a multidimensional evaluation scale to measure the aesthetic
construct with accurate validity and reliability. In this study, multidimensional construct analysis is used to identify
variables of importance with respect to Web page design using content analysis and structured interviews, ranks those
identified variables according to user preference by surveying several participants, and then uses multivariate statistical data
reduction to cluster and factor the ranked variables according to relationships involving user perception taken from the
survey’s profile data.
Psychophysical analysis is a local bottom-up analysis that establishes a quantitative view of how user preference
changes as a function of specific aesthetic variables identified in the multidimensional construct analysis. Specifically, user
ability to perceive and judge values, changes and variations in design parameters, and corresponding preferences of the
levels of values of aesthetic variables are of interest. In this study, a psychophysical experiment was conducted to
quantitatively investigate user preference, ease of interaction, and performance tradeoffs between two variables from two
separate clusters of variables identified in the multidimensional construct analysis and to identify and establish a
quantitative relationship among these variables.
2. MULTIDIMENSIONAL CONSTRUCT ANALYSIS
2.1 Method
Texts were selected that addressed aesthetics, usability, and/or design guidelines for Web page design (Nielsen, 2003; De
Graff, 2003; Gibbs and Szentivanyi, 2003; Hom, 2003; Ericsson, 2003; Perlman, 2003; Marion, 2003; Burstein, 2003; and
Instone, 2003). A content analysis was performed on these texts to obtain relevant variables affecting Web page aesthetics,
Web page usability, and Web page design guidelines by extracting all meaningful words. Meaningful words are those
words that aren’t just transition words and proposition words (i.e. not a, and, then, with, or to).
A parallel structured interview process was conducted upon twenty undergraduate engineering students.
Participants ranged in age from 18-22 years, each had at least five years of Internet experience, and each logged onto the
Internet at least once daily. Each participant was specifically instructed to list variables that they thought were important in
affecting Web page usability while browsing Web pages at their leisure. The participants were encouraged to list as many
items as they could think of and to not limit their thought and creativity.
The results of the content analysis and the structured interview were combined to obtain a list of variables
including frequency of appearance. Fifty-seven variables were ultimately extracted based upon a tradeoff between the total
number of variables collected, the frequency of each variable collected, and the limitations of the survey design, including
potential central fatigue of the participants due to the length of the survey.
479
Proceedings of the 8th Annual
International Conference on Industrial
Engineering – Theory, Applications
and Practice, Las Vegas, Nevada, USA,
November 10-12, 2003
A survey was conducted to rank the fifty-seven variables that resulted from the content analysis and structured
interview. The survey used a Balanced Incomplete Block (BIB) design (Dunn-Rankin, 1983), in which a large set of
ranking items are broken up into smaller groups. This survey design reduces the cognitive load on the participant by only
having to rank 8 variables at a time rather than ranking the full list of fifty-seven variables. There were a total of fifty-seven
small groups of 8 variables each that every participant (n=20) ranked, thereby comparing each variable to the rest of the
variables twice.
480
Proceedings of the 8th Annual
International Conference on Industrial
Engineering – Theory, Applications
and Practice, Las Vegas, Nevada, USA,
November 10-12, 2003
CLUSTER
RANK
40
Page Progression/
38
Targeting Strategy
42
34
6
53
36
Im age and Text
32
Balance
22
26
48
28
24
14
Navigation
29
25
30
19
27
23
18
Inform ation Value
20
17
21
15
39
2
3
4
5
Relevance/Speed
12
16
1
9
8
10
Trust (Security)
7
Platform
13
11
Independence
57
Marketing
52
55
46
33
35
Appeal/Diversion
43
37
31
56
54
41
51
Accessibility and
45
Multim edia
50
49
44
47
VARIABLE
Frames
Non-Frames
Opening of New Brow ser Window
Visual Design Cues
Visual Groups
Coordinated Audio and Video
Pictures Instead of Description
Simple Images
Graphic File Size
Font Size
Graphics for Graphics and Text for Text
Position in the Screen
Clear Exits
Wait-Time Feedback
Printable Contents
Navigation Support
Back Button
Grouping and Subheadings
Simple Headlines and/or Titles
Innovative
Provide Search
Length of an Article
Interactive
Minimized Scrolling
Simple Uniform Resource Identifiers (URIs)
Accurate Plain-Language Error Messages
Server Response Times
Time to Load
Dow nload Time
Speed
Timely Information
Updated Regularly
Information Layout
Location of Information
Credible and Original Information
Privacy
Security
Brow ser Independent
System Independent
Advertisements
Banners
Sudden Pop-Up Window s
Animations
Graphics
Background Images
Entertainment
Drop Dow n Menus
Free Service
Songs
Movies
Games
Icons
Logo
3-D Images
Multiple Colors
Standard Colors for Links
Accessible for Users w ith Disabilities
481
DENDROGRAM
Proceedings of the 8th Annual
International Conference on Industrial
Engineering – Theory, Applications
and Practice, Las Vegas, Nevada, USA,
November 10-12, 2003
Figure 1. Ten Clusters Consisting of 57 Ranked Variables and Corresponding Dendrogram
2.2 Results
Overall rank for the fifty-seven variables for each participant as well as an overall rank considering all participants’
responses is obtained from the BIB survey profile data. A multivariate statistical data reduction using K-means nonhierarchical and hierarchical cluster analyses as well as a traditional factor analysis and a factor analysis with Varimax
rotation were also carried out on the profile data from the BIB survey. The cluster analyses identify clusters, or groups, of
similar variables according to the underlying user perception of the variables. The factor analyses describe the relationship
among the observed variables in terms of a few underlying, but unobservable, constructs called factors.
Figure 1 illustrates the fifty-seven variables that were identified, ranked, and clustered. BIB ranking yielded the
five most important variables affecting Web page design (most important first): information layout, server response time,
time to load, download time, and speed. The five least preferred variables affecting Web page design (least preferred first)
were: advertisements, songs, sudden pop-up windows, movies, and coordinated audio and video.
Hierarchical cluster analysis performed using the BIB ranking results produced the dendrogram that illustrates the
underlying relationships among the variables by grouping similarly quantified entities at various stages of relationship
formation. Ten clusters were identified based upon these underlying relationships: page progression/targeting strategy,
image and text balance, navigation, information value, relevance/speed, trust (security), platform independence, marketing,
appeal/diversion, and accessibility and multimedia. Factor analysis yielded similar results as the hierarchical cluster
analysis. The similarity of results between the factor analysis and hierarchical cluster analysis illustrates consistency among
the various analyses and proves the method is valid.
3. PSYCHOPHYSICAL EXPERIMENT
3.1 Method
Two clusters of variables identified in the multidimensional construct analysis were selected for the psychophysical
experiment: 1) appeal/diversion; and 2) image and text balance. One variable from each of the two clusters, respectively,
was selected for this illustrative psychophysical experiment: 1) graphics; and 2) font size. This phase of the experiment
quantitatively examined how user preference and ease as well as interaction speed is modified as a function of Web page
graphic size and Web page font size.
Twenty participants aged 22.3 to 29.3 years (mean 24.7 years, standard deviation 1.9 years) participated in the
experiment. All participants had normal or corrected-normal vision and normal color vision. Each participant accessed the
Internet on average at least two hours a day, and each participant has had at least four years of prior Internet experience.
The participants were compensated $10.00 for approximately one hour of their time.
Each participant participated in twenty experimental trials that involved viewing twenty Web pages that were
designed based upon The New York Times on the Web (2003). The participant was instructed to read the article text
displayed on the Web page and then to click a link at the bottom of the page when they were finished. At the conclusion of
every trial, participants answered a four-choice multiple choice comprehension question about the article and then to rate
their preference for the Web page and ease of their interaction with the Web page on a scale from 0 (low) to 10 (high). The
total time the participant spent viewing each Web page was also recorded.
Among the twenty Web pages each participant viewed, the content of the article, the graphic size, and the font size
varied. A total of twenty different news articles were selected for the experimental stimuli and then reduced in length to
between 180 and 190 words. There were two graphics that corresponded to each article, a small graphic that ranged from
124-300 pixels high by 175-200 pixels wide, and a large graphic that ranged from 224-382 pixels high by 552 pixels wide.
The two graphic sizes corresponded to the small picture/large picture format of The New York Times on the Web (2003).
There were ten font sizes possible for each article: 7.5, 8.5, 9, 9.5, 10, 10.5, 11, 12, 13, or 14 point. Twenty articles, two
graphic sizes, and ten font sizes combined to create four hundred unique Web pages. No article content/graphic size/font
size condition was replicated within-subject or between-subjects. Each participant viewed each article once, each graphic
size ten times, and each font size two times. Each Web page was presented to the participant on a traditional 17 inch CRT
visual display terminal with 60 Hertz refresh rate and 1280x1024 pixel resolution. Participants used a mouse with a scroll
wheel as an input device.
482
Proceedings of the 8th Annual
International Conference on Industrial
Engineering – Theory, Applications
and Practice, Las Vegas, Nevada, USA,
November 10-12, 2003
10
10
60
8
7
40
6
30
5
4
20
3
2
10
9
50
Time (seconds)
Time (seconds)
50
Rating (0=low to 10=high)
9
8
7
40
6
30
5
4
20
3
2
10
1
0
1
0
Large
0
Small
0
7.5
8.5
9
9.5
Graphic Size
Interaction time
Preference
Rating (0=low to 10=high)
60
10
10.5
11
12
13
14
Font Size (point)
Ease of interaction
Figure 2. Average Interaction Time, Preference, and Ease
of Interaction for Each Graphic Size Condition
Interaction time
Preference
Ease of interaction
Figure 3. Average Interaction Time, Preference, and Ease
of Interaction for Each Font Size Condition
3.2 Results
A 2 x 10 repeated measures analysis of variance (ANOVA) was performed with the 400 data points from the experiment
(20 participants, 2 graphic sizes, 10 font sizes). The statistical analysis was a within subject design. The within subject
measures included Web page graphic size (small or large) and Web page font size (7.5, 8.5, 9, 9.5, 10, 10.5, 11, 12, 13, or
14 point).
Figure 2 displays the average interaction time, the average preference rating, and the average ease of interaction
rating by each graphic size condition. The main effect of graphic size on interaction time was not significant (p=0.1771).
The main effect of graphic size on user preference was significant (p=0.0027). The main effect of graphic size on ease of
interaction was also significant (p=0.0179).
Figure 3 displays the average interaction time, the average preference rating, and the average ease of interaction
rating by each font size condition. The main effect of font size on interaction time was not significant (p=0.4913). The main
effect of font size on user preference was significant (p<0.0001). The main effect of font size on ease of interaction was
also significant (p<0.0001).
4. DISCUSSION AND CONCLUSIONS
This study applied a dual-process engineering aesthetics research and evaluation methodology to Web page design
evaluation. The multidimensional construct analysis (top-down) approach yielded 57 ranked variables affecting user
judgments of a Web page as well as identified ten clusters grouping the 57 variables that reflect the underlying mental
structure of the user preferences. The ten clusters summarizing user judgment of a Web page are: page
progression/targeting strategy, image and text balance, navigation, information value, relevance/speed, trust (security),
platform independence, marketing, appeal/diversion, and accessibility and multimedia. The 57 variables (Figure 1) and ten
clusters identified using the dual-process engineering aesthetics research and evaluation methodology are relatively
consistent with previous studies that were based upon intuition, case studies, and literature review alone. The ranked
variables as well as the ten descriptive clusters provide valuable insights to Web page designers regarding the underlying
motivations and perceptions of Web page users in a quantitative and analytical manner as opposed to an intuitive generation
of a prioritized list or a literature review based upon case studies or qualitative surveys.
The psychophysical analysis (bottom-up) approach illustrated the ability to quantify relationships between
variables that summarize user judgment of a Web page. Results showed that user preference and ease of interaction increase
as font size increases and graphic size decreases, however, there was no difference in speed of interaction among Web
pages with varying font size or graphic size. This finding provides useful insight to Web page designers that performance
may not be the best indicator of Web page preference or ease of use. This psychophysical analysis also provides an
483
Proceedings of the 8th Annual
International Conference on Industrial
Engineering – Theory, Applications
and Practice, Las Vegas, Nevada, USA,
November 10-12, 2003
introduction to further investigation of relationships among the variables underlying clusters identified in the
multidimensional construct analysis.
Future study may investigate further issues such as age, task of Web page browsing (general purpose browsing or
a directed fact-finding search), content of Web pages (user interest or choice of the researcher) (Marchionini and
Shneiderman, 1988), and different methods of achieving high level of aesthetics without significantly degrading loading
speed.
5. REFERENCES
1.
Burstein, C.D. (2003). Viewable with any browser: Campaign. http://www.anybrowser.org/campaign/
2.
Cox, J. and Dale, B.G. (2002). Key quality factors in Web site design and use: An examination. International Journal of
Quality and Reliability Management, 19(7): 862-888.
3.
De Graaff, H. (2003). HCI index. http://degraaff.org/hci/
4.
Dunn-Rankin, P. (1983). Scaling Methods. Lawrence Erlbaum Associates, Hillsdale, New Jersey.
5.
Ericsson, M. (2003). HCI resources: Guidelines, styleguides, standards. http://www.ida.liu.se/~miker/hci/guidelines/
6.
Gibbs, S. and Szentivanyi, G. (2003). Index to multimedia information sources. http://viswiz.gmd.de/MultimediaInfo/
7.
Hom, J. (2003). The usability methods toolbox. http://www.best.com/~jthom/usability/usable.htm
8.
Instone, K. (2003). Usable web. http://www.usableweb.com/
9.
Liu, Y. (2001). Engineering aesthetics and aesthetic ergonomics: A dual-process methodology and its applications.
Proceedings of the International Conference on Affective Human Factors Design, pp. 248-255.
10. Liu, Y. (2003). Engineering aesthetics and aesthetic ergonomics: A dual-process methodology and its applications.
Ergonomics, (in press).
11. Marchionini, G. and Shneiderman, B. (1988). Finding facts vs. browsing knowledge in hypertext systems. IEEE
Computer, 21(3): 70-79.
12. Marion, C. (2003). Software design smorgasbord. http://www.chesco.com/~cmarion/
13. The New York Times on the Web. (2003). http://nytimes.com/
14. Nielsen, J. (2003). Jakob Nielsen on Usability and Web Design. http://www.useit.com/
15. Nielsen//NetRatings: The global standard for digital media measurement and analysis. (2003). http://www.nielsennetratings.com/news.jsp?section=dat_gi
16. Nua.
(2003).
Online
Internet
http://www.nua.ie/surveys/index.cgi
surveys,
demographics,
statistics
and
market
17. Oliver, K. (2003). Web Interface Design. http://www.edtech.vt.edu/edtech/id/interface/index.html
18. Perlman, G. (2003). HCI Sites. http://www.hcibib.org/hci-sites/
19. Turner, S. (2002). The HEP test for grading Web site usability. Computers in Libraries, 22(10): 37-39.
484
research.
Download