Full paper - UW Departments Web Server

advertisement
Guidelines: Web Data Collection for Understanding and
Interacting with Your Users
Judith Ramey, University of Washington
SUMMARY
The global growth of the World Wide Web challenges technical communicators to
reconsider the methods we use to create designs that meet the goals and needs of our
users. This article focuses on taking advantage of the Web’s potential for interactivity
between designers and users. It offers strategies for getting data from users of Web sites
and using it for two main purposes: (1) analyzing audience and patterns of use to support
continuous redesign, and (2) building a relationship or sense of community on a Web site.
WEB DATA COLLECTION: AN INTRODUCTION
Doing audience analysis has from the earliest days of technical communication been
regarded as essential to be sure that an information design will meet the needs of its
intended users or readers. Historically, audience analysis has been completed early in the
design process, so as to guide basic design choices, and has typically yielded a document
enumerating audience characteristics and the design strategies selected to respond to
them. This document is then kept on hand to be consulted during the rest of the design
process.
But in designing for the World Wide Web, the pace has become so accelerated, the
audiences so diverse, and audience needs so mutable that we need to reconsider how we
do audience analysis. And in fact, the medium itself offers us new answers. When
designing for the World Wide Web, technical communicators can be in direct contact
with their users/readers on an ongoing basis. Thus they have the opportunity to monitor
their audiences continuously and adjust their designs based on what they learn. Further,
they have the opportunity to go beyond traditional audience analysis to actually engage
their users/readers in direct communication of various kinds, thus creating new kinds of
immediate, interactive relationships with them.
These guidelines aim to guide technical communicators in the process of understanding
the power of direct, ongoing contact with users/readers and the possibilities for design
that can result from it. This area is new enough that much of the background information
appears online rather than in print. Also, most of it focuses on marketing or technical
issues rather than rhetorical issues like audience analysis. In the following discussion the
references to supporting literature often take the form of a universal resource locator (url),
although where printed sources exist they are cited, and more no doubt will have
appeared by the time of publication.
The design of these guidelines
1
These guidelines focus on strategies for getting data from users of existing informational
Web sites and using it for two main purposes: (1) analyzing audience and patterns of use
to support continuous redesign, and (2) building a relationship or sense of community,
either between you and your users or among the users themselves.
Although both of these purposes require you as designer to consider real user behavior,
the designer task is quite different in the two cases. The first case is the more elementary
and the closer to the goals and practices of traditional audience analysis. It calls for the
analysis of basic data available on any Web site, and of questionnaire responses, email,
and other direct user input, to draw conclusions about audience needs and patterns of use.
Meeting the second purpose goes beyond simple data analysis. It requires creative
rhetorical exploitation of the direct link to your users/readers and transforms data
exchange with users/readers into something much richer (Amkreutz 2000) But as you
work with even simple data from users, they become increasingly vivid and present to
you, and you can begin to see how much more you might do with the link you have to
them. The interactive nature of the Web in fact gives you the power to radically reimagine your relationship with your user or reader. Thus these guidelines first offer (in
Part One) a primer on Web-based audience analysis to support continuous redesign, and
then (in Part Two) go on to treat the more sophisticated relationships with users that you
can create.
Technical requirements for using these guidelines
Most of the guidelines in this set require that you work with the system administrator
responsible for your Web server (to extract the necessary data) and possibly with a
programmer (to implement data capture techniques). In most cases the guidelines require
you to use software that collects, manipulates, and/or graphically displays specific kinds
of data from the server log file (the record of activity on a site). Especially at the
beginning, while you are getting your tools and processes in place, these data analyses can
be very time-intensive. Thus your organization must endorse this user-centered approach
to Web site design and make a significant commitment of resources to implement it.
PART ONE; ANALYZING AUDIENCE AND PATTERNS OF USE
TO SUPPORT CONTINUOUS REDESIGN
You can use two main sources of data to analyze your audiences and their patterns of use:
data from server logs and data collected directly from your site visitors, for instance
answers to a questionnaire that you have posted on your site. You can use what you learn
to continually improve the fit between your site, its goals and purposes, and its users.
Important points to remember
2
Analysis of Web data does
not substitute for doing
initial audience analysis.
Web site design (like the design of other forms of
communication) begins with a careful analysis of the intended
or expected audience(s), purposes, and uses. After the initial
release, however, Web statistics and Web survey data can
provide the designer with a dynamically emerging picture of
site visitors and their patterns of use.
Web data from logs must
be used cautiously.
Server log data reports on machines and transactions
(individual requests for files), not people and sessions (see
below for more details).
Web data from user
informants must also be
used cautiously.
User informants (respondents to online questionnaires,
visitors sending email, and other voluntary providers of data)
are self-selected and possibly not representative of your
broader audience(s).
Guidelines for analyzing audience and patterns of use by means of
server log data
The first three guidelines focus on the use of server log data for audience analysis and
analysis of patterns of use. These guidelines primarily support the detection and
diagnosis of problems. For those who are new to the idea of server log data, the next
section provides an introductory overview, followed by a brief list of products for
analyzing Web statistics.
Server log data: an introductory overview In using the guidelines having to do with
using server log data, it is important to remember the problems and limitations of the
data. First, the Web is “sessionless;” each transaction (file request) is reported separately.
When a user types in a url for your Web site, or clicks on a link, each request for the file
or files associated with the link is treated as a single event not associated with any other
requests issued for files on the site. That is, requests for files are tracked, not users.
Second, the log recognizes visits by specific machine addresses, not specific people. The
number of visitors reported is affected (increased or decreased) by the use of dynamic
Internet Protocol (IP) addresses, proxy servers, and cacheing. In the case of dynamic IP
addresses, a single user at a single machine might actually be using more than one IP
address. For instance, in a lab with numerous workstations, the lab manager might figure
that not everybody will want to be on the Web at the same time, and thus might set up a
small pool of IP addresses to be assigned as needed . A user’s machine might thus be
assigned a different IP address for each Web transaction. Alternatively, in the case of
proxy servers, all internet traffic in an organization might be channeled through a server
3
used as a “stand-in” IP address (often the case with “firewalls” and other company
security arrangements). In this case, all the hits to your Web site from all the people in
that organization would show the same IP address. Also, when your user requests a file
from your server, it is transmitted to your user’s computer and typically is stored in a
cache, a temporary storage file. If the user returns to that file (page), his or her computer
might retrieve it from the local cache rather than from your site, in which case your server
log file would not record the transaction.
Third, the numbers of “hits” generally report the number of files requested, which may or
may not correspond to pages (a “page” on your site might for instance contain several
graphic images stored in separate files, each of which is logged separately).
Thus, reaching conclusions about your users and what they are doing requires you to
make complicated logical links and inferences that can take you far from the actual data
at hand. The greater the distance between the actual data and the conclusions that you
draw from them, the less reliable and certain your conclusions are and the more caution
you need to exercise in making design decisions based on them.
Keeping in mind these challenges to interpreting server log data, let’s look at more detail
at the data reported on a server log. Log files follow one of two formats: common log
file (CLF) format and extended log file (ELF) format.
The common log file (CLF) format basically records the date and time of the transaction,
the IP address of the remote host, the file that was requested, the size of the file in bytes,
and the status of the request (e.g. “404, file not found”). Table 1 shows a small sample of
server log data from a University of Washington informational Web site about arthritis
(Macklin, Turns, and Shelton 1999). Each row in the table corresponds to a hit. The first
two columns indicate the date and time of the hit; the third column contains the IP
address of the requester. The last column indicates the resource that was requested. (In
this analysis they did not track the number of bytes transferred.)
Table 1: Sample common log file format
1/1/96
1/1/96
1/1/96
0:09:30
0:09:32
0:10:03
dial18.chemek.cc.or.us.
dial18.chemek.cc.or.us.
dial18.chemek.cc.or.us.
1/1/96
1/1/96
1/1/96
1/1/96
0:10:47
0:10:56
0:13:01
0:15:00
dial18.chemek.cc.or.us.
dial18.chemek.cc.or.us.
pm5-00.magicnet.net
pm5-00.magicnet.net.
:bonejoint:kkakkkkk2_1.html
:bonejoint:gif:Clip.GIF
:bonejoint:mov:ScopeACLTear.mo
v
:bonejoint:mov:ACLgraft/mov
:bonejoint:Arthritis.idx.html
:bonejoint:nzzzzzzz1_2.html
:bonejoint:xzzzzyzz1_1.html
The size of the actual log file is suggested by the very small amount of time that elapsed
between hits on this site; imagine the possible size of a file covering for instance a full
day. Note that these seven entries report transactions with only two different IP
addresses; given the small amounts of time involved, you might decide to interpret these
4
as two visitors. Also note that the requests are for different kinds of files: formatted
pages (“html”), graphics (“GIF”), and animations (“mov”).
By analyzing these log entries, you can examine the pattern and consistency of use over
time (monthly or daily statistics, day of week statistics, or even hourly statistics), the
origins of hits to your site, the resources most often consulted (say, the top five files by
number of hits or the top five most frequently requested “404” files), number of apparent
repeat visitors, etc.
The extended log file (ELF) format includes other data points about visits and visitors,
such as the visitor’s browser and platform and the Web site that the visitor is coming
from, called the “referring page” (plus the search term the visitor used, if it was entered
from an engine or directory). The ELF format can include “cookie” information as well,
if available. A “cookie” is a small file that records a visiting computer’s activity on a site.
When you visit a site (and you have not set a preference in your browser to refuse
“cookies”), the site server can transmit a cookie file that records the files that you have
requested. The cookie is placed in a folder on your computer; if you visit the site again,
the server requests the cookie file that it sent to your machine before and updates it with
data about your current visit. In this way the server can build up a historical record of
your computer’s actions on the site. (There are privacy concerns with the use of cookies
that will be discussed in more detail later.)
Neither the CLF nor ELF format records the search terms that visitors use within your site
using your internal search engine. To get that data, you need to have your system
administrator set up your server so that the search engine reports all search terms to a file
that you can then analyze.
Several case studies of groups that have used server log data for analysis of audience and
patterns of use have appeared in print and on the Web (to mention only a few, see Drott
1998; Nielsen 1999; Sullivan 1998; Yu et al. 1999; and Kantner 2000).
Software products for analyzing Web statistics There are a number of products on the
market that help you manipulate and visualize Web server log data. These products differ
in the features that they offer; if your organization needs to acquire a product to
implement analysis of Web server log data, work with your technical staff to understand
your requirements so as to choose a tool that is right for your situation. These tools
change so rapidly that it is not possible to summarize their features accurately. Here is a
brief list of some tools (Macklin 00); no endorsement of any product is implied:







Free Ware Stats Analysis, http://awsd.com/scripts/weblog/index.html
Bazaar Analyzer, http://www.bazaarsuite.com/
FastStats, http://www.mach5.com/fast/
FunnelWeb, http://www.activeconcepts.com/
Gwstat, http://www.ccs.cs.umass.edu/stats/gwstat/html/
Summary, http://www.summary.net/
Webalyzer, http://www.webalizer.org/
5


WebTrends, http://www.webtrends.com/products/log/def
wwwstat, http://www.ics.uci.edu/pub/websoft/wwwstat/
For more information about currently available log analysis tools, see
http://www.uu.se/Software/Analyzers/Access-analyzers.html.
There are a number of resources, both print and online, that you can consult to learn more
about server log data and how it can be used (to mention only a few, see Buchanan and
Lukaszewski 1997; Stout 1997; Aviram 1998; Burke 1997; Goldberg 1999; Linder 1999;
Marketwave.com 1999; and Stehle 1999).
Armed with this understanding of the nature and limitations of Web statistics, we can
now turn to the three guidelines for analyzing audience and patterns of use based on them.
1.
USING SERVER LOG DATA TO MONITOR YOUR AUDIENCE
DEMOGRAPHICS
Use server log data to monitor your audience demographics, keeping in mind that
drawing conclusions about your audience demographics requires interpretation of the
data.
1.1 Analyze the IP addresses, translated into domain names or countries of origin,
for computers sending requests to your Web site server.
Determine what percentage of visits come from each of the various domains (indicated by
the extensions at the end of the names:.com for a business, .edu for an educational
institution, .gov, for a government agency, etc.).
Determine what percentage of visits come from each country (.nl for The Netherlands,
.jp for Japan, etc.), and thus get a view of the international composition of your audience.
Compare to the initial assumptions that you made in your audience analysis. If you
discover a difference, consider how much and in what way your actual audience differs
from the audience you expected to get. If there is a difference, do you still want or need
to reach the audience that you originally targeted? If so, consider what you can do to
raise their awareness of your site or increase your site’s attractiveness to them. Or is your
current actual audience acceptable and productive for you? If so, identify any design
changes to your site that are required by their characteristics.
1.2 Analyze the browsers or platforms being used by computers sending requests to
your Web site server.
6
Determine the technical composition and level of sophistication of your audience.
Compare to your initial assumptions.
If this is not the audience that you need to reach, consider how to make your site more
visible and attractive to the audience you want. Or, if your current actual audience is
acceptable even if unexpected, identify any design changes to your site that are required
by the actual browsers and platforms that they are using.
1.3 Analyze the number of unique IP addresses that visited your site and the
number of visits each made.
By (cautiously) assuming that each IP address is a single user, you can determine where
your audience falls on a continuum from heavy users to one-time visitors. Compare to
your initial assumptions.
Identify any design changes to your site that are called for by the pattern of visits of your
current actual audience. For instance, if you get mostly one-time visitors, do you clearly
announce the audience, purpose, and use of your site on each page?
2.
USING SERVER LOG DATA TO GET A GROSS VIEW OF PATTERNS
OF USE ON YOUR SITE
Use server log data to monitor the patterns of use on your site. (Drawing conclusions
about patterns of use requires extensive interpretation because the actions of a visitor that
together make up that visitor's session on your site are each reported as a separate
transaction. Thus you can reason that two or more requests within a very short time by
the same IP address constitute a sequence of requests by a single user, but in fact you
can't be sure.)
2.1 Analyze the patterns in the dates and times of transactions.
Determine how even or uneven your level of use is, and what your periods of heaviest use
are.
Identify any design changes to your site that are called for by the pattern of visits of your
current actual audience.
2.2 Analyze the number of hits (files requested) and the number of page views
(which can be derived from your site structure by most of the software tools for
server log analysis).
7
Use this data to determine the amount of traffic on your site and the level of demand for
the various topics and types of content that you offer.
Compare these patterns to your initial assumptions. Do you see differences that call for
changes to the site’s design or content?
2.3 Analyze the referring pages from which visitors come to your site.
Identify what sites your visitors are coming from. Are there patterns that you did not
expect to see? Are there design changes that you can think of that would better serve
visitors with these apparent interests or affiliations?
2.4 Analyze the amount of time spent on each page.
Using averages over long blocks of time to minimize the effects of disrupted user
attention (for instance, users leaving your page open while answering the phone),
determine which of your pages users appear to spend the most time on.
Use care in responding to this statistic. Although time spent on a page might indicate
interest, it might also indicate confusion or difficulty in understanding your content. This
statistic can be combined with other statistics (and results of user questionnaires and other
direct user queries) to clarify the user experience.
2.5 Analyze your most and least frequently visited pages.
Identify the pages that appear, based on the number of hits, to be relevant or interesting to
the greatest number of visitors, and those that appear to be of least interest.
Are there patterns that you did not expect to see? Are your visitors overlooking content
that your initial analysis suggested would be important to them or that you particularly
want them to see? Can you think of design changes that could redirect their attention?
2.6 Analyze the search terms used to hit your pages
Identify the vocabulary that your site matches in searches. Compare to your initial
assumptions and to the whole set of terms used in your design.
Are terms that are important descriptors of some of your content not showing up in the
search terms that bring visitors to your site? You can work with your Web master and
system administrator to add keywords to your site that will be picked up by search
engines.
Are there other problems with the search terms that lead people to your site? By looking
at the search terms and referring pages, you may be able to identify ambiguous terms or
other terms that are leading you to get unproductive hits.
2.7 . Analyze the search terms used to search within your Web site.
8
Identify the vocabulary used by your visitors to look for the content they are trying to
locate on your site. (Remember, to analyze the search terms used on your site, you need to
work with your Web master or other technical staff to set up the search engine so that the
terms are also reported to a file.) Compare the terms actually being used to your original
assumptions (labels and titles that you use).
Are users apparently seeking content that you offer, but simply using different names for
it? Are users seeking content that you don’t offer but could? Are users thinking about
your content in ways that differ from your terminology or organization?
Consider whether you can modify your terminology so that it fits better with the way your
actual users are thinking, so as to reduce the number of unproductive searches and
improve access to your content.
2.8 Analyze the most frequent paths through your site.
Using larger blocks of data to minimize problems with interpretation, identify the
pathways that users most often follow through your site.
Remember, doing so requires software that traces a given IP address through a string of
link choices, clusters the results, and graphically displays the results as pathways with
frequencies. One such product, Link trakker, provides information on what search
engines and terms visitors use, paths they take through the site, and other sites that are
linking to your site (http://www.radiation.com/cgi-bin/trakker/secure/demoreport.cgi).
Keep in mind that this kind of tracking depends on making a number of inferences rather
than on certain knowledge. Be careful not to put more faith in the results than is
warranted.
3.
COMBINING TYPES OF SERVER LOG DATA
By juxtaposing your insights from different parts of your server log data, you can draw
additional conclusions about your users and their needs. These additional conclusions
can point you to specific ways to improve the usefulness of your Web site.
3.1 Combine types of server log data to draw conclusions about users and user
needs to support strategic decisions about revision and redesign.
You can get more power out of your analyses if you combine or juxtapose the findings
related to different types of server log data. For example, compare the search terms used
on your site to the terminology you use; combine that data with data about the number of
hits on pages containing related content.
3.2 Track the effects of your changes by monitoring new server log data in the same
categories.
9
After you have redesigned your pages based on what the server log data told you, monitor
new server log data in the same categories to see the effects of your changes.
Examples:
As suggested above, you might observe that a significant block of users is using the same
set of terms to search within your Web site, and you might learn by inspecting your site’s
terminology that your users appear to have a vocabulary for your content that differs from
yours. After changing your vocabulary, you may learn by monitoring the search terms
used by visitors coming in to your site that you are now getting more hits on your new
terms.
You might compare users’ search terms to your most and least popular pages and
pathways. You might then change your terms and monitor patterns of use to see if you
have raised the visibility of your “hidden” content.
You might study the data about your most and least popular pages compared to the
ordering and grouping of topics in your menu hierarchy. You might then reorganize your
menu structure and monitor the server log data to see if the pattern of use changes.
If you are located in the U.S.A. and you find that your site is heavily visited from Asian
countries, and (perhaps as a result) the site has its heaviest period of use late at night, you
might reschedule routine maintenance that otherwise could reduce access.
Guidelines for analyzing audience and patterns of use by means of data
from user informants
Beyond using server log data, the second major way to analyze audience and patterns of
use on your Web site is to collect data directly from the visitors to your site–“user
informants.” The next three guidelines focus on collecting and using this kind of data.
Notes
Data from user informants is often used for marketing purposes, for instance to build
demographic profiles, conduct e-commerce, and document levels of traffic in order to sell
advertising. But data from user informants can also be collected and analyzed to provide
direction for improvements to a site's design.
Remember that user informants (respondents to online questionnaires, visitors sending
email, and other voluntary providers of data) are self-selected and possibly not
representative of your broader audience(s). Also remember that people may not have a
clear understanding of the question you are asking, may not be highly articulate, and may
not necessarily report factually or fully.
10
4.
USING DATA PROVIDED BY USER INFORMANTS TO IMPROVE
YOUR PICTURE OF AUDIENCE DEMOGRAPHICS AND PATTERNS OF USE
Visitors to your site can cooperate more directly in producing data to help you understand
your audiences and their needs. They can give you many kinds of data, varying greatly in
the effort required on their part. Each approach also imposes different technical
requirements on your site design and requires a different level of trust and willingness to
participate from your users.
A Word about Privacy
The information in cookies and other user data is often used to create profiles of typical
users and patterns of use. Profiling generally aggregates information about users, so that
no individual is named or tracked. But (by using cookies as well as passwords, for
instance) some profilers do link individual users and their names and addresses to their
specific purchases, so as to sell targeted ads. That is, you can compare the behavior of a
new visitor to patterns of behavior that you have seen over large numbers of visitors.
Where you see a match, you can tell the new visitor about the behavior of people ”like”
him or her (for instance, you can say that people who liked this book or movie–the one
the visitor is looking at–also liked a second one, which you can then offer to sell them).
A second form of profiling, called collaborative filtering, goes farther. Rather than
simply inferring visitors’ preferences by observing what they do, collaborative filtering
asks the visitor to overtly rate or rank choices. For instance, on the site
http://www.moviecritic.com, the visitor fills out an attitude questionnaire about a handful
of movies, and on the basis of that indication of his or her taste, the site recommends
other movies.
The power of profiling, collaborative filtering, and other user data analysis is just
beginning to be understood (e.g., Gladwell 1999), but in any case it raises serious issues
about user’s privacy. It is important to have a clear policy about privacy rights and adhere
to it strictly.
4.1 If you collect user data, put a privacy policy statement on your Web site that
clearly informs your users about the data you are collecting and the use you are
making of it.
TRUSTe is an independent, non-profit initiative whose mission is to build users’ trust
and confidence in the Internet by promoting the principles of disclosure and informed
consent. In the privacy policy statement that they recommend, the visitor can expect to be
notified of what information is gathered/tracked, how the information is used, and whom
the information is shared with. The TRUSTe site (http://www.truste.org/) provides
extensive information about privacy statements. TRUSTe has come to be a sort of
watchdog for the internet; AOL and other major internet service providers post a privacy
policy on their Web site and submit to auditing by TRUSTe.
11
4.2 Analyze data from cookies (files stored on the visitor’s computer that
accumulate a history of the computer’s activity on your site).
This is the least complicated cooperation that you can ask of your visitor; the only user
efforts involved in the use of cookies are leaving the computer set to accept them and not
throwing them away. By updating the cookie each time that computer requests a file from
your site, your server can add to a record of the computer's activity on your site: sequence
of page visited, participation in interactivity, transactions, etc.
4.3 Ask your users to use a user i.d. and password on your site.
By asking (or requiring) your users to create and use a user i.d and password, you can
track the activities of individuals as opposed to the computers that they are using. Thus
you can raise your confidence that the patterns of behavior that you are seeing on your
Web site do reflect the real behavior of users.
By aggregating these patterns of use, you can identify the main types of activity on your
site. For instance, if you have a cooking site, do most users go first to the recipe pages?
Do they search or browse? Do they check the seasonal availability of fresh ingredients as
they look at recipes? Do they consult the bulletin board for comments from other
user/cooks? By analyzing such patterns, you can draw conclusions about your users’
goals and interests (Dervin 1989).
4.4 Give users the opportunity to refuse dialog with you.
You can also use what you learn from user i.d.s and passwords to present individual users
with messages tailored to their apparent characteristics, tastes, and interests. Many users
will find such messages helpful and interesting; others will not. If you decide to display
targeted messages, give users the option to reject any one message and to turn off the
entire messaging activity.
4.5 Ask your users to fill out online questionnaires.
You can invite users to respond to online questionnaires about any issues on which you
want feedback. The shorter the questionnaire the more often users complete it. The same
rules of design used for paper questionnaires govern online ones.
4.6 On every page, offer your users the opportunity to send you email.
You can provide an email link to allow users to send the Web master questions,
comments, and other feedback.
If you do, you must be prepared to respond to the emails in a reasonable period of time. It
is possible to first send an automatic reply that tells the user when and how to expect an
answer. One somewhat less labor-intensive way to respond to emails is to aggregate the
12
answers and post them on a bulletin board or Frequently Asked Questions page on your
Web site.
Email from your users can be very powerful at getting you outside the confines of your
own thinking. Often unstructured input from users can help you discover creative or
radical solutions that would never have occurred to you otherwise. The tradeoff is that
this kind of input is time-consuming to respond to and challenging to control (for
instance, you don’t want to get locked into an extended email discussion about a problem
beyond your control).
4.7 Conduct “remote usability tests” over the Web.
A record of an individual’s session on your Web site is essentially a remote usability test,
but without the thinking out loud that can tell you what the user wanted to do, what
assumptions he or she was making as they worked, and other insights into their thinking.
You can get that important dimension of user behavior using technology as simple as the
telephone to have them talk you through a session as you follow along on your own
computer.
5.
COMBINING TYPES OF USER INFORMANT DATA
Again, by juxtaposing your insights from different kinds of user-supplied data, you can
draw additional conclusions about your users and their needs. These additional
conclusions can point you to specific ways to improve the usefulness of your Web site.
5.1 Combine types of data provided by user informants to draw additional
conclusions about users and user needs.
Examples
You can compare the questions and requests for information contained in email from your
users to the pathways they follow (provided by tracking the actions of users identified by
user i.d.'s) and their self-descriptons (provided by their answers to pop-up
questionnaires).
You can combine data from cookies, user i.d.'s and passwords, and popup questionnaires
about the users' level of familiarity with your your content to build a profile of the
behavior typical of beginners (e.g., search terms used, if any; pages visited).
6.
COMBINING DATA FROM WEB SOURCES WITH OTHER DATA
AVAILABLE IN YOUR ORGANIZATION
13
Continue to explore ways to combine the data available to you to support design
improvements. Don't overlook other data available in your organization derived from
sources other than the Web site itself.
6.1 Combine data from server logs and user informants with other data available in
your organization to support strategic decisions about revision and redesign. Track
the effect of your changes by monitoring new data in all categories.
Examples
Compare problem reports from your cusomer support organization to your record of
search terms used on your site to identify new content that should be added to your site.
Examine the search terms that triggered hits to your site. Construct a pop-up
questionnaire to ask visitors about their level of interest in related topics, appropriate to
your site's scope and purpose, which a search engine would not find on your site. If the
feedback warrants it, add the new content.
PART TWO: BUILDING A RELATIONSHIP OR SENSE OF
COMMUNITY
We started by saying that these guidelines would focus on strategies for getting data from
users and using it for two main purposes: (1) analyzing audience and patterns of use to
support continuous redesign, and (2) building a relationship or sense of community, either
between you and your users or among the users themselves. We now turn to focus on
this second main purpose, building community, often described as one of the most
powerful effects of Web communication.
Meeting this second purpose goes beyond analysis of audience and patterns of use to
examine how to manage the link between you and your users so as to build a relationship
or sense of community. We have in fact already talked about using this link for data
collection in which the user gives data to the designer. These methods actually already
imply a certain kind of relationship between the designer and user.
In the case that we first considered, in which the user is passive or even unaware that data
is being collected (use of server logs), we can say that the user experiences the Web site
as an inanimate product. The designer sees himself or herself as an analyst examining
data from a subject (the user). In the case of the user who actively takes part in the data
exchange (for instance, by sending email), the roles and relationship of designer and user
are very different. The user may still perceive the Web site as an inanimate product, but
now he or she has an awareness of the designer as a person, the producer of the site, and
of him- or herself as at least at some level a contributor to producing the site. Thus there
is a much richer human relationship involved in this exchange.
14
Type of Interaction
Examples; relationship established
user to designer
Examples:
Data from both server logs and user informants.
Relationship established:
1. If user is passive or unaware (server log data, cookies):
Web site perceived as inanimate product, designer is
"analyst," user is "subject"
2. If user is active (questionnaire respondent, emailer): Web
site is perceived as inanimate product, (human) designer is
acknowledged as its producer, user is "contributor" to the site
But providing the designer with data is only one form of user communication (user-todesigner) that your site can offer. You can also offer numerous other forms: designer-touser, designer-to-group, user-to-user, user-to-group. And these different types of
communication with users have different rules and create different relationships.
When the designer responds directly to communications from users, more complex
relationships become possible. Let’s consider the simplest case where the designer
presents the user with plain reportage, for instance a list of his or her recent actions. This
response, even though still quite impersonal, at least involves the closure of the feedback
loop–the user took actions, the designer analyzed the actions, and then the designer
reflected the record of the actions back to the user. The user may continue to perceive the
Web site as just a product,, but the “breadcrumb trail” information offered by the designer
may lead the user to think of the designer as a more active partner in a communication
(“producer/communicator”). In the next more complex situation, in which the designer
responds directly and personally to the user (for instance, in a personal email), the
beginnings of a peer relationship begin to emerge: the designer and user are consulting,
are collaborating, on the design of the site.
Let’s take this process one step further and look at the case where the designer
communicates back to the entire group of users. Here, the feedback loop is closed not by
providing feedback to just a single user but by showing the whole group the aggregated
results of the communication or data collection/analysis. For instance, the designer might
present a chart of the distribution of user responses to an online questionnaire. The
designer is still in the role of producer/collaborator, but now the user becomes aware of
himself or herself as a member of a community, and the Web site begins to feel more like
a setting for interactions than like an inanimate product.
15
Type of Interaction
Examples; relationship established
designer to user
Examples:
Display of former user actions (for instance, topics consulted
earlier); replies from designer to user email.
Relationship established:
1. Impersonal response (e.g. topic list): Web site is a
product, designer is "producer/communicator," user is
"participant"
2. Personal response (e.g. email reply): Web site is a
product, designer is "producer/collaborator," user is
"collaborator"
designer to group
Examples:
Designer reflects data back to users, displaying charts or other
reports of the results of online questionnaires, posing new
questions based on feedback from/dialog with users,
suggesting actions or options based on analysis of behavior of
other similar users ("if you liked this movie, you'll probably
like these others;" selecting ads to show based on user profile)
Relationship established:
Designer is "producer/collaborator;" user is "collaborator,"
possibly with some feeling of "community member," Web site
is a product and a setting
Once the user has become aware of the other users in the community, it is a short step to
go on to offer him or her the option of communicating directly user-to-user. You may
choose to offer a bulletin board where users can leave postings, or a chatroom where
users can interact with each other more casually. Now the relationship between designer
and user has changed radically; the designer now has become the enabler of
communication among users, the Web site is the setting, and the users are themselves
creators of content and even community.
Type of Interaction
Examples; relationship established
16
user to user
Examples:
User replies directly to another user's posting on a bulletin
board; user replies to another user's query or ad.
Relationship established:
Web site is a setting, designer is "enabler," focus on designer
displaced by focus on user as "co-creator" of use and content
user to group, group to user Examples:
User selects an audience or topic (for instance, on a bulletin
board) and addresses all other users associated with it (e.g. by
posting a message). Users post messages for or against
positions taken or attitudes expressed by one or more other
users. Users confirm group identity.
Relationship established:
Web site is a setting, designer is "enabler," focus on designer
displaced by focus on user as "creator" of use, content, and
community
Note that these radically different imaginings of the purpose and use of a Web site derive
directly from the designer’s choices about communication. By choosing communication
roles and relationships for themselves and for the users, Web site designers define the
scope of the human dimension of the Web.
Guidelines for building a relationship or sense of community
The final three guidelines focus on your selection and management of the roles and
relationships set up by your choice of modes of interaction and communication.
7. OFFERING FORMS OF INTERACTING WITH USERS THAT ARE
APPROPRIATE FOR AND CONSISTENT WITH YOUR SITE'S INTENDED
AUDIENCES, PURPOSES AND USES
We have said that different choices about how you communicate with your users create
different roles and relationships. Not all roles and relationships are appropriate for every
audience, purpose, or use of every Web site.
If a site claims that it is a forum for a particular interest group, does it offer the members
of that group a mechanism for posting content? If the site represents itself as an advocate
of a group (members of a social group, for instance), it should offer the members of that
group an egalitarian setting to encourage wide participation.
17
If a site, on the other hand, claims that it presents authoritative information, does it
nevertheless allow users to post unverified information? Generally, a site that wants to
maintain an authoritative persona would welcome questions but not enable unscreened
postings of information. It might however enhance community feeling by posting humaninterest stories or case studies (often done, for instance, on health or education sites).
A site can have areas that differ as to what forms of interaction are appropriate. For
instance, a health site might have a “news” area that is quite authoritative and a “chat”
area in which visitors can exchange feelings and stories. If a site has areas that differ in
this way, the site design should clearly indicate moves from one area to another.
8. CONSIDERING WHETHER YOUR INTERACTION DESIGN IS APPLIED
CONSISTENTLY, AND WHETHER YOU MAINTAIN THE RESULTING
DESIGNER/USER ROLES CONSISTENTLY ACROSS OTHER DIMENSIONS
OF YOUR DESIGN (FOR INSTANCE, TONE)
Roles and relationships are fragile and can be undermined or undone by abrupt departures
from the overall pattern or user expectation.
8.1 Make the communicative tasks and opportunities of the reader as clear and
explicit as far as possible.
Do you use one or more forms of interacting with your users? Is each type designed in
the same way everywhere that it is used? If not, can you justify the difference? If you use
more than one type, do the types work together without creating conflicting roles and
relationships for users? Are the relationships implied by your choice of forms of
interaction with users maintained consistently across your site?
8.2 Inspect your site to confirm that all of your design choices are working together.
The forms of interactivity with users that you employ create roles and relationships
between and among designers and users, but other dimensions of your design also
contribute to the creation of roles and relationships (see the Coney and Steehouder,
Guidelines: Reader Roles). Inspect your site to confirm that all your design choices are
working together.
Does your choice of tone (for instance, authority speaking in a formal tone versus peer
speaking in a familar tone) work with the roles and relationships created by the form of
interactivity you are using? Do the other dimensions of your Web site maintain
relationships between designer and user (and among users themselves) that are consistent
with those created the forms of interactivity that you provide to the user? Or do you
change the roles allowed users with respect to designers and other users? If you do
change the roles allowed to users, are the changes appropriate and justifiable, and on what
grounds?
18
9. DECIDING HOW EXPLICIT YOU WILL MAKE THE DESIGNER/USER
ROLES AND RELATIONSHIPS CREATED BY YOUR CHOICE OF
INTERACTIVITY WITH THE USER, AND IDENTIFYING THE DESIGN
CHOICES AND MOTIFS THAT YOU WILL USE TO REINFORCE THEM
Choose the extent to which you want to reveal and explicitly reinforce the roles and
relationships created by your choice of forms of interactivity.
If you do not want to emphasize the relationship of designer to user or user to user on
your site, consider whether you have chosen forms of interactivity with your users that
create expectations that you do not intend to or cannot meet. Consider choosing forms of
data collection and interactivity more consistent with the relationship with your user that
you want to maintain.
If you do want to emphasize this aspect of your site, consider using labels, design layouts,
or other design motifs to draw attention to the roles and relationships. If you use terms or
motifs that draw attention to or showcase user participation (user profiles, summaries of
responses, chatrooms, etc.), use them consistently across the site.
Quicklist: Web Data Collection for Understanding
and Interacting with Your Users
This guideline discusses Web data collection for understanding and interacting with your
users in two main parts: (1) analyzing audience and patterns of use to support continuous
redesign and (2) building relationship and community on your Web site.
Four Considerations to Keep in Mind About Web Data Collection
Analysis of Web data does
not substitute for doing
initial audience analysis.
Web site design (like the design of other forms of
communication) begins with a careful analysis of the intended
or expected audience(s), purposes, and uses. After the initial
release, however, Web statistics and Web survey data can
provide the designer with a dynamically emerging picture of
site visitors and their patterns of use.
19
Web data from logs must
be used cautiously.
Server log data reports on machines and transactions
(individual requests for files), not people and sessions (see
below for more details).
Web data from user
informants must also be
used cautiously.
User informants (respondents to online questionnaires,
visitors sending email, and other voluntary providers of data)
are self-selected and possibly not representative of your
broader audience(s).
Collecting and interpreting
Web data requires
technical support.
Most of the guidelines in this set require that you work with
the system administrator responsible for your Web server (to
extract the necessary data) and possibly with a programmer
(to implement data capture techniques). In most cases the
guidelines require you to use software that collects,
manipulates, and/or graphically displays specific kinds of data
from the server log file (the record of activity on a site).
PART ONE: ANALYZING AUDIENCE AND
PATTERNS OF USE TO SUPPORT CONTINUOUS
REDESIGN
Part One focuses on guidelines for analyzing audience and patterns of use to support
continuous redesign. It covers the two approaches to collecting data for this purpose:
collecting data from Web server logs and collecting data directly from user informants.
Guidelines for Analyzing Audience and Patterns of Use by means of
Server Log Data
1. USING SERVER LOG DATA TO MONITOR YOUR AUDIENCE
DEMOGRAPHICS
Use server log data to monitor your audience demographics, keeping in mind that
drawing conculsions about your audience demographics requires interpretation of the
data.
1.1 Analyze the IP addresses, translated into domain names or countries of origin,
for computers sending requests to your Web site server.
20
1.2 Analyze the browsers or platforms being used by computers sending requests
to your Web site server.
1.3 Analyze the number of unique IP addresses that visited your site and the
number of visits each made.
2. USING SERVER LOG DATA TO GET A GROSS VIEW OF PATTERNS OF USE
ON YOUR SITE
Use server log data to get a gross view of patterns of use on your site.
2.1 Analyze the patterns in the dates and times of transactions.
2.2 Analyze the number of hits (files requested) and the number of page views
(which can be derived from your site structure by most of the software tools for
server log analysis).
2.3 Analyze the referring pages from which visitors come to you site.
2.4 Analyze the amount of time spent on each page.
2.5 Analyze your most and least frequently visited pages.
2.6 Analyze the search terms used to hit your pages
2.7 . Analyze the search terms used to search within your Web site
2.8 Analyze the most frequent paths through your site.
3. COMBINING TYPES OF SERVER LOG DATA
Combine types of server log data to draw conclusions about users and user needs to
support strategic decisions about revision and redesign. Track the effects of your changes
by monitoring new server log data.
Guidelines for Analyzing Audience and Patterns of Use by means of
Data from User Informants
4. USING DATA PROVIDED BY USER INFORMANTS
Use data provided by user informants to improve the accuracy and detail of your picture
of audience demographics and patterns of use.
Note: It is important to have a clear policy about privacy rights and adhere to it
strictly.
21
4.1 If you collect user data, put a privacy policy statement on your Web site that
clearly informs your users about the data you are collecting and the use you are
making of it.
4.2 Analyze data from cookies (files stored on the visitor’s computer that
accumulate a history of the computer’s activity on your site).
4.3 Ask your users to use a user i.d. and password on your site.
4.4 Give users the opportunity to refuse dialog with you.
4.5 Ask your users to fill out online questionnaires.
4.6 On every page, offer your users the opportunity to send you email.
4.7 Conduct “remote usability tests” over the Web.
5. COMBINING TYPES OF USER INFORMANT DATA
5.1 Combine types of data provded by user informants to draw additonal conclusions
about users and user needs.
6. COMBINING DATA FROM WEB SOURCES WITH OTHER DATA AVAILABLE
IN YOUR ORGANIZATION
6.1 Combine data from server logs and user informants with other data available in your
organization to support strategic decisions about revision and redesign. Track the effect
of your changes by monitoring new data in all categories.
PART TWO: BUILDING A RELATIONSHIP OR
SENSE OF COMMUNITY
Part Two focuses on guidelines for building a relationship or sense of community on your
Web site.
Guidelines for Building a Relationship or Sense of Community
7. OFFERING FORMS OF INTERACTING WITH USERS THAT ARE
APPROPRIATE FOR AND CONSISTENT WITH YOUR SITE'S INTENDED
AUDIENCES, PURPOSES, AND USES
Offer forms of interacting with users that are appropriate for and consistent with your
site's intended audiences, purposes, and uses.
22
8. CONSIDERING WHETHER YOUR INTERACTION DESIGN IS APPLIED
CONSISTENTLY, AND WHETHER YOU MAINTAIN THE RESULTING
DESIGNER/USER ROLES CONSISTENTLY ACROSS OTHERDIMENSIONS OF
YOUR DESIGN (FOR INSTANCE, TONE)
Consider whether your interaction design is applied consistently, and whether you
maintain the resulting designer/user roles consistently across other dimensions of your
design (for instance, tone).
8.1 Make the communicative tasks and opportunities of the reader as clear and
explicit as far as possible.
8.2 Inspect your site to confirm that all of your design choices are working
together.
9. DECIDING HOW EXPLICIT YOU WILL MAKE THE DESIGNER/USER ROLES
AND RELATIONSHIPS
Decide how explicit you will make the designer/user roles and relationships created by
your choice of interactivity with the user, and, if you want to emphasize them, identify the
design choices and motifs that you will use to reinforce them.
23
REFERENCES
Amkreutz Boyd, Suzanne (2000). Practitioners' review of Web guidelines. Master's
thesis, Department of Communication, University of Washington, Seattle, Washington.
(I would also like to thank Suzanne for the extensive support, over more than a year’s
duration, that she provided me and my colleagues during the development of these
guidelines.)
Aviram, Mariva H. (2/3/98). “Analyze Your Web site Traffic,” Builder,com,
http://www.builder.com/Servers/Traffic/
Buchanan, R.W. & Lukaszewski, C. (1997). Measuring the Impact of Your Web
Site. Proven Yardsticks for Evaluating. New York: John Wiley.
Burke, Raymond R. (1997). “The Future of Market Research on the Web: Who is
visiting your site?” Continuous Learning Project: Problems with Traditional
Measurement Techniques, Indiana University,
http://universe.indiana,edu/clp/or/future.htm
Dervin, Brenda (1989). “Users as research inventions: how research categories
perpetuate inequalities.” Journal of Communication, 39, 3, pp. 216-232.
Drott, M. Carl (1998). “Using Web Server Logs to Improve Site Design,” SIGDOC ’98
Conference Proceedings, pp. 43-50.
Esler, Mike, Katherine Puckering, Ryan Knutsen, Josh Cohen, Dorothy Lin, and Tristan
Robinson, “Privacy on the Web: Pro and Con,” seminar report for TC505, ComputerAssisted Communication, Autumn 1999. I would like to thank these students for
identifying the sources cited concerning privacy.
Gladwell, Malcolm (1999). “Annals of Marketing: The Science of the Sleeper (How the
Information Age could blow away the blockbuster).” New Yorker: October 4, 1999.
Goldberg, Jeff (1999). “Why Web usage statistics are (worse than) meaningless,”
Cranfield Computer Centre, Cranfield University, http://www.cranfield.ac.uk/docs/stats/
Kantner, Laurie (2000). "Assessing Web Site Usability from Server Logs," Common
Ground, newsletter of the Usability Professionals' Association, vol. 10, no. 1 (March
200), pp. 1, 5-11. Also published as Tec-Ed, Inc. (1999), “Assessing Web site Usability
from Server Log Files,” white paper prepared by Tec-Ed, Inc., PO Box 1905, Ann arbor,
MI 48106, December 1999.
Linder, Doug (1999). “Interpreting WWW statistics,” National Archives and Records
Administration Web site, http://gopher.nara.gov:70/Oh/what/stats/webanal.html
24
Macklin, Scott, Jennifer Turns, and Brett Shelton (2000). personal communication. I am
grateful to Scott Macklin, director of the University of Washington PETTT (Program for
Educational T ransformation through Technology), and Jennifer Turns and Brett Shelton
of the same project, for allowing me to use a sample of their server log file.
Macklin, Scott (2000). Personal communication.
Marketwave.com (1999). “Web Mining: Going Beyond Web Traffic Analysis,” White
Paper–Web Statistics and Traffic Analysis Software, Tuesday, June 1, 1999.
http://www.marketwave.com/press/whitepaper.htm
Nielsen, Jakob (1999). “Collecting Feedback From Users of an Archive (Reader
Challenge),” Useit.com Alertbox, January 10, 1999.
http://www.useit.com/alertbox/990110.html
Stehle, Tim (1999). “Getting Real About Usage Statistics,”
http://www.wprc.com/wpl/stats.html
Stout, R. (1997). Web Site Stats. Tracking Hits and Analyzing Traffic.
Berkeley: Osborne/McGraw-Hill.
Sullivan, Terry (1998). “Reading Reader Reaction: A Proposal for Inferential Analysis
of Web Server Log Files,” U.S. West Web Conference: http://www.uswest.com/webconference/proceedings/rrr.html
Yu, Jack J., Prasad V. Prabhu, and Wayne C. Neale (1999). “A User-Centered Approach
to Designing a New Top-Level Structure for a Large and Diverse Corporate Web Site,”
Our Global Community Conference Proceedings,
http0://www.research.att.com/conf/hfweb/proceedings/yu/index/html
BIOSKETCH
Judith Ramey, PhD, is professor and chair of technical communication at the University
of Washington. She edited a special issue of IEEE Transactions on Professional
Communication on usability testing in 1989. With Ginny Redish, she conducted a
research study on the value added by technical communicators to a product or process, the
results of which were published in a special section of the Technical Communication in
1995. With.Dennis Wixon, she co-edited a collection of essays entitled Field Methods
Casebook for Software Design, published by John Wiley and Sons in 1996. She is a
Fellow of STC.
jramey@u.washington.edu, (206) 543-2588
25
Download