Uploaded by 432yesudas

hci uni 1-3 ans

Unit 1
2. Outline the factors that can limit the speed of an interactive computer system.
There are several factors that can limit the speed of an interactive system:
Computation bound, Storage, channel bound, Graphics bound, Network capacity.
3. Explain in detail about the Input-output channels of the Human.
A person’s interaction with the outside world occurs through information being received and
sent: input and output. In an interaction with a computer the user receives information that is
output by the computer, and responds by providing input to the computer– the user’s output
becomes the computer’s input and vice versa.
For example, sight may be used primarily in receiving information from the computer, but it can
also be used to provide information to the computer, for example by fixating on a particular
screen point when using an eyegaze system. Input in the human occurs mainly through the
senses and output through the motor control of the effectors.
There are five major senses: sight, hearing, touch, taste and smell. Of these, the first three are
the most important to HCI. Taste and smell do not currently play a significant role in HCI, and it
is not clear whether they could be exploited at all in general computer systems, although they
could have a role to play in more specialized systems (smells to give warning of malfunction, for
example) or in augmented reality systems. vision, hearing and touch are central.
There are a number of effectors, including the limbs, fingers, eyes, head and vocal system. In
the interaction with the computer, the fingers play the primary role, through typing or mouse
control, with some use of voice, and eye, head and body position.
Imagine using a personal computer (PC) with a mouse and a keyboard. The application you are
using has a graphical interface, with menus, icons and windows. In your interaction with this
system you receive information primarily by sight, from what appears on the screen.
4. Explain the model of the structure of human memory with diagrammatic illustration.
Our memory contains our knowledge of actions or procedures. It allows us to repeat actions, to
use language, and to use new information received via our senses. It also gives us our sense of
identity, by preserving information from our past experiences.
Memory is the second part of our model of the human as an information-processing system.
Memory is associated with each level of processing. Bearing this in mind, we will consider the
way in which memory is structured and the activities that take place within the system. It is
generally agreed that there are three types of memory or memory function: sensory buffers,
short-term memory or working memory, and long-term memory. There is some disagreement
as to whether these are three separate systems or different functions of the same system. It is
sufficient to note three separate types of memory. These memories interact, with information
being processed and passed between memory stores.
memories- Iconic,
Echoic, Haptic
memory or
working memory
Long term
5. List and explain the stages of Norman's model of interaction.
The stages in Norman’s model of interaction are as follows:
1. Establishing the goal.
2. Forming the intention.
3. Specifying the action sequence.
4. Executing the action.
5. Perceiving the system state.
6. Interpreting the system state.
7. Evaluating the system state with respect to the goals and intentions.
Norman uses this model of interaction to demonstrate why some interfaces cause
problems to their users. He describes these in terms of the gulfs of execution and the
gulfs of evaluation.
As we noted earlier, the user and the system do not use the same terms to describe the
domain and goals – remember that we called the language of the system the core
language and the language of the user the task language.
The gulf of execution is the difference between the user’s formulation of the actions to
reach the goal and the actions allowed by the system. If the actions allowed by the
system correspond to those intended by the user, the interaction will be effective.
The interface should therefore aim to reduce this gulf.
The gulf of evaluation is the distance between the physical presentation of the system
state and the expectation of the user.
If the user can readily evaluate the presentation in terms of his goal, the gulf of
evaluation is small.
The more effort that is required on the part of the user to interpret the presentation,
the less effective the interaction.
6. Explain the framework of Human computer interaction.
In particular, the field of ergonomics addresses issues on the user side of the interface,
covering both input and output, as well as the user’s immediate context.
Dialog design and interface styles can be placed particularly along the input branch of
the framework, addressing both articulation and performance.
However, dialog is most usually associated with the computer and so is biased to that
side of the framework.
Presentation and screen design relates to the output branch of the framework.
The entire framework can be placed within a social and organizational context that also
affects the interaction.
Each of these areas has important implications for the design of interactive systems and
the performance of the user.
7. Discuss the technologies involved in display devices?
1. Cathode ray tube
 The cathode ray tube is the television-like computer screen still most common as we
write this, but rapidly being displaced by flat LCD screens. It works in a similar way to a
standard television screen. A stream of electrons is emitted from an electron gun, which
is then focussed and directed by magnetic fields.
 As the beam hits the phosphor-coated screen, the phosphor is excited by the electrons
and glows.
 The electron beam is scanned from left to right, and then flicked back to rescan the
next line, from top to bottom. This is repeated, at about 30 Hz (that is, 30 times a
second), per frame, although higher scan rates are sometimes used to reduce the flicker
on the screen.
 Another way of reducing flicker is to use interlacing, in which the odd lines on the
screen are all scanned first, followed by the even lines.
 Using a high-persistence phosphor, which glows for a longer time when excited, also
reduces flicker, but causes image smearing especially if there is significant animation.
The CRT is a cheap display device and has fast enough response times for rapid
animation coupled with a high color capability.
CRT screen
2. Liquid crystal display
 If you have used a personal organizer or notebook computer, you will have seen the
light, flat plastic screens.
 These displays utilize liquid crystal technology and are smaller, lighter and consume far
less power than traditional CRTs.
 These are also commonly referred to as flat-panel displays.
 They have no radiation problems associated with them, and are matrix addressable,
which means that individual pixels can be accessed without the need for scanning.
 Similar in principle to the digital watch, a thin layer of liquid crystal is sandwiched
between two glass plates.
 The top plate is transparent and polarized, whilst the bottom plate is reflective. External
light passes through the top plate and is polarized, which means that it only oscillates in
one direction.
 This then passes through the crystal, reflects off the bottom plate and back to the eye,
and so that cell looks white.
 When a voltage is applied to the crystal, via the conducting glass plates, the crystal
twists. This causes it to turn the plane of polarization of the incoming light, rotating it so
that it cannot return through the top plate, making the activated cell look black.
 The LCD is less tiring to use than standard CRT ones, with reduced eyestrain.
3. Special displays
 There are a number of other display technologies used in niche markets. The one you
are most likely to see is the gas plasma display, which is used in large screens.
8. Explain the various interaction styles. (pg no. 136 [3.5])
Interaction can be seen as a dialog between the computer and the user. The choice of interface
style can have a profound effect on the nature of this dialog. There are a number of common
interface styles including
n command line interface
n menus
n natural language
n question/answer and query dialog
n form-fills and spreadsheets
n point and click
n three-dimensional interfaces.
9. Explain the Paradigms for Interaction.
1. Time sharing
 One of the major contributions to come out of this new emphasis in research was the
concept of time sharing, in which a single computer could support multiple users.
 Previously, the human (or more accurately, the programmer) was restricted to batch
sessions, in which complete jobs were submitted on punched cards or paper tape to an
operator who would then run them individually on the computer.
 Time-sharing systems of the 1960s made programming a truly interactive venture and
brought about a subculture of programmers known as ‘hackers’ – single-minded
masters of detail who took pleasure in understanding complexity.
 Though the purpose of the first interactive time-sharing systems was simply to augment
the programming capabilities of the early hackers, it marked a significant stage in
computer applications for human use.
 Rather than rely on a model of interaction as a pre-planned activity that resulted in a
complete set of instructions being laid out for the computer to follow, truly interactive
exchange between programmer and computer was possible.
 The computer could now project itself as a dedicated partner with each individual user
and the increased throughput of information between user and computer allowed the
human to become a more reactive and spontaneous collaborator.
 Indeed, with the advent of time sharing, real human–computer interaction was now
2. Video display units
 These display screens could provide a more suitable medium than a paper printout for
presenting vast quantities of strategic information for rapid assimilation.
 Sketchpad allowed a computer operator to use the computer to create, very rapidly,
sophisticated visual models on a display screen that resembled a television set.
The visual patterns could be stored in the computer’s memory like any other data, and
could be manipulated by the computer’s processor; But Sketchpad was much more than
a tool for creating visual displays.
 It was a kind of simulation language that enabled computers to translate abstractions
into perceptually concrete forms. And it was a model for totally new ways of operating
computers; by changing something on the display screen, it was possible, via Sketchpad,
to change something in the computer’s memory.
3. Programming toolkits
 Many of the ideas that Engelbart’s team developed at the Augmentation Research
Center – such as word processing and the mouse – only attained mass commercial
success decades after their invention.
 Engelbart wrote of how humans attack complex intellectual problems like a carpenter
who produces beautifully complicated pieces of woodwork with a good set of tools.
 The secret to producing computing equipment that aided human problemsolving ability
was in providing the right toolkit.
 Taking this message to heart, his team of programmers concentrated on developing the
set of programming tools they would require in order to build more complex interactive
 The idea of building components of a computer system that will allow you to rebuild a
more complex system is called bootstrapping and has been used to a great extent in all
of computing.
 The power of programming toolkits is that small, well-understood components can be
composed in fixed ways in order to create larger tools.
 Once these larger tools become understood, they can continue to be composed with
other tools, and the process continues.
4. Personal computing
 Alan Kay was profoundly influenced by the work of both Engelbart and Papert. He
realized that the power of a system such as NLS was only going to be successful if it was
as accessible to novice users as was LOGO.
 In the early 1970s his view of the future of computing was embodied in small, powerful
machines which were dedicated to single users, that is personal computers.
 Together with the founding team of researchers at the Xerox Palo Alto Research Center
(PARC), Kay worked on incorporating a powerful and simple visually based programming
environment, Smalltalk, for the personal computing hardware that was just becoming
 As technology progresses, it is now becoming more difficult to distinguish between what
constitutes a personal computer, or workstation, and what constitutes a mainframe.
 Kay’s vision in the mid-1970s of the ultimate handheld personal computer – he called it
the Dynabook.
5. Window systems and the WIMP interface
Humans are able to think about more than one thing at a time, and in accomplishing some
piece of work, they frequently interrupt their current train of thought to pursue some other
related piece of work. A personal computer system which forces the user to progress in order
through all of the tasks needed to achieve some objective, from beginning to end without any
diversions, does not correspond to that standard working pattern. One presentation
mechanism for achieving this dialog partitioning is to separate physically the presentation of
the different logical threads of user–computer conversation on the display device. The window
is the common mechanism associated with these physically and logically separate display
6. The metaphor
Papert used the metaphor of a turtle dragging its tail in the dirt. Children could quickly identify
with the real-world phenomenon and that instant familiarity gave them an understanding of
how they could create pictures. The danger of a metaphor is usually realized after the initial
honeymoon period. When word processors were first introduced, they relied heavily on the
typewriter metaphor. The keyboard of a computer closely resembles that of a standard
typewriter, so it seems like a good metaphor from which to start.
7. Hypertext
Hypertext is text which is not constrained to be linear. Hypertext is text which contains links to
other texts. The term was coined by Ted Nelson around 1965. HyperMedia is a term used for
hypertext which is not constrained to be text: it can include graphics, video and sound, for
example. Apparently Ted Nelson was the first to use this term too. Hypertext and HyperMedia
are concepts, not products.
8. Multi-modality
Genuine multi-modal systems rely to a greater extent on simultaneous use of multiple
communication channels for both input and output. Humans quite naturally process
information by simultaneous use of different channels. We point to someone and refer to them
as you‗, and it is only by interpreting the simultaneous use of voice and touch that our
directions are easily articulated and understood. Designers have wanted to mimic this flexibility
in both articulation and observation by extending the input and output expressions.
9. Computer-supported cooperative work
Personal computing provides individuals with enough computing power so that they were
liberated from dumb terminals which operated on a time-sharing system. It is interesting to
note that as computer networks became widespread, individuals retained their powerful
workstations but now wanted to reconnect themselves to the rest of the workstations in their
immediate working environment, and even throughout the world! One result of this
reconnection was the emergence of collaboration between individuals via the computer –
called computer-supported cooperative work, or CSCW.
10. The World Wide Web
WWW or "Web" is a global information medium which users can read and write via computers
connected to the Internet. The term is often mistakenly used as a synonym for the Internet
itself, but the Web is a service that operates over the Internet, just as e-mail also does. The
history of the Internet dates back significantly further than that of the World Wide Web. The
internet is simply a collection of computers, each linked by any sort of data connection,
whether it be slow telephone line and modem or high-bandwidth optical connection. The
computers of the internet all communicate using common data transmission protocols (TCP/IP)
and addressing systems (IP addresses and domain names). This makes it possible for anyone to
read anything from anywhere, in theory, if it conforms to the protocol. The web builds on this
with its own layer of network protocol (http), a standard markup notation (such as HTML) for
laying out pages of information and a global naming scheme (uniform resource locators or
URLs). Web pages can contain text, color images, movies, sound and, most important,
hypertext links to other web pages. Hypermedia documents can therefore be ‗published‗by
anyone who has access to a computer connected to the internet.
11. Ubiquitous computing
Ubiquitous computing is a paradigm in which the processing of information is linked with each
activity or object as encountered. It involves connecting electronic devices, including
embedding microprocessors to communicate information. Devices that use ubiquitous
computing have constant availability and are completely connected. Ubiquitous computing
focuses on learning by removing the complexity of computing and increases efficiency while
using computing for different daily activities. Ubiquitous computing is also known as pervasive
computing, every ware and ambient intelligence.
10. Explain about reasoning and problem solving.
Humans, on the other hand, are able to use information to reason and solve problems, and
indeed do these activities when the information is partial or unavailable. Human thought is
conscious and self-aware: while we may not always be able to identify the processes we use,
we can identify the products of these processes, our thoughts. In addition, we are able to think
about things of which we have no experience, and solve problems which we have never seen
Thinking can require different amounts of knowledge. Some thinking activities are much
directed and the knowledge required is constrained. Others require vast amounts of knowledge
from different domains. For example, performing a subtraction calculation requires a relatively
small amount of knowledge, from a constrained domain, whereas understanding newspaper
headlines demands.
Reasoning is the process by which we use the knowledge we have to draw conclusions or infer
something new about the domain of interest. There are a number of different Types of
reasoning: deductive, inductive and abductive. We use each of these types of reasoning in
everyday life, but they differ in significant ways.
Deductive reasoning
Deductive reasoning derives the logically necessary conclusion from the given premises.
For example,
If it is Friday then she will go to work It is Friday
Therefore she will go to work.
Inductive reasoning
Induction is generalizing from cases we have seen to infer information about cases we have not
seen. Induction is a useful process, which we use constantly in learning about our environment.
We can never see all the elephants that have ever lived or will ever live, but we have certain
knowledge about elephants which we are prepared to trust for all practical purposes, which has
largely been inferred by induction. Even if we saw an elephant without a trunk, we would be
unlikely to move from our position that ‗All elephants have trunks‗, since we are better at
using positive than negative evidence.
Abductive reasoning
The third type of reasoning is abduction. Abduction reasons from a fact to the action or state
that caused it. This is the method we use to derive explanations for the events we observe. For
example, suppose we know that Sam always drives too fast when she has been drinking. If we
see Sam driving too fast we may infer that she has been drinking. Of course, this too is
unreliable since there may be another reason why she is driving fast: she may have been called
to an emergency.
Problem solving
Human problem solving is characterized by the ability to adapt the information we have to deal
with new situations often solutions seem to be original and creative. There are a number of
different views of how people solve problems.
The Gestalt view that problem solving involves both reuse of knowledge and insight. This has
been largely superseded but the questions it was trying to address remain and its influence can
be seen in later research. In the 1970s by Newell and Simon, was the problem space theory,
which takes the view that the mind is a limited information processor.
Unit 2
All interaction design includes some form of iteration of ideas. This often starts early on with paper
designs and storyboards being demonstrated to colleagues and potential users. Any of these
prototypes, whether paper-based or running software, can then be evaluated to see whether they are
acceptable and where there is room for improvement. This sort of evaluation, intended to improve
designs, is called formative evaluation. This is in contrast to summative evaluation, which is
performed at the end to verify whether the product is good enough. One approach is to get an expert
to use a set of guidelines, for example the ‗knowing where you are‗ list above, and look screen by
screen to see if there are any violations.
The other main approach is to involve real users either in a controlled experimental setting, or ‗in the
wild‗ – a real-use environment. The result of evaluating the system will usually be a list of faults or
problems and this is followed by a redesign exercise, which is then prototyped, evaluated The end
point is when there are no more problems that can economically be fixed. So iteration and
prototyping are the universally accepted ‗best practice ‗approach for interaction design.
Prototyping is an example of what is known as a hill-climbing approach. Imagine you are standing
somewhere in the open countryside. You walk uphill and keep going uphill as steeply as possible.
Eventually you will find yourself at a hill top.is exactly how iterative prototyping works: you start
somewhere, evaluate it to see how to make it better, change it to make it better and then keep on
doing this until it cant get any better.
2) Software Life cycle models
In the development of a software product, we consider two main parties: the customer who requires
the use of the product and the designer who must provide the product. Typically, the customer and
the designer are groups of people and some people can be both customer and designer. It is often
important to distinguish between the customer who is the client of the designing company and the
customer who is the eventual user of the system. These two roles of customer can be played by
different people. The group of people who negotiate the features of the intended system with the
designer may never be actual users of the system. This is often particularly true of web applications.
In this chapter, we will use the term ‗customer‗ to refer to the group of people who interact with the
design team and we will refer to those who will interact with the designed system as the user or enduser.
2.6.1 Activities
The graphical representation is reminiscent of a waterfall, in which each activity naturally leads into
the next. The analogy of the waterfall is not completely faithful to the real relationship between these
activities, but it provides a good starting point for discussing the logical flow of activity. We describe
the activities of this waterfall model of the software life cycle
Requirements specification
Requirements specification begins at the start of product development. Though the requirements are
from the customer‗s perspective, if they are to be met by the software product they must be
formulated in a language suitable for implementation. Requirements are usually initially expressed in
the native language of the customer. The executable languages for software are less natural and are
more closely related to a mathematical language in which each term in the language has a precise
interpretation, or semantics. The transformation from the expressive but relatively ambiguous natural
language of requirements to the more precise but less expressive executable languages is one key to
successful development. Task analysis techniques, which are used to express work domain
requirements in a form that is both expressive and precise.
Architectural design
The requirements specification concentrates on what the system is supposed to do. The next activities
concentrate on how the system provides the services expected from it. The first activity is a highlevel decomposition of the system into components that can either be brought in from existing
software products or be developed from scratch independently. An architectural design performs this
decomposition. It is not only concerned with the functional decomposition of the system, determining
which components provide which services. It must also describe the interdependencies between
separate components and the sharing of resources that will arise between components.
Detailed design
The architectural design provides a decomposition of the system description that allows for isolated
development of separate components which will later be integrated. For those components that are
not already available for immediate integration, the designer must provide a sufficiently detailed
description so that they may be implemented in some programming language. The detailed design is
a refinement of the component description provided by the architectural design. The behavior
implied by the higher-level description must be preserved in the more detailed description.
There will be more than one possible refinement of the architectural component that will satisfy the
behavioral constraints. Choosing the best refinement is often a matter of trying to satisfy as many of
the non-functional requirements of the system as possible. Thus the language used for the detailed
design must allow some analysis of the design in order to assess its properties.
Coding and unit testing
The detailed design for a component of the system should be in such a form that it is possible to
implement it in some executable programming language. After coding, the component can be tested
to verify that it performs correctly, according to some test criteria that were determined in earlier
activities. Research on this activity within the life cycle has concentrated on two areas. There is
plenty of research that is geared towards the automation of this coding activity directly from a lowlevel detailed design. Most of the work in formal methods operates under the hypothesis that, in
theory, the transformation from the detailed design to the implementation is from one mathematical
representation to another and so should be able to be entirely automated. Other, more practical work
concentrates on the automatic generation of tests from output of earlier activities which can be
performed on a piece of code to verify that it behaves correctly.
Integration and testing
Once enough components have been implemented and individually tested, they must be integrated as
described in the architectural design. Further testing is done to ensure correct behavior and
acceptable use of any shared resources. It is also possible at this time to perform some acceptance
testing with the customers to ensure that the system meets their requirements. It is only after
acceptance of the integrated system that the product is finally released to the customer.
After product release, all work on the system is considered under the category of maintenance, until
such time as a new version of the product demands a total redesign or the product is phased out
entirely. Consequently, the majority of the lifetime of a product is spent in the maintenance activity.
Maintenance involves the correction of errors in the systems which are discovered after release and
the revision of the system services to satisfy requirements that were not realized during previous
2.6.2 Validation and verification
Throughout the life cycle, the design must be checked to ensure that it both satisfies the high-level
requirements agreed with the customer and is also complete and internally consistent. These checks
are referred to as validation and verification, respectively. Verification of a design will most often
occur within a single life-cycle activity or between two adjacent activities. For example, in the
detailed design of a component of a payroll accounting system, the designer will be concerned with
the correctness of the algorithm to compute taxes deducted from an employee‗s gross income.
The architectural design will have provided a general specification of the information input to this
component and the information it should output. The detailed description will introduce more
information in refining the general specification. The detailed design may also have to change the
representations for the information and will almost certainly break up a single high-level operation
into several low-level operations that can eventually be implemented. In introducing these changes to
information and operations, the designer must show that the refined description is a legal one within
its language (internal consistency) and that it describes all of the specified behavior of the high-level
description (completeness) in a provably correct way (relative consistency). Validation of a design
demonstrates that within the various activities the customer‗s requirements are satisfied.
Validation is a much more subjective exercise than verification, mainly because the disparity
between the language of the requirements and the language of the design forbids any objective form
of proof. In interactive system design, the validation against HCI requirements is often referred to as
evaluation and can be performed by the designer in isolation or in cooperation with the customer.
2.6.3 Management and contractual issues
The life cycle described above concentrated on the more technical features of software development.
In a technical discussion, managerial issues of design, such as time constraints and economic forces,
are not as important. The different activities of the life cycle are logically related to each other. We
can see that requirements for a system precede the high- level architectural design which precedes the
detailed design, and so on. In reality, it is quite possible that some detailed design is attempted before
all of the architectural design. In management, a much wider perspective must be adopted which
takes into account the marketability of a system, its training needs, the availability of skilled
personnel or possible subcontractors, and other topics outside the activities for the development of
the isolated system.
2.6.4 Interactive systems and the software life cycle
The life cycle for development we described above presents the process of design in a somewhat
pipeline order. In reality, even for batch-processing systems, the actual design process is iterative,
work in one design activity affecting work in any other activity either before or after it in the life
A final point about the traditional software life cycle is that it does not promote the
use of notations and techniques that support the user‗s perspective of the interactive
system. We discussed earlier the purpose of validation and the formality gap. It is
very difficult for an expert on human cognition to predict the cognitive demands that
an abstract design would require of the intended user if the notation for the design
does not reflect the kind of information the user must recall in order to interact. The
same holds for assessing the timing behavior of an abstract design that does not
explicitly mention the timing characteristics of the operations to be invoked or their
relative ordering. Though no structured development process will entirely eliminate
the formality gap, the particular notations used can go a long way towards making
validation of non-functional requirements feasible with expert assistance. In the
remaining sections of this chapter, we will describe various approaches to augment
the design process to suit better the design of interactive systems. These approaches
are categorized under the banner of user-centered design.
3) Elements of WIMP interfaces
4) Nielsen‘s ten heuristics are:
1. Visibility of system status
The system should always keep users informed about what is going on, through
appropriate feedback within reasonable time.
2. Match between system and the real world
The system should speak the users' language, with words, phrases and concepts
familiar to the user, rather than system-oriented terms. Follow real-world
conventions, making information appear in a natural and logical order.
3. User control and freedom
Users often choose system functions by mistake and will need a clearly marked
"emergency exit" to leave the unwanted state without having to go through an
extended dialogue. Support undo and redo.
4. Consistency and standards
Users should not have to wonder whether different words, situations, or actions
mean the same thing. Follow platform conventions.
5. Error prevention
Even better than good error messages is a careful design which prevents a
problem from occurring in the first place. Either eliminate error-prone
conditions or check for them and present users with a confirmation option
before they commit to the action.
6. Recognition rather than recall
Minimize the user's memory load by making objects, actions, and options
visible. The user should not have to remember information from one part of the
dialogue to another. Instructions for use of the system should be visible or
easily retrievable whenever appropriate.
7. Flexibility and efficiency of use
Accelerators — unseen by the novice user — may often speed up the
interaction for the expert user such that the system can cater to both
inexperienced and experienced users. Allow users to tailor frequent actions.
8. Aesthetic and minimalist design
Dialogues should not contain information which is irrelevant or rarely needed.
Every extra unit of information in a dialogue competes with the relevant units
of information and diminishes their relative visibility.
9. Help users recognize, diagnose, and recover from errors
Error messages should be expressed in plain language (no codes), precisely
indicate the problem, and constructively suggest a solution.
10. Help and documentation
Even though it is better if the system can be used without documentation, it may
be necessary to provide help and documentation. Any such information should
be easy to search, focused on the user's task, list concrete steps to be carried out,
and not be too large.
5) Norman’s Seven Principles for Transforming Difficult Tasks into
Simple Ones
1. Use both knowledge in the world and knowledge in the head. People work
better when the knowledge they need to do a task is available externally – either
explicitly or through the constraints imposed by the environment. But experts
also need to be able to internalize regular tasks to increase their efficiency. So
systems should provide the necessary knowledge within the environment and
their operation should be transparent to support the user in building an
appropriate mental model of what is going on.
2. Simplify the structure of tasks. Tasks need to be simple in order to avoid
complex problem solving and excessive memory load. There are a number of
ways to simplify the structure of tasks. One is to provide mental aids to help the
user keep track of stages in a more complex task. Another is to use technology
to provide the user with more information about the task and better feedback. A
third approach is to automate the task or part of it, as long as this does not
detract from the user‗s experience. The final approach to simplification is to
change the nature of the task so that it becomes something simpler. In all of
this, it is important not to take control away from the user.
3. Make things visible: bridge the gulfs of execution and evaluation. The
interface should make clear what the system can do and how this is achieved,
and should enable the user to see clearly the effect of their actions on the
4. Get the mappings right. User intentions should map clearly onto system
controls. User actions should map clearly onto system events. So it should be
clear what does what and by how much. Controls, sliders and dials should
reflect the task – so a small movement has a small effect and a large movement
a large effect.
5. Exploit the power of constraints, both natural and artificial. Constraints are
things in the world that make it impossible to do anything but the correct action
in the correct way. A simple example is a jigsaw puzzle, where the pieces only
fit together in one way. Here the physical constraints of the design guide the
user to complete the task.
6. Design for error. To err is human, so anticipate the errors the user could
make and design recovery into the system.
7. When all else fails, standardize. If there are no natural mappings then
arbitrary mappings should be standardized so that users only have to learn them
once. It is this standardization principle that enables drivers to get into a new
car and drive it with very little difficulty – key controls are standardized.
Occasionally one might switch on the indicator lights instead of the windscreen
wipers, but the critical controls (accelerator, brake, clutch, steering) are always
the same.
7) Discuss in detail about the activities in the waterfall model and spiral
model of the software life cycle.
Both the models, Waterfall model and Spiral model are used in software
development. Waterfall model is also known as classical or linear model
because it follows sequential stages while the spiral model does not follow the
sequential stages it follows the evolutionary way. The amount of risk in waterfall
model is more as compared to spiral model because the errors or risks are
identified and refined after the completion of each stages.
There are some difference between these two models waterfall and spiral
model which are given below:
Waterfall Model
Spiral Model
The Waterfall model is simple and
The spiral model is a lot more
Waterfall model works in sequential
While spiral model works in
evolutionary method.
In waterfall model errors or risks are
identified and rectified after the
completion of stages.
In spiral model errors or risks are
identified and rectified earlier.
Waterfall model is adopted by
While spiral model is adopted by
Waterfall model is applicable for small
While Spiral model is used for large
In waterfall model requirements and
early stage planning is necessary.
While in spiral model requirements
and early stage planning is necessary
if required.
Flexibility to change in waterfall model
is Difficult.
Flexibility to change in spiral model
is not Difficult.
There is high amount risk in waterfall
There is low amount risk in spiral
Waterfall model is comparatively
While cost of spiral model is very
Customer involvement is minimum in
Waterfall Model
In Spiral Model Customer
involvement is high.
8) The UI design principals are:
6 key UI design principles
So what are the key design principles of a high-performing user interface?
Let’s take a look at them.
From recognizing interactive and static elements to making navigation
intuitive, clarity is an essential part of a great UI design.
As Peep Laja aptly says, “Your job is to build an interface that minimizes the
knowledge gap between what users know prior to coming to your site and
what they must know to use it properly.”
So when you create your product, ask yourself the following questions:
Is your navigation intuitive? Are users directed and encouraged to move
from one page to the next with ease?
 Have you used highly visible buttons that prompt users into clicking
 Is the purpose of each element on your product, website or application,
clear and easy to understand?
You only have eight seconds to grab a user's attention, so make sure you
don’t waste those seconds by creating confusion and chaos.
Do you automatically look for the menu at the top of the homepage of a new
Why? Because that’s where we perceive the menu to be.
Displaying the website menu at the top is a great way of incorporating the design
principle of familiarity like Sephora does here
The best interfaces are familiar for users.
Usability, i.e. how easily a user interacts with a product or a website, is closely
related to familiarity. Users depend on elements and interfaces acting in a way
that’s familiar to their digital experience.
Jacob’s Law states that “Users spend most of their time on other sites. This
means that users prefer your site to work the same way as all the other sites
they already know.”
Not only should you leverage established UI design principles and rules (like
Ben Shneiderman’s golden rules of UI design described above) to incorporate
familiarity in your design, but also ensure that all elements are in sync
throughout the interface.
There are several benefits of incorporating the UI design principle of familiarity
in your product:
 Increases user retention
The more familiar the user is with your interface and more easily they can use
your website or app, the more they’ll come back to it.
And since research shows that acquiring a new customer costs 6x to 7x times
more than to retain an old one, it’s vital that you create a seamless user
experience by leveraging familiarity.
Easier for UI designers
It’s easier for UI designers to incorporate tried and tested interface design
solutions than create new ones from scratch.
Reduces the learning curve for users
The less time visitors have to spend understanding how the user interface
works, the faster they can start using your product or service.
It also reduces the chances of them exiting your website and moving on to the
next website because yours had a steep learning curve.
User Control
Place users in control of the interface. Jakob Nielsen explains why this is
important: “Users often choose system functions by mistake and will need a
clearly marked ‘emergency exit’ to leave the unwanted state without having to
go through an extended dialogue. Support undo and redo.”
This basically means giving users different options to go back a step when
they feel they’ve made a mistake. This is closely related to Ben
Shneiderman’s golden UI design rule, ‘Supporting internal locus of control’,
and ‘Allowing users to reverse their actions easily’.
As an example, whenever you’re creating forms, allow users to click
the <Back> button and go back to the page they were last on. Don’t take them
all the way back to the homepage or the start of the form.
Similarly, when using overlays on your website, make sure the exit button {x}
is clear. Otherwise, users might click the back button on the browser and go
back two steps instead of just exiting the overlay.
See how the small x button is clear and instantly noticeable here.
Making the exit button on an overlay visible is a great way of placing users in control of
the interface.
But, can you find the exit button on this screen?
An invisible exit button on an overlay can easily confuse and frustrate users
Strong visual hierarchy is a core design principle of a successful user
interface. It consists of arranging visual elements in a way that explains the
level of importance of each element and guides users to take the desired
action. As a designer, your job is to organize UI design elements in a way that
makes it easy for users to navigate within your product.
As Pascal Potvin, design lead at IBM, says, “[...] Visual hierarchy creates an
order of importance to visual elements in order to direct a user’s attention and
make information easier to consume.”
It ensures that users see the most important information first, then the next,
and so on, and is established through various elements. Some of them
 Color
The first and one of the most important elements of establishing visual
hierarchy is color. Bright colors stand out the most and can be used in muted
color schemes to direct users to take a certain action.
Monese and Mailchimp both make great use of colour in their designs.
Colour blocking is an excellent way of highlighting certain UI elements
Size matters a lot in UI design, especially when establishing visual hierarchy.
The bigger the element is, the more visible it is. Smaller elements are usually
those of less importance. So, as a designer, try to make important things (like
headlines or CTAs) bigger and bolder.
Play around with different sizes, weights and styles of fonts to establish visual
hierarchy. This is exactly what Odoo does here.
Odoo using different sized fonts to create visual hierarchy
Notice how the main headline, which aptly explains what it’s all about, is of the
largest font size and two main words (‘real’ and ‘CRM’) are written in bold to
emphasise it’s message.
And while the CTA is of smaller font size, it’s highlighted through a coloured
box to ensure that the audience’s eyes are drawn to it immediately after
reading the headline.
Negative space
Give your elements some breathing room - don’t cram all the objects and
elements together on the screen. Negative space makes important elements
pop and stand out. Swaggar makes great use of negative space on their
website’s main page.
Swagger uses negative space in a great manner to highlight its various features and
main message
As Nick Kampouris, User Experience & User Interface Designer,
advises, “Social distance your elements, text blocks and everything on your
darn screen. Space is everyone's friend. Use the Law of Proximity to help
users visually navigate your pages. This is as much a part of your design than
everything else so please mind the gap.”
“Flexibility doesn’t just follow a linear path, it’s about knowing your customers
and giving flexibility for different customer intents,” explains Brooke Cowling,
Co-founder & CMO of Digital of Things.
So from designing solutions that work great in all situations (from your grand
dad’s old computer to your son’s latest iPad), to using shortcuts that speed up
interaction for users, flexibility is a key principle of user interface design.
You not only need to make your user interface learnable for new users, but it
should also have accelerators that help expert users speed up their
processes. From novices to experts, ensure your product is flexible and
efficient for all kinds of users.
Some examples of a flexible UI design include:
Shortcuts for performing frequently used steps with a single click
 Advanced search features
 Incorporating filter bars
And think multi-modal.
A multimodal user interface allows users to interact with the system through
multiple modes, such as speech, text touch, vision. As a designer, you need
to be mindful of all these modes. Also, design the user interface keeping in
mind that your audience might use it via mobile, tablet, laptop, or a rusty old
“Many users depend mostly or entirely on mobile access; optimize the UI for
the strengths of each device. Be mindful that users' experiences may span
across devices and modes of interaction (phone, email, mobile device,
desktop, in-person) - design experiences across those modes to support their
needs,” shares Jon Fukuda, Co-founder and Principal at Limina, and a design
and UX industry leader.
Designing your website for all users is essential. According to the WHO,
around 285 million people are visually impaired, between 110 million and 190
million adults have significant mobility difficulties and 360 million people
worldwide have disabling hearing loss.
Jesse Hausler, Principal Accessibility Specialist at
Salesforce, writes, “Accessibility will not force you to make a product that is
ugly, boring, or cluttered. It will introduce a set of constraints to incorporate as
you consider your design.”
When designing products and websites, make sure that you meet the
requirements of WCAG (Web Content Accessibility Guidelines).
Some easy ways to incorporate the principle of accessibility while designing
products are:
Use the WebAim Color Contrast checker to create strong color
Ensure all your images have corresponding alt attributes
Ask yourself if users can easily navigate your website via the tab key on
your keyboard
Design rationale is the information that explains why a computer system is the way it is, including its
structural or architectural description and its functional or behavioral description. In this sense,
design rationale does not fit squarely into the software life cycle described in this chapter as just
another phase or box. Rather, design rationale relates to an activity of both reflection (doing design
rationale) and documentation (creating a design rationale) that occurs throughout the entire life cycle.
In an explicit form, a design rationale provides a communication mechanism among the members of
a design team so that during later stages of design and/or maintenance it is possible to understand
what critical decisions were made, what alternatives were investigated (and, possibly, in what order)
and the reason why one alternative was chosen over the others. This can help avoid incorrect
assumptions later.
• Accumulated knowledge in the form of design rationales for a set of products can be reused to
transfer what has worked in one situation to another situation which has similar needs. The design
rationale can capture the context of a design decision in order that a different design team can
determine if a similar rationale is appropriate for their product.
• The effort required to produce a design rationale forces the designer to deliberate more carefully
about design decisions. The process of deliberation can be assisted by the design rationale technique
by suggesting how arguments justifying or discarding a particular design option are formed.
In the area of HCI, design rationale has been particularly important, again for several reasons:
• There is usually no single best design alternative. More often, the designer is faced with a set of
trade-offs between different alternatives. For example, a graphical interface may involve a set of
actions that the user can invoke by use of the mouse and the designer must decide whether to present
each action as a ‗button‗on the screen, which is always visible, or hide all of the actions in a menu
which must be explicitly invoked before an action can be chosen. The former option maximizes the
operation visibility but the latter option takes up less screen space. It would be up to the designer to
determine which criterion for evaluating the options was more important and then communicating
that information in a design rationale.
Even if an optimal solution did exist for a given design decision, the space of alternatives is so vast
that it is unlikely a designer would discover it. In this case, it is important that the designer indicates
all alternatives that have been investigated. Then later on it can be determined if she has not
considered the best solution or had thought about it and discarded it for some reason. In project
management, this kind of accountability for design is good.
• The usability of an interactive system is very dependent on the context of its use. The flashiest
graphical interface is of no use if the end-user does not have access to a high- quality graphics
display or a pointing device. Capturing the context in which a design decision is made will help later
when new products are designed.
If the context remains the same, then the old rationale can be adopted without revision. If the context
has changed somehow, the old rationale can be re-examined to see if any rejected alternatives are
now more favorable or if any new alternatives are now possible.
2.9.1 Process-oriented design rationale
Rationale is based on Rittel‗s issue-based information system, or IBIS, a style for representing
design and planning dialog developed in the 1970s. In IBIS (pronounced
‗ibbiss‗), a hierarchical structure to a design rationale is created. A root issue is identified which
represents the main problem or question that the argument is addressing. Various positions are put
forth as potential resolutions for the root issue, and these are depicted as descendants in the IBIS
hierarchy directly connected to the root issue. Each position is then supported or refuted by
arguments, which modify the relationship between issue and position. The hierarchy grows as
secondary issues are raised which modify the root issue in some way. Each of these secondary issues
is in turn expanded by positions and arguments, further sub- issues, and so on.
A graphical version of IBIS has been defined by Conklin and Yakemovic called gIBIS
(pronounced ‗gibbiss‗), which makes the structure of the design rationale more apparent
visually in the form of a directed graph which can be directly edited by the creator of the design
rationale. Above figure gives a representation of the gIBIS vocabulary. Issues, positions and
arguments are nodes in the graph and the connections between them are labeled to clarify the
relationship between adjacent nodes. So, for example, an issue can suggest further sub-issues, or a
position can respond to an issue or an argument can support a position. The gIBIS structure can be
supported by a hypertext tool to allow a designer to create and browse various parts of the design
2.9.2 Design space analysis
MacLean and colleagues have proposed a more deliberative approach to design rationale which
emphasizes a post hoc structuring of the space of design alternatives that have been considered in a
design project. Their approach, embodied in the Questions, Options and Criteria (QOC) notation, is
characterized as design space analysis issues raised based on reflection and understanding of the
actual design activities. Questions in a design space analysis are therefore similar to issues in IBIS
except in the way they are captured. Options provide alternative solutions to the question. They are
assessed according to some criteria in order to determine the most favorable option. In Figure an
option which is favorably assessed in terms of a criterion is linked with a solid line, whereas negative
links have a dashed line.
The key to an effective design space analysis using the QOC notation is deciding the right questions
to use to structure the space and the correct criteria to judge the options. The initial questions raised
must be sufficiently general that they cover a large enough portion of the possible design space, but
specific enough that a range of options can be clearly identified. It can be difficult to decide the right
set of criteria with which to assess the options.
Structure-oriented technique, called Decision Representation Language (DRL), developed by Lee
and Lai, structures the design space in a similar fashion to QOC, though its language is somewhat
larger and it has a formal semantics. The questions, options and criteria in DRL are given the names:
decision problem, alternatives and goals. QOC assessments are represented in DRL by a more
complex language for relating goals to alternatives. The sparse language in QOC used to assess an
option relative to a criterion (positive or negative assessment only) is probably insufficient, but there
is a trade-off involved in adopting a more complex vocabulary which may prove too difficult to use
in practice. The advantage of the formal semantics of DRL is that the design rationale can be used as
a computational mechanism to help manage the large volume of information. For example, DRL can
track the dependencies between different decision problems, so that subsequent changes to the design
rationale for one decision problem can be automatically propagated to other dependent problems.
Design space analysis directly addresses the claim that no design activity can hope to uncover all
design possibilities, so the best we can hope to achieve is to document the small part of the design
space that has been investigated. An advantage of the post hoc technique is that it can abstract away
from the particulars of a design meeting and therefore represent the design knowledge in such a way
that it can be of use in the design of other products. The major disadvantage is the increased overhead
such an analysis warrants. More time must be taken away from the design activity to do this separate
documentation task. When time is scarce, these kinds of overhead costs are the first to be trimmed.
2.9.3 Psychological design rationale
The final category of design rationale tries to make explicit the psychological claims of usability
inherent in any interactive system in order better to suit a product for the tasks users have. This
psychological design rationale has been introduced by Carroll and Rosson, and before we describe
the application of the technique it is important to understand some of its theoretical background.
When designing a new interactive system, the designers take into account the tasks that users
currently perform and any new ones that they may want to perform. This task identification serves as
part of the requirements for the new system, and can be done through empirical observation of how
people perform their work currently and presented through informal language or a more formal task
analysis language. When the new system is implemented, or becomes an artifact, further observation
reveals that in addition to the required tasks it was built to support, it also supports users in tasks that
the designer never intended. Once designers understand these new tasks, and the associated problems
that arise between them and the previously known tasks, the new task definitions can serve as
requirements for future artifacts.
• What can I do: that is, what are the possible operations or functions that this programming
environment allows?
• How does it work: that is, what do the various functions do?
• How can I do this: that is, once I know a particular operation I want to perform,
• How do I go about programming it?
Carroll refers to this real-life phenomenon as the task–artifact cycle. He provides a good example of
this cycle through the evolution of the electronic spreadsheet. When the first electronic spreadsheet,
VisiCalc, was marketed in the late 1970s, it was presented simply as an automated means of
supporting tabular calculation, a task commonly used in the accounting world. Within little over a
decade of its introduction, the application of spreadsheets had far outstripped its original intent within
accounting. Spreadsheets were being used for all kinds of financial analysis, ‗what-if ‗simulations,
report formatting and even as a general programming language paradigm! As the set of tasks
expands, new spreadsheet products have flooded the marketplace trying to satisfy the growing
customer base. Another good example of the task–artifact cycle in action is with word processing,
which was originally introduced to provide more automated support for tasks previously achieved
with a typewriter and now provides users with the ability to carry out various authoring tasks that
they never dreamed possible with a conventional typewriter. And today, the tasks for the spreadsheet
and the word processor are intermingled in the same artifact.
The purpose of psychological design rationale is to support this natural task– artifact cycle of design
activity. The main emphasis is not to capture the designer‗s intention in building the artifact. Rather,
psychological design rationale aims to make explicit the consequences of a design for the user, given
an understanding of what tasks he intends to perform. Previously, these psychological consequences
were left implicit in the design, though designers would make informal claims about their systems
The first step in the psychological design rationale is to identify the tasks that the proposed
system will address and to characterize those tasks by questions that the user tries to answer
in accomplishing them. For instance, Carroll gives an example of designing a system to help
programmers learn the Smalltalk object-oriented programming language environment. The
main task the system is to support is learning how Smalltalk works. In learning about the
programming environment, the programme will perform tasks that help her answer the
2.3.1 Tools for layout
We have a number of visual tools available to help us suggest to the user appropriate ways to read
and interact with a screen or device.
Figure: Grouping related items in an order screen
Grouping and structure
If things logically belong together, then we should normally physically group them together. This
may involve multiple levels of structure. We can see a potential design for an ordering screen. Notice
how the details for billing and delivery are grouped together spatially; also note how they are
separated from the list of items actually ordered by a line as well as spatially. This reflects the
following logical structure:
• Administrative information o Billing details
o Delivery details
o Order line 2
• Order information o Order line 1
• Entering information
Order of groups and items
In general we need to think: what is the natural order for the user? This should normally match the
order on screen. For data entry forms or dialog boxes we should also set up the order in which the tab
key moves between fields. Occasionally we may also want to force a particular order; for example
we may want to be sure that we do not forget the credit card details
Decorative features like font style, and text or background colors can be used to emphasize
Alignment of lists is also very important. For users who read text from left to right, lists of text items
should normally be aligned to the left. Numbers, however, should normally be aligned to the right
(for integers) or at the decimal point. This is because the shape of the column then gives an indication
of magnitude – a sort of mini histogram. Items like names are particularly difficult.
White space
Spacing or whitespace, white space is any section of a document that is unused or space around an
object. White spaces help separate paragraphs of text, graphics, and other portions of a document,
and help a document look less crowded. Using white space effectively in a document keeps the
reader reading the document and helps the reader quickly find what they are interested in reading.
How to create white space
White space is created by pressing the return key, spacebar key, or the tab key and can also be
created by setting the document's margins and inserting form feeds or tables.
2.3.2 User action and control
In each case the screen consists not only of information presented to the user, but also of places for
the user to enter information or select options. Many of the same layout issues for data presentation
also apply to fields for data entry. Alignment is still important. It is
• Presenting information
especially common to see the text entry boxes aligned in a jagged fashion because the field names are
of different lengths. This is an occasion where right-justified text for the field labels may be best or,
alternatively, in a graphical interface a smaller font can be used for field labels and the labels placed
just above and to the left of the field they refer to. For both presenting and entering information a
clear logical layout is important.
The task analysis techniques can help in determining how to group screen items and also the order in
which users are likely to want to read them or fill them in. Knowing also that users are likely to read
from left to right and top to bottom (depending on their native language!) means that a screen can be
designed so that users encounter items in an appropriate order for the task at hand.
• Knowing what to do
If everyone designs buttons to look the same and menus to look the same, then users will be able to
recognize them when they see them. It is important that the labels and icons on menus are also clear.
Standards can help for common actions such as save, delete or print. For more system-specific
actions, one needs to follow broader principles. For example, a button says ‗bold‗: does this
represent the current state of a system or the action that will be performed if the button is pressed?
• Affordances
These are especially difficult problems in multimedia applications where one may deliberately adopt
a non-standard and avant-garde style. How are users supposed to know where to click? The
psychological idea of affordance says that things may suggest by their shape and other attributes
what you can do to them: a handle affords pulling or lifting; a button affords pushing. These
affordances can be used when designing novel interaction elements. One can either mimic real-world
objects directly, or try to emulate the critical aspects of those objects. What you must not do is depict
a real-world object in a context where its normal affordances do not work!
2.3.3 Appropriate appearance
The way of presenting information on screen depends on the kind of information: text, numbers,
maps, tables; on the technology available to present it: character display, line drawing, graphics, and
virtual reality; and, most important of all, on the purpose for which it is being used. The file listing is
alphabetic, which is fine if we want to look up the details of a particular file, but makes it very
difficult to find recently updated files. Of course, if the list were ordered by date then it would be
difficult to find a particular file. Different purposes require different representations. For more
complex numerical data, we may be considering scatter graphs, histograms or 3D surfaces; for
hierarchical structures, we may consider outlines or organization diagrams. But, no matter how
complex the data, the principle of matching presentation to purpose remains. We have an advantage
when presenting information in an interactive system in that it is easy to allow the user to choose
among several representations, thus making it possible to achieve different goals.
Figure : Alphabetic file listing. Screen shot reprinted by permission from Apple Computer, Inc.
Aesthetics and utility
The beauty and utility may sometimes be at odds. For example, an industrial control panel will often
be built up of the individual controls of several subsystems, some designed by different teams, some
bought in. The resulting inconsistency in appearance may look a mess and suggest tidying up.
Certainly some of this inconsistency may cause problems.
The conflict between aesthetics and utility can also be seen in many
‗well designed‗ posters and multimedia systems. In particular, the backdrop behind text must have
low contrast in order to leave the text readable; this is often not the case and graphic designers may
include excessively complex and strong backgrounds because they look good. The results are
impressive, perhaps even award winning, but completely unusable! In consumer devices these
aesthetic considerations may often be the key differentiator betweenproducts, for example, the sleek
curves of a car. This is not missed by designers of electronic goods: devices are designed to be good
to touch and feel as well as look at and this is certainly one of the drivers for the futuristic shapes of
the Apple iMac family.
Making a mess of it: colour and 3D
The increasing use of 3D effects in interfaces has posed a whole new set of problems for text and
numerical information. Whilst excellent for presenting physical information and certain sorts of
graphs, text presented in perspective can be very difficult to read and the all too common 3D pie
chart is all but useless.
Localization / internationalization
If you are working in a different country, you might see a document being word processed where the
text of the document and the file names are in the local language, but all the menus and instructions
are still in English. The process of making software suitable for different languages and cultures is
called localization or internationalization.
It is clear that words have to change and many interface construction toolkits make this easy
by using resources. When the program uses names of menu items, error messages and other
text, it does not use the text directly, but instead uses a resource identifier, usually simply a
number. A simple database is constructed separately that binds these identifiers to particular
words and phrases. A different resource database is constructed for each language, and so the
program can be customized to use in a particular country by simply choosing the appropriate
resource database.
Unit 3
1. Cognitive complexity theory,
It was introduced by Kieras and Polson [199], begins with
the basic premises of goal decomposition from GOMS and enriches the model to
provide more predictive power. CCT has two parallel descriptions: one of the user’s
goals and the other of the computer system (called the device in CCT). The description
of the user’s goals is based on a GOMS-like goal hierarchy, but is expressed
primarily using production rules. We introduced production rules in Chapter 1 and
we further describe their use in CCT below. For the system grammar, CCT uses
generalized transition networks, a form of state transition network. This will not be
described here, but state transition networks will be discussed in detail in Chapter 16.
The production rules are a sequence of rules:
if condition then action
where condition is a statement about the contents of working memory. If the condition
is true then the production rule is said to fire. An action may consist of one or
more elementary actions, which may be either changes to the working memory,
or external actions such as keystrokes. The production rule ‘program’ is written in a
LISP-like language.
Airline booking
3. Open System Task Analysis (OSTA)
OSTA attempts to describe what happens when a technical system is introduced into an
organizational work environment. Like CUSTOM, OSTA specifies both social and technical
aspects of the system. However, whereas in CUSTOM these aspects are framed in terms of
stakeholder perspectives, in OSTA they are captured through a focus on tasks.
OSTA has eight main stages:
1. The primary task which the technology must support is identified in terms of users‗ goals.
Task inputs to the system are identified. These may have different sources and forms that may
constrain the design.
2. The external environment into which the system will be introduced is described, including
physical, economic and political aspects.
3. The transformation processes within the system are described in terms of actions performed on or
with objects.
4. The social system is analyzed, considering existing work-groups and relationships within and
external to the organization.
5. The technical system is described in terms of its configuration and integration with other systems.
6. Performance satisfaction criteria are established, indicating the social and technical requirements of
the system.
7. The new technical system is specified.
4. CUSTOM methodology
CUSTOM is a socio-technical methodology designed to be practical to use in small
organizations. It is based on the User Skills and Task Match (USTM) approach, developed to
allow design teams to understand and fully document user requirements. CUSTOM focusses on
establishing stakeholder requirements: all stakeholders are considered, not just the end- users.
It is applied at the initial stage of design when a product opportunity has been identified, so the
emphasis is on capturing requirements. It is a forms-based methodology, providing a set of
questions to apply at each of its stages.
There are six key stages to carry out in a CUSTOM analysis:
1. Describe the organizational context, including its primary goals, physical characteristics,
political and economic background.
2. Identify and describe stakeholders. All stakeholders are named, categorized (as primary,
secondary, tertiary or facilitating) and described with regard to personal issues, their role in the
organization and their job. For example, CUSTOM addresses issues such as stakeholder
motivation, disincentives, knowledge, skills, power and influence within the organization, daily
tasks and so on.
3. Identify and describe work-groups. A work-group is any group of people who work together
on a task, whether formally constituted or not. Again, work-groups are described in terms of their
role within the organization and their characteristics.
4. Identify and describe task–object pairs. These are the tasks that must be performed, coupled
with the objects that are used to perform them or to which they are applied.
5. Identify stakeholder needs. Stages 2–4 are described in terms of both the current system and
the proposed system. Stakeholder needs are identified by considering the differences between the
two. For example, if a stakeholder is identified as currently lacking a particular skill that is
required in the proposed system then a need for training is identified.
6. Consolidate and check stakeholder requirements. Here the stakeholder needs list is checked
against the criteria determined at earlier stages.
5 & 6. Discuss the ways in which a full-page word-processor is or is not a direct manipulation interface
for editing a document using Shneiderman’s criteria. What features of a modern word processor
break the metaphor of composition with pen (or typewriter) and paper?
We will answer the first question by evaluating the word-processors relative to the criteria for direct
manipulation given by Shneiderman.
_ Visibility of the objects of interest: The most important objects of interest in a word-processor are the
words themselves. Indeed, the visibility of the text on a continual basis was one of the major usability
advances in moving from line-oriented to display-oriented editors. Depending on the user’s application,
there may be other objects of interest in word-processing that may or may not be visible. For example,
are the margins for the text on screen similar to the ones which would eventually printed? Is the spacing
within a line and the line-breaks similar? Are the different fonts and formatting characteristics of the
text visible (without altering the spacing)? Expressed in this way, we can see the visibility criterion for
direct manipulation as very similar to the criteria for a WYSIWYG (What You See Is What You Get)
_ Incremental action at the interface with rapid feedback on all actions: We expect from a modern wordprocessor that characters appear in the text as we type them it at the keyboard, with little delay. If we
are inserting text within a paragraph, we might also expect that the format of the paragraph adjust
immediately to accommodate the new changes. Various word processors do this reformatting
automatically, whereas others do it occasionally or only at the explicit request of the user. One of the
other important actions which require incremental and rapid feedback is movement of the insertion
point, usually by means of arrow keys. If there is a significant delay between the input command to
move the insertion point down one line and the actual movement of the cursor on screen, it is quite
possible that the user will “overshoot” the target when repeatedly pressing the down-arrow key to
move down a few lines on the screen.
_ Reversibility of all actions, so that users are encouraged to explore without severe penalties: Single step
undo commands in most word-processors allow the user to recover from the last action performed. One
problem with this is that the user must recognize the error before doing any other action. More
sophisticated undo facilities allow the user to retrace back more than one command at a time. The kind
of exploration this reversibility provides in a word-processor is best evidenced with the ease of
experimentation that is now available for June 4, 1993 17:34 exercises Sheet number 28 Page number
26 Usability Paradigms and Principles formatting changes in a document (fonts types and sizes and
margin changes). One problem with the ease of exploration is that emphasis may move to the look of a
document rather than what the text actually says (style over content).
_ Syntactic correctness of all actions, so that every operation is a legal operation WYSYWYG: wordprocessors usually provide menus and buttons which the user uses to articulate many commands. These
interaction mechanisms serve to constrain the input language to only allow legal input from the user.
Document production systems, such as troff, TeX, and Scribe, force the user to input textual commands
(which may be erroneously entered by the user) to achieve desired formatting effects.
_ Replacement of complex command languages with actions to manipulate directly the visible objects:
The case for word processors is similar to that described above for syntactic correctness. In addition,
operations on portions of text are achieved many times by allowing the user to directly highlight the text
with a mouse (or arrow keys). Subsequent action on that text, such as moving it or copying it to
somewhere else, can then is achieved more directly by allowing the user to “drag” the selected via the
mouse to its new location. To answer the second question concerning the drawback of the pen (or
typewriter) metaphor for word-processing, we refer to the discussion on metaphors in Section 4.2.6.
The example there compares the functionality of the space key in typewriting versus word-processing.
For a typewriter, the space key is passive; it merely moves the insertion point one space to the right. In a
word processor, the space key is active, as it inserts a character (the space character) into the
document. The functionality of the typewriter space key is produced by the movement keys for the
word-processor (typically an arrow key pointing right to move forward within one line). In fact, much of
the functionality that we have come to expect of a word-processor is radically different from that
expected of a typewriter, so much so that the typewriter as a metaphor for word-processing is not all
that instructive. In practice, modern typewriters have begun to borrow from word-processors when
defining their functionality!
Text-based communication is familiar to most people, in that they will have written and received
letters. However, the style of letter writing and that of face-to face communication are very different.
The text-based communication in groupware systems is acting as a speech substitute, and, thus, there
are some problems adapting between the two media.
There are four types of textual communication in current groupware:
discrete – directed message as in email. There is no explicit connection between different messages,
except in so far as the text of the message refers to a previous one.
linear – participants‗ messages are added in (usually temporal) order to the end of a single
non-linear – when messages are linked to one another in a hypertext fashion.
spatial – where messages are arranged on a two-dimensional surface.
Linear text vs. Hypertext
Multiplexed messages can be represented as updates to several parts of the hypertext, thus reducing
the likelihood of breakdown and lost topics. In addition, if the messages themselves can be minihypertexts, then eager messages listing several possible courses of action can be explicitly
represented by the message.
Even static hypertexts, which have been carefully crafted by their authors, can be difficult to
navigate. A hypertext that is created ‗on the fly‗ is unlikely to be comprehensible to any but those
involved in its creation. Conklin and Begeman, themselves associated with the hypertext based
argumentation tool gIBIS, conclude that ‗traditional linear text provides a continuous, unwinding
thread of context.
For the asynchronous reader trying to catch up with a conversation, a linear transcript is clearly
easier, but it is precisely in more asynchronous settings where overlap in linear text is most likely to
cause confusion.
Cooperation or conflict?
The term ‗computer-supported cooperative work‗ (CSCW) seems to assume that groups will be
acting in a cooperative manner. This is obviously true to some extent;even opposing football teams
cooperate to the extent that they keep (largely) within the rules of the game, but their cooperation
only goes so far. People in organizations and groups have conflicting goals, and systems that ignore
this are likely to fail spectacularly.
Imagine that an organization is already highly computerized, the different departments all have their
own systems and the board decides that an integrated information system is needed. The production
manager can now look directly at stocks when planning the week‗s work, and the marketing
department can consult the sales department‗s contact list to send out marketing questionnaires.
The storekeeper always used to understate stock levels slightly in order to keep an emergency supply,
or sometimes inflate the quoted levels when a delivery was due from a reliable supplier. Also,
requests for stock information allowed the storekeeper to keep track of future demands and hence
plan future orders. The storekeeper has now lost a sense of control and important sources of
information. Members of the sales department are also unhappy: their contacts are their livelihood.
The last thing they want is someone from marketing blundering in and spoiling a relationship with a
customer built up over many years.
Some of these people may resort to subverting the system, keeping ‗sanitized‗ information
online, but the real information in personal files.
Changing power structures
The identification of stakeholders will uncover information transfer and power relationships that cut
across the organizational structure. Indeed, all organizations have these informal networks that
support both social and functional contacts. The official lines of authority and information tend to
flow up and down through line management.
The physical layout of an organization often reflects the formal hierarchy: each department is on a
different floor, with sections working in the same area of an office. If someone from sales wants to
talk to someone from marketing then one of them must walk to the other‗s office. Their respective
supervisors can monitor the contact. In face-to-face conversation, the manager can easily exert
influence over a subordinate: both know their relative positions and this is reflected in the patterns of
conversation and in other non-verbal cues. Email messages lose much of this sense of presence and it
is more difficult for a manager to exercise authority. The ‗levelling‗ effect even makes it possible for
subordinates to direct messages ‗diagonally‗ across the hierarchy, to their manager‗s peers, or, even
worse, to their manager‗s manager!
The invisible worker
The ability to work and collaborate at a distance can allow functional groups to be distributed over
different sites. This can take the form of cross-functional neighbourhood centers, where workers
from different departments do their jobs in electronic contact with their functional colleagues. If the
approach in an organization is ‗management by presence‗, that is you know someone is working
because they are in the office, then there is no way a remote worker is going to be trusted. If, on the
other hand, the style is ‗management by objectives‗, that is you know your subordinates are working
because they are doing their jobs and producing results, then remote working is not so problematical.
Who benefits?
In these systems the sender has to do work in putting information into fields appropriately, but it is
the recipient who benefits. Another example is shared calendars. The beneficiary of the system is a
manager who uses the system to arrange meeting times, but whose personal secretary does the work
of keeping the calendar up to date. Subordinates are less likely to have secretarial support, yet must
keep up the calendar with little perceived benefit. Of course, chaos results when a meeting is
automatically arranged and the subordinates may have to rearrange commitments that have not been
recorded on the system. The manager may force use by edict or the system may simply fall into
disuse. Many such groupware systems are introduced on a ‗see if it works‗ basis,
Free rider problem
A system may still not function symmetrically, which may be a problem, particularly with shared
communication systems. One issue is the free rider problem. Take an electronic conferencing system.
If there is plenty of discussion of relevant topics then there are obvious advantages to subscribing and
reading the contributions. When considering writing a contribution, the effort of doing so may
outweigh any benefits.
The total benefit of the system for each user outweighs the costs, but for any particular decision the
balance is overturned. A few free riders in a conference system are often not a problem, as the danger
is more likely from too much activity. In addition, in electronic
Figure: Cost/benefit of system use
conferences the patterns of activity and silence may reflect other factors such as expertise. It is easy
for the number of free riders gradually to increase and the system slide into disuse. It is hard to
enforce equal use, except by restrictive schemes such as round-robin contributions (everyone
contributes something however short). In the real world, such problems are often solved by social
pressure, and the free rider reacts to the collective censure of the group. Increasing the visibility of
participants‗ contributions might also help these social mechanisms.
Critical mass
When telephones were only in public places, their use as a form of pervasive interpersonal
communication was limited. However, once a large number of people have telephones in their homes
it becomes worthwhile paying to have a telephone installed. In cost/benefit terms, the early
subscribers probably have a smaller benefit than the cost. Only when the number of subscribers
increases beyond the critical mass does the benefit for all dominate the cost .
The telephone was useful for subgroups before it became beneficial for all. Even when only a small
proportion of the population had personal telephones, they still formed a significant proportion of
their social group, so these cliques of use could grow gradually over time.
Automating processes – workflow and BPR
Organizations have many such processes, and workflow systems aim to automate much of the
process using electronic forms, which are forwarded to the relevant person based on pre-coded rules.
Some workflow systems are built using special purpose groupware, often based on a notation for
describing the desired workflow.
The rigid form of a typical workflow system is an example of global structuring. The danger with
any form of global structuring is that it may conflict with or inhibit more informal and less structured
patterns of activity which also contribute to the organization‗s free running.
A more radical approach to organizational processes is found in business process re- engineering
(BPR). Traditionally, organizations have been structured around functions: sales, accounts, stores,
manufacturing. However, the purpose of an organization can be seen in terms of key business
processes. The ordering/delivery process described above is a typical and important example. In BPR
these processes are recorded and analyzed. Problems in the current process are noted and the whole
process may be redesigned in order to make the path of the process more efficient. For example,
instead of sending an order to the accounts department to approve, a list of customer credit limits
could be given to the sales executives. They could then check the credit rating of the customer whilst
on the phone and only forward the order to accounts if there are any unusual problems.
Evaluating the benefits
The benefits from cooperative systems, especially organization-wide systems such as email or
electronic conferencing, are in terms of job satisfaction or more fluid information flow. Some, such
as the video wall , are expected primarily to help social contact within the organization. It may be
possible to measure contentment and job satisfaction using attitude questionnaires.
The user‗s interaction with a computer is often viewed in terms of a language, so it is not surprising
that several modeling formalisms have developed centered around this concept. BNF grammars are
frequently used to specify dialogs.
The models here, although similar in form to dialog design notations, have been proposed with the
intention of understanding the user‗s behavior and analyzing the cognitive difficulty of the interface.
Representative of the linguistic approach is Reisner‗s use of Backus–Naur Form (BNF) rules to
describe the dialog grammar . This views the dialog at a purely syntactic level, ignoring the
semantics of the language. BNF has been used widely to specify the syntax of computer
programming languages, and many system dialogs can be described easily using BNF rules. For
example, imagine a graphics system that has a line-drawing function. To select the function the user
must select the ‗line‗ menu option. The line-drawing function allows the user to draw a polyline,
that is a sequence of line arcs between points. The user selects the points by clicking the mouse
button in the drawing area. The user double clicks to indicate the last point of the polyline.
The aims in the description are of two types: non-terminals, shown in lower case, and terminals,
shown in upper case. Terminals represent the lowest level of user behavior, such as pressing a key,
clicking a mouse button or moving the mouse. Non-terminals are higher-level abstractions. The nonterminals are defined in terms of other non-terminals and terminals by a definition of the form name
::= expression The ‗::=‗ symbol is read as ‗is defined as‗. Only non-terminals may appear on the
left of a definition. The right-hand side is built up using two operators ‗+‗ (sequence) and ‗|‗
(choice). For example, the first rule says that the non- terminal draw-line is defined to be select-line
followed by choose-points followed by lastpoint. All of these are non-terminals, that is they do not
tell us what the basic user actions are. The second rule says that select-line is defined to be position
mouse (intended to be over the ‗line‗ menu entry) followed by CLICK-MOUSE. This is our first
terminal and represents the actual clicking of a mouse.
Position-mouse is, we look at the last rule. This tells us that there are two possibilities for positionmouse (separated by the ‗|‗ symbol). One option is that position-mouse is empty – a special symbol
representing no action. That is, one option is not to move the mouse at all. The other option is to doa
MOVE-MOUSE action followed by position-mouse. This rule is recursive, and this second positionmouse may itself either be empty or be a MOVE-MOUSE action followed by position-mouse, and so
on. That is, position-mouse may be any number of MOVE-MOUSE actions whatsoever. Similarly,
choose-points is defined recursively, but this time it does not have the option of being empty. It may
be one or more of the non-terminal choose-one which is itself defined to be (like select-line)
position-mouse followed by
The BNF description of an interface can be analyzed in various ways. One measure is to count the
number of rules. The more rules an interface requires to use it, the more complicated it is. This
measure is rather sensitive to the exact way the interface is described. For example, we could have
replaced the rules for choose points and choose-one with the single definition choose-points ::=
position-mouse + CLICK-MOUSE | position-mouse + CLICK-MOUSE + choose-points.
Task–action grammar
Measures based upon BNF have been criticized as not ‗cognitive‗ enough. They ignore the
advantages of consistency both in the language‗s structure and in its use of command names and
letters. Task–action grammar (TAG)
10. STATIC WEB CONTENT (Pg no. 771 [21.5 static web content])
21.5.1 The message and the medium
One thing is often forgotten when web pages are created. It is of vital importance,
and hence will be discussed first.
It is content.
Many people assume that because they can make information available on the
web, they should. Unfortunately, because it is very easy to publish information,
much less care is taken with the actual content. Material may be nonsense, it may be
incorrect, it may not read well, or be incomplete, or inane.
Excellent page design can make useless material look attractive, but it still remains
useless material. On the other hand, poor design can mean that excellent material
is never seen by potential readers, as they have become bored, or intolerant of the
medium, or confused, or for a host of other reasons have aborted their attempts to
download and view the information. Pages do have to look immediately interesting
and attractive if people are to spend time, effort and, because of the communication
costs, money, in viewing them; the user-centered nature of the medium makes this
imperative. This is in marked contrast to television or cinema or other dynamic
media, which are not under any direct user control, where information is presented
to a passive audience. With web documents, people have actually to want to see the
information, and make an effort to retrieve it, which clearly must have an influence
on design.
Whatever is being presented, underlying all the comments made on good and
bad design, the fundamental message is that, for the user group or groups who are
targeted, the content should be worth reading.
When it is likely that a user will require a paper copy of the information made
available over the web, ideally they should be able to download it in one go as a
Delivering WAP content – balancing usability and feedback
Because of the tiny screens on phones it is difficult to scroll through a long WAP page. However,
if every link involved going back to the WAP server the feedback would be very slow. For this
reason WML divides WAP content into stacks of notes. For the user browsing the content, the
note is the parallel of the normal web page. A link may be to a note in the same stack or one in a
different stack; the user is largely unaware of which it is. However, when you request a note in a
stack the whole stack is downloaded to the WAP browser on your phone. This means that links
to notes in the same stack have much faster feedback. By carefully arranging content within stacks
and notes, the overall user experience can be improved.
single complete file, with the same information content but possibly a different
layout. Paper does not have the same inbuilt hypertextual and active capabilities as
the web page, and will be accessed in a predominantly linear fashion.
21.5.2 Text
Because web pages are displayed on many different machines, there are only a
small set of fonts that can be guaranteed to be available: a standard font and a typewriter
font (e.g. courier) with bold and italic versions in different sizes. However,
it is possible to specify preferred fonts and many of these such as Arial, Verdana or
Comic Sans are available on most web platforms. The difficult thing is to balance
fine tuning the appearance of the text on one platform with making it readable
on all.
The various structured styles such as headings allow the web designer to create
material that will lay out passably on all platforms. But these offer a fairly coarse level
of control. The size and boldness of the heading should be chosen carefully; for
example, huge dark fonts on a page can look loud and brash.
There is an increasing desire to have fine control. Cascading style sheets (CSS)
allow you to specify fonts, line spacing, size, etc., in a similar way to styles in a word
processor or DTP package. However, care must be taken. For example, many pages
specify fixed point sizes that may not display well on different platforms and can
cause problems for people with visual impairments.
The use of color is of great importance for web pages, but it is often abused. First,
it should be remembered that a significant proportion of the potential viewers of
the page will have problems with color, either because they are using older machines
with a limited color palette, or because they have some form of color blindness.
Color, when used, should not be the only cue available. Users also bring a deeprooted
emotional interpretation to colors; as we have seen, in some cultures, red is
associated with danger and anger, whilst green is regarded as go, or safe. Blue can be
a cool color, orange a warm one, and so on.
Links usually change color once they have been accessed, providing cues to the
user about what material they have already explored. This means that two distinct
but still suitable colors need to be associated with each link, so that the system is
acceptable whether or not the links have been activated. Note, too, that consistent
use of color can help the user understand the role of particular elements more intuitively,
whereas color used for no clear purpose is often distracting.
One common mistake is to put colored text onto a similar colored background
so that it becomes nearly invisible. One of the authors had a student who designed a
beautifully laid out page of text, and decided to add a background to the page just
before demonstrating it to the rest of the group. It was only at the demonstration
that he realized that the cool black background he had added made the black text
impossible to see!
There are only a limited number of text-placing options: text can be left or right
justified, or centered. There are a few predefined formatting styles such as ordered
and unordered lists that have additional structure, in the form of indentation from
the left margin, with numbering in the case of ordered lists. Vertical positioning is
even more limited, but tables and (cautious use of) frames allow a greater degree of
horizontal and vertical placement. More precise positioning still can be obtained
using ‘dynamic HTML’ (DHTML), which allows parts of an HTML document
(called layers or ‘div’ sections) to be positioned as if they were separate mini-pages
within the browser window. The word ‘dynamic’ is used because these can then be
controlled using JavaScript to produce various animated effects (those pages where
little stars follow your mouse!).
Remember that monitors are different sizes and that some people use full-screen
windows and others smaller ones. To prevent very long lines, many designers lay out
pages within tables that put maximum widths (in pixels) for columns based on typical
minimum expected monitor sizes (perhaps 800 600 or even 640 480). If fixed
layouts or large graphics are used then they may either display strangely on smaller
windows or force the user to scroll horizontally, which many users find confusing.
And don’t forget people viewing the page on PDAs!
The lack of explicit textual positioning makes it very difficult to produce complex
mathematical equations, and the font set available is not rich enough to provide
a suitable approximation. Developments in the specification are addressing this,
though the intrinsic complexity of typesetting mathematics suggests that it may be
a while before a simple, usable solution is found that is acceptable to readers, page
designers, and implementors of web browsers alike.
21.5.3 Graphics
Obtaining graphics
There are a number of sites on the web that contain archives of graphical images,
icons, backgrounds and so on. There are also paint and image manipulation packages
available on almost all computer systems, and scanners and digital cameras,
where available, enable the input of photographs and diagrams.
Using graphics
While graphics and icons tend to play a significant role in web page design, their use
should be carefully thought out. Graphical images take longer to load than text, and
this may become a problem. Text uses 8 bits to represent a character: some rough
calculations show that approximately 2000 characters represent about a screenful of
information, and so 16,000 bits (2 K) are required. For graphics, one pixel may use
8 bits to represent its color: a page-sized image will be at least 600 by 400 pixels,
which will take 1,920,000 bits (240 K), or 120 times as long to load. Put another way,
while a picture may tell a thousand words, it takes approximately 50 times as long to
appear! Users become bored with operations that take a long time to complete, and
are unlikely to wait for ages while a page appears.
774 Chapter 21 n Hypertext, multimedia and the world wide web
Complex backgrounds are the worst offenders in this area; they offer little in the
way of added value to the information presented on the page, and cause great
frustration for the poor reader. They tend to be designed and tested only on local
machines, with high-bandwidth connections between them, which means that the
time factor is negligible for the designer/user. However, this disregards the fact
that many people accessing the page will be using congested, slow networks, with a
transfer rate sometimes down to a few kilobits per second, rather than fast megabit
links. Fussy backgrounds also have the unfortunate ability to obscure text, making it
very difficult or impossible to read.
Different browsers support different types of functionality, with more recent
versions having features that try to alleviate the usability problems introduced by the
delay involved in downloading graphics. Most browsers support caching, in which
graphics are downloaded once and temporarily stored on the user’s local machine. If
the same image is reused, it is fetched from the local store far more rapidly than if
it were retrieved from the remote site. This clearly has implications for page design:
if graphics are to be used, then their reuse wherever possible speeds up the whole
process of drawing the page. Complex graphics can sometimes be broken down into
a set of items, many of which can be reused and assembled in different ways to add
visual impact to the page without causing large delays. Most browsers also offer the
option of turning off automatic image loading, so that only the text is downloaded.
If a page then appears to be of interest, the graphics can be explicitly requested. It is
sometimes possible to set out a page so that it still looks attractive even without the
graphics, which is necessary if the user has turned off image loading. There are other
browsers that are purely text based and do not support graphics of any sort, and for
these HTML offers an additional image attribute that allows a textual description
of the image to be used as an alternative. The need to support these different user
preferences and browser capabilities provides a great challenge in designing pages
that are acceptable to all.
Some browsers have additional features related to image handling as a technological
response to the problem of page usability. If the designer specifies the size
of the image in advance, the browser can lay out the text on the page first, leaving
spaces for the images. This allows the user to continue to read the page contents
whilst the images are being downloaded into their respective slots. This capability
improves the usability of the page, and so should be supported by the page
designer whenever possible, by incorporating the necessary information into the
image reference.
Both GIF (graphics interchange format) and JPEG (Joint Photographic Experts
Group), the most widely used web graphic image formats, can be saved in forms
that allow them to be progressively transmitted. This means that images appear
as a whole, but very blurred, version that becomes gradually sharper, rather than
appearing in perfect resolution a line at a time. An overall impression of the page
and the graphic information appearing is thus given to the user, who is then better
informed about whether or not to continue the download.
The JPEG format is optimized for photographic images and makes use of their
properties to offer a higher compression ratio and hence faster loading. However,
its compression is lossy, that is the image reproduced is slightly different from the
original, losing certain kinds of visually indistinguishable colors and, more importantly,
losing high-frequency change. The latter is because photographs tend to have
slowly varying changes with few sharp edges. If sharp-edged images such as diagrams
or text labels are stored as JPEGs, small artifacts are produced such as ripples appearing
around letters and lines. In contrast GIF uses a lossless compression so that the
image appears exactly as it started. Although GIFs can be used for photographic
images, the compression is very poor.
The GIF format also allows animated GIFs. These are a sort of mini-slide show
or movie where several images are stored in the same file and play one after another.
These can be used to produce simple and effective animations, but when overused
can lead to very ‘noisy’ pages.
Active image maps are pictures with defined ‘hot’ areas, which, when clicked, execute
a particular script, usually calling another web page or graphic. The careful use
of such maps can transform an interface, but there is an overhead to pay in loading
the map and calling and running the script, and this should be considered carefully
by a page designer. Another characteristic of image maps is that there is rarely any
indication of which areas are active and which are not, and it can be difficult for users
to ensure that they have visited all the active regions. For accessing spatially organized
information, image maps are very suitable (for example, in a tourist information
system, the area about which information is available can be represented well),
but for information that does not have any clear spatial analog, their use is more
Icons often appear on web pages, and while there are many available to choose from,
they should be used with care. On web pages, icons are typically used in one of two
ways. They are either visual cues, associating some small picture with different parts
of the text (for example, some pages have icon-sized characters that appear next
to instructions). Alternatively, they are used in much the same way as in a standard
WIMP interface to represent aspects of the functionality of the underlying pages.
In this latter case, they must represent their associated functionality in either a concrete
or an abstract form. This means that the design of the individual icon has to be
carefully thought out, as a lot of information may have to be represented in a small
area of screen estate. However, icons are rarely seen on their own, and when placed
next to their neighbors, the whole effect has to be pleasing rather than disruptive
and garish. Therefore, the group of icons has to be designed together, with a coherent
and recognizable style. The picture is broader than this, however: other applications
also use icons, which has its advantages and disadvantages. One advantage is
that certain icons are already associated with specific functionality (for example,
a picture of a floppy disk to represent ‘save’). Disadvantages are that it restricts
the individuality and style icon sets can show, and may mean that icons designed
for one purpose are misunderstood by users because they have seen something
similar in another context. It is therefore vital that time is spent in examining the way icons are used in other systems,
before importing them into web pages or designing
new ones.
Icons sometimes appear for no apparent reason at all, when the
page creator has decided that as graphics are supported, a few should
be used. One interesting example is the icon that is a copy of the
roadworks sign, used in conjunction with text saying something like
‘This page still under construction!’.
This is an interesting social effect brought on by the ease of web publishing –
incomplete (sometimes even non-existent!) information can be made immediately
available. It’s like buying a map of the world nowadays and finding ‘here be dragons’
around the edges because the geographer could not be bothered to draw all the
countries in. Most pages can be designed properly before they are made available,
structured and presented in a complete and coherent way, allowing for extensions
and updates from the beginning.
There are times when the disclaimer ‘under construction’ has its uses: when critically
important information becomes available, publishing it may well be more
important than presenting it (much like older maps with their dragons – if maps had
only been printed once the whole world had been explored, civilization would be
very different today). There is, too, a sense in which web pages can be continually
‘under construction’, changing, evolving and growing, because of their dynamic
nature and the ease with which they can be updated, but this does not obviate the
designer’s responsibility to create pages that have both form and content.
Graphics and color
Using many different colors within graphics may well result in the browsers for older
machines running out of entries in the colormap, with unpredictable consequences.
This is often problematical as the browser may be running in tandem with other
color applications, and only has a restricted range of colors to begin with.
For many consumer markets, for example in the UK and the US, this is unlikely
to be a problem as home machines are often relatively recent. However, many
businesses continue to use older PCs so long as they ‘do the job’ and PDAs may not
have a full color palette. Furthermore, in economically deprived areas, where there is
computer access it may well be through older or second-hand machines.
If universal access is required it is therefore still wise, where possible, to restrict
images to a limited number of colors, taken from the standard 216 color web palette,
and to reduce complex color images to simpler approximations. Reducing the
number of colors used also allows the depth of the images to be reduced; a change
from a default of 8 bits to, say, 4 bits will produce a twofold speedup in image loading.
The earlier comments on the use of color obviously apply as much to graphics
as they do to text.
One further point should be made about graphics: computer screens are typically
limited to a resolution of around 72 dpi (dots per inch), and so either high-resolution
images will have to be displayed much larger than actual size, or the increased resolution
will be forfeited.
21.5.4 Movies and sound
Movies and sound are both available to users of the web, and hence to page designers.
One problem associated with them is actually obtaining appropriate sound and
video clips, as they usually require some sort of multimedia capability on behalf
of the host machine in order to be able to digitize sound and capture and digitize
video. Video suffers from the same problems as graphics, magnified by an order of
magnitude or two; it can take extremely large amounts of time for a video segment
to download. Video is also not well integrated into the web, requiring the creation
of a process to run it that is separate from the page from whence it came. Not all
receiving machines have the capability to play video, or sound, and so it is unwise for
a designer to rely on these dynamic media to convey information without replicating
it elsewhere.
The use of sound and video moves page design further away from the typesetter
and toward the sound engineer and cinematographer; the integration of these
cinematic media with the enhanced textual capabilities offered by the web is a new
domain, in which the techniques that work and those that fail have not yet been fully
explored, let alone understood.
The need to download movies and sound (see Figure 21.8) puts sharp limits on
the length of clip that can be shown. Streaming media over the internet, such as
RealVideo, RealAudio and CuSeeMe, allow potentially unlimited sources. As well
as longer prepared clips, these techniques allow live transmission (e.g. live radio
broadcasts over RealAudio) and long recorded sequences for asynchronous communication.
An excellent use of the latter is the eClass project, introduced in Section
20.2.2, which links recordings of audio and video during a lecture with pen strokes
on an electronic whiteboard, so that students can replay the part of a lecture associated
with any slide or annotation.
Acceptable streaming video and audio is achieved by a combination of high compression
and large client-end buffers. The former leads to loss of quality including
blurring and ghosting after rapid changes in screen content. The latter leads to
delays, often of several seconds, which makes it impossible to support video conferencing.
The challenges of achieving high quality transmissions (e.g. for video on
demand) and low latency (e.g. for video conferencing) are active research topics in
multimedia technology.
Figure 21.8 Animated GIF or movie needs to download completely
Stepping back a bit to look again at this, note that it is often not raw bandwidth
which is the problem on the internet, but packet losses and jitter. Buffering effectively
solves this by trading off quality against delay, which is okay for fixed content,
or low pace change (as in eClass), but is problematic when we require a high pace of
interaction (as in video conferencing). CuSeeMe uses little buffering and hence is more
21.6.1 The active web
In the early days, the web was simply a collection of (largely text) pages linked
together. The material was static or slowly changing and much of it authored and
updated by hand. Some pages were generated on the fly, in particular the gateways
into ftp servers and to gophers, which were so important in adding ‘free’ content to
the web [97]. However, even here the user’s model was still of a static repository of
information. Web surfers may not have always known where they were, but they had
a pretty good idea of what they were seeing and that if they came back it would be
the same.
It was a pleasant, if somewhat boring world, but from a usability viewpoint it was
wonderful – a consistent interface to terabytes of information. Who could ask for
more? Indeed, this is one of the key arguments Nielsen brings against frames-rich
sites in his famous alertbox, Why frames suck (most of the time) [263] – frames break
this simple user model and hence cause trouble. Nielsen calls for a new richer model
for the web, which preserves the simplicity of the old model, but which can accommodate
and guide the development of new features.
Well, if frames cause trouble, what about applets, timed refreshing pages, rollovers,
dynamic content creation? What are we interacting with – is it information, is
it computer systems? In fact, this was a problem with hypertext interfaces well before
the web existed. Back in 1989, two of the authors, Janet and Alan, wrote about the
potential problems of these shifts between passive and active paradigms within an
interface. Our solution was to accept these differences, but to make them evident to
the user through the design of an effective medium of interaction [103]. Of course
it’s easy to say . . .
As HCI researchers and designers, we can neither ignore nor uncritically accept
new technology in the web. The active web is here, our job is to understand it and to
learn how to use it appropriately.
In previous sections, we have already looked at the simplest form of active web
page, those with movies, animated gifs or streaming audio. These are simplest, not
in the sense that no effort is required – a short video clip may require many days of
production effort – but in the sense that they have least user interaction. In this section
we’ll look at more complex forms of interaction. First, where the actual content is fixed, but the user can
change the form of presentation; secondly, at the generation
of pages from database content; and finally at the update of database information
through the web.
21.6.2 What happens where
When considering dynamic material on the web we need to take the external, user’s
viewpoint and ask what is changing: media, presentation or actual data; by whom: by
the computer automatically, by the author, by the end-user or by another user; and
how often, the pace of change: seconds, days or months? From a technical standpoint,
we also need to know where ‘computation’ is happening: in the user’s web-browsing
client, in the server, in some other machine or in the human system surrounding it?
The ‘what happens where’ question is the heart of architectural design. It has a major
impact on the pace of interaction, both feedback, how fast users see the effects of their
own actions, and feedthrough, how fast they see the effects of others’ actions. Also,
where the computation happens influences where data has to be moved to with corresponding
effects on download times and on the security of the data.
The user view
One set of issues is based on what the end-user sees, the end-user here being the web
What changes? This may be a media stream (video, audio or animation) which is
changing simply because it is the fundamental nature of the medium. It may be
the presentation or view the user has of the underlying content; for example, sorting
by different categories or choosing text-only views for blind users. A special
form of presentation change is when only a selection of the full data set is shown,
and that selection changes. The deepest form of change is when the actual content
By whom? Who effects the changes? In the case of a media stream or animation,
the changes are largely automatic – made by the computer. The other principal
sources of change are the site author and the user. However, in complex sites users
may see each other’s changes – feedthrough.
How often? Finally, what is the pace of change? Months, days, or while you watch?
We’ll use the ‘what changes?’ categories as we examine alternatives and trade-offs in
more detail below. But first we also need to look at the technological constraints.
Technology and security
The fundamental question here is where ‘computation’ is happening. If pages are
changing, there must be some form of ‘computation’ of those changes. Where does
it happen?
Client One answer is in the user’s web-browsing client enabled by Java applets,
various plug-ins such as Flash, scripting using JavaScript or VBScript with
dynamic HTML layers, CSS and DOM (Domain Object Model).
Server A second possibility is at the web server using CGI scripts (written in Perl,
C, UNIX shell, Java or whatever you like!), Java Servlets, Active Server Pages or
one of the other server-specific scripting languages such as PHP. In addition,
client-side Java applets are only allowed to connect to networked resources on the
same machine as they came from. This means that databases accessed from clientside
JDBC (Java database connectivity) must run on the web server (see below).
Another machine Although the pages are delivered from the web server, they may
be constructed elsewhere. For hand-produced pages, this will usually be on the
page author’s desktop PC. For generated pages, this may be a PC or a central
database server.
People Of course, as noted earlier, the process of production and update may even
involve people!
It is easy to roll out maxims such as ‘users first’, but, in reality, the choice between
these options is not solely a matter of matching the end-user requirements. The best
choice also depends on the expertise of the web developer and external limitations.
If the server runs on a UNIX machine, you can’t expect to use Microsoft Active
Server Pages. On the other hand, if you are designing for an intranet you may even
get to influence the choice of client software and so make it easier to use more
complex client-end solutions.
The choice of technological solution is also heavily influenced by issues of
security. When we do any computation on data the computation and the data must
be in the same place [299]. This apparently simple factor means that if we want to
keep data secure (read ‘on well-protected servers’) then we must also perform the
critical computation on the same servers. For example, imagine a password check.
It would be foolish to send the correct password to a Java applet to check!
21.6.3 Fixed content – local interaction and changing views
Probably the most hyped aspect of the web in recent years has been Java. In fact, Java
can be used to write server-end software and platform independent standalone programs
(not to mention the embedded systems for which it was originally designed!),
but the aspect that most people think of is Java applets.
Applets are just one of the techniques that can be added to give client-end interaction
(and about the least well integrated into the rest of the page). The most
common alternatives are JavaScript, Flash and if you are prepared to limit yourself
to Windows platforms, ActiveX plug-ins. These techniques share the characteristic
that they are downloaded to the user’s own machine (see Figure 21.9) and thereafter
all interaction happens on the PC, not across the network (with caveats – see below).
The simplest use of this is to add interaction widgets such as roll-over buttons
(usually using JavaScript). More complex pages may add the equivalent of an
interactive application on the page. For examples, see Alan’s pages on coin tossing
experiments (Figure 21.10), which use JavaScript to emulate real and biased coins,
and dancing histograms (Figure 21.11), which use a Java applet. See Sun and
JavaSoft’s own sites for many more examples. The addition of DHTML gives even
more opportunities for dynamic pages where parts of the page can move, change
size, or change content all without any interaction with the web server.
Notice how this local interaction confuses the static model of the web. What
should happen when you go back to a previously visited page, or reload it? Do
you get the original state or the last state of your interaction? What happens if you
launch a second window on the same page? The actual behavior tends to be browser
specific and not always what you would expect! In particular, some browsers do
not re-initialize applets on a reload and so if you edit the applet’s parameters and
then reload you may not see the effects of your changes. More of a problem for web
developers than end-users, but very confusing.
Figure 21.10 Simulated coin tossing using JavaScript. Screen shot frame reprinted
by permission from Microsoft Corporation
Figure 21.9 Java applet or JavaScript running locally
21.6.4 Search
Some user-driven interaction can be accommodated at the client end, but not all.
Consider search engines. It would be foolish to download several megabytes of
information so that a Java applet can search it online! Instead, all common web
search pages work by submitting forms to the server where CGI programs perform
the searches and return results. An additional reason for this approach is that
most browsers support forms, but some still do not support Java or scripting in a
consistent manner. The web search engine for this book works in this way. The user’s
keywords are submitted to the server using an HTML form, they are compared
against pre-prepared indexes at the server and all matching paragraphs in the book
are returned (Figure 21.12). This also reminds us of another reason for not downloading
all the text to the user’s machine – security; we don’t want to distribute the
full electronic text for free!
Notice that, in all the above, the underlying content does not change; the variable
factor is the user’s input. The computation (working out what to show) needs both
the data supplied by the web author (pages, databases, indexes, etc.) and the user’s
input. The result must end up on the user’s screen. Either the data must come to the
user’s machine (as in Alan’s dancing histograms where the histogram data are in
applet parameters); or the user’s input must go to the server (as with the search). We
can see from the examples that the choice between these depends on the required pace
of interaction, the size of the data set required, security and available technology.
Figure 21.11 Dancing histograms using Java applet. Screen shot frame reprinted by
permission from Microsoft Corporation
21.6.5 Automatic generation
It was evident in the earliest days of the web that a key problem for the future would
be maintenance. In the first rush of enthusiasm, individuals and organizations
produced extensive and fascinating sites built largely of hand-crafted HTML. Not
surprisingly, vast areas of the web are not just static but in perpetual stasis. Web
surfing sometimes seems not so much a watersport, but an exercise in archaeology.
From the beginning it was clear that websites would eventually need to be created
from databases of content combined with some form of templates or layout description.
However, at that stage there were no tools available and those who saw the
database future used a variety of ad hoc methods.
Happily, there are now a (sometimes bewildering) array of products for automating
web production from existing and bespoke databases. These include
vendor-specific products such as Oracle Web Server and Domino (for publishing
Lotus Notes), and also more general techniques such as using SQL (structured query
language) or JDBC to access databases from CGI scripts or even from running Java
Database-generated websites have many advantages. They make use of existing
data sources. They guarantee consistency of different views of the data within the site
and between the site and the corporate data. They allow easy generation of tables of
contents, indices, and inter-page links. They separate content and layout.
The most dynamic way to get database content on the web is by accessing a
database directly from a running applet. The interface can then have any lookandfeel that can be programmed in Java and can allow very rapid interaction
with the user. The Java applet can establish an internet connection back to the web
server to access data files using HTTP (as if they were web pages), it can connect
to a bespoke server (e.g. for chat type applications) or it can use standard database
access methods. The latter would normally use JDBC, the Java database access
package. Using JDBC the applet can issue complex SQL queries back to the database
meaning that some of the most complex work happens there (Figure 21.13).
In all cases, the Java security model built into most web browsers means that the
applet can only connect back to the machine from which it came. This means that
the database server must run on the same machine as the web server. Think about
this. The most insecure part of any system is usually the web server, both because it
is easy to leave loopholes in the many file access permissions and also because it often
sits outside the most secure part of a corporate firewall.
The more common solution is where the user uses a web forms interface (or
special URL) and then a CGI script runs at the server end accessing the database
(Figure 21.14). The CGI script generates a web page, which is then returned to the
user. Some of the vendor-specific solutions use essentially this approach but bypass
the web-server/CGI step by having their own special web server which accesses the
database directly using their own scripting language or templates.
The user interface of such systems is limited to standard HTML features. This is a
limitation, but is at least consistent and means that it will work with virtually any
browser. Java applets can offer more rapid surface interaction, but both have to wait
for the actual data to move between server and client. Of course, the pages generated
by a CGI script can themselves contain JavaScript or Java applets for local interaction,
so the difference between the two solutions is not so radical as first appears.
From a security angle, the database accessed from the CGI script can run on a separate
machine (using standard database remote access methods or even a Java/JDBC
CGI program), thus making the system more secure. However, the database cannot
be entirely secure – if the web-server machine is compromised the CGI scripts can
be altered to corrupt or modify the database! The special vendor-specific web servers
are probably more secure as they don’t require a standard web server to be running.
21.6.6 Batch generation
A low-tech but very secure solution is to generate pages offline from a database
and then upload them to the web server (Figure 21.15). Many of Alan’s earliest web
pages were generated in this way from HyperCard stacks.
This is certainly a simple solution as it separates out the task of page generation
from that of page delivery. Pages can be generated directly using many standard
database packages such as Access or HyperCard. Alternatively, standalone programs
in languages such as Visual Basic, Java or C can access a database and output HTML
pages. These programs can run on a central computer, or on your own PC. The
generating program simply produces a set of HTML pages on your own disk that can
be checked locally and then copied onto the web server using ftp or shared network
disks. Many people think that this will be difficult, but in reality it is remarkably
easy, as you can use the tools you are used to – if you can create a text file you can
create HTML. In fact, the snippet of Visual Basic in Figure 21.16 is a trivial but fully
functioning HTML generator!
Some web scripting languages can be used in this mode too. For example PHP
allows you to send the page being generated into a buffer, which can then be saved
to a file. This can be run on a separate machine, or on the web server itself. The
latter sounds unnecessary; however, web servers are more efficient at delivering static
material so this sort of batch generation can be used simply to improve efficiency.
As well as the ease of programming, the offline generation of web pages means that
there is no need for an online connection between the web server and the database,
so a breach in the security of the web server doesn’t compromise your database. In
addition, it may mean that the web server can be configured without CGI scripting
enabled at all, which considerably increases its security.
The downside is that you can only show the indices and pages that you can precompute.
So, you could use a product database to produce a pro-forma page for each
stock item, plus alphabetic and categorized lists, but you could not produce a special
list based on a user’s own search criteria.
This low-tech solution is appropriate in many circumstances – whenever the pace
of change is low (e.g. overnight or when periodic updates are acceptable), the volume
of data is not too large and no online searching is required. Even when some of
these conditions don’t hold, it is possible to use the general approach. For example,
searching can often be arranged by having a much cut-down database or index on
the web server with most pages pre-computed.
21.6.7 Dynamic content
The mechanisms we have been discussing manage the feedthrough when the database
is updated by some non-web means. Perhaps the most ‘active’ web pages are those
where the content of the pages reacts to and is updateable by the web user.
If pages are generated from database content using either the Java-applet/JDBC
method or the CGI method, the same mechanisms can as easily be used to update
as to access the database. The feedback of changes to the user is thus effectively
instantaneous – you check for seat availability on the theatre web page, select a seat,
enter your credit card details and not only is the seat booked, but you can see it
change from free to booked on the web page.
This sort of web application opens up many additional problems. You may need
to add some form of login or authentication. If credit card numbers are supplied you
need to ensure that the web server is secure. Also, without care it is easy to design
solutions that accidentally book multiple seats if the user presses the back button and
ends up on what appears to be a simple confirmation screen.
If we consider an estate agent’s web page, with houses for sale and potential
buyers, the situation is rather different. The pace of change is slow; house purchases
take place over days, weeks and months. A solution that automatically marked a
house as sold would neither be necessary nor desirable! In this case a socio-technical
solution using low-tech database generation would probably be sufficient. The web
page can have a contact telephone number, email address or message form. Queries
via these channels (as well as non-web-based queries) come to the estate agent who
is responsible for deciding when to mark the house ‘sold’. There is a continuous
loop between web user and the database, but it involves human as well as automatic
Going in the direction of greater complexity, many business applications operate
an n-tier web architecture. This involves multiple layers of software where the outer
layers are concerned more with the user interface and the inner layers more with
business functionality. Figure 21.17 shows this using several web standards. The user
interacts through a web browser with a web server. The pages are generated using
Java Servlet Pages (JSP). To generate the page the servlets connect to Java Enterprise
Beans (JEB) on an enterprise server. These are components that encapsulate ‘business
logic’. For example, in a banking system this could include rules on whether a
particular transaction is allowed. These Java Enterprise Beans draw their data from
the corporate database using JDBC connections.