Uploaded by hadesshah

srs

advertisement
Requirements for Visualization Software
This article reviews the problems with current visualization software and develops four groups of requirements for
visualization software: hardware and software requirements, user requirements, visualization researcher requirements,
and visualization developer requirements. I believe that meeting these requirements will produce visualization software
that will be effective, flexible, highly usable, and fast and easy to implement.
1. Introduction
Hardware and software requirements deal with the range of software that contain visualizations and the variety of
hardware platforms on which they run. User requirements concern the needs of the users of visualization software.
Visualization researchers have requirements for designing, prototyping, and evaluating new visualization software.
Finally, visualization developer requirements deal with the needs of programmers that implement visualization
software. I believe that meeting these requirements will produce effective, flexible, and highly usable visualization
software that is fast and easy to implement.
Section 2 introduces visualization and the people involved with it. Sections 3, 4, 5 and 6 describe the hardware and
software, user, researcher, and developer requirements, respectively. Section 7 summarizes the four groups of
requirements.
2. Background
The amount of information produced is growing exponentially. Recent estimates indicate that approximately one million
terabytes of data is generated annually worldwide, of which 99.9% is available only in digital form (Keim 2001).
Repositories are swelling with financial, legal, bibliographical, geographical, political, and environmental information, to
name a few.
Information is analyzed by commercial, governmental, and academic research institutions to shape policies and plan for
growth and development. Information collected by businesses provides a valuable strategic asset. In particular, ecommerce businesses are able to collect an unprecedented amount of information about their customers' on-line
activities. Each page visited and re-visited, mouse click, and on-line purchase is recorded for analysis.
The need to process, understand, and make decisions using information has never been greater and this need will
increase in the future. Visualization techniques provide powerful tools for displaying, analyzing, and exploring
information.
2.1 Visualization
Visualization uses a variety of techniques to present large amounts of information in a pictorial form so that it can be
understood more easily. Comprehending a large amount of data can be an enormous cognitive task, but may be eased
through visualization which exploits the highly developed human visual perceptual system.
The simplest forms of visualization use techniques from mathematics. Simple line graphs and histograms show numeric
data such as patient temperatures and stock prices that change over time. Two- and three-dimensional scatter plots
show clusters, trends, and outliers. More dimensions can be added by varying the size, shape, colour, and shading of
each data point.
Data visualization is primarily concerned with understanding large amounts of numeric information, often physical
phenomena with inherently spatial characteristics. Data visualization has many applications in science, engineering, and
medicine; disciplines that commonly use 3D graphics and animation to visualize data. Magnetic resonance imaging
(MRI), for example, produces 3D models of organs, bones, and teeth that medical professionals can rotate and slice.
Data that change over time such as weather patterns and the turbulence over an aeroplane wing can be understood
more easily using animation to show the changes as they happen.
Information visualization is concerned with understanding the structure of non-numeric information and its
interrelationships. Examples of information visualization include network and hierarchical information structures, the
spatial data used in geographical information systems, and the analysis of business transactions.
The power of a visualization can be increased by providing tightly coupled tools for exploring and analyzing the data.
Visual exploration has enormous potential for revealing interesting patterns and relationships such as clusters,
correlations, trends, dependencies, and exceptions. Sophisticated exploratory data mining and analysis systems can be
created by enabling the user’s task knowledge and sophisticated decision making abilities to drive highly interactive
visualization software.
So far, I have given a traditional definition of visualization. I extend this definition to include all visual displays of
information such as interactive maps, product advertisements, and organization charts. Broadening the definition to
accommodate a much wider range of visual information display problems provides a significant opportunity for
supplying visualization solutions.
2.2 People in Visualization
I am concerned with improving visualization software for three groups of people: users, researchers, and developers.
Visualization users are the people that use visualization software to display, explore, and analyze their data.
Visualization researchers design, prototype, and evaluate new visualization software. Visualization developers
implement complete, robust, and polished visualization software, often using the results of visualization researchers.
These three groups are not mutually exclusive. For example, a visualization developer might research and then
implement new visualization software. A visualization user might be able to design a new visualization style that meets
his or her needs better, but require a developer to implement it.
The next section begins the discussion of visualization software requirements with the requirements for hardware and
software.
3. Hardware and Software Requirements
Visualization software needs to be incorporated into a variety of software applications and needs to be run on a range of
hardware with vastly different resources.
3.1 Hardware
Visualization software was previously only available on high end workstations. As processing power has increased and
memory prices fallen, PCs and laptops are now also able to run sophisticated visualization software. Extending the
definition of visualization to all visual information displays opens up the range of hardware on which visualizations need
to be run to include less powerful PCs and resource limited mobile devices such as personal digital assistants (PDAs) and
mobile phones.
Each new generation of mobile technology is becoming more powerful and is able to run a wider range of applications.
For example, devices that provide in-car navigation using global positioning satellite (GPS) data are already available.
GPS navigation devices use visualization to display the user’s current position overlaid on a map of the region.
Small devices such as PDAs require different modes of interaction than the hardware traditionally used with visualization
applications. Virtual on-screen keyboards and styli replace hardware keyboards and mice. Such devices rely more on
visual interaction which provides further opportunities for visualization applications.
A vast number of users already have mobile devices such as mobile phones. As mobile devices become more
sophisticated, the opportunities for supplying visualization applications to such a vast user base is enormous. To capture
the largest possible user base, visualization software needs to run on a scale of hardware such as, from most powerful to
least powerful: workstations, PCs/laptops, PDAs, and mobile phones. A different version of a visualization application
will be needed for each of these platforms.
3.2 Software
Visualization is used in a wide variety of applications such as dedicated visualization software, networked software and
web pages, and ordinary non-visualization software applications. Dedicated visualization applications are powerful
workstation-based software packages that focus on tasks such as visualizing geographical data collected from satellites,
complex mathematical simulations of nuclear reactions, and the results of MRI scans. Visualization software needs to
support traditional workstation-based visualization applications.
The growth of the Internet and the development of the web and browser technology requires software to be
networked. The uptake of ISDN and now broadband means that fast network connections are available to small business
and domestic users as well as large companies and research establishments. Visualization software must be able to be
run over networks and to visualize networked information. As network bandwidth increases, visualizations of data that
change in real time will need to be updated in real time.
Visualizations are also becoming more common in ordinary non-visualization software. The term non-visualization refers
to software that is not primarily designed to visualize information, such as word processors, spreadsheets, and file
management programs, but use data visualization; examples include the disk utilization pie chart in the Windows
Explorer application and the Disk Defragmentation application, both part of the Microsoft Windows operating system,
currently the most widely used PC software. The Calendar application in the Pocket PC operating system for PDAs uses
visualization to display large amounts of information on a limited screen with levels of detail. There is a significant
opportunity for providing visualization software components that enable developers to add visualization to their
software.
Visualization software needs to be able to be integrated into third party software and to expose APIs to enable
developers to programmatically control all aspects of a visualization and to capture low level user interaction events
such as mouse movements. Visualization software must also capture and generate high level visualization events such as
data selection to fully support the development of visualization applications.
Visualization software needs to be able to scale up to large complex 3D scientific visualizations and to scale down to
simple 2D information displays on resource limited mobile devices. Visualizations need to be scalable so that the most
appropriate version of a visualization can be supplied for the platform on which it is to run. For example, a complex 3D
visualization of a large amount of data is ideal for a workstation but would not be able to run on a PDA. Resource limited
devices have far less memory and processing power than workstations and PCs so visualization scaling is required.
Visualization scaling reduces the size and complexity of visualizations to enable them to be displayed by less powerful
hardware platforms. For example, a complex 3D animated visualization of 10,000 data points for a workstation could be
simplified to a 2D animated visualization of 2000 data points for a PC/Laptop, and further simplified to a static
visualization with summaries of the data for a PDA. Another example of the need for visualization scaling is to run PC
route finding software on PDAs and mobile phones. PDAs have smaller screens and less memory and processing power
than PCs, and current mobile phones have smaller screens and less memory and processing power than PDAs.
Visualization scaling would produce a simpler map of the region with fewer place names and landmarks for a PDA than
for a PC, and an even simpler map for a mobile phone than for a PDA.
4. User Requirements
Users have a variety of requirements for visualization software. This section describes the need for different
visualization styles, generic visualization and data analysis tools, and for the ability to tailor generic visualization
software to meet the needs of specific visualization applications. This section concludes with the requirements for an
integrated visualization software package that incorporates these requirements.
4.1 Data
Users need to visualize many different types of information; seven of the most common are one-, two- and threedimensional; temporal; multi-dimensional; tree; and network data. One-dimensional linear data types include textual
documents, program source code, and alphabetical lists of names that are all organized sequentially. Two-dimensional
planar data represent objects that have area, and examples include geographical maps, floor plans, and newspaper
layouts. Three-dimensional data represent objects that have volume or 3D co-ordinates such as real world objects such
as molecules, the human body, and buildings.
Temporal data record events that happen over time such as medical record timelines, project management schedules,
historical records of events, and video editing sequences. Temporal data is separate from one-dimensional data because
temporal items have a start and finish time.
Multi-dimensional data are often stored in relational and statistical databases. Data with n attributes are often
represented as points in an n-dimensional space. Techniques such as multi-dimensional scaling can scale an ndimensional space onto two- or three-dimensional spaces to make use of the visualization techniques that are available
for these lower dimensional spaces.
Tree data represent hierarchies and tree structures that represent a collections of items that are connected to their
parents. Networks represent data that cannot be represented by trees because they need arbitrarily complex
interconnections between items.
There are also many variations of these seven basic data types, such as multi-trees where each item can be the root of
another tree, and four-dimensional data that is represented by adding colour or different plotting symbols as an extra
dimension to three-dimensional co-ordinates.
The amount and homogeneity of the data displayed by a visualization varies widely. Information displays such as an
interactive product advertisement will present a relatively small amount of heterogeneous data such as price,
dimensions, available colours, and the current stock level. In contrast, database visualizations will display a large number
of homogeneously structured database records.
Visualization software needs to be able to represent all of the commonly displayed types of information, as well as their
combinations and variants. Visualization software must also be able to represent large and small amounts of
homogenous and heterogeneous data.
4.2 Different Visualization Styles
Visualization research shows that no visualization style is best for all data sets or tasks and several different styles can
often be used to visualize the same data; for example, pie charts and histograms can summarize the same set of numeric
data.
The selection of an appropriate style depends on several criteria such as the amount of data and its characteristics, the
stage the user is at in a data exploration, and the experience and abilities of the user. The choice of visualization style is
often constrained by the amount of information to be visualized. Some visualizations present all the data at once, others
present an overview and require users to explore it to discover more details. Some styles are not suitable for large
amounts of data.
The characteristics of the information can also influence the style of visualization. For example, scientific data
visualization tends to model physical phenomena that are inherently three dimensional. Models of organs, bones, and
teeth built from the results of MRI scans, and geographical models built from satellite data are naturally presented in
3D. Information visualization, on the other hand, models data that is more abstract and that does not naturally map
onto a particular visualization style.
The stage the user is at in a data exploration can suggest which visualization style is best. Shneiderman’s (1996)
information seeking mantra—overview, zoom and filter, then details on demand—supports this view. Visualizations
such as maps and network graphs provide useful overviews of large data sets without cluttering up the display with the
details of each data object. More detail can be presented when users zoom into interesting parts of the data using levels
of detail: as the user gets closer to the information of interest, more detail is added; when the user zooms out, the
details are hidden.
Other styles of visualization do not enable this progressive disclosure of detail; visualizations that are suitable for
overviews may not be suitable for displaying more details when the user zooms in. For example, pie charts summarize
numerical information but do not naturally fit into the progressive disclosure with zooming model. In such cases,
different visualization styles need to be used to provide more detail. Several co-ordinated styles may also be used to
provide overview and detail simultaneously; for instance, a network graph might provide a global overview of the
departments in an organization that marks the user’s current position, and a tree might show a detailed view of the
hierarchical organization of the current department.
The style of visualization can also depend on the experience and abilities of the user. Users new to a subject may prefer
a general overview with gradual and progressive disclosure of the details; experts in the field may want to navigate as
quickly as possible to an area of interest and then interrogate it. The needs of disabled users can also restrict the
visualization styles that can be used. For example, partially sighted users may not be able to use visualizations with a
large number of densely clustered data points; they may prefer summaries and different exploration tools to discover
the details.
Visualization software needs to be able to represent data in as many different styles as possible and to enable users to
control which styles are used during a data exploration session. Visualization software should enable different
visualization styles to be combined to produce multiple co-ordinated views: a change in one view causes a
corresponding change in another view. Users should be able to select data from a visualization and generate a new
visualization of the selected data in the same style or a different style.
4.3 Generic Visualization Tools
Many generic operations are commonly applied to visualizations. Shneiderman’s information seeking mantra suggests
four basic operations: requesting an overview, filtering out uninteresting items, zooming into items of interest, and
requesting more detailed information. This initial list can be extended to include a wide range of generic tools and
operations that are common to all visualizations: navigating around an information display, searching for specific values,
browsing for serendipitous discovery, partitioning the data into user-defined groups or groups suggested by the data,
viewing the relationships between items of data, and automatic data analysis, such as cluster and statistical analysis.
Maintaining a history of reversible actions is a central feature of graphical user interfaces and is also important for
visualization software. A history of reversible actions encourages experimentation because undesirable outcomes can be
undone. Preserving sequences of exploration and analyzes enables them to be replayed to produce guided tours though
a data set. Users should also be able to extract subsets of the data in a visualization. Subsets may be selected by the user
or may be the results of an exploration or search. Subsets should be able to be saved, printed, and imported into other
software.
These generic visualization tools can be used with most visualization styles. Some tools will be better than others for
interacting with certain types of visualization and for completing certain types of tasks, but users should be able to
choose which tool to use.
Visualization software should provide these generic tools to produce a consistent and familiar set of operations that will,
once they have been learned, create a powerful visualization environment. All of the generic visualization and data
analysis tools need to be tightly integrated with the visualizations to produce an effective exploratory data analysis
environment.
4.4 Tailoring
Some visualization applications need specific data interrogation and analysis facilities that are not provided by the
generic tools listed above. Users should be able to use a single integrated package to perform specialist visualization
tasks as well as a variety of generic tasks. Visualization software should capitalize on the power of, and user familiarity
with, the generic tools and be flexible enough to incorporate additional application specific tools. Using task specific
visualization styles, generic visualization software can then be tailored to meet the needs of specialist visualization tasks.
Visualization software should provide an environment in which new tools are automatically integrated with existing
tools and are tightly integrated with the environment.
4.5 Integrated Visualization Software
Users need an integrated visualization application that meets their requirements for visualizing different types and
amounts of data in different visualization styles, that provides a generic set of visualization and data analysis tools, and
that can be tailored to specific visualization tasks.
An integrated visualization application that provides a variety of styles and tools has the important advantage of a single
interface for the user to learn. A single interface will reduce the time and effort required to learn how to use the
software.
5. Researcher Requirements
The goal of information visualization research is to develop rich visual interfaces to help users understand and navigate
through complex information spaces that are often abstract, non-spatial, and highly dimensional with no natural
physical mapping onto 2D or 3D spaces. Visualization researchers need to develop new visualization styles for presenting
information and new tools for manipulating and exploring them.
Effective visualization software combines imaging, graphics, visualization, and human computer interaction.
Visualization software is highly interactive and must undergo extensive usability evaluation to ensure that users are able
to focus on their tasks rather than on the software. Rapid prototyping—an iterative cycle of prototyping and user
evaluation—is an essential part of developing any usable and effective interactive system. Visualization software should
support rapid prototyping to encourage experimentation with new visual interaction styles and exploration tools that
will produce more usable and effective visualization software. Making rapid prototyping easier will broaden the range
users and developers able to implement visualizations.
Implementing visualization designs requires specific skills; visualization designers may not have the necessary skills to be
able to prototype their designs. Visualization software should reducing the complexity of prototyping to broaden the
base of researchers that are able to prototype their designs. Even for skilled developers, lengthy development times and
complex implementations make rapid prototyping prohibitively expensive. Reducing the development time and
complexity will make extensive rapid prototyping easier.
Academic research tends to produce highly creative experimental prototypes. The prototype is the focus of the research
so complete and robust software is rarely produced. Visualization software that enables researchers to develop
complete and robust software from their prototypes will increase the potential for marketing the results of their
research. Software for enabling new visualization styles to be produced quickly and easily from prototypes will also
provide a competitive advantage for companies that develop visualizations.
6. Developer Requirements
Two of the main approaches to implementing visualizations are programming with graphics APIs and visualization
toolkits and describing visualizations with textual graphics description languages. Each of these approaches has its
advantages but they also have significant disadvantages. Visualization software that addresses the disadvantages will be
a significant improvement over the visualization software that is currently available.
6.1 Programming with Graphics APIs and Visualization Toolkits
Visualization software can be implemented with the graphics APIs of programming languages such as Java, C++, C#, and
Visual Basic. The advantage of developing a visualization from scratch is that developers have complete control over the
implementation. Visualization applications can be built exactly to the requirements and application specific
optimizations can be implemented. The main drawback is that implementing visualizations with graphics APIs is complex
and time consuming which limits development to highly skilled developers. The complexity of the code and the time
required to implement it, increases further when 3D, interactivity, real-time behaviors, and tightly integrated data
exploration and analysis tools are required.
Two- and three-dimensional scenegraph toolkits have been developed to ease the development of graphic scenes.
Scenegraph toolkits describe graphic scenes as a hierarchy of objects which simplifies the code and provides rendering
optimizations. Scenegraph toolkits provide a higher level of abstraction than graphics APIs and are particularly useful for
implementing zooming, animation, and levels of detail. Examples of scenegraph toolkits are Jazz/Piccolo (2D) and
OpenGL and Java3D (3D).
Visualization toolkits have also been developed to help reduce the complexity and development time by providing
commonly used visualization components. These include 2D and 3D information displays such as scatter plots,
histograms, pie charts, and line graphs as well as sliders, radio buttons, and other interaction controls. These off the
shelf components are highly parameterized and can be easily integrated into bespoke software. Although visualization
toolkits make implementing existing visualization styles easier, they do not help to develop new styles. New styles must
be developed with graphics APIs and scenegraph toolkits.
The complexity of implementing visualizations can limit designs to those which the designer has the ability to
implement. Developers need experience with graphics APIs and scenegraph and visualization toolkits to be able to use
them effectively. Projects that require visualizations often do not have developers with the necessary skills or
experience to implement them. Reducing the complexity of producing visualizations will broaden the base of developers
that will be able to implement powerful visualization software.
6.2 Textual Graphics Description Languages
Visualizations are often implemented by describing them with textual graphics description languages such as the Virtual
Reality Modelling Language (VRML), the 3D Modelling Language (3DML), and Scalar Vector Graphics (SVG), an XML
application for describing graphic scenes. Description languages have a number of advantages over graphics APIs and
toolkits for implementing visualizations: rendering software is provided, they use higher level concepts, generating
visualizations is simple, and they have high compression ratios. The most significant advantage is that software to render
scenes is already provided. Developers do not need to write rendering code so development time is reduced
considerably. Rendering software, often called viewers, are usually distributed as a web browser plug-in or as a stand
alone application.
Description languages often provide higher level concepts than graphics APIs that enable developers to focus on the
visualization rather than the implementation details of the graphics. VRML, for example, enables complex colouring and
lighting of 3D objects to be described without requiring the mathematical knowledge to be able to implement such
effects.
Visualizations implemented with description languages can be easily generated by creating a text file with a text editor
or as the output of a program. The level of programming ability required to implement a visualization with a description
language is significantly less than that required to implement a visualization with a graphics API or toolkit—compare the
code required to output a text file with the code required for a 3D interactive visualization. People with limited
programming skills are able to implement their own visualizations with description languages.
Description languages have high compression ratios that enables large visualizations to be archived, shared among
users, and sent over networks such as the Internet.
Description languages also have several significant drawbacks for implementing visualizations: a lack of visualization
concepts, no preservation of data, a lack of flexibility, limited language extensions, and no support environment.
Although graphics description languages often provide higher level concepts than graphics APIs, they are not designed
for implementing visualizations. They are only able to express a limited number of the concepts that are present in
visualizations. High level visualization concepts include layout algorithms, links between objects rather than coordinates, rule based behaviors, and real-time updates of visual presentations based on live data.
Visualizations created with description languages do not preserve the data that is visualized. The data is embodied in the
visual presentation but the data itself is not contained in the textual description. This has two important implications for
users. First, new visualizations cannot be generated from an existing visualization. For example, users may want to select
interesting data points in a visualization and summarize them by producing a new visualization of the selections. Second,
the data is not available for further analysis. Users are unable to verify that the visualization accurately describes the
data or whether it hides aspects of the data which can occur with animated guided tours through data.
Visualization development with description languages is less flexible than programming with graphics APIs and
scenegraph toolkits. Developers are constrained by what the language is able to describe, and by what APIs the
rendering software exposes to enable it to be integrated into other software.
Textual graphics description languages are often adopted as standards and tightly controlled by the standards body so
they cannot be extended. This eliminates incompatibilities caused by non-standard features not being available in all
viewers. Prohibiting extensions ensures that visualizations written in a description language will be rendered correctly by
any viewer that renders the language standard. Language extensions are often desirable and there is a dichotomy
between inflexible standardized languages and flexible languages with non-standard features that inhibit compatibility.
A middle ground is needed where description languages can be extended with a well specified extensions mechanism.
Such a mechanism must be able to describe the extensions to the language, the parser, and to the rendering software. A
graceful degradation system is also required to enable viewing software without an extension to be able to render a
visualization.
The final disadvantage of description languages is that they enable visualizations to be described, but they do not
support visualization. A supportive visualization environment should provide tightly integrated tools to navigate,
explore, and analyze the information provided by a visualization. Although description language viewers such as the
VRML viewer can render 3D scenes with complex lighting and shading, they are simply presentation tools. A limited set
of navigation tools are usually provided but there are no tools for searching, exploring, or analyzing the information
because description language viewers are not designed for this purpose.
Viewers can be extended to provide extra facilities to support visualization but they must be explicitly programmed.
Description language viewers for VRML, for example, expose APIs to enable them to be integrated into a visualization
environment, but integration is cumbersome because the APIs were an afterthought: VRML was not originally designed
to be programmatically controlled so the concepts in the language do not facilitate easy integration with other software.
Implementing extra facilities requires the skills and experience required to implement visualizations with graphics APIs
and toolkits.
7. Summary of Requirements
The following points summarize the requirements for visualization software discussed in this article.
Hardware Requirements
Visualization software must be able to run on a range of hardware platforms that includes workstations, PCs/laptops,
PDAs and mobile phones.
Software Requirements
Visualization software must be able to be networked, to visualize networked information, and to update networked
information in real time.
Visualization software must expose APIs that enable developers to programmatically control all aspects of a
visualization, to capture high level visualization events such as data selection, and to capture low level user interaction
events such as mouse movements.
Visualizations must be able to scale up to large complex 3D scientific visualizations and to scale down to simple 2D
information displays on resource limited mobile devices.
User Requirements
Visualizations must represent all of the most common types of information: one-, two- and three-dimensional;
temporal; multi-dimensional; tree and network data.
Visualizations must represent large and small amounts of homogenous and heterogeneous data in a wide variety of
styles.
Users must be able to control which visualization styles are used during a data exploration session.
Users should be able to use a range of different visualization styles in a single visualization package rather than be forced
to use several packages.
Visualization software must provide a generic set of tools that can be applied to a wide variety of visualizations:
requesting an overview, filtering out uninteresting items, zooming into items of interest and requesting more detailed
information.
Generic visualization software must be able to incorporate application specific tools to tailor it to specific visualization
applications.
Users must be able to select data from one visualization style and generate a new visualization of the selected data in
the same style or a different style.
Researcher Requirements
Visualization software must support rapid prototyping to encourage experimentation and to enable new ideas and
designs to be tested.
Visualization software must reduce the complexity of prototyping which will broaden the range of researchers that will
be able to prototype their visualization designs, and which will make extensive rapid prototyping a realistic proposition.
Visualization software must enable complete and robust commercial visualization software to be developed from
academic prototypes.
Visualization software must enable new visualization styles to be produced quickly and easily which will provide a
competitive advantage for companies that develop visualization software.
Developer Requirements
Visualization software must reduce the programming experience required to implement visualizations by providing
powerful rendering software that implements high-level visualization concepts.
Visualization software must reduce the time and complexity required to implement visualizations which will broaden the
range of developers that will be able to implement powerful visualization software.
References
Keim, Daniel A., Visual exploration of large data sets, Communications of the ACM, 44(8) 2001.
Shneiderman, Ben, The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations, Proceedings of Visual
Languages 1996.
Download