NonVisual.ppt

advertisement
Visualizing the Non-Visual
Spatial Analysis and Interaction with
Information from Text Documents
Wise, Thomas, Pennock, Lantrip, Pottier, Schur, and Crow
Presented By: Cyntrica Eaton
Presentation Overview

Paper Description

Contributions

Current State

Critique

References
Paper Description

Motivation

Approach

Visualization Paradigms


Galaxies
Themescapes
MVAB

Multidimensional Visualization and
Advanced Browsing Project

Researchers at the Pacific Northwest
National Laboratories were interested in
solving the problem of information
overload for Intelligence analysts.
Motivation

Modern information technologies have contributed to an
increased availability of information.

Accompanying the increasing quantity of available
information is a subsequently decreasing quantity of time
to locate and absorb it.

The ability to overview large document corpora and get
information without the heavy cognitive processes
involved in language processing will improve the search
process.
Approach

Problem of processing large amounts of text can
be solved if text is spatialized in manner that
takes advantage of human perceptual abilities.

Visual processing take place in parallel on the
retinal level and is:



Relatively effortless
Exceptionally fast
Not additive to cognitive workload
Approach
Transform text into visualizations that:

Communicate through images instead of prose.

Preserve information characteristics from documents.

Represent textual content and meaning without the
need to read it in the normal manner.

Reveal thematic patterns and relationships between
documents in ways in which the natural world is
perceived.
SPIRE

Spatial Paradigm for Information Retrieval
and Exploration

Developed to facilitate the browsing and
selection of documents from large corpora

Two major approaches:


Galaxies
Themescapes
Galaxies and Themescapes
Display metaphor rationale:

Each paradigm offers a rich variety of
cognitive spatial affordances that naturally
address the problems of text visualization.

Spatial perceptual mechanisms that operate
on the real world will respond analogously to
synthetic cues.
Paradigm Overviews

Galaxies


Point clusters suggest patterns of interest
Themescapes

Topographies of peaks and valleys that can
easily be detected based on contour patterns.
Paradigm Overviews

Both allow for overview + detail without a
change of view.

Each view offers a different perspective of
the same information.
Galaxies

Two-dimensional scatterplot of ‘docupoints’ that
appear like stars in the night sky.

Computes word similarities and patterns in
documents and communicates similarity via
proximity.

Provides a first cut at sifting through information
and determining how the contents of a document
base are related.
Types
Treatment
Case Studies
…..
Types
Treatment
Case Studies
…..
Themescapes

Three-dimensional relief map of themes within
the document corpora themes.

Complex surfaces convey information about
topics or themes found within the corpus without
cognitive load of reading

Terrain simultaneously communicates:



Primary themes of an arbitrarily large collection of
documents.
Measure of relevance in the corpus.
Similarity of themes.
Themescapes
Glance provides visual thematic summary
of the entire corpus

Elevation: Theme strength

Shapes: Information distribution

Proximity: Content Similarity
Themescapes

Utilizes human abilities for pattern
recognition and spatial reasoning

Employs communicative invariance across
levels of textual scale
Entire document corpus
 Cluster of documents
 Individual documents

Summarization

Reading is a slow, serial process of
mentally encoding a document.

Text visualizations can overcome much of
the user limitations that result from
accessing and trying to read from large
document bases.
Summarization

Visual cues can offer readers a way to
employ their primarily preattentive, parallel
processing powers of visual perception.

Galaxy and landscape metaphors allow the
cognitive and visual processes that enable
our spatial interactions with the natural
world to be applied to the search process.
Contributions

Prior visualization approaches offered methods
for visualization of structured, hierarchical text.

Free text visualization was relatively
unexamined.

MVAB Project produced novel methods for
interaction with large amounts of text.
Current Project Status

Correlation Tool

WebTheme

ThemeRiver

Rainbow
Love
Tybalt
Romeo
Caesar
Critique

The visualization paradigms were
discussed in a straight-forward manner.

There was, however, a deficiency of
example figure explanations.
My Favorite Sentence

[The] perceptual processes involved are the
results of millions of years of selective
mammalian and primate evolution, and
have become biologically tuned to seeing
in the natural world.
References

Information Retrieval

Information Visualization
Visualizing the Non-Visual
Spatial Analysis and Interaction with
Information from Text Documents
Questions?
Technical Considerations




Clear definition of text
Way to transform text into a different visual form
that retains high dimensional invariants of natural
language.
Suitable mathematical procedures and analytical
measures must be defined as the foundation of the
visualizations
Database management system must be designed
to store and manage text
Technical Considerations

Way to transform text into a different visual form
that retains high dimensional invariants of natural
language.


Text has statistical and semantic attributes such as
frequency and context and combination of words in
themes and topics
Differences between texts statistical and semantic
compositions provide much of opportunity for text
visualizations described in this paper.
Approach


A set of measures which characterize the text in
meaningful ways provide for multiple perspective
of documents and their relationships to one
another.
One measure is similarity


Based on occurrences and context of key words or
other extracted features measure of similarity can be
computed that reflect relatedness between
documents.
In a visualization, similarity can be shown as
proximity or congruity to form.
Download