Presentation

advertisement
Space & Order
(1)
Jing Li
2003.1.27
Topics

The Visual Design and Control of Trellis
Display R. A. Becker, W. S. Cleveland, and
M. J. Shyu (1996).
Source: http://cm.bell-labs.com/stat/doc/trellis.jcgs.col.ps

VisDB: Database Exploration using
Multidimensional Visualization, Daniel A.
Keim and Hans-Peter Kriegel, IEEE
CG&A, 1994
Source: http://www.dbs.informatik.uni-muenchen.de/
dbs/projekt/papers/visdb.ps
The Visual Design and Control
of Trellis Display
A framework for the visualization of
multivariable data
Introduction
Trellis Basics:
 A three-way rectangular array of panels with
columns, rows, and pages
 Panel Variables and Conditioning Variables
 Strip labels at the top of each panel with a
dark bar indicating the value of the variable
 Packet: info sent to each panel, including the
values of the panel variables to be graphed
on the panel.
Introduction
Display method: used to uncover the
structure of data. (i.e. a dot plot, a
scatter plot, a box plot…)
 Control method: a technique for
specifying info (i.e. layout and packets’
assignment) so that a Trellis display can
be drawn.
 But the precise boundary between them
is sometimes fuzzy.

Figure 1. A dotplot of the barley data showing yield against variety given year and site
Main-Effects Ordering

Order the variety levels (unique values) of a
variable by its median on each panel
 Allow the user to discover the anomalous
behavior
 But better to use the natural order of the
variable if a categorical variable is naturally
ordered and there are more than two levels.
Figure 2. A dotplot of the barley data showing yield against site and year given variety
Multiple Conditionings
From figure 1, how can we compare the
six values of yield for each combination
of variety and year?
 Need another Trellis display. The
dependence changes as the value of
the conditioning variables change.
 Make multiple Trellis displays so that
each explanatory variable appears at
least once as a panel variable.

Figure 4. Yield against site given variety and year
Partial Residuals

Take the mean from all the measurements in
each panel
 Subtract the mean from each measurement
 Graph the residuals as the response by
Trellis display
 Partial residuals plots allow subtler effects to
emerge by removing gross main effects.
Figure 5. Differences of barley yield against variety given site
Trellising Mechanism





Dimensions: columns, rows and pages
Order for conditioning variables and order for
the levels of each con variable
Packet Order: the levels of the first
conditioning variable vary the fastest…
Panel Order: bottom left panel of the first
page, columns, rows, pages
Packet assignments to Panels: match the
packet order and the panel order
Trellising

Different Trellising
Dimension (2, 6, 1)  Dimension (6, 2, 1)
 Flexible Trellising
The numbers of levels of the conditioning
variables and the trellis dimensions are
independent
 Breaking: Enhance our perception
 Skipping: Assign packets with an irregular
structure to the rectangular trellis. If the
sequence specified is smaller than the number
of the panels, then skip the repeated sequence
Conditioning on A Numeric
Variable of Discrete Values

Response:
F -- the operating temperature of the fuse
 Variables:
A – the ambient temperature (75°, 110°)
S – the start condition of the fuse in a run
(cold or hot)
V – the voltage (110V, 120V, 126V)
Figure 6. Fuse temperature vs. Partial residual fuse temperature against
voltage given start and ambient temperature
Conditioning on Intervals

Shingle: The intervals for a numerical variable
together with the measured values of the
variable. The intervals often overlap.
 Equal Count Algorithm: Choose the number
of intervals and the percentage of overlap.
The endpoints are chosen to make the
number of points in the intervals nearly equal
while maintaining the percentage of points
shared by successive intervals as close to the
target percentage as possible.
Equal Count Illustrated
Banking to 45°
Principle: Orientations of line segments
are most accurately judged when the
absolute slopes are centered on 45°
 Choose the right aspect ratio, the height
of the data region of the graph divided
by the width.
 Example: Sunspot cycles

Figure 7. Sunspot numbers vs. year
(source: http://www.research.att.com/~rab/trellis/sunspot.html)
High-Level Design for Software

The trellising mechanism:
The conceptual framework as well as the
control mechanism for users
 Conditioning variables use appropriate data
structure:
Category for categorical variables;
Shingle for numerical variables, etc.
 Program a panel function instead of a highlevel routine
Trellis Display Summary
Bring substantial generality to multipanel display as an overall framework
 Can be scatter plots, dot plots, curve
plots, wireframes, etc.
 The use of strip labels to make panels
self-contained
 Implementation: The S-PLUS system
for graphics and data analysis

VisDB: Database Exploration Using
Multidimensional Visualization
A tool to support
Exploration of large databases
By using
Human Visual System
To analyze large database
Reasons
Scientific and Geographic databases tend to
have large amounts of data.

Some of the challenges in dealing with these
databases are:
– Mining these databases for useful information is a
difficult task due to the sheer volume of data
Reasons
– Users do not know what they are looking
for exactly.
– With traditional query specification
languages, it is not possible to specify
vague queries and thus not possible to get
approximate results.
– There is no feedback. Result set may
contain too few or too many points.
Requirements
Requirements for a good Visualization
System to explore large databases:



Flexible Query Specification
Good Query Feedback
Interactive system
Requirements
Also, the users should be able to view
as many data points as possible to see
the patterns and clusters.
 Necessary to display the
interdependencies between data
attributes, Hotspots (anomalies).

VisDB Concept
The basic idea for visualizing the data is
to map the distances to colors and
represent each data item resulting from
a query by one or multiple colored
pixels.
 The goal of the VisDB system is to
address the tasks of visualization of the
results and to provide an effective way
of incrementally refining the query to
find interesting data properties.

Features
More feedback on the results of the
queries provided
 Interactivity allows immediate feedback
from a modified query
 Configurable tool, that allows various
forms of data visualization techniques
 Using the human vision system for
pattern recognition

Approach

Use each pixel of the screen to visualize
the results.


Display size and resolution are limiting factors
Provide data items not only fulfilling the
result exactly , but also those that match
approximately.
Approach
Approximate results are determined by
a relevance factor.
 The relevance factor of a data item is
obtained by calculating distances for
each selection predicate and combining
them.
 The less the combined distance, the
higher the relevance factor of the data
point.

Basic Technique
Sort query data w.r.t. the relevance, and
map relevance factors to colors
 Highest relevance factor in the center
 Yellow-Green-Blue-Red-Black in
decreasing order of relevance.
 Plot the sorted, colored points starting
from the center of the screen moving
outwards in a rectangular spiral fashion.

Overall Result Plot
Figure 8. Spiral Shaped Arrangement of One Dimension
Basic Technique
 To relate the visualization of the overall
result to the visualization of different
selection predicates, separate windows
for each selected predicate of the query
are created and shown along with the
result window.
 The position of the data items in all the
other windows is determined by their
position in the overall result window.
Arrangement of Windows for 5D Data
Figure 9. Arrangement of Windows for Displaying FiveDimensional Data
Mapping 2D To The Axes
Visualization of inherently 2D or 3D data
is not handled in VizDB
 Use of two axes for two dimensions and

arrange the relevance factors according to
the directions of the distance. Positive and
negative values displayed.
 Some space may be wasted. (i.e. some
quadrant may be almost empty, while others
are saturated)
2D Arrangement
Figure 10. 2D-Arrangement of One Dimension
Grouping the Dimensions
The pixels corresponding to the different
dimensions of one data item are placed in
one area instead of distributing them in
different windows
 Coloring is similar to the previous method
 require more pixels per dimension per data
item. Data in multiple dimensions are
represented as clusters of pixels
 Useful for data sets with larger
dimensionality

Grouping multi dimensional data
Figure 11. Grouping Arrangement for Five-Dimensional Data
Interactive Data Exploration

Dynamic Query Modification Techniques
 Feedback on the results
– Change in color means change in values that are
“relevant”
– Change in structure means overall distribution of
data has changed

Sliders for discrete as well as continuous
values
 Initial Query is SQL or “Gradi”
Calibrations
Calculation of “relevance” factor can be
calibrated by the user
 Starting and ending values for various
numeric data

– eg: Blood samples count
Figure 12. The VisDB System
How about complex queries?
Multiple layers of windows for complex
queries using nested AND and OR
operators
 Data that satisfies ALL join conditions is
yellow. The rest is colored based on the
number of criteria met
 Works well with the relational databases

Applications
Molecular Biology - to find possible
docking regions by identifying sets
surface points with distinct
characteristics.
 Database of geographical data
 Environmental Data
 NASA Earth observation data

Future Extensions
Automatic generation of queries that Cool !!
correspond to data in specific regions
(Select some data, and the SQL query
that matches that data will get
generated…)
 Time series visualization

VisDB Summary
Useful for identifying and isolating
clusters, correlations and hotspots in
large databases.
 Good Query specification system.
 No Zoom for the visualizations

Thank You!
Download