MDMV Visualization

advertisement
Visualization of
Multidimensional Multivariate
Large Dataset
Presented by:
Zhijian Pan
zpan@cs.umd.edu
University of Maryland
Description
 Covered papers:
– Alfred Inselberg, Multidimensional Detective
– Ted Mihalisin, Visualizing Multivariate
Functions, Data, and Distributions
 The problem:
• Visualization and analysis of large dataset with
multiple parameters or factors, and the key
relationships among them
• MDMV problem
Key words explanation
 Multidimensional:
– The dimensionality of independent variables
 Multivariate:
– The dimensionality of dependent variables
 Example:
– 3-D volume space+temperature+pressure
produces 3D2V data
 The data set could larger than number of
pixels
Four Stages of Development
 1st:Graphical representation of either one or two
variate data, e.g. scatterplot, scatterplot matrix
 2nd:Two dimensional graphics, but encoding
multiple parameters, e.g. color, size,shape coding
 3rd:High dimensional graphics, high speed
computation, single display, such as Parallel
Coords
 4th:elaboration and assessment of various
visualization techniques
MDMV Visualization Category
 Broadly categorized into five groups:
– Brushing
– Panel Matrix
– Iconography
– Hierarchical Displays
– Non-Cartesian Displays
Group 1
 Brushing
– Direct manipulation of MDMV visualization
display:labeling, enhanced linking
– E.g. brushing a scatterplot matrix
Group 2
 Panel Matrix (pairwise 2-D plot, n-D box)
– E.g. Hyperbox: n*n lines, n*(n-1)/2 faces
– Elaboration of scatterplot matrix
– Adding interactive data navigation (hyperbox
cutting)
Group 3
 Iconography: Glyphs: graphical entities
which encode MDMV with shape, size,
color, and position.
– E.g. faceglyph: size and position of eyes, nose,
mouth; curvature of mouth; angle of eyebrows
Group 4
 Hierarchical Displays:
– map a subset of variates into different
hierarchical display
– Dynamic interactive analysis
– the Ted Mihalisin paper, more details followed
Group 4 (cont’d)
 New term: speed=the hierarchical axes
 E..g. Three variables:x,y,and z: {0,1,2}
 X the fastest axis, Z the slowest axis
Group 4 (Cont’d)
 Visualizing 3
variables:
– 2 interdependent
variables: x, y:
• x= -2, -1, 0, 1, 2;
• y= -2, -1, 0, 1, 2
– 1 dependent variable: z
= x**2 + y**2
– so, a 2D1V problem
– x fastest, y slowest
Group 4 (Cont’d)
 3d1v: W = (x**2) * (e**-y) + z
• Top panel speed order : x, y, z
• Bottom panel speed order: z, y, x
Group 4 (cont’d)
 What if the number of the data points
greatly exceeds the number of horizontal
pixels assigned to the panel?
 Example: 7 independent variables + each
has 10 values = 10,000,000 points
 Need:
– hierarchical subspace zooming to reduce
dimension
Group 4 (cont’d)
 From 7D to 2D:
Group 4 (cont’d)
 example: experiment
data visualization:
– Dependent: specific
heat
– Independent:
• Fastest: temperature
(white) :gaussian peak
• Then alloy
concentration (blue):
linear increase
• Then magnetic field
(red) :nonlinear
decrease
Group 5
 Parallel Coordinates
– So many class presentations have already been
done!
– Everybody is already expert using it
– What are some basic ideas behind it?
– Cartesian v.s. Parallel Coords
Group 5 (cont’d)
 A Cartesian line:
– L: x2 = mx1+b
– A set of points sampled
on this line
• On Parallel Coords:
– Each point becomes a line
– The set of points becomes a
set of intersecting lines
Group 5 (cont’d)
 The intersect point:
 The location of the
intersect point is
important!
– Between two axes:
inversely proportional
(x1 α 1/x2)
– Outside two axes:
directly proportional
(x1 α x2)
Group 5 (cont’d)
 Application example
– Aircraft collision
checking
– Converting the
problem into detecting
a four dimension
geometric intersection
– Collision at (2,2,2,1)
Group 5 (cont’d)
 Application example:
– Economic model of a
real country
– 8 variables:
•
•
•
•
•
•
•
•
Agriculture
Fishing
Mining
Manufacturing
Construction
Government
Miscellaneous
GNP
Group 5 (cont’d)
 A Least Squares
function defines the
boundary region in 8
dimension space
 Any point (polygon)
inside the boundary
represents a feasible
economic policy for
the country
Group 5 (cont’d)
 Discoveries:
– No policy would favor
Agriculture without
also favoring Fishing:
(x1 α x2)
– Inverse relationship
between Fishing and
Mining: resource
competition:
(x1 α 1/x2)
Notes on the References
 The Inselberg’s paper:
– 11 citations found on
researchIndex
– Application in
knowledge discovery,
user interface, aircraft
design, etc.
 Ted Mihalisin paper:
– Only one citation
found
Contribution
 Inselberg’s paper:
– Transform MDMV hyperspace relations into a
2-D geometric pattern problem
– empirical studies demonstrated the ability
extending the strength with trade-off analysis,
discover sensitivities, and optimization
 Mihalisin’s paper:
– Hierarchical technique visualizing data points
greatly exceeding number of pixels
Critique
 Inselberg’s paper:
– No comparison with other MDMV techniques
– No examples supporting the claim that
displayed objects can be recognized under
projective transformations
 Mihalisin’s paper:
– Limited number of values for each variable
visualized in one display
– No discussion of potential information loss
with coarse-grained grid
Favorite Sentence
 “You can’t be unlucky all the time!”
– Multiple techniques exist for MDMV
visualization problem
– Each has strength and weakness
– Whichever you start with, you can’t be unlucky
all the time!
– Integration and collaboration of existed tools
remain to be active research topics.
Download