Visual Computing Lecture 2 Visualization, Data, and Process Pipeline 1 High Level Visualization Process 1. 2. 3. 4. 5. Data Modeling Data Selection Data to Visual Mappings Scene Parameter Settings (View Transforms) Rendering Pipeline 2 Computer Graphics 1. 2. 3. 4. 5. 6. Modeling Viewing Clipping Hidden Surface Removal Projection Rendering Pipeline 3 Visualization Process Pipeline 4 Knowledge Discovery (Data Mining) A Data Analysis Pipeline Raw Data Processed Data Hypotheses Models Results D Cleaning Filtering Transforming A Statistical Analysis Pattern Rec Knowledge Disc Validation B C Where Does Visualization Come In? • All stages can benefit from visualization • A: identify bad data, select subsets, help choose transforms (exploratory) • B: help choose computational techniques, set parameters, use vision to recognize, isolate, classify patterns (exploratory) • C: Superimpose derived models on data (confirmatory) • D: Present results (presentation) What do we need to know to do Information Visualization? • Characteristics of data – – • Characteristics of user – – • Perceptual and cognitive abilities Knowledge of domain, data, tasks, tools Characteristics of graphical mappings – – • Types, size, structure Semantics, completeness, accuracy What are possibilities Which convey data effectively and efficiently Characteristics of interactions – – Which support the tasks best Which are easy to learn, use, remember Visualization Components • Human Abilities • Design Principles Imply • Visual perception • Visual display • Cognition • Interaction • Motor skills • Design Process • Iterative design • Design studies • Evaluation Inform design • Frameworks Constrain • Data types design • Tasks • Techniques • Graphs & plots • Maps • Trees & Networks • Volumes & Vectors • … Issues Regarding Data • Type may indicate which graphical mappings are appropriate – – – – – – – • • Nominal vs. ordinal Discrete vs. continuous Ordered vs. unordered Univariate vs. multivariate Scalar vs. vector vs. tensor Static vs. dynamic Values vs. relations Trade-offs between size and accuracy needs Different orders/structures can reveal different features/patterns Types of Data • Quantitative (allows arithmetic operations) - 123, 29.56, … • Categorical (group, identify & organize; no arithmetic) Nominal (name only, no ordering) • Direction: North, East, South, West Ordinal (ordered, not measurable) • First, second, third … • Hot, warm, cold Interval (starts out as quantitative, but is made categorical by subdividing into ordered ranges) • Time: Jan, Feb, Mar • 0-999, 1000-4999, 5000-9999, 10000-19999, … Hierarchical (successive inclusion) • Region: Continent > Country > State > City • Animal > Mammal > Horse Adapted from Stone & Zellweger 11 Quantitative Data • Characterized by its dimensionality and the scales over which the data has been measured • Data scales comprise: – Interval scales - real data values such as degrees Celsius, but do not have a natural zero point. – Ratio data scales - like interval scales, but have a natural zero point and can be defined in terms of arbitrary units. – Absolute data scales - ratio scales that are defined in terms of non-arbitrary units. Data Dimensions • Scalar - single value – e.g. Speed. It specifies how fast an object is traveling. • Vector – multi value – e.g Velocity. It tells the speed and direction. • Tensor – multi value – Scalars and vectors are special cases of tensors with degree (n) equal to 0 and 1 respectively. – The number of tensor components is given as dn, where d is the dimensionality of the coordinate system. – In a three dimensional coordinate system (d=3), a scalar (n=0) requires three values; and a tensor (n=2) requires 9 values. – There is a difference between a vector and a collection of scalars. – A multidimensional vector is a unified entity, the components of which are physically related. – The three components of a velocity vector of particle moving through three-space are coherently linked; while a collection scalar measurements such a weight, temperature, and index of refraction, are not. Metadata • Metadata provides a description of the data and the things it represents. – e.g., a data value of 98.6 oF has two metadata attributes: temperature and temperature scale. – The value 98.6 has little meaning without the metadata attribute of temperature. – By adding Fahrenheit the attribute, we know the Fahrenheit sale is used. • Metadata may also include descriptions of experimental conditions and documentation of data accuracy and precision. Issues Regarding Mappings • Variables include shape, size, orientation, color, texture, opacity, position, motion…. • Some of these have an order, others don’t • Some use up significant screen space • Sensitivity to occlusion • Domain customs/expectations www3.sympatico.ca/blevis/Image10.gif Importance of Evaluation • • • • Easy to design bad visualizations Many design rules exist – many conflict, many routinely violated 5 E’s of evaluation: effective, efficient, engaging, error tolerant, easy to learn Many styles of evaluation (qualitative and quantitative): – Use/case studies – Usability testing – User studies – Longitudinal studies – Expert evaluation – Heuristic evaluation Categories of Mappings • Based on data characteristics – Numbers, text, graphs, software, …. • Logical groupings of techniques (Keim) – – – – – • Standard: bars, lines, pie charts, scatterplots Geometrically transformed: landscapes, parallel coordinates Icon-based: stick figures, faces, profiles Dense pixels: recursive segments, pixel bar charts Stacked: treemaps, dimensional stacking Based on dimension management (Ward) – – – – Dimension subsetting: scatterplots, pixel-oriented methods Dimension reconfiguring: glyphs, parallel coordinates Dimension reduction: PCA, MDS, Self Organizing Maps Dimension embedding: dimensional stacking, worlds within worlds Scatterplot Matrix • • • Each pair of dimensions generates a single scatterplot All combinations arranged in a grid or matrix, each dimension controls a row or column Look for clusters, outliers, partial correlations, trends Parallel Coordinates • • • • • Each variable/dimension is a vertical line Bottom of line is low value, top is high Each record creates a polyline across all dimensions Similar records cluster on the screen Look for clusters, outliers, line angles, crossings Star Glyph • Glyphs are shapes whose attributes are controlled by data values • Star glyph is a set of N rays spaced at equal angles • Length of each ray proportional to value for that dimension • Line connects all endpoints of shape • Lay glyphs out in rows and columns • Look for shape similarities and differences, trends Other Types of Glyphs Dimensional Stacking • • • • Break each dimension range into bins Break the screen into a grid using the number of bins for 2 dimensions Repeat the process for 2 more dimensions within the subimages formed by first grid, recurse through all dimensions Look for repeated patterns, outliers, trends, gaps Pixel-Oriented Techniques • • • • Each dimension creates an image Each value controls color of a pixel Many organizations of pixels possible (raster, spiral, circle segment, space-filling curves) Reordering data can reveal interesting features, relations between dimensions Methods to Cope with Scale • Many modern datasets contain large number of records (millions and billions) and/or dimensions (hundreds and thousands) • Several strategies to handle scale problems – Sampling – Filtering – Clustering/aggregation • Techniques can be automated or usercontrolled Examples of Data Clustering Example of Dimension Clustering Example of Data Sampling The Visual Data Analysis (VDA) Process • • • • • • Overview Filter/cluster/sample Scan Select “interesting” Details on demand Link between different views Issues Regarding Users • What graphical attributes do we perceive accurately? • What graphical attributes do we perceive quickly? • Which combinations of attributes are separable? • Coping with change blindness • How can visuals support the development of accurate mental models of the data? • Relative vs. absolute judgements – impact on tasks Role of Perception MC Escher Consider the Following Role of Perception • Users interact with visualizations based on what they see. (e.g. black spots at intersection of white lines) • Must understand how humans perceive images. • Primitive image attributes: shape, color, texture, motion, etc. Visualization Example Op Art - Victor Vasarely OpGlyph (Marchese) Gestalt Psychology Rules of Visual Perception Principles of Art & Design Proximity Similarity Continuity Closure Symmetry Foreground & Background Size Emphasis / Focal Point Balance Unity Contrast Symmetry / Asymmetry Movement / Rhythm Pattern / Repetition Issues Regarding Interactions • Interaction critical component • Many categories of techniques – Navigation, selection, filtering, reconfiguring, encoding, connecting, and combinations of above • Many “spaces” in which interactions can be applied – Screen/pixels, data, data structures, graphical objects, graphical attributes, visualization structures