i247: Information Visualization and Presentation Marti Hearst Data Types and Graph Types 1 Outline • • • • The Roles and Stages of Visualization (briefly) Data Models and Types of Data Which Kinds of Graphs for Which Types of Data? Class Exercise 2 The Roles and Stages of Visualization 3 What Visualization Can Do (Ware) • • • • Allows comprehension of huge amounts of data. Allows perception of emergent properties Enables problems with the data to stand out Facilitates understanding at both large and small scales; patterns linking local features • Facilitates hypothesis formation. 4 What Visualization Can Do (Tufte ’83) • • • • • • Show the data Induce to viewer to think about the data Avoid distorting what the data have to say Present many numbers in a small space Make large data sets coherent Encourage the eye to compare different pieces of data • Reveal the data at several levels of detail, from overview to fine structure • Serve a clear purpose: – Description, exploration, tabulation, or decoration • Be closely integrated with the statistical and verbal descriptions of a data set. 5 Stages of Visualization (Ware) • Collection and storage of data • Preprocessing to transform data into something understandable • Hardware and graphics algorithms for producing an image on the screen • Human perceptual and cognitive system. • (I think he’s missing a stage … Design of the visualization.) 6 Put it Into Questions • • • • What are our goals? What questions do we want to answer? What kind of data might we collect? How might we convey the information associated with this data? 7 Visualization Components • Human Abilities Imply • Design Principles • Visual perception • Visual display • Cognition • Interaction • Motor skills Inform design • Frameworks • Data types • Tasks Constrain design • Techniques • Graphs & plots • Maps • Trees & Networks • Volumes & Vectors • … • Design Process • Iterative design • Design studies • Evaluation From Melanie Tory 8 Data Models and Types of Data 9 Basic Elements of a Data Model • A data model represents some aspect of the world • Data models consist of these basic elements: – objects – values (also called attributes) – relations 10 Adapted from Stone & Zellweger Basic Elements: Objects • Objects are items of interest – people, plants, cars, films, etc… • Objects allow you to define and reason about a domain – ecosystem: ponds, streams, woodlands, mountains, plants, animals, etc. 11 Adapted from Stone & Zellweger Basic Elements: Values • Values (or attributes) are properties of objects • Two major types – quantitative – categorical • Appropriate visualizations often depend upon the type of the data values 12 Adapted from Stone & Zellweger Basic Elements: Relations • Relations relate two or more objects – leaves are part of a plant – a department consists of employees • Ecosystem – connections between streams and lakes – predator/prey network of what eats what – … 13 Adapted from Stone & Zellweger Types of Data (Ware) • Entities • Relationships • Attributes of Entities or Relationships – Nominal / Ordinal / Interval / Ratio (Stevens ’46) – Categorical / Integer / Real • Operations Considered as Data – – – – Mathematical Merging lists Transforming data, etc. Metadata (derived data) 14 Types of Data (Few) • Quantitative • Categorical (allows arithmetic operations) (group, identify & organize; no arithmetic) Nominal Ordinal Interval Hierarchical 15 Adapted from Stone & Zellweger Types of Data • Quantitative (allows arithmetic operations) - 123, 29.56, … • Categorical (group, identify & organize; no arithmetic) Nominal (name only, no ordering) • Direction: North, East, South, West Ordinal (ordered, not measurable) • First, second, third … • Hot, warm, cold Interval (starts out as quantitative, but is made categorical by subdividing into ordered ranges) • Time: Jan, Feb, Mar • 0-999, 1000-4999, 5000-9999, 10000-19999, … Hierarchical (successive inclusion) • Region: Continent > Country > State > City • Animal > Mammal > Horse 16 Adapted from Stone & Zellweger Which Types of Graphs for Which Kinds of Data? 17 Quantitative Against Categorical From Few, "Quantitative vs. Categorical Data: A Difference Worth Knowing", DM Review 18 Magazine, April 2005 Quantitative against Quantitative From Few, "Quantitative vs. Categorical Data: A Difference Worth Knowing", DM Review 19 Magazine, April 2005 Questions to ask when creating a graph • Is a graph needed? – Yes, if illustrating relationships among measurements • What information is being conveyed? – What is most important? – Start by writing a title 20 Questions to ask when creating a graph • What data is needed to answer specific questions? – Overview? Relationships? – Grice’s maxims • combine relevant information together • don’t show extraneous information • Who is your audience? 21 What Format to Use? • Bertin has a notion of efficiency • Tufte says “show the data” • Let’s start with familiar graph types – – – – line graphs bar charts scatter plots layer graphs • When to use each? 22 Anatomy of a Graph (Kosslyn 89) • Framework – sets the stage – kinds of measurements, scale, ... • Content – marks – point symbols, lines, areas, bars, … • Labels – title, axes, tic marks, ... 23 When to use which type? • Line graph – x-axis requires quantitative variable – differences among contiguous values – familiar/conventional ordering among ordinals • Bar graph – comparison of relative point values • Scatter plot – convey overall impression of relationship between two variables 24 What to put on the x axis? • Independent vs. Dependent variables – we often measure one quantitative variable against another – the value of one changes in relation to the other – the dependent variable changes relative to the independent one – the independent variable acts as a “measuring stick” • Independent usually goes on the x (horizontal) axis 25 Independent vs. Dependent • Independent vs. Dependent variables – heat in degrees against time – sales against season – tax revenue against city • What happens when there is more than one independent variable? – Choose one for the x axis, and another as a variation in the mark (color, shape) 26 Few on How to Show Information • The best way to show a single value? – Use a textual representation. – Why? • How to draw attention to a number? 27 Few on How to Show Information • What are tables good for? – Data lookup – Hierarchical relationships 28 Class Exercise 29 How to Combine Data Types? • Class Exercise: – Using data about autos from the 70’s – Each person get a column of data • First, identify the data type • Then, stand up • Then, repeat the following several times: – Walk up to someone else. If they have a different column than you do, discuss whether and how you should plot your two columns. » If yes, what question are you answering? » If no, why not? • Then, repeat this, but with groups of three people. 30