Stopping to Look at the Flowers Di Cook, Statistics, Iowa State University Joint with D. F. Swayne, A. Buja, AT&T What is data visualization? Data: information in a table or list Visualization: I abstract relationships between variables. I beyond 3D, to arbitrary dimensions. I applicable to many types of data. 3 Beyond a Flat Page \Multiple views" paradigm. Focusing using zoom/pan/re-scale. Linking by queries, or motion. Rearranging to make multiple comparisons. Augmented by: 4 History of Statistical Graphics PRIM-9: Fisherkeller, Friedman, Tukey 1974. brushing: Newton, 1978; McDonald, 1982. grand tour: Asimov, 1985. 5 What is \interactive"? Direct manipulation in the plot: linked brushing points/regions/lines. querying the id of a point or group of points. dragging a scrollbar to change the value of a parameter. clicking a button to change the variables viewed in a plot. 6 What is \dynamic"? cycling between plots. 3D rotating plots. tour methods: grand/random, guided, manual. Motion graphics: 7 What Makes Graphics Special? The eye can absorb enormous amounts of information. I small departures from the trend. I sparse structure in high-dimensions. often nd features that we wouldn't otherwise detect from numerical methods - With graphics we can: rene numerical results and make them more interpretable. 8 Intricate Features: Tipping Behavior One waiter records 244 dining parties for 2.5 months, early 1990. Recorded total tip, total bill, sex of payer, smoking or not, day of the week, time of day, size of the party. What are the important factors in tipping behavior? Reference: Bryant and Smith (1995) 9 0 20 40 60 0 10 20 30 40 Intricate Features: Tipping Behavior 2 4 0 2 4 0 2 4 Tips 6 8 10 6 8 10 6 8 10 0 2 4 0 2 4 0 2 4 Tips 6 8 10 6 8 10 6 8 10 0 10 20 30 0 10 20 30 40 50 0 Tips 0 10 20 30 0 10 20 30 40 Tips Tips Tips 10 Intricate Features: Tipping Behavior 10 Total Tip 4 6 8 2 Male Smokers Female Smokers 10 20 30 Total Bill 40 50 2 2 Total Tip 4 6 8 10 0 10 20 30 40 50 Total Bill Total Tip 4 6 8 4 2 0 Female Non-smokers 0 10 20 30 40 50 Total Bill 10 Total Tip 6 2 8 Total Tip 4 6 8 10 10 Male Non-smokers 0 10 20 30 40 50 Total Bill 0 10 20 30 40 50 Total Bill 11 Software: XGobi Developed at Bellcore by Swayne, Cook, and Buja, beginning 1989 (Swayne et al, 98). Freely available from . Data represented by scatterplots, and connected lines. Linked brushing of points and lines across plots. Dynamic plots - cycling, 3D rotations, tours. Interprocess communication to other software. X Window System application. www.research.att.com/areas/stat/xgobi/ 12 Sparse Structure: 7D particle physics X5 X3 X6 X2 X1 X4 X7 X5 X3 X2 X1 X4 X7 X6 13 Rening Results: Italian Olive Oils Percentage composition of 8 fatty acids for 572 samples from 3 regions (and several areas) in Italy. How do we distinguish the oils from dierent regions and areas in Italy based on their combinations of the fatty acids? Reference: Forina et al. 1983 14 60 Rening Results: Italian Olive Oils 1 1 10 11 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 111 11 1 1 1 1 1 11 1 111111 11 1 111 11 11 1 1 1 1 11 1 1 1 1 11 1 111 1 1 11 1 1 11111 11111 1 1 1 11 1 1 1 1 11 111 11 11111 1 111 1 1 1 11111111 1 11 1 1 11 1111 11 111 1 1 11 1 1 11 1 111 1 11111 111 1 1 1 1 1 11 111 11 1 111 11111 11111 11111111111 11 1 1 1 11 111 11111 111 11 1 111 111 11 11111 11 1 1111 1111 1 11 1 1 11 1 1 1 11111111111 11 1 1 11 1 111 1 111 111 1111111 1 1 1111111 11111 1111 1 1 0 50 1 33 33333333 3 33 3333 3333 33 222222 2 222 22222222 2 2 2222222 3333 3 333 33 33 33 33 333 2 2 3 33 33 33 33 3333333 33 33 33 33 333 33 33333 333333333333333 3 332 2 22222 22 222 222 222222 2 222 22222222222 22 20 eicosenoic 30 40 1 1 linoleic 2 arachidic oleic 600 800 1000 linoleic 1200 1400 eicosenoic 15 0 400 800 Rening Results: Italian Olive Oils 0 200 400 600 16 Summary Applicable wherever data is collected: all areas of science, governments, nancial, retail, health, telecommunications industries. 17 Web Pages The author can be contacted by electronic email at: dicook@iastate.edu and the XGobi software can be downloaded from the XGobi web site: http://www.research.att.com/areas/stat/xgobi/ 18