Figure 1 - Andy Warhol. Do It Yourself Flowers. 1962 http://americanhistory.si.edu/paint/unfinishedWork.html “A good sketch is better than a long speech” - Napoleon Bonaparte 2 A Primer on Super Graphics Introducing Temporal Analysis CONTENTS TM PAGE Paint and Calculate By Numbers 2 Observations on Minard's Visualization 3 Example: Temporal Super Graphic 4 Temporal Super Graphic Observations 4 Conclusion 5 Andy Warhol, Do it Yourself Flowers, 1962. http://artobserved.com/artimages/2010/11/andy-warhol-do-it-yourself-flowers-1962.jpg PAINT AND CALCULATE BY NUMBERS In the 1950s, the Palmer Paint Company marketed the first paint-by-numbers products.¹ “Master” artists provided specific outlines and the colors that allowed the amateur to create paintings well beyond his or her capability. Another way to look at it is that they provided those with little artistic talent the assistance needed to create appreciable artworks. Though widely criticized by purists, these products were soon very popular. PAINT-BY-NUMBERS It was not long before the paint-by-numbers phenomenon became a metaphor to describe the modern world’s attempt to mechanize everything - including art (Figure 1, front page). and make visual data mining possible. However, just as the paint-by-numbers master artists helped the amateur create a pleasurable product, those creating and using data visualization also require assistance from masters. SKILLS REQUIRED According to Stephen Few, “though simple, the skills that are required by the many to make sense of data don’t come naturally and the tools that support this effort are in short supply. These skills must be taught, and well-designed tools must be provided, to support cognitively rich immersion in information.” What the “purists” failed to appreciate, however, was the pleasure that paint-by-numbers enthusiasts experienced through approximating the output of the masters. Along with delight, the paint-by-number artists gained knowledge and an appreciation for the original art and artist. He goes on to say, “we certainly don’t want all infovis [information visualization] researchers to go through exactly the same training resulting in a community of clones, but there are a few fundamentals, other than those typically found in computer science programs, which everyone should learn. Here are a few often overlooked areas: CHARLES MINARD Training in psychology, especially visual perception Charles Minard’s often-cited graphic of Napoleon’s 1812 March to Moscow (Figure 2, next page) uses colors, positions, sizes, proportions, texts, and graphs to display related facts; in a single canvas, it depicts what we would presently call a 360o view of the event. The effects of the march, as well as the causes, are quickly discerned when art is applied to science in this manner. Edward Tufte, a recognized authority on visual design, characterizes this work as “the best statistical graph ever drawn”.² Given the richness of this depiction, it goes well beyond the charts we commonly see in print, and that are available to us in spreadsheet programs. Charts and graphs are effective in presenting comparative data in two or three dimensions. Minard artistically summarized multiple dimensions of related data on a single page. This technique allows a reader to, at a glance, understand the many causes and effects related to the march. Using graphics to depict related data in close proximity allows the brain to more easily uncover patterns. These data visualizations provide an alternative to statistical models as it applies to data presentation Training in data analysis, including actual experience Training in design, especially visual and human- computer interaction design Training in the scientific method, including how to design, conduct, and publish robust research studies. Those of us who guide research efforts are responsible both for knowing these fundamentals ourselves, and instilling them in our students.” Few’s bulleted list of the skills required of infovis researchers reveals a need for not just right brain technical and /or statistical skills, but also left brain subjective skills - namely training in psychology, visual perception, and humancomputer interaction design. Looking back at Minard’s visualization (Figure 2), one can imagine the hours of data analysis and summarization, and the scores of rough drawings, required to achieve the end result. 1. http://americanhistory.si.edu/paint/rembrandt.html 2. Tufte, Edward; The Visual Display of Quantitative Information, Graphic Press, 2001, Second Edition, p.40. A BIS2 White Paper A Primer on Super Graphics Introducing Temporal Analysis TM 3 Figure 2 — Charles Minard’s Visualization of Napoleon’s 1812 March to Moscow OBSERVATIONS ON MINARDS VISUALIZATION Top 10 Criteria for a Super Graphic™ The following section covers the 10 aspects of a Super Graphic and how Super Graphics can be applied to the Minard visualization (Figure 2): 1. Deterministic Framework: The display naturally lends itself to the display of data. The framework is a spatial-temporal design where the majority of the ink displayed is the shape of the troop movement. This framework is the central theme, and the relationship of the influencing factors are easily discerned by virtue of their position. 2. Attributes (Dimensions and Measures) Are Self Comparable: The content of the data defines the characteristics of the display. There are 6 dimensions of data displayed, namely time, troop size, temperature, latitude, longitude and distance. These dimensions have been merged to show a complete picture. 3. Volume of Ink Related to Scale of Data: The overwhelming majority of the ink used displays data in the spatial structure, and shows the continuous decline in the number of troops over the distance and time travelled. 4. Language Neutral: Even though the text is in French, the use of graphics and numbers in the visualization allows non-French readers to readily understand it. 5. Purpose is to Present Findings: The key way to understand this feature is to attempt to read the graphic from any direction. No matter which direction one reads the graphic, the story is shown. Edward Tufte, in his seminars, describes a process where one should allow observers to wander throughout Super Graphics and develop their own story. 6. No Independent Legend: The Minard graphic has no separate legend. This is the most difficult task in achieving the status of Super Graphic. All of the elements that one would include on a legend are clear on the graphic itself. The number of troops are displayed at appropriate points in the text, and the meaning of the color of the lines is apparent by the spatial separation. 7. The Color is Metric and Excites the Key Points: The use of color is the second most difficult area when creating Super Graphics. The visual cortex is highly tuned to color processing. Approximately 90% of the fovea input are color receptors. The Minard graphic makes effective use of the color of the troop movement into and out of Russia. 8. Pre-attentive Multi-layers of Information Shown: The Minard graphic is a classic multi-layered map with annotation overlaid over graphics. These layers are used to show different dimensions of data. 9. Fine-Grained Measurability: The width of the line is an enlarged spatial representation - the distortion gives a directly measurable (and continuous) value. Location and distance are also approximately measurable. Classification is not required for measurement. 10. Illustrates Causality: Although some historians argue that it was Russian forces, and not the cold that caused deaths on the retreat, the clear linkage between temperature (distance travelled) and deaths is illustrated graphically. A BIS2 White Paper 4 A Primer on Super Graphics Introducing Temporal Analysis TM EXAMPLE: TEMPORAL SUPER GRAPHIC TEMPORAL SUPER GRAPHIC OBSERVATIONS The following is a data visualization produced by BIS²’s vizExplorer product (Figure 3). Depicted is a calendar comparing the percentage of DVD rental revenue in the family category over a seven-year period. Understand seasonality by following the natural The hot colors (reds) indicate periods of high percentages, the intermediate colors (greens) represent periods of average percentages,and the cool colors (blues) are periods of low percentages. The intensity of the colors indicates the relative values. The small dots are the beginnings of weeks starting with 2001, spiraling out clockwise from the center. The large red dots are the significant movie releases. The size of the dot indicates the significance of the event. Movie release analysis, over an extended period. cyclical patterns. Read from one week inside to out (or vice versa) to If one starts at the center of the circle and follows a line of the small dots indicating weeks toward the outer circle, one can quickly compare the levels of sales over time in the context of major film releases (the red dots are sized relative to the significance of the event). see year on year trends. STRATEGIC The family category represents a large percentage of the overall business in April. The family category contributes a fairly consistent percentage of the overall revenue for the majority of the year. TACTICAL There were uncharacteristic increases in revenue during the 4th week of September 2004, and Valentine’s Day 2005. Figure 3 – Temporal Super Graphic A BIS2 White Paper A Primer on Super Graphics Introducing Temporal Analysis TM Top 10 Criteria for a Super Graphic™ 5 5. Purpose is to Present Findings: Observers can read the visualization from the inside or the outside (from past to present or present to past). Broad patterns are shown along with microtrends. There is no guiding principal on where to start in the interpretation. The following section applies the Top 10 aspects of a Super Graphic to BIS²’s Temporal Super Graphic (Figure 3, previous page): 1. Deterministic Framework: The display naturally lends itself to the display of data. The spiral calendar represents the natural cyclical structure of time. 6. No Independent Legend: This visualization could benefit from a legend but it is not required for the majority of the interpretation (e.g., colors known to be “hot” indicate areas of high sales). 2. Attributes (Dimensions and Measures) Are Self Comparable: The content of the data defines the characteristics of the display. The combination of large monthly and weekly cycles, with percentage of revenue overlaid with key events, (meaning top 10 movie releases) brings together 3-4 dimensions of data. 7. The Color is Metric and Excites the Key Points: The extensive use of spectrum-based color scales enables a metric reading of continuous values and intuitive interaction. 3. Volume of Ink Related to Scale of Data: The volume of data shown in the heat map is comprised of 365 data points of sales data. The volume and color of the ink makes the scale used to represent sales volume evident. 4. Language Neutral: The movie titles are English annotations (it could be argued that movie titles have multilingual characteristics due to their global popularity), but no aspects of the graphic require language specific knowledge. 8. Pre-attentive Multi-layers of Information Show: Along with annotation layers, the graphic has at least three layers including heat, week and major releases. 9. Fine-Grained Measurability: The heat map provides a specific color band. The size-graduated symbols (e.g., the red dots) are a direct representation of the metric. 10. Illustrates Causality: The link between releases and sales is apparent. The strong seasonality (e.g., April) is also apparent. CONCLUSION In his classic work, F. J. Anscombe states that “graphs are essential to good statistical analysis.” He goes on to state that, “graphs can have various purposes, such as: (i) to help us perceive and appreciate some broad features of the data, (ii) to let us look behind those broad features and see what else is there.”³ As with the Minard visualization (Figure 2, see page 3), colors, positions, sizes, proportions, and texts are all used to provide and communicate related information. The same multi-layered methods occur with the vizExplorer visualization (Figure 3, previous page) as with Minard’s—the brain discerns patterns, thereby mining the data without reliance on black box or other probabilistic statistical models. It illustrates the sales information and allows a human to make actionable decisions based on the data—visual data mining. Like the paint-by-numbers master artists, vizExplorer was designed and developed by masters. Its goal is to provide one, without particular talent in visual design or statistics, with the ability to create complex and informative views of their data— views capable of providing rapid understanding of the relationships in the data. Thus, producing the ability to act on the new knowledge. It accomplishes this goal by providing, in the form of software, expert guidance regarding how best to summarize and display available raw data. The software turns the data into real information—guidance that everyone can use to create far-reaching output; an accessible resource for learning. 3. Anscombe, F. J.; “Graphs in Statistical Analysis;” The American Statistician, Vol. 27, No. 1 (Feb., 1973), pp. 17-21. Emphasis added. A BIS2 White Paper This document contains proprietary information and may not be used, reproduced or disclosed without explicit written permission from BIS2. This document may also describe current and proposed product features intended to be released depending on development and market demands. Accordingly, the end product may not contain all product features shown in this document. vizExplorer, vizMaker and other BIS2 company, product or service names are trademarks of BIS2. Other brand and product names are trademarks of their respective companies. BIS2 comprises WME BI, LLC and its subsidiaries. Private & Confidential. Copyright © 2008 - 2013 NEW BIS SAFE LUXCO S.Á.R.L All Rights Reserved. For more information, please contact us: Web: Email: Phone: www.bis2.net sales@bis2.net Americas: +1 800 323 5690 (U.S. Toll Free) Americas: +1 858 381 5092 (International) Asia Pacific: +64 (0) 4 889 0228 NOV 2013- V3.3