Small multiples, or the science and art of combining graphs Nicholas J. Cox Department of Geography Durham University, UK 1 1 Small multiples Good graphics often exploit one simple design that is repeated for different parts of the data. Edward Tufte called this the use of small multiples. Well-designed small multiples are inevitably comparative, deftly multivariate, shrunken, high-density graphics…. Edward Rolf Tufte (1942–) 2 …in Stata In Stata, small multiples are supported for different subsets of the data with by() or over() options of many graph commands. Users can emulate this in their own programs by writing wrapper programs that call twoway or graph bar and its siblings. Otherwise, specific machinery offers repetition of a design for different variables, such as the graph matrix command. 3 Users can always put together their own composite graphs by saving individual graphs and then combining them using graph combine. This presentation offers further modest automation of the same design repeated for different data. 4 Original programs discussed are stripplot sparkline crossplot combineplot designplot subsetplot with cameo roles for aaplot and sepscatter. All may be installed from SSC. 5 5 What’s in a name? roseplot by any other name… A minor theme here is that definite names are needed for programs, even if kinds of graphs do not have distinct agreed names. As in advertising, a good name attracts and keeps users. As in politics, a bad name can be fatal. 6 stripplot Show me. Unofficial nickname of Missouri 7 stripplot stripplot started as an alternative to graph oneway in 1999, but by a mix of accident and design has morphed into an alternative to the official command dotplot. I have shown results from stripplot in previous meetings, so I will just feature here some additions to the latest incarnation. The aim is to compare univariate distributions with scope for linear or stacked dot plots, box plots and confidence intervals. We can now do side-by-side quantile plots. 8 As with dotplot, you can now show reference lines for means or medians – and indeed any reference level for which there is a suitable egen function. The examples here use Stata’s citytemp and auto datasets. 9 F) 86 68 50 32 14 NE N Cntrl South Census Region West 10 whiskers to 5 and 95% points 15000 12500 10000 7500 5000 2500 0 Domestic Foreign Car type 11 15000 12500 10000 7500 5000 2500 0 Domestic Foreign Car type 15000 12500 10000 7500 5000 2500 0 1 2 3 4 Repair Record 1978 5 sparkline The purpose of visualization is insight, not pictures. Ben Shneiderman (1947–) 14 Sparklines The name “sparkline” was suggested by Edward Tufte for intense text-like graphics. Sparklines are typically simple in design, sparing of space and rich in data, but they include several quite different kinds of graph otherwise. The most common kind shows several time series stacked vertically. sparkline is a Stata implementation. 15 15 Sparklines have long been standard in several fields, including physics and chemistry (spectroscopy), seismology, climatology, ecology, archaeology and physiology (notably encephalography and cardiography). Tufte provided an memorable and evocative new name and an excellent provocative discussion. The Grunfeld data (webuse grunfeld) are a classic dataset in panel-based economics. Ten companies were monitored for 1935–54. They give us a simple sandbox. 16 What are we doing here? The problem of time series graphics Comparisons of time series are a rich and challenging area of statistical graphics. The widespread term spaghetti plot hints immediately at the difficulties. As always, we want to combine a grasp of general patterns with access to individual details. With this in mind, we look at some sparklines of the Grunfeld dataset. 17 17 company 1 2226.3 kstock 2.8 6241.7 mvalue 2792.2 1486.7 invest 257.7 1935 1940 1945 year 1950 1955 18 1 2 3 4 5 6 7 8 kstock mvalue invest kstock mvalue invest 1935 1940 1945 1950 1955 1935 1940 1945 1950 1955 9 10 kstock mvalue invest 1935 1940 1945 1950 1955 1935 1940 1945 1950 1955 Graphs by company 19 19 1 2 3 2226.3 669.7 4 888.9 414.9 kstock 6241.7 2.8 2676.3 50.5 2803.3 97.8 1001.5 10.2 mvalue 1486.7 2792.2 645.5 1362.4 189.6 1170.6 410.9 174.93 invest 257.7 209.9 5 33.1 6 40.29 7 804.9 238.7 8 511.3 213.5 kstock 183.2 398.4 6.5 927.3 91.9 197 135.72 210.1 100.2 .8 1193.5 191.5 90.08 mvalue 151.2 98.1 89.51 invest 39.67 20.36 23.21 12.93 1935 1940 1945 1950 1955 1935 1940 1945 1950 1955 9 10 468 14.33 kstock 496 162 87.94 3.23 mvalue 213.3 66.11 6.53 58.12 invest 20.89 .93 1935 1940 1945 1950 1955 1935 1940 1945 1950 1955 20 20 Vertical and horizontal By default sparkline stacks small graphs vertically. If several graphs are combined, it is typical to cut down on axis labels and rely on differences in shape to convey information. Horizontal stacking is also supported, which can be useful for archaeological or environmental problems focused on variations with depth or height. Here is an archaeological dataset as example. 21 21 levels cores blanks tools 12 12 13 13 14 14 15 15 16 16 17 17 18 18 19 19 20 20 21 21 22 22 23 23 24 24 25 25 3.8 17.7 25.6 74.7 18.6 56.9 22 Nightingale’s data Florence Nightingale (1820–1910) is well remembered for her nursing in the Crimean war and (within statistical science) for use of quantitative arguments. Her most celebrated dataset is often reproduced using her polar diagram, but is easier to think about as time series. Zymotic (loosely, infectious) disease mortality dominates other kinds, so much so that a square root scale helps comparison. (A logarithmic scale over-transforms here.) 23 23 24 Watch out: the small print does explain that we are given superimposed sectors. Each sector must be assessed as a whole, from the centre outwards. The distinct colouring of each annular sector shows only the outermost part of each sector. Source of image: http://understandinguncertainty.org/coxcombs 25 Nightingale's data on mortality in the Crimea 1000 zymotic disease wounds and injuries all other causes 800 600 400 200 0 1854 1855 1856 annualised rates per 1000 26 26 Nightingale's Nightingale'sdata dataon onmortality mortalityininthe theCrimea Crimea 1000 900 zymotic zymoticdisease disease wounds woundsand andinjuries injuries all allother othercauses causes 800 625 400 600 225 400 100 200 25 0 0 1854 1854 1855 1855 1856 1856 annualised annualised rates rates perper 1000 1000 27 27 Would sparkline help? A sparkline display is useful to show relative shape, such as times of peaks. We see that seasonality is only part of what is being seen. The harsh winter of 1854–5 coincided with some of the hardest battles of the war, but 1855–6 was quite different. But, as often happens, no one graph dominates others here. 28 28 Nightingale's data on mortality in the Crimea 140.1 all other causes 2.5 115.8 wounds and injuries .4 1022.8 zymotic disease 1.4 1854 1855 1856 annualised rates per 1000 29 29 crossplot The scatter plot is the workhorse of statistical graphics. John McKinley Chambers (1941– ) 30 crossplot crossplot is designed as a quick-and-easy way to combine scatter plots. The basic syntax is crossplot (yvarlist) (xvarlist) and the idea is to plot every y in yvarlist against every x in xvarlist. The use of two varlists gives greater flexibility than does graph matrix, which produces every possible scatter plot for a single varlist. 31 Scatter plot matrices Scatter plot matrices are great, but they can be excessive. Their main feature is also a limitation. p variables mean p2 plots all at once, so 10 means 100, and so forth. (The half option just controls which plots you see. ) 32 crossplot design crossplot was developed in teaching, especially of regression, with the aim of encouraging focused comparisons. Originally (1999) crossplot was called cpyxplot, cp meaning Cartesian product, but the name was ugly, cryptic and easily forgotten. The syntax had to be as simple as possible. 33 crossplot examples Versions of a response variable versus a key predictor. A response variable versus versions of a key predictor. Each output versus each input. Principal components versus original variables. First, let us look at four versions of mpg versus weight in the auto dataset. 34 7 rt_mpg 5 6 40 30 4 20 3 10 4,000 3,000 Weight (lbs.) 2,000 4,000 3,000 Weight (lbs.) 5,000 2,000 4,000 3,000 Weight (lbs.) 5,000 2,000 4,000 3,000 Weight (lbs.) 5,000 4 6 rec_mpg 3.5 3 2 2.5 ln_mpg 8 4 2,000 5,000 35 Next we look at an audiometric dataset used as a multivariate example in the Stata manuals. There are 8 response variables, 4 for left ears and 4 for right ears. Here we just focus on the 16 plots pairing left and right. Another graph could be the 4 plots comparing left and right ears at the same frequency, the diagonal here. 36 -10 0 10 20 30 right ear at 2000H 10 15 -5 -10 0 20 40 60 right ear at 4000H 80 -20 0 20 40 60 right ear at 4000H 80 -20 0 20 40 60 right ear at 4000H 80 -20 0 20 40 60 right ear at 4000H 80 10 0 -10 10 0 40 40 20 0 -20 40 20 40 60 80 40 20 0 -20 0 -20 0 20 -20 20 0 10 20 30 right ear at 2000H 20 40 60 80 20 0 5 left ear at 500H 10 15 -5 -10 -10 -10 0 -10 20 0 -20 20 40 60 80 -20 0 10 20 30 right ear at 2000H left ear at 1000H 0 10 right ear at 1000H -10 left ear at 2000H -10 0 left ear at 4000H 30 20 40 left ear at 4000H 0 10 right ear at 1000H 0 10 20 30 right ear at 2000H 20 -10 40 left ear at 2000H 30 0 5 left ear at 500H 10 15 -5 -10 0 10 right ear at 1000H 10 left ear at 1000H 10 0 -10 40 20 0 -20 20 40 60 80 0 -10 -10 -20 0 10 20 right ear at 500H 30 20 left ear at 1000H -10 0 10 right ear at 1000H left ear at 2000H 0 10 20 right ear at 500H -10 left ear at 4000H -10 30 20 0 10 20 right ear at 500H 0 5 left ear at 500H 10 15 0 5 -5 -10 -10 20 0 10 20 right ear at 500H -20 left ear at 500H left ear at 1000H left ear at 2000H left ear at 4000H -10 40 37 crossplot syntax for examples crossplot (mpg rt_mpg ln_mpg rec_mpg) weight, combine(imargin(small)) crossplot (lft*) (rght*), jitter(1) 38 crossplot syntax extras By default, crossplot is just calling twoway scatter followed by graph combine. It follows that recast() is available to recast to twoway line or twoway connected. crossplot has an extra sequence() option to label graphs to ease preparation of graphics for papers e.g. sequence(a b c d) 39 combineplot The greatest value of a picture is when it forces us to notice what we never expected to see. John Wilder Tukey (1915–2000) 40 combineplot combineplot is a generalisation of crossplot, more flexible and inevitably more complicated in syntax. The general problem of combining plots of similar kind reduces to a loop producing individual plots and a call to graph combine. That is bound to be a challenge to beginning users. The idea is to avoid that by encapsulating the predictable syntax within one command. 41 combineplot examples We will look at a series of univariate examples followed by a series of bivariate examples. A great variety is possible, as we can loop over user-written graphics commands as well as official commands. 42 1 2,000 4,000 4 1 2 3 4 4 3 3 5 5,000 2 2 Headroom (in.) 3,000 Weight (lbs.) 1 5 5 1 2 3 4 5 1 2 3 4 5 43 0 10 30 40 Price 10,000 15,000 5,000 20 Price 10,000 15,000 5,000 40 30 20 0 10 4 3 2 Repair Record 1978 1 4 3 2 Repair Record 1978 5 1 4 3 2 Repair Record 1978 5 1 4 3 2 Repair Record 1978 5 4.0 3.0 2.0 1.0 2,000 3,000 Headroom (in.) 4,000 5.0 5,000 1 5 44 0 10 20 30 Mileage (mpg) 40 Price 10,000 15,000 5,000 15,000 Length (in.) 1,000 2,000 3,000 4,000 Inverse Normal 10 15 20 25 Inverse Normal 30 35 180 200 Inverse Normal 220 240 140 160 180 200 220 240 5,000 10,000 Inverse Normal 1,000 2,000 3,000 4,000 5,000 0 5,000 140 160 45 0 0 Price 10,000 15,000 5,000 Price 10,000 15,000 5,000 10 10 30 1 1 2 2 3 3 4 4 20 30 Mileage (mpg) 20 40 40 a b 5 c d 5 Domestic Foreign Domestic Foreign 46 10,00015,000 0 0 5,000 Price (USD) 10,00015,000 5,000 Price (USD) 20 30 Mileage (mpg) 2,000 3,000 4,000 Weight (lbs.) Domestic 5,000 Foreign 10,00015,000 Price (USD) 10,00015,000 Foreign 0 0 5,000 Price (USD) Domestic 40 5,000 10 140 160 180 200 Length (in.) Domestic 220 Foreign 240 100 200 300 400 Displacement (cu. in.) Domestic 500 Foreign 47 A digression on sepscatter The last example used sepscatter, a program automating separation of data points on a scatter plot by a categorical variable. The repetition of the legend needs some kind of fix. In this and similar examples, the legend could be deleted and explaining symbols left as a task for the text caption. 48 sepscatter and scatter plot matrices combineplot with sepscatter meets a felt need, scatter plot matrices with categorisation of data points. Here is an example with “size” variables from the auto dataset. The diagonal scatter plots have meaning, yet are not conventional. But not every graph need be immediately publishable. 49 2,000 3,000 4,000 Weight (lbs.) 5,000 2,000 3,000 4,000 Weight (lbs.) 5,000 160 180 200 220 Length (in.) 140 160 180 200 220 Length (in.) 240 140 160 180 200 220 Length (in.) 240 140 160 180 200 220 240 Length (in.) 140 100 200 300 400 500 Displacement (cu. in.) 2,000 3,000 4,000 5,000 Weight (lbs.) 2,000 3,000 4,000 5,000 5,000 140 160 180 200 220 240 3,000 4,000 Weight (lbs.) 100 200 300 400 500 Length (in.) 140 160 180 200 220 240 2,000 3,000 4,000 5,000 Weight (lbs.) Weight (lbs.) Displacement (cu. in.) 100 200 300 400 500 Length (in.) 2,000 240 100 200 300 400 Displacement (cu. in.) 500 100 200 300 400 Displacement (cu. in.) 500 100 200 300 400 Displacement (cu. in.) 500 50 2 10000 15000 price = -6.7074 + 2.0441 weight R = 29.0% 0 0 5000 Price (USD) 10000 15000 5000 Price (USD) 2 price = 11253 - 238.89 mpg R = 22.0% 10 n = 74 20 30 Mileage (mpg) 40 2000 RMSE = 2,623.7 n = 74 RMSE = 2,502.3 10000 15000 price = 3029 + 15.896 displace~t R = 24.5% 0 0 5000 Price (USD) 10000 15000 5000 2 price = -4584.9 + 57.202 length R = 18.6% 5000 Price (USD) 2 3000 4000 Weight (lbs.) 140 n = 74 160 180 200 Length (in.) RMSE = 2,678.7 220 240 100 n = 74 200 300 400 Displacement (cu. in.) 500 RMSE = 2,580.6 51 A digression on aaplot The last example used aaplot. aaplot customises automatic annotation of scatter plots with fitted regressions with text for key results. Originally, it was written following a request by my Ph.D. student Alona Armstrong. 52 Back to combineplot Some examples of its syntax will make clearer how it works. First look at a univariate example: combineplot mpg price weight headroom: graph box @y, over(rep78) Here we have one varlist and the syntax @y is a placeholder for the variable name. 53 Next look at a bivariate example: combineplot price (mpg weight length displacement): sepscatter @y @x, ytitle("Price (USD)") sep(foreign) Here we have two varlists and the syntax elements @y and @x are placeholders for the variable names. 54 The two varlists may each contain a single variable and they may be identical. When both are presented, the combination is the Cartesian product of the varlists. Naturally, you can reach through to control the options of graph combine as well as those of the particular graph command used. 55 Quirk or quick? The quirky syntax of combineplot might cause some queasiness. Some might recall the obsolete for command. Confident users would (should) be happy to write their own loops, topped by graph combine, and that is fine too. The justification for combineplot is just convenience: it can be quicker than writing your own script. 56 designplot Real life is both complicated and short, and we make no mockery of honest adhockery. Irving John Good (1916–2009) 57 designplot Here more than anywhere arbitrariness of names can bite. If you have used S or S-Plus or R much, you may have come across “design plots”. But as implemented there they do not look much like the graphs you are going to see. Nor are they plots showing fitted results; nor do they imply experimental design. To understand designplot, we need to creep up on it step by step. 58 Mileage (mpg) (all) 1 2 3 4 5 0 10 20 30 mean 59 Mileage (mpg) (all) 1 2 3 4 5 Domestic Foreign 1 Domestic 2 Domestic 3 Domestic 3 Foreign 4 Domestic 4 Foreign 5 Domestic 5 Foreign 0 10 20 30 mean 60 designplot syntax Minimal syntax specifies a response first, then one or more predictors. The predictors should in practice be categorical, meaning taking on only a small or moderate number of distinct levels (“factors”, if you like). The examples were designplot mpg rep78 designplot mpg rep78 foreign 61 designplot default The statistics shown are means. Given one, two, … predictors, the means are shown for all the data, each one-way breakdown, each two-way breakdown, …. designplot uses a syntax of way being 0, 1, 2, … graph dot is the default vehicle. statsby underpins calculations. In essence, we can get a multiscale breakdown. In practice, we might want to restrict what is shown. 62 Mileage (mpg) (all) Domestic Foreign 1 2 3 4 5 0 20 10 30 mean 63 Restricting designplot Here we restricted the scope by designplot mpg foreign rep78, maxway(1) Let us look at a different dataset. The response variable for these data on the Titanic is a binary variable survived, so its mean is the fraction survived. We restrict using maxway(2). 64 survived (all) crew first second third child adult female male crew adult first child first adult second child second adult third child third adult crew female crew male first female first male second female second male third female third male child female child male adult female adult male 0 .2 .4 .6 mean .8 1 65 So we have here: the overall mean one-way breakdowns for three predictors class, adult, male two-way breakdowns for combinations class×adult, class×male, adult×male 66 This kind of graph is for detailed scrutiny, rather than delivering shock. Logically similar displays are often used for reporting opinion poll or electoral results. 67 That reminds us of… The structure echoes analysis of variance, used descriptively. Similar ideas appear in ANOVA and other literature going back to J.W. Tukey in 1977. It also echoes the little used official command grmeanby. By default, grmeanby also shows means. (Medians are allowed.) It allows one-way breakdowns only. 68 Means of survived .7 female .6 first second adult .3 .4 .5 child third crew .2 male class adult male 69 28 Means of mpg, Mileage (mpg) 26 5 22 24 Foreign 4 20 1 Domestic 18 3 2 rep78 foreign 70 grmeanby In these examples, grmeanby shows different means distinctly, but that is not guaranteed. Using graph dot as a default within designplot ensures more readability, although that too has its limits. 71 designplot can show other statistics You can show any summarize result. In practice, you would only want to plot results sharing the same units of measurement (including none at all, as with skewness and kurtosis). 72 Mileage (mpg) (all) Domestic Foreign 1 2 3 4 5 0 10 min p25 20 median 30 mean 40 p75 max 73 More to say… Although based on graph dot by default, designplot can be recast to graph bar or graph hbar. Reference lines in the style of grmeanby can also easily be added. Although based on summarizing single variables, what could be simpler than putting different designplots side-by-side? 74 Mileage (mpg) Price (all) (all) Domestic Foreign Domestic Foreign 1 2 3 4 5 missing 1 2 3 4 5 missing 0 10 20 30 0 2,000 mean Weight (lbs.) (all) (all) Domestic Foreign Domestic Foreign 1 2 3 4 5 missing 1 2 3 4 5 missing 1,000 2,000 mean 3,000 6,000 counts 0 4,000 mean 4,000 74 52 22 2 8 30 18 11 5 0 20 40 count 60 80 75 Is this just a reinvention of graph dot? No. graph dot and its siblings are restricted in offering only one-way or two-way or three-way breakdowns given, respectively, one or two or three “factors”. designplot gives scope for saving results for separate graphing or tabulation. 76 subsetplot To clarify, add detail. Edward Rolf Tufte (1942– ) Graphing subsets subsetplot automates an approach discussed in Stata Journal 10: 670–681 (2010). The idea is to plot each subset separately, but with the rest of the data as a backdrop. We thus combine juxtaposing and superimposing, in the hope of getting the best of both approaches. The cost is some redundancy. Superimpose or juxtapose? The principle of superimposing subsets is easy to understand. The question is whether it really works in practice. With even say 5 subsets, mentally extracting each subset and comparing with the others can be hard work. Consider a conventional grouped scatter plot and a subsetplot alternative in our final fling with the auto data. 40 5 30 5 55 4 44 20 3 44 5 3 5 3 35 4 3 4 4 3 4 212 4 3 3 4 5 5 3 5 32 33 33 33 333 23 1 5 3 4 2 344 2 2 4 3 3 32 3 3 4 44 3 3 10 3 2,000 3,000 Weight (lbs.) 4,000 5,000 The previous graph can be got with sepscatter mpg weight, sep(rep78) mylabel(rep78) The next graph can be got with subsetplot scatter mpg weight, by(rep78) 2,000 3,000 4,000 Weight (lbs.) 5,000 10 20 30 Mileage (mpg) 10 20 30 Mileage (mpg) 30 10 20 40 3 40 2 40 1 2,000 5,000 30 20 10 10 20 30 Mileage (mpg) 40 5 40 4 3,000 4,000 Weight (lbs.) 2,000 3,000 4,000 Weight (lbs.) 5,000 2,000 3,000 4,000 Weight (lbs.) 5,000 2,000 3,000 4,000 Weight (lbs.) 5,000 With an ordered (Likert) scale such as repair record rep78, self-describing marker labels can be natural and effective. 40 30 20 10 10 2 2 22 2 2 3,000 4,000 Weight (lbs.) 5,000 2,000 3,000 4,000 Weight (lbs.) 5,000 40 Mileage (mpg) 5 30 5 40 4 5 55 55 5 20 4 4 4 4 44 5 55 5 10 10 4 44 44 4 4 444 4 2,000 3,000 4,000 Weight (lbs.) 5,000 3 3 3 3 3 3 3 33 333 3 333 33 3 3 33 33 3 10 20 22 1 2,000 30 Mileage (mpg) 30 Mileage (mpg) 30 20 1 20 3 40 2 40 1 2,000 3,000 4,000 Weight (lbs.) 5,000 2,000 3,000 4,000 Weight (lbs.) 33 5,000 The Grunfeld data again This approach is especially suitable as another way to tackle the spaghetti problem of plotting multiple time series. Here are the invest data for different companies. If the plot seems excessively simple, then there are some bells and whistles for adding key extras. 1000 10 invest 100 1000 1 1 10 invest 1 1 1935 1940 1945 1950 1955 year 1935 1940 1945 1950 1955 year 1935 1940 1945 1950 1955 year 5 6 7 8 100 10 invest 1 1 10 invest 1 1 10 1000 9 1935 1940 1945 1950 1955 year 100 1000 1 1 1935 1940 1945 1950 1955 year 10 100 10 invest 10 invest 100 1000 1000 100 10 1935 1940 1945 1950 1955 year 1935 1940 1945 1950 1955 year 1000 1935 1940 1945 1950 1955 year 1000 100 100 1000 10 invest 100 1000 100 10 invest invest invest 4 3 2 1 1935 1940 1945 1950 1955 year 1935 1940 1945 1950 1955 year invest 100 invest 40 1 1 10 1 172 10 100 1 190 33 10 invest 100 10 8 5 1 3 1 1935 1940 1945 1950 1955 100 69 13 1 10 invest 1 100 10 49 10 invest 100 1000 10 1000 1935 1940 1945 1950 1955 9 1935 1940 1945 1950 1955 90 10 1 1 10 24 1935 1940 1945 1950 1955 27 100 100 invest 100 136 20 10 40 81 1000 1935 1940 1945 1950 1955 7 1000 1935 1940 1945 1950 1955 6 1000 1935 1940 1945 1950 1955 5 1000 1935 1940 1945 1950 1955 invest invest invest 1000 459 210 100 1487 318 invest 4 1000 3 1000 2 1000 1 1935 1940 1945 1950 1955 1935 1940 1945 1950 1955 How far can we go? The Grunfeld data are perhaps at the trivial end of this problem. For a stiffer challenge, here are some data for the 28 countries of the European Union on long-term unemployment. As often, a graph can be valuable in suggesting what else to plot…. Spain 199219962000200420082012 15 10 5 0 10 15 0 5 Long-term unemployment (%) 15 10 5 10 15 0 5 Long-term unemployment (%) 0 199219962000200420082012 199219962000200420082012 15 10 5 0 10 15 0 5 Long-term unemployment (%) Slovenia Li thuani a 199219962000200420082012 Irel and Portugal 199219962000200420082012 Greece 10 5 199219962000200420082012 5 10 15 199219962000200420082012 0 10 15 0 5 Long-term unemployment (%) 10 15 0 5 Long-term unemployment (%) Estoni a Bulgaria 0 199219962000200420082012 199219962000200420082012 Romani a 199219962000200420082012 15 199219962000200420082012 Croatia 199219962000200420082012 10 15 0 5 Long-term unemployment (%) 10 15 0 5 Long-term unemployment (%) 199219962000200420082012 Slovakia 10 15 0 5 Long-term unemployment (%) 199219962000200420082012 Hungary Germany Luxembourg 199219962000200420082012 10 15 0 5 Long-term unemployment (%) Ital y Czech Republi c 199219962000200420082012 199219962000200420082012 10 15 0 5 Long-term unemployment (%) 10 15 0 5 Long-term unemployment (%) Cyprus Poland Denmark 199219962000200420082012 199219962000200420082012 199219962000200420082012 10 15 0 5 Long-term unemployment (%) 10 15 0 5 Long-term unemployment (%) Latvi a Malta 10 15 0 5 Long-term unemployment (%) France 10 15 0 5 Long-term unemployment (%) 10 15 0 5 Long-term unemployment (%) 199219962000200420082012 Belgi um 10 15 0 5 Long-term unemployment (%) 10 15 0 5 Long-term unemployment (%) 10 15 0 5 Long-term unemployment (%) 199219962000200420082012 199219962000200420082012 10 15 0 5 Long-term unemployment (%) United Kingdom Fi nland 199219962000200420082012 199219962000200420082012 199219962000200420082012 Netherlands Sweden 10 15 0 5 Long-term unemployment (%) 10 15 0 5 Long-term unemployment (%) 10 15 0 5 Long-term unemployment (%) Austria 199219962000200420082012 199219962000200420082012 The main players again were stripplot sparkline crossplot combineplot designplot subsetplot Our attraction to images as a source of understanding is both primal and pervasive. Stephen Jay Gould (1941–2002) 90 90