Scientific Information Skills Bad Graphs Bad graphs a graph should pictorially display data in a sensible and meaningful way many graphs in the public domain fail this two reasons why this happens: poor design intentionally deceiving Poor design the aspects mentioned last week plotting the wrong data carelessness over-complication duplication of information Example 2.1 What’s the graph about? which companies are the “biggest”? by showing revenue & profit Problems a lot of the image is taken up with text the profits aren’t a bar chart but it suggests that they are Fixes change of size/shape double bar (or column) to add profit (or leave profit entirely) scatter graph Example 2.1 Fixes Example 2.1 Fix Exercise 2.6(a) changes in poverty rates over time for different age groups you can’t see where the lines go after crossing year scale not consistent different styles – dashed, dotted Exercise 2.6(b) something to do with shares? background too dark too much little text lines all over the place etc fire the person responsible Exercise 2.6(c) changes in military divisions hard to see much difference between the two series % of what? if you have to include % labels, the graph is unnecessary change graph type and data measure (% to numbers) Exercise 2.6(d) how much of the basic food groups in a serve calories are grouped with mass amount not proportional to sector size 3 g of satd fat is part of the 4.5 g a graph isn’t necessary Exercise 2.6(e) fuel economy for different cars why create the graphs and then put the exact figures in as well ditch the numbers or the graph Exercise 2.6(f) growth of economies OK for showing difference between rich & poor cannot see detail for middle- & poor need a different scale Exercise 2.6(g) see page 23 Exercise 2.6(h) supposed to be shape of UK showing marriage status rates dark normally implies more (not here) it implies everyone in a square is in that status this amount of info can’t be presented in one graph Lying with graphs by “distorting” the graph, you distort the information being shown this can occur by: hiding a graph element distorting the graph elements over emphasising one element making unfair comparisons most people just look at the picture Hiding an element taking away a key piece of information the viewers needs it to properly understand the data left to look at the picture missing elements: title – not as serious; you should be able to tell enough from the axes labels or surrounding text axis scale – a serious problem; ignore any graph like this Example 2.2 400 ppm CO2 370 340 310 280 1950 1960 1970 1980 1990 2000 2010 Example 2.2 a) Does it matter that the graph has no title? Yes the necessary info is in the text above the vertical axis has a label referring to CO2 the horizontal axis is clearly years but where does the data come from Now the same graph without the vertical scale this is really meaningless some people will see a steep rise others not much ppm CO2 1950 1960 1970 1980 1990 2000 2010 Lesson 1 if there is no title and nothing else tells you what it is about, ignore the graph if there is no axis scale, ignore the graph Distorting elements inappropriate axis scales fancy 2D graphics the picture doesn’t reflect the data Example 2.3(a) the same data graphed by a climate-change denier 600 4 50 ppm CO2 300 150 Are you worried by rising CO2 levels by this? I don’t think so. 0 1950 1960 1970 1980 1990 2000 2010 Example 2.3(b) the same data graphed by the greenest environmentalist 390 3 70 ppm CO2 Scared now? Very likely. 3 50 330 3 10 1950 1960 1970 1980 1990 2000 2010 Column type graphs golden rule: always begin the value axis at 0 the size of the column is proportional to the value be careful since Excel will default to the wrong scale! 44 43 42 41 40 1 2 Exercise 2.7 a graph of energy consumption in the USA a small but steady increase or is it? the author of this one would like us to think energy use has skyrocketed 100 95 80 trillion Btu 90 60 85 40 80 20 750 1980 1985 1990 1995 2000 2005 Non-linear scales an absolute no-no that doesn’t mean people don’t do it deliberately Example 2.5 400 it appears that CO2 has begun to rise even faster but really the time scale is compressed at the RH end ppm CO2 370 340 How is this done on Excel? 310 By using a line graph and deleting 2 of every 3 pts from 1984 on 280 1958 1968 1978 1994 Images as columns relevant image instead of column what is wrong with this? 5.8 billion 2.9 billion 1.4 billion 1960 1975 1990 - the width has increased, not just the height -this means that the area increases more than the height -the size of the 1990 is 16 times that of 1960 So … use repeated small images stacked on top of each other it may not look as good but it doesn’t mislead stretching in 1 dimension isn’t good 1960 1975 1990 Overemphasising one component where you are comparing two (or more) categories or graphs make the one you are trying to “sell” stand out 3D pie charts are notorious for this (impossible to avoid it) also possible with line or column graphs Fancy pie charts Nuclear Coal Gas Oil Oil Renewables Renewables Nuclear Coal Gas the “front” section looks more important than it really should because you see the front of the pie as part of it here gas looks more important than coal here it is the other way round actually they are both equal magnitude Exercise 2.8 what do you think is going on here? perhaps trying to conceal the problem with lead? 9 8 7 6 mg/L effluent Zn 5 4 3 2 Pb 1 0 2001 2002 2003 2004 2005 2006 2007 Unfair comparisons two separate graphs compared side by side differences in numerical value will be masked mainly a problem with column and pie charts if two items being compared are in the same graph, then they will be scaled according to their relative amounts in separate graphs, then unless the scaling is exactly the same, differences will disappear Exercise 2.9 What is this trying to tell you about energy output? coal use is increasing but look how well renewables are doing – almost as much as coal How should they have been graphed? double-column 25 Coal Renewables 20 trillions Btu 15 10 5 0 1960 1970 1980 1990 2000 Exercise 2.10 For each, consider the following: what the PICTURE ONLY indicates what type of deception it is how the graph should have been done to be accurate Exercise 2.10(a) isn’t Apple dig well? notice how Apple has managed to be at the front who is in the very large Other category? 2D pie chart Exercise 2.10(b) it looks like more companies have concerns about Linux unfair comparison – look at the scales double-column graph Exercise 2.10(c) all these graphs are very popular no scale, but % values given vertical mustn’t start at 0 change vertical axis scale Exercise 2.10(d) there are so many more murders than assaults another unfair comparison like (b) change the scale on one of them Exercise 2.10(e) aren’t pitchers bad? the scale isn’t right unfair comparison – so many more pitchers fix the scale up and adjust for the relative number of each player type Exercise 2.10(f) the dollar doesn’t buy much any more – look how small it is in 1978 width has changed as well is the hand included? lots of small notes Exercise 2.10(g) the drop in overseas aid is so great when the income has been rising the grey arrow is totally out of scale probably can’t expressed with a graph