Scientific Information Skills Bad Graphs

advertisement
Scientific Information
Skills
Bad Graphs
Bad graphs

a graph should pictorially display data in a sensible and meaningful way

many graphs in the public domain fail this

two reasons why this happens:
 poor design
 intentionally deceiving
Poor design

the aspects mentioned last week

plotting the wrong data

carelessness

over-complication

duplication of information
Example 2.1
What’s the graph about?
 which companies are the “biggest”?
 by showing revenue & profit
Problems
 a lot of the image is taken up with
text
 the profits aren’t a bar chart but it
suggests that they are
Fixes
 change of size/shape
 double bar (or column) to add profit
(or leave profit entirely)
 scatter graph
Example 2.1 Fixes
Example 2.1 Fix
Exercise 2.6(a)




changes in poverty rates
over time for different
age groups
you can’t see where the
lines go after crossing
year scale not consistent
different styles – dashed,
dotted
Exercise 2.6(b)






something to
do with
shares?
background
too dark
too much
little text
lines all over
the place
etc
fire the
person
responsible
Exercise 2.6(c)





changes in military divisions
hard to see much difference between the two series
% of what?
if you have to include % labels, the graph is unnecessary
change graph type and data measure (% to numbers)
Exercise 2.6(d)





how much of the basic food
groups in a serve
calories are grouped with mass
amount not proportional to
sector size
3 g of satd fat is part of the 4.5 g
a graph isn’t necessary
Exercise 2.6(e)



fuel economy for different cars
why create the graphs and
then put the exact figures in as
well
ditch the numbers or the graph
Exercise 2.6(f)




growth of economies
OK for showing difference
between rich & poor
cannot see detail for
middle- & poor
need a different scale
Exercise 2.6(g)

see page 23
Exercise 2.6(h)





supposed to be shape of UK
showing marriage status
rates
dark normally implies more
(not here)
it implies everyone in a
square is in that status
this amount of info can’t be
presented in one graph
Lying with graphs

by “distorting” the graph, you distort the information being shown

this can occur by:
 hiding a graph element
 distorting the graph elements
 over emphasising one element
 making unfair comparisons

most people just look at the picture
Hiding an element




taking away a key piece of information
the viewers needs it to properly understand the data
left to look at the picture
missing elements:
 title – not as serious; you should be able to tell enough from the axes labels or
surrounding text
 axis scale – a serious problem; ignore any graph like this
Example 2.2
400
ppm CO2
370
340
310
280
1950
1960
1970
1980
1990
2000
2010
Example 2.2
a)
Does it matter that the graph has no title?

Yes
the necessary info is in the text above
the vertical axis has a label referring to CO2
the horizontal axis is clearly years
but where does the data come from




Now the same graph without the vertical scale
this is really meaningless
some people will see a steep rise
others not much
ppm CO2

1950
1960
1970
1980
1990
2000
2010
Lesson 1

if there is no title and nothing else tells you what it is about, ignore the graph

if there is no axis scale, ignore the graph
Distorting elements

inappropriate axis scales

fancy 2D graphics

the picture doesn’t reflect the data
Example 2.3(a)
the same data graphed by a climate-change denier
600
4 50
ppm CO2

300
150
Are you worried by rising CO2 levels by this?
I don’t think so.
0
1950
1960
1970
1980
1990
2000
2010
Example 2.3(b)
the same data graphed by the greenest environmentalist
390
3 70
ppm CO2

Scared now?
Very likely.
3 50
330
3 10
1950
1960
1970
1980
1990
2000
2010
Column type graphs



golden rule: always begin the value axis at 0
the size of the column is proportional to the value
be careful since Excel will default to the wrong scale!
44
43
42
41
40
1
2
Exercise 2.7


a graph of energy consumption in the USA
a small but steady increase or is it?
the author of this one would like us to think energy use has skyrocketed
100
95
80
trillion Btu

90
60
85
40
80
20
750
1980
1985
1990
1995
2000
2005
Non-linear scales

an absolute no-no

that doesn’t mean people don’t do it deliberately
Example 2.5
400
it appears that CO2 has begun to
rise even faster
but really the time scale is
compressed at the RH end
ppm CO2
370
340
How is this done
on Excel?
310
By using a line
graph and deleting
2 of every 3 pts
from 1984 on
280
1958
1968
1978
1994
Images as columns


relevant image instead of column
what is wrong with this?
5.8 billion
2.9 billion
1.4 billion
1960
1975
1990
- the width has
increased, not just
the height
-this means that
the area increases
more than the
height
-the size of the
1990 is 16 times
that of 1960
So …



use repeated small images
stacked on top of each other
it may not look as good but it
doesn’t mislead
stretching in 1 dimension isn’t
good
1960
1975
1990
Overemphasising one component

where you are comparing two (or more) categories or graphs

make the one you are trying to “sell” stand out

3D pie charts are notorious for this (impossible to avoid it)

also possible with line or column graphs
Fancy pie charts
Nuclear
Coal
Gas
Oil
Oil
Renewables
Renewables
Nuclear




Coal
Gas
the “front” section looks more important than it really should because you
see the front of the pie as part of it
here gas looks more important than coal
here it is the other way round
actually they are both equal magnitude
Exercise 2.8

what do you think is going on here?
perhaps trying to conceal the problem with lead?
9
8
7
6
mg/L effluent

Zn
5
4
3
2
Pb
1
0
2001
2002
2003
2004
2005
2006
2007
Unfair comparisons





two separate graphs compared side by side
differences in numerical value will be masked
mainly a problem with column and pie charts
if two items being compared are in the same graph, then they will be scaled
according to their relative amounts
in separate graphs, then unless the scaling is exactly the same, differences will
disappear
Exercise 2.9


What is this trying to tell you about energy output?
coal use is increasing but look how well renewables are doing – almost as much as
coal

How should they have been graphed?
double-column
25
Coal
Renewables
20
trillions Btu

15
10
5
0
1960
1970
1980
1990
2000
Exercise 2.10

For each, consider the following:
 what the PICTURE ONLY indicates

what type of deception it is

how the graph should have been done to be accurate
Exercise 2.10(a)




isn’t Apple dig well?
notice how Apple
has managed to be
at the front
who is in the very
large Other
category?
2D pie chart
Exercise 2.10(b)



it looks like more companies have concerns about Linux
unfair comparison – look at the scales
double-column graph
Exercise 2.10(c)




all these graphs are very
popular
no scale, but % values
given
vertical mustn’t start at
0
change vertical axis
scale
Exercise 2.10(d)



there are so many more
murders than assaults
another unfair comparison
like (b)
change the scale on one of
them
Exercise 2.10(e)




aren’t pitchers bad?
the scale isn’t right
unfair comparison – so many more pitchers
fix the scale up and adjust for the relative number of each player type
Exercise 2.10(f)




the dollar doesn’t
buy much any more
– look how small it is
in 1978
width has changed as
well
is the hand included?
lots of small notes
Exercise 2.10(g)



the drop in overseas aid is so great when
the income has been rising
the grey arrow is totally out of scale
probably can’t expressed with a graph
Download