2. Graphing Sci. Info Skills

advertisement
2. Graphing
Sci. Info Skills
Graphing

Remember the cliché about pictures and thousands of words (or
numbers)

But only if the picture is clear and accurate

Hand-drawn graphs are slow, inaccurate and unprofessional
Rules for preparing graphs







axes labelled clearly
informative title
axis scale is the same all the way along
the axis scales must be shown
where multiple series are plotted on the same graph, make it clear
which data belongs to which line/column (legend)
don’t overdo the fancy formatting
the measured variable should be plotted on the vertical axis
Formatting issues - General

Do not accept the default settings without thinking “can I make this
look better?”
45
40
35
30
25
20
15
10
5
0
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Formatting issues - General
2D vs 3D
 “3-dimensional” columns look better than plain rectangles
 they make it more difficult to judge where the top of the column is
 gridlines can help
 avoid 3D piecharts completely
2D column graph
100
80
60
40
20
0
Monday
Tuesday
Wednesday
Thursday
Friday
Friday
Thursday
Wednesday
Tuesday
Monday
3D column graph
100
80
60
40
20
0
Formatting issues - General
Gridlines
 too many gridlines definitely mask the graph
 Excel seems to default to 10 often (too many)
 definitely needed with 3D column graphs
 less necessary with 2D columns
 not needed at all with scatter graphs
Friday
Thursday
Wednesday
Tuesday
Monday
3D without gridlines
100
80
60
40
20
0
3D graph with default gridlines
100
90
80
70
60
50
40
30
20
10
Friday
Thursday
Wednesday
Tuesday
Monday
0
2D without gridlines
100
80
60
40
20
0
Monday
Tuesday
Wednesday
Thursday
Friday
Formatting issues - General
Legends
 crucial where there are more than one data series

totally useless when there is only one

made useless if the labels are simply Series 1, Series 2 etc
Useless legend
45
40
35
Abs.
30
25
Absorbance
20
15
10
5
0
0
1
3
2
Conc.
4
5
Useless without a legend
100
90
80
70
Energy
60
50
40
30
20
10
0
Mon
Tue
Wed
Thu
Fri
Useless with one
100
90
80
70
Energy
60
Series1
50
Series2
40
30
20
10
0
Mon
Tue
Wed
Thu
Fri
Formatting issues - General
Background
 the default is mid grey

hopeless when printed

change to white or none

some colour on Powerpoint with non-white background
Dark slide – no background
100
90
80
70
Energy
60
50
40
30
20
10
0
Mon
Tue
Wed
Thu
Fri
Dark slide – white background
100
90
80
70
Energy
60
50
40
30
20
10
0
Mon
Tue
Wed
Thu
Fri
Dark slide – non-white bkgd
100
90
80
70
Energy
60
50
40
30
20
10
0
Mon
Tue
Wed
Thu
Fri
Formatting issues - General
Colours & patterns
 don’t change colours and end up with similar brightness

bright light colours are hard to see on PPt

avoid patterns
Hard on the eyes!
70
60
50
Urban
40
Undeveloped
30
20
10
0
Native
Introduced
Not identified
Excel chart types

look at the graph and ask yourself:

Does this graph tell me what I want it to?
Column, bar, cylinder, cone, pyramid
 plotting frequencies for category data
 only difference between them is appearance
 personal preference
Excel chart types
Line
 Only useful for plotting frequencies for tallied numerical data
 where a XY scatter doesn’t work (eg group values on horizontal axis)
 MUST NOT be used for category data
Pie & doughnut
 show relative proportions of different categories
 doughnut allows multiple different sets to be compared
Excel chart types
XY scatter
 numerical data where two variables have some relationship to each
other
 eg concentration vs time, temperature vs CO2, absorbance vs
concentration
Area
 line graph where the space underneath is filled with a solid colour
 can be used for multiple related sets
Excel chart types
Radar, bubble, stock
 limited or no use in this context
Line-Column
 allows a combination of each type
Column type
Population Density
Class Exercise 2.1
Does either of the above
graphs illustrate the
data better?
No
35
25
20
15
10
5
Population Density
0
50
100
150
200
250
300
350
400
400
Popn/sq. km
350
300
Popn/sq. km
No. of countries
30
250
200
150
100
50
0
5
10
15
20
No. countries
25
30
35
Column types – multiple sets

multiple column/bars per category
70
60
50
Urban
Undeveloped
40
30
20
10
0
Native
Introduced
Not identified
Column types – multiple sets

stacked columns
100
90
gridlines
useful here
80
70
60
Not identified
Introduced
50
Native
40
30
20
10
0
Urban
Undeveloped
Exercise 2.2

Which multiple set way is better at portraying the data?

each gives different way of comparing
multiple columns gives easy comparison of each category
stacked gives an overall picture of how they differ


Alternative multi-column - OK
70
60
50
40
Native
Introduced
30
Not identified
20
10
0
Urban
Undeveloped
Alternative stacked – wrong!
Exercise 2.3
What’s wrong?
120
100
Undeveloped
80
Urban
60
40
20
0
Native
Introduced
Not identified
There is no
reason to stack
urban &
undeveloped
on each other;
they don’t belong
together
Formatting issues - Columns
80


your choice of which style
is 3D a plus or minus?
60
40
20
0
Mon Tue
Wed Thu
Fri
Sat
Sun
80
90
80
60
70
60
40
50
20
40
0
30
Mon Tue
Wed Thu
20
10
0
Mon
Tue
Wed
Thu
Fri
Sat
Sun
Fri
Sat
Sun
80
60
40
20
0
Mon Tue
Wed Thu
Fri
Sat
Sun
Line graphs


easy to see line and scatter charts as the same thing
line charts are basically for the same type of data as column type
graphs
 must not be used for category data
 only tallied numerical data
Line graph - OK
Population Density
35
30
25
20
No. countries
15
10
5
0
<50
50-100 100-150 150-200 200-250 250-300 300-350
Popn/sq. km
>400
Exercise 2.4

How does the line graph compare to the column/bar graphs for
this data?

no better or worse
the line implies continuity between the values on the horizontal axis
which is fine

Exercise 2.5


Why is a line graph wrong for plotting category data?
Because it implies a connection between the categories
350
300
250
200
150
100
50
0
Truck
Bus
Car
Motorbike
Scatter graphs

both axes need a number associated with them

two measurements should have been made about that particular
item
 eg absorbance and concentration, pH and time

if there is only one measurement, it is not possible
Scatter graph – line or dots





line or dots or both?
line – if there is an obvious connection between the points
 eg the measurements have been made at different times or at
different distances, but on the same basic population
dots only – no direct connection
join-the-dots line – no need to show the dots
best-fit line – must show dots
Scatter graph – line, no dots
Carbon dioxide levels - Hawaii
340
Concentration (ppm)
335
330
325
320
315
310
305
300
1958
1960
1962
1964
1966
1968
1970
1972
1974
Scatter – dots only
2.3
2.1
Nitrate (mg/L)
1.9
1.7
1.5
1.3
1.1
0.9
0.7
0.5
6
6.5
7
7.5
pH
8
8.5
9
Scatter – line & dots
1
Abs.
0.75
0.5
0.25
0
0
5
10
Conc. (mg/L)
15
20
FORMATTING ISSUES – Scatter graphs

make sure the line is thick enough to be clearly visible.

if you have more than one line on the graph, make sure that:
 they differ in style (solid, dotted, dashed)
 are not light colours, to ensure it is readable when printed

when using join-the-dots, avoid auto-smoothed line
Bad scatter graph
2
1.8
1.6
1.4
Value
1.2
1
0.8
0.6
Location A
0.4
Location B
0.2
0
0
2
4
6
Time
8
10
12
Good scatter graph
2
1.8
1.6
1.4
Value
1.2
1
0.8
0.6
Location A
0.4
Location B
0.2
0
0
2
4
6
Time
8
10
12
Pie charts






for category data
to show the relative proportions
similar to stacked column
only used when items are from the same population
 eg amount of each type of recyclables in collected waste
not suitable for related items from different populations
 eg weight of paper collected in recycling in different suburbs
inappropriate to leave out some categories
Pie chart
Not identified
Introduced
Native
FORMATTING ISSUES – Pie charts





do not use 3D pie charts under any circumstances
If you intend to print it out in monochrome, be careful to avoid
adjoining segments of the same colour intensity
the labels next to the segments, rather than in a box-type legend
not necessary to include values and especially percentages in the
label.
to emphasise a particular category, it is common practice to draw
that segment slightly displaced from the main pie
Assignment 3



graphing provided data (2 separate data sets)
some graphs are specified format, other not
evaluating which graph works better for given data
Download