TUTORIAL 1
th
January 9 2003
THE PLAN
1. Types of data
a) qualitative or quantitative
b) qualitative : nominal or ordinal
c) quantitative: discrete or continuous
2. Paired data: why use it?
3. Descriptive stats
a) mean & median & mode
b) variation in data
4. Frequency tables and graphs
?
1. TYPES OF DATA
measurements
are QUALITIES
measurements
are QUANTITIES
red/blue/green
COLOUR
oak/beech/larch
SPECIES
-
small/medium/large
SIZE
X cm or x mm
light wavelength
has units
QUALITATIVE
QUANTITATIVE
QUALITATIVE DATA
Do your categories have a natural order?
N => NOMINAL
Y => ORDINAL
Colour of eyes:
Blue/brown/hazel
Size:
Small-medium-large
Species of insect:
Wasp/fly/beetle
Behaviour rank:
0 = no aggression
1 = approach
2 = threat
3 = attack
Evergreen/deciduous:
Nocturnal/diurnal
QUANTITATIVE DATA
Can your measurements take any value?
i.e. can you have a decimal point/fraction?
N => DISCRETE
Y => CONTINUOUS
Number of offspring
Number of trees in field
Number feeding events
Mass measured in STONE
Mass of rats (g)
Mean trees in field
Lifespan
Speed
Mass in POUNDS & OUNCES
integers
real numbers
2. THE JOYS OF PAIRED
DATA
A pair or measurements made on a) the same
individual or b) same type of individual
To look for a relationship:
height & mass of people
Control for diffs in individuals:
Mass before and after Christmas
Worm load before and after drug
Minimise diffs in individuals:
Pair rats by size/age for 2 diff diets as
(cannot measure 2 diets in same rat)
NO. WORMS
NO. WORMS
MORE POWER
PLACEBO
DRUG
8 diff individuals
Trend for drugs < worms?
Lots of noise => significant?
BEFORE
4 diff individuals
All < worms after drug
Trend clearer
P.S. if more than 2 measurements on
the same individual come and see me
AFTER
3. DESCRIPTIVE STATS
How to describe raw data - what is a typical value?
MEAN = average = x/n
1,1,1,2,3,3,4,5 => 20/8 => 2.5
MEDIAN = middle value when
Data is in numerical order
1,2,3,4,5,6,7,8,9 => 5
1,2,3,4,5,6,7,8 => 1/2x(4+5) => 4.5
students
students
MODE = most common
1,2,2,3,3,3,4,4,5 => 3
1
2
3
4
5 pints
1
2
3 4
5
6 7 pints
MEAN OR MEDIAN OR
MODE?
Usually all 3 very similar:
Trout mass (g)
4.5 4.6 4.9 5.1 5.3 5.5 5.5 5.6 5.7 6.1
MEAN = 5.3g MODE = 5.5g MEDIAN = 5.4g
What if we have a giant fish?
Now fish 10 = 50g (not 6.1g)
MEAN = 9.7g
MODE = 5.5g
MEDIAN = 5.4g
Mass (g)
The MEAN might NOT be best if distribution skewed
VARIATION IN DATA
Say, the average speed of penguins = t ms-1.
Do all penguins swim close to speed t ?
i.e. what is the range of penguin speeds?
<<<<
t
t
Measures of DISPERSION & SPREAD
8
STANDARD DEVIATION
Height (cm)
0 2 4 6
8-4 = +4
mean
RESIDUALS: deviation from the
mean for each data point ( )
=
- mean
0-4 = -4
S2 = ( - mean)2
n -1
Used to calculate standard
deviation (s) & variance (S2)
Data (cm)
0, 1, 2,3, 4, 5, 6, 7, 8
x-mean
-4, -3, -2, -1 0, 1, 2, 3, 4
(x-mean)2
16, 9, 4, 0, 1, 4, 9, 16
N=8
=59
2 = 59/7 = 8.4
S
=>
s = 2.9cm
4. FREQUENCY TABLES &
GRAPHS
The data
No. pints drunk
3
No. students (frequency)
Cumulative frequency
Most drink 7 pints
4 5
6 7 8 9
1 2 4 11 21 5 3
1 3 7 18 39 44 47
18 drink 6 or less