variability

advertisement
Intra-Individual Variability
• Intra-individual variability is greater among older
adults (Morse 1993)
– May be an indicator of the functioning of the central nervous
system (Dykiert 2012)
– Neurodegenerative diseases increase variability Alzheimer’s
• Parkinson’s Disease
• following a traumatic brain injury
• Test with Reaction Time Test
– http://www.bbc.co.uk/science/humanbody/sleep/sheep/reacti
on_version5.swf
– Record a series of reaction times
– Variability of reaction times is important
Example data from reaction time test
1
2
3
4
5
6
7
8
9
10
Reaction time in milliseconds
220
281
Mean
232.2
200
214
Max
290.0
287
Min
160.0
220
Range
130.0
160
290
Variance
2417.5
280
Std Dev
49.2
170
Reaction Time embedded Excel sheet
1
2
3
4
5
6
7
8
9
10
281
351
370
280
391
501
350
289
348
271
Mean
343.2
Max
501.0
Min
271.0
Range
230.0
Variance
Std Dev
4894.2
70.0
Variability
• A measure of how spread out the scores are in a distribution
• Usually accompanies a measure of central tendency as basic
descriptive statistics for a set of scores
• As a descriptive measure variability measures the degree to
which the scores are spread out or clustered together in a
distribution.
• As an important component of most inferential statistics
variability provides a measure of how accurately any
individual score or sample represents the entire population
Population Variability
• Effects on sampling
– When the population variability is small “homogeneous”
• all of the scores are clustered close together
• individual score or sample will necessarily provide a good
representation of the entire set.
– When population variability is large “heterogeneous”
• scores are widely spread
• one or two extreme scores can give a distorted picture of the
general population
Figure 4-1 (p. 105)
Population distributions of adult heights and adult weights.
Selective breeding for taste preference
• Selective breeding of dogs or cats for coat color, body size
and behavior has occurred for centuries.
• Rats selectively bred for low versus high saccharin drinking.
– Low-saccharin-consuming (LoS)
– High-saccharin-consuming (HiS)
• Compare taste preference of male and female rats from
HiS and LoS lines.
– Values in the table are phenotype scores which is taste preference
adjusted for baseline water and body weight.
HSF
MEAN
STD DEV
HSM
LSF
LSM
53.66
28.59
5.52
1.02
16.58
11.90
10.48
5.48
Measuring Variability
• Variability can be measured with
– range is the total distance covered by the distribution, from the
highest score to the lowest score
– variance average square distance from the mean
– standard deviation measures the standard distance between a
score and the mean
• In each case, variability is determined by measuring
distance.
The Range
• The range is the total distance covered by the
distribution
• The range tells us the number of measurement
categories.
– from the highest score (MAX) to the lowest score (MIN)
• For continuous variables
• Upper Real Limit for Xmax – LRL for Xmin
• Scores from 1 to 5 ; 5.5 – 0.5 = 5
• Alternative definition of range:
– When scores are whole numbers or discrete variables
with numerical scores,
• Xmax – Xmin + 1
• Scores from 0 to 4 4 – 0 + 1 = 5
The Range
• Commonly the range can be defined as the
difference between the largest score (max) and the
smallest score (min).
• Used by SPSS computer program
• Scores from 1 to 5 5 -1 = 4
• OK for discrete but not continuous variables
• Which calculation is used usually does not matter
• Range usually is not an accurate measurement of
variability.
• Completely determined by two scores (max – min)
• An extreme large or small score will inflate the range
• Can be used as an adjunctive measure with
variance
Interquartile and Semi-interquartile Range
• Interquartile range:
• Range covered by the middle 50% of the distribution
• = (Q3–Q1)
• Q3 is the 75th percentile, Q1 is the 25th percentile
• Semi-interquartile range:
• half the interquartile range = (Q3–Q1) / 2
• less likely to be influenced by extreme scores
The Standard Deviation
• Standard deviation measures the standard
(average) distance between a score and the mean.
– All of the scores are distributed around the mean
– there is a distance between each score and the mean
• called a deviation;
• X - m = deviation score
– The average of all these deviation scores is standard
deviation
Example 4.1 page 107
X
X-m
8
5
1
-2
3
0
0
-3
∑ (X – m) = 0
OOOPS
Can not calculate the average deviation this way
Sum of deviation from the mean is always zero
Sum of deviations divided by number of scores
0/4 = ????
Need a work around
see example 4.2
Example 4.2 table of data from page 109
Population of N = 5 scores.
Calculate the mean
Calculate deviation from the mean
Calculate Squared Deviation
Calculate “SS” Sum of Squared Deviations
Score
X
1
9
5
8
7
∑ X = 30
m = (∑ X)/N
m = 30/5 = 6
Deviation
X-m
-5
+3
-1
+2
+1
∑ (X – m) = 0
Squared
Deviation
(X-m)2
25
9
1
4
1
SS = ∑(X-m)2 = 40
Figure 4.3 (page 111) from example data 4.2
A frequency distribution histogram for a population of N = 5 scores. The mean for this
population is µ = 6. The smallest distance from the mean is 1 point, and the largest distance is
5 points. The standard distance (or standard deviation) should be between 1 and 5 points.
See example 4.2 for calculation table
Sum of Squares SS = ∑(X-m)2 = 40
Figure 4.2 (page 109)
The calculations of variance and standard deviation
Calculation of Standard Deviation for a Population
•
•
•
•
For example 4.2
Calculate mean m = (∑ X)/N m = 30/5 = 6
Calculate (SS) ∑(X-m)2 = 40
Compute variance which is mean square
– calculate mean of the squared deviation
• (SS) divided by N
– Variance s2 = SS/N = 40/5 = 8
• Compute Standard Deviation
– Standard deviation is the square root of variance
– Standard deviation s = √8 = 2.83
Figure 4.3 (page 111) from example data 4.2
A frequency distribution histogram for a population of N = 5 scores. The mean for this
population is µ = 6. The smallest distance from the mean is 1 point, and the largest distance is
5 points. The standard distance (or standard deviation) should be between 1 and 5 points.
Sum of Squares SS = ∑(X-m)2 = 40
Variance s2 = SS/N = 40/5 = 8
Standard deviation s = √ s2 = √8 = 2.83
Population Variance and Standard Deviation Formulas
• Definitional formulas
– Sum of Squares SS = ∑(X-m)2
– Population Variance s2 = SS/N
– Population Standard deviation s = √ SS/N
• Computational formulas Sum of Squares
– Easier with a hand calculator
– Using example 4.2
x
1
9
5
8
7
x2
1
81
25
64
49
( x) 2
SS   x 
N
2

SS = 220 - 900/5
SS = 220 - 180
SS = 40
Relationship with Other Statistical
Measures
• Variance and standard deviation are mathematically
related to the mean. They are computed from the
squared deviation scores (squared distance of each
score from the mean).
• Median and semi-interquartile range are both based
on percentiles and therefore are used together.
When the median is used to report central tendency,
semi-interquartile range is often used to report
variability.
• Range has no direct relationship to any other
statistical measure.
Figure 4-4 (p. 114)
The population of adult heights forms a normal distribution. If you select a
sample from this population, you are most likely to obtain individuals who are
near average in height. As a result, the scores in the sample will be less
variable (spread out) than the scores in the population.
F
R
E
Q
U
E
N
C
Y
48 inches
84 inches
Calculation of Standard Deviation for a Sample
1. Compute the deviation (distance from the mean)
for each score.
2. Square each deviation.
3. Compute the mean of the squared deviation
a) called the variance or mean square
b) for samples
a) sum of the squared deviations (SS)
b) dividing by n - 1, rather than N
a) n - 1, is know as degrees of freedom (df)
b) used so that the sample variance will provide an unbiased
estimate of the population variance
4. take the square root of the variance to obtain the
standard deviation
Example 4.5 table from page 116
Sample of n = 7 scores from a sample
Score
Deviation
X
1
6
4
3
8
7
6
X-M
-4
+1
-1
-2
+3
+2
+1
∑ X = 35
M =(∑ X)/n
M = 35/7 = 5
∑ (X – M) = 0
Squared
Deviation
(X-M)2
16
1
1
4
9
4
1
∑(X-M)2 = 36
Figure 4-5 (p. 116) using Example data 4.5 (p. 116)
The frequency distribution histogram for a sample of n = 7 scores. The sample mean
is M = 5. The smallest distance from the mean is 1 point, and the largest distance
from the mean is 4 points. The standard distance (standard deviation) should be
between 1 and 4 points or about 2.5.
Sum of Squares SS = ∑(X-M)2 = 36
Sample Variance s2 = SS/n-1 = 36/6 = 6
Sample Standard Deviation s = √ s2 = √ 6 = 2.45
APA uses SD for standard deviation so it would be SD = √ s2 = √ 6 = 2.45
Degrees of Freedom (df)
• Sample variability is biased, tends to under report
population variability
• Using n – 1 corrects for that bias
• Sample variance is an unbiased estimate of population
variance
• For a single sample of size n; degrees of freedom is n-1
• Used as the denominator when calculating sample variance
s2 = SS/n-1
• BTW: Degrees of freedom refers to number of scores that
are free to vary
– For a sample of n = 3 and a M = 5, where the first two scores are X =
4 and X = 5, the third score must be?
– X = 6, it has to be, i.e. has no freedom to vary
– So when n = 3 there are only 2 degrees of freedom
Download