Measures of Relative Standing

advertisement
Measures of Relative
Standing
•
Percentiles
• z-scores
• T-scores
Percentiles/Quantiles
•
•
•
Percentiles and Quantiles
The sample kth percentile (Pk) is value such that
k% of the observations in the sample are less
than Pk and (100 – k)% are greater
E.g. 90th percentile (P90) is a a value such that
90% of the observations have smaller value and
10% of the observations are greater in value.
Quantile is just another term for percentile, e.g.
JMP refers to percentiles as quantiles.
Quartiles
•
Quartiles are specific percentiles
•
Q1 = 1st quartile = 25th percentile
Q2 = 2nd quartile = 50th percentile
•
= Median
•
Q3 = 3rd quartile = 75th percentile
Boxplot
Q1
Minimum = x(1)
Median
Q3
Maximum = x(n)
Outliers
IQR = Interquartile Range
which is the range of the middle
50% of the data
Comparative Boxplots
Definition of z-score
Population z-score
z
x

Sample z-score
xx
z
s
In either case, the z-score tells us how many
standard deviations above (if z > 0) or
below (if z < 0) the mean an observation is.
Interpretation of z-Scores
•
•
•
If z = 0 an observation is at the mean.
If z > 0 the observation is above the mean in
value, e.g. if z = 2.00 the observation is 2 SDs
above the mean.
If z < 0 the observation is below the mean in
value, e.g. if z = -1.00 the observation is 1 SD
below the mean.
The Empirical Rule (z-scores)
99.7% of data are within 3 standard deviations of the mean
95% within
2 standard deviations
68% within
1 standard deviation
34%
34%
2.4%
2.4%
0.1%
0.1%
13.5%
-3.00
-2.00
-1.00
13.5%
0
z-score
1.00
2.00
3.00
The Empirical Rule (z-scores)
Therefore for normally distributed data:
• 68% of observations have z-scores between
-1.00 and 1.00
• 95% of observations have z-scores between
-2.00 and 2.00
• 99.7 of observations have z-scores between
-3.00 and 3.00
Outliers based on z-scores
•
When we consider the empirical rule an
observation with a
z-score < -2.00 or z-score > 2.00
might be characterized as a mild outlier.
•
Any observation with a
z-score < - 3.00 or z-score > 3.00
might be characterized as an extreme outlier.
Example: z-scores
Q: Which is more extreme an
infant with a gestational age of 31
weeks or one with a birth weight of
1950 grams?
Calculate z-scores for each case.
Gestational Age = 31 weeks
31  38.61
z
 2.80
2.72
Birthweight = 1950 grams
Because the z-score associated with a
gestational age of 31 weeks is smaller
(more extreme) we conclude that it
corresponds to more extreme infant.
1950  3299.27
z
 2.11
638.97
Standardized Variables
We can convert each observed value of a numeric
variable to its associated z-score. This process
is called standardization and the resulting
variable is called the standardized variable.
Note: When standardized
the mean is 0 and standard
deviation is 1!
T-Scores ~ Another “Standardization”
Facts About T-scores
• Have a mean of 50.
• Have a standard deviation of 10.
• May extend from 0 to 100.
• It is unlikely that any T-score will be beyond
20 or 80  (i.e. 3 SD’s above and below the mean)
Definition of T-Score
10 x 
10xx 


TT  50
 50


10

s
s s


x (this is the formula in Grove, pg. 145 YUCK!)
  50  10 z where z  z  score

Empirical Rule: z- and T-scores
68%
95%
99.7%
T-Scores
T-scores may be used in same way as z-scores, but
may be preferred because:
• Only positive whole numbers are reported.
• Range from 0 to 100.
However, they are sometimes confusing because
60 or above is good score, BUT not if we are
taking a 100 point exam!
Download