Measures of Relative Standing • Percentiles • z-scores • T-scores Percentiles/Quantiles • • • Percentiles and Quantiles The sample kth percentile (Pk) is value such that k% of the observations in the sample are less than Pk and (100 – k)% are greater E.g. 90th percentile (P90) is a a value such that 90% of the observations have smaller value and 10% of the observations are greater in value. Quantile is just another term for percentile, e.g. JMP refers to percentiles as quantiles. Quartiles • Quartiles are specific percentiles • Q1 = 1st quartile = 25th percentile Q2 = 2nd quartile = 50th percentile • = Median • Q3 = 3rd quartile = 75th percentile Boxplot Q1 Minimum = x(1) Median Q3 Maximum = x(n) Outliers IQR = Interquartile Range which is the range of the middle 50% of the data Comparative Boxplots Definition of z-score Population z-score z x Sample z-score xx z s In either case, the z-score tells us how many standard deviations above (if z > 0) or below (if z < 0) the mean an observation is. Interpretation of z-Scores • • • If z = 0 an observation is at the mean. If z > 0 the observation is above the mean in value, e.g. if z = 2.00 the observation is 2 SDs above the mean. If z < 0 the observation is below the mean in value, e.g. if z = -1.00 the observation is 1 SD below the mean. The Empirical Rule (z-scores) 99.7% of data are within 3 standard deviations of the mean 95% within 2 standard deviations 68% within 1 standard deviation 34% 34% 2.4% 2.4% 0.1% 0.1% 13.5% -3.00 -2.00 -1.00 13.5% 0 z-score 1.00 2.00 3.00 The Empirical Rule (z-scores) Therefore for normally distributed data: • 68% of observations have z-scores between -1.00 and 1.00 • 95% of observations have z-scores between -2.00 and 2.00 • 99.7 of observations have z-scores between -3.00 and 3.00 Outliers based on z-scores • When we consider the empirical rule an observation with a z-score < -2.00 or z-score > 2.00 might be characterized as a mild outlier. • Any observation with a z-score < - 3.00 or z-score > 3.00 might be characterized as an extreme outlier. Example: z-scores Q: Which is more extreme an infant with a gestational age of 31 weeks or one with a birth weight of 1950 grams? Calculate z-scores for each case. Gestational Age = 31 weeks 31 38.61 z 2.80 2.72 Birthweight = 1950 grams Because the z-score associated with a gestational age of 31 weeks is smaller (more extreme) we conclude that it corresponds to more extreme infant. 1950 3299.27 z 2.11 638.97 Standardized Variables We can convert each observed value of a numeric variable to its associated z-score. This process is called standardization and the resulting variable is called the standardized variable. Note: When standardized the mean is 0 and standard deviation is 1! T-Scores ~ Another “Standardization” Facts About T-scores • Have a mean of 50. • Have a standard deviation of 10. • May extend from 0 to 100. • It is unlikely that any T-score will be beyond 20 or 80 (i.e. 3 SD’s above and below the mean) Definition of T-Score 10 x 10xx TT 50 50 10 s s s x (this is the formula in Grove, pg. 145 YUCK!) 50 10 z where z z score Empirical Rule: z- and T-scores 68% 95% 99.7% T-Scores T-scores may be used in same way as z-scores, but may be preferred because: • Only positive whole numbers are reported. • Range from 0 to 100. However, they are sometimes confusing because 60 or above is good score, BUT not if we are taking a 100 point exam!