Chapter 3 Descriptive Measures Section 2 Measures of Variations Slide 3-2 Measures of Variation There are three common measures for the spread or variability of a data set they are range, variance, and standard deviation. To describe the difference quantitatively, we use a descriptive measure that indicates the amount of variation, or spread, in the data set. These are referred to as measures of variation or measures of spread. The range rule of thumb: A rough estimate of the standard deviation is range s 4 Slide 3-3 Variance: is the average of the squares of the distance each value is from the mean. The symbol for the population variance is σ2. Greek lower case letter sigma. Symbol for sample variance is s2. Sample Variance = s 2 Population Variance = 2 X X 2 N 1 2 X X = individual values μ = population mean N = population size X = sample mean n = sample size N 4 Slide 3-4 The “data sets” have the same Mean, Median, and Mode yet clearly differ! Measures of Variation or Measures of Spread Figure 3.3 Slide 3-5 Range of a Data Set Range of a Data Set The range of a data set is given by the formula Range = Max – Min, where Max and Min denote the maximum and minimum observations, respectively. Range: distance between the highest value and the lowest value. The symbol R is used for the range. R = highest value – lowest value Slide 3-6 Measures of Variation or Measures of Spread: The Range Team I has range 6 inches, Team II has range 17 inches. Figure 3.4 Slide 3-7 Standard Deviation: Measures variation by indicating how far, on average, the observations are from the mean. is the square root of the variance. Symbol for the population standard deviation is σ (sigma). Symbol for the sample standard deviation is s. Variation and the Standard Deviation: The more variation that there is in a data set the larger is its standard deviation Rounding rule: The final answer should be rounded to one more decimal place than the original data. Sample Standard Deviation = s s 2 X X 2 Population Standard Deviation = 2 n 1 X 2 N Slide 3-8 Sample Standard Deviation Deviations from the Mean is how far each observation is from the mean and is the first step in computing a sample standard deviation. Sum of Squared Deviation is the sum of the squared deviations from the mean ∑(x1 - )2 and gives a measure x of the total deviations from the mean for all the Slide 3-9 observation. Sample Standard Deviation – standard deviation of a sample. Take the square root of the sample variance. 2 ( x x ) 2 Sample variance formula =s s n 1 2 Why is the denominator (n – 1) rather than n? Division by (n – 1) increases the value of the sample variance so that it will more closely reflect the population variance. Giving us an unbiased estimate for the population variance. Shortcut or computational formulas for data obtained from samples: Variance s 2 2 2 X ( X ) /n n 1 Standard Deviation s X 2 ( X ) 2 / n n 1 Slide 3-10 Variance and Standard Deviation Variances and standard deviations can be used to determine the spread of the data. If the variance or standard deviation is large, the data are more dispersed. The information is useful in comparing two (or more) data sets to determine which is more (most) variable. The measures of variance and standard deviation are used to determine the consistency of a variable. The variance and standard deviation are used to determine the number of data values that fall within a specified interval in a distribution. The variance and standard deviation are used quite often in inferential statistics. Slide 3-11 Computing Formula for a Sample Standard Deviation Rounding rule: do not perform any rounding until the computation is complete, otherwise, substantial round off error can result. Slide 3-12 Standard Deviation: the more variation, the larger the standard deviation. Data set II has greater variation. Table 3.10 Table 3.11 Slide 3-13 Data set II has greater variation and the visual clearly shows that it is more spread out. Figure 3.6 Data Set I Figure 3.7 Data Set II Slide 3-14 Three-Standard-Deviation Rule Almost all the observations in any data set lie within three standard deviations to either side of the mean. Slide 3-15 Example: Find the variance and standard deviation. Exam Scores: For 108 randomly selected college students, the exam score frequency distribution was obtained. (90+98)/ 2 94 6 * 94 564 6*(94)2 53,016 Class limits 90-98 99-107 108-116 117-125 126-134 f Xm Midpoint 6 94 22 103 43 112 28 121 9 130 108 f • Xm f • (Xm)2 564 2266 4816 3388 1170 12,204 53,016 233,392 539,392 409,948 152,100 1,387,854 Slide 3-16 Example: 12,2042 1,387,854 2 108 82.26 or 82.3 s Variance = 108 1 Standard Deviation = s 82.26 9.07 or 9.1 *** NOTE: 9.072 = 82.26 *** Slide 3-17