Lecture 4: Measures of Variation 2.5 Measures of Variation Slide 1 Slide 2 Review of Lecture 3: Measures of Center • Given a stem –and-leaf plot Be able to find » Mean • (40+42+3*50+51+2*52+64+67)/10=46.7 » Median • (50+51)/2=50.5 » mode Stem (tens) Leaves (units) 4 02 5 6 000122 47 5th • 50 • Given a regular frequency distribution Be able to find » Sample size •2+4+5+16+13=40 » Mean •(8+12+10+16+0)/40=1.15 » Median: •average of the two middle values=1 Median group f fx Cum Freq 4 2 8 2 3 4 12 6 2 5 10 11 1 16 16 27 0 13 0 40=n Statistics handles variation. Thus this section one of the most important sections in the entire book Standard Deviation Slide 3 The range of a set of data is the difference between the highest value and the lowest value Range=(Highest value) – (Lowest value) Example: Range of {1, 3, 14} is 14-1=13. Sample Standard Deviation Formula Candidates: Range Standard Deviation, Variance Coefficient of Variation 6th # of phones (x) Definition Measure of Variation (Measure of Dispersion): A measure helps us to know the spread of a data set. Slide 5 1 The standard deviation of a set of values is a measure of variation of values about the mean We introduce two standard deviation: • Sample standard deviation • Population standard deviation Sample Standard Deviation (Shortcut Formula) Data value S= Formula 2-4 Σ (x - x)2 n-1 Sample size Slide 4 s= Formula 2-5 n (Σ Σx2) - (Σ Σx)2 n (n - 1) Slide 6 Standard Deviation Key Points Example: Publix check-out waitingSlide 7 times in minutes Data: 1, 4, 10. Find the sample mean and sample standard deviation. Using the shortcut x−x formula: ( x − x )2 x2 x n=3 15 = 5.0 min 3 ∑ (x − x) ∑ x ∑x ∑ (x − x ) 2 s= n −1 n ∑ x 2 − (∑ x ) ! The value of the standard deviation s is usually positive and always non-negative. = 3(117) − (15) 3(3 − 1) = 351 − 225 126 = 6 6 ! The value of the standard deviation s can increase dramatically with the inclusion of one or more outliers (data values far away from all others) s= 100 117 2 n(n − 1) 2 2 42 = = 21 = 4.6 min 3 −1 ! The units of the standard deviation s are the same as the units of the original data values = 21 = 4.6 min Population Standard Deviation 2 N Slide 10 ! The variance of a set of values is a measure of variation equal to the square of the standard deviation. ! Sample variance s2: Square of the sample standard deviation s ! Population variance: Square of the population standard deviation σ This formula is similar to Formula 2-4, but instead the population mean and population size are used Variance - Notation Variance Slide 9 Σ (x - µ)2 σ = ! The standard deviation is a measure of variation of all values from the mean 2 1 16 16 1 25 42 -1 5 Slide 11 Round-off Rule for Measures of Variation Slide 12 standard deviation squared Notation } x= 1−5= -4 1 4 10 15 Slide 8 s2 σ 2 Sample variance Population variance Carry one more decimal place than is present in the original set of data. Round only the final answer, not values in the middle of a calculation. Definition Example: How to compare the variability in heights and weights of men? Slide 13 Sample: 40 males were randomly selected. The summarized statistics are given below. The coefficient of variation (or CV) for a set of sample or population data, expressed as a percent, describes the standard deviation relative to the mean Sample CV = Sample mean Population s •100% x CV = σ •100% µ 68.34 in Weight 172.55 lb 26.33 lb s 3.02 • 100% = 4.42% x 68.34 s 26.33 Weights: CV = • 100% = • 100% = 15.26% x 172.55 Heights: CV = • 100% = n [Σ(f • x 2)] - [Σ(f • x)]2 n (n - 1) Use the class midpoints as the x values 3 TV sets (x) 0 1 2 3 4 Total (a) x = # of Households (f) 4 33 28 10 5 80 Use s≈ Range 4 Where range = (highest value) – (lowest value) Slide 16 fx 0 33 56 30 20 139 fx2 0 33 112 90 80 315 Compute: (a) the sample mean (b) the sample standard deviation 139 = 1.7sets 80 n∑ ( fx 2 ) − (∑ fx ) 2 For estimating a value of the standard deviation s, Heights (with CV=4.42%) have considerably less variation than weights (with CV=15.26%) • A random sample of 80 households was selected • Number of TV owned is collected given below. (b) s = Estimation of Standard Deviation Slide 17 Range Rule of Thumb Conclusion: Example: Number of TV sets Owned by households Slide 15 Formula 2-6 S= Height Sample standard deviation 3.02 in Solution: Use CV to compare the variability • A measure good at comparing variation between populations • No unit makes comparing apple and pear possible. Standard Deviation from a Frequency Distribution Slide 14 n( n − 1) = 80(315) − (139) 2 5879 = = 1.0 sets 80(80 − 1) 6320 Estimation of Standard Deviation Slide 18 Range Rule of Thumb For interpreting a known value of the standard deviation s, find rough estimates of the minimum and maximum “usual” values by using: Minimum “usual” value ≈ (mean) – 2 X (standard deviation) Maximum “usual” value ≈ (mean) + 2 X (standard deviation) Definition The Empirical Rule Slide 19 Slide 20 Empirical (68-95-99.7) Rule For data sets having a distribution that is approximately bell shaped, the following properties apply: ! About 68% of all values fall within 1 standard deviation of the mean ! About 95% of all values fall within 2 standard deviations of the mean ! About 99.7% of all values fall within 3 standard deviations of the mean FIGURE 2-13 The Empirical Rule The Empirical Rule Slide 21 Slide 22 4 FIGURE 2-13 Recap FIGURE 2-13 Slide 23 Homework Assignment 4 Slide 24 In this section we have looked at: ! Range • problems 2.5: 1, 3, 7, 9, 11, 17, 23, 25, 27, 31 ! Standard deviation of a sample and population ! Variance of a sample and population ! Coefficient of Variation (CV) ! Standard deviation using a frequency distribution ! Range Rule of Thumb ! Empirical Distribution • Read: section 2.6: Measures of relative standing.