251solnG3B 1/31/08 (Open this document in 'Page Layout' view!) G. Measures of Dispersion and Asymmetry. 1. Range Downing & Clark, problem 7 above (Use data to find IQR). Review solutions and terms on page 41 (36 in 3 rd ed.) of Downing & Clark. 2. The Variance and Standard Deviation of Ungrouped Data. Text exercises 3.1b, 3.2b, 3.6, 3.37, 3.24 [3.1b, 3.2b, 3.7, 3.37, 3.23] (3.1b, 3.2b, 3.7, 3.23, 3.33) 3. The Variance and Standard Deviation of Grouped Data. Text exercises 3.28, 3.30 (3.68, 3.70) (work 3.30 in thousands), Downing & Clark pg 42 or 37, problems 6,7 (Find sample standard deviation – hint: run problem 6 in hundreds) (Note that you can use the Excel or Minitab techniques in the graded assignment to compute and sum the fx and fx 2 columns in problems 6 and 7. ), Problems G1, G2. Graded Assignment 1 4. Skewness and Kurtosis. Find the standard deviation, coefficient of variation and measures of skewness in Text problem 3.1, 3.2. Problems G3A, G4 (See 251wrksht). 5. Review a. Grouped Data. b. Ungrouped Data. Part of Section 4 is in this document. ----------------------------------------------------------------------------------------------------------------------------. Problem G3A: Use computational formulas on the data below. Consider the data a sample. a. Complete the cumulative frequency under F . b. Calculate the mean. c. Calculate the median. d. Calculate the mode. e. Calculate the variance. f. Calculate the interquartile range. g. Calculate the standard deviation. h. Calculate a statistic showing skewness. i. Show all the data presented on a histogram with six class intervals. j. Put a box plot below the histogram. Now repeat Problem G3A using definitional formulas. Class x F f 0 - 9.999 50 10 - 19.999 50 20 - 29.999 100 30 - 39.999 150 40 - 49.999 50 xf x2 f x3 f Solution: Fill in the table. Note that the conventional way of writing the headings is Class, x , f , F, fx , fx2 and fx3 . We use computational formulas first. So if x = 5 and f = 50, fx 505 250 , fx2 2505 1250 and fx3 12505 6250 . F Class f x 0 - 9.999 10 - 19.999 20 - 29.999 30 - 39.999 40 - 49.999 (midpoint) 5 15 25 35 45 50 50 100 150 50 400 50 100 200 350 400 xf 250 750 2500 5250 2250 11000 x2 f 1250 11250 62500 183750 101250 360000 x3 f 6250 168750 1562500 6431250 4556250 12725000 251solnG3 1/31/08 To summarize our results, f n 400 , fx 11000 , fx 2 360000 and fx a. Complete the cumulative frequency under F: (See above.) We add down the 3 12725000 . f column. b. Calculate the mean: x fx 11000 27.50 n 400 c. Calculate the median: To get a measure of position in grouped data pN F first use position pn 1 , then use x1 p L p w to find the value. Here f p p .5 . So pn 1 .5401 200 .50 . This location is above 200 and below 350, so use 20 .5400 200 to 29.9999. Then x.5 30 10 30 .00 . 150 d. Calculate the mode: The group 30-39.999 has a frequency of 150, which is the largest frequency. So the mode is 35.00, its midpoint. e. Calculate the variance: s2 fx 2 nx 2 n 1 360000 400 27 .50 2 144 .1103 399 f. Calculate the interquartile range: For the first quartile position pn 1 .25401 100 .25 . This location is above 100 and pN F below 200, so use 20 to 29.999. Then, using x1 p L p w we find f p .25400 100 x.75 Q1 20 10 20 .00 100 For the third quartile pn 1 .75401 300 .75 . This location is above 200 and below 350, so .75 400 200 x.25 Q3 30 10 36 .67 . So 150 IQR Q3 Q1 36.67 20.00 16.67 use 30 to 39.999. Then, we find g. Calculate the standard deviation: s variance 144.1103 12.005 . Note also std .deviation 12 .005 0.4365 . the coefficient of variation C mean 27 .50 h. Calculate a statistic showing skewness: There are three possibilities: n fx 3 3x fx 2 2nx 3 1) k 3 (n 1)( n 2) 400 12725000 327 .5360000 2400 27 .53 (399 )(398 ) .00251886 12725000 29700000 16637500 .00251886 337500 850 .115 . 2 251solnG3 1/31/08 2) g 1 k3 s 3 850 .115 144 .1103 3 850 .115 0.4914 . 1729 .98580 3mean mode 327 .5 35 1.874 . std .deviation 12 .005 Only one of these three is needed. All indicate skewness to the left. 3) Pearson’s Measure of Skewness SK i. Show all the data presented on a histogram with five class intervals. j. Put a box plot below the histogram. The box will begin at 20 and end at 36.67 with a band at 30 to indicate the median. (Include a hand-drawn solution to i and j.) Now we do the problem using definitional formulas. Note how much bigger the table has to be! Once again, the conventional headings would be Class, x, f, fx, x x , f x x , f x x 2 and f x x 3 . There is no reason to use both the computational and the definitional methods unless specifically requested, though, of course, one method serves as a check on the other. x (midpoint) 5 15 25 35 45 Class 0 - 9.999 10 - 19.999 20 - 29.999 30 - 39.999 40 - 49.999 f xf 50 50 100 150 50 400 We can summarize the table as follows: f x x 3 250 750 2500 5250 2250 11000 F 50 100 200 350 400 x x -22.5 -12.5 -2.5 7.5 17.5 x x f -1125 -625 -250 1125 875 0 x x 2 f 25312.5 7812.5 625.0 8437.5 15312.5 57500.0 x x 3 f -569531.25 -97656.25 -1562.50 63281.25 267968.75 -337500.00 f n 90, fx 11000 , f x x 2 57500 , 337500. e. Calculate the variance: s2 f x x n 1 2 57500 144 .1102 399 h. Calculate a statistic showing skewness: n 400 337500 850 .1152 k 3 f x x 3 399 398 (n 1)( n 2) Other calculations are the same as on the previous page. 3 251solnG3 1/31/08 Problem G4: For the sample below, compute the following: b. the mean c. the median (hint: put in order first!) d. the mode e. the variance f. the interquartile range g. the standard deviation h. a statistic showing skewness. 1,2,4,5,6,3,3,7,8,3,1,2 Solution: Both the computational and definitional method are shown. There is no reason to do both unless specifically requested. Computational Method 2 x x 1 2 4 5 6 3 3 7 8 3 1 2 45 1 4 16 25 36 9 9 49 64 9 1 4 227 So n 12 , Definitional Method x 3 1 8 64 125 216 27 27 343 512 27 1 8 1359 x 45, x 2 227 , x 3 x in order xx x x x x -2.75 -1.75 0.25 1.25 2.25 -0.75 -0.75 3.25 4.25 -0.25 -2.75 -1.75 0.00 7.5625 3.0625 0.0625 1.5625 5.0625 0.5625 0.5625 10.5625 18.0625 0.5625 7.5625 3.0625 58.2500 -20.79688 -5.35938 0.01563 1.95313 11.39063 -0.42188 -0.42188 34.32813 76.76563 -0.42188 -20.79688 -5.35438 70.87499 2 1359 , x x 2 3 58.250 , and x x 3 index x 1 2 3 4 5 6 7 8 9 10 11 12 1 1 2 2 3 3 3 4 5 6 7 8 70.875. b. the mean: x x 45 3.75 n 12 c. the median (hint: put in order first!): To get a measure of position first use position pn 1 .513 6.5 . 33 This implies that we want the mean of x6 and x7 . x.5 3 . We can also use the method 2 for finding any fractile. position pn 1 .513 6.5 a.b . From this we get a 6 and .b .5 . Now use x1 p xa .b( xa1 xa ) . So x.50 x 6 .5( x 7 x6 ) 3 .5(3 3) 3 . The two ways of finding the median of ungrouped data always give identical results. d. the mode: The mode is 3, since that appears most. e. the variance: i) Computational Formula s 2 x 2 nx 2 n 1 x x 227 12 3.75 2 5.29545 11 2 58 .2500 5.29545 n 1 11 You need only one of these two. I strongly recommend the first one. ii) Definitional Formula s 2 4 251solnG3 1/31/08 f. the interquartile range: First Quartile position pn 1 .2513 3.25 . From this we get a 3 and .b .25 . Now use x1 p xa .b( xa1 xa ) . So Q1 x.75 x 4 .25( x 4 x3 ) 2 .25(2 2) 2 . Third Quartile position pn 1 .7513 9.75 . From this we get a 9 and .b .75 . Now use x1 p xa .b( xa1 xa ) . So Q3 x.25 x9 .75( x10 x9 ) 5 .75(6 5) 5.75 . IQR Q3 Q1 5.75 2 3.75 g. the standard deviation: s variance 5.29545 2.30119 . std .deviation 2.30119 C 0.6137 . Note also the coefficient of variation mean 3.75 h. a statistic showing skewness. There are three possibilities: n i) Computational Formula for Skewness k 3 (n 1)( n 2) x 3x 3 x 2 2nx 3 12 1359 33.75 227 212 3.75 3 (11)(10 ) 12 1359 255375 1265 .625 12 70.875 7.73182 . (11)(10 ) (11)(10 ) ii) Definitional Formula for Skewness k 3 n (n 1)( n 2) x x 3 12 70.875 7.73182 . (11)(10 ) iii) Relative Skewness g1 k3 7.73182 0.6345 . 2.30119 3 3mean mode 33.75 3 0.87775 . iv) Pearson’s Measure of Skewness SK s 3 std .deviation 2.30119 Only one of these four is needed. All of these are positive, indicating skewness to the right. 5