251y0815

251y0815 6/11/08 ECO251 QBA1 FIRST EXAM June 11, 2008 Name: ___KEY_________________ Student Number: _________________________ Class Hour: _____________________ Remember – Neatness, or at least legibility, counts. In most non-multiple-choice questions an answer needs a calculation or short explanation to count. Show your work! The exam is normed on 50 points, so that any grade above 48 is an A+ and grades wrap around. Part I. (7 points) The following numbers are to be considered a sample of prices of gasoline taken at five gas stations. 4.14 4.19 4.25 4.01 3.99 x1 Compute the following: Show your work! a) The Median (1) b) The Standard Deviation (3) c) The 3rd quintile (2) d) The Coefficient of variation (1) Extra credit beyond this point! e) The harmonic mean (1.5) f) The root-mean square (1.5) g) The geometric mean (3 ways!) (3) Solution: My calculations are below. logx  is a logarithm to the base 10 and ln x  is a natural logarithm. Row 1 2 3 4 5 x x in order 4.14 4.19 4.25 4.01 3.99 20.58 3.99 4.01 4.14 4.19 4.25 x logx  0.241546 0.238663 0.235294 0.249377 0.250627 1.21551 0.617000 0.622214 0.628389 0.603144 0.600973 3.07172 1 x2 17.1396 17.5561 18.0625 16.0801 15.9201 84.7584 ln x  1.42070 1.43270 1.44692 1.38879 1.38379 7.07290 a) The Median (1): the median is the middle number when the data is in order – 4.14 b) The Standard Deviation (3): We have x  x  20 .58 ,  x 2  84 .7584 , x   x  20.58  4.116 , n 5 84 .7584  54.116 2 0.05112   0.01278 . So that s  0.01278  0.113049. n 1 4 4 If you chose to annoy me by using the definitional formula, you should have gotten the following. x  20 .58 , You would have x  x  x 2 xx Row s2  1 2 3 4 5 s 2 2  nx 4.14 4.19 4.25 4.01 3.99 20.58  0.024 0.074 0.134 -0.106 -0.126 0.000  x  x   n 1 2 2  0.000576 0.005476 0.017956 0.011236 0.015876 0.05112  x  x  20.58  4.116 , n  x  x  5 2  0.05112 0.05112  0.01278 . So that s  0.01278  0.113049 4 1 251y0815 6/11/08 I had better repeat the table from the last page. 1 x Row x in order x2 x 1 2 3 4 5 4.14 4.19 4.25 4.01 3.99 20.58 3.99 4.01 4.14 4.19 4.25 17.1396 17.5561 18.0625 16.0801 15.9201 84.7584 logx  0.241546 0.238663 0.235294 0.249377 0.250627 1.21551 c) The 3rd quintile (2): The third quintile has 3 5 ln x  0.617000 0.622214 0.628389 0.603144 0.600973 3.07172 1.42070 1.43270 1.44692 1.38879 1.38379 7.07290  .60 of the data below it. The data must be in order. location  pn  1  .606  3.60  a.b . x1 p  x.40  x3  0.60x 4  x3   4.14  0.604.19  4.14  4.17 d) The Coefficient of variation (1): C  s 0.113049   .0247 or 2.47% x 4.116 Extra credit beyond this point! e) The harmonic mean (1.5): The formula table says 1 1  xh n  x or x1  15  4.114  4.119  4.125  4.101  3.199  1 h 1 1 1  1.21551   0.243102 . So xh    4.1135 . 1 1 0.243102 5 n x Of course some of you decided that  1 1  xh n 1 1 1  1  1   x  5  4.14  4.19  4.25  4.01  3.99   ? 5  4.14  4.19  4.25  4.01  3.99   5  20.58   0.009718 1 1 1 1 1 1 This is, of course, an easier way to do the problem. It is also wrong and unreasonable (since it is not between 3.99 and 4.25), and you will get an A for the course if you can prove to me that it is not wrong! And please don’t try any math if you get on “Are you smarter than a fifth-grader.” 1 1 x 2 or x rms 2  x2 f) The root-mean square (1.5): The formula table says x rms  n n 84 . 7584 x rms 2   16 .95168 . So x rms  16 .95168  4.11724 5   1 g) The geometric mean (3): The formula table says x g  x1  x 2  x3  x n  n  n xg  n x  5  x . So 4.14  4.19  4.25  4.01  3.99  5 1179 .61428  1179 .61428 0.2  4.11476   The formula table also says ln x g  1 n  ln( x) , but I said in class that this could be either natural logs or   logs to the base 10. If we use logarithms to the base 10 we get log x g  1 n  log( x)     0.614344 and x g  10 0.614344  4.11476 . If we use natural logarithms we get ln x g  3.07172 5 1 ln( x)  n  7.07290  1.41458 and x g  e1.41458  4.11476 5 2 251y0815 6/11/08 Part II. (18 points) According to Anderson, Sweeny and Williams a bank found the following as a sample of 30 waiting times (in seconds) for service. Row 1 2 3 4 5 Time 60-119.99 120-179.99 180-239.99 240-299.99 300-359.99 Frequency 6 10 8 4 2 a. Calculate the Cumulative Frequency (1) b. Calculate the Mean (1) c. Calculate the Median (2) d. Calculate the Mode (1) e. Calculate the Variance (3) f. Calculate the Standard Deviation (2) g. Calculate the Interquartile Range (3) h. Calculate a Statistic showing Skewness and interpret it (3) i. Make a frequency polygon of the data (Neatness Counts!)(2) Solution: Note that unreasonable answers are answers where the mean, median, mode, first quartile and third quartile do not fall between 60 and 360. If we use the computational method, we get the following. x is the midpoint of the class. Row 1 2 3 4 5 Class 60-119.99 120-179.99 180-239.99 240-299.99 300-359.99 f F 6 10 8 4 2 30 6 16 24 28 30 x 90 150 210 270 330 fx fx2 fx3 540 48600 4374000 1500 225000 33750000 1680 352800 74088000 1080 291600 78732000 660 217800 71874000 5460 1135800 262818000 If we use the definitional method, we get the following. x is the midpoint of the class. I usually tell people that they are wasting their time if they use the definitional method. Because of the large numbers here that may not be true. Row 1 2 3 4 5 Class 60-119.99 120-179.99 180-239.99 240-299.99 300-359.99 F f 6 10 8 4 2 30 6 16 24 28 30 fx x 90 150 210 270 330 540 1500 1680 1080 60 5460 xx f x  x  f x  x 2 -92 -32 28 88 148 -552 -320 224 352 296 0 50784 10240 6272 30976 43808 142080 If you used the computational method, you would have gotten n  the mean is x   fx  5460  182 .0000 . You would also find n 30  f  30  fx 2 and f x  x 3 -4672128 -327680 175616 2725888 6483584 4385280  fx  1135800 and  5460 , so that  fx 3  262818000. If you used the definitional method, you would have and gotten If you used the computational method, you would have gotten n   f  84 You would have followed by getting  f x  x  3  fx 5460  fx  5460 , so that the mean is x  n  30  182 .0000 .  f x  x   0 (a check),  f x  x 2  142080 and and  4385280 . If you used one of Pearson’s measures of skewness, you would not have bothered with the f x  x 3 or the fx3 columns. a. Calculate the Cumulative Frequency (1) See the F column above. b. Calculate the Mean (1): We have already found x  182 .0000 3 251y0815 6/11/08 Row f F 6 10 8 4 2 30 6 16 24 28 30 Class 1 2 3 4 5 60-119.99 120-179.99 180-239.99 240-299.99 300-359.99 x 90 150 210 270 330 fx fx2 fx3 540 48600 4374000 1500 225000 33750000 1680 352800 74088000 1080 291600 78732000 660 217800 71874000 5460 1135800 262818000  fx 5460  f  30 ,  fx  5460 , x  n  30  182 .0000 ,  fx  1135800 , 2 3  262818000,  f x  x   0 ,  f x  x   142080 and  f x  x   4385280 . Remember n   fx 3 2 c. Calculate the Median (2): position  pn  1  .531  15.5 . This is above F  6 and below F  16 , so the interval is the 2nd, 120 to 180, which has a frequency of 10. Each interval width is 180 – 120 = 60.  pN  F   .530   6  x1 p  L p    w so x1.5  x.5  120    60   120  0.960   174 . Check: this is  10   f p  between 120 and 160. d. Calculate the Mode (1): The largest group is 120 to 180, which has a frequency of 10, so by convention the mode is its midpoint, which is mo  150. e. Calculate the Variance (3): We have s2   fx s2   f x  x  2  nx 2 n 1 n 1  fx 2  1135800 and x  182 .0000 or  1135800  30 182 .0000 2 142080   4899 .3103 or 29 29  142080  4899 .3103 . 29 2  f x  x  2  142080 . f. Calculate the Standard Deviation (2): s  4899.3103  69.9951 . g. Calculate the Interquartile Range (3): Note that to be reasonable, Q1  x50  Q3 . First Quartile: position  pn  1  .2531  7.75 . This is above F  6 and below F  16 , so the interval is the 2nd, 120 to 180, which has a frequency of 10. Each interval width is 180 – 120 = 60.  pN  F   .25 30   6  x1 p  L p    w gives us Q1  x1.25  x.75  120    60   120  0.1560   129 .00 . 10    f p  Third Quartile: position  pn  1  .7531  23.25 . This is above F  16 and below F  24, so the  .7530   16  interval is the 3rd, 180 to 240 which has a frequency of 8. Q3  x1.75  x.25  180    60  8    180  0.8125 60  228 .75 . So IQR  Q3  Q1  228 .75  129 .00   99.75 .  f  30 ,  1135800 ,  fx h. Calculate a Statistic showing Skewness and interpret it (3): Remember n  x  182 .0000 , x.5  174 , mo  150, s  4899.3103  69.9951 ,  fx 2 3   f x  x   0 , and  f x  x 3  4385280 . n  fx  3x  fx  2nx   293028 262818000  3182 .000 1135800  230182 .000    (n  1)( n  2)  262818000, k3 3 2 3 3  0.0369458 262818000  620146800  361714080   0.0369458 4385280   162017 .734 . or k 3  n (n  1)( n  2)  f x  x  3  30 4385280   162017 .734 . 29 28  4 251y0815 6/11/08 or g 1  k3 s 3  162017 .734 69 .9951 3  0.4725 Pearson's Measure of Skewness SK1  or mean  mode  182  150   0.4572 or 69 .9951 std .deviation 3mean  median 3182  174    0.3429 69 .9951 std .deviation Because of the positive sign, the measures all imply (slight) skewness to the right.. SK 2  i. Make a frequency polygon of the data (Neatness Counts!)(2) Row 0 1 2 3 4 5 6 Class 0-60 60-119.99 120-179.99 180-239.99 240-299.99 300-359.99 360-419.99 f F 0 6 10 8 4 2 0 6 16 24 28 30 x 30 90 150 210 270 330 390 fx 540 1500 1680 1080 660 fx2 48600 225000 352800 291600 217800 fx3 4374000 33750000 74088000 78732000 71874000 The seven points on your graph should be (30, 0), (90, 6), (150, 10), (210, 8), (270, 4), (330, 2) and (390, 0). 5 251y0815 6/11/08 Part III. Multiple choice (12 points). Note: If you say ‘None of the above,’ you should supply a correct answer to get full credit. 1. If a distribution is skewed to the right, the following must be true. (Hint: making a diagram first is a good way to prevent errors.) a. Mean < median < mode b. Median < mean < mode c. *Mode < median < mean d. Mode < mean < median e. Mean = median = mode f. None of the above. 2. If I have a population described as grouped data and I am using definitional formulas. f x     0 a.  b.  f x     0 c.  f x     1 d. *None of the above. Solution: For the same reason that  f x  x   0 on page 4, this sum is zero. To do the mathematics,  f x  x    fx   fx   fx   f x   fx  nx  fx  fx  fx  0   fx  n   n 3. Which of the following does not describe a population? a.* x b.  c. The coefficient of variation d.  e. Pearson’s coefficient of skewness. f.  g. All of the above describe a population. 4. Mark the following items N (nominal), O (ordinal), I (interval) or R (ratio) data. If the data is interval or ratio data, would it be considered C (continuous) or D (discrete)? (4) a) Likert Scale - The format of a typical five-level Likert item is: 1) Strongly disagree; 2) Disagree; 3) Neither agree nor disagree; 4) Agree; 5) Strongly agree O b) Next year’s tuition (in dollars and cents) RC c) Place of residence N d) Number of credit cards that you hold RD 5. If you make a graph to represent a data set, the following should be plotted at class midpoints. a. An ogive b. *A frequency polygon c. A Pareto diagram d. All of the above e. None of the above 6 251y0815 6/11/08 Part IV. (13+ points) Table 1 Given below is a stem-and-leaf display for the amount of gas purchased at a service station. Minitab gives the following information. (SE Mean is the standard deviation divided by the square root of n .) MTB > describe c1 Descriptive Statistics: x Variable n x 25 9|147 10|02238 11|125566777 12|223489 13|02 Mean SE Mean 11.372 0.232 StDev 1.158 Minimum 9.100 Q1 Median Q3 Maximum …………… ………… ………… 13.200 1. In Table 1, what is the median purchase? (2) Solution: The way that I did these problems is to write out the indices of the numbers as below. 9|147 x1  x3  10|02238 x 4  x8  11|125566777 12|223489 13|02 x9  x17  x18  x23  x 24  x 25  It may be clearer if I actually write out the numbers and their indices. Value Index 9.1 1 9.4 2 9.7 3 10.0 4 10.2 5 10.2 6 10.3 7 10.8 8 11.1 9 11.2 10 11.5 11 11.5 12 Value Index 11.6 13 11.6 14 11.7 15 11.7 16 11.7 17 12.2 18 12.2 19 12.3 20 12.4 21 12.8 22 12.9 23 13.0 24 Value 13.2 Index 25 location  pn  1  0.526   13. median  x13  11.6 . 2. Create a 5-number summary from Table 1. (4) Solution: To find the first quartile, we write location  pn  1  0.2526   6.5. This implies that the first quartile is x6  0.5x 7  x6   10.2  0.510.3  10.2  10.25 . For the third quartile, location  pn  1  0.7526   19 .5. This implies that the third quartile is x19  0.5x 20  x19   12.2  0.512.3  12.2  12 .25 . The five-number summary is thus 9.1, 10.25, 11.6, 12.25, 13.2  7 251y0815 6/11/08 3. In Table 1, assume that you were asked to present the data in 4 classes. Using the method you learned in class, show how you would decide what class interval to use and list the classes below with their frequencies. (4) [10] A B C D ___ ___ ___ ___ Class to under to under to under to under ___ ___ ___ ___ Frequency ___ ___ ___ ___ Solution: The highest number is 13.2 and the lowest is 9.1. We calculate 1.25. If we use 1.25, we might get the following. Class frequency A 8.75 to under 10.0 3 B 10.00 to under 11.25 7 C 11.25 to under 12.50 11 D 12.50 to under 13.75 4 You could also start at 8.50. 25 13 .2  9.1  1.025 . Use 1.2 or 4 If we use 1.2, we might get the following. Class frequency A 9.0 to under 10.2 4 B 10.2 to under 11.4 6 C 11.4 to under 12.6 11 D 12.6 to under 13.8 4 1.5 would work if you start at 8. 25 4. In Table 1, according to the Tchebyschev inequality, what is the minimum number of observations that should be between 9.056 and 13.688? What would the empirical rule say? Should the empirical rule apply here? Why? (4) Variable x n 25 Mean SE Mean 11.372 0.232 StDev 1.158 Minimum 9.100 Q1 Median Q3 Maximum …………… ………… ………… The mean is 11.372 and 11.372 – (2)1.158 = 9.056, 11.372 + (2)1.158 = 13.688. These points are thus 2 standard deviations from the mean. The inequality says that at most 1 22  1 4 of the points should be below 9.056 or above 13.688. So at least 75% of the points should be between these numbers. This is at least 19 points. The empirical rule says that about 95% of the observations should be between these two points. This is about 24 points. The stem-and leaf diagram shows that the data is roughly symmetrical, so we would expect this to be almost true. Actually all the 25 points are in the interval. As usual the inequality gives us an underestimate of the number of the points in the interval. 5. Which of the following are not sensitive to extreme values? (Circle all correct answers.) (3) a. *The mode b. The mean c. The variance d. The coefficient of variation e. k 3 , the third k-statistic. 8

251y0815

Related documents

Products

Support

251y0815

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib