251y0011 10/11/00 Part I. ECO251 QBA1 FIRST HOUR EXAM OCTOBER 7, 2000 Name ____KEY___________ SECTION MWF 10 11 TR 11 12:30 Multiple Choice (10 points) 1.(D7-1) The major contribution of inferential statistics is that it a. Allows us to take population information and make statements about samples. b. Gives us a description of data contained in a sample. c. Gives us a description of data contained in a population. *d. Allows us to take sample information and make statements about the population. e. None of the above. 2.(S-3) Debit balances owed in a retail store are an example of a. Ordinal data. b. Nominal data. c. Interval data. *d. Ratio data. e. None of the above. 3. A used automobile dealer lists cars in the following classes. A - 100,000 miles or more on the odometer, B - less than 100,000 miles on the odometer, C - Diesel. Are these three categories a. Mutually exclusive? *b. Collectively exhaustive? c. Both mutually exclusive and collectively exhaustive? d. Neither mutually exclusive or collectively exhaustive? e. Can't tell with the information given. 4. (D7-9)If a distribution is skewed to the right, we can say that it is likely that *a. Mean > median > mode b. Median > mean > mode c. Mode > median > mean d. Mode > mean > median e. Mode = mean = median (Most people got this backwards - make a diagram!) 5. A graph that connects points, each of which represents the frequency f is called a a. Histogram b. Ogive *c. Frequency Polygon d. Pie chart e. None of the above 251y0011 10/11/00 Part II. Compute an appropriate answer, showing your work (except in a)) (15 Points maximum - if you do more than 15 points, only your right answers will be counted.): a) Fill in the following table (3) Class F f rel f Solution: 50-59.99 60-69.99 70-79.99 80-89.99 90-99.99 Total _ 4 _ 6 7 25 .12 __ __ __ _ __ Class f 3 4 5 6 7 25 f rel .12 .16 .20 .24 .28 1.00 50-59.99 60-69.99 70-79.99 80-89.99 90-99.99 Total __ __ 12 __ __ F 3 7 12 18 25 Note that n 25 . b) Assume that we have sold 1000 life insurance policies in amounts between $5200 and $9800. If this data is to be presented in eight classes, what intervals would you use? Explain your reasoning using the appropriate formula and make a table showing the class intervals you would actually use. (3) 9800 5200 575 so use 600. This is only a suggestion. Any number somewhat above 8 575 will work. Solution: Class A B C D E F G H From 5200 5800 6400 7000 7600 8200 8800 9400 To 5799.99 6399.99 6999.99 7599.99 8199.99 8799.99 9399.99 9999.99 c) (S-30)If a population of 1000 items with an unknown distribution has a mean of 12 and a standard deviation of 1.2, what is the approximate minimum number of items that must be (i) between 6 and 18? (ii) What is the maximum that can be above 18? (3) x 6 12 18 12 5 and 5. Solution: (i) If we use the formula k z , we find that 1.2 1.2 According to the Chebyshef inequality, the minimum fraction of the data that must be between 1 1 5 is 1 2 1 24 25 . 24 25 of 1000 is 960. (ii) The answer is the opposite to the 25 k answer to (i). There are about 1000 - 960 = 40 items left over. All of these could be above 18. 2 251y0011 10/11/00 d) Do c) again assuming that the distribution is unimodal and symmetric.(2) Solution: Since the Empirical Rule says that almost all points must be between 3 , we would expect almost all of the 1000 points to be between 6 and 18 since these points are 5 , and we would be quite surprised if even one point is above 18. e) For the numbers 11.1, 13.2, 15.1 and 12.7, compute the i) Root-mean-square ii) Harmonic mean, iii) Geometric mean (2 each) x 52 .1 . This is not used in any of the following calculations and there is Solution: Note that no reason why you should have computed it! (i) The Root-Mean-Square. 1 1 1 1 2 x rms x 2 11 .12 13 .2 2 15 .12 12 .7 2 123 .21 174 .24 228 .01 161 .29 686 .75 n 4 4 4 1 n 171 .6875 . So x rms x 2 171 .6825 13 .103 . (ii) The Harmonic Mean. 1 1 1 1 1 1 1 1 1 0.090090 0.075758 0.066225 0.078740 xh n x 4 11 .1 13 .2 15 .1 12 .7 4 1 0.310813 0.077703 . So xh 1 1 12 .8947 . 1 1 0.077703 4 n x (iii) The Geometric Mean. 1 x g x1 x 2 x3 x n n n x 4 11.113.21`5.112.7 4 28098 .1404 28098 .1404 1 4 28098 .1404 0.25 12.9470 . Or ln x g 1 n ln( x) 4 ln 11.1 ln 13.2 ln 15.1 ln 12.7 4 2.40695 2.58022 2.71469 2.54160 1 1 1 10.24346 2.56086 . So x g e 2.56086 12 .9470 . I got the last result by putting 2.56086 into 4 the calculator and pressing 'inverse' and then 'ln x.' Or log x g 1 n log( x) 4 log11.1 log13.2 log15.1 log12.7 1 1 1.04532 1.12057 1.17898 1.10380 1 4.44868 1.11217 . So 4 4 x g 10 1.11217 12 .9470 . I got the last result by putting 1.11217 into the calculator and pressing 'inverse' and then 'log x.' Notice that the original numbers and all the means are between 11.1 and 15.1. In spite of everything that I said, there are many of you who think that: (i) You can find a sum of squares by summing numbers and squaring the sum; (ii) You can find the sum of 1x by adding up the numbers and taking the reciprocal; (iii) You can find an nth by dividing by n. I can only recommend a remedial math class (unless, of course, you want to try listening in class and checking out the homework very carefully.) 3 251y0011 10/11/00 Part III. Do the following problems (25 Points) 1. In a period of 7 days you make the following numbers of sales(in millions): Day : 1 2 3 4 5 6 Sales: 9.2 10.2 9.2 11.2 19.5 12.2 Compute the following (assuming that the numbers are a sample): a) Mean Sales (1) b) The Median (1) c) The Standard Deviation (3) d) The 2nd Quintile (2) 7 13.2 Index x x x 2 xx x2 1 9.2 84.64 -2.9 8.41 2 9.2 84.64 –2.9 8.41 3 10.2 104.04 –1.9 3.61 4 11.2 125.44 or -0.9 0.81 5 12.2 148.84 0.1 0.01 6 13.2 174.24 1.1 1.21 7 19.5 380.25 7.4 54.76 84.7 1102.09 0.0 77.22 Isn't it wonderful how predictable so many of you are! I strongly recommended that you compute the variance by the computational formula in both this and the next problem. Many of you ignored me. Two thirds of those who used the definitional formula got the problem wrong because they had not checked out Solution: Compute the Following: Note that x is in order the method enough so that they knew what the formula meant. you seem to have fooled yourselves into believing. Nor is x 2 x x 2 x 84.7 as some of equal to x x 84 .7 12 .1 . is not 2 2 2 2 If you had tried these in any of the homework problems, you would have found that these tricks didn’t work. Note that, to be reasonable, the mean, median and 2nd quintile must fall between 9.2 and 19.5. n 6 , x 84 .7 , a) x x 84.7 12.1 x 2 1102 .09 , x x 0.00, x x 2 77.22 . n 7 b) Just put the numbers in order and pick the middle number, 11.2. Or formally: position pn 1 a.b .58 4.0 x1 p xa .b( xa1 xa ) so x1.5 x.5 x 4 .0( x5 x 4 ) 11.2 c) s 2 x 2 nx 2 n 1 1102 .09 712 .12 12 .87 or s 2 6 x x n 1 2 77 .22 12 .87 6 s 12.87 3.58748 d) The 2nd quintile has 40% below it. position pn 1 a.b .48 3.2 x1 p xa .b( xa1 xa ) so x1.4 x.6 x3 .2( x 4 x3 ) 10.2 .2(11.2 10.2) 10.4 I warned you about quintiles - they are fifths, not fourths. This is an excellent warning! You can't answer a question that you haven't read carefully! 4 251y0011 10/11/00 2. A bank finds that the amounts overdue on its credit cards are the following. . (Assume that the numbers are a sample.) Are there reasons why so many of you (i) totally ignored the classes, (ii) decided that the frequency column was both f and x , (iii) computed the squaring it after I had specifically warned you not to? amount (thousands) a. Calculate the Cumulative Frequency (1) b. Calculate The Mean (1) c. Calculate the Median (2) d. Calculate the Mode (1) e. Calculate the Variance (3) f. Calculate the Standard Deviation (2) g. Calculate the Interquartile Range (3) h. Calculate a Statistic showing Skewness and Interpret it (3) i. Make an histogram of the Data (Neatness Counts!)(2) frequency 0-$1.99999 $2.000-3.99999 $4.000-5.99999 $6.000-7.99999 $8.000-9.99999 $10.000 and up fx 2 column by taking each value of fx and 80 40 30 30 20 0 Solution: x is the midpoint of the class. Our convention is to use the midpoint of 0 to 2, not 1.99999. class $0-$1.99999 $2.000-3.99999 $4.000-5.99999 $6.000-7.99999 $8.000-9.99999 n f 200 , f x x 2 F x 80 80 40 120 30 150 30 180 20 200 200 fx 740 , 1.0 3.0 5.0 7.0 9.0 f 1542.0, and fx fx3 fx 2 80 80 80 120 360 1080 150 750 3750 210 1470 10290 180 1620 14580 740 4280 29780 fx 2 4280 , and fx 3 f x x 3 x x f x x f x x 2 f x x 3 -2.7 -216 583.2 -1574.64 -0.7 -28 19.6 -13.72 1.3 39 50.7 65.91 3.3 99 326.7 1078.11 5.3 106 561.8 2977.54 0 1542.0 2533.20 29780 , f x x 0, 2533.20. Note that, to be reasonable, the mean, median and quartiles must fall between 0 and 10. And no, I did not get the 1.0 in the x column by rounding 0.999995, or, for that matter, by rounding anything else - Think! a. Calculate the Cumulative Frequency (1): (See above) The cumulative frequency is the whole F column. b. Calculate the Mean (1): x fx 740 3.7 n 200 c. Calculate the Median (2): position pn 1 .5201 100 .5 . This is above 80 and below 120, so the pN F .5200 80 interval is 2-3.99999. x1 p L p w so x1.5 x.5 2 2 3.000 40 f p d. Calculate the Mode (1) The mode is the midpoint of the largest group. Since 80 is the largest frequency, the modal group is 0 to 1.99999 and the mode is 1.000. e. Calculate the Variance (3): s 2 s2 f x x n 1 2 fx 2 nx 2 n 1 4280 200 3.7 2 7.74874 or 199 1542 .0 7.74874 199 f. Calculate the Standard Deviation (2): s 7.74874 2.78366 5 251x0011 10/11/00 g. Calculate the Interquartile Range (3): First Quartile: position pn 1 .25201 50.25 . This is above pN F F 0 and below F 80 , so the group is 0 to 1.99999. x1 p L p w gives us f p .25200 0 Q1 x1.25 x.75 0 2 1.250 . 80 Third Quartile: position pn 1 .75201 150 .75 . This is above 150 and below 180, so the group is .75 200 150 6.000 to 7.99999. x1.75 x.25 6 2 6.000 . IQR Q3 Q1 6.000 1.250 4.750 . 30 h. Calculate a Statistic showing Skewness and interpret it (3): n k 3 fx 3 3x fx 2 2nx 3 200 29780 33.74280 2200 3.73 (n 1)( n 2) 199 198 0.00507588 2533 .2 12.8582 . or k 3 or g 1 n (n 1)( n 2) k3 s 3 f x x 12 .8582 2.78366 3 3 200 2533 .2 12.8582 199 198 0.596121 3mean mode 33.7 1.0 2.9098 std .deviation 2.78366 Because of the positive sign, the measures imply skewness to the right. i. Make an histogram of the Data (Neatness Counts!)(2) A histogram is a bar graph of the frequency. The first bar is between 0 and 2 on the x axis (or has a midpoint at 1) and has a height of 80. or Pearson's Measure of Skewness SK 6