251y0012 10/11/00 Part I. ECO251 QBA1 FIRST HOUR EXAM OCTOBER 7, 2000 Name ______KEY__________ SECTION MWF 10 11 TR 11 12:30 Multiple Choice (10 points) 1.(S-2) Inferential statistics is a. The display of characteristics of a sample in a graph with summary measures. b. The display of characteristics of a population in a graph with summary measures. *c. The process of estimating facts (parameters) about a population from a sample taken from the population. d. A branch of mathematics devoted to the collection, display and analysis of data. e. None of the above. 2.(S-3) The Fortune 500 listing of the 500 largest companies in the US in order of their annual sales is an example of *a. Ordinal data. b. Nominal data. c. Interval data. d. Ratio data. e. None of the above. 3. A used automobile dealer lists cars in the following classes. A - 100,000 miles or more on the odometer, B - less than 100,000 miles on the odometer, C - Diesel. Are these three categories *a. Collectively exhaustive? b. Mutually exclusive? c. Both mutually exclusive and collectively exhaustive? d. Neither mutually exclusive or collectively exhaustive? e. Can't tell with the information given. 4. (D7-9) If a distribution is skewed to the left, we can say that it is likely that a. Mean > median > mode b. Median > mean > mode *c. Mode > median > mean d. Mode > mean > median e. Mode = mean = median (Most people got this backwards - make a diagram!) 5. A graph that connects points, each of which represents the cumulative frequency F is called a a. Histogram *b. Ogive c. Frequency Polygon d. Pie chart e. None of the above 1 251y0012 10/11/00 Part II. Compute an appropriate answer, showing your work (except in a)) (15 Points maximum - if you do more than 15 points, only your right answers will be counted.): a) Fill in the following table (3) Class F f rel f 50-59.99 60-69.99 70-79.99 80-89.99 90-99.99 Total Solution: Class 50-59.99 60-69.99 70-79.99 80-89.99 90-99.99 Total Note that n 25 _ 3 _ 7 6 25 .12 __ __ __ _ __ f 3 3 6 7 6 25 f rel .12 .12 .24 .28 .24 1.00 __ __ 12 __ __ F 3 6 12 19 25 b) Assume that we have sold 1000 life insurance policies in amounts between $5300 and $9800. If this data is to be presented in seven classes, what intervals would you use? Explain your reasoning using the appropriate formula and make a table showing the class intervals you would actually use. (3) 9800 5300 642 .86 so use 650 or 700. This is only a suggestion. Any number Solution: 7 somewhat above or equal to 643 will work. Class A B C D E F G from 5000 5700 6400 7100 7800 8500 9200 to 5699.99 6399.99 7099.99 7799.99 8499.99 8199.99 9899.99 c) (S-30)If a population of 1000 items with an unknown distribution has a mean of 12 and a standard deviation of 1.5, what is the approximate minimum number of items that must be (i) between 6 and 18? (ii) between 12 and 18? Note: there was an error here - (ii) was a harder question than I intended to give - I will thus give 3 points for a correct answer for (i). (ii) should have read (iic) What is the maximum that could be over 18? (3) x 6 12 18 12 4 and 4. Solution: (i) If we use the formula k z , we find that 1.5 1 .5 According to the Chebyshef inequality, The minimum fraction of the data that must be between 1 1 4 is 1 2 1 15 16 . Fifteen sixteenths of 1000 is about 938. (ii) since we can't pick 16 k sides here, the answer can't really be found. (iic) The answer is the opposite to the answer to (i). There are about 1000 - 938 = 62 items left over. All of these could be above 18. 2 251y0012 10/11/00 d) Do c) again assuming that the distribution is unimodal and symmetric.(2) Solution: (i and iic) Since the Empirical Rule says that almost all points must be between 3 , we would expect almost all of the 1000 points to be between 6 and 18, since these points are 4 , and we would be quite surprised if even one point is above 18. (ii) If the distribution is symmetric, we would expect half of the 1000 points or 500 on one side. Again there will be 2 points for a correct answer to (i). e) For the numbers 11.1, 13.2, 15.1 and 11.5, compute the i) Root-mean-square ii) Harmonic mean, iii) geometric mean (2 each) x 50 .9 . This is not used in any of the following calculations and there is Solution: Note that no reason why you should have computed it! (i) The Root-Mean-Square. 1 1 1 1 2 x rms x 2 11 .12 13 .2 2 15 .12 11 .5 2 123 .21 174 .24 228 .01 132 .25 657 .71 n 4 4 4 1 n 164 .4275 . So x rms x 2 164 .4275 12 .823 . (ii) The Harmonic Mean. 1 1 1 1 1 1 1 1 1 0.090090 0.075758 0.066225 0.086957 xh n x 4 11 .1 13 .2 15 .1 11 .5 4 1 0.319029 0.079757 . So xh 1 1 12 .5380 . 1 1 0.079757 4 n x (iii) The Geometric Mean. 1 x g x1 x 2 x3 x n n n x 4 11.113.21`5.111.5 4 25443 .198 25443 .198 1 4 25443 .198 0.25 12.6297 . Or ln x g 1 n ln( x) 4 ln 11.1 ln 13.2 ln 15.1 ln 11.5 4 2.40695 2.58022 2.71469 2.44234 1 1 1 10.14420 2.53605 . So x g e 2.53605 12 .6297 . I got the last result by putting 2.53605 into 4 the calculator and pressing 'inverse' and then 'ln x.' Or 1 log( x) 1 log11.1 log13.2 log15.1 log11.5 log x g n 4 1 1 1.04532 1.12057 1.17898 1.10607 4.40557 1.10139 . So 4 4 x g 10 1.11217 12 .6297 . I got the last result by putting 1.10139 into the calculator and pressing 'inverse' and then 'log x.' Notice that the original numbers and all the means are between 11.1 and 15.1. In spite of everything that I said, there are many of you who think that: (i) You can find a sum of squares by summing numbers and squaring the sum; (ii) You can find the sum of 1x by adding up the numbers and taking the reciprocal; (iii) You can find an nth by dividing by n. I can only recommend a remedial math class (unless, of course, you want to try listening in class and checking out the homework very carefully.) 3 251y0012 10/11/00 Part III. Do the following problems (25 Points) 1. In a period of 7 days you make the following numbers of sales(in millions): Day : 1 2 3 4 5 6 Sales: 9.2 10.2 9.2 11.2 19.2 12.2 Compute the following (assuming that the numbers are a sample): a) Mean Sales (1) b) The Median (1) c) The Standard Deviation (3) d) The 2nd Quintile (2) Solution: Compute the Following: Note that x is in order n 6 , x 85 .4 , x 2 1117 .88 , x x 2 xx -3.0 –3.0 –2.0 -1.0 0.0 2.0 7.0 0.0 Index x x2 1 9.2 84.64 2 9.2 84.64 3 10.2 104.04 4 11.2 125.44 or 5 12.2 148.84 6 14.2 201.64 7 19.2 368.64 85.4 1117.88 x x 0.00, x x 2 7 14.2 9.00 9.00 4.00 1.00 0.00 4.00 49.00 76.00 76.00 . Isn't it wonderful how predictable so many of you are! I strongly recommended that you compute the variance by the computational formula in both this and the next problem. Many of you ignored me. Two thirds of those who used the definitional formula got the problem wrong because they had not checked out the method enough so that they knew what the formula meant. you seem to have fooled yourselves into believing. Nor is x 2 x x 2 x 85.4 as some of equal to x x 85 .4 12 .2 . 2 is not 2 2 2 If you had tried these in any of the homework problems, you would have found that these tricks didn’t work. Note that, to be reasonable, the mean, median and 2nd quintile must fall between 9.2 and 19.5. a) x x 85.4 12.2 n 7 b) Just put the numbers in order and pick the middle number, 11.2. Or formally: position pn 1 a.b .58 4.0 x1 p xa .b( xa1 xa ) so x1.5 x.5 x 4 .0( x5 x 4 ) 11.2 c) s 2 x 2 nx 2 n 1 1117 .88 712 .22 12 .6667 or s 2 6 x x n 1 2 76 .00 12 .6667 6 s 12.6667 3.55903 d) The 2tnd quintile has 40% below it. position pn 1 a.b .48 3.2 x1 p xa .b( xa1 xa ) so x1.4 x.6 x3 .2( x 4 x3 ) 10.2 .2(11.2 10.2) 10.4 I warned you about quintiles - they are fifths, not fourths. This is an excellent warning! You can't answer a question that you haven't read carefully! 4 251y0012 10/11/00 2. A bank finds that the amounts overdue on its credit cards are the following. . (Assume that the numbers are a sample.) Are there reasons why so many of you (i) totally ignored the classes, (ii) decided that the frequency column fx 2 column by taking each value of fx and squaring it after I was both f and x , (iii) computed the had specifically warned you not to? amount (thousands) a. Calculate the Cumulative Frequency (1) b. Calculate The Mean (1) c. Calculate the Median (2) d. Calculate the Mode (1) e. Calculate the Variance (3) f. Calculate the Standard Deviation (2) g. Calculate the Interquartile Range (3) h. Calculate a Statistic showing Skewness and Interpret it (3) i. Make an histogram of the Data (Neatness Counts!)(2) frequency 0-$1.99999 $2.000-3.99999 $4.000-5.99999 $6.000-7.99999 $8.000-9.99999 $10.000 and up 70 40 40 30 20 0 Solution: x is the midpoint of the class. Our convention is to use the midpoint of 0 to 2, not 1.99999. class $0-$1.99999 $2.000-3.99999 $4.000-5.99999 $6.000-7.99999 $8.000-9.99999 n f 200 , f x x 2 F x 70 70 40 110 40 150 30 180 20 200 200 fx 780 , 1.0 3.0 5.0 7.0 9.0 f 1478.0, and fx fx3 fx 2 70 70 70 120 360 1080 200 1000 5000 210 1470 10290 180 1620 14580 780 4520 31020 fx 2 4520 , and fx 3 f x x 3 x x f x x f x x 2 f x x 3 -2.9 -203 588.7 -1707.23 -0.9 -36 32.4 -29.16 1.1 44 48.4 53.24 3.1 93 288.3 893.77 5.1 102 520.2 2653.02 0 1478.0 1863.64 31020 , f x x 0, 1863.64. Note that, to be reasonable, the mean, median and quartiles must fall between 0 and 10. And no, I did not get the 1.0 in the x column by rounding 0.999995, or, for that matter, by rounding anything else - Think! a. Calculate the Cumulative Frequency (1): (See above) The cumulative frequency is the whole F column. b. Calculate the Mean (1): x fx 780 3.9 n 200 c. Calculate the Median (2): position pn 1 .5201 100 .5 . This is above 70 and below 110, so the pN F .5200 70 interval is 2-3.99999. x1 p L p w so x1.5 x.5 2 2 3.500 40 f p d. Calculate the Mode (1): The mode is the midpoint of the largest group. Since 70 is the largest frequency, the modal group is 0 to 1.99999 and the mode is 1.000. e. Calculate the Variance (3): s 2 s2 f x x n 1 2 fx 2 nx 2 n 1 4520 200 3.92 7.42714 or 199 1478 .0 7.42714 199 f. Calculate the Standard Deviation (2): s 7.42714 2.72528 5 251x0011 10/11/00 g. Calculate the Interquartile Range (3): First Quartile: position pn 1 .25201 50.25 . This is above pN F F 0 and below F 70 , so the group is 0 to 1.99999. x1 p L p w gives us f p .25200 0 Q1 x1.25 x.75 0 2 1.4286 . 70 Third Quartile: position pn 1 .75201 150 .75 . This is above 150 and below 180, so the group is .75 200 150 6.000 to 7.99999. x1.75 x.25 6 2 6.000 . 30 . IQR Q3 Q1 6.000 1.4286 4.5714 h. Calculate a Statistic showing Skewness and interpret it (3): n k 3 fx 3 3x fx 2 2nx 3 200 31020 33.94520 2200 3.93 (n 1)( n 2) 199 198 0.00507588 1863 .6 9.45940 . or k 3 or g 1 n (n 1)( n 2) k3 s 3 f x x 9.45942 2.72528 3 3 200 1863 .64 9.4647 199 198 0.467339 3mean mode 33.9 1.0 3.1923 std .deviation 2.72528 Because of the positive sign, the measures imply skewness to the right. i. Make an histogram of the Data (Neatness Counts!)(2): A histogram is a bar graph of the frequency. The first bar is between 0 and 2 on the x axis (or has a midpoint at 1) and has a height of 70. or Pearson's Measure of Skewness SK 6