251x0811 (corrected) 2/25/07 ECO251 QBA1 FIRST EXAM February 21, 2008 Name: _________________________ Student Number: _________________________ Class Hour: _____________________ Remember – Neatness, or at least legibility, counts. In most non-multiple-choice questions an answer needs a calculation or short explanation to count. Part I. (7 points) The following numbers are to be considered a sample and represent the scores of a class of ten seniors on the first exam in Dr. Hardnose’s accounting class (Doane and Seward). x1 60 88 60 71 60 73 74 75 60 99 Compute the following: Show your work! a) The Median (1) b) The Standard Deviation (3) c) The 7th decile (2) d) The Coefficient of variation (1) e) (Extra Credit) Dr. Hardnose gives a second exam, but one student is absent. He enters the data from both exams into Minitab with the following results. MTB > describe c1 c2 Descriptive Statistics: x1, x2 Variable x1 x2 N 10 9 N* 0 0 Mean ----72.78 SE Mean ---2.19 StDev ----6.57 Minimum 60.00 65.00 Q1 60.00 65.00 Median 72.00 74.00 Q3 78.25 79.00 Maximum 99.00 79.00 The numbers computed should be self explanatory except for ‘SE Mean,’ (the standard error of the mean) s which can be gotten by computing s x . I have erased the numbers related to the statistics you should n compute. So – given the statistics that we have learned to compute and the facts that the mean and sample size have changed, can you compare the variability of the results of the first and second exams? (Yes or No won’t work here, give me some numbers!) 1 251x0811 (corrected) 2/25/07 Part II. (At least 35 points – 2 points each unless marked - Parentheses give points on individual questions. Brackets give cumulative point total.) Exam is normed on 50 points. 1. Which of the following is not a measure of central tendency (2) a) The arithmetic mean b) The geometric mean c) The standard deviation d) The median e) The mode. f) All of the above are measures of central tendency g) None of the above is a measure of central tendency. 2. The smaller the spread of data around the mean (2) a) The smaller the interquartile range b) The smaller the standard deviation c) The smaller the coefficient of variation d) All of the above e) None of the above. 3. Mark the following items N (nominal), O (ordinal), I (interval) or R (ratio) data. If the data is interval or ratio data, would it be considered C (continuous) or D (discrete)? (4) [8] a) The weights of Sumo wrestlers ________ b) Your Social Security number ________ c) Volume of traffic on I-95 (light, medium, heavy) __________ d) Number of hits in the World Series _________ 4. A number that is used to summarize population data is called (2) a) A parameter b) s c) A statistic d) None of the above would be used to summarize population data e) All of the above could be used to summarize population data. 5. If a frequency distribution has a positive coefficient of skewness, we would expect (2) [10] a) The mean to be between the median and the mode b) The mean to exceed the median c) The median to be larger than the mean or the mode. d) The standard deviation to exceed the mean e) The coefficient of excess to be positive f) None of the above would be likely to be true. 6. Under what circumstances would you expect a population variance be zero? [12] a) If the mean is zero b) If every observation above the sample mean is precisely offset by a number the same distance below the sample mean. c) If the mean, median and mode were all the same. d) When there are an equal number of observations above and below the median e) None of the above is likely to result in a zero population variance f) All of the above are likely to result in a zero population variance. 2 251x0811 (corrected) 2/25/07 7. Batting averages of major league baseball players are thought to follow a symmetrical unimodal distribution with a mean of .260 and a standard deviation of .03. a) What proportion of major league players would you expect to have a batting average above .320? Why? (3) b) If you did not know that the distribution was symmetrical, what is the largest proportion of players that you would expect to have a batting average above .320? Why? (3) [18] Table 1 Given below is a stem-and-leaf display of the heights in inches of 50 Christmas trees being grown for sale. The smallest tree is 44 inches. 4|44668899 5|00112244555577888899 6|000022223344557799 7|3355 8. In Table 1, what is the median height? (2) 9. In Table 1, what are the first and third quartiles? What do they lead you to think about the skewness of the distribution? (3) 10. In Table 1, assume that you were asked to present the data in 5 classes. Show how you would decide what class interval to use and list the classes below with their frequencies. (4) [27] Class A B C D E ___ ___ ___ ___ ___ to to to to to under under under under under Frequency ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ 3 251x0811 (corrected) 2/25/07 Table 2 Class 20-35 35-50 50-65 65-80 80-95 f 11 f rel .122 .278 F 11 36 10 .111 1.000 90 Frel .122 .400 .700 .889 11. Fill in the missing numbers in Table 2 (3) 12. Find the 1 3 fractile of the data in Table 2. (3) [33] 13. The Des Moines Metropolitan area consists of Dallas, Polk and Warren Counties. In the 1990 census Dallas County had 30700 people and a mean income of $9673, Polk County had 309000 people and a mean income of $11779 and Warren County had 37300 people with a mean income of $9749. a) What is the mean income of residents of the Des Moines area? (Note: I am much more interested in how you do this that whether you get the right answer, so show your work!) (2) b) Suppose that in 1991 Census officials believe that the population and mean income in Polk county and Warren County stayed the same, but the mean income in Dallas county fell while the population stayed constant. For the Des Moines area, which of the following is likely to have occurred? (2) [37] a) The median income fell most b) The modal income fell most c) The mean income fell most d) The mean, median and modal income all fell to the same degree. 4 251x0811 (corrected) 2/25/07 ECO251 QBA1 FIRST EXAM February 21, 2008 TAKE HOME SECTION Name: _________________________ Student Number: _________________________ Throughout this exam show your work! Please indicate clearly what sections of the problem you are answering and what formulas you are using. Turn this is with your in-class exam. Part III. Do all the Following (12+ Points). The problems are based on problems by Doane and Seward. Show your work! 1. The table below represents the distribution of winning times in seconds for horses in the Kentucky Derby. Treat these data as a sample. Personalize the data below by adding the last two digits of your student number to the last 2 frequencies. .Add one digit per frequency For example, Seymour Butz’s student number is 876509 so he adds 0 to the second-to-last frequency and 9 to the last frequency and uses {1, 5, 16, 22, 12, 8, 5, 3 and 11} (adding to 83). You may check your work on the computer, but what is turned in should look as if it had all been done by hand. Row Time in seconds Frequency a. Calculate the Cumulative Frequency (0.5) b. Calculate the Mean (0.5) c. Calculate the Median (1) d. Calculate the Mode (0.5) e. Calculate the Variance (1.5) f. Calculate the Standard Deviation (1) g. Calculate the Interquartile Range (1.5) h. Calculate a Statistic showing Skewness and interpret it (1.5) i. Make an ogive of the data (Neatness Counts!)(1) j. Extra credit: Put a (horizontal) box plot below the ogive using the same horizontal scale (1) k. (Extra, extra credit) the trimmed mean is a measure of the data that mitigates the effect of extreme values. A 5% timed mean is a mean calculated after 5% of the data is removed from both the top and bottom of the data. (For example, if there are 100 points, the top 5 and the bottom 5 are removed and the mean of the middle 90 points is calculated.) Try to calculate a 5% trimmed mean of your Kentucky Derby numbers. (1) l. (More extra credit) Davies’ test: In 1929 Professor George Davies tried to find a method that would tell us whether an arithmetic mean or a geometric mean was a better way to characterize a set of data. Davies logQ1 logQ3 2 logx.50 recommended using D . We should decide to use the geometric mean if log Q3 logQ1 (α) D 0.20 . (β) The data seems to be convincingly skewed to the right. (γ) There are at least 50 observations. (i) Using the Derby data compute D and make a recommendation with an explanation. (ii) See if you can compute a geometric mean for this grouped data. Be sure that you explain what you do so that I can follow it. 1 2 3 4 5 6 7 8 9 119 120 121 122 123 124 125 126 127 to to to to to to to to to under under under under under under under under under 120 121 122 123 124 125 126 127 128 1 5 16 22 12 8 5 3 2 5 251x0811 (corrected) 2/25/07 2. The sales of your firm over the last 5 years are as follows. Year 1 2 3 4 5 Sales ($millions) 131 227 311 354 403 Personalize these data by subtracting the last two digits of your student number from the first sales figure. For example, Ima Badrisk’s student number is 876519, so she subtracts 19 from 131 to get 112. The effect of this will be to raise the growth rates in a). Do the following (3) a) Find the average growth rate of sales by taking a geometric mean using the four year-to-year growth rates. b) Find the harmonic mean of your sales numbers. c) Find the root-mean-square of your sales numbers. d) (Extra credit) Compute the geometric mean from a) using natural and/or base 10 logarithms. (1 point extra credit each). 6