ECO251 QBA1 Exam: Statistics & Data Analysis

251y0312 9/26/03 ECO251 QBA1 FIRST HOUR EXAM October 1, 2003 Name: _____KEY____________ Social Security Number: _____________________ Part I. (32 points) 1. The process of using sample statistics to draw conclusions about true population parameters is called a) *statistical inference. b) the scientific method. c) sampling. d) descriptive statistics. 2. A summary measure that is computed to describe a characteristic of an entire population is called a) *a parameter. b) a census. c) a statistic. d) the scientific method. 3. Which of the following is a discrete quantitative variable? a) the Dow Jones Industrial Average b) the volume of water released from a dam c) the distance you drove yesterday d) *the number of employees of an insurance company TABLE 1-1 The manager of the customer service division of a major consumer electronics company is interested in determining whether the customers who have purchased a videocassette recorder made by the company over the past 12 months are satisfied with their products. 4. Referring to Table 1-1, the possible responses to the question "Are you happy, indifferent, or unhappy with the performance per dollar spent on the videocassette recorder?, " if we write down a 1 for ‘happy, ’ a 2 for ‘unhappy’ and a 3 for ‘indifferent, are the following kind of random variable. a) ratio b) *nominal c) interval d) ordinal 1 251y0312 9/26/03 TABLE 2-2 At a meeting of information systems officers for regional offices of a national company, a survey was taken to determine the number of employees the officers supervise in the operation of their departments, where X is the number of employees overseen by each information systems officer. X f_ 1 7 2 5 3 11 4 8 5 9 5. Referring to Table 2-2, how many regional offices are represented in the survey results? a) 127 b) 5 c) 15 n f d) *40  TABLE 2-5 The following are the durations (in minutes) of a sample of long-distance phone calls made within the continental United States, reported by one long-distance carrier: Time (in Minutes) 0 but less than 5 5 but less than 10 10 but less than 15 15 but less than 20 20 but less than 25 25 but less than 30 30 but less than 35 Relative Frequency 0.37 0.22 0.15 0.10 0.07 0.07 0.02 6. Referring to Table 2-5, if 1,000 calls were randomly sampled, how many calls lasted under 10 minutes? a) 220 class f rel Frel b) 370 0 but less than 5 0.37 0.37 c) 410 5 but less than 10 0.22 0.59 10 but less than 15 0.15 0.74 d) *590 15 but less than 20 0.10 0.84 The answer is the 20 but less than 25 0.07 0.91 cumulative frequency 25 but less than 30 0.07 0.98 nd for the 2 class 30 but less than 35 0.02 1.00 multiplied by 1000. 7. If I make a graph of the data in table 2-5 (Assume the table represents a sample of 1000 calls) with the following x and y coordinates for the first five points: {(0, 0), (5, 370), (10, 590), (15, 740) , (20, 840)}, a one-word name for this type of graph is _ogive_ , and the last point on the line could be (45, _1000_ ) Explanation: The x points are the upper limits of the class, starting at the last empty class. The y points are the cumulative frequencies, gotten by multiplying the Frel column by 1000. When the graph gets to x = 35, y hits 1000 and is 1000 for all subsequent points. 2 251y0312 9/26/03 8. Referring to Table 2-5, what is Frel for the percentage of calls that lasted under 20 minutes? a) 0.10 b) 0.76 c) *0.84 Look at the table. d) None of the above – write in the correct answer. TABLE 2-7 The stem-and-leaf display below contains data on the number of months between the date a civil suit is filed and when the case is actually adjudicated for 50 cases heard in superior court. Stem Leaves 1 234447899 2 22223455678889 3 0011135778 4 02345579 5 112466 6 158 9. Referring to Table 2-7, the civil suit with the fourth shortest waiting time between when the suit was filed and when it was adjudicated had a wait of _14__ months. Explanation: The first four numbers are 12, 13, 14, 14. k n x  x 3 , 33 , 10. Eunice computes the following statistics from a sample (n  1)( n  2) s   x  x    x  x 4 3n  13 s 4  3mean  mode n2   . She , , k4  n  1   std .deviation n 1 n  1n  2n  3  n n2   thinks the sample represents a population that is skewed to the right. Which of the statistics would show skewness and what sign should she expect from them? (No partial credit on this one.) Answer: Any legitimate measure of skewness would be positive if the population is skewed to the n x  x 3 right. From your formula table, the measures of skewness are: (i) k 3  (n  1)( n  2) 2  skewness, (ii) g1  k3 s 3 skewness. The other two are s 2 - relative skewness and (iii) SK   x  x   n 1 3mean  mode  - Pearson’s measure of std .deviation 2 - the sample variance, which is always positive and   x  x 4 3n  13 s 4  n2 n  1  - the   n  1n  2n  3  n n2   coefficient of excess (in the outline), which measures kurtosis. measures dispersion and k4  11. In a perfectly symmetrical distribution with one mode. a) the arithmetic mean equals the median. b) the median equals the mode. c) the arithmetic mean equals the mode. d) *all of the above. e) none of the above. 3 251y0312 9/26/03 12. According to the Bienayme-Chebyshev rule (I called it Chebyshef’s Inequality), at least 93.75% of all observations in any data set are contained within a distance of how many standard deviations around the mean? a) 1 b) 2 c) 3 d) *4 Explanation: If at least 93.75% are ‘in,’ then at most 6.25% are out in the tails. The rule says that 1 k 2 is the proportion in the tails, defined as the points below   k and the points above   k . If you try out the values here, you will find More directly, you could solve 1  1 k2 1 42  116  .0625, so k must be 4.  .9375 , by trying the four values of k that were given. This is a problem that was done in class. 13. Evaluate the following statements. (i) The median of the values 3.4, 4.7, 1.9, 7.6, and 6.5 is 4.05. (ii) In a set of numerical data, the value for Q3 can never be smaller than the value for Q1. (iii) In a set of numerical data, the value for Q2 is always halfway between Q1 and Q3. a) (i) and (ii) are false. b) *(i) and (iii) are false. c) (ii) and (iii) are false d) Only one of the statements is false. e) All of the statements are false. Explanation: The numbers in order are 1.9, 3.4 ,4.7 ,6.5 ,7.6 , so the median is 4.7 and (i) is wrong. The order of the quartiles is Q1, median, Q3. If all the middle numbers are the same, Q3 could equal both the median and Q1, but it could never be smaller than Q1, so (ii) is true. Q2 is the second quartile and it could be any value between Q1 and Q3, depending of what the numbers are. Its position, however, is halfway between them, so (iii) is false. 14. Which one of the following statements is false? a) In a sample of size 40, the sample mean is 15. In this case, the sum of all observations in x  600 . the sample is  b) *A population with 200 elements has an arithmetic mean of 10. From this information, it can be shown that the population standard deviation is 15. c) The median of a data set with 20 items would be the average of the 10th and 11th items in the ordered array. d) The coefficient of variation measures variability in a data set relative to the size of the arithmetic mean. e) If every possible group of 10 individuals in the population is equally likely to be chosen to be in the sample, we must be taking a simple random sample of 10. f) All of the above statements are false. 15. Which of the following is NOT a measure of central tendency? a) the arithmetic mean b) the geometric mean c) the mode d) *the interquartile range 4 251y0312 9/26/03 16. Which of the following is most sensitive to extreme values? a) the median b) the interquartile range c) *the arithmetic mean d) the 1st quartile 5 251y0312 9/26/03 Part II. (Ng pp 77-79) (8 points) The data below represent the amount of grams of carbohydrates in a serving of breakfast cereal. It is a x  217 , x 2  4541 sample containing 11 numbers. Note:   {11, 15, 23, 29, 19, 22, 21, 20, 15, 25, 17} Find: a) The First Quartile (1.5) b) The Standard Deviation (2) c) The Coefficient of variation (1.5) d) The five-number summary (3)  x , x 2 , x 3 , x 4 , x 5 , x 6 , x 7 , x8 , x 9 , x10 , x11  Solution: a) Put the numbers in order.  1 . 11, 15, 15, 17 , 19 , 20 , 21, 22 , 23, 25, 29  n  11, so the first quartile is at position  pn  1  .2512   3.0 , and Q1  x3  15 . Or if a.b  3.0, x1 p  x.75  xa  .bxa1  x a   x3  0x 4  x3   15  017  15.  x  217  19.7273 , so, using the computational formula, s   x b) x  2 n 2  nx 2 n 1 11 4541  1119 .7273  260 .17   26 .017 . s  26.017  5.101 . 10 10 st .deviation s 5.101    0.2586 . c) C  mean x 19 .7273 d) For the median position  pn  1  .512   6.0 and for the third quartile, position  pn  1  .7512   9.0 . So, x.50  x6  20 and Q3  x.75  x9  23. The 5 number summary would be {lower bound, Q1,  2 median, Q3, upper bound} or 11, 15, 20, 23, 29 . 6 251x0312 9/23/03 ECO251 QBA1 FIRST EXAM October 1, 2003 TAKE HOME SECTION Name: _____KEY________________ Social Security Number: _________________________ Throughout this exam show your work! Please indicate clearly what sections of the problem you are answering and what formulas you are using. Part III. Do all the Following (11 Points) Show your work! 1. My Social Security Number is 265398248. If I use each digit as a frequency in and the intervals below, I get: Class Frequency $0- 5999 $6000- 11999 $12000- 17999 $18000- 23999 $24000- 29999 $30000- 35999 $36000- 41999 $42000- 47999 $48000- 53999 Assume that this data represents a sample of rents paid in Chester County. a. Calculate the Cumulative Frequency (0.5) b. Calculate The Mean (0.5) c. Calculate the Median (1) d. Calculate the Mode (It is possible but unlikely that there is more than one)(0.5) e. Calculate the Variance (1.5) f. Calculate the Standard Deviation (1) g. Calculate the Interquartile Range (1.5) h. Calculate a Statistic showing Skewness and Interpret it (1.5) i. Make a frequency polygon of the Data (Neatness Counts!)(1) j. Extra credit: Put a (horizontal) box plot below the histogram using the same scale. (1) 2 6 5 3 9 8 2 4 8 Replace my Social Security number with your own in the frequency column. To make the problem easier, you may replace all zeros in your new frequency column with 10s. Solution: x is the midpoint of the class. Our convention is to use the midpoint of 0 to 2, not 1.99999. Note also, that the midpoints and class limits have been divided by 1000. Most numbers should be multiplied by 1000, the variance should be multiplied by 1,000,000 and k 3 by 1,000,000,000. class A B C D E F G H I 0- 5.999 6-11.999 12-17.999 18-23.999 24-29.999 30-35.999 36-41.999 42-47.999 48-53.999 f F x 2 6 5 3 9 8 2 4 8 47 2 8 13 16 25 33 35 39 47 3 9 15 21 27 33 39 45 51 6 54 75 63 243 264 78 180 408 1371 18 486 1125 1323 6561 8712 3042 8100 20808 50175 fx3 x  x 54 4374 16875 27783 177147 287196 118638 364500 1061208 2058075 -26.1702 -20.1702 -14.1702 - 8.1702 - 2.1702 3.8298 9.8298 15.8298 21.8298 f x  x  f x  x 2 f x  x 3 -52.340 1369.76 -35846.9 -121.021 2441.02 -49236.0 -70.851 1003.97 -14226.5 -24.511 200.26 -1636.1 -19.532 42.39 -92.0 30.638 117.34 449.4 19.660 193.25 1899.6 63.319 1002.33 15866.6 174.638 3812.32 83222.1 0.000 10182.64 400.2  fx  50175 ,  fx  2058075 ,  f x  x   0,  f x  x 2  10182.64, and  f x  x 3  400.2. Note that, to be reasonable, the mean, median and n  f  47,  fx fx 2 fx  1371 , 2 3 quartiles must fall between 0 and 54. a. Calculate the Cumulative Frequency (1): (See above) The cumulative frequency is the whole F column. b. Calculate the Mean (1): x   fx  1371  29.1702 n 47 7 c. Calculate the Median (2): position  pn  1  .548   24 . This is above F  16 and below F  25, so  pN  F  the interval is E, 24-29.999 in thousands. x1 p  L p    w so  f p   .547   16  x1.5  x.5  24    6  24  0.83333 10  24 .5000 9   d. Calculate the Mode (1) The mode is the midpoint of the largest group. Since 9 is the largest frequency, the modal group is E, 24 to 29.999 and the mode is 27 (in thousands). e. Calculate the Variance (3): s 2  s2   f x  x  2  n 1  fx 2  nx 2 n 1  51075  47 29 .1702 2 11082 .673   221 .3627 or 46 46 10182 .64  221 .3617 . The computer got 221.362. (in millions) 46 f. Calculate the Standard Deviation (2): s  221.3627  14.8783 or s  221.3617  14.8782 (in thousands) g. Calculate the Interquartile Range (3): First Quartile: position  pn  1  .2548   12 . This is above  pN  F  F  8 and below F  13, so the interval is C, 12-17.999. x1 p  L p    w gives us, in thousands,  f p   .25 47   8  Q1  x1.25  x.75  12    6   16 .500 . 5   Third Quartile: position  pn  1  .7548   36 . This is above F  35 and below F  39, so the interval  .7547   35  is H, 42-47.999. x1.75  x.25  42    6  42 .375 . 4   IQR  Q3  Q1  42.375 16.500  25.875 (in thousands). h. Calculate a Statistic showing Skewness and interpret it (3): n k 3 fx 3  3x fx 2  2nx 3  47 2058075  329.1702 50175  247 29.1702 3 (n  1)( n  2) 4645       0.0227053 2058075  4390844 .4  2333168 .3  0.0227053 399 .3  9.066 . or k 3  g1  k3 s 3 n (n  1)( n  2)   f x  x  9.085 14 .8782 3 3  47 400 .2  9.087 (The computer gets 9.0849) or 46 45   .00276 3mean  mode 329 .1702  27    0.4376 std .deviation 14 .8782 Because of the positive sign, the measures imply skewness to the right. i. A frequency polgon is a simple line graph with frequency on the y-axis and the numbers 0- 54 (thousand) on the x-axis. Since class A has a frequency of 2 plotted at x = 3 and the class width is 6, it should really start at x = -3 and y = 0. You should, at least show, the line falling across the y axis. Sinne the last nonempty class is 48-53.999, with its frequency plotted at x = 51, there should be a zero at x = 57. j. The box plot should show the median and the quartiles. (See text) or Pearson's Measure of Skewness SK  8 251y0312 9/26/03 2. My Social Security Number is 265398248. If I write it in clumps of 2 numbers and add 100 to the end, I get: 26, 53, 98, 24, 8, 100. Write your social security number the same way, so that you have a list of six numbers. Note: If any of these five numbers is a zero, change it to a one. For these five numbers, compute the a) Geometric Mean b) Harmonic mean, c) Root-mean-square (1point each). Label each clearly. If you wish, d) Compute the geometric mean using natural or base 10 logarithms. (1 points extra credit each ). Solution: Note that  x  209 . This is not used in any of the following calculations and there is no reason why you should have computed it! a) The Geometric Mean. 1 x g  x1  x 2  x3  x n  n  n  25928448 x  26 539824 8100   6 2592844800 5  2592844800  1 6 0.16667  37.0648 . b) The Harmonic Mean. 1 1  xh n 1 1 1 1 0.20289454 6  1   x  6  26  53  98  24  8  100   6 0.0384615  0.0188679  0.010204  0.00036099 1 1 1   0.0338157 1 . So xh  1 1 1 n  1 x   0.125  0.01 1  29 .57208 0.0338157 c) The Root-Mean-Square. 1 1 1 2 x rms  x 2  26 2  53 2  98 2  24 2  8 2  100 2  676  2809  9604  576  64  10000  n 6 6    1 23729   3954 .83 . So x rms  6  1 n x 2  3954 .83  62 .8875 . d) (i)   ln x g  1 n  ln( x)  6 ln 26   ln 53  ln 98   ln 24   ln 8  ln 100  1 1 3.25809  3.97029  4.58497  3.17805  2.07944  4.60517   1 21 .67594   3.6127 6 6  So x g  e 3.6127  37 .0644 . (ii)   log x g   1 n  log( x)  6 log26   log53  log98   log24   log8  log100   1 1 1.41497  1.72428  1.99123  1.38021  0.90309  2.00000   1 9.41378   1.56896 . 6 6 So x g  10 1.56896  37 .0649 . Notice that the original numbers and all the means are between 8 and 100. 9

ECO251 QBA1 Exam: Statistics & Data Analysis

Related documents

Products

Support

ECO251 QBA1 Exam: Statistics & Data Analysis

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib