251y0311 2/19/03 ECO251 QBA1 FIRST HOUR EXAM February 21, 2003 Name: ______Key____________ Social Security Number: _____________________ Part I. (34 points) 1. Which of the following is NOT a reason for the need for sampling? a) It is usually too costly to study the whole population. b) It is usually too time consuming to look at the whole population. c) It is sometimes destructive to observe the entire population. d) *It is always more informative to investigate a sample than the entire population. 2. Most analysts focus on the cost of tuition as the way to measure the cost of a college education. But incidentals, such as textbook costs, are rarely considered. A researcher at Drummand University wishes to estimate the textbook costs of first-year students at Drummand. To do so, she monitored the textbook cost of 250 first-year students and found that their average textbook cost was $300 per semester. Identify the population of interest (target population) to the researcher. a) All Drummand University students. b) All college students. c) *All first-year Drummand University students. d) The 250 students that were monitored. 3. Which of the following is a continuous quantitative variable? a) The color of a student’s eyes b) The number of employees of an insurance company c) *The amount of milk produced by a cow in one 24-hour period d) The number of gallons of milk sold at the local grocery store yesterday 4. If I describe the place where a number is in a table as column 3, row 5, the location of that number is a: a) Field b) *Cell c) Stub d) Label 1 251y0311 2/19/03 TABLE 2-1 An insurance company evaluates many numerical variables about a person before deciding on an appropriate rate for automobile insurance. A representative from a local insurance agency selected a random sample of insured drivers and recorded, X, the number of claims each made in the last 3 years, with the following results. X f fx 1 14 14 2 18 36 3 12 36 4 5 20 5 1 5 50 111 5. Referring to Table 2-1, how many drivers are represented in the sample? a) 5 b) 15 c) 18 d) *50 6. Referring to Table 2-1, how many total claims are represented in the sample? a) 15 b) 50 c) *111 d) 250 7. When constructing charts, the following is plotted at the class midpoints: a) *frequency histograms. b) percentage polygons. c) cumulative relative frequency ogives. d) All of the above. 8. Which of the following is NOT a reason for drawing a sample? a) A sample is less time consuming than a census. b) A sample is less costly to administer than a census. c) *A sample is usually a good representation of the target population. d) A sample is less cumbersome and more practical to administer. 2 251y0311 2/19/03 TABLE 2-4 A survey was conducted to determine how people rated the quality of programming available on television. Respondents were asked to rate the overall quality from 0 (no quality at all) to 100 (extremely good quality). The stem-and-leaf display of the data is shown below. Stem Leaves 3 24 4 03478999 5 0112345 6 12566 7 01 8 9 2 9. Referring to Table 2-4, what percentage of the respondents rated overall television quality with a rating of 50 or below? a) 0.11 b) 0.40 c) *0.44 (11 out of 25) d) 0.56 TABLE 2-11 The ordered array below resulted from taking a sample of 25 batches of 500 computer chips and determining how many in each batch were defective. Defects 1, 2, 4, 4, 5, 5, 6, 7, 9, 9, 12, 12, 15, 17, 20, 21, 23, 23, 25, 26, 27, 27, 28, 29, 29 Class 0 – 5.99 5 – 9.99 10 – 24.99 15 - 20.99 20 – 24.99 25 – 29.99 Total Frequency 4 6 2 2 4 7 25 f Rel Frequency .16 .24 .08 .08 .16 .28 1.00 f rel 10. Referring to Table 2-11, if a frequency distribution for the defects data is constructed, using "0 but less than 5" as the first class, the frequency of the “20 but less than 25” class would be __4______. 11. Referring to Table 2-11, if a frequency distribution for the defects data is constructed, using "0 but less than 5" as the first class, the relative frequency of the “15 but less than 20” class would be ___.08_____. 3 251y0311 2/19/03 TABLE 2-13 A research analyst was directed to arrange raw data collected on the yield of wheat, ranging from 40 to 93 bushels per acre, in a frequency distribution. 12. If the researcher was directed to present the data in 5 classes, what should the class interval be? Show your calculations. Solution: 93 40 10.6 Use 11 or more. 5 13. Show the actual intervals you might use. (I used 12 as my width – some used 15.) Class A B C D E From 40 52 64 76 88 to 51.99 63.99 75.99 87.99 99.99 14. Which of the following is NOT sensitive to extreme values? a) The range. b) The standard deviation. c) *The interquartile range. d) The coefficient of variation. 15. In right-skewed distributions, which of the following is the correct statement? (Q2 and the median are the same.) a) The distance from Q1 to Q2 is larger than the distance from Q2 to Q3. b) *The distance from Q1 to Q2 is smaller than the distance from Q2 to Q3. c) The mean is smaller than the median. d) The mode is larger than the mean. 16. In a perfectly symmetrical distribution a) the range equals the interquartile range. b) the interquartile range equals the mean. c) *the median equals the mean. d) the variance equals the standard deviation 17. Evaluate the following statements: (i) If every individual in the population is equally likely to be chosen to be in the sample, we must be taking a simple random sample. (ii) The sum of cumulative frequencies in a distribution always equals 1. (iii) The Chebyschev inequality says that least 1/9 of the data must be 3 standard deviations or more from the mean. a) Only the first is true. b) Only the second is true. c) Only the third is true. d) *None are true. e) All are true. A simple random sample of n must also have all possible samples of n equally likely. The relative frequencies add to 1. At most 1/9 of the data is 3 or more standard deviations from the mean. 4 251y0311 2/19/03 Part II. My Social Security Number is 265398248. If I write it in clumps of 2 numbers I get: 26, 53, 98, 24, 8. Write your social security number the same way. Compute the following: a) The Median (1) b) The Standard Deviation (3) c) The 2nd Quintile (2) Solution: The numbers in order are 8, 24, 26, 53, 98. x x2 8 64 x1 24 576 x2 x3 x4 x5 Total 26 676 53 2809 98 9604 209 13729 a) The middle number is 26. b) n 5, x s2 x 2 x 209 41.80 , n nx n 1 5 2 13729 541 .80 2 4 4992 .8 1248 .2 . So s 1248.2 35.3299 4 c) pn 1 .46 2.4 . So a 2 and .b .4 x1 p xa .b( xa1 xa ) so x1.4 x.6 x 2 0.4( x3 x 2 ) 24 0.6(26 24 ) 25 .2 5 251y0311 2/19/03 ECO251 QBA1 FIRST EXAM February 21, 2003 TAKE HOME SECTION Name: _________________________ Social Security Number: _________________________ Throughout this exam show your work! Please indicate clearly what sections of the problem you are answering and what formulas you are using. Part III. Do all the Following (11 Points) Show your work! 1. My Social Security Number is 265398248. If I use each digit as a frequency in and the intervals below, I get: Class frequency $300- 399.99 $400- 499.99 $500- 599.99 $600- 699.99 $700- 799.99 $800- 899.99 $900- 999.99 $1000-1099.99 $1100-1199.99 Replace my social security number with your own in the frequency. To make the problem easier, you may replace all zeros in your new frequency column with 10s. Assume that this data represents a sample of rents paid in Chester County. a. Calculate the Cumulative Frequency (0.5) b. Calculate The Mean (0.5) c. Calculate the Median (1) d. Calculate the Mode (0.5) e. Calculate the Variance (1.5) f. Calculate the Standard Deviation (1) g. Calculate the Interquartile Range (1.5) h. Calculate a Statistic showing Skewness and Interpret it (1.5) i. Make a histogram of the Data showing relative or percentage frequency (Neatness Counts!)(1) j. Extra credit: Put a (horizontal) box plot below the histogram using the same scale. (1) 2 6 5 3 9 8 2 4 8 Solution: x is the midpoint of the class. Our convention is to use the midpoint of 0 to 2, not 1.99999. Note also, that the midpoints have been divided by 10. Most numbers should be multiplied by 10, the variance should be multiplied by 100 and k 3 by 1000. f F x class A 300- 399.99 B 400- 499.99 C 500- 599.99 D 600- 699.99 E 700- 799.99 F 800- 899.99 G 900- 999.99 H 1000-1099.99 I 1100-1199.99 n 2 6 5 3 9 8 2 4 8 47 2 35 8 45 13 55 16 65 25 75 33 85 35 95 39 105 47 115 f 47, fx f x x 2 70 270 275 195 675 680 190 420 120 3695 3695 , 28285, and fx 2 fx fx3 x x 2450 85750 12150 546750 15125 831875 12675 823875 50625 3796875 57800 4913000 18050 1714750 44100 4630500 105800 12167000 318775 29510375 fx 2 f x x 3 318775 , -43.6170 -33.6170 -23.6170 -13.6170 -3.6170 6.3830 16.3830 26.3830 36.3830 fx 3 f x x f x x 2 f x x 3 -87.234 -201.702 -118.085 -40.851 -32.553 51.064 32.766 105.532 291.064 0.001 29510375 , 3804.9 -165958 6780.6 -227944 2788.8 -65864 556.3 -7575 117.7 -426 325.9 2080 536.8 8794 2784.2 73457 10589.8 385287 28285.0 1851 f x x 0, 1851. Note that, to be reasonable, the mean, median and quartiles must fall between 0 and 180. a. Calculate the Cumulative Frequency (1): (See above) The cumulative frequency is the whole F column. 6 251y0311 2/19/03 b. Calculate the Mean (1): x fx 3895 78.6170 n 47 c. Calculate the Median (2): position pn 1 .548 24 . This is above F 16 and below F 25, so pN F the interval is E, 70-79.999 in hundreds. x1 p L p w so f p .547 16 x1.5 x.5 70 10 70 0.83333 10 78 .3333 9 d. Calculate the Mode (1) The mode is the midpoint of the largest group. Since 9 is the largest frequency, the modal group is E, 700 to 799.99 and the mode is 75 (in hundreds). e. Calculate the Variance (3): s 2 s2 f x x n 1 2 fx 2 nx 2 n 1 318775 47 78 .6170 2 28285 .264 614 .897 or 46 46 28285 .0 614 .891 . The computer got 614.894. 46 f. Calculate the Standard Deviation (2): s 614.897 24.7971 or s 614.891 24.7970 g. Calculate the Interquartile Range (3): First Quartile: position pn 1 .2548 12 . This is above pN F F 8 and below F 13, so the interval is C, 500-599.99. x1 p L p w gives us, in hundreds, f p .2547 8 Q1 x1.25 x.75 50 10 50 .75 . 5 Third Quartile: position pn 1 .7548 36 . This is above F 35 and below F 39, so the interval .7547 35 is H, 1000-1199.99. x1.75 x.25 100 10 100 .625 . 4 IQR Q3 Q1 100.625 50.75 49.875 . h. Calculate a Statistic showing Skewness and interpret it (3): n k 3 fx 3 3x fx 2 2nx 3 47 29510376 378.6170 318775 247 78.6170 3 (n 1)( n 2) 4645 0.0227053 1836 41.687 . or k 3 or g 1 n (n 1)( n 2) k3 s 3 f x x 42 24 .797 3 3 47 1851 42.028 The computer gets 42.062 and 41.959 46 45 .00275 3mean mode 378 .6170 75 0.4376 std .deviation 24 .797 Because of the positive sign, the measures imply skewness to the right. i. A histogram is a simple bar graph with frequency on the y-axis and the numbers 300-1200 on the x-axis. j. The box plot should show the median and the quartiles. or Pearson's Measure of Skewness SK 7 251x0311 2/13/03 2. My Social Security Number is 265398248. If I write it in clumps of 2 numbers I get: 26, 53, 98, 24, 8. Write your social security number the same way. For these five numbers , compute the a) Geometric Mean b) Harmonic mean, c) Root-mean-square (1point each). Label each clearly. If you wish, d) Compute the geometric mean using natural or base 10 logarithms. (1 points extra credit each ). While your at it, compute the sample mean and bring it to the exam (no credit – but it won’t hurt). Solution: Note that x 209 . This is not used in any of the following calculations and there is no reason why you should have computed it! a) The Geometric Mean. 1 x g x1 x 2 x3 x n n n 25928448 x 5 26 539824 8 5 25928448 25928448 1 5 0.2 30.3917 . b) The Harmonic Mean. 1 1 xh n 1 1 1 x 5 26 53 98 24 8 5 0.0384615 0.0188679 0.010204 0.00036099 1 1 1 1 1 0.125 1 0.19289454 5 1 1 25 .9208 1 1 0.03857891 n x c) The Root-Mean-Square. 1 1 1 2 x rms x 2 26 2 53 2 98 2 24 2 8 2 676 2809 9604 576 64 n 5 5 0.03857891 . So xh 1 13729 2745 .8 . So x rms 5 1 n x 2 2745 .8 52 .4004 . d) (i) ln x g 1 n ln( x) 5 ln 26 ln 53 ln 98 ln 24 ln 8 5 3.25809 3.97029 4.58497 3.17805 2.07944 1 1 1 17 .0709 3.4142 . So x g e 3.4142 30 .3917 . 5 (ii) 1 log( x) 1 log26 log 53 log 98 log24 log8 log x g n 5 1 1 1.41497 1.72428 1.99123 1.38021 0.90309 7.41378 1.48276 . So 5 5 x g 10 1.48276 30 .3917 . Notice that the original numbers and all the means are between 8 and 98. 8