STAT 366 QUIZ 2 2/3/10 1)You are a large fish commercial fisherman, and at the end of each day you bring your catch to the market, where each fish is weighed out, the total calculated, and you are paid by the pound. On Saturdays, near 6pm when everyone wants to go home, there is a particular weigh person who checks the total # of fish you have, and if he knows you, will weigh the first half of your fish, provided they are presented in either ascending (increasing) or descending (decreasing) size, and then give you a total weight calculated by [A] (total # of fish)*(weight of your median size fish). This saves time. You know, as a former student of statistics, that the actual total weight is [B](total # of fish)*(average fish weight). You also have a good idea of the distribution of the weights of your fish. For each distribution below, say whether you would bring your catch near 6pm (getting formula [A] for total weight), before 6pm (getting formula [B]), or it wouldn’t matter. Briefly justify your answer. Fish weight Distribution Plot Fish weight Distribution Plot X X Ans: The distribution of the left is clearly right-skewed, meaning that the mean is bigger than the median. Therefore, you will get a larger total weight if formula [B] is used, so you should go before 6pm. The distribution on the right is symmetric, meaning the mean and median are equal, so both formulas [A] and [B] will give the same total weight, so it wouldn’t matter when you brought your fish in. 2) (Hypothetical) Suppose the price of a one bedroom apartment within the city limits is considered to be normally distributed, has an average of $525 and a standard deviation of $70. a) What percentage of all one bedroom apartments within the city limits are priced above $560? b) What is the 30th percentile among the prices of all one bedroom apartments within city limits? c)What percentage of all one bedroom apartments within the city fall between the prices of the 30th and 50th percentiles? Ans: a) Z560=(560-525)/70=0.5. This corresponds to an area under the normal curve to left of 0.5(less than 0.5) of .69 or 69%. But since we are interested in the % above $560, we need to take what is to the right of Z=0.5, which is 1-(.69)=0.31 or 31%. b) The 30th percentile among prices is the price for which 30% of the data falls to the left (is less than). So we need to find the Z value for which the area is 0.3. Unfortunately, our table doesn’t have an area (a number in one of the left hand columns) of .3. All is not lost, as we can use the 70th percentile, or the Z value corresponding to the area of .7. Since the area to the right of the 70th percentile is .3 we can easily obtain the 30th percentile and the negative of the 70th percentile. Distribution Plot Normal, Mean=0, StDev=1 0.4 Density 0.3 0.2 0.1 0.0 0.3 0.3 -0.524 0 X 0.524 STAT 366 QUIZ 2 2/3/10 So, the 30th percentile is -70th percentile, or -.524. That is, the 30th percentile for any normal curve is .524 SD’s below the mean. So the 30th percentile in price is 525-.524(70)=approx $488. c) The percentage of data between two prices is the area under the normal curve between them, or, as we have seen, the % data to eh left of the larger price minus the % data to the left for the smaller price. Fortunately, we don’t even need to calculate any Zscores or look up any areas. By definition of the 50th percentile, we mean the price for which 50% of the data is to the left, and for the 30th percentile, we just mean that price for which 30% of the area falls to the left. So without any effort, we know, for any normal curve, that the % of data between the 50th and 30th percentiles is 50%-30% = 20%. Distribution Plot Normal, Mean=0, StDev=1 0.4 0.2 Density 0.3 0.2 0.1 0.0 30th 50th X Histogram of C5 30 25 Percent 20 15 10 5 0 0 3 6 9 12 15 18 21 C5 3)Using the empirical rule, estimate the SD for the data in the histogram above. Ans: The empirical rule tells us that if the data is “mound” shaped, then approx. 95% of the data fits between xbar-2*SD and Xbar+2*SD. That is, the “middle” 95% of the data fits into an interval of length 4*SD. To get the “middle” 95%, we cut off 2.5% from each end of the histogram, and the range of values left over should be about 4*SD wide. The high has an interval of about 2-2.5%, so we cut off at 18. The low end has an interval of about 7-7.5%, so we need to cut off only about a third of it, since 7.5% /3 is about 2.5%, so we cut off at 1. The 4*SD=18-1=17, so SD~17/4=4.25.