Introduction to the Practice of Statistics Sixth Edition Moore, McCabe Section 5.2 Homework Answers 5.43 A grinding machine for auto axles. An automatic grinding machine in an auto parts plant prepares axles with a target diameter µ = 40.135 mm. The machine has some variability, so the standard deviation of the diameters is σ = 0.003mm. A sample of 4 axles is inspected each hour for process control purposes, and records are kept of the sample mean diameter. If the process mean is exactly equal to the target value, what will be the mean and standard deviation of the numbers recorded? σx = µ x = 40.135mm 0.003 4 = 0.0015 5.45 Axle diameters. Averages are less variable than individual observations. Suppose that the axle diameters in Exercise 5.43 vary according to a normal distribution. In that case, the mean x of an SRS of axles also has a normal distribution. a) Make a sketch of the normal curve for a single axle. Add the normal curve for the mean of an SRS of 4 axles on the same sketch. 40.126 40.129 40.132 40.135 40.138 40.141 40.144 b) What is the probability that the diameter of a single randomly chosen axle differs from the target value by 0.006 mm or more? Let the random variable X measure the diameter of an axle. P( X < 40.129 OR X > 40.141) = 5% using the 68-95-99.7 Rule 40.129 40.135 40.141 c) What is the probability that the mean diameter of an SRS of 4 axles differs from the target value by 0.006mm or more? 40.129 − 40.135 P( X < 40.129 OR X > 40.141) = 2 P Z < 0.003 4 = 2P(Z < -4) ≈0 In actuality 0.0000634. 5.50 North Carolina State University posts the grade distributions for its courses online. You can find the distribution grades in English 210in the Spring 2006 semester was Grade X P(x) A 4 0.31 B 3 0.40 C 2 0.20 D 1 0.04 F 0 0.05 a) Using the common scale A = 4, B = 3, C = 2, D = 1, F = 0, take X to be the grade of a randomly chosen 210 student. Use the definitions of the mean (page 271) and standard deviation (page 280) for discrete random variables to find the mean µ and the standard deviation σ of grades in this course. µX = 0.31(4) + 0.4(3) + 0.20(2) + 0.04(1) + 0.05(0) = 2.88 σX = 0.31(4 - 2.88) 2 + 0.4(3 - 2.88) 2 + 0.20(2 - 2.88) 2 + 0.04(1 - 2.88) 2 + 0.05(0 - 2.88) 2 σ = 1.051 b) English 210 is a large course. We can take the grades of an SRS of 50 students to be independent of each other. If x is the average of these 50 grades, what are the mean and standard deviation of x ? µ x = 2.88 while σ x ≈ 0.1486 c) What is the probability P(X ≥ 3) that a randomly chosen English 210 student gets a B or better? What is the approximate probability P( x ≥ 3) that the grade point average for 50 randomly chosen English 210 students is B or better? P(X ≥ 3) = 0.4 + 0.31 = 0.71 3-2.88 P( x ≥ 3) ≈ P Z > 1.051 50 ≈ P(Z > 0.8074) ≈ 0.2097 5.52 A lottery payoff. A $1 bet in a state lottery's Pick 3 game pays $500 if the three-digit number you choose exactly matches the winning number, which is drawn at random. Here is the distribution of the payoff X: Payoff X $0 $500 Probability 0.999 0.001 Each day's drawing is independent of other drawings. (a) What are the mean and standard deviation ofX? µX = $0(0.999) + 500(0.001) = $0.5 σX = 0.999(0 − 0.5) 2 + 0.001(500 − 0.5) 2 = 15.80 (b) Joe buys a Pick 3 ticket twice week. What does the law of large numbers say about the average payoff Joe receives from his bets? I mentioned before that the sample space determines the probability. The problem I am having here is that I can look at this question in two ways; I am not sure which one is the intent of the question. Case 1 – The mention of twice a week, is it to let me know that Joe is buying lots of tickets, but Iam suppose to think of each ticket as an individual ticket. If that is the case the average payoff is $0.5 per ticket. Case 2 - Is the mention of the two tickets to let me know that the sample space consists of the result of two ticket outcomes, not just one. Then the average return per two tickets is 2(0.5) = $1. Thus, in the long run, if you take years of buying 2 tickets a week, the average of those winnings would be around $1. Of course this does not take into account that Joe on average exactly pays $2 per week to have the privelage of getting $1 back on average. (c) What does the central limit theorem say about the distribution of Joe’s average payoff after 104 bets in a year? The population of payoff is discrete, but it will have that roughly normal shape. (d) Joe comes out ahead for the year if his average payoff is greater than $1. What is the probability Joe ends the year ahead? I will think as a sample of 104. I am assuming from the previous discussion that Joe buys 104 tickets in a year. I will think of the sample space as consisting of all single ticket outcomes 1- (0.5) P( x ≥ 1) ≈ P Z > 15.80 104 ≈ P(Z > 0.3227) ≈ 0.3735 5.54. Flaws in carpets. The number of flaws per square yard in a type of carpet material varies with mean 1.5 flaws per square yard and standard deviation 1.3 flaws per square yard. This population distribution can not be normal, because a count takes only whole-number values (i.e. this is a discrete population). An inspector studies 200 square yards of the material, records the number of flaws found per square yard inspected. Use the central limit theorem to find the approximate probability that the mean number of flaws exceeds 2 per square yard. 2 - 1.5 P(X > 2) = P Z > 1.3 200 = P(Z > 5.44) = 0.0000000268 The result indicates that the probability of seeing more than 2 flaws per square yard on average from a sample of 200 is extremely rare. In this situation we have a population that is not normally distributed. Regardless of the distribution type we can still calculate the mean and standard deviation but the 68-95-99.7 rule does not apply anymore. We are told that µ = 1.5 flaws per sq yd, and that σ = 1.3 flaws per sq yd. Again, you can see that this is not normally distributed since the smallest value for a measurement is 0 flaws per sq yd, and if we go two standard deviations to the left 1.5 – 2(1.3) we have negative flaws per sq yd which is nonsense. Also normal distributions are continuous and this distribution is discrete, we can only have whole numbers as our outcomes; 0 flaws/sq yd, 1 flaw/sq yd, 2 flaws/sq yd, and so on. We can not have 1.75flaws/sq yd in one measurement; when we average several values then we can have fractional flaws/sq yd. The situation is we will be looking at 200 sq yds of material, and we want to know the likely hood that if we looked at blocks of 1yd by 1yd and recorded the flaws in each square yard, and then averaged all 200 numbers, what is the probability that the average recorded is greater than 2 flaws/sq yd? Simulated distribution based on given information. Key words - Central Limit Theorem – the theorem is mentioned to bring back to memory the fact that you will be dealing with the sampling distribution and not the actual population itself. Also the distribution is approximately normal, according to the theorem, so our calculated probability is also an approximation. You can see from the histogram, and the normal quantile plot that the Central Limit Theorem is correct in the fact that the distribution is very, very close to a normal distribution. The straight line in the normal quantile plot indicates that it is extremely close to a normal distribution, so much so that we can depend on the calculations we are about to make to be very good approximations. I want to calculate the probability that my average of 200 numbers exceeds 2 flaws per sq yd. I can see from my histogram that this is not very likely, since in my 750 simulations not once did this occur. I used the simulated distribution above to sample from. Normal Quantile Plot for the Sampling Dsitribution Simulation; 750 sample means averaging 200 values at a time. 1.9 1.8 1.7 1.6 1.5 1.4 1.3 AVERAGES OF 200 VALUES I conducted a simulation in which I sampled 200 square yards of material and recorded the flaws in the 200 pieces of 1 yd by 1 yd. I then averaged the 200 values I recorded to get my one value of the sample mean, x . Now I repeated this procedure 765 more times to obtain the sampling distribution of the mean, for the 765 values. By doing this many times I am hoping that the distribution I got by experiment is close to the theoretical distribution, or at least I get to glimpse what the theoretical distribution probably looks like. -4 -2 0 EXPECTED Z-SCORE 2 4