Homework 10 Solutions 1. Ross 7.79 p. 418 a) Let X 1 and X 2 represent the weekly sales for the next 2 weeks; then ( X 1 , X 2 ) has a bivariate normal distribution with common mean 40, common standard deviation 6, and correlation 0.6. Thus, X 1 X 2 has a normal distribution with mean 80 and standard deviation 6 2 6 2 2(0.6)(6)(6) 115.2 10.733 , and the probability that the total of the next 2 weeks sales exceeds 90 is 90 80 P( X 1 X 2 90) P Z P( Z 0.932) 0.176 . 115.2 b) If the correlation were 0.2 instead of 0.6, the answer from (a) would decrease. Observe that a smaller correlation would result in a smaller standard deviation for X 1 X 2 . A deviation of 10 from the mean is more pronounced when the standard deviation is smaller, so the z-score would increase, making the right-tail probability that we are seeking decrease. c) The standard deviation is now 6 2 6 2 2(0.2)(6)(6) 86.4 9.295 , and the probability that the total of the next 2 weeks sales exceeds 90 is 90 80 P( X 1 X 2 90) P Z P( Z 1.076) 0.141 . 86.4 2. a) The regression to the mean effect implies that for any test-retest situation, we would expect higher “scores” on the initial test to decrease on a re-test, and similarly, lower “scores” on the initial test to increase on a re-test, regardless of any other influences. In the context of this problem, we would naturally expect higher first readings to go down on the second reading, and lower first readings to go up on the second reading; the patient’s state of tension is irrelevant because the observed outcomes are consistent with the regression to the mean effect. (b) If large study is performed, and it is found that first readings average 130 mm, second readings average 120 mm, and both readings have a standard deviation of about 15 mm, then this evidence supports the first doctor’s claim that patients are more relaxed on the second reading. A difference in the overall means between two readings (or a test and a re-test) cannot be attributed to the regression to the mean effect; see Notes 23 in which the regression to the mean effect occurs in a context in which the test and retest have the same overall means and standard deviations. 3. i) Ross 8.2 p. 457 a) Let X be the random variable denoting the score on the final exam. Using Markov’s inequality, we have E ( X ) 75 15 P( X 85) 0.882 . 85 85 17 b) From Chebyshev’s inequality, we have 25 P( X 75 10) 0.25 . 100 Hence, P(65 X 85) 1 P( X 75 10) 1 0.25 0.75 . c) Let n represent the number of students taking the final exam. Then, the average test score of n students is a random variable with mean 75 and variance 25/n. Also, note that having the class average be within 5 of 75 with probability at least 0.9 is equivalent to having the class average be more than 5 away from 75 with probability at most 0.1; hence, using Chebyshev’s inequality again, we have 25 P ( X 75 5) 0.1 . 25n Solving the RHS gives us n 10 , so we need at least 10 students to have a probability at least 0.9 of having the class average be within 5 of 75. X ii) Ross 8.3 p. 457 Using the CLT, we have X 75 5 P( Z n ) 2 P( Z n ) 0.1 P( Z n ) 0.05 . P 5 / n 5 / n Looking in a normal table, this means n 1.645 , so we need n to be at least 3 under the CLT (note, however, that the CLT is an asymptotic result, so the accuracy when n is small – starting at 3 in this case – can be reasonably questioned). 4. Ross 8.7 p. 457 Let X i , i 1, ,100 , represent the lifetime of the i-th lightbulb. Then, using the CLT, we have 525 500 25 100 P X i 525 P Z P Z 0.309 . 50 (100)( 25) i 1 5. Ross 8.8 p. 457 Let X i be defined as above, and let Yi , i 1,,99 , be the time needed to replace the i-th lightbulb (note that once all bulbs have failed, we stop, so we do not include for the time needed to replace the very last lightbulb). So, the probability we are looking for is 99 100 P X i Y j 550 . j 1 i 1 Now, since the replacement time is uniformly distributed over (0, 0.5), we have: 100 100 E X i (100)(5) 500 Var X i (100)( 25) 2500 i 1 i 1 99 99 1 E Y j (99)(0.25) 24.75 Var Y j (99) 2.0625 48 j 1 j 1 Hence, using the CLT, 99 100 550 524.75 P X i Y j 550 P Z P( Z 0.505) 0.693 . 2502.0625 j 1 i 1 6. Ross 8.13 p. 458 a) Let X denote the average test score for the class of size 25. Then, 80 74 30 P( X 80) P Z P Z P( Z 2.143) 0.016 14 14 / 25 b) Let Y denote the average test score for the class of size 64. Then, 80 74 48 P(Y 80) P Z P Z P( Z 3.429) 0.0003 14 14 / 64 c) Since SD(Y X ) (14) 2 / 64 (14) 2 / 25 3.302 , we have 2.2 P(Y X 2.2) P Z P( Z 0.666) 0.253 3.302 d) Same as (c): P(Y X 2.2) P(Y X 2.2) P( X Y 2.2) 0.253 . 7. Ross TE 8.9 p. 460 We would expect the proportion of heads on the remaining 900 tosses to be 0.5 (so the expected proportion of heads on the 1000 tosses would be 0.55) as the remaining tosses are independent of the first 100 tosses. The strong law of large numbers guarantees that the overall proportion of heads will be in the neighborhood of the expected proportion of 0.5; however, it does not suggest that the remaining 900 tosses will behave in a manner designed to have the 1000 tosses see a proportion of heads equal to 0.5 (for the proportion of heads to be 0.5 for the 1000 tosses, we would have needed the 900 remaining tosses to yield 44.444% heads). 8. a) Since Z is an indicator variable, E ( Z ) P( Z 1) . The probability Z 1 is the probability that the point Z falls inside the object, but this is just the proportion of the unit square that is covered by this object; i.e., Area of the object A P( Z 1) A E (Z ) . Area of the unit square 1 b) This is the essence of Monte Carlo integration: suppose we have a complicated shape S with unknown area A which can be contained entirely within a simple shape S* with known area A* (S* chosen so that its area is simple to compute). If we pick n points randomly from within S*, we can then approximate the area A as the fraction of those points that fall within S multiplied by A*. In the context of this problem, S* is the unit square, so A* = 1; hence, we’d estimate A using the fraction of the n points that fall within the object. 1 n c) If we denote our n independent random points as X i , i 1, , n , then Aˆ X i . The n i 1 X i ’s are indicators with mean A and variance A(1 A) . Hence, we have P( Aˆ A 0.01) P Aˆ A A(1 A) / n P Z 0.01 n A(1 A) / n A(1 A) 0.01 For this probability to equal 0.99, we require 0.01 n 2.576 n 257.6 A(1 A) n 66357.76 A(1 A) . A(1 A) This is unsatisfying: the approximate sample size as it stands depends upon the very value that we’re estimating. However, observe that A(1 A) has a maximum value of 0.25 (when A 0.5 ), so the most conservative sample size would be n (66357.76)(0.25) 16589.44 . Rounding up, we conclude that using 16,590 points would be sufficient for P( Aˆ A 0.01) to be at least 0.99 no matter what the true value of A is.