STAT 431: HOMEWORK 3 Problem 1. (5 points) Suppose X1 , . . . , Xn are i.i.d. draws from a population with mean µ and known variance σ 2 . Consider the following two situations: Situation 1: You perform the one-sided hypothesis test H0 : µ ≥ µ0 and H1 : µ < µ0 at signicance level α. Situation test 2: Using the same data set, your friend performs the one-sided hypothesis H0 : µ < µ0 and H1 : µ ≥ µ0 . at signicance level α. True or false: You reject your null hypothesis if and only if your friend does not reject his/her null hypothesis. Explain your answer. Without loss of generality, let's assume σ is known. Suppose we are testing the hypotheses at signicance level α. In situation 1, we reject H0 whenever X < µ0 − zα √σn (i.e., we reject H0 if X is in the red shaded region in the left-panel of the gure below.) In situation 2, we fail to reject H0 whenever X < µ0 + zα √σn (i.e., we fail to reject H0 if X is in the blue shaded region in the right panel of the gure below). Solution: 1 Note that if we reject H0 in situation 1, we will fail to reject H0 in situation 2 (in terms of the gure, if X is in the red region, it is also in the blue region). However if we fail to reject H0 in situation 2, we do not necessarily reject H0 (in terms of the gure, if X is in the blue shaded region, it is not necessarily in the red shaded region). Hence, the statement is false. Problem 2. (5 points) According to historical records, a large department store found that 1 out of every 14 people entering the store engaged in shoplifting. To help alleviate the problem, the store hired additional security guards. Suppose the store now chooses 300 shoppers at random and follows their movements by camera. If 18 of the 300 shoppers are found to engage in shoplifting, is there signicant evidence to conclude that the strategy is working? Solution: We want to test H0 : p ≥ 1/14 against H1 : p < 1/14. Our test statistic is p̂ − 1/4 q 1 300 1 4 × × 1− 1 4 = −0.769 and our p-value is computed below. > z <− (18 / 300 − 1/ 14) / s q r t (1 / 300 * 1/14 * 13/ 14) > pnorm ( z , mean = 0 , sd = 1) [ 1 ] 0.2210609 2 Since the p-value is larger than 0.05, we cannot reject the null hypothesis. So while the sample proportion of shoplifters is smaller than the historical record (6% compared to 7.14%), it is not signicantly smaller. Thus there is not enough evidence (at any reasonable signicance level) to conclude that the strategy is working. Problem 3. (a) (5 points) Let X be a random variable with a continuous cumulative distribution function F (so that the quantile function F −1 is well-dened). Dene a new random variable Y = F (X). Show that Y has a uniform distribution on the interval [0, 1]. Solution: show that To show that Y has a uniform distribution on the unit interval we need to P r(X ≤ x) = x for x ∈ [0, 1] . Since F −1 is well-dened we have that F is a bijection between the values that X may take and the interval [0, 1]. What does this imply? First, it means that the event {X ≤ x} for x in the range of X is equivalent to the event {F (X) ≤ F (x)}. Second, it means that for all y ∈ [0, 1] there is an x ∈ Range(X) such that y = F (x). The rst fact implies that F (x) = P r(X ≤ x) = P r(F (X) ≤ F (x)) = P r(Y ≤ F (x)) . So we have that P r(Y ≤ F (x)) = F (x) for all x ∈ Range(X). The second fact implies that P r(Y ≤ y) = y for all y ∈ [0, 1], completing our proof. (b) (5 points) Let X1 , X2 , . . . , Xn be i.i.d. random samples from a N (µ, σ 2 ) distribution, with known σ 2 . Consider testing H0 : µ = 0 versus H1 : µ > 0 √ 0 . For observed data x1 , x2 , . . . , xn the p-value is based on the test statistic Z = X̄−µ σ/ n x̄−µ √ 0 . Thus, over repeated random samples with the same p = 1 − Φ(z), with z = σ/ n sample size n, the p-value P = 1 − Φ(Z) is a random variable. Use part (a) to show that P has a uniform distribution on the interval [0, 1], and use this result to further show that a test which rejects H0 when the p-value is less than α has the probability of making Type I error equal to α. Solution: Using part (a) we know that Φ(Z) ∼ U[0, 1] so we can write P = 1 − U where U ∼ U[0, 1]. We begin by showing that P is uniform on [0, 1] when we assume that H0 is true. One way to see this is that U 0 = U − E[U ] = U − 1/2 is symmetric 3 about 0 and thus U − 1/2 and 1/2 − U have the same distribution. Consequently, 1/2+(U −1/2) = U and 1/2+(1/2−U ) = 1−U have the same distribution. Another way would be to show that P and U have the same distribution function. Problem 4. Let X1 , X2 , . . . , Xn be i.i.d. random samples from N (µ, σ2 ). Consider the problem of testing H0 : σ 2 = σ02 and H1 : σ 2 > σ02 , at signicance level α, using the test that rejects H0 if (n − 1)S 2 > χ2n−1,α , σ02 where S 2 is the sample variance. a (3 points) Find an expression for the power of this test in terms of the χ2n−1 distribution if the true σ 2 = cσ02 , where c > 1. 2 2 Solution: The expression for power at σ = cσ0 is P (n − 1)S 2 > χ2n−1,α σ 2 = cσ02 σ02 . has a chi-squared distribution under H0 but not under H1 . HowRecall that (n−1)S σ02 2 ever, we do know that (n−1)S is chi-squared distributed with n−1 degrees of freedom cσ02 when this particular alternative, σ 2 = cσ02 is true. Using this fact we can express the power in terms of the distribution function of the χ2n−1 , Fχ2n−1 (·): 2 Power (cσ02 ) =P (n − 1)S 2 > χ2n−1,α σ 2 = cσ02 2 σ0 χ2n−1,α (n − 1)S 2 =P > cσ02 c 2 χn−1,α = 1 − Fχ2n−1 . c 2 σ = cσ02 Notice that as c → ∞ the power increases to 1 and that as c → 1 then the power converges to the level, α. (b) (2 points) Find the power of this test if α = 0.05, n = 16, and c = 4, that is, the true σ 2 is four times the value being tested under H0 . Solution: To compute this we run the following R code, 4 > alpha <− 0 . 0 5 > n <− 16 > c <− 4 > pow <− 1 − p c h i s q ( q c h i s q (1 − alpha , df = n −1)/c , df = n −1) > pow [ 1 ] 0.9752557 Problem 5. On Groundhog Day, February 2, a famous groundhog in Punxsutawney, PA is used to predict whether a winter will be long or not based on whether or not he sees his shadow. We collected data on whether he saw his shadow or not, and kept them in the le groundhog.txt. Now, suppose we want to know whether this information says anything about whether or not we will experience a rainy winter in California. For this, we found rainfall data, and saved it in the le rainfall.csv. (a) (4 points) Make a boxplot of the average rainfall in Northen California comparing the years the groundhog sees his shadow versus the years he does not. 3 4 5 6 7 Solution: N Y ghog <− read . csv ( " groundhog . t x t " ) r a i n f a l l <− read . csv ( " r a i n f a l l . csv " ) a l l . data <− merge ( ghog , r a i n f a l l , by . x = " year " , by . y = "WY" ) a l l . data $Avg <− a l l . data $ Total /12 boxplot (Avg ~ shadow , data = a l l . data ) 5 (b) (3 points) Construct a 90% condence interval for the dierence between the mean rainfall during the month of February in years the groundhog sees his shadow and years he does not. Solution: Now we just consider February rainfall. Let Ȳ be the average February rainfall in years when the groundhog saw his shadow and N̄ the average when he did not. Since there are very few observations, we'll probably want to assume that February rainfall is normal so that we can use t critical values to construct our 90% CI. Our CI will take the form s Ȳ − N̄ − tν,0.05 Where ν = s2Y nY s2Y + s2N s , Ȳ + N̄ − tν,0.05 s2Y + s2N nN 2 2 2 2 2 s2N sY sN + nN / /(nY − 1) + nN /(nN − 1) ≈ 5.77. nY nY nN We can use R to compute the interval 6 nY > > > > > > > > > > > i .Y <− a l l . data $shadow=="Y" i .N <− a l l . data $shadow=="N" s .Y <− sd ( a l l . data $Feb [ i .Y] ) s .N <− sd ( a l l . data $Feb [ i .N] ) n .Y <− sum( i .Y) n .N <− sum( i .N) Y. bar <− mean ( a l l . data $Feb [ i .Y] ) N. bar <− mean ( a l l . data $Feb [ i .N] ) s <− s q r t ( s .Y^2/n .Y + s .N^2/n .N) nu <− s ^4/ ( ( s .Y^2/n .Y) ^2/ ( n .Y−1)+(s .N^2/n .N) ^2/ ( n .N−1) ) c (Y. bar −N. bar − qt ( 0 . 9 5 , df=nu ) * s ,Y. bar −N. bar + qt ( 0 . 9 5 , df= nu ) * s ) [ 1 ] − 3.878695 8.933445 The 90% CI is [−3.88, 8.93]. (GRADING NOTE: If the student uses a pooled standard error estimator correctly that answer should also receive full credit.) (c) (3 points) At level, α = 0.05, would you reject the null hypothesis that the average rainfall in Northern California during the month of February was the same in years the groundhog sees his shadow versus years he does not? What assumptions are you making in forming your condence interval and in your hypothesis test? The interval formed in part (b) includes zero which means we cannot reject the null hypothesis that the average rainfall in Northern California during the month of February was the same in years the groundhog sees his shadow versus years he does not at level α = 0.1. Consequently, we cannot reject this null at α = 0.05 either. We have assumed that the February rainfall data for years the g'hog sees his shadow and for years that he doesn't are independent samples and are normally distributed. However, we do not assume they have equal variance (it might actually be a good idea to do this since there are so few data, this would give us a better estimate of s... full credit will be given if you use the equal variance assumption and compute s appropriately). Solution: Problem 6. Suppose we are given two independent random samples of sizes n1 and n2 from Bernoulli populations with parameters p1 and p2 . Let p̂1 and p̂2 be the corresponding sample 7 proportions. Consider the problem of testing the hypotheses: H0 : p1 − p2 = 0 versus H1 : p1 − p2 6= 0. (a) (1 point) Read Section 9.2.1 from the book, and write down the test statistic for the above hypothesis. Solution: There are two possible test statistics we can use. The rst is p̂1 − p̂2 p̂1 − p̂2 − (p1 − p2 ) =q . t= q p̂2 (1−p̂2 ) p̂2 (1−p̂2 ) p̂1 (1−p̂1 ) p̂1 (1−p̂1 ) + + n1 n2 n1 n2 Since the null hypothesis states that p1 = p2 we can use the pooled proportion p̂ = n1 p̂1 + n2 p̂2 n1 + n2 to estimate the variance which gives us that p̂1 − p̂2 − (p1 − p2 ) t= r . 1 1 p̂(1 − p̂) n1 + n2 Either is OK to use for a hypothesis test while the second is not appropriate when constructing a condence interval. (GRADING NOTE: part (b) is solved using the rst version but full credit should be given if you use the second. Part (c) using the second version should only get partial credit) (b) Now, suppose a random sample of 220 female and 210 male coee drinkers was selected and interviewed. The result was that 71 women and 58 men indicated a preference for decaeinated coee. (i) (2 points) Should we conclude, at a 5% signicance level, that the proportion of female coee drinkers who prefer decaeinated coee diers from the proportion of male coee drinkers who prefer decaeinated coee? Let pM and pF be the proportion of male and female coee drinkers who prefer decaeinated coee. We want to test H0 : pM = pF vs H1 : pM 6= pF . Observe that Solution: p̂M pM (1 − pM ) pF (1 − pF ) − p̂F ∼ N pM − pF , + nM nF 8 We compute p̂M − p̂F q p̂M (1−p̂M ) nM + = −1.05 p̂F (1−p̂F ) nF If Z ∼ N (0, 1) then the p-value is P (|Z| ≥ 1.395) = 0.291, and we fail to reject the hypothesis that pM = pF . (ii) (2 points) Construct a 95% condence interval for the dierence in proportions between male and female coee drinkers who prefer decaeinated coee. Solution: s p̂M − p̂F − z α 2 The condence interval has the form p̂M (1 − p̂M ) p̂F (1 − p̂F ) + , p̂M − p̂F + z α2 nM nF s p̂M (1 − p̂M ) p̂F (1 − p̂F ) + nM nF We compute the 95% CI as [−0.148, 0.025]. > s <− s q r t ( ( phat .m * (1 − phat .m) ) /n .m + ( phat . f * (1 − phat . f ) ) /n . f ) > c ( phat .m − phat . f − qnorm ( 0 . 9 7 5 , 0 , 1) * s , phat .m − phat . f + qnorm ( 0 . 9 7 5 , 0 , 1) * s ) [ 1 ] − 0.13298585 0.03991226 Problem 7. Consider the problem of testing H0 : µ1 = µ2 versus H1 : µ1 > µ2 , when the samples are independently drawn from two normal populations N (µ1 , σ12 ) and N (µ2 , σ22 ), where σ12 and σ22 are assumed to be known. Let n1 and n2 be the sample sizes and x̄ and ȳ be the sample means, respectively. The α-level test for H0 rejects if x̄ − ȳ z=p 2 > zα . σ1 /n1 + σ22 /n2 (a) (4 points) Show that the power of the α-level test as a function of µ1 − µ2 is given by µ1 − µ2 π(µ1 − µ2 ) = Φ −zα + p 2 σ1 /n1 + σ22 /n2 9 ! . We rst observe that under the alternative µ1 − µ2 > 0, our test statisp tic (x̄ − ȳ)/ σ12 /n1 + σ22 /n2 is normal with standard deviation 1 and mean (µ1 − p µ2 )/ σ12 /n1 + σ22 /n2 . We now start with the general denition of power and use this fact to write it in terms of the standard normal CDF: Solution: π(µ1 − µ2 ) = P (reject H0 | H1 ) =P ! x̄ − ȳ =p 2 > zα µ1 − µ2 > 0 σ1 /n1 + σ22 /n2 x̄ − ȳ − (µ1 − µ2 ) µ1 − µ2 =P = p 2 > zα − p 2 2 σ1 /n1 + σ2 /n2 σ1 /n1 + σ22 /n2 ! µ1 − µ2 = 1 − Φ zα − p 2 σ1 /n1 + σ22 /n2 ! µ1 − µ2 = Φ −zα + p 2 . σ1 /n1 + σ22 /n2 ! µ1 − µ2 > 0 (b) (3 points) For detecting a specied dierence µ1 − µ2 = δ > 0 show that for a xed total sample size n1 + n2 = N , the power is maximized when n1 = σ1 N σ1 + σ2 n2 = σ2 N. σ1 + σ2 (Here we are ignoring integer restrictions on n1 and n2 ). In order to maximize π(µ1 − µ2 ) in terms of n1 , n2 given that we have a total sample size of N we need to make the term inside of Φ(·) as big as possible. Since µ1 − µ2 = δ > 0, this is equivalent to making the term σ12 /n1 + σ22 /n2 as small as possible such that n1 + n2 = N . Now notice that we can write n2 as N − n1 and write this term to minimize as the function Solution: f (n1 ) = σ12 /n1 + σ22 /(N − n1 ) . One can easily check that this function is strictly convex for n1 > 0 and thus setting the derivative to zero (rst order condition) and solving for n1 will yield the unique 10 minimizer: f 0 (n1 ) = σ22 /(N − n1 )2 − σ12 /(n21 ) = 0 ⇒ n1 = (σ1 /σ2 )(N − n1 ) ⇒ n1 (σ2 + σ1 )/σ2 = (σ1 /σ2 )N ⇒ n1 = N σ1 /(σ1 + σ2 ) . From the relation n2 = N − n1 we see that n2 = N σ2 /(σ1 + σ2 ), which is our desired result. (c) (3 points) Show that the smallest integer total sample size required to guarantee power at least 1 − β , when µ1 − µ2 = δ > 0, is given by (zα + zβ )(σ1 + σ2 ) N= δ 2 . If we are interested in the smallest N to guarantee a certain power, then we should use the result from part (b) to choose the allocation of the total sample across the two populations. Since Φ−1 is well dened we have that Solution: ! δ Φ −zα + p 2 σ1 /n1 + σ22 /n2 =1−β δ =⇒ −zα + p 2 = zβ σ1 /n1 + σ22 /n2 δ =⇒ p = zα + zβ (σ1 + σ2 )σ1 /N + (σ1 + σ2 )σ2 /N ) √ N zα + zβ = =⇒ p δ (σ1 + σ2 )2 2 (zα + zβ )(σ1 + σ2 ) =⇒ N = . δ Now since this might not return an integer, we must take the ceiling of this term to get the smallest integer N to guarantee us power of at least 1 − β . 11 Problem 8. (5 points) To study the eectiveness of a certain commercial liquid protein diet, the Food and Drug Administration sampled nine individuals who were entering a two-week weight loss program. Their weights immediately before and 6 months after completing the program are recorded below: Person Weight before Weight after 1 197 185 2 212 220 3 188 180 4 226 217 5 170 185 6 194 197 7 233 219 8 166 170 9 205 202 Based on the data, would you conclude that the diet is eective for weight loss? Use a 5% signicance level. What assumptions have you made? This is a matched pairs design (since each person contributes two measurements, one before the weight loss program and one after the program). For person i, let di be the dierence between their weight after the program and their weight before the program (weight after minus weight before). Let µd be the (population) average weight loss following the program. We want to test H0 : µd ≥ 0 vs H1 : µd < 0. 1 Let D and sD be the sample average and standard deviations of the observed dierences d1 , . . . , d9 . Then under the null we have Solution: D √ ∼ t9−1 sd / 9 and can proceed as we have previously. We compute our t-statistic to be -0.504 and the p-value to be 0.301 So we cannot reject the null that the average weight after the program is at least the average weight before the program (i.e. there's weak evidence that the plan doesn't work). The crucial assumption is that the di 's are i.i.d normal random variables. Note, that we do not need the weights themselves to be normally distribution, just the dierence. 1From the standpoint of the regulatory agency, approving a weight-loss regimen that doesn't work (or increases weight) is much worse than rejecting a regimen that does reduce weight. In light of this, we set H0 so that the rst situation corresponds to a Type I error. 12 Problem 9. (5 points) A study was instigated to see if southern California earthquakes of at least moderate size (having values of at least 4.4 on the Richter scale) are more likely to occur on certain days of the week than on others. The following data were obtained for 1100 earthquakes: Day Sun Mon Tues Weds Thurs Fri Sat Number of Earthquakes 156 144 170 158 172 148 152 (a) Test the hypothesis that an earthquake is equally likely to occur on any of the seven days of the week. Use a 5% signicance level. If earthquakes are equally likely to occur on any of the seven days, we would expect to see counts of 157.1429 in each cell of the table. Under the null we the approximate distribution of our test statistic Solution: 7 X (observedi − expected )2 i=1 expectedi i ∼ χ26 . We compute χ20.05,6 = 12.591 and our test-statistic is 4.269. Since 4.269 < 12.591, we fail to reject the hypothesis that earthquakes are equally likely to occur on any of the seven days of the week. > n . c <− 7 > obs <− c ( 1 5 6 , 144 , 170 , 158 , 172 , 148 , 152) > expected <− rep (sum( obs ) /n . c , times = n . c ) > c r i t i c a l . value <− q c h i s q ( 0 . 9 5 , df = n . c − 1) > c r i t i c a l . value [ 1 ] 12.59159 > t e s t . s t a t <− sum( ( obs − expected ) ^2/ expected ) > test . stat [ 1 ] 4.269091 (b) What is the p-value of the data? The p-value is the probability that χ26 random variable exceeds our observed test-statistic. Solution: > 1 − p c h i s q ( t e s t . s t a t , df = n . c − 1) [ 1 ] 0.640312 13 (c) What kind of condence interval might be used here? We may try to construct a condence interval for any parameter that summarizes the "dierences" between the parameters. For example, the variance of of the probability of earthquakes per day, or the dierence between the max and min probability. However, nding a condence interval may be dicult using the methods we have talked about. None of the standard intervals t this directly. 14