Ch13. Confidence interval and Testing hypothesis two

343 Chapter 13. Confidence intervals and Test of hypothesis comparing two parameters Problem PS274 1. Report the mean 𝜇1 = 13 and the standard deviation 𝜎1 = 3 of a normal random variable in cells A1 and A2. Do likewise for a second normal random variable in cells B1 and B2: 𝜇2 = 10 and 𝜎2 = 2. Generate 25 values of the first normal variable in the range A3:A27 and 16 values of the second variable in the range B3:B18. Compute the mean of the first sample in cell E3 and the mean of the second sample in cell F3; 2. Compute a 95% C.I. for the difference 𝜇1 − 𝜇2 assuming the variances 𝜎12 and 𝜎22 are known in cell D7 (lower bound) and cell D8 (upper bound). Example for cell D7: = 𝐸3 − 𝐹3 − 𝑁𝑂𝑅𝑀. 𝑆. 𝐼𝑁𝑉(0.975) ∗ 𝑆𝑄𝑅𝑇(𝐴2 ∗ 𝐴2⁄25 + 𝐵2 ∗ 𝐵2). Likewise for cell D8. Compute the width of the interval in cell D9; 3. As in Step 2 for the 99% C.I. in cells D12 to D14; 4. Repeat the simulation a number of times, compare each time the width of the intervals and check whether the value 0 is contained in the interval; 5. Change the mean of the second normal variable to 12 and check the results; 6. Restore the value of the mean of the second variable to 10, change the standard deviation of the first variable to 8, of the second variable to 4 and check the influence of the increased standard deviations on the width of the intervals. Assignment PA274 Generate data as in Step 1 above. Apply a two-sided hypothesis test 𝐻0 : 𝜇1 − 𝜇2 = 0 assuming variances known. Use the following decision rule (𝑝∗ = 𝑝 − 𝑣𝑎𝑙𝑢𝑒): when 𝑝∗ < .01: 𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0 𝑠𝑡𝑟𝑜𝑛𝑔𝑙𝑦, when 0.01 < 𝑝∗ < 0.05, 𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0 , when 0.05 < 𝑝∗ < 0.1, 𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0 𝑚𝑖𝑙𝑑𝑙𝑦, when 𝑝∗ > .1, 𝑎𝑐𝑐𝑒𝑝𝑡 𝐻0 . Apply a one-sided test with 𝐻𝐴 : 𝜇1 − 𝜇2 > 0 and a significance level of 0.05. 344 Problem PS275 1. Generate data as in Step 1 of Problem PS274 changing 𝜎2 to the value 3. Compute the means of both samples in cells E3 and F3, the variances in cells E4 and F4. Compute the pooled sample variance in cell F5: = (24 ∗ 𝐸4 + 15 ∗ 𝐹4)⁄(25 + 16 − 2); 2. Compute a 95% C.I. for the difference 𝜇1 − 𝜇2 assuming the variances 𝜎12 and 𝜎22 are unknown but equal in cell D8 (lower bound) and cell D9 (upper bound). Example for cell D8: = 𝐸3 − 𝐹3 − 𝑇. 𝐼𝑁𝑉. 2𝑇(0.05; 25 + 16 − 2) ∗ 𝑆𝑄𝑅𝑇(𝐹5 ∗ (1⁄25 + 1⁄16)). Likewise for cell D9. Compute the width of the interval in cell D10; 3. Perform a two-sided test for 𝐻0 : 𝜇1 − 𝜇2 = 0 with significance level 0.05. Compute the T-ratio in cell D13: = (𝐸3 − 𝐹3)⁄𝑆𝑄𝑅𝑇(𝐹5 ∗ (1⁄25 + 1⁄16)) and the p-value in cell D14: = 𝑇. 𝐷𝐼𝑆𝑇. 2𝑇(𝐷13; 25 + 16 − 2); 4. Use an alternative formula available in Excel to compute the p-value in cell D15: = 𝑇. 𝑇𝐸𝑆𝑇(𝐴3: 𝐴27; 𝐵3: 𝐵18; 2; 2). Report the decision in cell D16: = 𝐼𝐹 (𝐷15 < .01; "𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0 𝑠𝑡𝑟𝑜𝑛𝑔𝑙𝑦"; 𝐼𝐹(𝐷15 < .05; "𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0"; 𝐼𝐹(𝐷15 < .1; "𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0 𝑚𝑖𝑙𝑑𝑙𝑦"; "𝑎𝑐𝑐𝑒𝑝𝑡 𝐻0"))); 5. Change the mean of the second normal variable to 12.5 and repeat the simulation; 6. Keep the second mean at 12.5 and decrease the value of both standard deviations to 0.5. Repeat the simulation a few times (key F9). Assignment PA275 Generate data as in Step 1 above with the mean of the second normal variable equal to 12. Perform a two-sided test to compare the means in two ways: test 1 assumes the variances known as in Assignment PA274 and test 2 assumes the variances unknown as above. Repeat the test several times and notice that the outcomes may differ. 345 Problem PS276 1. Generate data as in Step 1 of Problem PS274 changing 𝜎1 to the value 5. Compute the means of both samples in cells E3 and F3, the variances in cells E4 and F4. Compute the pooled sample variance in cell F5: = (24 ∗ 𝐸4 + 15 ∗ 𝐹4)⁄(25 + 16 − 2); 2. Compute a 95% C.I. for the difference 𝜇1 − 𝜇2 assuming the variances 𝜎12 and 𝜎22 are not equal (Behrens-Fisher problem). To do this, first compute 𝑠12 ⁄𝑛1 and 𝑠22 ⁄𝑛2 in cells E8 and F8 and the degrees of freedom using the formula = 𝑃𝑂𝑊𝐸𝑅(𝐸8 + 𝐹8; 2)⁄(𝑃𝑂𝑊𝐸𝑅(𝐸8; 2)⁄24 + 𝑃𝑂𝑊𝐸𝑅(𝐹8; 2)⁄15) in cell E9. Compute the lower bound of the interval in cell E11, the upper bound in cell E12. Example for cell E11: = 𝐸3 − 𝐹3 − 𝑇. 𝐼𝑁𝑉. 2𝑇(0.05; 𝐸9) ∗ 𝑆𝑄𝑅𝑇(𝐸8 + 𝐸9) and likewise for the upper bound. Compute the width of the interval in cell E13; 3. Compute a 95% C.I. for the difference 𝜇1 − 𝜇2 assuming the variances 𝜎12 and 𝜎22 are not equal (wrong assumption) in cells I11 and I12. Compute the width of the interval in cell I13. Compare the results with Step 2; 4. Perform a two-sided test for 𝐻0 : 𝜇1 − 𝜇2 = 0. Compute the T-ratio in cell D16: = (𝐸3 − 𝐹3)⁄𝑆𝑄𝑅𝑇(𝐸8 + 𝐹8). Compute the p-value in cell D17: = 𝑇. 𝐷𝐼𝑆𝑇. 2𝑇(𝐷16; 𝐸9); 5. Use an alternative formula available in Excel to compute the p-value in cell D19: = 𝑇. 𝑇𝐸𝑆𝑇(𝐴3: 𝐴27; 𝐵3: 𝐵18; 2; 3). The p-values in Step 4 and 5 may slightly differ probably due to a different handling of non-integer degrees of freedom. Report the decision in cell D20 similar to Step 4 in Problem PS275; 6. Perform a two-sided test assuming equal variances by computing the p-value using the T.TEST function of Excel (cell I17). Compare the result with Step 5. Assignment PA276 As the T-Test above is an approximate procedure, the dof are sometimes computed according to a slightly different formula: = (𝑠12 ⁄𝑛1 +𝑠22 ⁄𝑛2 ) 2 2 2 (𝑠12 ⁄𝑛1 ) ⁄𝑛1 +(𝑠22 ⁄𝑛2 ) ⁄𝑛2 − 2. Use this formula to compute the dof for the data above and compare with the result in Step 2. 346 Problem PS277 1. Report the mean 𝜇1 = 10 and the standard deviation 𝜎1 = 2.5 of a normal random variable in cells A1 and A2. Generate 20 values of the normal variable in the range A4:A23. Generate dependent values of a second variable in the range B4:B23 as follows (example for cell B4): = 𝑁𝑂𝑅𝑀. 𝐼𝑁𝑉(𝑅𝐴𝑁𝐷( ); 𝐴4 − 1.5; 2) and likewise for cells B5:B23. Compute the pairwise differences in cells C4:C23. Compute the mean of the first sample in cell F3, the mean of the second sample in cell G3 and the mean of the differences in cell H3. Compute the sample variances in cells F4 to H4. Compute the pooled sample variance in cell F5; 2. Perform a two-sided test for the equality of means for dependent samples. Therefore compute the ratio 𝐻3⁄𝑆𝑄𝑅𝑇(𝐻4⁄20) in cell E7 and the p-value in cell E8: = 𝑇. 𝐷𝐼𝑆𝑇. 2𝑇(𝐴𝐵𝑆(𝐸7; 19)). An alternative to compute the p-value in Excel in cell E9: = 𝑇. 𝑇𝐸𝑆𝑇(𝐴4: 𝐴23; 𝐵4: 𝐵23; 2; 1). Report the conclusion in cell E10: = 𝐼𝐹 (𝐸9 < .01; "𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0 𝑠𝑡𝑟𝑜𝑛𝑔𝑙𝑦"; 𝐼𝐹(𝐸9 < .05; "𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0"; 𝐼𝐹(𝐸9 < .1; "𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0 𝑚𝑖𝑙𝑑𝑙𝑦"; "𝑎𝑐𝑐𝑒𝑝𝑡 𝐻0"))); 3. Perform a two-sided test for the equality of means for independent samples (wrong assumption) assuming equal variances (use the T.TEST function). Compute the pvalue in cell E13 and the conclusion in cell E14 (see Problem PS276). Notice the strong effect of the dependence in the samples; 4. Perform a two-sided test for the equality of means for independent samples (wrong assumption) assuming unknown variances (use the T.TEST function). Compute the p-value in cell E17 and the conclusion in cell E18 (see Problem PS276). Notice also here the strong effect of the dependence in the sample. Assignment PA277 Generate data as in Step 1 above. Perform a two-sided test for the equality of the population means assuming dependent data. Compare the result with a two-sided test assuming 1. independent samples with known population variances; 2. independent samples with unknown but equal population variances. 347 Problem PS278 We test the equality of two variances of normally distributed data. 1. Generate data as in Step 1 in Problem PS274. Compute the variances of both samples in cells E3 and F3; 2. Compute the ratio of both variances in cell E6: = 𝐸3⁄𝐹3 and the p-value for a twosided test for the equality of the two variances in cell E7: = 𝐼𝐹 (𝐸6 > 1,2 ∗ 𝐹. 𝐷𝐼𝑆𝑇. 𝑅𝑇(𝐸6; 24; 15); 2 ∗ (1 − 𝐹. 𝐷𝐼𝑆𝑇. 2𝑇(𝐸6; 24; 15))). Draw the conclusion in cell E8: = 𝐼𝐹 (𝐸7 < .01; "𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0 𝑠𝑡𝑟𝑜𝑛𝑔𝑙𝑦"; 𝐼𝐹(𝐸7 < .05; "𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0"; 𝐼𝐹(𝐸7 < .1; "𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0 𝑚𝑖𝑙𝑑𝑙𝑦"; "𝑎𝑐𝑐𝑒𝑝𝑡 𝐻0"))); 3. An alternative and more direct way of computing the p-value for the two-sided test is possible using = 𝐹. 𝑇𝐸𝑆𝑇(𝐴3: 𝐴27; 𝐵3: 𝐵18). Notice that Excel assumes that a twosided test is required; 4. Perform a one-sided test of the form: 𝐻0 : 𝜎12 = 𝜎22 versus 𝐻𝐴 : 𝜎12 > 𝜎22 . Compute the p-value in cell E13: = 𝐹. 𝐷𝐼𝑆𝑇. 𝑅𝑇(𝐸6; 24; 15) and the conclusion in cell E14 similar to cell E8. Assignment PA278 Repeat Step 1 above. Compute a 95% confidence interval for the ratio 𝜎12 ⁄𝜎22 . 348 Problem PS279 Test of the equality of two proportions or fractions. 1. Report the (unknown) true proportion of successes of a first population in cell A2 (0.4), of a second population in cell B2 (0.3). Generate 80 successes/failures from the first population in the range A3:A82. Example for cell A3: = 𝐼𝐹(𝑅𝐴𝑁𝐷( ) < 𝐴$2; "𝑆"; "𝐹") . Do likewise for 75 observations from population 2 in the range B3:B77; 2. Compute the sample proportion of successes in both samples in cells D3 and E3. Example for cell D3: = 𝐶𝑂𝑈𝑁𝑇𝐼𝐹(𝐴3: 𝐴82; "𝑆")⁄80 and similarly for cell E3. Assuming that both proportions in the populations are equal, compute the estimate of the equal proportion in cell E4: = (80 ∗ 𝐷3 + 75 ∗ 𝐸3)⁄155; 3. Compute the test ratio in cell D7: = (𝐷3 − 𝐸3)⁄𝑆𝑄𝑅𝑇(𝐸4 ∗ (1 − 𝐸4) ∗ (1⁄80 + 1⁄75)) and the p-value for a two-sided test of the equality of the two proportions in cell D8: = 2 ∗ (1 − 𝑁𝑂𝑅𝑀. 𝑆. 𝐷𝐼𝑆𝑇(𝐴𝐵𝑆(𝐷7); 1)). Report the conclusion in cell D9 similar to the problems above. Notice that for these sample sizes the null hypothesis of equal proportions is often accepted; 4. Change the population proportion of the second population to 0.2 and check the results. Assignment PA279 Repeat Step 1 above. Construct a 95% confidence interval for the difference 𝜋1 − 𝜋2 of the population proportions. 349 Problem PS280 Consider the data set ‘Rice’. Assume the weights for the different fillers are (approximately) normally distributed. 1. Compute the sample variances for the weights for fillers 1 and 2 in cells G2 and G3, the pooled sample variance in cell J2; 2. Test the equality of the variances of the weights of filler1 and filler2 in cell G4: = 𝐹. 𝑇𝐸𝑆𝑇(𝑓𝑖𝑙𝑙𝑒𝑟1; 𝑓𝑖𝑙𝑙𝑒𝑟2) where filler1 and filler2 are the names of the data for the two fillers. Notice that this function applies a two-sided test for the equality of the variances. Answer: 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = .3581. The hypothesis that both variances are equal cannot be rejected; 3. Compute a 95% C.I. for the ratio 𝜎12 ⁄𝜎22 . The lower bound in cell G6: = 𝐹. 𝐼𝑁𝑉(0.025; 19; 19) ∗ 𝐺2⁄𝐺3 and likewise for the upper bound; 4. Compute the sample means of the weights for filler 1 and 2 in cells G9 and G10; 5. Apply a two-sided test for the equality of the means of the weights of filler1 and filler2 in cell G12 (assume variances equal): = 𝑇. 𝑇𝐸𝑆𝑇(𝑓𝑖𝑙𝑙𝑒𝑟1; 𝑓𝑖𝑙𝑙𝑒𝑟2; 2; 2). Answer: 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 1.1623𝐸 − 04. The hypothesis that both means are equal is strongly rejected; 6. Apply a two-sided test for the equality of the means of the weights of filler 1 and 2 assuming the variances are unequal in cell G13: = 𝑇. 𝑇𝐸𝑆𝑇(𝑓𝑖𝑙𝑙𝑒𝑟1; 𝑓𝑖𝑙𝑙𝑒𝑟2; 2; 3). Compare the p-value with Step5; Answer: 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 1.2402𝐸 − 04. The p-values in Steps 5 and 6 are quite close and lead to the same conclusion; 7. Derive a 95% C.I. for the difference of the means for filler 1 and 12. The lower bound in cell G15: = 𝐺9 − 𝐺10 − 𝑇. 𝐼𝑁𝑉. 2𝑇(0.05; 38) ∗ 𝑆𝑄𝑅𝑇(𝐽2 ∗ (1⁄20 + 1⁄20)) and likewise for the upper bound. Assignment PA280 Apply Steps 1 to 7 above for the fillers 1 and 4 of the data set ‘Rice’. 350 Problem PS281 Consider the data set ‘Sabena’. We compare the mean delay at arrival between flights originating in Marseille (LINE STATION-DEP=MRS) and Florence (LINE STATIONDEP=FLR). 1. Generate in column R the delay times at arrival for flights originating in Marseille. Example for cell R2: = 𝐼𝐹(𝐺2 = "MRS"; 𝑄2; ""). Do likewise in column S for flights originating in Florence; 2. Compute the mean and the variance of delay times of flights originating in Marseille in cells T2 and T3. Do likewise for the flights originating in Florence in cells U2 and U3; 3. Cell T7: apply a two-sided test for the equality of the variances of the delays at arrival between flights originating in Marseille and in Florence (assume normality of delay times, see later): = 𝐹. 𝑇𝐸𝑆𝑇(𝑅2: 𝑅3854; 𝑆2: 𝑆3854). Answer: 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 1.2070𝐸 − 22. Decision: reject the equality of the variances strongly; 4. Cell T10: apply a two-sided test for the equality of the means of the delays at arrival between flights originating in Marseille and in Florence (assume normality of delay times, see later): = 𝑇. 𝑇𝐸𝑆𝑇(𝑅2: 𝑅3854; 𝑆2: 𝑆3854; 2; 3). Answer: 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.0242. Decision: reject the equality of the means; 5. Cell T13: : apply a two-sided test for the equality of the means of the delays at arrival between flights originating in Marseille and in Florence assuming equal variances (which is actually false): = 𝑇. 𝑇𝐸𝑆𝑇(𝑅2: 𝑅3854; 𝑆2: 𝑆3854; 2; 2). Answer: 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.0236. Notice that the p-values do not differ much assuming equal or unequal variances because of the fairly large sample sizes of 217 and 215 data points). Assignment PA281 Apply Steps 1 to 5 for the flights originating in Marseille and Edinburgh. 351 Problem PS282 Consider the data set ‘Sabena’. Assume the first 50 observations to be a random sample of the data. 1. Use the sample of 50 observations to test the hypothesis that the average delay at departure equals the average delay at arrival. Clearly delays at departure and at arrival cannot be considered independent. Therefore to compute the p-value for a two-sided test, use the instruction = 𝑇. 𝑇𝐸𝑆𝑇(𝐽2: 𝐽51; 𝑄2: 𝑄51; 2; 1), say in cell S5; Answer: 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 1.0135𝐸 − 3 (significantly different means); 2. Assume now that delays at departure and arrival are independent (wrong assumption). Test again the equality of average delays using the sample of 50 observations, both assuming equal and unequal variances. Use the function 𝑇. 𝑇𝐸𝑆𝑇 with last argument 2 and 3. Answer: 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 8.3096𝐸 − 02 assuming equal variances and 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 8.3176𝐸 − 02 assuming unequal variances. Notice the (much) larger p-values compared to Step 1; 3. Check the dependence of both delays by computing the sample correlation coefficient. Answer: 𝑐𝑜𝑟𝑟 = 0.759; 4. Derive the p-value in Step 1 in an alternative way as follows: compute the difference of the delays for the 50 observations in column R, rows 2 to 51. Use the instruction = 𝑆𝑄𝑅𝑇(50) 𝑇. 𝐷𝐼𝑆𝑇. 2𝑇 (𝐴𝑉𝐸𝑅𝐴𝐺𝐸(𝑅2: 𝑅51) ∗ 𝑆𝑇𝐷𝐸𝑉.𝑆(𝑅2:𝑅51) ; 49). Compare with the p-value in Step 1. Assignment PA282 Apply Steps 1 to 4 above using the next 50 observations as a random sample. 352 Problem PS283 Consider the data set ‘Sabena’. We compare the proportion of flights with a delay of more than 5 minutes at arrival for flights originating in Marseille (LINE STATION-DEP=MRS) and Florence (LINE STATION-DEP=FLR). 1. Generate in column R the value 1 for flights from Marseille with a delay of more than 5 minutes. Example for cell R2: = 𝐼𝐹(𝐴𝑁𝐷(𝐺2 = "𝑀𝑅𝑆";𝑄2 > 5); 1; ""). Do likewise in column S for flights originating in Florence. 2. Count the number of flights originating from Marseille in cell T2: = 𝐶𝑂𝑈𝑁𝑇(𝐺2: 𝐺3854, "𝑀𝑅𝑆") and do likewise for the flights from Florence in cell U2. Compute the proportion of flights with a delay of more than 5 minutes originating from Marseille in cell T3: = 𝑆𝑈𝑀(𝑅2: 𝑅3854)⁄𝑇2. Do the same for the flights from Florence in cell U3. Compute the pooled proportion in cell U4: = (𝑇2 ∗ 𝑇3 + 𝑈2 ∗ 𝑈3)⁄(𝑇2 + 𝑇3) Answer: Marseille: 0.6590, Florence: 0.6093, pooled proportion: 0.6343; 3. Compute a 95% C.I. for the difference of the proportions of late flights from Marseille and Florence. For the lower bound in cell T6: = 𝑇3 − 𝑈3 − 𝑁𝑂𝑅𝑀. 𝑆. 𝐼𝑁𝑉(0.975) ∗ 𝑆𝑄𝑅𝑇(𝑇3 ∗ (1 − 𝑇3)⁄𝑇2 + 𝑈3 ∗ (1 − 𝑈3)⁄𝑈2). Do likewise for the upper bound in cell T7. Answer: lower bound: -0.0410, upper bound: 0.1404; 4. Apply a two-sided test for the equality of proportions of late flights from Marseille and Florence. First compute the ratio in cell T10: = (𝑇3 − 𝑈3)⁄𝑆𝑄𝑅𝑇(𝑈4 ∗ (1 − 𝑈4) ∗ (1⁄𝑇2 + 1⁄𝑈2)). Compute the p-value in cell T11: = 2 ∗ (1 − 𝑁𝑂𝑅𝑀. 𝑆. 𝐷𝐼𝑆𝑇(𝑇10; 𝑇𝑅𝑈𝐸)). Answer: 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.2837. Equality of the proportions cannot be rejected; Assignment PA283 Apply Steps 1 to 4 for the flights originating in Marseille and Edinburgh.

Ch13. Confidence interval and Testing hypothesis two

Related documents

Products

Support

Ch13. Confidence interval and Testing hypothesis two

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib