Statistics Statistical Inference for Two Samples Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011. Inference on the Difference in Means of Two Normal Distributions, Variances Known Assumptions 1. X11 , X 12 , …, X 1n is a random sample from population 1. 2. X 21 , X 22 , …, X 2n is a random sample from population 2. 3. The two populations represented by X1 and X 2 are independent 4. Both populations are normal. 1 2 E ( X 1 X 2 ) E ( X 1 ) E ( X 2 ) 1 2 V ( X1 X 2 ) V ( X1 ) V ( X 2 ) The quantity Z X 1 X 2 ( 1 2 ) 12 n1 has a N (0,1) distribution 22 n2 12 n1 22 n2 Hypothesis tests on the difference in means, variance known Hypotheses, two-sided alternative X X 2 0 Test statistic: Z0 1 12 22 n1 n2 Hypotheses, two-sided alternative H 0 : 1 2 0 H1 : 1 2 0 P-value: P 2[1 (| z0 |)] Reject H 0 if z0 z / 2 or z0 z / 2 Hypotheses, upper-tailed alternative H 0 : 1 2 0 H1 : 1 2 0 P-value: P 1 ( z0 ) Reject H 0 if z0 z Hypotheses, lower-tailed alternative H 0 : 1 2 0 H1 : 1 2 0 P-value: P ( z0 ) Reject H 0 if z0 z Type II error and choice of sample size Finding the probability of type II error Hypotheses, two-sided alternative H 0 : 1 2 0 H1 : 1 2 0 Suppose the true value of the difference under H1 is 1 2 Test statistic: z0 X1 X 2 0 2 1 n1 2 2 n2 X1 X 2 2 1 n1 2 2 n2 0 12 n1 22 n2 Type II error and choice of sample size Finding the probability of type II error Hypotheses, two-sided alternative Under H1 0 z0 N ,1 2 2 1 2 n n 1 2 0 0 z / 2 z / 2 2 2 2 2 1 2 1 2 n n n n 1 2 1 2 Type II error and choice of sample size Sample size formulas If 0 0 z / 2 2 2 1 2 n1 n2 0 z / 2 2 2 1 2 n n 1 2 0 z / 2 2 2 1 2 n1 n2 Type II error and choice of sample size Sample size formulas If 0 Let z be the 100 upper percentile of the standard normal distribution. Then ( z ) z z / 2 0 12 n1 22 n2 Note ( 0 ) ( 0 ) z / 2 z / 2 2 2 2 2 1 2 1 2 n n n n 1 2 1 2 0 0 1 z / 2 1 z / 2 2 2 2 2 1 2 1 2 n n n n 1 2 1 2 0 0 z / 2 z / 2 2 2 2 2 1 2 1 2 n1 n2 n1 n2 Sample size for a two-sided test on the difference in mean with n1 n2 , variance known n ( z / 2 z ) 2 ( 12 22 ) ( 0 ) 2 Sample size for a one-sided test on the difference in the mean with n1 n2 , variance known n ( z z ) 2 ( 12 22 ) ( 0 ) 2 Operating characteristic (OC) curves Curves plotting against a parameter d for various sample size n n1 n2 | 0 | d 12 22 n1 n2 See Appendix VII For a given n and d , find . For a given and d , find n If n1 n2 , use 12 22 n 2 1 / n1 22 / n2 Confidence interval on the difference in means, variances known Z X 1 X 2 ( 1 2 ) 12 n1 22 n2 has a standard normal distribution X 1 X 2 ( 1 2 ) P z / 2 z / 2 1 2 2 1 2 n1 n2 12 22 12 22 P X 1 X 2 z / 2 1 2 X 1 X 2 z / 2 1 n1 n2 n1 n2 Confidence interval on the difference in means, variances known X 1 X 2 z / 2 12 n1 22 n2 1 2 X 1 X 2 z / 2 12 n1 22 n2 Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011. Choice of sample size From z / 2 X 1 X 2 ( 1 2 ) 2 1 n1 2 2 z / 2 n2 | X 1 X 2 ( 1 2 ) | z / 2 12 n 22 n 2 we have z / 2 2 2 n ( 1 2 ) E if n n1 n2 . The sample size n required so that the error in estimating 1 2 by x1 x2 will be less than E at 100(1 )% confidence. Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011. One-sided confidence bounds on the differences in means, variance known ◦ A 100(1 )% upper-confidence bound for is 12 22 1 2 X 1 X 2 z n1 n2 ◦ A 100(1 )% lower-confidence bound for is X 1 X 2 z 12 n1 22 n2 1 2 Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011. Example 10-1 Paint Drying Time 8 , 0.05 , n1 n2 10 , x1 121 , x2 112 What conclusions can the product developer draw about the effectiveness of the new ingredient? Example 10-2 Paint Drying Time, Sample Size from OC Curves If the true difference in mean drying times is as much as 10 minutes, find the sample sizes required to detect this difference with probability at least 0.90. 0 0 , 10 , d | 0 | 2 1 2 2 10 8 8 2 2 0.88 Example 10-3 Paint Drying Time Sample Size If the true difference in mean drying times is as much as 10 minutes, find the sample sizes required to detect this difference with probability at least 0.90. 1 0.90 Exercise 10-9 The concentration of active ingredient in a liquid laundry detergent is thought to be affected by the type of catalyst used in the process. The standard deviation of active concentration is known to be 3 grams per liter, regardless of the catalyst type. Ten observations on concentration are taken with each catalyst, and the date follow: Catalyst 1: 57.9, … Catalyst 2: 66.4, … (a) Find a 95% confidence interval on the difference in mean active concentrations for the two catalysts. Find the P-value. Exercise 10-9 (b) Is there any evidence to indicate that the mean active concentrations depend on the choice of catalyst? Base your answer on the results on part (a). (c) Suppose that the true mean difference in active concentration is 5 grams per liter. What is the power of the test to detect this difference if 0.05 ? (d) If this difference of 5 grams per liter is really important, do you consider the sample sizes used by the experimenter to be adequate? Does the assumption of normality seem reasonable for both samples? Exercise 10-9 Catalyst 1, normal probability plot Exercise 10-9 Catalyst 2, normal probability plot Inference on the Difference in Means of Two Normal Distributions, Variances Unknown E ( X 1 X 2 ) E ( X 1 ) E ( X 2 ) 1 2 V ( X1 X 2 ) V ( X1 ) V ( X 2 ) 12 n1 22 n2 Hypothesis tests on the difference in means, variances unknown 2 2 2 Case 1: 1 2 2 2 S The pooled estimator of , denoted by p 2 2 ( n 1 ) S ( n 1 ) S 1 2 2 S p2 1 n1 n2 2 We know that X 1 X 2 ( 1 2 ) Z0 1 1 n1 n2 has a N (0,1) distribution. Then T X 1 X 2 ( 1 2 ) 1 1 Sp n1 n2 has a t distribution with n1 n2 2 degrees of freedom Hypothesis tests on the difference in means, variances unknown Case 1: 12 22 2 Hypotheses, two-sided alternative X1 X 2 0 Test statistic: T0 1 1 Sp n1 n2 H 0 : 1 2 0 H1 : 1 2 0 P-value: P 2 P(Tn n 2 | t0 |) Reject H 0 if t0 t / 2,n n 2 or t0 t / 2,n n 1 2 1 2 1 2 2 Hypotheses, upper-tailed alternative H 0 : 1 2 0 H1 : 1 2 0 P-value: P P (Tn n 2 t0 ) Reject H 0 if t0 t , n n 2 Hypotheses, lower-tailed alternative H 0 : 1 2 0 H1 : 1 2 0 1 2 1 2 P-value: P P(Tn n 2 t0 ) Reject H 0 if t0 t ,n n 2 1 2 1 2 Hypothesis tests on the difference in means, variances unknown 2 2 Case 2: 1 2 If H 0 : 1 2 0 is true, the statistic X X 2 0 T0* 1 S12 S 22 n1 n2 is distributed approximately as t with degrees of freedom given 2 s s n1 n2 v 2 ( s1 / n1 ) 2 ( s22 / n2 ) 2 n1 1 n2 1 2 1 2 2 Type II error and choice of sample size Finding the probability of type II error Case 1: 12 22 2 Hypotheses, two-sided alternative H 0 : 1 2 0 H1 : 1 2 0 Test statistic: t0 0 X1 X 2 1 1 1 1 n1 n2 n1 n2 (n1 n2 2) S p2 1 2 (n1 n2 2) Under H1 , t 0 is of the noncentral t distribution with n1 n2 2 degrees of freedom and . 0 noncentrality parameter 1 / n1 1 / n2 Type II error and choice of sample size Finding the probability of type II error Case 1: 12 22 2 Hypotheses, two-sided alternative H 0 : 1 2 0 H1 : 1 2 0 P{t / 2,n n 2 T0 t / 2,n n 2 | H1} 1 2 1 2 P{t / 2,n1 n2 2 T0 ' t / 2,n1 n2 2 } where T0 ' denotes the noncentral t random variable with n1 n2 2 degrees of freedom and noncentrality 0 1 / n1 1 / n2 Type II error and choice of sample size Finding the probability of type II error Case 1: 12 22 2 Hypotheses, two-sided alternative H 0 : 1 2 0 H1 : 1 2 0 Operating characteristic (OC) curves Curves plotting against a parameter d for various sample size n* | 0 | n* 2n 1 n1 n2 n d 2 , , See Appendix VII Note that d depends on the unknown parameter Confidence interval on the difference in means, variances unknown 2 2 2 Case 1: 1 2 2 2 S The pooled estimator of , denoted by p 2 2 ( n 1 ) S ( n 1 ) S 1 2 2 S p2 1 n1 n2 2 We know that X 1 X 2 ( 1 2 ) T 1 1 Sp n1 n2 has a t distribution with n1 n2 2 degrees of freedom P(t / 2,n1 n2 2 T t / 2,n1 n2 2 ) 1 Confidence interval on the difference in means, variances unknown and equal ◦ Case 1: 12 22 2 X 1 X 2 t / 2,n1 n2 2 S p 1 1 n1 n2 1 2 X 1 X 2 t / 2,n1 n2 2 S p 1 1 n1 n2 Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011. Confidence interval on the difference in means, variances unknown and unequal 2 2 Case 2: 1 2 If H 0 : 1 2 0 is true, the statistic T0* ( X 1 X 2 0 ) / S12 / n1 S 22 / n2 is distributed approximately as t with degrees of freedom given by 2 2 2 s1 s2 n1 n2 v 2 ( s1 / n1 ) 2 ( s22 / n2 ) 2 n1 1 n2 1 Confidence interval X 1 X 2 t / 2, S12 S 22 S12 S 22 1 2 X 1 X 2 t / 2, n1 n2 n1 n2 Example 10-5 Yield from a Catalyst 0.05 , n1 n2 8 , x1 92.255 , x2 92.733 , s1 2.39 , s2 2.98 Is there any difference between the mean yields? Example 10-6 Arsenic in Drinking Water 0.05, n1 n2 10, x1 12.5 , x2 27.5 , s1 7.63 , s2 15.3 Is there any difference in mean arsenic concentrations? (It is unlikely that the population variances are the same) Example 10-7 Yield from a Catalyst Sample Size If catalyst 2 produces a mean yield that differs from the mean yield of catalyst 1 by 4.0%, we would like to reject the null hypothesis with probability at least 0.85. What sample size is required? s p 2.70 d | | / 2 | 4.0 | /[( 2)( 2.70)] 0.74 Example 10-8 Cement Hydration 0.05 , n1 10 , n2 15 , x1 90.0 x2 87.0 , s1 5.0 , s2 4.0 Find the confidence interval for 1 2 . Exercise 10-29 The overall distance traveled by a golf ball is tested by hitting the ball with Iron Byron, a mechanical golfer with a swing that is said to emulate the legendary champion, Byron Nelson. Ten randomly selected balls of two different brands are tested and the overall distance measured. The data follow: Brand 1: 251, … Brand 2: 236, … (a) Is there evidence that overall distance is approximately normally distributed? Is an assumption of equal variances justified? Exercise 10-29 (b) Test the hypothesis that both brands of ball have equal mean overall distance. Use 0.05 . What is the P-value? (c) Construct a 95% two-sided CI on the mean difference in overall distance between the two brands of golf balls. (d) What is the power of the statistical test in part (b) to detect a true difference in mean overall distance of 4.5 m? (e) What sample size would be required to detect a true difference in mean overall distance of 2.75 m with power of approximately 0.75? Exercise 10-29 Normal probability plot A Nonparametric Test for the Difference in Two Means Wilcoxon rank-sum test Appendix Table X ( w ) Two samples, n1 n2 Arrange all n1 n2 observations in ascending order of magnitude and assign ranks to them. If two or more observations are tied (identical), use the mean of the ranks that would have been assigned if the observations differed. W1 : the sum of the ranks in the smaller sample W2 (n1 n2 )(n1 n2 1) / 2 W1 Wilcoxon rank-sum test Appendix Table X ( w ) Hypotheses, two-sided alternative H 0 : 1 2 H1 : 1 2 Reject H 0 if min( w1 , w2 ) w Hypotheses, upper-tailed alternative H 0 : 1 2 H1 : 1 2 Reject H 0 if w2 w Hypotheses, lower-tailed alternative H 0 : 1 2 H1 : 1 2 Reject H 0 if w1 w Normal approximation for Wilcoxon rank-sum test statistic ( n1 8 and n2 8 ) n1 (n1 n2 1) 1 2 n n (n n 1) W21 1 2 1 2 12 W1 W1 Z0 W W 1 Reject H 0 if | z0 | z / 2 for H1 : 1 2 or if z0 z for H1 : 1 2 or if z0 z for H : 1 1 2 Example 10-9 Axial Stress n n 10 , 0.05 1 2 We wish to test the hypothesis that the means of the two stress distributions are identical. Exercise 10-33 The manufacturer of a hot tub is interested in testing two different heating elements for his product. The element that produces the maximum heat gain after 15 minutes would be preferable. He obtains 10 samples of each heating unit and tests each one. The heat gain after 15 minutes (in ℃ (K)) follows. Unit 1: 25, … Unit 2: 31, … (a) Is there any raeson to suspect that one unit is superior to the other? Use 0.05 and the Wilcoxon rank-sum test. (b) Use the normal approximation for the Wilcoxon rank-sum test. Assume that 0.05 . What is the approximate P-value for the test statistic? Paired t-Test Define the differences between each pair of observations as D j X 1 j X 2 j , j 1,2,..., n . The D j‘s are assumed to be normally distributed with mean D E( X1 X 2 ) E( X1 ) E( X 2 ) 1 2 and variance D2 Test statistic: T D 0 0 SD / n H 0 : D 0 H1 : D 0 P-value: P 2P(Tn1 | t0 |) Reject H 0 if t0 t / 2,n 1 or t0 t / 2,n 1 Hypotheses, upper-tailed alternative H 0 : D 0 H1 : D 0 P-value: P P(T t ) n 1 0 Reject H 0 if t 0 t , n 1 Hypotheses, lower-tailed alternative H 0 : D 0 H1 : D 0 P-value: P P(Tn1 t0 ) Reject H 0 if t0 t ,n 1 Paired versus unpaired comparisons The two-sample t -statistic is X1 X 2 0 1 1 Sp n n which would be compared to t 2 n 2 T0 The paired t -statistic is D 0 T0 SD / n which is compared to t n 1 D X1 X 2 V ( D ) V ( X 1 X 2 0 ) V ( X 1 ) V ( X 2 ) 2 cov( X 1 , X 2 ) 2 2 (1 ) 2 2 1 2 1 Sp( ) n n n n Guidelines If the experimental units are relatively homogeneous (small ) and the correlation within pairs is small, the gain in precision attributable to pairing will be offset by the loss of degrees of freedom, so an independent-sample experiment should be used. If the experimental units are relatively hererogeneous (large ) and there is large positive correlation within pairs, the paired experiment should be used. Typically, this case occurs when the experimental units are the same for both treatments. Confidence interval for D D D T SD / n has a t distribution with n 1 degrees of freedom D D P t / 2,n 1 t / 2,n 1 1 SD / n Confidence interval on the difference in means d t / 2,n1sD / n D d t / 2,n1sD / n Example 10-10 Shear Strength of Steel Girders ◦ n 9 , 0.05, d 0.2769 , sd 0.1350 ◦ We wish to determine whether there is any difference (on the average) between the two methods. Example 10-11 Parallel Park Cars ◦ n 14 , 0.05 , d 1.21 , sd 12.68 ◦ The 90% confidence interval for D 1 2 is d t0.05,13sD / n D d t0.05,13sD / n Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011. Exercise 10-45 ◦ An article in Neurology (1988,Vol. 50, pp. 1246-1252) discussed that monozygotic twins share numerous physical, psychological, and pathological traits. The investigators measured an intelligence score of 10 pairs of twins, and the data are follows: …. ◦ (a) Is the assumption that the difference in score is normally distributed reasonable? Show results to support your answer. ◦ (b) Find a 95% confidence interval on the difference in mean score. Is there any evidence that mean score depends on birth order? ◦ (c) It is important to detect a mean difference in score of one point, with a probability of at least 0.90. Was the use of 10 pairs a adequate sample size? If not, how many pairs should have been used? Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011. Inference on the Variances of Two Normal Distributions F distribution Let W and Y be independent chi-square random variables with u and v degrees of freedom, respectively. Then the ratio W /u F Y /v has the probability density function u v u u / 2 (u / 2) 1 ( )( ) x 2 v f ( x) , 0 x (u v ) / 2 u v u ( )( ) ( ) x 1 2 2 v and is said to follow the F distribution with u degrees of freedom in the numerator and v degrees of freedom in the denominator PDF of the F distribution From Wikipedia, http://www.wikipedia.org. CDF of the F distribution From Wikipedia, http://www.wikipedia.org. F distribution v /( v 2) 2v 2 (u v 2) u (v 2) 2 (v 4) 2 P ( F f ,u , v ) f ( x)dx f . u . v f1 ,u ,v 1 f ,u , v (n 1) S F (n 1)S 1 2 2 1 2 2 / 12 /( n1 1) S12 / 12 2 2 2 / 2 /( n2 1) S 2 / 2 Test on the ration of variances from two normal distributions Test statistic: S12 F 2 S2 Hypotheses, two-sided alternative H 0 : 12 22 H1 : 12 22 P-value: P 2 min( P ( Fn1 1, n2 1 f 0 ), P ( Fn1 1, n2 1 f 0 )) Reject H 0 if f 0 f / 2,n1 1,n2 1 or f 0 f1 / 2, n1 1, n2 1 Hypotheses, upper-tailed alternative 2 2 H 0 : 1 2 2 2 H : 1 1 2 P-value: P P( Fn 1,n 1 f 0 ) Reject H 0 if f 0 f / 2,n 1,n 1 Hypotheses, lower-tailed alternative 2 2 H 0 : 1 2 2 2 H1 : 1 2 P-value: P P( Fn 1,n 1 f 0 ) Reject H 0 if f 0 f1 ,n 1,n 1 1 2 1 1 2 2 1 2 f 0 f1 / 2,n1 1,n2 1 Type II error and choice of sample size Finding the probability of type II error Hypotheses, two-sided alternative H 0 : 12 22 H1 : 12 22 (n1 1) S12 12 n1 1 12 f / 2,n1 1,n2 1 | H1} P{ f1 / 2,n1 1,n2 1 2 2 (n2 1) S 2 2 n2 1 22 22 22 P{ n2 1 2 1 n1 1 f1 / 2,n1 1,n2 1 Fn1 1,n2 1 n2 1 2 1 n1 1 f / 2,n1 1,n2 1} Type II error and choice of sample size Finding the probability of type II error Hypotheses, two-sided alternative H 0 : 12 22 H1 : 12 22 Operating characteristic (OC) curves 1 Curves plotting against a parameter 2 n n n for various sample size 1 2 See Appendix VII Confidence interval on the ratio of two variances S 22 / 22 F 2 2 S1 / 1 has a F distribution with n2 1 and n1 1 degrees of freedom P f1 / 2,n2 1,n1 1 F f / 2,n2 1,n1 1 1 Confidence interval on the ratio of variances from two normal distributions s12 12 s12 f 2 2 f / 2,n2 1,n1 1 2 1 / 2, n2 1, n1 1 s2 2 s2 Example 10-12 Semiconductor Etch Variability ◦ n 16 , s1 1.96 , s2 2.13 ◦ Is there any evidence to indicate that either gas is preferable? Use a fixed-level test with 0.05 . Example 10-13 Semiconductor Etch Variability Sample Size ◦ Suppose that one gas resulted in a standard deviation of oxide thickness that is half the standard deviation of oxide thickness of the other gas. If we wish to detect such a situation with probability at least 0.80, is the sample size n1 n2 20 adequate? 1 2 2 Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011. Example 10-14 Surface Finish for Titanium Alloy ◦ n1 11 , s1 0.13 , n2 16 , s2 0.12 ◦ Find a 90% confidence interval on the ratio of the two standard deviations, 1 / 2 . s12 12 s12 f 2 2 f 0.05,15,10 2 0.95,15,10 s2 2 s2 Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011. Exercise 10-45 ◦ An article in Neurology (1988,Vol. 50, pp. 1246-1252) discussed that monozygotic twins share numerous physical, psychological, and pathological traits. The investigators measured an intelligence score of 10 pairs of twins, and the data are follows: …. ◦ (a) Is the assumption that the difference in score is normally distributed reasonable? Show results to support your answer. ◦ (b) Find a 95% confidence interval on the difference in mean score. Is there any evidence that mean score depends on birth order? ◦ (c) It is important to detect a mean difference in score of one point, with a probability of at least 0.90. Was the use of 10 pairs a adequate sample size? If not, how many pairs should have been used? Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011. Inference on Two Population Proportions Hypothesis H 0 : p1 p2 H1 : p1 p2 The statistic Z Pˆ1 Pˆ2 ( p1 p2 ) p1 (1 p1 ) p2 (1 p2 ) n1 n2 is distributed approximately as standard normal If p1 p2 p Z Pˆ1 Pˆ2 1 1 p (1 p )( ) n1 n2 is distributed approximately N (0,1) . A pooled estimator of the common parameter p is X1 X 2 ˆ P n1 n2 The test statistic for H 0 : p1 p2 is then Z Pˆ1 Pˆ2 1 1 Pˆ (1 Pˆ )( ) n1 n2 Approximate tests on the difference of two population proportions Test statistic: Z0 Pˆ1 Pˆ2 1 1 Pˆ (1 Pˆ )( ) n1 n2 Hypotheses, two-sided alternative H 0 : p1 p2 H1 : p1 p2 P-value: P 2(1 (| z0 |)) Reject H 0 if or z0 z / 2 z0 z / 2 Hypotheses, upper-tailed alternative H 0 : p1 p2 H1 : p1 p2 P 1 ( z0 ) P-value: Reject H 0 if z0 z Hypotheses, lower-tailed alternative H 0 : p1 p2 H1 : p1 p2 P-value: P ( z0 ) Reject H 0 if z0 z Type II error and choice of sample size Finding the probability of type II error Hypotheses, two-sided alternative H 0 : p1 p2 H1 : p1 p2 P{ z / 2 Pˆ1 Pˆ2 z / 2 | H1} 1 1 Pˆ (1 Pˆ )( ) n1 n2 P{ z / 2 pq (1 / n1 1 / n2 ) Pˆ1 Pˆ2 z / 2 pq (1 / n1 1 / n2 ) | H1} n1 p1 n2 p2 p n1 n2 and q n1 (1 p1 ) n2 (1 p2 ) n1 n2 Type II error and choice of sample size Finding the probability of type II error Hypotheses, two-sided alternative H 0 : p1 p2 H1 : p1 p2 z / 2 pq (1 / n1 1 / n2 ) ( p1 p2 ) P{ Pˆ Pˆ 1 Pˆ1 Pˆ2 ( p1 p2 ) Pˆ Pˆ 1 2 2 z / 2 pq (1 / n1 1 / n2 ) ( p1 p2 ) Pˆ Pˆ 1 2 z / 2 pq (1 / n1 1 / n2 ) ( p1 p2 ) Pˆ1 Pˆ2 z / 2 pq (1 / n1 1 / n2 ) ( p1 p2 ) Pˆ1 Pˆ2 | H 1} Type II error and choice of sample size Finding the probability of type II error Hypotheses, upper-tailed alternative H 0 : p1 p2 H1 : p1 p2 z pq (1 / n1 1 / n2 ) ( p1 p2 ) Pˆ1 Pˆ2 Hypotheses, upper-tailed alternative H 0 : p1 p2 H1 : p1 p2 z pq (1 / n1 1 / n2 ) ( p1 p2 ) 1 Pˆ1 Pˆ2 Type II error and choice of sample size Finding the probability of type II error Hypotheses, two-sided alternative H 0 : p1 p2 H1 : p1 p2 z z / 2 pq (1 / n1 1 / n2 ) ( p1 p2 ) Pˆ Pˆ 1 2 z ( p1q1 p2 q2 ) / n z / 2 ( p1 p2 )( q1 q2 ) /( 2n) p1 p2 z n ( p1q1 p2 q2 ) z / 2 ( p1 p2 )( q1 q2 ) / 2 ( p1 p2 ) 2 2 Confidence interval on the difference in population proportions Z Pˆ1 Pˆ2 ( p1 p2 ) p1 (1 p1 ) p2 (1 p2 ) n1 n2 P z / 2 Z z / 2 1 pˆ 1 pˆ 2 z / 2 p1 (1 p1 ) p2 (1 p2 ) p1 p2 n1 n2 pˆ 1 pˆ 2 z / 2 p1 (1 p1 ) p2 (1 p2 ) n1 n2 Example 10-15 St. John’s Wort ◦ n1 n2 100 , p ˆ1 27 / 100 , pˆ 2 19 / 100 ◦ Is there any reason to believe that St. John’s Wort is effective in treating major depression? Use 0.05 . Example 10-13 Defective Bearings ˆ1 0.12 , pˆ 2 8 / 85 0.09 ◦ n1 n2 85 , p ◦ Find the approximate 95% confidence interval on the difference in the proportion of defective bearings produced under the two processes. Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011. Exercise 10-71 ◦ Two different types of polishing solutions are being evaluated for possible use in a tumble-polish operation for manufacturing interocular lenses used in the human eye following cataract surgery. Three hundred lenses were tumble polished using the first polishing solution, and of this number 253 had no polishinginduced effects. Another 300 lenses were tumble-polishing using the second polishing solution, and 196 lenses were satisfactory upon completion. ◦ (a) Is there any reason to believe that the two polishing solutions differ? Use 0.01 . What is the P-value for this test? ◦ (b) Discuss how this question could be answered with a confidence interval on p1 p2 . Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011.