Chapter 9: Inferences Involving Two Populations H 0 : 1 2 H a : 1 2 1 2 Independent and Dependent Samples • The object is to compare means of two samples and draw conclusions about the differences in population means. • Two basic kinds of samples: independent and dependent (paired). • Which kind you have depends on the sources of two samples and how the data was collected. Dependent Samples • If one observation is collected for each sample from the same source, the samples are dependent. • This is often called “pair data” because you get a pair of observations from one individual or experimental unit. • Examples include pretest & posttest scores, weight before and after a diet, left eye and right eye acuity, etc. • There is a one-to-one correspondence between an observation in one sample and an observation in the other sample. Independent Samples • Two samples are independent if there is no connection between an observation of one sample with a particular observation of the other. • Also, there can be no connection in the sampling procedure (an individual selected for one in no way affects the selection of any individual in the other, including by exclusion) Dependent Sample Examples • The same test is given to all students at the beginning and end of a course to measure learning (one pair of scores per person). • IQ tests are given to husband & wife pairs. • A medical treatment is given to patients matched for condition, age, sex, race, weight, and other characteristics with patients in a control group. Independent Sample Examples • The same test is given to all students in two classes (no pairing of scores occurs). • IQ tests are given to men and women without consideration of relationship between any of them. • Subjects are randomly assigned to a treatment and a control group to test a new drug. No attempt is made to “match” them. Difference of means for Paired Data • When dependent samples are involved, the data is paired data. • Paired data results from: – before and after studies, – a common source, or – from matched pairs. • We will denote the random variables from the two samples by X1 and X2. The meaning of X1 and X2 • Two samples are being taken. For example, the pretest is one sample, and the posttest is another sample. • Then X1 represents a pretest score and X2 represents a posttest score. • However, there is an X1 and an X2 for each person taking the test. Paired Data X1 x11 x12 x13 x14 x15 x16 x17 … X2 x21 x22 x23 x24 x25 x26 x27 … Here the random variable names are given by capital letters, and individual observations by small letters. The data appear side-byside, each pair having the same second subscript. You cannot change the order of one column without destroying the relationship between the columns. That is what makes it “paired data.” There are the same number of observations, n, in each sample. What do we want to know? • In paired data studies, the parameter of interest is the mean difference between the groups. • This is conceptually different from the difference between the means of the groups. • In other words, the population of interest is actually the differences between X1 and X2 . • We define a new value, d=x1-x2 as one observation taken from this population. Why is this important? • The mean difference between the groups and the difference between the means of the groups are the same number. • But their sampling distributions are different! • From d x1 x2 , we calculate d , the mean difference. • Now, d will have a normal distribution if X1 and X2 are normal or n>30 (approx). Distribution of mean differences • If we know d is normally distributed, then we can use the same tests and confidence intervals that we learned for x . • We won’t bother with the “variance known” situation this time. We will calculate the variance from the sample and use the t distribution. • In other words, treat the d’s as the sample. Find their mean and standard deviation. Distribution of mean differences • There is a population parameter, d , that we are trying to estimate. • The point estimate is d , taken from a sample of n differences (d’s). • The d’s have a standard deviation, sd which is calculated in the same way as s. sd • The standard deviation of d is sd n • This is no difference from what we had before, except for symbols! Confidence Interval for Paired Differences • A (1-α)100% CI for d is given by: d t n-1, 2 sd , where sd sd . n Hypothesis Testing: When testing a null hypothesis about the mean difference, the test statistic is d 0 d t* sd n where t* has a t distribution with df = n 1. Example: The corrosive effects of various chemicals on normal and specially treated pipes were tested by using a dependent sampling plan. The data collected is summarized by n 17, d 5.7, sd 4.8 where d is the amount of corrosion on the treated pipe subtracted from the amount of corrosion on the normal pipe. Example (continued): Does this sample provide sufficient evidence to conclude the specially treated pipes are more resistant to corrosion? Use = 0.05 a. Solve using the classical approach. b. Solve using the p-value approach. Solution: 1. State the hypotheses (you must say something about the direction of the difference): Test for the mean difference in corrosion, normal pipe treated pipe. The null and alternative hypothesis: H0: d = 0 (did not lower corrosion) Ha: d > 0 (did lower corrosion) 2. Determine the appropriate type of test: Assumptions: Assume corrosion measures are approximately normal, σ unknown. Use t-test for paired differences. 3. Define the rejection region: a. Right tailed test, Reject H0 if t*>t(16,0.05) = 1.75. b. Reject H0 if p<.05. 4. Calculate the value of the test statistic: n 17, d 5.7, sd 4.8 t* d 0 d sd 5.7 0.0 5.7 4.896 n 4.8 17 1.164 p P(t* 4.896) .0001 by the table p P(t* 4.896) 0.00006731 using Excel 5. State the conclusion: a. Decision: Reject H0 because t*=4.896>1.75 b. Decision: Reject H0 because p<.0001<α=.05. Conclusion: The treated pipes do not corrode as much as the normal pipes when subjected to chemicals. Example: Salt-free diets are often prescribed for people with high blood pressure. The following data was obtained from an experiment designed to estimate the reduction in diastolic blood pressure as a result of following a salt-free diet for two weeks. Assume diastolic readings to be normally distributed. Before 93 106 87 92 102 95 88 110 After 92 102 89 92 101 96 88 105 1 4 -2 0 1 -1 0 5 Difference Test the hypothesis that the salt free-free diet resulted in a reduction in blood pressure, using α=0.02. Question: How do you decide which way to subtract? Solution: 1. A reduction means “before”>“after” so we will use “before-after” as our difference, which is expected to be positive. H0: d = 0 (no reduction) Ha: d > 0 (reduction occurred) 2. Determine the type of test: This is a t-test of paired differences. The mean reduction (difference) in diastolic blood pressure is being tested. Both before and after scores are assumed normally distributed, so d is normal, and σ is unknown, so we use the t-test with 7 df. 3. Rejection Region: Reject H0 if t*>t(7,0.02)=2.52 (from Excel) or if p<0.02. 4. Calculate the test statistic and/or p-value: n 8, d 1.0, sd 2.39, 0.02 t* 5. d 0 1 1.18 p 0.1383 sd 2.39 / 8 Conclusion: Since t* is not in the rejection region, and p is not less than α, we do not reject H0. We conclude there is not enough evidence to say that the diet reduces blood pressure. Two Independent Samples • • • • Compare the means of two populations Parameter of interest: (1 - 2) Base inferences on ( x1 x2 ) The parentheses indicate that we are thinking of the difference as one parameter • Consider the general confidence interval formula, P±TS. We know what P is now. • We need to know the distribution of ( X 1 X 2 ) to find T and S. Distribution of ( X 1 X 2 ). • The sampling distribution of ( X 1 X 2 ) has a mean, x1 x2 ( 1 2 ) • The point estimate of x1 x2 is ( x1 x2 ) • The standard deviation of ( X 1 X 2 ) is x x 1 2 12 n1 22 n 2 • Since the variances are hardly ever known, we will have to estimate them. Sample Standard Deviation • The sample standard deviation of ( X 1 X 2 ) is sx1 x2 s12 n1 s22 n 2 • The following assumptions are needed to use the above formula: – The samples are randomly selected from normally distributed populations – The samples are independent – There is no reason to believe σ1=σ2 – The populations (not samples) are “large” Distribution • The t distribution will be used. • Degrees of freedom: – If n1=n2, no problem, df=n1-1. – Otherwise, df may be calculated by a complicated formula. Statistical computer software will do this automatically. – Alternatively, the smaller of n1-1 and n2-1 can be used as an approximation. (conservative— actual confidence level will be higher, actual pvalue will be lower) Confidence Interval • Now we have all the information we need. • P= ( x1 x2 ) T=t(df,α/2) S= s12 n1 s22 n 2 • A (1-α)100% confidence interval for (1-2) is given by ( x1 x2 ) t(df , / 2) s12 n1 s22 n2 Hypothesis Tests: To test a null hypothesis about the difference between two population means, use the test statistic t* ( x1 x2 ) ( 01 02 ) s12 s22 n n 1 2 where df is the smaller of df1 or df2 when computing t* without the aid of a computer. Note: The hypothesized difference between the two population means (01 02) can be any specified value. The most common value is zero. Example: A recent study compared a new drug to ease postoperative pain with the leading brand. Independent random samples were obtained and the number of hours of pain relief for each patient were recorded. The summary statistics are given in the table below. Pain Reliever New Drug Leading Brand n 10 17 Mean 4.350 3.929 St.Dev. 0.542 0.169 Is there any evidence to suggest the new drug provides longer relief from post-operative pain? Use = 0.05 a. Solve using the p-value approach. b. Solve using the classical approach. Solution: 1. The Hypotheses: H0: 1 2 = 0 (new drug relieves pain no longer) Ha: 1 2 > 0 (new drug works longer to relieve pain) 2. The appropriate test: Assumptions: Both populations are assumed to be approximately normal. The samples were random and independently selected. Use t-test for difference of means, indep. samples, df = 9. 3. Rejection Region: Reject if t*>t(9, 0.05) = 1.83 or p<.05. 4. Calculations: ( x1 x 2 ) ( 1 2 ) ( 4.350 3.929) (0.00) s12 s22 0.542 2 0.169 2 n n 10 17 1 2 0.421 0.421 2.39 0.0294 0.0017 0.1763 t* 4. (cont’d) The p-value: P P(t* 2.39, with df 9) 0.020 5. The Conclusion: Reject H0. There is evidence to suggest that the new drug provides longer relief from post-operative pain. Sample Standard Deviation—Equal Variance • If σ1=σ2, X t 1 X 2 1 2 sP sP 2 2 1 1 n n 1 2 n1 1 s12 n2 1 s2 2 (n1 1) ( n2 1) df (n1 1) ( n2 1) n1 n2 2 • The formula also assumes that the populations are normal, samples are random and independent. Example: We modify the previous drug study example. Suppose the statistics are as given below, and the variances of the two populations are equal. Pain Reliever n Mean St.Dev. New Drug Leading Brand 10 17 4.350 3.929 0.542 0.469 Is there any evidence to suggest the new drug provides longer relief from post-operative pain? Use = 0.05 a. Solve using the p-value approach. b. Solve using the classical approach. Solution: 1. The Hypotheses: H0: 1 2 = 0 (new drug relieves pain no longer) Ha: 1 2 > 0 (new drug works longer to relieve pain) 2. The appropriate test: Assumptions: Populations normal, samples random and independently selected, variances equal. Use t-test for difference of means, indep. samples, df = 25. 3. Rejection Region: Reject if t*>t(25, 0.05) = 1.71 or p<.05. 4. Calculations: sP 2 2 n 1 s n 1 s 1 2 2 2 1 (n1 1) (n2 1) t* ( x1 x 2 ) ( 1 2 ) 9 0.5422 16 0.469 2 0.2465 9 16 (4.350 3.929) (0.00) 1 1 1 1 0.2465 sP 10 17 n1 n2 0.421 0.421 2.18 0.0294(0.1588) 0.1979 2 The p-value: p P(t* 2.18, with df 25) 0.019 5. The Conclusion: Reject H0. There is evidence to suggest that the new drug provides longer relief from post-operative pain. Difference of Two Proportions If independent samples of sizes n1 and n2 are drawn randomly from large populations with p1 = P1(success) and p2 = P2(success), respectively, then the sampling distribution of p1 p2 has these properties: 1. a mean p1 p2 p1 p2 2. a standard error p p 1 2 p1q1 p2 q2 n1 n2 3. an approximately normal distribution if n1 and n2 are sufficiently large. Note: To ensure normality: 1. The sample sizes are both larger than 20. 2. The products n1p1, n1q1, n2p2, n2q2 are all larger than 5. Since p1 and p2 are unknown, these products are estimated by n1 p1 , n1q1 , n2 p2 , n2 q2 3. The samples consist of less than 10% of respective populations. Confidence Intervals: 1. A confidence interval for p1 p2 is based on the unbiased sample statistic p1 p2 . 2. The confidence limits are found using the following formula: p1q1 p2 q2 ( p1 p2 ) z ( / 2) n1 n2 Hypothesis Tests: If the null hypothesis is there is no difference between proportions, this can be written as p1 = p2, or p1 p2 = 0. Let’s consider how we can construct a standard error term. Now the standard deviation of p1' p2' is actually p p 1 2 p1q1 p2 q2 n1 n2 However, if the null hypothesis is true, p1 = p2, so we can say p p 1 2 pq pq n1 n2 1 1 pq n1 n2 But we don’t know p and q! How can we estimate these from the sample? Under the null hypothesis, the proportions of the two samples are the same. So simply take all of the data and pool it together to estimate the common proportion. x x pp 1 2 , and qp 1 pp n1 n2 The test statistic becomes p1 p2 z* 1 1 ( pp )(qp ) n1 n2 Example: The proportions of defective parts from two different suppliers were compared. The following data were collected. Supplier 1 2 Sample Size 300 275 Number Defective 15 9 Is there any evidence to suggest the proportion of defectives is different for the two suppliers? Use = 0.01. 1. The null and alternative hypotheses: H0: p1 p2 = 0 (proportion of defectives the same) Ha: p1 p2 0 (proportion of defectives different) 2. The type of test: Difference of proportions, with Samples are larger than 20. Products n1 p1 , n1q1 , n2 p2 , n2 q2 are larger than 5. Sampling distribution should be approximately normal. Use z* for difference of proportions. 3. Rejection region: Reject H0 if z* > z(.005) = 2.575 or z* < -z(.005) = -2.575 4. Calculations: x1 x2 15 9 24 pp 0.042 n1 n2 300 275 575 qp 1 pp 1 0.042 0.958 4. Calculations cont’d: p1 p2 0.05 0.0327 z* 1 1 1 1 (0.042)(0.958) ( pp )(qp ) 300 275 n1 n2 0.0173 0.0173 1.03 0.0002804 0.0167 5. Conclusion: Do not reject H0 and conclude that there is no evidence to suggest the proportion of defectives is different for the two suppliers.