• Pick up your test and a chapter 11 notes packet • Today and Tomorrow: 11.1 • Friday: Chapter 11 Quiz 1 • If you have a Quiz 1 grade on your test, you do not have to take the quiz on Friday! • If you will not be here on Friday, you must take the quiz tomorrow! Chapter 11 Comparing Two Populations or Treatments Suppose we have a population of adult men with a mean height of 71 inches and standard deviation of 2.5 inches. We also have a population of adult women with a mean height of 65 inches and standard deviation of 2.3 inches. Assume heights are normally distributed. Suppose we take a random sample of 30 men and a random sample of 25 women from their respective populations and calculate the difference in their heights (man’s height – woman’s height). If we did this many times, what would the distribution of differences be like? Male Heights Female Heights Randomly take one of the sample means for the 71 sM = 2.5 65 sF = 2.3 males and one of Suppose we took repeated Suppose we took repeated thesamples sample means samples of size n = 25 from the of size n = 30 from the for the females population of female heights population and of male heights and and find the calculated the sampledifference calculated means. in the sample means. We would have the sampling We heights. would have the sampling mean distribution of xF 71 s xM Doing this repeatedly, we will create the sampling distribution of (xM – xF) 2.5 30 distribution of xM. 65 xM - xF s xF 2. 3 25 2.5 2 2.3 s x M xF 30 25 6 2 Heights Continued . . . • Describe the sampling distribution of the difference in mean heights between men and women. The sampling distribution is normally 2.5 2 2.3 2 distributed with s x M xF 71 65 6 • x M x F 30 25 What is the probability that the difference in mean heights of a random sample of 30 men and a random sample of 25 women is less than 5 inches? P ((xM xF ) 5) .0614 6 Properties of the Sampling Distribution of x1 – x2 If the random samples on which x1 and x2 are based are selected independently of one another, then 1. x1 x2 x1 x2 1 2 2 2 1 2 distribution s s 2 2 s s 1 2 of x1s– x2 is always and x1 x 2 2 2 2 The sampling s s s 2. x1 x 2 x1 x2 Meancentered value of nof n2 at thenvalue 1 2 1 – 2, so x1 – x2 isnan 1 x1 –unbiased x2 statistic for estimating 1 – 2. 3. InThe n1 and n2 areofboth large or theispopulation distributions variance the differences are (at least approximately) normal, x1 and x2 each have (at the sum of the variances. least approximately) normal distributions. This implies that the sampling distribution of x1 – x2 is also (approximately) normal. The properties for the sampling distribution of x1 – x2 implies that x1 – x2 can be standardized to obtain a variable with a sampling distribution that is approximately the standard normal (z) distribution. When two random samples are independently selected and n1 and n2 are both large or the population We must s1 and s2 iss1 distributions are (at least approximately)If normal, the know distribution of unknown we and s in 2 x1 x2 ( 1 2 ) must useto t use z order 2 2 s1 s2 distributions. this n1 n2 procedure. is described (at least approximately) by the standard normal (z) distribution. Two-Sample t Test for Comparing Two Populations Null Hypothesis: H0: 1 – 2 = hypothesized value Test Statistic: t x1 x2 hypothesiz ed value 2 2 s s 1 2 A conservative of the P estimate The hypothesized is tn1 found n2 by value value can be using the often 0, but areoftimes curve with thethere number degrees The appropriate df for the two-sample t test is of freedom equal to the smaller of when we are interested in V1 V2 2 (n1 a – 1) or (n2 – 1).that 2 2 testing for difference is df s s 1 2 V where V1 not and V1 2 V22 2 n1 0. n2 n1 1 n2 1 The computed number of df should be truncated to an integer. Two-Sample t Test for Comparing Two Populations Continued . . . Null Hypothesis: H0: 1 – 2 = hypothesized value Alternative Hypothesis: P-value: Ha: 1 – 2 > hypothesized value Area under the appropriate t curve to the right of the computed t Ha: 1 – 2 Area under the appropriate t < hypothesized value curve to the left of the computed t Ha: 1 – 2 ≠ hypothesized value 2(area to right of computed t) if +t or 2(area to left of computed t) if -t Another Way to Write Hypothesis Statements: H0: 1 = - 22= 0 Ha: 1 < - 22< 0 Ha: 1 > - 22> 0 Ha: 1 -≠22≠ 0 When the hypothesized value is 0, we Be sure to can rewrite define BOTH these 1 and 2! hypothesis statements: Two-Sample t Test for Comparing Two Populations Continued . . . Assumptions: 1) The two samples are independently selected random samples from the populations of interest 2) The sample sizes are large (generally 30 or larger) or the population distributions are (at least approximately) normal. When comparing two treatment groups, use the following assumptions: 1) Individuals or objects are randomly assigned to treatments (or vice versa) 2) The sample sizes are large (generally 30 or larger) or the treatment response distributions are approximately normal. Are women still paid less than men for comparable work? A study was carried out in which salary data was collected from a random sample of men and from a random sample of women who worked as purchasing managers and who were subscribers to Purchasing magazine. Annual salaries (in thousands of dollars) appear below (the actual sample sizes were much larger). Use a = .05 to determine there isconvincing evidence that the If we hadifdefined as the mean salary 1 purchasing managers is mean annual salary for male for female purchasing and 2 purchasing greater than the mean annual managers salary for female as the mean salary for male purchasing managers. managers, then the correct alternative hypothesis would be the difference in the Men 81 69 81 76 76 74 69 76 79 65 means is less than 0. Women 78 60 67 61 62 73 71 58 68 48 H0: 1 – 2 = 0 Ha: 1 – 2 > 0 Where 1 = mean annual salary for male State the hypotheses: purchasing managers and 2 = mean annual salary for female purchasing managers Salary War Continued . . . Men 81 69 Wome 78 60 nH : – = 0 0 1 2 Ha: 1 – 2 > 0 81 67 76 61 76 62 74 73 69 71 76 58 79 68 65 48 Where 1 = mean annual salary for male purchasing managers and 2 = mean annual salary for female purchasing managers Assumptions: 1)Given two independently selected random samples of male and female purchasing managers. Men 2) Since the sample sizes are small, we must Even thoughVerify these the are assumptions samples from subscribers of determine if it is plausible that the sampling Women Purchasingformagazine, of the study believed distributions each of thethe twoauthors populations are approximately normal. Since the boxplots it was reasonable to view the samples as 60 are reasonably symmetrical with no outliers, of it interest. representative of the populations is plausible that the sampling distributions are approximately normal. 80 Salary War Continued . . . Men 81 69 81 76 76 74 69 76 79 65 Women 78 60 67 61 62 73 71 58 68 48 Where 1 = mean annual salary for male H0: 1 – 2 = 0 purchasing managers and 2 = mean annual Ha: 1 –What 2 > 0potential typeforerror salary female purchasing managers could we have made with 74.6 64.6 0 3.11 this conclusion? t Test Statistic: 2 (round down) this 8.62 Type I 5.4 Truncate value. 10 10 P-value =.004 a = .05 Now find the area to the 2.916 7.3962 Since theright P-value < 3.11 a, weinreject is convincing dft-H0. There 15 .14 15 of t = the 2 2 .916the 7test .396statistic evidence that the mean salary for2male purchasing Compute for with df = 15.mean To find the P-value, first managers iscurve higher than the salary female 9 9 and P-value purchasing managers. find the appropriate df. The Two-Sample t Confidence Interval for the Difference Between Two Population or Treatment Means The general formula for a confidence interval for 1 – 2 when 1) The two samples are independently selected random samples from the populations of interest 2) The sample sizes are large (generally 30 or larger) or the population distributions are (at least approximately) normal. s12 s22 isFor a comparison x1 x2 of(ttwo critical value) use the following treatments, n1 n2 assumptions: The t critical value is based onrandomly assigned to 1) Individuals or objects are V1 V2 2(or vice versa) treatments s12 s22 df V2 where V1 n and V1 2 V22 n2 1 2) The sample sizes are large (generally 30 or larger) or n1 1 n2 1 the treatment response distributions are approximately df should be truncated to an integer. normal. In a study on food intake after sleep deprivation, men were randomly assigned to one of two treatment groups. The experimental group were required to sleep only 4 hours on each of two nights, while the control group were required to sleep 8 hours on each of two nights. The amount of food intake (Kcal) on the day following the two nights of sleep was measured. Compute a 95% confidence interval for the true difference in the mean food intake for the two sleeping conditions. 4-hour sleep 3585 4470 3068 5338 2221 4791 4435 3187 3901 3868 3869 4878 3632 4518 8-hour sleep 4965 3918 1987 4993 5220 3653 3510 4100 5792 4547 3319 3336 4304 4057 3099 3338 the mean xand standard deviation for x4 = 3924 s4 =Find 829.67 8 = 4069.27 s8 = 952.90 each treatment. Food Intake Study Continued . . . 4-hour sleep 3585 4470 3068 5338 2221 4791 4435 3187 3901 3868 3869 4878 3632 4518 8-hour sleep 4965 3918 1987 4993 5220 3653 3510 4100 5792 4547 3319 3336 4304 4057 x4 = 3924 s4 = 829.67 3099 3338 x8 = 4069.27 s8 = 952.90 Assumptions: 1) Men were randomly assigned to two treatment groups Verify the assumptions. 2) The assumption of normal response 4-hour distributions is plausible because 8-hour both boxplots are approximately 4000 symmetrical with no outliers. Food Intake Study Continued . . . 4-hour sleep 3585 4470 3068 5338 2221 4791 4435 3187 3901 3868 3869 4878 3632 4518 8-hour sleep 4965 3918 1987 4993 5220 3653 3510 4100 5792 4547 3319 3336 4304 4057 3099 3338 x4 = 3924 s4 =upon 829.67 x8 = is 4069.27 Based this interval, there a s8 = 952.90 significant difference in the mean food No, since is intwo thesleeping confidence interval, 829.67 2conditions? 952 .902 there is not intake for0 the (3924 4069.27) 2.052 ( 814.1, 523.6) convincing evidence that15the mean food intake for the 15the interval. Calculate two sleep conditions are different. We are 95% confident that the true difference in the mean Interpret the interval in context. food intake for the two sleeping conditions is between 814.1 Kcal and 523.6 Kcal. Pooled t Test • Used when the variances of the two populations are equal (s1 = s2) • CombinesP-values information from both computed using thesamples pooled t to create a “pooled” thethe common procedureestimate can be farof from actual variance which in place variances of the two P-valueisif used the population are not equal. sample standard deviations When the population variances are equal, • Is not the widely used due to is itsbetter sensitivity to any pooled t procedure at detecting departure from the variance assumption deviations fromequal H0 than the two-sample t test. Suppose that an investigator wants to determine if regular aerobic exercise improves blood pressure. A random sample of people who jog regularly and a second random sample of people who do not exercise regularly are selected independently of one another. Can we conclude that the difference in mean blood pressure is attributed to jogging? What about other factors like weight? One way to avoid these difficulties would be to pair subjects by weight then assign one of the pair to jogging and the other to no exercise. Summary of the Paired t test for Comparing Two Population or Treatment Means Null Hypothesis: H0: d = hypothesized value xd hypothesiz ed value Where d tisTthe he mean hypothesized of the value is Test Statistic: sd differences usually in the 0 –paired meaning that there n is no and difference. Where n is the number observations of sample differences xd and sd are the mean and standard deviation of the sample differences. This test is based on df = n – 1. Alternative Hypothesis: Ha: d > hypothesized value Ha: d < hypothesized value Ha: d ≠ hypothesized value P-value: Area to the right of calculated t Area to the left of calculated t 2(area to the right of t) if +t or 2(area to the left of t) if -t Summary of the Paired t test for Comparing Two Population or Treatment Means Continued . . . Assumptions: 1. The samples are paired. 2. The n sample differences can be viewed as a random sample from a population of differences. 3. The number of sample differences is large (generally at least 30) or the population distribution of differences is (at least approximately) normal. Is this an example of paired samples? An engineering association wants to see if there is a difference in the mean annual salary for electrical engineers and chemical engineers. A random sample of electrical engineers is surveyed about their annual income. Another random sample of chemical engineers is surveyed about their annual income. No, there is no pairing of individuals, you have two independent samples Is this an example of paired samples? A pharmaceutical company wants to test its new weight-loss drug. Before giving the drug to volunteers, company researchers weigh each person. After a month of using the drug, each person’s weight is measured again. Yes, you have two observations on each individual, resulting in paired data. Can playing chess improve your memory? In a study, students who had not previously played chess participated in a program in which they took chess lessons and played chess daily for 9 months. Each student took a memory test before starting the chess program and again at the end of the 9-month period. If we had subtracted Post-test minus Pre-test, then Student 1 2 3 4 5 6 7 8the alternative 9 10 11 12 hypothesis be the Pre-test 510 610 640 675 600 550 610would 625 450 720mean 575 675 difference greater 0. 680 Post-test 850 790 850 775 700 775 700 is850 690 than 775 540 Difference -340 -180 -210 -100 -100 -225 -90 -225 -240 -55 35 H0: d = 0 First, find the differences H a: d < 0 the hypotheses. State pre-test minus post-test. Where d is the mean memory score difference between students with no chess training and students who have completed chess training -5 Playing Chess Continued . . . Student 1 2 3 4 5 6 7 8 9 10 11 12 Pre-test 510 610 640 675 600 550 610 625 450 720 575 675 Post-test 850 790 850 775 700 775 700 850 690 775 540 680 Difference -340 -180 -210 -100 -100 -225 -90 -225 -240 -55 35 -5 H 0 : d = 0 H a : d < 0 Where d is the mean memory score difference between students with no chess training and students who have completed chess training Assumptions: 1) Although the sample of studentsVerify is not a assumptions random sample, the investigator believed that it was reasonable to view the 12 sample differences as representative of all such differences. 2) A boxplot of the differences is approximately symmetrical with no outliers so the assumption of normality is plausible. Playing Chess Continued . . . Student 1 2 3 4 5 6 7 8 9 10 11 12 Pre-test 510 610 640 675 600 550 610 625 450 720 575 675 Post-test 850 790 850 775 700 775 700 850 690 775 540 680 Difference -340 -180 -210 -100 -100 -225 -90 -225 -240 -55 35 -5 H 0 : d = 0 H a : d < 0 Where d is the mean memory score difference between students with no chess training and students who have State the conclusion in completed chess training Test Statistic: t Compute 144.6 0 the context. test statistic 4.56 109.74 P-value. 12 P-value ≈ 0 df = 11 and a = .05 Since the P-value < a, we reject H0. There is convincing evidence to suggest that the mean memory score after chess training is higher than the mean memory score before training. Paired t Confidence Interval for d When 1. 2. 3. The samples are paired. The n sample differences can be viewed as a random sample from a population of differences. The number of sample differences is large (generally at least 30) or the population distribution of differences is (at least approximately) normal. the paired t interval for d is sd xd (t critical value) n Where df = n - 1 Playing Chess Revisited . . . Student 1 2 3 4 5 6 7 8 9 10 11 12 Pre-test 510 610 640 675 600 550 610 625 450 720 575 675 Post-test 850 790 850 775 700 775 700 850 690 775 540 680 Difference -340 -180 -210 -100 -100 -225 -90 -225 -240 -55 35 -5 109.74 144.6 1.796 ( 201.5, 87.69) 12 Compute a 90% confidence interval for the Wedifference are 90% confident the before true mean mean in memorythat scores in memory scores before chess chessdifference training and the memory scores after training and the memory scores after chess training. chess training is between -201.5 and 87.69. Large-Sample Inferences Concerning the Difference Between Two Population or Treatment Proportions Some people seem to think that duct tape can fix anything . . . even remove warts! Investigators at Madigan Army Medical Center tested using duct tape to remove warts versus the more traditional freezing treatment. Suppose that the duct tape treatment will successfully remove 50% of warts and that the traditional freezing treatment will successfully remove 60% of warts. Let’s investigate the sampling distribution of pfreeze - ptape pfreeze = the true proportion of ptape = the true proportion of warts that are warts that are successfully removed successfully removed by freezing by using duct tape Randomly take pfreeze = .6 ptape = .5 one of the sample Suppose we repeatedly treated Supposeforwe repeatedly treated proportions 100 warts using the duct tape 100 warts using the traditional the freezing method and calculated the freezing treatment and treatment and one ofare the sample the proportion of proportion of warts that calculated proportions for successfully removed. We would warts that are successfully the duct tape have the .6 sampling .6(.4) distribution removed. have the treatment and We would .5 s pˆ .5(.5) s 100. of ptape sampling distribution ofpˆ pfreeze100 find the difference. freeze tape Doing this repeatedly, we will create the sampling distribution of (pfreeze – ptape) pfreeze - ptape s pˆ ˆ freeze ptape .1 .6(.4) .5(.5) 100 100 Properties of the Sampling Distribution of p1 – p2 If two random samples are selected independently of one When performing a another, the following properties hold: hypothesis test,forwe Since the value p1 will null and p2 use are the unknown, 1. pˆ1 pˆ2 p1 p2 hypothesis that p11 we will combine Use: This says that the sampling distribution of p1 – p2 is centered at p1 and We andp2pare to–pˆequal. estimate ˆ 2pn n p – p2 so p1 – p2 is an unbiased statistic for estimating p . 11 1 2 2 2 ˆ p will not know theof the common value c p1 (1 p1 ) p2 (1 p2 ) n1 n2for p common value 2. s pˆ1 pˆ2 p1 and p2 1 n1 n2 and p2. 3. If both n1 and n2 are large (that is, if n1p1 > 10, n1(1 – p1) > 10, n2p2 > 10, and n2(1 – p2) > 10), then p1 and p2 each have a sampling distribution that is approximately normal, and their difference p1 – p2 also has a sampling distribution that is approximately normal. Summary of Large-Sample z Test for p1 – p2 = 0 Null Hypothesis: H0: p1 – p2 = 0 Test Statistic: Use: z n1 pˆ1 n2 pˆ2 pˆc n1 n2 Alternative Hypothesis: H a: p 1 – p 2 > 0 H a: p1 – p 2 < 0 H a: p 1 – p 2 ≠ 0 pˆ1 pˆ2 ( p1 p2 ) pˆc (1 pˆc ) pˆc (1 pˆc ) n1 n2 P-value: area to the right of calculated z area to the left of calculated z 2(area to the right of z) if +z or 2(area to the left of z) if -z Another Way to Write Hypothesis statements: H H00:: pp11 -=pp2 2= 0 Haa:: pp11 ->pp22> 0 H Haa:: pp11 -<pp22< 0 H Haa:: pp11 -≠pp22≠ 0 H Be sure to define both p1 & p2! Summary of Large-Sample z Test for p1 – p2 = 0 Continued . . . Assumption: 1) The samples are independently chosen Since p1 andor p2 treatments are unknown were we must use random samples and p2 to verify that the samples are assignedp1at random to individuals or objects large enough. 2) Both sample sizes are large n1p1 > 10, n1(1 – p1) > 10, n2p2 > 10, n2(1 – p2) > 10 Investigators at Madigan Army Medical Center tested using duct tape to remove warts. Patients with warts were randomly assigned to either the duct tape treatment or to the more traditional freezing treatment. Those in the duct tape group wore duct tape over the wart for 6 days, then removed the tape, soaked the area in water, and used an emery board to scrape the area. This process was repeated for a maximum of 2 months or until the wart was gone. The data follows: n Number with wart successfully removed Liquid nitrogen freezing 100 60 Duct tape 104 88 Treatment Do these data suggest that freezing is less successful than duct tape in removing warts? Duct Tape Continued . . . Treatment n Number with wart successfully removed Liquid nitrogen freezing 100 60 Duct tape 104 88 H 0: p 1 – p 2 = 0 H a: p 1 – p 2 < 0 Where p1 is the true proportion of warts that would be successfully removed by freezing and p2 is the true proportion of warts that would be successfully removed by duct tape Assumptions: 1) Subjects were randomly assigned to the two treatments. 2) The sample sizes are large enough because: n1p1 = 100(.6) = 60 > 10 n1(1 – p1) = 100(.4) = 40 > 10 n2p2 = 100(.85) = 85 > 10 n2(1 – p2) = 100(.15) = 15 > 10 Duct Tape Continued . . . Treatment n Number with wart successfully removed Liquid nitrogen freezing 100 60 Duct tape 104 88 H 0 : p1 – p2 = 0 H a : p1 – p2 < 0 z .6 .85 0 .73(.27) .73(.27) 100 104 pˆc 4.03 60 88 .73 100 104 P-value ≈ 0 a = .01 Since the P-value < a, we reject H0. There is convincing evidence to suggest the proportion of warts successfully removed is lower for freezing than for the duct tape treatment. A Large-Sample Confidence Interval for p1 – p2 When 1)The samples are independently chosen random samples or treatments were assigned at random to individuals or objects 2) Both sample sizes are large n1p1 > 10, n1(1 – p1) > 10, n2p2 > 10, n2(1 – p2) > 10 a large-sample confidence interval for p1 – p2 is pˆ pˆ z critical value 1 2 pˆ1 (1 pˆ1 ) pˆ2 (1 pˆ2 ) n1 n2 The article “Freedom of What?” (Associated Press, February 1, 2005) described a study in which high school students and high school teachers were asked whether they agreed with the following statement: “Students should be allowed to report controversial issues in their student newspapers without the approval of school authorities.” It was reported that 58% of students surveyed and 39% of teachers surveyed agreed with the statement. The two samples – 10,000 high school students and 8000 high school teachers – were selected from schools across the country. Compute a 90% confidence interval for the difference in proportion of students who agreed with the statement and the proportion of teachers who agreed with the statement. Newspaper Problem Continued . . . p1 = .58 p2 = .39 Based this confidence interval, does there 1) Assume that it ison reasonable to regard these two samples as being independently selected representative of the populations of appear to be and a significant difference in proportion interest.of students who agreed with the statement and the 2) Both sample sizes are large enough who agreed with the proportion of teachers n1p1 = 10000(.58) > 10, n1(1 – p1) = 10000(.42) > 10, statement? Explain. n2p2 = 8000(.39) > 10, n2(1 – p2) = 8000(.61) > 10 .58(.42) .39(.61) (.58 .39) 1.645 (.178, .202) 10000 8000 We are 90% confident that the difference in proportion of students who agreed with the statement and the proportion of teachers who agreed with the statement is between .178 and .202.