Math 311, Spring 2008, Lab 6 Due May 20th at 3:00 p.m. The Tools In this section you’ll learn the mechanics of 1-sample and 2-sample proportion tests Inferences With Minitab: Have Minitab compute two columns of 10,000 rows of data. Store this data in columns C1 and C2. Choose the first (C1) from a B(1,.5) population and the second (C2) from a B(1,.2) population. You’ll get a couple of columns of 0’s and 1’s. Recall one does this by selecting Calc>Random Data> …. We will use these data in our tests below. Because the data in C1 came from a B(1,.5) population, about 50% of the data will be 1’s. Similarly, because the data in C2 came from a B(1,.2) population, about 20% of the data will be 1’s. 1-sample proportion-test: View the data in column C1 as a sample of size 10,000. We may use Minitab to compute both a confidence interval for the (true) population proportion and perform a hypothesis test for the population proportion at one time! a. Select Stat>Basic Statistics>1 Proportion… b. Enter C1 for Samples in columns box c. Click Perform hypothesis test. d. Enter the value of the population proportion from the Null Hypothesis as the Hypothesized proportion: (this is for the hypothesis test – the Alterative Hypothesis is entered below). This time, use .45 (this is, in fact, pretty close to the true population proportion of .5 – recall that the data came from a B(1,.5) distribution). e. To set the confidence level select Options (this is for the confidence interval). 95% is the default setting. f. Notice that while you’re in the Options menu, you can also select equal to, greater than, or less than for your Alternative Hypothesis. Since this is just practice, pick whichever floats your boat. g. Finally, select OK (twice) and get something like this: Test and CI for One Proportion: C1 Test of p = 0.45 vs p not = 0.45 Event = 1 Variable C1 X 49980 N 100000 Sample p 0.499800 h. Note: the hypotheses tested (p = 0.45 vs output (I’ve highlight these for emphasis.) i. j. 95% CI (0.496696, 0.502904) p not = 0.45) Exact P-Value 0.000 are listed at the top of the “X” gives the number of 1’s in the data – a 1 is considered a “success” and a 0 is a “failure.” “Sample p” is our sample proportion ( p̂ ). Notice that it’s close to, but not exactly equal to, .5. This is because of random error. k. “Exact P-value” is the p-value from the hypothesis test. The p-value is low because our null hypothesis was not correct – p is not .45. (I’ve highlight these for emphasis too.) 2-sample proportion tests: We can perform 2-sample inferences by viewing the data in columns C1 and C2 as samples from independent populations and by following the directions below: a. Select Stat>Basic Statistics>2 Proportions… b. Select Samples in Different Columns (in this example, we’re going to compare the samples in C1 and C2). c. Enter C1 in First: and C2 in Second: d. Select Options to set the confidence level and type of hypothesis test (1-sided vs. 2sided) and the value of the difference of the proportions in the null hypothesis. e. Finally, select OK (twice) and get something like this: Test and CI for Two Proportions: C1, C2 Event = 1 Variable C1 C2 X 49980 19977 N 100000 100000 Sample p 0.499800 0.199770 Difference = p (C1) - p (C2) Estimate for difference: 0.30003 95% CI for difference: (0.296062, 0.303998) Test for difference = 0 (vs not = 0): Z = 148.20 P-Value = 0.000 Fisher's exact test: P-Value = 0.000 f. As above, “Sample p” gives the sample proportions for each sample. g. Note Estimate for difference = difference of sample proportions = pˆ1 pˆ 2 . h. Everything else should look familiar. Look at it. Do you know what each bit means? If not, ask. If so, good. i. Important note: the samples from the two populations do not need to be in separate columns. For example, suppose we were comparing the proportions of dead fish in streams on the east coast vs. the west coast. If the dead fish data was in C1 and the location (east or west coast) was in C2, then we could compare the dead fish proportions by selecting the option Samples in one column and then entering C1 in the Samples: box and C2 in the Subscripts: box. You’ll need to try this in one of following questions. The Questions In this section you’ll apply the techniques learned above !!!BE CERTAIN TO READ EACH SITUATION CAREFULLY – THEY CONTAIN USEFUL CLUES!!! Nutrition – platewaste. Again. Just in case you hadn’t heard: A few years ago colleagues from the Family and Consumer Sciences Department and I studied various factors that affect the amount of food elementary school children eat during lunch. Among other things, we were curious if the children’s Recommended Daily Allowances (RDA’s) were being met at lunch. The results for each child at one of the schools are recorded in the file RDAMet.mtw. Get this file now. The non-obvious variable codes: Ethnic Code: H = Hispanic, NH = Non-Hispanic Entrée Type: mex = Mexican influenced (e.g. tacos), nonmex = not Mexican influenced (e.g. cheese zombies). T. Fat = total fat S. Fat = saturated fat Choles = cholesterol Na+ = sodium Fiber = fiber Pro = protein Fe+ = iron Ca+ = calcium Vit A = vitamin A Vit C = vitamin C A 1 was recorded if the student met or exceeded the RDA. A 0 was recorded otherwise. 1. Food service personnel would consider it “not horrible” if at least 50% of the kids consumed at least 667 calories at lunch. Do we have evidence that this is true? Run the appropriate test and then cut ‘n paste your results into a Word document. Clearly state your conclusion. 2. Is there a difference in the proportion of kids exceeding their daily allowance of saturated fat when served Mexican influence entrees vs. non-Mexican influenced entrees? Cut ‘n paste your results into a Word document. Clearly state your conclusion. 3. Is there a difference in the proportion of those exceeding the RDA of vitamin C between the genders? Cut ‘n paste your results into a Word document. Clearly state your conclusion. 4. Pick something interesting from the data and run either a 1 or a 2-sample proportion test on it. Yes, anything. You get to choose. I think this is an interesting data set and I hope you do too. Report what you chose to investigate and then cut ‘n paste your results into a Word document. Clearly state your conclusion.