Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 Advanced Placement Statistics Tuesday March 24, 2015 1 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 Daily Agenda 1. Welcome to class 2. Please find folder and take your seat. 3. Homework check 4. Introduction to paired data 5. Miscellaneous 6. Collect Folders 2 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 Entry Quiz .... 1. A small P­value .003 is __________ evidence against the Null Hypothesis. 2. The biggest difference in a significance test vs. a confidence interval in the (p,p­hat world) proportions is ... 3. Significance Level is the symbol _____ and ________ error. 4. If a P­value is significant then we _________ the null hypothesis. 5. What is a P­value. I need a sentence for explanation. Assuming that _______________________ ..... 3 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 OTL C9#6 TWO WORKSHEETS 4 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 ρ = the true proportion population of this restaurant chain employees that feel that work stress has a negative impact on their personal lives. ρ = 0.75 ρ ≠ 0.75 V CENTER p came from a random sample, V and μp is an unbiased estimator of ρ V SHAPE SPREAD 100(0.75) = 75 > 10 so the sampling distribution of p 100(0.25) = 25 > 10 is approx. normally distributed 100(10) = 1000 < N total number of employees in chain so and the individual observations can be considered Independent 5 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 T O 2 < 0.68 2 < ­1.62 2(0.052979) = 0.1059 N(0.75, 0.0433) 0.620 0.663 0.707 0.75 0.793 0.837 0.880 6 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook weak March 26, 2015 0.106 10.5 out of 100 IS NOT Fail to 0.106 > 0.05 employees at this restaurant chain who would say work stress has a negative impact on their personal lives equals 0.75. There is not convincing evidence to suggest the proportion is different from 75%. 7 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 V should use p vs po N ­ use a z confidence interval for a population proportion I employees at this restaurant chain who would say 58.9%, 77.1% work stress impacts their personal lives 8 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 35 9 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 ≈ N(320,0.5) 318.5 319 319.5 320 320.5 321 321.5 2 < 319 2 moderately strong t35< ­2.00 2 (0.0266) = 0.0532 0.0532 5 out of 100 times IS NOT 0.0532 > 0.05 Fail to mean milligrams of active ingredient in the Aspro brand aspirin tablets is 320 mg. There is not convincing evidence to suggest that the mean active ingredient is different from 320 mg 10 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 35 0.025 0.95 t*35 = 2.03 milligrams of active ingredient in the Aspro brand aspirin 317.99, 320.02 mg 11 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 12 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 13 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 14 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 15 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 16 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 17 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 18 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 19 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 20 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 Matched Paired t Procedures The parameter μ in paired t procedure is ... ∎ the mean difference in the responses to the two treatments within matched pairs of subjects in the entire population ∎ the mean difference in response to the two treatments for individuals in the population (when the same subject receives both treatments) ∎ the mean difference between before-and-after measurements for all individuals in the population. 21 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 22 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 Goal ­ Carrying Out a significance test Part 2 - perform the significance test and state conclusion USE THE PHANTOM METHOD P ­ Parameter declaration or description H ­ state the Hypotheses in terms of μ A ­ Assumptions (check the assumption) N ­ Name the test that your will use T ­ calculate the Test statistic (z or t score) O ­ Obtain the P value (use the curve and tables) M ­ Make a conclusion (5 sentence parts) 23 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 P ­ parameter μD = • μN= μDiff = μD ­ μN μDiff = H hypothesis A assumptions conditions Ho: Ha: * the data is not from a random sample, x is data from our AP class so μx is not an unbiased estimate of μ for the general population * we must examine the data * 10n < total population 10( ) = must assume that this is less than the total number of times we could perform this activity so the standard error of the sampling distribution is and individual observations can be considered independent. N name the test Use a t­test matched pairs for the mean difference with degrees of freedom T test statistic O obtain P­value M make a conclusion This is ________ evidence against the null hypothesis. A P­value of ______ means that this observation would occur approximately _______ out of ______ times by chance. This __________ statistically significant at α = .05 because _____ __ ______. Therefore I ___ reject the null hypothesis that the difference in the mean number of O's crossed off by the dominant hand compared to the number of O's crossed off by the weak hand is 0. 24 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 What are the changes? P ­ parameter is a mean difference μD H ­ hypothesis Ho: μD = 0 Ha: μD > < ≠ 0 A ­ assumptions you need to test assumptions on the data list of differences only (not each data set individually) N ­ use a t ­ test for the difference of two population means (or a paired t test) TO ­ remember that your μ0 is at 0 M ­ use same sentences but remember it is a mean difference 25 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 IF the P-value is low (small .0?) Then Ho must go ... (reject the null) IF the P-value is high (big from .1? ish to 1) Then give Ho a try (do not reject the null) 26 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 27 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 28 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 MORE DETAIL of PHANTOM P ­ state the population parameter μ = the true population mean ... make this description very specific to the problem H ­ state the hypothesis Ho: μ = Ha : μ > or < or ≠ always state in terms of the population parameter. Let null hypothesis be "status quo" A ­ N ­ name the test t­ test for a population mean with ____ (n­1) degrees of freedom could now be a paired proportion t test T ­ find the test statistic O ­ obtain the P­value Draw the sampling distribution, shade, _ use correct notation P(x )= P(t ) ≈ etc.... M ­ make a decision make this description very specific to the problem 29 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 30 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 31 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 OTL C9#7 Page 597: 77, 79 Page 599: 87 32 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 http://en.wikipedia.org/wiki/Tablet_press 33 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 A drug manufacturer forms tablets by compressing a granular material that contains the active ingredient and various fillers. The hardness of a sample from each batch of tablets produced is measured in order to control the compression process. The target values for the hardness are μ = 11.5. The hardness data for a sample of 20 tablets were given on the page. Use the Sx from this. Is there significant evidence at the 5% level that the mean hardness of the tablets is different from the target value? P μ = the true mean hardness of the batch of tablets from the manufacturer H Ho: μ = 11.5 A Ha: μ ≠ 11.5 We would like to find evidence against the tablets being manufactured at 11.5 to know when the process needs recalibrated x it is stated that the sample mean came from a random sample, and μx is an unbiased estimator of μ. n = 20, no outliers as shown in the graphs below so the sampling distribution of x will be approximately normally distributed 20(10) = 200 assume 200 < total number of tablets produced so the standard error of the sampling distribution is and the individual observations can be considered Independent. N use a t test for a population mean with 19 degrees of freedom gather your know information together first then find the test statistic and obtain the P­value T ≈N(11.5,0.02) t = 1.00 11.44 11.46 11.48 11.5 11.52 11.54 11.56 x = 11.52 Use the program DRAWT to find the P-Value for the test statistics t t = 1.00 O 2P( > 11.52) = 2P(t x M 19 > 0.755) ≈ 2(0.2298) ≈ 0.4596 This is little evidence against the null hypothesis. A P­value of 0.4596 means that this observation (or one more extreme) would occur approximately 46 out of 100 times by chance. This is NOT statistically significant at that α = 0.05 0.4596 > 0.05 Therefore, I fail to reject the null hypothesis the mean hardness of the tablets differs from 11.5 Don't shut down the manufacturing process, it is working well .... How does this relate to a 95% Confidence Interval. 11.5 is it in there or not in the interval? 34 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook P A March 26, 2015 μ = the true mean hardness of the batch of tablets from the manufacturer x it is stated that the sample mean came from a random sample, and μx is an unbiased estimator of μ. n = 20, no outliers as shown in the graphs below so the sampling distribution of x will be approximately normally distributed 20(10) = 200 assume 200 < total number of tablets produced so the standard error of the sampling distribution is and the individual observations can be considered Independent. N use a t confidence interval for a population mean with 19 degrees of freedom I C 35 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 36 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 pg 599 #87 GROWING TOMATOES An agricultural field trial compares the yield of two varieties of tomatoes for commercial use. The researchers divide in half each of 10 small plots of land in different locations and plant each tomato variety on one­half of each plot. After harvest, they compare the yields in pounds per plant at each location. The 10 differences (Variety A ­ Variety B) give x = 0.34 and s = 0.83 Is there convincing evidence that Variety A has the higher yield? a) Describe in words what the parameter μ is in this setting. b) Perform a significance test to answer the question. Follow the Inference Toolbox. (PHANTOM) P μD = the mean difference in the yield in pounds per plant. (Variety A plant yield ­ Variety B plant yield) H A Ho: μD = 0 Ha : μ D > 0 * it is not stated if the Varieties were randomly placed to grow in each lot but we must be will to act as if the plant placement was randomly selected so that μx will be an unbiased estimator of μ. * we must examine the data, WOOPS there is no data so we must be willing to assume that with n = 10, the sample data has no outliers and no skew, thus the sampling distribution is approximately normally distributed. * 10n < total population are the scores independent? 10(10) = 100 safely assume that this is less than the total number of possible tomato plants of Variety A and Variety B so the standard error of the sampling distribution is and the individual differences can be considered ind. NOTE: Variety A and Variety B plants need not be randomly selected but the difference we assumed were from randomly placed plants in each plot do These yields should be independent. N Use a t­test matched pairs for the difference in population means with 9 degrees of freedom 37 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 pg 599 #87 GROWING TOMATOES An agricultural field trial compares the yield of two varieties of tomatoes for commercial use. The researchers divide in half each of 10 small plots of land in different locations and plant each tomato variety on one­half of each plot. After harvest, they compare the yields in pounds per plant at each location. The 10 differences (Variety A ­ Variety B) give x = 0.34 and s = 0.83 Is there convincing evidence that Variety A has the higher yield? a) P μD = the mean difference in the yield in pounds per plant. (Variety A plant yield ­ Variety B plant yield) H Ho: μD = 0 Ha: μD > 0 B) C) D) Describe in words what Type I and Type II error is in the setting of The null hypothesis is basically that there is no difference between the yield in pounds of variety A and variety B. A type I error would be to decide that variety A yields more pound of fruit when in fact there is no difference in the variety yields. The alternative hypothesis is basically that there is a difference between the yield in pounds of variety A and variety B. A yields more than B. A type II error would be to decide that the varieties produce the same amount of fruit when in fact variety A yields more pound of fruit than variety B. 38 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 39 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook T March 26, 2015 ≈N(0, ? 0.26 ) ­0.78 ­0.52 ­0.26 0 0.26 0.52 0.78 x = 0.34 t = 1.31 x O P( > 0.34) = P(t 9 < 1.31) ≈ 0.11132 M Weak This is ________ evidence against the null hypothesis. A P­value of ______ means that this 0.11 observation would occur approximately 11 _______ out of ______ times by chance. This 100 is NOT __________ statistically significant at α = .05 because _____ __ ______. Therefore I 0.11 > 0.05 ___ reject the null hypothesis that the difference do in the Variety A yield and Variety B yield is greater than 0. (ie A has a larger yield than B is not upheld we can detect no statistical significance in the yield of the varieties of the plants) 40 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 12.19 The power of tomatoes pg 763 a) Describe in words what Type I and Type II error is in the setting of 12.18 also describe POWER The null hypothesis is basically that there is no difference between the yield in pounds of variety A and variety B. A type I error would be to decide that variety A yields more pound of fruit when in fact there is no difference in the variety yields. The alternative hypothesis is basically that there is a difference between the yield in pounds of variety A and variety B. A yields more than B. A type II error would be to decide that the varieties produce the same amount of fruit when in fact variety A yields more pound of fruit than variety B. The power of the test is the ability to find that variety A produces more than variety B when in fact that is really true. d) Give two other ways to increase the power of the test besides increasing the sample size. To increase Power ... increase the significance level. move the null hypothesis further away from the alternative hypothesis. (farther away from zero) increase the population standard deviation (sometimes impossible) INCREASE THE SAMPLE SIZE 41 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 42 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 43 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 44 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 45 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 12.18 GROWING TOMATOES pg 763 An agricultural field trial compares the yield of two varieties of tomatoes for commercial use. The researchers divide in half each of 10 small plots of land in different locations and plant each tomato variety on one­half of each plot. After harvest, they compare the yields in pounds per plant at each location. The 10 differences (Variety A ­ Variety B) give x = 0.34 and s = 0.83 Is there convincing evidence that Variety A has the higher yield? a) Describe in words what the parameter μ is in this setting. b) Perform a significance test to answer the question. Follow the Inference Toolbox. (PHANTOM) P μD = the mean difference in the yield in pounds per plant. (Variety A plant yield ­ Variety B plant yield) H A Ho: μD = 0 H a : μD > 0 * it is not stated if the Varieties were randomly placed to grow in each lot but we must be will to act as if the plant placement was randomly selected so that μx will be an unbiased estimator of μ. * we must examine the data, WOOPS there is no data so we must be willing to assume that with n = 10, the sample data has no outliers and no skew, thus the sampling distribution is approximately normally distributed. * 10n < total population are the scores independent? 10(10) = 100 safely assume that this is less than the total number of possible tomato plants of Variety A and Variety B so the standard error of the sampling distribution is NOTE: Variety A and Variety B plants need not be randomly selected but the difference we assumed were from randomly placed plants in each plot do These yields should be independent. N Use a t­test matched pairs for the difference in population means with 9 degrees of freedom 46 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook T March 26, 2015 ≈N(0, ? 0.26 ) ­0.78 ­0.52 ­0.26 0 0.26 0.52 0.78 x = 0.34 t = 1.31 x O P( > 0.34) = P(t 9 < 1.31) ≈ 0.11132 M Weak This is ________ evidence against the null hypothesis. 0.11 A P­value of ______ means that this observation would occur approximately 11 _______ out of ______ times by chance. 100 is NOT This __________ statistically significant at = .05 because _____ __ ______. Therefore I 0.11 > 0.05 α do ___ reject the null hypothesis that the difference in the Variety A yield and Variety B yield is 0. (ie A has a larger yield than B is not upheld we can detect no statistical significance in the yield of the varieties of the plants) 47 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 12.19 The power of tomatoes pg 763 The tomato experts who carried out the field trial described in the previous exercise suspect the large P­value there is due to low power. They would like to be able to detect a mean difference in yields of 0.5 pound per plant at the 0.05 significance level. Based on the previous study, use 0.83 as an estimate of both the population σ and the value of s in future samples. a) Describe a Type I and Type II error in this setting. Which is more serious? TYPE I ERROR α Type I error is to reject the null hypothesis when I should not have rejected the null (It was actually TRUE). In context of the tomato plants. A type I error is to find that Variety A tomato plants outproduce Variety B tomato plants when in fact A does not. Consequence - The farmers switch to a new variety of tomato plant that really has no benefit (higher yield) ... what could happen here? 48 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 TYPE II ERROR β Type II error is to fail to reject the null hypothesis when I should have rejected the null (It was actually FALSE). In context of the tomato plants. A type II error is to find that Variety A tomato plants and Variety B tomato plants produce the same pound yield of tomatoes when in fact Variety A produces more tomatoes. Consequence - The farmers do not switch to the new variety of tomato plants and miss a great opportunity to make more food (profit, money, etc). For this company (farmer) the Type II error seems that it would be most serious. They would fail to make a bigger profit. 49 Chapter 9 Section 3 day 3 mu hyp test 2015s.notebook March 26, 2015 Power 1­β Power is the probability to reject the Null when it is in fact FALSE. In context of the tomato plants. Power is the ability to find that Variety A tomato plants and produce more than Variety B tomato plants. When in fact that is actually the case (False Null) 50