Chapter 8 Hypothesis Testing 8.1 Review & Preview Inferential statistics involve using sample data to… 1. 2. 8.2 Basics of Hypothesis Testing Part I: Basic Concepts of Hypothesis Testing Hypothesis: Hypothesis test (or test of significance): Examples of typical hypotheses: the mean body temperature of humans is less than 98.6°F µ < 98.6°F the XSORT method of gender selection increases the probability that a baby born will be a girl, so the probability of a girl is greater than 0.5 p > 0.5 the population of college students has IQ scores with a standard deviation equal to 15 σ = 15 Rare Event Rule for Inferential Statistics To analyze sample data to test a claim we choose between the following 2 explanations: 1. The sample results COULD easily occur by chance 1 Ex: In testing the XSORT gender selection method that is supposed to make babies more likely to be girls, the result of 52 girls in 100 births is greater than 50% but 52 girls could easily occur by chance, so there is not sufficient evidence to conclude that the XSORT method is effective. 2. The sample results are NOT LIKELY to occur by chance Ex: In testing the XSORT gender-selection method, the result of 95 girls in 100 births is greater than 50% and 95 girls is so extreme that it could not easily occur by chance so there is sufficient evidence to conclude that the XSORT method is effective. Ex: Assume that 100 babies are born to 100 couples treated with XSORT method of gender selection that is claimed to make girls more likely. If 58 of the 100 babies born are girls, test the claim that “with the XSORT method, the proportion of girls is greater than the proportion of 0.5 that occurs without any treatment.” Using p to denote the proportion of girls born with XSORT method, test the claim that p > 0.5. Given the probability of getting 58 or more girls = 0.0548 2 Components of a Formal Hypothesis Test Null Hypothesis: denoted by H0 We test the null hypothesis directly in the sense that we assume it is true and reach a conclusion to either reject H0 or fail to reject H0. Ex: H0 : p = 0.5 H0 : = 98.6 H0 : = 15 Alternative Hypothesis: denoted by H1, Ha, or HA must use one of these symbols: <, >, or Ex: H1 : p > 0.5 H1 : < 98.6 H1 : 15 Ex: Use the given claims to express the corresponding null and alternative hypothesis in symbolic form. a. The proportion of workers who get jobs through networking is greater than 0.5. Claim: H 1: H 0: b. The mean weight of airline passengers with carry on baggage is at most 195 lb. Claim: H 1: H 0: c. The standard deviation of IQ scores of actors is equal to 15. Claim: H 1: H 0: 3 Forming Your Own Claims (hypotheses): If you are conducting a study and want to use a hypothesis test to support your claim, they claim must be worded so that it becomes the alternative hypothesis (and can be expressed using only the symbols <, >, or ≠). You can never support a claim that some parameter is EQUAL to a specified value.\ Significance level (denoted by ): If the test statistic falls in the critical region, we reject the null hypothesis, so is the probability of making the mistake of rejecting the null hypothesis when it is true. Most common choices for are _______________________ (They correspond with confidence levels of 90%, 95%, and 99%) Test Statistic: Parameter Proportion p Mean µ Mean µ Standard deviation σ or variance σ2 Sampling Distribution Requirements Test Statistic np ≥ 5 and nq ≥ 5 z σ is unknown and normally distributed population OR σ is unknown and n > 30 σ is known and normally distributed population OR σ is known and n > 30 normally distributed population (strict requirement) pˆ p pq n x t s z n x 2 n (n 1) s 2 2 Critical region (or rejection region): 4 Ex: A survey of n = 703 randomly selected workers showed that 61% ( p̂ = 0.61) of those respondents found their job through networking. Find the value of the test statistic for the claim that most workers (more than 50%) get their jobs through networking. Ex #2: Using the claim about the XSORT method of gender selection, we have n = 100 and x = 58 girls. Find the value of the test statistics for the claim that more girls are born if couples use the XSORT method. 5 2 Types of Hypothesis Tests: The test statistic alone usually does not give us enough information to make a decision about the claim being tested There are 2 methods of interpreting the test statistic: 1. ______________________________ (or p-value or probability value) 2. ______________________________ (or traditional method) Both approaches require that we first determine whether our hypothesis test is two-tailed, left-tailed, or right-tailed: a. Two-tailed test: the critical region is in the two extreme regions (tails) under the curve (the significance level is divided equally between the two tails that constitute the critical region) Ex: p 0.5 (so the critical region is in ___________ of the normal distribution) b. Left-tailed test: the critical region is in the extreme left region (tail) under the curve Ex: p < 0.5 (so the critical region is in the _________ of the normal distribution) c. Right-tailed test: the critical region is in the extreme right region (tail) under the curve Ex: p > 0.5 (so the critical region is in the _________of the normal distribution) 6 Hint: The direction of the inequality symbol in the alternative hypothesis points to the tail of the critical region: > ____________________ < ____________________ P-Value Method of Hypothesis Testing: P-value: The P-value that is used with the test statistic in order to make a decision about the null hypothesis More common than the critical value method because technology can be used to find P-values quickly Summary of procedure: Critical region in the left tail: P-Value = __________________________________ Critical region in the right tail: P-Value = __________________________________ Critical region in 2 tails: P-Value = __________________________________ 7 Ex: You found a test statistic of z = 1.60 and the alternative hypothesis states that the mean length is greater than 30.45mm. Ex: You found a test statistic of z = 1.78 and the alternative hypothesis states that the population proportion is not equal to 75%. Don’t confuse the P-value (or probability that the test statistic is at least as extreme as the one obtained) with the population proportion p 8 Critical Value Method of Hypothesis Testing: Critical value (or traditional) method: Depends on the nature of the null hypothesis, the sampling distribution that applies, and the significance level of Ex: Using significance level of = 0.05, find the critical values for each of the following alternative hypothesis (assuming that the normal distribution can be used to approximate the binomial distribution). o Left-tailed test: p < 0.5 (so the critical region is in the left tail of the normal distribution) o Right-tailed test: p > 0.5 (so the critical region is in the right tail of the normal distribution) o Two-tailed test: p 0.5 (so the critical region is in both tails of the normal distribution) 9 Conclusions in Hypothesis Testing: We always test the null hypothesis. The initial conclusion will always be one of the following: 1. 2. Decision to reject or fail to reject the null hypothesis is made using the following four methods: P-value method: --Reject H0 if the P-value (where is the significance level, such as 0.05). --Fail to reject H0 if the P-value > Ex: With a significance level of = 0.05 and P-value = 0.0548, we have a P-value > , so we should ____________ Ex: With a significance level of = 0.10 and P-value = 0.075, we have a P-value < , so we should ____________ Critical Value method: --Reject H0 if the test statistic falls within the critical region --Fail to reject H0 if the test statistic does not fall within the critical region Ex: With a test statistic z = 1.60 and = 0.05 where the critical region is everything to the right of the critical value z = 1.645, the test statistic _______fall within the critical region & we ________ ___________ Ex: With a test statistic z = 2.45 and = 0.05 so the critical region is everything to the outside of the critical values of z / 2 = ______ and z / 2 = _______ so the test statistic ________ fall within the critical region & we ___________ Confidence Intervals: --Because a confidence interval estimate of a population parameter contains the likely values of that parameter, reject a claim 10 that the population parameter has a value that is not included in the confidence interval. -for a two-tailed test, construct a confidence interval with a confidence level of 1 - -for a one-tailed hypothesis test with significance level , construct a confidence interval with a confidence level of 1 - 2 --This method may give a conclusion that may be different from a conclusion based on the other hypothesis tests (those testing the population proportion p) --We will not focus on using confidence intervals to test hypotheses in this course Comprehensive Hypothesis Test: P-value Method: Critical Value Method: Confidence Interval Method: 11 Restate the Conclusion Using Simple & Non-technical Terms: --We always test the null hypothesis Condition Original claim does not include equality and you reject H0 Original claim does not include equality and you fail to reject H0 Original claim includes equality and you reject H0 Original claim includes equality and you fail to reject H0 Conclusion “There is sufficient evidence to support the claim that…(original claim)” “There is not sufficient evidence to support the claim that…(original claim)” “There is sufficient evidence to warrant rejection of the claim that…(original claim)” “There is not sufficient evidence to warrant rejection of the claim that … (original claim)” --Recognize that we are not proving the null hypothesis. The term “accept” is somewhat misleading because it seems to imply that the null hypothesis has been proved. “Fail to reject” says more correctly that the available evidence isn’t strong enough to warrant rejection of the null hypothesis. --When stating the final conclusion, it’s possible to get correct statements with up to 3 negative terms. Such conclusions are confusing & should be restated in a way that makes them more understandable (but does not change the meaning). --Ex: “There is not sufficient evidence to warrant rejection of the claim of no difference between 0.5 and the population proportion.” Examples: First determine whether the given conditions result in a righttailed test, a left-tailed test, or a two-tailed test, then find the P-value and the critical value(s). Use both methods to test the null hypothesis and state conclusion. 12 a. A significance level of = 0.05 is used in testing the claim that p > 0.25, and the sample data result in a test statistic of z = 1.18. H1: H 0: P-Value method: Critical value method: Conclusion: b. A significance level of = 0.05 is used in testing the claim that p 0.25, and the sample data result in a test statistic of z = 2.34 H1: H 0: P-Value method: Critical value method: Conclusion: 13 Errors in Hypothesis Tests: Conclusions are sometimes correct & sometimes wrong (even if we do everything correctly) --Type I error (): --Type II error (): --Mneumonic device: use the consonants from “RouTiNe FoR FuN” Type I error: “Reject True Null” (RTN) Type II error: “Failure to Reject a False Null” (FRFN) Descriptions of type I and II errors refer to the null hypothesis being true or false, but when wording a statement representing a type I or type II error, be sure that the conclusion addresses the original claim (which may or may not be the null hypothesis) Ex: Assume that we are conducting a hypothesis test of the claim that p < 0.5. Here are the null and alternative hypotheses: H0: p = 0.5 a. H1: p < 0.5 Type I error—the mistake of rejecting a true null hypothesis so this is a type I error: 14 b. Type II error—the mistake of failing to reject hypothesis when it is false so this is a type II error: Controlling type I & II errors: Since , , and the sample size n are all related, when you choose or determine any two of them, the 3rd is automatically determined. The usual practice is to select the values of and n so the value of is determined. 1. For any fixed , an increase in the sample size n will cause a decrease in . (A larger sample will lessen the chance that you make the error of not rejecting the null hypothesis when it’s actually false). 2. For any fixed sample size n, a decrease in will cause an increase in . (Conversely, an increase in will cause a decrease in .) 3. To decrease both and , increase the sample size. -Ex: In testing the claim that = 0.8535 g. for M&M’s, we might choose = 0.05 and n = 100. In testing the claim that = 325 mg. for Bufferin brand aspirin tablets, we might choose = 0.01 and n = 500 because of the more serious consequences associated with testing a commercial drug. Part II: Beyond the Basics of Hypothesis Testing: The Power of a Test Power of a Hypothesis Test: Computed by using a particular significance level and a particular value of the population parameter that is an alternative to the value assumed true in the null hypothesis. The power of a hypothesis test is the probability supporting an alternative hypothesis that is true. Used to gauge the test’s effectiveness in recognizing that a null hypothesis is false 15 The values of power can be found using some software packages such as Minitab or can be manually calculated (but calculations are complicated). Ex: Assume that we have the following null and alternative hypotheses, significance level, and sample data: H0: p = 0.5 H1: p 0.5 significance level: = 0.05 sample size: n = 100 sample proportion: p̂ = 0.57 test statistic is z = 1.4 We get the following examples of power values: Specific Alternative Value of P Power of Test (1 - ) 0.3 0.013 0.987 0.4 0.484 0.516 0.6 0.484 0.516 0.7 0.013 0.987 We see that this hypothesis test has a power of 0.987 (or 98.7%) of rejecting H0: p = 0.5 when the population is actually 0.3. Similarly, there is a 51.6% probability of rejecting p = 0.5 when the true value of p is actually 0.4. It makes sense that this test is more effective in rejecting the claim of p = 0.5 when the population is actually 0.3 than when the population proportion is actually 0.4. To increase power, you can… ---- A power of at least 0.80 is a common requirement for determining that a hypothesis test is effective. (some argue that it should be higher—0.85 or 0.90) 16 8.3 Testing a Claim about a Proportion Claims about a population proportion are usually tested by using a normal distribution as an approximation to the binomial distribution. Instead of using the same exact methods of section 6.7, we use a different but equivalent form of the test statistic and don’t include the correction for continuity (because its effect tends to be very small with large samples). Testing Claims about a Population Proportion p: Requirements: 1. The sample observations are a simple random sample 2. The conditions for a binomial distribution are satisfied (fixed # of trials having consistent probabilities, and each trial has 2 outcome categories—success and failure) 3. The conditions np ≥ 5 and nq ≥ 5 are both satisfied, so the binomial distribution of sample proportions can be approximated by a normal distribution with = np and = npq --Note that p is the assumed proportion used in the claim, not the sample proportion Notation: n= pˆ x n p= q= Test Statistic for Testing a Claim about a Proportion: z pˆ p pq n P-values: use the standard normal distribution (table A-2) and find the pvalues Memory Trick: “If the P-value is low, the null must go” “If the P-value is high, the null will fly” 17 Critical values: use the standard normal distribution (table A-2) and find the z scores that correspond to the given significance level Ex: Among 703 randomly selected workers, 61% got their jobs through networking. Use the following sample data with 0.05 significance level to test the claim that most workers (more than 50%) get their jobs through networking. Claim: Most workers get their jobs through networking (p > 0.5) Sample data: n = 703 and p̂ = 0.61 --Are the requirements satisfied? 1. simple random sample 2. fixed number of independent trials, 2 categories—got job through networking or not 3. np ≥ 5 and nq ≥ 5 are both satisfied with n = 703, p = 0.5 and q = 0.5. (np = 351.5 and nq = 351.5) --P-value method of hypothesis testing: Step 1: The original claim in symbolic form: Step 2: The opposite of the original claim: Step 3: H0: __________ H1: __________ Step 4: Use a significance level of = ________ Step 5: The normal distribution is used for this test (because we are testing a claim about a population proportion p & the sampling distribution of sample proportions p̂ is approximated by a normal distribution) Step 6: Calculate the test statistic z pˆ p pq n Because the hypothesis test is right-tailed, we find the cumulative area to the right of z = _____ is ______ Step 7: Because the P-value of ______ is less than or equal to the significance level of = 0.05, we ______________________ Step 8: We conclude that 18 --Critical value method of hypothesis testing: Steps 1-5: same as above Step 6: This is a right-tailed test, so the area of the critical region is an area of = ___ in the right tail. From table A-2, we find z = ________ Step 7: Because the test statistic (z = 5.83) falls within the critical region, we _____________________________ Step 8: We conclude that --Confidence interval method of hypothesis testing: Claim of p > 0.5 can be tested with a 0.05 significance level by constructing a 90% confidence interval Using the methods from section 7-2, we find the following confidence interval: 0.580 < p < 0.640 Because we are 90% confident that the true value of p is contained within the limits of 0.580 and 0.640, we have sufficient evidence to support the claim that p > 0.5. Note: The confidence interval uses an estimated standard deviation based on the sample proportion & may often differ from results of the traditional and P-value methods of hypothesis testing. A good strategy is to use confidence intervals to estimate population proportions but use the traditional and P-value methods for testing a hypothesis. Ex: When Gregor Mendel conducted his famous hybridization experiments with peas, one such experiment resulted in offspring consisting of 428 peas with green pods and 152 peas with yellow pods. According to Mendel’s Theory, ¼ of the offspring peas should have yellow pods. Use a 0.05 significance level to test the claim that the proportion of peas with yellow pods is equal to ¼. --Are the requirements satisfied? 19 --P-value method of hypothesis testing: --Critical value method of hypothesis testing: 20 --Finding the number of successes if you’re given the sample proportion: -Technology often requires the number of successes (x) in order to test a claim about a population proportion -Change the percent to a decimal and multiply it by the sample size (n) -Ex: A Harris Interactive survey conducted for Gillette showed that among 514 human resource professionals polled, 90% said that appearance of a job applicant is most important for a good first impression. What is the actual number of respondents who said that appearance of a job applicant is most important for a good first impression? --Using a TI-83 or TI-84 Graphing Calculator to find the test statistic when testing a claim about a population proportion: 1. Press STAT and select TESTS 2. Select 1: PropZTest 3. Enter the claimed value of the population proportion for p0, then enter the values for x and n, and then select the type of test. 4. Highlight Calculate and press ENTER Ex: If you were testing the claim from the previous example that the proportion of human resource professionals who said that appearance of a job applicant is most important for a good impression is greater than 75%, use the graphing calculator to find the test statistic. 21 8.4 Testing a Claim about a Mean Part I: Testing Claims about a Population Mean (with not known): More practical and realistic because we don’t usually know the value of Requirements: 1. 2. 3. Test Statistic for Testing a Claim about a Mean (with not known): z x x s n P-values & Critical Values: use table A-3 and use df = n – 1 for the number of degrees of freedom. Ex: Data set 20 in Appendix B includes weights of 13 red M&M candies randomly selected from a bag containing 465 M&M’s. The sample weights are listed below and they have a mean of x = 0.8635 and a standard deviation of s = 0.0576. The bag states that the net weight of the contents is 396.9 g, so the M&M’s must have a mean weight that is at least 396.9/465 = 0.8535 g in order to provide the amount claimed. Use the sample data with a 0.05 significance level to test the claim of a production manager that the M&M’s have a mean that is actually greater than 0.8535 g, so consumers are being given more than the amount indicated on the label. 0.751 0.942 0.857 0.841 0.873 0.856 0.809 0.799 0.890 0.966 0.878 0.859 0.905 --Check the requirements: simple random sample, we are not using a known value of , the sample size n = 13 is not greater than 30 but the normal quantile plot suggests the weights are normally distributed. 22 --Critical value method of hypothesis testing: Step 1: The original claim: Step 2: The opposite of the original claim: Step 3: H0: _________ H1: _________ Step 4: Use a significance level of = __________ Step 5: Because the claim is made about the population mean and because the requirements for using the t test statistic are satisfied, we use the t distribution. x x Step 6: Calculate the test statistic: t = s n The correct df = ____________ The test is right-tailed with = _______ Use table A-3 to find the critical value of t = ________ Step 7: Because the test statistic of t = 0.626 does not fall in the critical region, we __________________________ Step 8: We conclude that . Finding P-values with the t distribution: Use Table A-3 to identify a range of P-values as follows: use the number of degrees of freedom to locate the relevant row of the table, then determine where the test statistic lies relative to the t values in that row. Based on a comparison of the t test statistic and the t values in the row of table A-3, identify a range of values by referring to the area values given at the top of table A-3. --Ex: Use the previous example and the P-value method of hypothesis testing: -This is a right-tailed test, so the P-value is the area to the right of the test statistic. -Use table A-3 & locate the row where df =_____ 23 -The test statistic of t = _____ is smaller than every value in that row of the table so the “area in one tail” is greater than ______ -The area to the right of t = ______ is greater than _____ -- Ex: Use Table A-3 to find a range of values for the P-value corresponding to a t test with these components: Test statistic is t = 2.777, sample size is n = 20, significance level is = 0.05, alternative hypothesis is H1: 5 -Use table A-3 & locate the row where df = _____ -Because the test statistic of t = ____ lies between the table values of _____ and ____, the P-value is between the corresponding “areas in two tails” of _____ and ____ -Although we can’t find the exact P-value, we can conclude that ______ < P-value < _______ Using a Confidence Interval to test a claim about when is not known: --For a two-tailed hypothesis test with a 0.05 significance level, we construct a 95% confidence interval. --For a one-tailed hypothesis test with a 0.05 significance level, we construct a 90% confidence interval. --Using the sample data from the previous example: n = 13, x = 0.8635 g, s = 0.0576 g. (with not known), and a significance level of 0.05 --By the methods learned in section 7-4, we construct this 90% confidence interval: 0.8350 < < 0.8920. --Because the assumed value of = 0.8535 is contained within the confidence interval, we cannot reject that assumption. --We don’t have sufficient evidence to support the claim that the mean weight is greater than 0.8535 g. Based on the confidence interval, the true value of is likely to be any value between 0.8350 and 0.8920 g, including 0.8535 g. Although the 3 methods of testing hypotheses can be different when you know the value of (section 8-3), when testing a claim about a population mean, there is no such difference—all 3 methods are equivalent. 24 --Ex #2: Listed below are the measure radiation emissions (in W/kg) corresponding to a sample of cell phones. Use a 0.05 significance level to test the claim that cell phones have a mean radiation level that is less than 1.00 W/kg. Assume the sample is a simple random sample and that the data is from a population that is normally distributed. 0.38 0.55 1.54 1.55 0.50 0.60 0.92 0.96 1.00 0.86 1.46 -Critical value method: Step 1: The original claim: Step 2: The opposite of the original claim: Step 3: H0: ___________ H1: ___________ Step 4: Use a significance level of = _______ Step 5: Because the claim is made about the population mean and because the requirements for using the t test statistic are satisfied, we use the t distribution. x x Step 6: Calculate the test statistic: t s n The correct df = ___________ The test is right-tailed with an area of = ______ in one tail Use table A-3 to find the critical value of t = __________ Step 7: Because the test statistic of t = -0.486 does not fall in the critical region, we ________________________________ Step 8: We conclude that 25 -P-value method: -This is a left-tailed test, so the P-value is the area to the left of the test statistic. -Use table A-3 & locate the row where df = _____ -The test statistic of t = +________ is smaller than every value in that row of the table so the “area in one tail” (the area to the right of 0.486) is greater than _______ -So, by symmetry, the area to the left of t = -0.486 is also greater than 0.10 -The area to the left of t =______ is greater than ______ -Because the test statistic of t = -0.486 is greater than α = 0.05, we ________________________________ -We conclude that --Using a TI-83 or TI-84 Graphing Calculator to test a claim about the population mean when σ is unknown 1. Press STAT and select TESTS 2. Select 2: T-test 3. You can choose whether you want to enter data into the list L1 or you can enter the statistics 4. Enter the claimed value of the population mean for µ0, then enter the values for the sample mean, sample standard deviation, and sample size, and then select the type of test. 4. Highlight Calculate and press ENTER Ex: Find the test statistic t and the p-value using example #2: 26 Part II: Testing Claims about a Population Mean (with known): Uses the normal distribution with the same components of hypothesis testing as section 8-3. Requirements: 1. 2. 3. Test Statistic for Testing a Claim about a Mean (with known): z x x n P-values: use the standard normal distribution (table A-2) and find the pvalues Critical values: use the standard normal distribution (table A-2) and find the z scores Ex: Data set 13 in Appendix B includes weights of 13 red M&M candies randomly selected from a bag containing 465 M&M’s. The standard deviation of the weights of all M&M’s in the bag is = 0.0565 g. The sample weights are listed below and they have a mean of x = 0.8635. The bag states that the net weight of the contents is 396.9 g, so the M&M’s must have a mean weight that is at least 396.9/465 = 0.8535 g in order to provide the amount claimed. Use the sample data with a 0.05 significance level to test the claim of a production manager that the M&M’s have a mean that is actually greater than 0.8535 g, so consumers are being given more than the amount indicated on the label. 0.751 0.942 0.857 0.841 0.873 0.856 0.809 0.799 0.890 0.966 0.878 0.859 0.905 --Check the requirements: simple random sample, value of = 0.0565 g, the sample size n = 13 is not greater than 30 but the normal quantile plot suggests the weights are normally distributed. 27 --P-value method of hypothesis testing: Step 1: The original claim: Step 2: The opposite of the original claim: Step 3: H0: ______________ H1: ______________ Step 4: Use a significance level of = _____ Step 5: Because is known and the population appears to be normally distributed, the central limit theorem indicates that the distribution of sample means can be approximated by a normal distribution. x x Step 6: Calculate the test statistic: z n Because the hypothesis test is right-tailed, we find the cumulative area to the right of z = ____ is ______________ Step 7: Because the P-value of ________ is greater than the significance level of = 0.05, we _____________________ ________________________ Step 8: We conclude that --Critical value method of hypothesis testing: Steps 1-5: same as above Step 6: This is a right-tailed test, so the area of the critical region is an area of = ___ to the right so 0.95 is the area from the left. From table A-2, we find z = _______ Step 7: Because the test statistic (z = 0.64) does not fall within the critical region, __________________________________ Step 8: 28 --Using a TI-83 or TI-84 Graphing Calculator to test a claim about the population mean when σ is known 1. Press STAT and select TESTS 2. Select 1: Z-test 3. You can choose whether you want to enter data into the list L1 or you can enter the statistics 4. Enter the claimed value of the population mean for µ0, then enter the values for the sample mean, population standard deviation, and sample size, and then select the type of test. 4. Highlight Calculate and press ENTER Ex: Find the test statistic z and the p-value using the previous example: 29 8.5 Testing a Claim about a Standard Deviation or Variance Testing Claims about a Population Standard Deviation or a Population Variance ² Uses the chi-square distribution from section 7-4 Requirements: 1. 2. Test Statistic for Testing a Claim about or ²: 2 where n = s= σ= s2 = σ2 = (n 1) s 2 2 P-values and Critical Values: Use table A-4 with df = n – 1 for the number of degrees of freedom *Remember that table A-4 is based on cumulative areas from the _______ Properties of the Chi-Square Distribution: 1. All values of 2 are nonnegative and the distribution is not symmetric 2. There is a different 2 distribution for each number of degrees of freedom 3. The critical values are found in table A-4 (based on cumulative areas from the right) --locate the row corresponding to the appropriate number of degrees of freedom (df = n – 1) --the significance level is used to determine the correct column --Right-tailed test: Because the area to the right of the critical value is 0.05, locate 0.05 at the top of table A-4 --Left-tailed test: With a left-tailed area of 0.05, the area to the right of the critical value is 0.95 so locate 0.95 at the top of table A-4 --Two-tailed test: Divide the significance level of 0.05 between the left and right tails, so the areas to the right of the two critical values are 0.975 and 0.025. Locate 0.975 and 0.025 at the top of table A-4 30 Ex #1: The industrial world shares this common goal: Improve quality by reducing variation. Quality control engineers want to ensure that a product has an acceptable mean, but they also want to produce items of consistent quality so that there will be few defects. The Newport Bottling Company had been manufacturing cans of cola with amounts having a standard deviation of 0.051 oz. A new bottling machine is tested, and a simple random sample of 24 cans results in the amounts (in ounces) listed below. (Those 24 amounts have a standard deviation of s = 0.039 oz). Use a 0.05 significance level to test the claim that cans of cola from the new machine have amounts with a standard deviation that is less than 0.051 oz. 11.98 11.98 11.99 11.98 11.90 12.02 11.99 11.93 12.02 12.02 12.02 11.98 12.01 12.00 11.99 11.95 11.95 11.96 11.96 12.02 11.99 12.07 11.93 12.05 --Check requirements: simple random sample, histogram & normal quantile plot show that the sample appears to come from a population with a normal distribution --Critical value method of testing hypotheses: Step 1: The original claim: __________ Step 2: If the original claim is false, then __________ Step 3: H0: __________ H1: __________ Step 4: The significance level is ________ Step 5: Because the claim is made about , we use the chi-square distribution and we have a _______________ test (n 1) s 2 Step 6: The test statistic is 2 2 Using table A-4, df = __ and the column corresponding to ___, we find the critical value = _______ Step 7: Because the test statistic is not in the critical region, we __________________________________ Step 8: . 31 --P-value method of testing hypotheses: When using table A-4, we usually cannot find exact P-values because that chi-square distribution table includes only selected values of . If using table A-4, we can identify limits that contain the P-value. You will reach the same conclusion as with the traditional method. Step 1: Identify the df = n – 1 = Step 2: Use the test statistic of _____ and see that in the ____ row of table A-4, the test statistic is between _____ and ______ so the area to the right is between ______ and _______ Step 3: Since we have a left-tailed test, the area to the left must be between _____ and _____ Since our P-value is ______________ to 0.05, we _____________________________________ Step 4: Conclusion: --Confidence interval method of testing hypothesis: Using the methods from section 7-5, we can use the sample data (n = 24, s = 0.039) to construct this 90% confidence interval: 0.032 < < 0.052. Again, we cannot conclude that is less than 0.051. (same conclusions as the critical value & p-value methods) --Note: The TI-83/84 graphing calculator does not test hypotheses about σ or σ2 Ex #2: The Skytek Avionics company uses a new production method to manufacture aircraft altimeters. A simple random sample of new altimeters resulted in the errors listed below. Use a 0.01 significance level to test the claim that the new production method has errors with a standard deviation greater than 32.2 feet, which was the standard deviation for the old production method. If it appears that the standard deviation is greater, does the new production method appear to be better or worse than the old method? Should the company take any action? -42 78 -22 -72 -45 15 17 51 -5 -53 -9 -109 32 --Critical value method of testing hypotheses: --P-value method of testing hypotheses: 33