STATISTICS AND PROBABILITY Quarter 4- Week 1: Testing Hypothesis (Introduction) Content Standards: The learner demonstrates understanding of key concepts of tests of hypotheses on the population mean and population proportion. Performance Standards: The learner is able to perform appropriate tests of hypotheses involving the population mean and population proportion to make inferences in real-life problems in different disciplines. Most Essential Learning Competency: 1. Illustrate: (a) null hypothesis; (b) alternative hypothesis; (c) level of significance; (d) rejection region; and (e) types of errors in hypothesis testing. M11/12SP-IVa-1 2. Identify the parameter to be tested given a real-life problem.M11/12SP-IVa-3 3. Formulate the appropriate null and alternative hypotheses on a population mean. M11/12SP-IVb-1 4. Identify the appropriate form of the test-statistic when: (a) the population variance is assumed to be known; (b) the population variance is assumed be unknown; and (c) the Central Limit Theorem is to be used. M11/12SP-IVb-2 Lesson 1 Testing Hypothesis Hypothesis Testing is a statistical method applied in making decisions using experimental data. Hypothesis testing is basically testing an assumption that we make about a population. Hypothesis is a proposed explanation, assertion, or assumption about a population parameter or about the distribution of a random variable. The Null and Alternative Hypothesis The null hypothesis denoted by 𝐻0 states that there is no difference between a parameter and specific value, or that there is no difference between two parameters. It can be written as follows: 𝐇𝟎 : 𝜇 = 𝜇𝟎 𝐇𝟎 : 𝜇 ≤ 𝜇𝟎 𝐇𝟎 : 𝜇 ≥ 𝜇𝟎 The alternative hypothesis denoted by 𝐻1 or 𝐻𝑎 states that there is a difference between a parameter and specific value, or that there is a difference between two parameters. It can be written as follows: 𝐇𝟏 : 𝜇 ≠ 𝜇𝟎 𝐇𝟏 : 𝜇 > 𝜇𝟎 𝐇𝟏 : 𝜇 < 𝜇𝟎 = > ≥ Hypothesis-Testing Common Phrases is equal to is not equal to ≠ is the same as is not the same is exactly the same as is different from has not changed from has changed from is decreased is increased is less than is greater than is lower than is higher than < is below is above is smaller than is bigger than is decreased or reduced is longer than from is not more than is at least is at most ≤ is not less than is not more than is greater than or equal to is less than or equal to Example 1. The owner of factory sells a particular bottled fruit juice claims that the average capacity of their product is 250 ml. Is the claim true? Solutions: The parameter of interest is the mean μ = 250. 𝑯𝟎 : The bottled drinks contain 250 ml per bottle. (This is the claim) In symbols: 𝑯𝟎 : μ = 250 𝑯𝟏 : The bottled drinks do not contain 250 ml per bottle. (This is the opposite of the claim) In symbols: 𝑯𝟏 : μ ≠ 250 Example 2. A farmer believes that using organic fertilizers on his plants will yield greater income. His average income from the past was P 200, 000.00 per year. State the hypotheses in symbols. Solutions: 𝑯𝟎 : μ = 200, 000.00 The phrase ‘greater income’ is associated with the greater than direction. So, 𝑯𝟏 : μ >200, 000.00 Level of Significance The level of significance denoted by alpha or 𝛂 refers to the degree of significance in which we accept or reject the null hypothesis. 100% accuracy is not possible in accepting or rejecting a hypothesis. The significance level α is also the probability of making the wrong decision when the null hypothesis is true. Most of the significant levels are 0.10, 0.05 and 0.01 level. Two-Tailed Test vs One-Tailed Test Two-Tailed Test -it is non-directional test with the region lying on both tails of the normal curve. It is used when the alternative hypothesis uses words such as not equal to, significantly different, etc. 𝑯𝟎 : μ = μ𝟎 𝑯𝟏 : μ ≠ μ𝟎 One-Tailed Test -it is a directional test with the rejection, lying on either left or right tail of the normal curve. a. Right directional test. The region of rejection is on the right tail. It is used when the alternative hypothesis uses comparatives such as greater than, higher than, better than, superior to, exceeds, etc. 𝑯𝟎 : μ = μ𝟎 𝑯𝟏 : μ > μ𝟎 b. Left directional test. The region of rejection is on the left tail. It is used when the alternative hypothesis uses comparatives such as less than, smaller than, inferior to, lower than, below, etc. 𝑯𝟎 : μ = μ𝟎 𝑯𝟏 : μ < μ𝟎 Example 3. Determine whether the test is one-tailed or two-tailed. a. Given: The owner of factory sells a particular bottled fruit juice claims that the average capacity of their product is 250 ml. Answer: two-tailed test a. Given: A farmer believes that using organic fertilizers on his plants will yield greater income. His average income from the past was P 200, 000.00 per year. Answer: one-tailed test (right) Illustration of the Rejection Region The rejection region (or critical region) is the set of all values of the test statistic that causes us to reject the null hypothesis. The non-rejection region (or acceptance region) is the set of all values of the test statistic that causes us to fail to reject the null hypothesis. The critical value is a point (boundary) on the test distribution that is compared to the test statistic to determine if the null hypothesis would be rejected. Commonly used Level of Significance and its Corresponding Critical Values of z-distribution Test Types Level of Significance 𝜶 One-Tailed Two-Tailed 0.010 +2.33 or -2.33 ± 2.575 0.025 +1.96 or -1.96 ± 2.24 0.05 +1.645 or -1.645 ±1.96 0.100 +1.28 or -1.28 ± 1.645 Example 4 Write the critical value of the following: a. two- tailed test 𝛼 = 0.01 n= 67 Solutions: two tailed n ≥30, (use the z-distribution) z = ± 2.575 b. right-tailed test 𝛼 = 0.05 n= 25 Solutions: one tailed (positive) n < 30 (use the t-distribution) Identify the level of significance 𝛼 = 0.05 Identify the degree of freedom df = n-1 = 25 – 1 = 24 Find the critical value using t-distribution in the row with n-1 df. t = 1.711 Example 5 Illustrate the rejection region given the critical value and identify if the t-values lie in the non-rejection region or rejection region. a. critical t-value of -2.33 computed t-value of -1.38 The computed t-value is at the non-rejection region. Type I and Type II Errors If the null hypothesis is true and rejected, then it is a Type I error. The probability of committing a Type I error is denoted by α (alpha). If null hypothesis is false and accepted, then it is a Type II error. The probability of committing a Type II error is denoted by β (beta). Example 6. a. Maria’s Age Maria insists that she is 30 years old when, in fact, she is 32 years old. What error is Mary committing? Solutions: Mary is rejecting the truth. She is committing a Type I error. b. Monkey-Eating Eagle Hunt A man plans to go hunting the Philippine monkey-eating eagle believing that it is a proof of his mettle. What type of error is this? Solutions: Hunting Philippine eagle is prohibited by law. Thus, it is not a good sport. It is a Type II error. To summarize the difference between the Type I and Type II errors, take a look at the table below: Null Hypothesis, 𝑯𝟎 True False Lesson 2 Failed to Reject 𝑯𝟎 or Accept 𝑯𝟎 Reject 𝑯𝟎 Correct decision Failed to reject 𝐻0 when it is true Type II Error Failed to reject 𝐻0 when it is false Type I Error Rejected 𝐻0 when it is true Correct decision Rejected 𝐻0 when it is false Identifying Appropriate Test Statistics Involving Population Mean A test statistic is a value used to determine the probability needed in decision making. It is a random variable that is calculated from sample data and used in a hypothesis. z- test. In a z-test, the sample is assumed to be normally distributed. A z-score is calculated with population parameters such as population mean and population standard deviation. The normal or sample size is large. t- test. The sample is also assumed to be normally distributed. A t-test is used when the population variance or standard deviation are not known. The sample size is less than 30. Central Limit Theorem If the population is normally distributed or the sample size is large and the true population mean μ = μ𝟎 , then z has a standard normal distribution. When population standard deviation 𝜎 is not known, we may still use z-score by replacing the population standard deviation 𝜎 by its estimate, sample standard deviation s. When the value of sample size (n)… n ≥ 30 𝜎 is known z-test n < 30 𝜎 is unknown z-test 𝜎 is known 𝜎 is unknown z-test t-test Example7 Identify the appropriate test statistic to be used in the given problem. a. The average test score for an entire school is 75 with a standard deviation of 10. What is the probability that a random sample of 5 students scored above 80? Answer: Here, the sample size (n) is 5 which is less than 30 and population standard deviation (10) is known, then the appropriate test statistical to be used is z-test. b. From a random sample of 100 students who have passed a statistic course, the average score was 71.8. Assuming that the population standard deviation is 8.9, with a significance level of 0.05, does it seem to signify that the average score is more than 70? Answer: Here, the sample size (n) is 100 which is greater than 30 and population standard deviation (8.9) is known, then the appropriate test statistical to be used is z-test. c. An English teacher wanted to test whether the mean reading speed of students is 550 words per minute. A sample of 12 students revealed a sample mean of 540 words per minute with a standard deviation of 5 words per minute. At 0.05 significance level, is the reading speed different from 550 words per minute? Answer: The sample size (n) is 12 which is less than 30 and sample standard deviation (5 words per minute) was given. Therefore, the appropriate test is t-test. ASSESSMENT: (30 points) I. State the null and the alternative hypotheses of the following statements. (4 points each) 1. A car dealership announces that the mean time for an oil change is less than 15 minutes. 2. A company advertises that the mean life of its furnaces is more than 18 years. 3. A consumer analyst reports that the mean life of a certain type of automobile battery is not 74 months. 4. A transportation network company claims that the mean travel time between two destinations is about 16 minutes. II. Determine if one-tailed test or two-tailed test fits the given alternative hypothesis. (2 points each) 1. The average age of doctors in Las Piñas is 35 years. 2. The proportion of senior male students’ height is significantly higher than that of senior female students. III. Illustrate the rejection region given the critical value and identify if the t-values lie in the non-rejection region or rejection region. (2points each) 1. critical t-value of -2.086 computed t-value of -2.096 2. critical t-value of ±1.071 computed t-value of 1.01 IV. Identify the appropriate test statistic to be used in each problem. (2 points each) 1. Based on the report of the school nurse, the average height of Grade 11 students has increased. Five years ago, the average height of Grade 11 students was 170cm with standard deviation of 38cm. She took a random sample of 150 students and derived the average height of 165cm. 2. A manufacturer of tires claim that their tire has a mean life of at least 50,000kms. A random sample of 28 of these tires is tested and the sample mean is 33,000kms. Assume that the population standard deviation is 3,000kms and the lives of the tires are approximately normally distributed. 3. In the population, the average IQ is 100. A team of scientists wants to test a new medication to see if it has either a positive or a negative effect on intelligence, or no effect at all. A sample of 30 participants who have taken the medication has a mean of 140 with a standard deviation of 20. Did the medication affect intelligence? Alpha=0.05. REFERENCES: Textbooks: Belecina, R. R., Baccay, E. S., & Mateo, E. B, (2016). Statistics and Probability. Rex Book Store. Mangaran, A. J., Santos E. M. (2005) Probability and Statistics: A Comprehensive Approach Online Resources: Wow Math. (2021, April 23). Null and alternative hypotheses||hypothesis testing||statistics and probability q4 [Video]. Youtube. https://www.youtube.com/watch?v=8IxJaU06qJA&t=3s BYJU’S (2021). Retrieved from: https://byjus.com/maths/t-test-table/ Alcantara A. [Teacher Ayhi]. (2021, April 13). Identifying Appropriate Test Statistics involving Population Mean [Video]. Youtube. https://www.youtube.com/watch?v=Vk5mJeROzME Rai University (2015). Unit 4 Tests of Significance. Slideshare. https://www.slideshare.net/raiuniversity/unit-4-45983025 socratic.org/statistics (n.d). Retrieved from: https://socratic.org/questions/how-do-you-find-the-area-under-thenormal-distribution-curve-to-the-right-of-z-3 /rrsa