STAT 210 Probability and Statistics Unit 7a: Tests of hypotheses for a single sample Outline Introduction Testing of one mean Testing and Errors Testing of one proportion STAT210: Probability and Statistics 2 An Example: Example 5.2 from the text Microdrills example in Chapter 5:We studied the average lifetimes. A sample of 50 microdrills had a mean of 12.68 and standard deviation of 6.83. The population mean lifetime () of all microdrills is unknown Let us assume that the main question is whether or not the population mean lifetime is greater than 11. We can address this by conducting a hypothesis test STAT210: Probability and Statistics 3 Example We see that our sample mean is larger than 11, but because of uncertainty in the means, this does not guarantee that > 11. We would like to know just how certain we can be that > 11. The statement “ > 11” is a hypothesis about the population mean . To determine just how certain we can be that a hypothesis is true, we must perform a hypothesis test. STAT210: Probability and Statistics 4 Introduction There are two common types of formal statistical inference: 1) 2) Confidence intervals (Interval Estimation) They are appropriate when our goal is to estimate a population parameter. Hypothesis Testing To assess the evidence (support) provided by the sample data in favor of some claim about the population. We perform a test of hypothesis only when we are making a decision about a population parameter based on the value of the sample statistic. STAT210: Probability and Statistics 5 More Examples A computer system currently has 10 terminals and uses a single printer. The average turnaround time for the system is 15 minutes. 10 new terminals and a second printer are added to the system . Has the mean turnaround time been improved? Suppose a manufacturer claims their VHS tapes can hold 120 minutes of programming at SP mode. You believe they are shorter. The manufacturer of the ColorSmart-5000 television set claims that 95% of its sets last at least five years without needing a single repair. STAT210: Probability and Statistics 6 Hypotheses A statistical hypothesis is a statement about the parameters of one or more populations. A null hypothesis H0 is a statement about a population parameter that is assumed to be true until it is declared false. ◼ The test is designed to assess the strength of the evidence against the null hypothesis. The alternative hypothesis Ha (or H1), a claim about a population parameter that will be true if the null hypothesis is false. In performing a hypothesis test, we essentially put the null hypothesis on trial. STAT210: Probability and Statistics 7 More on Alternative Hypothesis The alternative hypothesis H1 will contain either a greater than sign (one-tailed test), a less than sign (one tailed test), or a not equal to sign (two-tailed test). ◼ Greater than (>): results if the problem says increases, improves, better, result is higher, etc. ◼ Less than (<): results if the problem says decreases, reduces, worse than, result is lower, etc. ◼ Not equal to (): results if the problem says different from, no longer the same, changes, etc. STAT210: Probability and Statistics 8 Procedure We begin by assuming that H0 is true, The random sample provides the evidence. The hypothesis test involves measuring the strength of the disagreement between the sample and H0 to produce a number between 0 and 1, called a P-value. STAT210: Probability and Statistics 9 Test Statistic and P-value The test statistic is some quantity calculated from the sample data that we have collected. It is used to determine the strength of the evidence against H0. Based on the distribution of the test statistic we compute P-value The smaller the P-value, the stronger the evidence is against H0. If the P-value is sufficiently small, we reject the assumption that H0 is true and believe H1 instead. This is referred to as rejecting the null hypothesis. STAT210: Probability and Statistics 10 P-value We take one final step to assess the evidence against H0. We compare the Pvalue with a fixed value, called the significance level (). Typical values of used are 0.05 and 0.01. ◼ If, for a test, P-value is less than , we reject the null hypothesis H0 and conclude that there is enough evidence to believe in H1. ◼ STAT210: Probability and Statistics 11 Decision Summary When we carry out the test we assume H0 is true. Hence the test will result in one of two decisions. ◼ ◼ ◼ Reject H0: Hence we have sufficient evidence to conclude that the alternative hypothesis is true. Such a test is said to be significant. Fail to reject H0: Hence we do not have sufficient evidence to conclude that the alternative hypothesis is true. Such a test is said to be insignificant. We reject H0 if P-value < . STAT210: Probability and Statistics 12 General Testing Procedure 1) State the null and alternative hypothesis. 2) Carry out the experiment, collect the data, verify the assumptions, and compute the value of the test statistic. 3) Calculate the p-value or the rejection region. 4) Make a decision on the significance of the test (reject or fail to reject H0). Make a conclusion statement in the words of the original problem. STAT210: Probability and Statistics 15 Tests about Population Mean Goal: We hypothesize that the population mean () equals some value 0, and state the alternative hypothesis that we wish to prove is true. Case I: Normal Population with known ◼ We use one-sample z-test. ത 0 𝑋−𝜇 𝜎/ 𝑛 ◼ 𝑧= ◼ Not a realistic case. STAT210: Probability and Statistics 16 Tests about Population Mean Case II: Used when is unknown ◼ ◼ ◼ ◼ ◼ For large sample sizes (n>30) Or for Small sample sizes (n<30) and Population Normal If the value of standard deviation is unknown, then instead of using (z) we should use (t) test statistic. For small sample we must verify Normality before applying this procedure. We use the normal probability plot to test for normality test. The test statistic follows t distribution with degrees of freedom df=n-1. STAT210: Probability and Statistics 17 What we do in practice? In real application is not known, and we use the one sample t-test ◼ ◼ If sample size is small (n<30), we need to check for normality. If sample size is large (n>=30), No need for normality check STAT210: Probability and Statistics 18 One-sample t-test 1. 2. Hypotheses ◼ H1: > 0 Test Statistic , H1: < 0 t0 = 3. P-value/ Critical region H1 > 0 < 0 ≠ 0 P-value P(Tn-1>t0) P(Tn-1<t0) 2P(Tn-1>|t0|) or H1: 0 x − 0 s/ n Critical Region t0>t, n-1 t0<-t, n-1 |t0|>t/2, n-1 STAT210: Probability and Statistics 19 In Minitab STAT210: Probability and Statistics 20 Example 1. The life in hours of a battery is known to be approximately normally distributed. A random sample of 10 batteries has a mean life of 40.5 hours and a standard deviation of 1.25 hours. Is there evidence to support the claim that battery life exceeds 40 hours? Use =0.05. Hypotheses: H0: <= 40 vs. H1: > 40 2. Test statistic (df=10-1=9) t0 = x − 0 s/ n = 40.5 − 40 1.25 / 10 STAT210: Probability and Statistics = 1.26 21 Example 3. 4. P-value =P(T9 > 1.26)= 1-P(T9 < 1.26)=0.1197. Since p-value=0.1197>0.05, we do not reject H0. There is no sufficient evidence to support the claim that battery life exceeds 40 hours. Minitab output using one-sample t-test: One-Sample T Test of mu = 40 vs > 40 N 10 Mean 40.5000 StDev 1.2500 SE Mean 0.3953 95% Lower Bound 39.7754 STAT210: Probability and Statistics T 1.26 P 0.119 22 Exercises 1) Before a substance can be deemed safe for landfilling, its chemical properties must be characterized. An article reports that in a sample of six sludge specimens from a New Hampshire wastewater treatment plant, the mean pH was 6.68 with a standard deviation of 0.20. Can we conclude that the mean pH is less than 7.0? STAT210: Probability and Statistics 23 Exercises 2) Ford Motor Company wants to test a new type of engine to determine whether it meets new airpollution standards. The mean emission of all engines of this type must be less than 20 parts per million (ppm) of carbon. Ten engines are manufactured for testing purposes and the emission level of each is determined. The data in ppm is listed below. 15.6 16.2 22.5 20.5 16.4 19.4 16.6 17.9 12.7 13.9 At =0.01, do the data supply sufficient evidence to allow Ford to conclude that this type of engine meets the pollution standard? STAT210: Probability and Statistics 24 Confidence Intervals and Hypothesis Testing In hypothesis testing, another concept of interest is the relationship between hypothesis testing and confidence intervals. Assume that the same significance level is used in the hypothesis-testing situation and in finding the confidence interval, ◼ When the confidence interval contains the hypothesized mean, do not reject the null hypothesis. ◼ When the confidence interval does not contain the hypothesized mean, reject the null hypothesis. STAT210: Probability and Statistics 25 Example Sugar is packed in 5-pound bags. An inspector suspects the bags may not contain 5 pounds. A sample of 50 bags produces a mean of 4.6 pounds and a standard deviation of 0.7 pound. Is there enough evidence to conclude that the bags do not contain 5 pounds as stated, at = 0.05? Also, find the 95% confidence interval of the true mean. STAT210: Probability and Statistics 26 Example 1. 2. H0: = 5 vs. The test value is t0 = 3. 4. H1: ≠ 5 4.6 − 5 0.7 / 50 = −4.04 P-value ≈ 0 The null hypothesis is rejected. There is enough evidence to support the claim that the bags do not weigh 5 pounds. One-Sample T Test of mu = 5 vs not = 5 N 50 Mean 4.6000 StDev SE Mean 0.7000 0.0990 95% CI (4.4011, 4.7989) STAT210: Probability and Statistics T -4.04 P 0.000 27 Example The 95% confidence interval for the mean is given by x t /2, n −1 s n 0.7 4.6 (2.01) 50 One-Sample T N Mean StDev SE Mean 50 4.6000 0.7000 0.0990 (4.401, 4.799) 95% CI (4.4011, 4.7989) Notice that the 95% confidence interval of does not contain the hypothesized value = 5. Hence, there is agreement between the hypothesis test and the confidence interval. STAT210: Probability and Statistics 28 Errors Four scenarios when making a decision based on a sample H 0 true D o n o t reject H 0 Great! H 0 false Type II Error H 0 accepted given H 0 false P (Type II Error) = Reject H 0 Type I Error Great! False Rejection of H 0 P (Type I Error) = (Significance level) STAT210: Probability and Statistics 29 Error Types A type I error occurs when the null hypothesis (H0) is rejected when in fact H0 is true. P( Type I error) = P( Reject H0|H0 is true)= A type II error occurs when we fail to reject the null hypothesis (H0) when in fact H0 is false. P( Type II error) = P( Accept H0|H0 is false)= Note that when decreases increases. If we want to control we need to increase the sample size. STAT210: Probability and Statistics 30 Tests for a Population Proportion Our hypothesis test is similar to the one we saw before. But now we have a sample that consists of successes and failures, e.g. “success” may be a defective wafer. Population proportion of defective wafers is p Supplier claims that the proportion of defective wafers in his supply is less than 0.1 (or 10%), i.e., p 0.1. Since our hypothesis concerns a population proportion, it is natural to base the test on the sample proportion. STAT210: Probability and Statistics 31 Tests on a population proportion (p) Now, we consider testing hypotheses about a population proportion p when the sample size n is large. We hypothesize that the population proportion p equals some specified value p0 and we want to use the data in a sample to test whether this null hypothesis is appropriate or whether we should reject the null hypothesis in favor of some alternative hypothesis. To use the one-proportion z-test, we must have both np0 ≥ 5 and n(1-p0) ≥ 5. STAT210: Probability and Statistics 32 One-proportion z-test 1. 2. Hypotheses ◼ H1: p > p0 Test Statistic z0 = 3. , H1: p < p0 pˆ − p 0 p 0 (1 − p 0 ) n or H1: p p0 x where pˆ = n P-value and critical value: H1 P-value Critical Region p>p0 P(Z>z0) z0>z p<p0 P(Z<z0) z0<-z p≠p0 2[P(Z>|z0|)] |z0|>z/2 STAT210: Probability and Statistics 33 Example Scientists think that robots will play an increasingly crucial role in factories over the next 20 years. Suppose that in an experiment to determine whether the use of robots to weave computer cables is feasible, a robot was used to assemble 500 cables. The cables were examined and 14 defectives were found. If human assemblers have a defect rate of 3%, does this data support the hypothesis that the rate of defectives is lower for robots than for humans? Use = 0.01. STAT210: Probability and Statistics 34 Example 1. Hypotheses: ◼ 2. H0: p = 0.03 np0=(500)(0. 03)=15>5 and n(1-p0)=485>5. Test Statistic 14 pˆ = = 0.028 and z 0 = 500 3. 4. vs. H1: p < 0.03 0.028 − 0.03 (0.03)(0.97) / 500 = − 0.26 P-value =P(Z <-0.26)=0.3974 Since p-value=0.3974>0.01, we do not reject H0. There is no sufficient evidence to conclude that the rate of defectives is lower for robots than for humans. STAT210: Probability and Statistics 35 In Minitab STAT210: Probability and Statistics 36 Example Minitab output using one-proportion z-test: ◼ Test and CI for One Proportion Test of p = 0.03 vs p < 0.03 95% Upper Sample X N Sample p Bound Z-Value 1 14 500 0.028000 0.040135 -0.26 P-Value 0.397 Using the normal approximation. STAT210: Probability and Statistics 37 Exercise 1 A telephone company is trying to decide whether some new lines in a large community should be installed underground. Because a small surcharge will be added to telephone bills to pay for the extra installation costs, the company has decided to survey customers and proceed only if the survey strongly indicates that more than 60% of all customers favor underground installation. If 118 of 160 customers surveyed favor underground installation despite the surcharge, what should the company do? Test the relevant hypothesis using = 0.05. STAT210: Probability and Statistics 38 Exercise 2 An article presents a method for measuring orthometric heights above sea level. For a sample of 1225 baselines, 926 gave results that were within the class C spirit leveling tolerance limits. Can we conclude that more than 75% of the times this method produces results within the tolerance limits? STAT210: Probability and Statistics 39