Tests of Hypotheses – One Sample Case General Objectives: In this chapter, the concept of a statistical test of a hypothesis is formally introduced. The sampling distributions of statistics presented in earlier chapters are used to construct large-sample tests and small-sample tests concerning the values of population parameters of interest to the experimenter. Topics Include 1. The concept of a statistical test of hypotheses 2. Large-sample Z-test about a population mean m 3. The use of p-value for testing a hypothesis 4. Small-sample t-test about a population mean 5. Testing a hypothesis about a population proportion p An Analogy of A Hypothesis Testing Hypothesis testing is similar to a court case. It involves with the process of a decision-making based on the data information. The rule used to make the decision is based on the idea: ‘In a decision-making case, we often have two choices to choose from. We collect evidence to help us to make a decision. if the observed information is in favor of choice A, then we decide to take decision A; otherwise, we would choose decision B. Before we look into how to conduct a hypothesis test, let us go to the court to observe how a judge decides if some one is innocent or guilty. When a criminal case goes to the court, the person is first assumed INNOCENT, and will be eventually determined as either innocent or guilty based the INFORMATION (or EVIDENCE) presented by the prosecutor and the defendant. Rules that are used by the Judge are the US LAW. The two choices for the judge are (1) This person is INNOCENT or (2) This person is GUILTY. At the very beginning, the person is usually ASSUMED INNOCENT. Therefore, to simplify the discussion, we use Ho for the assumed situation, (that is: The person is assumed INNOCENT), and call it NULL HYPOTHESIS. The alternative that the prosecutor is trying to prove (The person is GUILTY) is the ALTERNATIVE HYPOTHESIS, and the notation is Ha. Based on the above discussion, Can you complete the following blanks for the court case? What are : Ho:_______________________ Ha: __________________________ What is the Decision Rule (The rule that Judge uses to make the decision): __________________________________________________________ What is the Sample Information (The information presented by prosecutor and lawyer): ________________________________________________________ Final Decision: Made by Judge, who applies the Decision Rule to the Sample Information, and decide if the person is innocent ( take Ho), or the person is guilty (take Ha) . NOTE: It is important to understand that no matter what the final decision made by the judge, there is ALWAYS some chance of making errors. Q: There are two possible errors in this decision-making. What are they? (Hint: One type of error is: Judge decides the person is Guilty, but s/he is not guilty ) : Q: Based on the types of errors you described above,which type of error is considered more critical? That is: if the type of error was made, it would have more serious consequences. Q:The judge would like to reduce the type of more critical error, so that the judge will not have too high probability to make this critical error. In order to reduce the type of the more critical error, what suggestion(s) do you have to reduce this type of error? A Statistical Test of Hypothesis A statistical test of hypothesis involves four steps: 1. Set up the alternative hypothesis denoted by Ha , and the null hypothesis, denoted by H 0 2. Determine the Decision Rule and the test statistic 3. Apply the information from data to compute the observed test statistics. 4. Compare the observed test statistic with the critical value set in the Decision Rule. If the observed test statistic falls in Reject Ho Region, we reject Ho. Otherwise, we do not reject Ho. 5. State the conclusion based on the context. Definition: The two competing hypotheses are the alternative hypothesis Ha , generally the hypothesis that the researcher wishes to support, and the null hypothesis H 0 , a contradiction of the alternative hypothesis. The researcher then uses the sample data to decide whether the evidence favors Ha rather than H 0 and draws one of these two conclusions: - Reject H 0 and conclude that Ha is true. - Accept (do not reject) H 0 as true. Examples on p. 300, p. 307, and example 6.1 on p.307 show null and alternative hypotheses and the procedure of performing a test. You can have a two-tailed test of a hypothesis or a one-tailed test of a hypothesis, a left tailed-test or a right-tailed test. The test statistic is a single number calculated from sample data. Either or both of these measures act as a decision maker for the researcher in deciding whether to reject or accept H 0. Example 6.1 and 6.2 and figures 6.1, 6.2 and 6.5 show acceptance and rejection regions for different type of tests The p-value is a probability calculated using the test statistic (See Figures 6.7 and 6.8 for more examples). A Large-Sample Test About a Population Mean For a Right Side Test: H0 : m = m0 Ha : m > m0 = . x The standard error of is calculated as x n The standardized test statistic: z= x m0 / n Important points to remember: (a) (b) (c) For setting the hypothesis: The researcher’s interest, or the question asked is used to determine Ha. This is the one to determine first. Always set Ho: m = m 0 There are three types of tests : Right-side test: H a : m > m 0 Two-side test: H a : m m 0 Left-side Test: H a : m < m 0 Example: The average weekly earnings for women in managerial and professional positions is $670. Do men in the same positions have average weekly earnings that are higher than those for women? A random sample of n = 40 men in managerial and professional positions showed x = $725 and s = $102. Test the appropriate hypothesis using a = .01. Solution You would like to show that the average weekly earnings for men are higher than $670, the women’s average. Hence, if m is the average weekly earnings in managerial and professional positions for men, the hypotheses to be tested are H 0 : m = 670 versus H a : m > 670 (NOTE: This is a right-side test) The rejection region for this right-side one-tailed test consists of large values of x or, equivalently, values of the standardized test statistic z in the right tail of the standard normal distribution, with a = .01, which gives z = 2.33 (This is the critical value). That is, the interval of Z > 2.33 is the REJECTION REGION for the right-side test when a = .01 The observed value of the test statistic, using s as an estimate of the population standard deviation, is z= x 670 725 670 = = 3.41 s/ n 102 / 40 From the data, we observe the sample average $725. The corresponding observed z-value is 3.41, which is larger than 2.33, the critical value. Since the observed value of the test statistic falls in the rejection region, you can reject H 0 and conclude that the average weekly earnings for men in managerial and professional positions are significantly higher than those for women. The probability that you have made an incorrect decision is a = .01. The rejection region of a right-tailed test with a = .01 Decision Rule: If the observed test-statistic, zobs > 2.33 , the critical z-value, z.01, then, REJECT Ho, and in favor of Ha. If the observed test-statistic, zobs < = 2.33 , the critical z-value, z.01, then, ACCEPT Ho For this case, zobs = 3.41 > 2.33, therefore, based on the decision rule, we reject Ho, and take Ha, which means: Men’s average weekly salary is significantly higher than the female counterparts. Another type of Hypothesis Testing is: The two-sided hypothesis is written as Ho: m = m 0 H a : m m 0, which implies either m > m 0 or m < m 0.. The rejection region for a two-tailed test with a = .01 NOTE: There are two critical values: -za/2 and za/2. This is because we do not know if m > m 0 or m < m 0, so we will reject Ho whenever the observed average is too LARGE or too SMALL. The reject probability is totaled to a. Summary of Large-Sample Statistical Test for m : 1. Null hypothesis: H 0 : m = m 0 2. Alternative hypothesis: For One-Tailed Test H a : m > m 0 (Right-side Test) (or H a : m < m 0 , Left-side Test) For Two-Tailed Test Ha : m m0 3. Test statistic: z= x m0 x = x m0 / n If is unknown (which is usually the case), substitute the sample standard deviation s for .. 4. Rejection region: Reject H 0 when One-Tailed Test Two-Tailed Test z > za z > za/2 or z < za/2 (or z < za when the alternative hypothesis is H a : m < m 0 ) Assumptions: The n observations in the sample are randomly selected from the population and n is large—say, n 30. The following figures show right-side and two-side rejection regions: Calculating the p-Value To avoid any ambiguity in their conclusions, some experimenters prefer to use the observed level of significance called the p-value for the test. Definition: The p-value or observed significance level of a statistical test is the tail probability beyond the observed in the REJECT region. The p-value measures the strength of the evidence against H0. For a right-side test, the p-value of the test is actually the area to the right of the calculated value of the test statistic. p-value = P(Z > zobs) for large sample right-side test. For a left-side test, the p-value of the test is actually the area to the left of the calculated value of the test statistic. p-value = P(Z < zobs) for large sample left-side test. For a two-side test, the p-value of the test is actually twice of the area to the right of the absolute value of the calculated value of the test statistic. p-value = 2P(Z > |zobs|) for large sample two-side test. P-value for a right-side test a p-value Za zobs Drawing conclusion based on p-value: If p-value < a, then we reject Ho, and take Ha. If p-value > = a, then, we do not reject Ho, and take Ho. Example Calculate the p-value and draw your conclusion based on p-value for the test of hypothesis in the example of testing if men’s salary is significantly higher than female counterpart. Solution Since the observed value of the test statistic is z = 3.43, and it is a right-side test, so, p-value is given by: p-value = P (z > 3.43) = (.5 .4998) = .0002 Based on the decision rule using p-value, we see p-value = .0002 < a = .01. Therefore, we reject Ho, and take Ha. We conclude that Men’s average weekly salary is significantly higher than the female counterpart at a=1%. NOTE: This conclusion is the same as the conclusion using z-value. Computer software usually gives us the p-value. We use z-value to draw conclusion when there is no computer available, but a z-table or t-table is available. Many researchers use a “sliding scale” to classify their results: - If the p-value is less than .01, H0 is rejected. The results are highly significant. - If the p-value is between .01 and .05, H0 is rejected. The results are statistically significant. - If the p-value is between .05 and .10, H0 is usually not rejected. The results are only tending toward statistical significance. - If the p-value is greater than .10, H0 is not rejected. The results are not statistically significant. In this class, the a-value will be given. In case it is not given, use a = 5%. Use p-value to make the decision does have two advantages: - Statistical output from packages such as Minitab usually report the p-value of the test. - Based on the p-value, your test results can be evaluated using any significance level you wish to see. The smaller the p-value, the more unlikely it is that H 0 is true! Whenever we make a decision for a hypothesis test, we are at a risk of making two types of mistakes, which are: Definition: A Type I error for a statistical test is the error of rejecting the null hypothesis when it is true. The probability of making a Type I error is denoted by the symbol a . A Type II error for a statistical test is the error of accepting (not rejecting) the null hypothesis when it is false and some alternative hypothesis is true. The probability of making a Type II error is denoted by the symbol b . Table: illustration of the two types of errors Decision TRUTH Based on Sample Ho True Ha True Accept H 0 Correct decision Type II error Reject H 0 (Take Ha) Type I error Correct decision Notice that the probability of a Type I error is exactly the same as the level of significance a and is therefore controlled by the researcher. Keep in mind that “accepting” a particular hypothesis means deciding in its favor. There is always a risk of being wrong, measured by a and b . Work on some hands-on activities for identifying Type I and Type II errors. Hands-on Activities We will do some of the Extra Exercise Problems Small-Sample Inferences Concerning a Population Mean Small sample inference can involve either estimation or hypothesis testing. Small Sample Hypothesis Test for m : 1. Null Hypothesis: H 0 : m = m 0 2. Alternative Hypothesis: One-Tailed Test Two-Tailed Test Ha : m > m0 Ha : m m0 (or H a : m < m 0 ) 3. Test Statistic: t= x m0 s n 4. Rejection Region: Reject H 0 when One-Tailed Test Two-Tailed Test t > ta t > ta/2 or t < ta/2 (or t < ta when the alternative hypothesis is H a : m < m 0 ) or when the p-value < a Assumption: The sample is randomly selected from a normally distributed population. Example A new process for producing synthetic diamonds can be operated at a profitable level only if the average weight of the diamonds is greater than .5 karat. To evaluate the profitability of the process, six diamonds are generated, with recorded weights: .46, .61, .52, .48, .57, and .54 karat. Do the six measurements present sufficient evidence to indicate that the average weight of the diamonds produced by the process is in excess of .5 karat? Solution The population of diamond weights produced by this new process has mean m , the value in question. The hypotheses to be tested are H 0 : m = .5 versus H a : m > .5 and the test statistic is a t-statistic with (n 1) = (6 1) = 5 degrees of freedom. You can use your calculator to verify that the mean and standard deviation for the six diamond weights are .53 and .0559, respectively. The calculated value of the test statistic is then x m .53 .5 t= = = 1.32 s / n .0559 / 6 _ 0 As with the large-sample tests, the test statistic provides evidence for either rejecting or accepting H 0 depending on how far from the center of the t distribution it lies. If you choose a 5% level of significance (a = .05 ), the right-tailed rejection region is found using the critical values of t from Table 4 in Appendix I. With d f = n 1 = 5, you can reject H 0 if t > t.05 = 2.015. Since the calculated value of the test statistic, 1.32, does not fall into the rejection region, you cannot reject H 0. The data do not present sufficient evidence to indicate that the mean diamond weight exceeds .5 karat. There are two ways to conduct a test of a hypothesis: critical value approach – as described in the above example. The The p-value approach. For this example, it is a right-side test: P-value = P(t > tobs) For this example, p-value = P(t > 1.32), which is larger than .05, as the graph shows. Figure Rejection region for the above Example tobs P-value = P(t > tobs) for the right-side test. For this example, p-value = P(t > 1.32), which is larger than .05, as the graph shows. Most statistical computing packages contain programs that will implement the Student’s t test or construct a confidence interval for m when the data are properly entered. The following example illustrates how computer can be useful for computing confidence intervals and conducting hypothesis tests. Example: For most brands of paint, a gallon will cover between 250 and 500 square feet, depending on the texture of the surface to be painted. One manufacturer claims that a gallon of its paint can cover 400 square feet of surface area. To test this claim, a random sample of ten 1-gallon cans of white paints were used to paint ten identical areas using the same kind of paint brush. The actual areas covered by these 10 1-gallon of paint are given here: 310, 311, 412, 368, 447, 376, 303, 410, 365, 350 Do the data present sufficient evidence to indicate the average coverage of this brand differs from 400 (square feet) at a = 5%? Complete the following steps for this TWO-SIDE test: • Hypothesis: • Test-statistic: • Decision Rule: • P-value: • Conclusion: Minitab output for the Paint Example Calculating the p-value for Paint Example Two-side t-test, p-value = 2 P(t > |tobs|). For this example , p-value = 2 P(t > 2.27) = .049 as given in the computer output. NOTE: Typically, we can not compute p-value by hand when t-test is performed. Computer comes handy for these situations. Hands-on Activities Work on some of the extra exercise problems A Large-Sample Test of a Hypothesis for a Binomial Proportion Large-Sample Statistical Test for p 1. Null hypothesis: H 0 : p = p 0 2. Alternative hypothesis: One-Tailed Test Two-Tailed Test Ha : p > p0 Ha : p p 0 (or H a : p < p 0 ) 3. Test statistic: z= pˆ p0 SE = pˆ p0 p0 q0 with pˆ = x n n where x is the number of successes in n binomial trials. 4. Rejection region: Reject H 0 when One-Tailed Test Two-Tailed Test z > za z > za/2 or z > za/2 (or z < za/2 when the alternative hypothesis is H a : p < p 0 ) or when p-value < a Assumption: The sampling satisfies the assumptions of a binomial experiment and n is large enough so that the sampling distribution of pˆ can be approximated by a normal distribution (np 0 > 5 and nq 0 > 5). Example Regardless of age, about 20% of American adults participate in fitness activities at least twice a week. However, these fitness activities change as the people get older, and occasional participants become nonparticipants as they age. In a local survey of n = 100 adults over 40 years old, a total of 15 people indicated that they participated in a fitness activity at least twice a week. Do these data indicate that the participation rate for adults over 40 years of age is significantly less than the 20% figure? Calculate the p-value and use it to draw the appropriate conclusions. Solution It is assumed that the sampling procedure satisfies the requirements of a binomial experiment. You can answer the question posed by testing the hypothesis H0 : p = .2 versus Ha : p < .2 A one-tailed test is used because you wish to detect whether the value of p is less than .2. The point estimator of p and the test statistic is pˆ = x n , is z= pˆ p0 p0 q0 n When H 0 is true, the value of p is p 0 = .2, The sampling distribution of standard deviation of p q 0 0 pˆ has a mean equal to p 0 and a n. The value of the test statistic is z= pˆ p0 p0 q0 n = .15 .20 = 1.25 (.20)(.80) 100 The p-value associated with this test is found as the area under the standard normal curve to the left of z = 1.25 as shown in Figure 9.10. Therefore, p - value = P ( z < 1.25) = (.5 .3944) = .1056 p-value for the above Example (NOTE: This is a left-side test) Hands-on Activities Work on some of the Extra Exercises problems. Some Comments on Testing Hypotheses If the p-value is greater than .05, the results are reported as NS — not significant at the 5% level. If the p-value lies between .05 and .01, the results are reported as P < .05 — significant at the 5% level. If the p-value lies between .01 and .001, the results are reported as P < .01— “ highly significant ” or significant at the 1% level. If the p-value is less that .001, the results are reported as P < .001— “ very highly significant ” or significant at the .1% level.