252ones 9/20/07 (Open this document in 'Outline' view!) B. HYPOTHESIS TESTS FOR ONE SAMPLE 1. The Meaning of Hypothesis Testing A hypothesis is a statement about the characteristics of a population. To be of any use to us it must be quantifiable and testable. The hypothesis to be tested is usually called the Null Hypothesis H 0 . A rival hypothesis is called the Alternative Hypothesis H 1 or H A . It is usually true that the null hypothesis will be a hypothesis of "no difference" and the alternative hypothesis covers all other possibilities. To start out ask "What do I want to know about a population or populations?" Can I state this in terms of population parameters? Can I state this in terms of a testable hypothesis ( null hypothesis?)? Does my null hypothesis say that these parameters or differences between these parameters are insignificant (i.e. not distinct from zero)? Next ask "What am I assuming about the population or populations?" Are the parameters that I am testing appropriate to the type of population that I am assuming? What can I use to test my hypothesis? Can I find a sample statistic or statistics to do the job ? How many samples do I need? Can I calculate the sample statistic or statistics.? What distribution does the test statistic have? Is this in accord with the null hypothesis? What errors am I likely to make? Usually there are three approaches to hypothesis testing involving a statement about a parameter of a population: (i) the test ratio method, in which a ratio involving an estimate of a parameter is tested against a well known distribution like t; (ii) the critical value method, in which values of estimates of a parameter are found which could lead to rejection of H 0 and (iii) the confidence interval method, in which a confidence interval is constructed for the parameter and compared to the null hypothesis. If I use a test ratio, what is the probability of getting values as extreme or more extreme than I actually got? (This is the p-value, the lower it is the less likely it is that the null hypothesis is true. If the p-value falls below the significance level, I can say that I reject the null hypothesis.) If I use a test statistic and my significance level is 5% or 1%, did the value fall among the most likely 95% or 99% of values? Or was it a very unlikely value? If I use a confidence interval, did the parameter value in my null hypothesis fall in the confidence interval? Remember: a. A null hypothesis is usually a statement about a parameter of a population. It is never a statement about a sample statistic. A sample statistic is used to test the hypothesis. b. A null hypothesis usually contains an equality, an alternate hypothesis does not contain an equality. c. A null hypothesis often says that a parameter or a difference between parameters is insignificant. If a result is significant we reject the null hypothesis. 1 2. Steps for Testing a Hypothesis Applied to testing for a Population Mean a. Outline i. State the problem as two hypotheses ii. Quantify the hypotheses iii. Identify the statistic, ratio or interval to be used iv. Determine the sampling distribution of the statistic to be used. v. Select a level of significance vi. Find a value or values of the test ratio or statistic that would lead to rejection of the null hypothesis vii. Compute a value of the statistic or ratio from a random sample viii. By comparing the results of (vii) with the values found in (vi) accept or reject the null hypothesis b. Application to a Population Mean To test H 0 : 0 against H1 : 0 . Assume that we have computed s from our sample, and that we do not know . x 0 i. Test Ratio: t sx ii. Critical Value: xcv 0 t 2 s x iii. Confidence Interval: x t 2 s x Note: If , the population standard deviation, is known, replace t and s x with z and x. c. One-sided tests. To test H 0 : 0 against H1 : 0 or H 0 : 0 against H1 : 0 , if you use a critical value or a confidence interval, you must use a one-sided one. Replace t with t . One-sided tests take more 2 thinking than two-sided test, and the most common error is in stating the null hypothesis. In a problem statement, the question asked is often the alternative hypothesis, not the null. Always ask yourself if the statement contains a strict inequality. If it does it cannot be a null hypothesis. Examples: Question: Is the mean income less than 20000? H 0 : 20000 Question: Is the mean income at least 20000? H 0 : 20000 Question: Is the mean income more than 20000? H 0 : 20000 Question: Is the mean income at most 20000? H 0 : 20000 3. The Use of p-value instead of Significance Levels. A p-value is a measure of the credibility of the null hypothesis and is defined as the probability that a test lower low statistic or ratio as extreme as or more extreme than the observed statistic or ratio could occur, high higher assuming that the null hypothesis is true. Note: If we have a p-value and want to do a conventional Hypothesis test, we can reject the null hypothesis if the p-value is below the significance level. The p-value can thus be said to represent the smallest level of significance at which the null hypothesis can be rejected. Other interpretations are: a) (i) If pvalue .01 we strongly doubt H 0 , (ii) If .01 pvalue .05 we somewhat doubt H 0 , and (iii) If pvalue .05 we cannot doubt H 0 ; or b) (i) If pvalue .01 results are very significant, (ii) If 2 .01 pvalue .05 results are significant, (iii) If .05 pvalue .10 results are marginally significant and (iv) If pvalue .10 results are not significant. This means that if we have a calculated a t - ratio with a value of t calc , and we have a left-sided test, pvalue Pt t calc . If we have a right sided test, pvalue Pt t calc . If we have a 2-sided test pvalue 2Pt t calc or pvalue 2Pt t calc , whichever is smaller. So, for a one-sided test make a diagram of the t distribution with a mean of zero, find the value of t calc and shade the appropriate side of t calc . For a 2-sided test, find both t calc and t calc and shaded the tail above whichever is positive and below whichever is negative. For an example using t see 252onesx0. For an example using z replace t with z in this paragraph and see 252doctor. 3 4. Type One and Type Two Errors a. Definitions A Type one error is rejecting H 0 when H 0 is true. A Type two error is not rejecting H 0 when H 0 is false. b. Probabilities H 0 True H 0 False Do not reject H 0 1 Confidence Level Reject H 0 Operating Characteristic 1 Significance Level Power 5. Hypotheses about a Proportion a. Tests: (For an example see 252onesx1) To test H 0 : p p 0 against H1 : p p0 i. Test Ratio: z p p0 p , p p0 q0 n ii. Critical Value: pcv p0 z 2 p pq n iii. Confidence Interval: p p z 2 s p , s p (b. Continuity Correction. The continuity correction acts to expand the 'accept' interval by x if npq 9 . i. Test Ratio: z p .5 n p 0 p , p .5 This is the same as testing z against z 2 n p 1 ii. Critical Value: pcv p0 2n z 2 p Use if p0 q0 n 1 2 in each direction. it should be used p p 0 and if p p 0 . iii. Confidence Interval: p p 1 2n z 2 s p ) 6. The Sign Test a. The Sign Test for a Median. To test H 0 : 0 against H1 : 0 In any distribution outside of the normal distribution, It is usually easier to use the p-value approach. For example, let us assume that we are testing the hypotheses H 0 : 25 and H1 : 25 , where is, as before, the median. The most important fact to know about testing for a median is that numbers above and below a median are equally likely to occur in a random sample. A test of the median is a test of proportions. 4 So let us use p as the proportion of the observations in our population that are above 25. ( p could just as easily be the proportion below 25.) If this is true, and we are working with a continuous distribution, our hypotheses become H 0 : p .5 and H1 : p .5 . Now let us assume that we take a sample of n 20 and that we find that x , the number of points above 25, is 5. We expect that half of our points, or 10, will be above 25, so 5 seems low. We thus use a binomial table to find Px 5 for n 20 and p .5. The table tells us that Px 5 .0207 . Since this is a two-sided test, we double this probability to .0414 and use this as our p-value. If our confidence level is 95%, our significance level must be 5%, and, since the p-value is below 5%, we reject the null hypothesis. (But if we are to repeat tests, it may be wise to define acceptance and rejection regions by defining two critical values, and saying that if x CVL or x CVu , we will reject the null hypothesis. Again assume that the significance level is .05 . We can use the p-value approach by saying that, if we would reject the null hypothesis using the p-value approach for some value of x , that value is in our rejection region. Starting from the bottom, try x 0 . From the table for n 20 and p .5,Px 0 .0000 . Since this p-value is below 2 , we would reject H 0 if x were 0. We come to a similar conclusion if x takes values of 2, 3 or 4. If x 5 , we have already seen that Px 5 .0207 , and that we would still reject the null hypothesis. But if we try x 6 , we find Px 6 .0577 which is above 2 .025 , so we accept H 0 if x is 6 or larger. But, since this is a two-sided test, it is also possible that x is too large. For example if x is 16, Px 16 1 Px 15 1 .9941 .0059 . Since this is below 2 , we would reject the null hypothesis if x were 16 or larger. So try x 15 , Px 15 1 Px 14 1 .9793 .0207 . This is still too low for acceptance, so try 14. Px 14 1 Px 13 1 .9423 .0577 . Since this is above 2 , we would accept the null hypothesis if x were 14. We can thus say that we accept the null hypothesis if x is between 6 and 14, or that our critical values for x are 5 and 15. If we now look back at the cumulative binomial table, we see that we rejected the null hypothesis for probabilities below .025 or 2 and above .975 or 1 2 .) Let's try a one-sided problem. Suppose that our null hypothesis is that median income in a region is at least $20000 and that we take a sample with the results shown below. Let .05 . 5 Our hypotheses are H 0 : 20000 ,H1 : 20000 . Let p be the proportion of numbers in the population below 20000. If the median is exactly 20000, p will be exactly 0.5. But if the median is above 20000, p will be below .5. Observation No. We can replace our original hypotheses with H 0 : p .5 and H 1 : p .5 . . We see that x , the quantity of numbers in the sample below 20000, is 7. Our expected number of items below 20000 is 0.5n .5(10 ) 5 , so 7 is high and our pvalue will be P( x 7) 1 P( x 6) 1 .8281 .1719 . Since the p-value is above the significance level, we accept the null hypothesis. Income 1 2 3 4 10132 11252 13475 14260 5 16871 6 7 19357 19438 8 23010 9 10 30278 35932 (If we wish to set up accept and reject zones for this one sided test, we need to try higher values of x . A value of x 8 is still too small; it gives a probability of .0547, which is above , so try x 9 . According to the binomial table for n 10 and p .5 , Px 9 1 Px 8 1 .9893 =.0107, which is below . So 9 is our critical value, and we will reject the hypothesis if x 9 .) To clarify the correspondence between hypotheses about a mean and hypotheses about a proportion, let us assume that p is the proportion of the data above 20000. If 2000 is the median, then by definition of the median p is one half. But let us assume that the median is above 2000, say 2100. then one half of the data must be above 2100, so that more than one half of the data must be above 2000, which means less than one half of the data is below 2000. Since a hypothesis about a median is a hypothesis about a proportion, H 0 : 0 H : p .5 corresponds to 0 . The table below shows these correspondences depending on the H : 0 1 H 1 : p .5 definition of p . Hypotheses about a median Hypotheses about a proportion If p is the proportion If p is the proportion above 0 below 0 H 0 : 0 H 1 : 0 H 0 : 0 H 1 : 0 H 0 : p .5 H 1 : p .5 H 0 : p .5 H 1 : p .5 H 0 : p .5 H 1 : p .5 H 0 : p .5 H 1 : p .5 H 0 : 0 H 1 : 0 H 0 : p .5 H 1 : p .5 H 0 : p .5 H 1 : p .5 6 b. The Sign Test more Generally. This technique can be used in other ways. For instance let us say that we wish to check the effectiveness of a product brochure. A sample of 17 clients is asked about their impression of a product. Then they read the brochure and once again are asked their impression. We write a + if their impression has improved and a if it is worse. A zero indicates no change. Our results are as follows: Client Sign 1 + 2 + 3 + 4 + 5 0 6 7 + 8 0 9 10 + 11 + 12 + 13 14 + 15 0 16 + 17 + Since we are hoping for a positive effect, count the zeros as minuses. We will use the brochure if we believe that the majority of the population will respond favorably. Let p be the proportion of plusses in the population. Our hypotheses are H 0 : p .5, H 1 : p .5 . There are 11 plusses so that we must find Px 11 when p .5 and n 17 . A large binomial table says that this value is .166, so that we must accept the null hypothesis and not use the brochure In the absence of a binomial table we must use the normal approximation to the binomial p p0 x distribution. If p is our observed proportion, we use z . But for the sign test, n p0 q0 n p .5 and q 1 .5 .5 . So z x n .5 .25 n x n .5 .5 2x n x .5n n x .5n 2 . n .5 n n n (For relatively small values of n , a continuity correction is advisable, so try z 2x 1 n , where the + n n n , and the applies if x . In the problem above, where 2 2 211 1 17 Pz .970 .5 .3340 =.1660. Since this is a p-value, if n 17 , use Px 11 P z 17 takes a typical value like.05 or .10, we can say p-value and accept the null hypothesis. ) applies if x 7 7. Hypothesis Test for Means - Rare Events In statistics, ‘rare events’ is a code word for the Poisson distribution. The easiest way to approach Poisson results is to use a p-value. For example if you wish to test H 0 : 5 against H 1 : 5 and you have a result that says x 7 , the p-value is 2Px 7. (For an example, see 252sx2) 8. Hypothesis Tests for a Variance. To test H 0 : 0 against H1 : 0 (For an example, see 252sx2) i. Test Ratio: 2 n 1s 2 or for large samples 2 ii. Critical Value: 2 s cv 0 2 02 (Don't try this for large samples.) iii. Confidence Interval: s 2DF z 2 2DF 2 or n 1 n 1s 2 2 2 z 2 2 2DF 1 12 2 02 n 1 2 s 2DF n 1s 2 or for large samples 2 1 2 z 2 2DF Appendix: One-sided and Two Sided Tests. Assume the following: n 7, 0 12 .2, DF n 1 6, .05, x 12 .00 , s 2 .6082333 , so that sx s n 0.6082333 0.29477 . 7 A 2-sided Test: H 0 : 12.2 H 1 : 12 .2 (i) Test ratio: t x 0 12 .00 12 .2 6 0.678 . We test this against two values of t, t n 1 t .025 2.447 sx 0.29477 2 6 2.447 . We reject H 0 if t is above t n 1 or below t n 1 . In this case we do and t n 1 t .025 2 2 2 not reject H 0 . 6 6 0.553 and t .25 0.718 . If we use p-value: pval 2Pt 0.678 . On the t-table 0.678 is between t .30 So Pt 0.678 is between .25 and .30 and the p-value is between .50 and .60. Since the p-value is above .05, we do not reject H 0 . (ii) Critical value for x : xcv 0 t s x 12.2 2.447 0.29477 12.2 0.72 . We reject H 0 if x is 2 above the upper x cv 12 .2 0.72 12.92 or below the lower x cv 12 .2 0.72 11 .48 . In this case x 12 .00 and we do not reject H 0 . (iii) Confidence interval: x t s x 12 .00 2.447 0.29447 12.00 0.72. This interval is 11.28 to 2 12.72. Since 0 12.2 is between these two limits, we do not reject H 0 . 8 A Left -Sided test: H 0 : 12.2 H 1 : 12 .2 x 0 12 .00 12 .2 6 0.678 . We test this against one value of t, tn 1 t .05 sx 0.29477 1.943 . We reject H 0 if t is below tn1 . In this case we do not reject H 0 . (i) Test ratio: t 6 6 If we use p-value: pval Pt 0.678. On the t-table 0.678 is between t .30 0.553 and t .25 0.718 . So Pt 0.678 is between .25 and .30 and the p-value is between .25 and .30. Since the p-value is above .05, we do not reject H 0 . (ii) Critical value for x : x cv 0 t s x 12.2 1.943 0.29477 12.2 0.57 11.63. We reject H 0 if x is below x cv 11 .63 . In this case x 12 .00 and we do not reject H 0 . (iii) Confidence interval. Since the alternate hypothesis is H 1 : 12 .2 , use x t s x 12.00 1.943 0.29477 12.0 0.57 12.57. Since H 0 : 12.2 does not contradict 12.57, we do not reject H 0 . A Right-sided Test: H 0 : 12.2 H 1 : 12 .2 x 0 12 .00 12 .2 6 1.943 . 0.678 . We test this against one value of t, tn 1 t .05 sx 0.29477 if t is above t n 1 . In this case we do not reject H . (i) Test ratio: t We reject H 0 0 6 6 0.553 and t .25 0.718 . So If we use p-value: pval Pt 0.678. On the t-table 0.678 is between t .30 Pt 0.678 is between .25 and .30, Pt 0.678 is between .70 and .75 and the p-value is between .70 and .75. Since the p-value is above .05, we do not reject H 0 . (ii) Critical value for x : x cv 0 t s x 12.2 1.943 0.29477 12.2 0.57 12.77. We reject H 0 if x is above x cv 12 .77 . In this case x 12 .00 and we do not reject H 0 . (iii) Confidence interval: Since the alternate hypothesis is H 1 : 12 .2 , use x t s x 12.00 1.943 0.29477 12.0 0.57 11.43. Since H 0 : 12.2 does not contradict 11.43 , we do not reject H 0 . More on p-value Let’s say that you have gotten one of the following results for a test of a mean with n 25 a) t 1.000 b) t 1.000 c) z 1.000 The values of z could also come from tests of proportions or variances. d) z 1.000 A 2-sided Test A p-value is defined as the probability that a test statistic or ratio as extreme as or more extreme than the observed statistic or ratio could occur, assuming that the null hypothesis is true. a) t 1.000 . You want Pt 1.000 ort 1.000 2Pt 1.000 . To find Pt 1.000 , look at the t table. Since n 25 , df n 1 24 . 24 24 0.857 and t .15 1.059 . This means Look at the df 24 line. You will find that 1.000 is between t .20 that Pt 0.857 .20 and Pt 1.059 .15 . Since 1.000 is between these values we can say .15 Pt 1.000 .20 . So pval 2Pt 1.000 , which means .30 pval .40 . b) t 1.000 . You want Pt 1.000 ort 1.000 2Pt 1.000 9 24 24 You found in a) that 1.000 is between t .20 0.857 and t .15 1.059 . This means that Pt 0.857 .20 and Pt 1.059 .15 , but, since the t distribution is symmetrical, we can also say Pt 0.857 .20 and Pt 1.059 .15 . Since 1.000 is between these values we can say .15 Pt 1.000 .20 . So pval 2Pt 1.000 , which means .30 pval .40 . c) z 1.000 . You want Pz 1.000 orz 1.000 2Pz 1.000 . Make a diagram for z with a center at zero and shade the area above 1.000. Use the Normal table. Pz 1.000 Pz 0 P0 z 1 .5 .3413 .1587 , so pval 2Pz 1.000 2.1587 .3174 d) z 1.000 . You want Pz 1.000 orz 1.000 2Pz 1.000 . Make a diagram for z with a center at zero and shade the area below -1.000. Pz 1.000 Pz 0 P1 z 0 .5 .3413 .1587 , so pval 2Pz 1.000 2.1587 .3174 A Left-sided Test A p-value is defined as the probability that a test statistic or ratio as low as or lower than the observed statistic or ratio could occur, assuming that the null hypothesis is true. 24 0.857 and a) t 1.000 . You want Pt 1.000 . You found in 2-sided Test a) that 1.000 is between t .20 24 t .15 1.059 . This means that Pt 0.857 .20 and Pt 1.059 .15 . Since 1.000 is between these values we can say .15 Pt 1.000 .20 . But you want Pt 1.000 , so subtract these probabilities from 1. So .80 pval .85 . b) t 1.000 . You want Pt 1.000 . You found in a) that .15 Pt 1.000 .20 . Since the t distribution is symmetrical, we can also say .15 Pt 1.000 .20 or .15 pval .20 . c) z 1.000 . You want Pz 1.000 . Make a diagram for z with a center at zero and shade the area below 1.000. Pz 1.000 Pz 0 P0 z 1 .5 .3413 .8413 , so pval .8413 d) z 1.000 . You want Pz 1.000 . Make a diagram for z with a center at zero and shade the area below -1.000. Pz 1.000 Pz 0 P1 z 0 .5 .3413 .1587 , so pval .1587 . A Right-sided Test A p-value is defined as the probability that a test statistic or ratio as high as or higher than the observed statistic or ratio could occur, assuming that the null hypothesis is true. 24 0.857 and a) t 1.000 . You want Pt 1.000 . You found in 2-sided Test a) that 1.000 is between t .20 24 t .15 1.059 . This means that Pt 0.857 .20 and Pt 1.059 .15 . Since 1.000 is between these values we can say .15 Pt 1.000 .20 . So .15 pval .20 . b) t 1.000 . You want Pt 1.000 . You found in a) that .15 Pt 1.000 .20 . Since the t distribution is symmetrical, we can also say .15 Pt 1.000 .20 . But you want Pt 1.000 , so subtract these probabilities from 1. So .80 pval .85 . c) z 1.000 . You want Pz 1.000 . Make a diagram for z with a center at zero and shade the area above 1.000. Pz 1.000 Pz 0 P0 z 1 .5 .3413 .1587 , so pval .1587 . d) z 1.000 . You want Pz 1.000 . Make a diagram for z with a center at zero and shade the area below -1.000. Pz 1.000 Pz 0 P1 z 0 .5 .3413 .8413 , so pval .8413 . Note that, since every one of these p-values is above 1%, 5% and 10%, you would not reject the null hypothesis if you used any of these significance levels. 10 © 2005 R. E. Bove 11