Matakuliah Tahun : A0392 – Statistik Ekonomi : 2006 Pertemuan 09 Pengujian Hipotesis Proporsi dan Data Katagorik 1 Outline Materi : • Uji hipotesis proporsi • Uji hipotesis beda proporsi • Uji kebebasan data katagorik 2 Summary of Test Statistics to be Used in a Hypothesis Test about a Population Mean Yes s known ? Yes n > 30 ? No Yes Use s to estimate s s known ? Yes z x / n No x z s/ n x z / n No Popul. approx. normal ? No Use s to estimate s x t s/ n Increase n to > 30 3 A Summary of Forms for Null and Alternative Hypotheses about a Population Proportion • The equality part of the hypotheses always appears in the null hypothesis. • In general, a hypothesis test about the value of a population proportion p must take one of the following three forms (where p0 is the hypothesized value of the population proportion). H0: p > p0 Ha: p < p0 H0: p < p0 Ha: p > p0 H0: p = p0 Ha: p p0 4 Tests about a Population Proportion: Large-Sample Case (np > 5 and n(1 - p) > 5) • Test Statistic p p0 z p where: p p0 (1 p0 ) n • Rejection Rule H0: p p H0: p p H0: pp One-Tailed Reject H0 if z > z Reject H0 if z < -z Two-Tailed Reject H0 if |z| > z 5 Example: NSC • Two-Tailed Test about a Population Proportion: Large n For a Christmas and New Year’s week, the National Safety Council estimated that 500 people would be killed and 25,000 injured on the nation’s roads. The NSC claimed that 50% of the accidents would be caused by drunk driving. A sample of 120 accidents showed that 67 were caused by drunk driving. Use these data to test the NSC’s claim with = 0.05. 6 Example: NSC • Two-Tailed Test about a Population Proportion: Large n – Hypothesis H0: p = .5 Ha: p .5 – Test Statistic p0 (1 p0 ) .5(1 .5) p .045644 n 120 z p p0 p (67 /120) .5 1.278 .045644 7 Example: NSC • Two-Tailed Test about a Population Proportion: Large n – Rejection Rule Reject H0 if z < -1.96 or z > 1.96 – Conclusion Do not reject H0. For z = 1.278, the p-value is .201. If we reject H0, we exceed the maximum allowed risk of committing a Type I error (p-value > .050). 8 Hypothesis Testing and Decision Making • In many decision-making situations the decision maker may want, and in some cases may be forced, to take action with both the conclusion do not reject H0 and the conclusion reject H0. • In such situations, it is recommended that the hypothesis-testing procedure be extended to include consideration of making a Type II error. 9 Calculating the Probability of a Type II Error in Hypothesis Tests about a Population Mean 1. Formulate the null and alternative hypotheses. 2. Use the level of significance to establish a rejection rule based on the test statistic. 3. Using the rejection rule, solve for the value of the sample mean that identifies the rejection region. 4. Use the results from step 3 to state the values of the sample mean that lead to the acceptance of H0; this defines the acceptance region. x 5. Using the sampling distribution of for any value of from the alternative hypothesis, and the acceptance region from step 4, compute the probability that the sample mean will be in the acceptance region. 10 Example: Metro EMS (revisited) • Calculating the Probability of a Type II Error 1. Hypotheses are: H0: and Ha: 2. Rejection rule is: Reject H0 if z > 1.645 3. Value of the sample mean that identifies the rejection region: x 12 z 1.645 3.2 / 40 4. We will accept H0 when x < 12.8323 3.2 x 12 1.645 12.8323 40 11 Example: Metro EMS (revisited) • Calculating the Probability of a Type II Error 5. Probabilities that the sample mean will be in the acceptance region: 12.8323 z 3.2 / 40 Values of 14.0 13.6 13.2 12.83 12.8 12.4 12.0001 -2.31 -1.52 -0.73 0.00 0.06 0.85 1.645 b 1-b .0104 .0643 .2327 .5000 .5239 .8023 .9500 .9896 .9357 .7673 .5000 .4761 .1977 .0500 12 Example: Metro EMS (revisited) • Calculating the Probability of a Type II Error Observations about the preceding table: – When the true population mean is close to the null hypothesis value of 12, there is a high probability that we will make a Type II error. – When the true population mean is far above the null hypothesis value of 12, there is a low probability that we will make a Type II error. 13 Power of the Test • The probability of correctly rejecting H0 when it is false is called the power of the test. • For any particular value of , the power is 1 – b. • We can show graphically the power associated with each value of ; such a graph is called a power curve. 14 Determining the Sample Size for a Hypothesis Test About a Population Mean n ( z zb ) 2 2 ( 0 a )2 where z = z value providing an area of in the tail zb = z value providing an area of b in the tail = population standard deviation 0 = value of the population mean in H0 a = value of the population mean used for the Type II error Note: In a two-tailed hypothesis test, use z /2 not z 15 Relationship among , b, and n • Once two of the three values are known, the other can be computed. • For a given level of significance , increasing the sample size n will reduce b. • For a given sample size n, decreasing will increase b, whereas increasing will decrease b . 16 Inferences About the Difference Between the Proportions of Two Populations • Sampling Distribution of p1 p2 • Interval Estimation of p1 - p2 • Hypothesis Tests about p1 - p2 17 Sampling Distribution ofp1 p2 • Expected Value E ( p1 p2 ) p1 p2 • Standard Deviation p1 p2 p1 (1 p1 ) p2 (1 p2 ) n1 n2 • Distribution Form If the sample sizes are large (n1p1, n1(1 - p1), n2p2, and n2(1 - p2) are all greater than or equal to 5), the sampling distribution of p1 p2 can be approximated by a normal probability distribution. 18 Interval Estimation of p1 - p2 • Interval Estimate p1 p2 z / 2 p1 p2 • Point Estimator of p1 p2 s p1 p2 p1 (1 p1 ) p2 (1 p2 ) n1 n2 19 Example: MRA MRA (Market Research Associates) is conducting research to evaluate the effectiveness of a client’s new advertising campaign. Before the new campaign began, a telephone survey of 150 households in the test market area showed 60 households “aware” of the client’s product. The new campaign has been initiated with TV and newspaper advertisements running for three weeks. A survey conducted immediately after the new campaign showed 120 of 250 households “aware” of the client’s product. Does the data support the position that the advertising campaign has provided an increased awareness of the client’s product? 20 Example: MRA • Point Estimator of the Difference Between the Proportions of Two Populations 120 60 p1 p2 p1 p2 . 48. 40 . 08 250 150 p1 = proportion of the population of households “aware” of the product after the new campaign p2 = proportion of the population of households “aware” of the product before the new campaign p1 = sample proportion of households “aware” of the product after the new campaign p2 = sample proportion of households “aware” of the product before the new campaign 21 Example: MRA • Interval Estimate of p1 - p2: Large-Sample Case For = .05, z.025 = 1.96: . 48(.52) . 40(. 60) . 48. 40 1. 96 250 150 .08 + 1.96(.0510) .08 + .10 or -.02 to +.18 – Conclusion At a 95% confidence level, the interval estimate of the difference between the proportion of households aware of the client’s product before and after the new advertising campaign is -.02 to +.18. 22 Hypothesis Tests about p1 - p2 • Hypotheses H0: p1 - p2 < 0 Ha: p1 - p2 > 0 • Test statistic z ( p1 p2 ) ( p1 p2 ) p1 p2 • Point Estimator of p1 p2 where p1 = p2 s p1 p2 p (1 p )(1 n1 1 n2 ) where: n1 p1 n2 p2 p n1 n2 23 Example: MRA • Hypothesis Tests about p1 - p2 Can we conclude, using a .05 level of significance, that the proportion of households aware of the client’s product increased after the new advertising campaign? p1 = proportion of the population of households “aware” of the product after the new campaign p2 = proportion of the population of households “aware” of the product before the new campaign – Hypotheses H0: p1 - p2 < 0 Ha: p1 - p2 > 0 24 Example: MRA • Hypothesis Tests about p1 - p2 – Rejection Rule Reject H0 if z > 1.645 – Test Statistic 250(. 48) 150(. 40) 180 p . 45 250 150 400 s p1 p2 . 45(.55)( 1 1 ) . 0514 250 150 (. 48. 40) 0 . 08 z 1.56 . 0514 . 0514 – Conclusion Do not reject H0. 25 Test of Independence: Contingency Tables 1. Set up the null and alternative hypotheses. 2. Select a random sample and record the observed frequency, fij , for each cell of the contingency table. 3. Compute the expected frequency, eij , for each cell. (Row i Total)(Column j Total) eij Sample Size 26 Test of Independence: Contingency Tables 4. Compute the test statistic. 2 i j ( f ij eij ) 2 eij 2 2 (where 5. Reject H0 if is the significance level and with n rows and m columns there are (n - 1)(m - 1) degrees of freedom). 27 Example: Finger Lakes Homes (B) • Contingency Table (Independence) Test Each home sold can be classified according to price and to style. Finger Lakes Homes’ manager would like to determine if the price of the home and the style of the home are independent variables. The number of homes sold for each model and price for the past two years is shown below. For convenience, the price of the home is listed as either $65,000 or less or more than $65,000. Price Colonial < $65,000 18 > $65,000 12 Ranch 6 14 Split-Level 19 16 A-Frame 12 3 28 Example: Finger Lakes Homes (B) Contingency Table (Independence) Test • Hypotheses H0: Price of the home is independent of the style of the home that is purchased Ha: Price of the home is not independent of the style of the home that is purchased • Expected Frequencies Price Colonial Ranch Split-Level A-Frame < $99K 18 6 19 12 > $99K 12 14 16 3 Total 30 20 35 15 Total 55 45 100 29 Example: Finger Lakes Homes (B) • Contingency Table (Independence) Test – Test Statistic 2 2 2 ( 18 16 . 5 ) ( 6 11 ) ( 3 6 . 75 ) 2 ... 16. 5 11 6. 75 = .1364 + 2.2727 + . . . + 2.0833 = 9.1486 – Rejection Rule 2 With = .05 and (2 - 1)(4 - 1) = 3 d.f.,.05 7.81 Reject H0 if 2 > 7.81 – Conclusion We reject H0, the assumption that the price of the home is independent of the style of the home that is purchased. 30