Week 7 October 13-17 Three Mini-Lectures QMM 510 Fall 2014 • Chapter 8 Confidence Interval For a Proportion () ML 7.1 A proportion is a mean of data whose only values are 0 or 1. 1 0 1 0 1 0 1 0 0 0 0 0 1 1 1 1 0 1 0 1 Sample of 100 Binary Outcomes 0 0 0 1 1 0 1 0 1 0 0 1 1 1 0 1 1 1 0 1 1 0 0 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 0 0 1 0 0 0 1 0 1 1 0 1 1 1 1 Number of "successes" 0 0 1 0 0 1 1 0 0 1 0 0 1 1 0 0 0 0 0 1 1's 56 8-2 Chapter 8 Confidence Interval for a Proportion () Applying the CLT • The distribution of a sample proportion p = x/n is symmetric if = .50. • The distribution of p approaches normal as n increases, for any . 8-3 Chapter 8 Confidence Interval for a Proportion () When Is It Safe to Assume Normality of p? • Rule of Thumb: The sample proportion p = x/n may be assumed to be normal if both n 10 and n(1) 10. Sample size to assume normality: Rule: It is safe to assume normality of p = x/n if we have at least 10 “successes” and 10 “failures” in the sample. Table 8.9 8-4 Chapter 8 Confidence Interval for a Proportion () Confidence Interval for • The confidence interval for the unknown (assuming a large sample) is based on the sample proportion p = x/n. 8-5 Chapter 8 Confidence Interval for a Proportion () Example: Auditing 8-6 N = population size; n = sample size • The FPCF narrows the confidence interval somewhat. • When the sample is small relative to the population, the FPCF has little effect. If n/N < .05, it is reasonable to omit it (FPCF 1 ). 8-7 Chapter 8 Estimating from Finite Population ML 7-2 Chapter 8 Sample Size Determination Sample Size to Estimate m • 8-8 To estimate a population mean with a precision of + E (allowable error), you would need a sample of size n. Now, How to Estimate s? • Method 1: Take a Preliminary Sample Take a small preliminary sample and use the sample s in place of s in the sample size formula. • Method 2: Assume Uniform Population Estimate rough upper and lower limits a and b and set s = [(b a)/12]½. • Method 3: Assume Normal Population Estimate rough upper and lower limits a and b and set s = (b a)/4. This assumes normality with most of the data within m ± 2s so the range is 4s. • Method 4: Poisson Arrivals In the special case when m is a Poisson arrival rate, then s = m . 8-9 Chapter 8 Sample Size Determination for a Mean Using MegaStat For example, how large a sample is needed to estimate the population mean age of college professors with 95 percent confidence and precision of ± 2 years, assuming a range of 25 to 70 years (i.e., 2 years allowable error)? To estimate σ, we assume a uniform distribution of ages from 25 to 70: (70 25) 2 s 13 12 8-10 z 2s 2 (1.96) 2 (13) 2 n 2 163 E 22 Chapter 8 Sample Size Determination for a Mean 8-11 • To estimate a population proportion with a precision of ± E (allowable error), you would need a sample of size n. • Since is a number between 0 and 1, the allowable error E is also between 0 and 1. Chapter 8 Sample Size Determination for a Mean How to Estimate ? • Method 1: Assume that = .50 This conservative method ensures the desired precision. However, the sample may end up being larger than necessary. • Method 2: Take a Preliminary Sample Take a small preliminary sample and use the sample p in place of in the sample size formula. • Method 3: Use a Prior Sample or Historical Data How often are such samples available? might be different enough to make it a questionable assumption. 8-12 Chapter 8 Sample Size Determination for a Mean Using MegaStat For example, how large a sample is needed to estimate the population proportion with 95 percent confidence and precision of ± .02 (i.e., 2% allowable error)?. z 2 (1 ) (1.96) 2 (.50)(1 .50) n 8-13 E2 (.02) 2 2401 Chapter 8 Sample Size Determination for a Mean ML 7-3 Learning Objectives LO9-1: List the steps in testing hypotheses. LO9-2: Explain the difference between H0 and H1. LO9-3: Define Type I error, Type II error, and power. LO9-4: Formulate a null and alternative hypothesis for μ or π. 9-14 Chapter 9 One-Sample Hypothesis Tests Chapter 9 Logic of Hypothesis Testing 9-15 Chapter 9 Logic of Hypothesis Testing LO9-2: Explain the difference between H0 and H1. State the Hypothesis • • • • Hypotheses are a pair of mutually exclusive, collectively exhaustive statements about some fact about a population. One statement or the other must be true, but they cannot both be true. H0: Null hypothesis H1: Alternative hypothesis These two statements are hypotheses because the truth is unknown. 9-16 Chapter 9 Logic of Hypothesis Testing State the Hypothesis • • • • Efforts will be made to reject the null hypothesis. If H0 is rejected, we tentatively conclude H1 to be the case. H0 is sometimes called the maintained hypothesis. H1 is called the action alternative because action may be required if we reject H0 in favor of H1. Can Hypotheses Be Proved? • We cannot accept a null hypothesis; we can only fail to reject it. Role of Evidence • The null hypothesis is assumed true and a contradiction is sought. 9-17 LO9-3: Define Type I error, Type II error, and power. Types of Error • • 9-18 Type I error: Rejecting the null hypothesis when it is true. This occurs with probability a (level of significance). Also called a false positive. Type II error: Failure to reject the null hypothesis when it is false. This occurs with probability b. Also called a false negative. Chapter 9 Logic of Hypothesis Testing Probability of Type I and Type II Errors • • • 9-19 If we choose a = .05, we expect to commit a Type I error about 5 times in 100. b cannot be chosen in advance because it depends on a and the sample size. A small b is desirable, other things being equal. Chapter 9 Logic of Hypothesis Testing Power of a Test • • 9-20 A low b risk means high power. Larger samples lead to increased power. Chapter 9 Logic of Hypothesis Testing Relationship between a and b • • • • 9-21 Both a small a and a small b are desirable. For a given type of test and fixed sample size, there is a trade-off between a and b. The larger critical value needed to reduce a risk makes it harder to reject H0, thereby increasing b risk. Both a and b can be reduced simultaneously only by increasing the sample size. Chapter 9 Logic of Hypothesis Testing Chapter 9 Logic of Hypothesis Testing Consequences of Type I and Type II Errors • The consequences of these two errors are quite different, and the are borne by different parties. costs • Example: Type I error is convicting an innocent defendant, so the costs are borne by the defendant. Type II error is failing to convict a guilty defendant, so the costs are borne by society if the guilty person returns to the streets. • Firms are increasingly wary of Type II error (failing to recall a product as soon as sample evidence begins to indicate potential problems.) 9-22 LO9-4: Formulate a null and alternative hypothesis for μ or π. • • • 9-23 Chapter 9 Statistical Hypothesis Testing A statistical hypothesis is a statement about the value of a population parameter. A hypothesis test is a decision between two competing mutually exclusive and collectively exhaustive hypotheses about the value of the parameter. When testing a mean we can choose between three tests. One-Tailed and Two-Tailed Tests • The direction of the test is indicated by H1: > indicates a right-tailed test < indicates a left-tailed test ≠ indicates a two-tailed test 9-24 Chapter 9 Statistical Hypothesis Testing Decision Rule • • • 9-25 A test statistic shows how far the sample estimate is from its expected value, in terms of its own standard error. The decision rule uses the known sampling distribution of the test statistic to establish the critical value that divides the sampling distribution into two regions. Reject H0 if the test statistic lies in the rejection region. Chapter 9 Statistical Hypothesis Testing Decision Rule for Two-Tailed Test • 9-26 Reject H0 if the test statistic < left-tail critical value or if the test statistic > right-tail critical value. Chapter 9 Statistical Hypothesis Testing When to use a One- or Two-Sided Test • • • 9-27 Chapter 9 Statistical Hypothesis Testing A two-sided hypothesis test (i.e., µ ≠ µ0) is used when direction (< or >) is of no interest to the decision maker. A one-sided hypothesis test is used when - the consequences of rejecting H0 are asymmetric, or - where one tail of the distribution is of special importance to the researcher. Rejection in a two-sided test guarantees rejection in a one-sided test, other things being equal. Chapter 9 Statistical Hypothesis Testing Decision Rule for Left-Tailed Test • Reject H0 if the test statistic < left-tail critical value. Figure 9.2 9-28 Chapter 9 Statistical Hypothesis Testing Decision Rule for Right-Tailed Test • Reject H0 if the test statistic > right-tail critical value. 9-29 Type I Error • • • • 9-30 also called a false positive A reasonably small level of significance a is desirable, other things being equal. Chosen in advance, common choices for a are .10, .05, .025, .01, and .005 (i.e., 10%, 5%, 2.5%, 1%, and .5%). The a risk is the area under the tail(s) of the sampling distribution. In a two-sided test, the a risk is split with a/2 in each tail since there are two ways to reject H0. Chapter 9 Statistical Hypothesis Testing