Sample Size Determination One of the most frequent problems in the application of statistical theory to practical applications is the determination of sample sizes for surveys and other empirical observations. This paper pulls together various approaches to this problem from a typical text 1 on statistics for business, economics, and social sciences in general. Sample Size Based on Confidence Intervals If it is desired to gain knowledge of the values of population parameters, a common practice is to perform observations on a sample from the population. Then an estimate of each of the populations parameters is calculated as a confidence interval within which the true value will lie for some proportion of all the samples ever to be taken. For the population mean the upper bound of the confidence interval is given by: upperbound = x + z α2 / 2 Where s n = x+e x = sample mean za/2 = value from the normal distribution for which a = 1 – C (confidence) s = sample standard deviation n = sample size The term after the + sign is the half width of the interval and can be considered the error, e, with which the mean is estimated. When x is subtracted from both sides of the equation the result can be solved for n as follows: n= z α2 / 2 s 2 e2 To use this formula, business decisions for the confidence, C (from which a is calculated by C – 1), and the allowable error, e, have to be made Obviously, both have a lot to do with the situation. Then a conservative estimate for n should be made (the smaller the sample size the large ta/2, n-1 will be so a conservative estimates will be small values of n). In addition s must be estimated. This can be done from historically similar situations or a pilot sample. n= z α2 / 2 pq e2 In this formula q = 1 – p. Since the object of using this formula is to find out what the sample proportion, p, is it is necessary to estimate p from prior knowledge or a pilot survey, as with s above. An underlying assumption for the use of these formulas is that the sample size will be less than 5% of the population size, whose exact size is often not known but needs to be guessed. It can always be assumed that the population is infinite, in which case the above formulas apply, but if it might be suspected that the 1 McClave, Benson, and Sincich (2007) Statistics for Business and Economics, 10th Edition, Pearson/Prentice Hall M Peter Jurkat Document1 Page 1 2/8/2016 Sample Size Calculation… above values of n could be greater than 5% then the sample size could be reduced by the correction factor in the following section. For two populations the same formulas apply except that the variance factor (second factor in numerator) is the sum of the two populations variance. Thus for a test for difference in means z α2 / 2 (σ12 + σ 22 ) z α2 / 2 (p1q1 + q 2 q 2 ) n1 = n 2 = or e2 e2 Correction for Finite Populations The correction in this section needs to be applied when the sample size can be expected to be more than 5% of the population size. Here the population size is designated with an upper case N. The finite population correction factor is N n (for some reason in the formula tool used here the N 1 minus signs do not appear – the correction factor should be the square root of N minus n in the numerator and N minus 1 in the denominator). This factor multiplies the sample standard deviation resulting in an equation with n on both sides of the equal sign. When this equation is solved for n the result is n= n0N n0 + N 1 where n0 is the samples size calculated for infinite populations from the previous section for both the population mean and proportion (also the denominator is n0 plus N minus 1). Sample Size Consideration when Sampling for Small Proportions An additional consideration arises when samples are drawn to estimate small population proportions, such as for epidemiologic and quality control studies. In both of these case (it is hoped) that the proportions of diseased individuals and production/service failures are very small, sometimes as few as one in hundreds of thousands. In this case samples must be large enough so that for each category being samples at least 5 or 6 diseased individuals or failures are actually found. Sample Size Calculation for Given Probabilities of Type I and Type II Error Section 6.6 of McClave et al2 discusses the probability of Type II errors in hypothesis testing for the mean of a single population. The probability of making a Type II error is usually denoted by and 1 - is referred to as the power of the test. However, the section does not extend the discussion to the most common use of the power of a test, which is to calculate the sample size that will satisfy a predetermined value of and . This paper provides an example of such calculations. The situation is specified as follows: H0: 2400 = 2 Ibid, p376-381 M Peter Jurkat Document1 Page 2 2/8/2016 Sample Size Calculation… Ha: > 2400 Significance level = .05, the probability of making a Type I error Sample size is initially specified as n = 50 and the sample s.d. = 200. Let x denote a typical value of the mean of such a sample. The critical region are those values of x for which H0 will be rejected. It may be calculated as follows: The significance level of .05 means the critical region for z, a normally distributed random variable is z > 1.654, since that region has probability 05. x 0 x 2400 Corresponding values of x can be calculated by solving 1.645 for x . Thus s 200 n 50 200 x 2400 1.645 2446 .53 With the critical region of x 2446.53 the probability of a Type I 50 error will be .05 no matter what the true value of the population mean really is. The acceptance region is now x < 2446.53 no matter what the true value of the population mean really is. However, the probability, , of this region is the probability of a Type II error will differ with various values of the true mean. For instance, for the true alternate mean a = 2475 the probability of this region 2446 .53 2475 is P( x 2446 .53 ) P z P(z 1.007 ) = .5 - P( 0 z 1.007) = .5 - .3413 = .1587. 200 50 This value differs somewhat from the one in the text due to differences of reading and extrapolating the values of the normal probability table. This results in a power of about 1 - .1587 ~ .84 = 84%. 2446 .53 2475 that is not fixed at 200 50 this point in the development is the sample size n = 50. The upper boundary of the acceptance region, 2445.53, is fixed by the Type I error requirement of .05, the alternate true mean, 2475, is the minimum value to be distinguished, and the estimated value of the population standard deviation, 200, is the only estimate we have until other samples are taken. Suppose a power of 90% is desired. The only value of the expression For a 90% power the probability of falsely accepting the null hypothesis, H 0 2400, is = .1. The corresponding value of z.1 is the solution of P( z z.1) = .1, which is z.1 = -1.23. Then the required 2446 .53 2475 sample size can be calculated by solving 1.28 for n which results in 200 n 1.28 200 n 8.99 9 Squaring yields n = 81. 2446 .53 2475 This development has illustrated how the concepts of Type I and Type II error are used to specify a sample size for a survey. In general: 1. State a null and alternate hypothesis: e.g., H0: = 0 and Ha: 0 2. Select a level of significance, , which will be the probability of making a Type I error, rejecting the null hypothesis when it is true. 3. Select an alternate population mean value, a, and a power level, 1 - . The alternate mean, a, is to be distinguished with power 1 - , i.e., when the alternate a is the true mean the null M Peter Jurkat Document1 Page 3 2/8/2016 Sample Size Calculation… hypothesis is to be rejected with probability 1 - . This is the same as making a Type II error, accepting the hull when it is false, with probability . 4. Use historical data or perform a pilot survey (or guess) to estimate the population standard deviation, s, and the sample size, n, of the historical data or the survey size.. 5. Calculate the boundaries (one could be infinite) of the rejection region, xb. The acceptance region is then its complement. 6. Calculate the probability of a Type II error, accepting a false null hypothesis, from the acceptance region assuming the alternate population mean is the true mean. 7. If the probability of a Type II error is greater than b then calculate a larger sample size to be used in the survey to test the null hypothesis. This procedure, while correct, is complicated and really applies to only the one alternate mean value, a. When it is repeated for other alternates the result is an operating characteristic curve. However, even this does not provide easily applied recommendations for sample size. Sample Size Estimation Based on Effect Size An alternate approach to estimate sample sizes to control both Type I and Type II errors can be based on estimated effect size. Effect size is defined as the difference between a sample statistic, such as the mean, and the true value divided by the standard deviation. Effectively this is effectSize = x μ0 s (where again there should be a minus sign between x and ). In practice the effect size is often estimated by the largest difference realistically expected. Once an effect size has been estimated the following table (Hair, Anderson, Tatham, Black (1998) Multivariate Data Analysis, p12) and graph can be used to estimated a sample size which can satisfy controls on Type I and Type II errors. Values of Power (1 - ) Alpha (α) = .05 Alpha (α) = .01 Effect Size (ES) Effect Size (ES) Sample size Small (.2) Moderate (.5) Small (.2) Moderate(.5) 20 .095 .338 .025 .144 40 .143 .598 .045 .349 60 .192 .775 .067 .549 80 .242 .882 .092 .709 100 .290 .940 .120 .823 150 .411 .990 .201 .959 200 .516 .998 .284 .992 Here is the acceptable probability of a Type I error (reject the null hypothesis when it is true) and is acceptable probability of a Type II error (accept a false null hypothesis). Typical values of = .05 and = M Peter Jurkat Document1 Page 4 2/8/2016 Sample Size Calculation… .2 are often used unless there is a good reason not to. Power (the values in the above table) is 1 - ; the previous recommendation results in a power value of .8 = 80%. So the above table recommends samples sizes for 80 for = .05 and effect size of .5 and samples sizes considerably greater than 200 for effect sizes of .2. For = .01 a similar sample size of100 for moderate (.5) effect size and much larger than 200 for small effect sizes. A graph3 that shows the relationship between sample size, effect size, and Power is shown next for an effect size between the small value (.2) and moderate value (.5). 3 Hair, Anderson, Tatham, Black (1998) Multivariate Data Analysis, p13 M Peter Jurkat Document1 Page 5 2/8/2016