Chapter 7 Systematic Sampling Selection of every kth case from a list of possible subjects. Systematic Sampling - 2 Definition: A sample obtained by randomly selecting 1 element from among the first k elements in the frame and every kth element thereafter is called a 1-in-k systematic sample with a random start. (Assumes population is randomly ordered). Does each element in the frame have an equal chance to be selected? Yes If so, what is this equal chance? 1/k Is this a simple random sample? NO!! Systematic Sampling - 3 N = 100 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 Systematic Sampling - 4 N = 100 Want n = 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 Systematic Sampling - 5 N = 100 Want n = 20 k = N/n = 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 Systematic Sampling - 6 N = 100 Want n = 20 k = N/n = 5 Select a random number between 1 and 5: Choose 4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 Systematic Sampling - 7 N = 100 Want n = 20 k = N/n = 5 Select a random number between 1 and 5: Choose 4 Start with #4 and select every 5th item 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 Systematic Sampling - 8 There are actually only 5 distinct systematic random samples which are: 1. {1,6,11,…,91,96} 2. {2,7,12,…,92,97} 3. {3,8,13,…,93,98} 4. {4,9,14,…,94,99} 5. {5,10,15,…,95,100} We are simply choosing 1 of these 5 groups at random N = 100 Want n = 20 k = N/n = 5 Select a random number between 1 and 5: Choose 4 Start with #4 and select every 5th item 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 Systematic Sampling - 9 Advantages – Easier to perform in the field, especially if a good frame is not available – Frequently provides more information per unit cost than simple random sampling, in the sense of smaller variances. Example. A systematic sample was drawn from a batch of produced computer chips. The first 400 chips are fine but, due to a fault in the machine later in the production process, the last 300 chips are defective. Systematic sampling will select uniformly over the non-defective and defective items and would give a very accurate estimate of the fraction of defective items. Systematic Sampling - 10 Value of k? In general, for a systematic random sample of n elements from a population (or frame) of size N, choose k ≤ N/n. Example: From a population of 90,000 students we desire a sample of 12,000 students. Since 90,000/12,000 = 7.5, we can select a 1-in-7 systematic sample. Systematic Sampling - 11 Value of k when N unknown? Must guess the value of k to achieve a sample size n. If k is too large, in some cases can go back and select another 1-in-k sample until the sample size n is attained. Systematic Sampling - 12 Estimation Parameters of interest that typically desire to estimate: population mean population total p population proportion How many people at this rally? Systematic Sampling – 13 Estimation of population mean Estimator of the population mean : n ˆ ysy y i 1 i n where the subscript sy signifies that systematic sampling was used. Estimated variance of ysy : 2 n s Vˆ ( ysy ) 1 N n assuming a randomly ordered population. Systematic Sampling – 14 Estimation of population mean From previous slide: 2 n s ˆ V ( ysy ) 1 N n Identical to the estimated variance of y obtained by using simple random sampling. This does NOT imply that V ( y ) V ( ysy ) (see following slides) Systematic Sampling – 15 n V ( y) 1 n N 2 1. If ρ is close to 0, and N is fairly large, systematic sampling is roughly equivalent to simple random sampling. V ( ysy ) 2 1 (n 1) n 1 where 1 is a measure of the n 1 correlation between pairs of elements within the same systematic sample. 2. If ρ is close to 1, then the elements within the sample are quite similar wrt the characteristic being measured, and systematic sampling will yield a higher variance of the sample mean than will simple random sampling. 3. If the elements in the systematic sample tend to be very different, then ρ is negative and systematic sampling may be more precise than simple random sampling. Systematic Sampling – 16 Summary: comparison of systematic and simple random sampling 2 n V ( y) 1 n N V ( ysy ) 2 n 1 (n 1) 1. Random order (If ρ is close to 0) Systematic and simple random sampling are approximately equal in precision. 2. Cyclic pattern in the y’s Systematic random sampling is worse than simple random sampling. 3. Increasing or Decreasing order in the y’s Systematic random sampling is better than simple random sampling. Systematic Sampling – 17 n V ( y) 1 n N 2 V ( ysy ) 2 n 1 (n 1) 1. Random order: if ρ is close to 0, and N is fairly large, systematic sampling is roughly equivalent to simple random sampling. Systematic Sampling – 18 n V ( y) 1 n N 2 V ( ysy ) 2 n 1 (n 1) 2. Cyclic pattern in the y’s Systematic random sampling is worse than simple random sampling. Systematic Sampling – 19 n V ( y) 1 n N 2 V ( ysy ) 2 n 1 (n 1) 3. Increasing or Decreasing order in the y’s Systematic random sampling is better than simple random sampling. Systematic Sampling – 20 Estimation of population total Estimator of the population total : n ˆ Nysy N y i 1 i n where the subscript sy signifies that systematic sampling was used. Estimated variance of ˆ : 2 n s Vˆ (ˆ) N 2Vˆ ( yst ) N 2 1 N n assuming a randomly ordered population. Systematic Sampling – 21 Estimation of population proportion p 1, if ith element has characteristic of interest Let yi th 0, if i element does not have characteristic of interest Estimator of the population proportion p : n pˆ sy ysy y i 1 i n where the subscript sy signifies that systematic sampling was used. Estimated variance of pˆ sy : ˆ sy qˆ sy p n Vˆ ( pˆ sy ) 1 N n 1 ˆ sy , assuming a randomly ordered population. where qˆ sy =1 p Systematic Sampling – 22 Required Sample Size for Bound B Sample size for Sample size for p N n 2 ( N 1) D Npq n ( N 1) D pq 2 2 B where D 4 2 B where q 1 p and D 4