Class exercise (1) NOTES This classroom exercise concerns

Class exercise (1) NOTES This classroom exercise concerns numerical illustrations of simple random, clustered (multi-stage) and stratified sampling. It is recommended that the students work in pairs. To begin with, from a given population we will select simple random samples of elements of various sizes to illustrate how the distribution of sample means varies with sample size. We will also illustrate how the variability among elements in any given sample estimates the population variance. 1 The population (1) In order to facilitate quick selection of many random samples, we will employ a simple wellmixed population with known characteristics in which units with different values appear in an entirely random order, so that any arbitrary set of units can be regarded as a random sample. Our example is artificial in two respects:   We actually know the entire population (the Yj values vary in the range 0 to 9). The population is thoroughly mixed at least in relation to values of the variable of interest, i.e. the Yj values appear in an entirely random order. (2) Theoretically, the population parameters are: Y  1 10 .  Y  4.500; j 10 2  j  1 2 1 10  .   Y  4.5  8.25; j 10   2.87 j  1 Our illustrative population is shown in tab “40x40 random digits 0-9”. Let us assume that these digits represent values of some variable Y, the average value of which we are interested in estimating. Our table of 1,600 digits is of limited size and not perfect. It compares as follows with the theoretical values above (see tab “frequency distribution”): Comparison of frequencies in the 1600 digit table with the expected frequencies Frequency distribution of digits 0 1 2 3 4 theoretical 160 160 160 160 160 actual 151 162 161 144 176 5 160 169 6 160 169 7 160 170 8 160 154 y 2 9 160 4.500 144 4.498 8.250 7.940 S2 8.250 7.945 2 Simple random sampling For illustrative purposes, we construct six sets of simple random samples: Sample “design” Sample size (1r) one quarter of each row n =(1x10)=10 160 (1c) one quarter of each column n =(10x1)=10 160 (2) Each (4x4) square n =(4x4)=16 100 (3r) each whole row n =(1x40)=40 40 (3c) each whole column n =(40x1)=40 40 (4) Each (10x10) square n =(10x10)=100 16 Number of samples 1 In each case, the above mentioned samples amount to a very small subset of all possible samples of a given size that can be drawn from the population. Many more can be easily created. 3 Clusters with different degrees of homogeneity Here we have considered the 100 square clusters of size (4x4). Clusters with five different degrees of homogeneity have been formed: (1) Entirely random clusters (2) For each set of 4 columns of digits, the first column is sorted by increasing value of Y. Such sorting is applied to left half of the table of (40x40) digits. The (4x4) clusters are formed in the normal way, but using the set of digits sorted as above. (3) As above, except that the sorting is applied to the whole table. (4) The sorting is applied to the first two columns of each set of 4 columns of digits in the left half of the table. (5) The above is applied to the whole table. If a simple random sample of a clusters (each of size n) were selected from a population of A clusters, variance of the mean for this clustered sample would be Var y A 2 Yk  Y  a S2  A  a  SA ,f  .   (1  f ). A , with S A2  . A 1 A a  A  a 2 When the clusters are formed by entirely random grouping as in (1), the variance would be identical to that for a simple random sample of elements of the same size, i.e. of (a.n) elements: 2 2  N  a.n  S  Aa S , giving Var0  y    .  .     N  a.n  A  a.n S 2 1A  S2 n or Var1  y A  Var0  y  . In schemes (2)-(5), clusters correspond to increasingly homogeneous groupings of elements. Greater homogeneity within clusters implies greater variability between clusters. Hence the cluster means in populations (2)-(5) are increasingly more diverse compared to cluster means in (1). S 2 5A S 2 4A S 2 3A S 2 2A S 2 1A  S2 n . Generally, with n>>1, Si A  S 2 . 2  A  a  Sk A gives Vark  y A > Vark 1  y A , 𝑘 = 1 − 5. Vark  y A   .  A  a 2 2 Vark  y A  S k A  deft   , 𝑘 = 1 − 5. Var0  y   S 2 n  2 k deft5 A  deft42A  deft3 A  deft2 A  deft1 A  1 . 2 2 2 2 2 4 Stratification If sample selection and estimation is done separately within each stratum, the same basic expressions given above apply to each stratum. Using subscript h to refer to a particular stratum, we have with SRS within strata: Yhj  Yh  S2 , Var  y h   1  f h . h with S h2  Nh  1 nh 2 summed over Nh units in the stratum h. In putting together the results from different strata, we often do that in proportion to stratum size, e.g. Wh=Nh/N. For the total population Y   hWh .Yh . and if the Wh are known, y   hWh . y h and Var  y    h Wh2 .Var  y h  . For simplicity in our illustrations, we consider the population as divided into H strata equal in population as well as in sample size (Wh = Nh/N = nh/n = 1/H). With this, it follows from the above expression that with SRS within each stratum, variance can be written as Var y  E  1  f   h S h2  . . n  H  Comparing this with unstratified SRS, the effect of stratification is in proportion to the ratio of the average within-stratum variance S 2  S h2 H to the unstratified value S2. We may decompose the total variance into variation within strata and variation among the strata means:   h  j Yhj  Y  2    h  j Yhj  Yh  2   h  j Yh  Y  , 2 or dividing by N, we may write the above as  2   2  2 , where the first term on the right is the within-stratum and the second term the between-strata component. ∆̅ is the mean squared deviation of means. The proportionate reduction in variance from stratification is approximately 2 2   hj  Yh  Y    hj Yhj  Y 2  2    h N h .  Yh  Y   hj Yhj  Yh  2 2   h N h .  Yh  Y  2 . The actual gain is slightly smaller due to the (generally minor) difference in the definition of  and S. With H strata of equal size N/H, it is seen to be S 2  S 2 2  H  1  2   2  . . N H 2 S2  An important point to note is that exactly the same idea applies when we are dealing with clusters rather than elements as the sampling units in a stratified design. The above quantities then refer to the variance of cluster means. With a given stratification, the deviation between strata means, 2 , is the same whether element or cluster sampling is used within strata. By contrast, S2 or the within-stratum term, S 2 , is usually much smaller for cluster means than that for individual elements (as noted earlier). Hence with cluster sampling, the relative gain from stratification is usually much more appreciable. 3 5 Estimating variance from the sample In our illustrations, the average of the sample means y =yi/n is equal to the population mean Y . We say that the expected value of the former equals to the latter: E[ y ]= Y ; i.e. y provides an unbiased estimator of Y . Furthermore, the variability among elements in any particular sample provides a measure of that 2 2 variability in the population, i.e.  2   i  y i  y  n   j Y j  Y N   2 , where the summation   on the left is over n elements of the sample and that on the right is over N population elements. 2 Actually, for an SRS the exact relationship happens to be E[s2]=S2, where s 2   i  y i  y  n  1  and S 2   j Y j  Y  2  N  1 . Hence for a simple random sample E[var( y )]=Var[ y ], where var( y )=(1-f)s2/n is estimated from the sample and Var( y )=(1-f).S2/n its population value. This is the basis on which we can estimate the variance (a measure of variability among different samples) from the results of a single sample that is available. It is important to note that variance computed above provides a valid estimate for only simple random sampling. For more complex designs estimating variance will involve more complex formulae taking into account the complexity of the design. But interestingly, an important result of sampling theory is that for many complex designs, the relationship E[s2]S2 still holds approximately. 4 Y value 0 1 2 3 4 5 6 7 8 9 total mean sigma2 S2 theoretical N→∞ 160 160 160 160 160 160 160 160 160 160 1600 4.500 8.250 8.250 illustrative N=1600 151 162 161 144 176 169 169 170 154 144 1600 4.498 7.940 7.945 Frequency distribution Population characteristics Theoratical Illustrative mean= 4.500 4.498 var= 8.250 7.940 StDev= 2.870 2.818 cv= 0.638 0.626 Examples of simple random samples (1) n=10: (10x1); (1x10) (2) n=16: (4x4) (3) n=40: (40x1); (1x40) (4) n=100: (10x10) mean= var= StDev= cv= 4.50 0.48 0.69 0.15 Stratum (1) Unstratified mean= var= StDev= cv= 4.50 0.48 0.69 0.15 4.50 0.67 0.82 0.18 4.50 1.00 1.00 0.22 4.50 1.62 1.27 0.28 4.50 0.48 0.69 0.15 (2) (1+2) 4.89 0.45 4.50 0.52 0.72 0.16 3.83 0.59 3.55 0.83 Stratum (1) Unstratified mean= var= StDev= cv= 4.60 0.49 Stratum (1) Unstratified mean= var= StDev= cv= 4.11 0.60 4.50 2.28 1.51 0.34 3.27 0.88 mean= var= StDev= cv= (1+2) Stratum (1) Unstratified mean= var= StDev= cv= 4.40 0.46 4.50 0.67 0.82 0.18 (2) Stratum (1) Unstratified mean= var= StDev= cv= mean= var= StDev= cv= (2) (1+2) 5.17 0.51 4.50 0.55 0.74 0.17 (2) (1+2) 5.45 0.59 4.50 0.71 0.84 0.19 (2) (1+2) 5.72 0.65 4.50 0.77 0.88 0.19 4.50 0.99 1.00 0.22 Complete the statistics for sampling distributions (1.r) (1.c) (2) (3.r) (3.c) (4) mean= 4.50 var= 0.85 StDev= 0.92 cv= 0.20 mean= var= StDev= cv= 4.50 1.60 1.27 0.28 mean= var= StDev= cv= 4.50 2.25 1.50 0.33 5

Class exercise (1) NOTES This classroom exercise concerns

Related documents

Products

Support

Class exercise (1) NOTES This classroom exercise concerns

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib