Chapter 5 Stratified Random Sampling

Chapter 5 Stratified Random Sampling Advantages of stratified random sampling  How to select stratified random sample  Estimating population mean and total  Determining sample size, allocation  Estimating population proportion; sample size and allocation  Optimal rule for choosing strata  Stratified Random Sampling  The ultimate function of stratification is to organize the population into homogeneous subsets and to select a SRS of the appropriate size from each stratum. Stratified Random Sampling  Often-used option – May produce smaller BOE than SRS of same size – Cost per observation may be reduced – Obtain estimates of population parameters for subgroups  Useful when the population is heterogeneous and it is possible to establish strata which are reasonably homogeneous within each stratum Chapter 5 Stratified Random Sampling Improved Sampling Designs with Auxiliary Information Stratified Random Sampling Chapter 6 Ratio and Regression Estimators Stratified Random Sampling: Notation yi :sample mean of data from stratum i, i  1, , L ni :sample size for stratum i i : population mean of stratum i  i :population total of stratum i population total   1   2    L Stratified Random Sampling SRS within each stratum, so: E ( yi )  i ˆi  N i yi ; E (ˆi )  E ( N i yi )  N i E ( yi )  N i i   i Estimate population total  by summing estimates of  i ˆ1  ˆ2   ˆL ˆ   yst N Stratified Random Sampling: Estimate of Mean 1 yst   N1 y1  N 2 y2   N L yL  N 1 L   N i yi N i 1 1 ˆ V ( yst )  2  N12Vˆ ( y1 )  N 22Vˆ ( y2 )   N L2Vˆ ( yL )  N 2 2 2         n s n s n s 1 2 2 2 1 1 2 2 L L  2  N1 1    N 2 1     N L 1    N   N1  n1  N 2  n2  N L  nL  Stratified Random Sampling: Estimate of Mean , BOE 1 2 2 ˆ V ( yst )  2  N1 Vˆ ( y1 )  N 2 Vˆ ( y2 )  N 1  2  N LVˆ ( y L )  2 2 n s n s     2 2  2  N1  1  1  1  N 2  1  2  2  N1  n1 N    N 2  n2 2  n s   2 L L  N L 1     N L  nL  BOE 1  2 2 n s n s     2 2 1 1 2 2  1.96 N 1   N 1     2  2  1  N1  n1 N    N 2  n2 2  n s   2 L L  N L 1     N L  nL  Stratified Random Sampling: Estimate of Population Total Nyst   N1 y1  N 2 y2   N L yL  L   N i yi i 1 Vˆ ( Nyst )  N 2Vˆ ( yst )   N12Vˆ ( y1 )  N 22Vˆ ( y2 )   N L2Vˆ ( yL )  2  2   n1  s12 n s 2 2 2   N1 1   N 1     2    N1  n1  N 2  n2 2    n s 2 L L  N L 1     N L  nL  Stratified Random Sampling: BOE for Mean and Total , t distribution  When stratum sample sizes are small, can use t dist.   L 2 2 ak sk Satterwaithe df  L  a s  2 k N k ( N k  nk ) where ak  k 1 2 nk k nk  1 k 1 BOE for  :   N 1    1 t df N 2 1 2   n n s 1 N 2 1 1 N 1 2 2  1   n   n s 2 N 2 2 2  N 2 2 L  1   n     n  s L N L t df   N 1    2 1 1 N 1   n s 2 1 1 N 2 2  1   n 2 N 2   n s 2 2 2  N 2 L  1   n L N L     n  s L L BOE for  : n 2 2 L L Degrees of Freedom(worksheet cont.) Stratified Random Sample Summary: a  k N (N  n ) k k n k , n  20, n  8, n  12 1 2 3 k N  155, N  62, N  93, 1 2 3 a  1046.25, a  418.5, a  627.75 1 2 3 1046.25  5.95  418.5  15.25  627.75  9.36  1046.25  5.95   418.5  15.25   627.75  9.36  2 df  2 2 2 2  19  21.09; t 21.09  2.08 2 2 2 2  7 (see Excel worksheet) 11 2 Compare BOE in Stratified Random Sample and SRS (worksheet cont.) Stratified Random Sample Summary: n  40, y  27.7;Vˆ ( y )  1.97 st st Strat. random sample has more precision If observations were from SRS: 40 11.31   s  11.31, Vˆ ( y )  1   2.79   310  40 2 Approx. Sample Size to Estimate 2 V ( y st )  B  V ( y st )  B 2 4 Let ni  ai n, ai  prop. of sample from stratum i 1 N 2 N i 1  an s 1  N  a n   2 L 2 i i i i i  B 2  4 L  n 2 2 N i si ai where D  i 1 L N D 2 N s i i 1 2 i B 2 4 Approx. Sample Size to Estimate 2 V ( Ny st )  B  V ( y st )  B 2 4N 2 Let ni  ai n, ai  prop. of sample from stratum i 1 N 2 N i 1  an s 1  N  a n   2 L 2 i i i i i  B 2 4N 2  L  n 2 2 N i si ai where D  i 1 L N D 2 N s i i 1 2 i B 2 4N 2 Summary: Approx. Sample Size to Estimate , L n N i 1 2 i 2 i s ai L N DN s 2 i 1 2 i i B2 D when estimating  4 B2 D when estimating  4N 2 Example: Sample Size to Estimate (worksheet cont.) L n N i 1 2 i si2 ai L N D   N i si2 2 i 1 Prior survey:  1  5,  2  15,  3  10. Estimate  to within 2 hrs with 95% conf. allocation proportions are a1  a2  a3  1 3. B 2 D B 3 N i 1 s ai  2 2 i i 2 4  1; N 2 D  310 2  96,100 1552 (25) 13  622 (225) 13  932 (100) 13 3  6,991, 275 2 N s  i i  155(25)  62(225)  93(100)  27,125 i 1 6,991, 275 n  56.7  57 96,100  27,125 so n1  n2  n3  13 (57)  19 B2 D 4 Example: Sample Size to Estimate (worksheet cont.) L n N i 1 2 i si2 ai L N D   N i si2 2 i 1 Prior survey:  1  5,  2  15,  3  10. Estimate  to within 400 hrs with 95% conf. allocation proportions are a1  a2  a3  1 3. D 3 B2 N i 1 4N 2  4002 4N2 s ai  2 2 i i  160,00  4N2 1552 (25) 13  40,000 N2 622 (225) 13 ; N 2 D  40, 000  932 (100) 13 3  6,991, 275 2 N s  i i  155(25)  62(225)  93(100)  27,125 i 1 6,991, 275 n  104.2 105 40, 000  27,125 so n1  n2  n3  13 (105)  35 B2 D 4N 2 5.5 Allocation of the Sample Objective: obtain estimators with small variance at lowest cost.  Allocation affected by 3 factors:  1. Total number of elements in each stratum 2. Variability in each stratum 3. Cost per observation in each stratum 5.5 Allocation of the Sample: Proportional Allocation  If don’t have variability and cost information for the strata, can use proportional allocation. Sample size for stratum h : Nh nh  n  N In general this is not the optimum choice for the stratum sample sizes. 5.5 Optimal {min V ( yst )} allocation 1 ˆ V ( yst )  2 of the sample: same cost/obs N in each stratum min Vˆ ( yst ), subject to g ( n1 , n2 , , nL )  0, , nL )  n1  n2   nL  n n1 , n2 , , nL where g ( n1 , n2 , Directly proportional to stratum size and stratum variability Use Lagrange multipliers: Vˆ ( yst ) g   0, i  1, ni ni 2   n s 2 i i N 1     i i 1  N i  ni L , L  ni  n N i si L N s k 1 This method of choosing n1 , n2 , called Neyman allocation , nL k k , i  1, ,L 5.5 Optimal {min V ( yst )} allocation of the sample: same cost/obs in each stratum 2 i 1 L N D L N s , i  1, ,L k k ni substitute for ai above gives n     N i si   i 1  L n n 2 N i si ai N s i i 1 N i si k 1  2 From previous slide ni  n L L 2 N D   N i si2 2 i 1 2 i 5.5 Optimal {min V ( yst )} allocation of the sample: same cost/obs in each stratum  Worksheet 11 5.5 Optimal {min V ( yst )} allocation 1 ˆ of the sample for fixed cost C: ci = cost/obs V ( yst )  2 N in stratum i. min Vˆ ( yst ), subject to g ( n1 , n2 , n1 , n2 , , nL where g ( n1 , n2 , , nL )  0, , nL )  c1n1  c2 n2  Use Lagrange multipliers: Vˆ ( yst ) g   0, i  1, ni ni 2   n s 2 i i N 1     i i 1  N i  ni L  c L nL  C Directly proportional to stratum size and stratum variability , L  ni  n N i si ci , i  1, L N s k 1 k k ,L ck Inversely proportional to stratum cost/obs 5.5 Optimal {min V ( yst )} allocation of the sample: ci cost/obs in stratum i ni  n k 1 k k n 2 2 N i si ai i 1 L N D N s i i 1 ci , i  1, L N s  2 From previous slide N i si L ,L ck ni substitute for ai above gives n  L  L    N k sk ck   N i si ci  k 1 i 1    n L N 2 D   N i si2 i 1 2 i 5.5 Optimal {min V ( yst )} allocation of the sample: ci = cost/obs in each stratum  Worksheet 12

Chapter 5 Stratified Random Sampling

Related documents

Products

Support

Chapter 5 Stratified Random Sampling

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib