Chapter 5 Stratified Random Sampling Advantages of stratified random sampling How to select stratified random sample Estimating population mean and total Determining sample size, allocation Estimating population proportion; sample size and allocation Optimal rule for choosing strata Stratified Random Sampling The ultimate function of stratification is to organize the population into homogeneous subsets and to select a SRS of the appropriate size from each stratum. Stratified Random Sampling Often-used option – May produce smaller BOE than SRS of same size – Cost per observation may be reduced – Obtain estimates of population parameters for subgroups Useful when the population is heterogeneous and it is possible to establish strata which are reasonably homogeneous within each stratum Chapter 5 Stratified Random Sampling Improved Sampling Designs with Auxiliary Information Stratified Random Sampling Chapter 6 Ratio and Regression Estimators Stratified Random Sampling: Notation yi :sample mean of data from stratum i, i 1, , L ni :sample size for stratum i i : population mean of stratum i i :population total of stratum i population total 1 2 L Stratified Random Sampling SRS within each stratum, so: E ( yi ) i ˆi N i yi ; E (ˆi ) E ( N i yi ) N i E ( yi ) N i i i Estimate population total by summing estimates of i ˆ1 ˆ2 ˆL ˆ yst N Stratified Random Sampling: Estimate of Mean 1 yst N1 y1 N 2 y2 N L yL N 1 L N i yi N i 1 1 ˆ V ( yst ) 2 N12Vˆ ( y1 ) N 22Vˆ ( y2 ) N L2Vˆ ( yL ) N 2 2 2 n s n s n s 1 2 2 2 1 1 2 2 L L 2 N1 1 N 2 1 N L 1 N N1 n1 N 2 n2 N L nL Stratified Random Sampling: Estimate of Mean , BOE 1 2 2 ˆ V ( yst ) 2 N1 Vˆ ( y1 ) N 2 Vˆ ( y2 ) N 1 2 N LVˆ ( y L ) 2 2 n s n s 2 2 2 N1 1 1 1 N 2 1 2 2 N1 n1 N N 2 n2 2 n s 2 L L N L 1 N L nL BOE 1 2 2 n s n s 2 2 1 1 2 2 1.96 N 1 N 1 2 2 1 N1 n1 N N 2 n2 2 n s 2 L L N L 1 N L nL Stratified Random Sampling: Estimate of Population Total Nyst N1 y1 N 2 y2 N L yL L N i yi i 1 Vˆ ( Nyst ) N 2Vˆ ( yst ) N12Vˆ ( y1 ) N 22Vˆ ( y2 ) N L2Vˆ ( yL ) 2 2 n1 s12 n s 2 2 2 N1 1 N 1 2 N1 n1 N 2 n2 2 n s 2 L L N L 1 N L nL Stratified Random Sampling: BOE for Mean and Total , t distribution When stratum sample sizes are small, can use t dist. L 2 2 ak sk Satterwaithe df L a s 2 k N k ( N k nk ) where ak k 1 2 nk k nk 1 k 1 BOE for : N 1 1 t df N 2 1 2 n n s 1 N 2 1 1 N 1 2 2 1 n n s 2 N 2 2 2 N 2 2 L 1 n n s L N L t df N 1 2 1 1 N 1 n s 2 1 1 N 2 2 1 n 2 N 2 n s 2 2 2 N 2 L 1 n L N L n s L L BOE for : n 2 2 L L Degrees of Freedom(worksheet cont.) Stratified Random Sample Summary: a k N (N n ) k k n k , n 20, n 8, n 12 1 2 3 k N 155, N 62, N 93, 1 2 3 a 1046.25, a 418.5, a 627.75 1 2 3 1046.25 5.95 418.5 15.25 627.75 9.36 1046.25 5.95 418.5 15.25 627.75 9.36 2 df 2 2 2 2 19 21.09; t 21.09 2.08 2 2 2 2 7 (see Excel worksheet) 11 2 Compare BOE in Stratified Random Sample and SRS (worksheet cont.) Stratified Random Sample Summary: n 40, y 27.7;Vˆ ( y ) 1.97 st st Strat. random sample has more precision If observations were from SRS: 40 11.31 s 11.31, Vˆ ( y ) 1 2.79 310 40 2 Approx. Sample Size to Estimate 2 V ( y st ) B V ( y st ) B 2 4 Let ni ai n, ai prop. of sample from stratum i 1 N 2 N i 1 an s 1 N a n 2 L 2 i i i i i B 2 4 L n 2 2 N i si ai where D i 1 L N D 2 N s i i 1 2 i B 2 4 Approx. Sample Size to Estimate 2 V ( Ny st ) B V ( y st ) B 2 4N 2 Let ni ai n, ai prop. of sample from stratum i 1 N 2 N i 1 an s 1 N a n 2 L 2 i i i i i B 2 4N 2 L n 2 2 N i si ai where D i 1 L N D 2 N s i i 1 2 i B 2 4N 2 Summary: Approx. Sample Size to Estimate , L n N i 1 2 i 2 i s ai L N DN s 2 i 1 2 i i B2 D when estimating 4 B2 D when estimating 4N 2 Example: Sample Size to Estimate (worksheet cont.) L n N i 1 2 i si2 ai L N D N i si2 2 i 1 Prior survey: 1 5, 2 15, 3 10. Estimate to within 2 hrs with 95% conf. allocation proportions are a1 a2 a3 1 3. B 2 D B 3 N i 1 s ai 2 2 i i 2 4 1; N 2 D 310 2 96,100 1552 (25) 13 622 (225) 13 932 (100) 13 3 6,991, 275 2 N s i i 155(25) 62(225) 93(100) 27,125 i 1 6,991, 275 n 56.7 57 96,100 27,125 so n1 n2 n3 13 (57) 19 B2 D 4 Example: Sample Size to Estimate (worksheet cont.) L n N i 1 2 i si2 ai L N D N i si2 2 i 1 Prior survey: 1 5, 2 15, 3 10. Estimate to within 400 hrs with 95% conf. allocation proportions are a1 a2 a3 1 3. D 3 B2 N i 1 4N 2 4002 4N2 s ai 2 2 i i 160,00 4N2 1552 (25) 13 40,000 N2 622 (225) 13 ; N 2 D 40, 000 932 (100) 13 3 6,991, 275 2 N s i i 155(25) 62(225) 93(100) 27,125 i 1 6,991, 275 n 104.2 105 40, 000 27,125 so n1 n2 n3 13 (105) 35 B2 D 4N 2 5.5 Allocation of the Sample Objective: obtain estimators with small variance at lowest cost. Allocation affected by 3 factors: 1. Total number of elements in each stratum 2. Variability in each stratum 3. Cost per observation in each stratum 5.5 Allocation of the Sample: Proportional Allocation If don’t have variability and cost information for the strata, can use proportional allocation. Sample size for stratum h : Nh nh n N In general this is not the optimum choice for the stratum sample sizes. 5.5 Optimal {min V ( yst )} allocation 1 ˆ V ( yst ) 2 of the sample: same cost/obs N in each stratum min Vˆ ( yst ), subject to g ( n1 , n2 , , nL ) 0, , nL ) n1 n2 nL n n1 , n2 , , nL where g ( n1 , n2 , Directly proportional to stratum size and stratum variability Use Lagrange multipliers: Vˆ ( yst ) g 0, i 1, ni ni 2 n s 2 i i N 1 i i 1 N i ni L , L ni n N i si L N s k 1 This method of choosing n1 , n2 , called Neyman allocation , nL k k , i 1, ,L 5.5 Optimal {min V ( yst )} allocation of the sample: same cost/obs in each stratum 2 i 1 L N D L N s , i 1, ,L k k ni substitute for ai above gives n N i si i 1 L n n 2 N i si ai N s i i 1 N i si k 1 2 From previous slide ni n L L 2 N D N i si2 2 i 1 2 i 5.5 Optimal {min V ( yst )} allocation of the sample: same cost/obs in each stratum Worksheet 11 5.5 Optimal {min V ( yst )} allocation 1 ˆ of the sample for fixed cost C: ci = cost/obs V ( yst ) 2 N in stratum i. min Vˆ ( yst ), subject to g ( n1 , n2 , n1 , n2 , , nL where g ( n1 , n2 , , nL ) 0, , nL ) c1n1 c2 n2 Use Lagrange multipliers: Vˆ ( yst ) g 0, i 1, ni ni 2 n s 2 i i N 1 i i 1 N i ni L c L nL C Directly proportional to stratum size and stratum variability , L ni n N i si ci , i 1, L N s k 1 k k ,L ck Inversely proportional to stratum cost/obs 5.5 Optimal {min V ( yst )} allocation of the sample: ci cost/obs in stratum i ni n k 1 k k n 2 2 N i si ai i 1 L N D N s i i 1 ci , i 1, L N s 2 From previous slide N i si L ,L ck ni substitute for ai above gives n L L N k sk ck N i si ci k 1 i 1 n L N 2 D N i si2 i 1 2 i 5.5 Optimal {min V ( yst )} allocation of the sample: ci = cost/obs in each stratum Worksheet 12