It PROPOSED SAMPLE DESIGN FOR AGRICULTURAL FINANCE MARKET SCOPING SURVEY IN TANZANIA (AGFIMS SURVEY, 2011) (SAMPLE FOR NATIONAL ESTIMATES) Prepared by: National Bureau of Statistics Dar es Salaam MARCH, 2011 Table of Contents 1. Introduction ........................................................................................................................ 3 2. Sampling Frame .................................................................................................................. 3 3. Stratification of the Sampling Frame.................................................................................. 7 4. Sampling Design and Selection .......................................................................................... 7 5. Sample Size and Allocation ................................................................................................ 8 6. The Distribution of Rural / Urban EAs in Agfims Survey ................................................. 9 7. Estimation procedure ........................................................................................................ 10 2 1. Introduction The Financial Sector Deepening Trust (FSDT) aims to conduct a national survey of Agricultural Finance Market Scoping in Tanzania (AGFIMS survey, 2011). A joint steering and technical committees oversee the project implementation. This will be the first AGFIMS national survey to be conducted in Tanzania and will report on male, female, zonal, urban and rural domains to allow for national and zonal analysis. Since the AGFIMS survey will form the baseline data, it will be important to ensure the following: A detailed and robust, nationally representative survey design; Extensive technical support and guidance is provided to the research firm and the AGFIMS Steering and Technical Committees by the National Bureau of Statistics and Office of the Chief Government Statistician (NBS/OCGS) throughout the survey process; Quality control is present at all stages of the project. Demographic and key questions from the AGFIMS survey are harmonized by using other existing household based surveys by NBS/OCGS. The proposed sample design for the Tanzania’s 2011 AGFIMS survey will provide estimates at zonal, urban / rural, Mainland / Zanzibar, and male / female levels. Note that rural and urban domains will only be feasible at national level. The AgriFim’s main objective is to boost the supply of finance to agricultural sector and to enhance access to agricultural finance through market-leading innovation and policy change. 2. Sampling Frame The sampling frame for the 2011 Agfims Survey is based on the data and cartography from the 2002 Tanzania Population and Housing Census. A stratified multi-stage sample design is used for this survey. The primary sampling units (PSUs) selected at the first stage are the enumeration areas (EAs), which are small operational areas defined on maps for the 2002 Census enumeration. The EAs have an average of 133 households each (153 for rural EAs and 97 for urban EAs) for Tanzania Mainland and 90 households each (92 for rural EAs and 88 for urban EAs) for Tanzania Zanzibar, which is an effective size for conducting a new listing of households. There are a total of 52,375 EAs housing 6,966,966 private households in the 2002 Tanzania mainland Census frame (33,947 rural EAs and 18,428 urban EAs) while in Tanzania Zanzibar there are 2,172 EAs housing 196,293 private households in the 2002 Census frame (1,351 rural EAs and 821 urban EAs). 3 Table 1: Distribution of Population in 2002 Tanzania Census Frame by Region, Rural and Urban Stratum Region Total Urban % National Population Dodoma 1,684,561 5.01% 1,472,571 87.42% 211,990 12.58% Arusha 1,253,082 3.73% 850,632 67.88% 402,450 32.12% Kilimanjaro 1,347,098 4.01% 1,064,778 79.04% 282,320 20.96% Tanga 1,623,252 4.83% 1,324,969 81.62% 298,283 18.38% Morogoro 1,709,273 5.09% 1,237,420 72.39% 471,853 27.61% Pwani Population Rural % Region Population % Region 867,831 2.58% 677,139 78.03% 190,692 21.97% 2,460,824 7.33% 150,607 6.12% 2,310,217 93.88% 779,451 2.32% 654,313 83.95% 125,138 16.05% Mtwara 1,124,663 3.35% 898,298 79.87% 226,365 20.13% Ruvuma 1,095,468 3.26% 925,236 84.46% 170,232 15.54% 424,374 1.26% 399,918 94.24% 24,456 5.76% Mbeya 2,053,205 6.11% 1,634,081 79.59% 419,124 20.41% Singida 1,079,691 3.21% 932,900 86.40% 146,791 13.60% Tabora 1,701,617 5.07% 1,486,126 87.34% 215,491 12.66% Dar es Salaam Lindi Iringa Rukwa 722,768 2.15% 587,824 81.33% 134,944 18.67% Kigoma 1,296,588 3.86% 1,094,881 84.44% 201,707 15.56% Shinyanga 1,538,060 4.58% 1,359,660 88.40% 178,400 11.60% Kagera 1,605,400 4.78% 1,514,148 94.32% 91,252 5.68% Mwanza 1,919,584 5.71% 1,418,625 73.90% 500,959 26.10% Mara 1,356,202 4.04% 1,103,847 81.39% 252,355 18.61% Manyara 1,005,102 2.99% 866,186 86.18% 138,916 13.82% Njombe 639,114 1.90% 540,464 84.56% 98,650 15.44% Katavi 999,547 2.98% 878,974 87.94% 120,573 12.06% Simiyu 2,173,672 6.47% 2,050,905 94.35% 122,767 5.65% Geita 1,131,524 3.37% 1,017,426 89.92% 114,098 10.08% 33,591,951 100.00% 26,141,928 77.82% 7,450,023 22.18% 186,876 18.23% 183,120 97.99% 3,756 2.01% 92,636 9.04% 87,744 94.72% 4,892 5.28% Mjini Magharibi 385,366 37.59% 68,810 17.86% 316,556 82.14% Kaskazini Pemba 185,113 18.05% 154,700 83.57% 30,413 16.43% Kusini Pemba 175,292 17.10% 143,634 81.94% 31,658 18.06% 1,025,283 100.00% 638,008 62.23% 387,275 37.77% Total Mainland Kaskazini Unguja Kusini Unguja Total Zanzibar The Tanzania mainland is divided administratively into 25 regions and Zanzibar into five regions, identified in Table 1. Each region is divided into districts, which are further divided into wards. For the 2002 Census the wards were classified by type of residence as urban, rural or mixed, and all the EAs within a ward were assigned the same classification. The EAs in mixed wards were later individually assigned to the rural and urban strata using the EA coding scheme. The EAs with codes of 300 or higher in mixed wards were assigned to the urban stratum, since they are part of small towns. Table 1 shows the distribution of the population by region, rural and urban strata, based on the 2002 Tanzania Census. 4 It can be seen in Table 1 that the largest region in Tanzania Mainland is Dar es Salaam, with 7.3 percent of the population, and the smallest region is Iringa, with 1.3 percent of the population. In Tanzania Zanzibar the lagest region is Mjini Magharibi (37.6%) and the smallest region is Kusini Unguja with 9.04 percent of the total population in Tanzania Zanzibar. In reference to type of residence, at the national level for Mainland, 77.8 percent of the population is classified as rural and 22.2 percent as urban. In Zanzibar 62.2 percent of the population is classified as rural and 37.8 percent as urban Table 2: Distribution of EAs and Households in 2002 Tanzania Census Frame by Region, Rural and Urban Strata Region Total No. EAs Rural No. Hhs. No. EAs Urban No. Hhs. No. EAs No. Hhs. Dodoma 2,217 381,140 1,732 330,711 485 50,429 Arusha 2,147 284,964 1,254 177,940 893 107,024 Kilimanjaro 2,313 298,262 1,632 227,471 681 70,791 Tanga 2,284 360,498 1,599 292,583 685 67,915 Morogoro 2,956 385,148 1,750 270,609 1,206 114,539 Pwani 1,389 201,281 919 155,284 470 45,997 Dar es Salaam 6,721 603,393 181 37,688 6,540 565,705 Lindi 1,338 191,449 1,010 158,175 328 33,274 Mtwara 2,073 297,757 1,408 237,918 665 59,839 Ruvuma 1,470 233,129 1,080 192,957 390 40,172 670 98,786 611 92,718 59 6,068 Mbeya 3,046 496,926 2,103 390,286 943 106,640 Singida 1,500 219,217 1,199 185,364 301 33,853 Tabora 2,200 293,663 1,728 244,715 472 48,948 Rukwa 1,078 149,952 799 120,249 279 29,703 Kigoma 1,761 238,783 1,273 201,134 488 37,649 Shinyanga 2,122 264,101 1,693 222,012 429 42,089 Kagera 2,038 350,093 1,873 326,937 165 23,156 Mwanza 2,594 337,775 1,595 226,451 999 111,324 Mara 2,026 248,570 1,398 195,088 628 53,482 Manyara 1,529 196,447 1,215 162,473 314 33,974 Njombe 1,044 153,553 800 128,341 244 25,212 Katavi 1,500 170,066 1,244 143,978 256 26,088 Simiyu 2,940 326,855 2,660 301,137 280 25,718 Geita 1,419 185,158 1,191 161,865 228 23,293 1,782,882 Iringa Total Mainland 52,375 6,966,966 33,947 5,184,084 18,428 Kaskazini Unguja 421 38,703 412 37,914 9 789 Kusini Unguja 203 19,993 192 18,853 11 1,140 Mjini Magharibi 833 74,394 155 15,125 678 59,269 Kaskazini Pemba 365 33,258 306 27,920 59 5,338 Kusini Pemba 350 29,945 286 24,522 64 5,423 2,172 196,293 1,351 124,334 821 71,959 Total Zanzibar Table 2 shows the distribution of the total number of EAs and households in the 2002 Tanzania Census frame by region and stratum. 5 Table 3 presents the average number of households per EA and the average number of persons per household in the 2002 Tanzania Census frame, by region, rural and urban stratum. It can be seen that the average number of households is 133, higher for the rural EAs (153) than for the urban EAs (97) for Mainland. For Tanzania Zanzibar the average number of households per EA is 90, higher for the rural EAs (92) than for the urban EAs (88). The average number of persons per household in Mainland is 4.8 and is considerably higher for the rural areas (5.0) than the urban areas (4.2) for mainland while for Zanzibar the average number of persons per household is 5.2 higher for urban areas (5.4) than for rural areas (5.1). Table 3: Average Number of Households per EA and Average Number of Persons per Household in 2002 Tanzania Census Frame by Region, Rural and Urban Stratum Total Rural Persons/ Region Hhs./EA hh. Urban Persons/ Hhs./EA hh. Persons/ Hhs./EA hh. Dodoma 172 4.4 191 4.5 104 4.2 Arusha 133 4.4 142 4.8 120 3.8 Kilimanjaro 129 4.5 139 4.7 104 4.0 Tanga 158 4.5 183 4.5 99 4.4 Morogoro 130 4.4 155 4.6 95 4.1 Pwani 145 4.3 169 4.4 98 4.1 90 4.1 208 4.0 86 4.1 Lindi 143 4.1 157 4.1 101 3.8 Mtwara 144 3.8 169 3.8 90 3.8 Ruvuma 159 4.7 179 4.8 103 4.2 Iringa 147 4.3 152 4.3 103 4.0 Mbeya 163 4.1 186 4.2 113 3.9 Singida 146 4.9 155 5.0 112 4.3 Tabora 133 5.8 142 6.1 104 4.4 Rukwa 139 4.8 150 4.9 106 4.5 Kigoma 136 5.4 158 5.4 77 5.4 Shinyanga 124 5.8 131 6.1 98 4.2 Kagera 172 4.6 175 4.6 140 3.9 Mwanza 130 5.7 142 6.3 111 4.5 Mara 123 5.5 140 5.7 85 4.7 Manyara 128 5.1 134 5.3 108 4.1 Njombe 147 4.2 160 4.2 103 3.9 Katavi 113 5.9 116 6.1 102 4.6 Simiyu 111 6.7 113 6.8 92 4.8 Geita 130 6.1 136 6.3 102 4.9 Total Mainland 133 4.8 153 5.0 97 4.2 Kaskazini Unguja 92 4.8 92 4.8 88 4.8 Kusini Unguja 98 4.6 98 4.7 104 4.3 Mjini Magharibi 89 5.2 98 4.5 87 5.3 Kaskazini Pemba 91 5.6 91 5.5 90 5.7 Kusini Pemba 86 5.9 86 5.9 85 5.8 Total Zanzibar 90 5.2 92 5.1 88 5.4 Dar es Salaam 6 Following the selection of the sample EAs at the first sampling stage, a new listing of households will be conducted in each sample EA. At the second sampling stage households will be selected from the listing of each sample EA. The units of analysis for the 2011 Agfims will be the individual households and the persons in these households. 3. Stratification of the Sampling Frame In order to increase the efficiency of the sample design for 2011 Agfims Survey, it is important to divide the sampling frame of EAs into strata that are as homogeneous as possible. The first stage sample selection is carried out independently within each explicit stratum. The nature of the stratification depends on the most important characteristics to be measured in the survey, as well as the domains of analysis; the strata should be consistent with the geographic disaggregation to be used in the survey tables. It is also desirable to order the EAs within each stratum by certain criteria that are correlated with key survey variables, in order to provide further implicit stratification when systematic selection is used. The first level of stratification will correspond to the geographic domains of analysis defined for the 2011 Agfims Survey. The separate frame of EAs for Dar es Salaam Region can be ordered by urban and rural residence; there are not many rural EAs in this region, so a small proportional sample of rural EAs would be selected for Dar es Salaam. Given that the sample EAs will be selected systematically with PPS, this ordering of the sampling frame will also automatically provide a proportional allocation of the sample EAs in each region based on the total number of households in the frame. In this case the rural and urban parts of each region were treated as explicit strata for the calculation of sampling errors for the estimates of key indicators. The EAs were further sorted by region, district, ward and EA codes to ensure that the sample is geographically representative. 4. Sampling Design and Selection The 2011 Agfims Survey utilized a three-stage sampling design. The primary sampling units (PSUs) were the enumeration areas (EAs) and the secondary sampling units (SSUs) were the households. The ultimate sampling units (USUs) were the individual household members aged 18 years and above. The first stage involved selection of urban and rural enumeration areas (EAs) from the frame used in the 2002 Population and Housing Census for each administrative region in the Mainland; Unguja and Pemba Islands in Zanzibar. The EAs were selected using a probability proportional to size (PPS) procedure. The measure of size for the EAs was the number of households. 7 The second stage involved selection of households from all the selected EAs. A listing form will be used to list all private households in the selected EAs in a serpentine way to avoid clustering when drawing households. The survey will interview 8 agricultural producing households to be selected systematically from the updated list of households from the selected EAs. Roundom selection of processors and service providers is described in the technical proposal of the research firm. The third stage will involve selection of random respondents aged 18 years and above from the selected agriculture producing households. Within each of the selected households, all members of the household 18 years and older will be listed on a questionnaire. Information on the gender and age of each household member who are 18 years and above will be collected. If the selected household is occupied by one individual, then that individual will be automatically selected. In cases where a selected household is occupied by more than one qualified individual, the interviewer had to choose randomly one household for interview using a Kish Grid. 5. Sample Size and Allocation The sample size for a particular survey is determined by: The number of domains of analysis (main determinant); Resources constraints; Operational constraints; Accuracy required for the survey domain estimates (measured through sampling errorvariance estimation and non-sampling error -interview or validation studies. The sampling error is inversely proportional to the square root of the sample size. On the other hand, the nonsampling error may increase with the sample size, since it is more difficult to control the quality of a larger operation. It is therefore important that the overall sample size be manageable for quality and operational control purposes. The sample size also depends on cost considerations and logistical issues related to the organization of the teams of enumerators and the workload for each team. The overall sample size for the 2011 Agfims Survey at national level is 639 EAs (567 EAs for Mainland and 72 EAs for Zanzibar). 8 6. The Distribution of Rural / Urban EAs in Agfims Survey The measure of size enabled selection of EAs that were allocated proportional to the rural / urban explicity domains with geographical representation. The availability of EA inventory from the 2002 Population and Housing Census enabled the selection of required number of EAs in each domain for both Mainland and Zanzibar (Selected EAs are attached as Annex 1). The summary of the allocation of EAs by these domains is indicated in Table 5 below. Table 4: Number of Selected EAs by Region, Urban and Rural strata Region Urban Eas Total Eas Per Region Rural Eas Household to Total Households Sample Dodoma 2 19 21 8 168 Arusha 7 14 21 8 168 Kilimanjaro 2 19 21 8 168 Tanga 5 16 21 8 168 Pwani 12 9 21 8 168 4 17 21 8 168 Dar es Salaam - Kinondoni 28 0 28 8 224 Dar es Salaam - Ilala 14 1 15 8 120 Dar es Salaam - Temeke 18 2 20 8 160 Lindi 2 19 21 8 168 Mtwara 4 17 21 8 168 Ruvuma 3 18 21 8 168 Iringa 3 18 21 8 168 Mbeya 3 18 21 8 168 Singida 3 18 21 8 168 Tabora 1 20 21 8 168 Rukwa 5 16 21 8 168 Kigoma 2 19 21 8 168 Shinyanga 4 17 21 8 168 Kagera 0 21 21 8 168 Mwanza 6 15 21 8 168 Mara 5 16 21 8 168 Manyara 2 19 21 8 168 Njombe 2 19 21 8 168 Katavi 4 17 21 8 168 Simiyu 2 19 21 8 168 Geita 1 20 21 8 168 Kaskazini 0 1 1 8 8 Morogoro 0 9 9 8 72 Mjini Magharibi 27 6 33 8 264 Kaskazini Pemba 3 12 15 8 120 Kusini Kusini Pemba Total 4 10 14 8 112 178 461 639 8 5,112 9 7. Estimation procedure In order to obtain representative estimates for population, it is necessary to attach to each household a weight. Weights consist of two factors: initial weight or sometimes known as the design weight as result of sampling design and a correction factor for nonresponse. The initial weight for each household is equal to the inverse of inclusion probability (this inclusion probability is a product of inclusion probabilities from each stage). 2011 Agfims Survey was based on three-stage stratified random sample. Primary sampling units (PSUs) were EAs from 2002 Population and Housing Census Frame, secondary sampling units (SSU) were households selected from updated EAs. In case when a household was occupied by more than one eligible individual, the third stage selection procedure was applied in order to select one individual. Total inclusion probability for an individual is equal to: phij nh M hi mhi 1 ' Mh M hi khij Where phij = Total inclusion probability of Individual in j -th Household, selected in i th EA, in h th stratum; nh = Sample number of EAs in h th stratum; M hi = Number of households in the i th EA, in h th stratum; M h = Total number of households in the frame from 2002 Population and housing census in h th stratum; mhi = 8, is the number of selected households in the sample from the updated list of households in i th EA in h th stratum; M hi = Number of Households in i th EAs, in h th stratum, determined after updating the list of households; 10 k hij = Number of individuals in j th households, in i th EA, in h th stratum Three probability components correspond to three stages of the sample selection. If a household is occupied by one individual, in that case the last probability component would be equal to 1. Initial sample weight is equal to inverse of the selection probability: Whij M h M hi' khij nh M hi mhi Whij initial weight for a individual in j th household, in i th EA, in h th stratum After data collection the initial weight would be adjusted for non-response using the following formula: W 'hij Whij m 'hi m "hi Whij = adjusted weight for individual in j th household, i th EA and in h th stratum mhi = number of households in i th EA, in h th stratum mhi = number of completed 2011 Agfims questionnaires (completed questionnaires for individuals) in i th EA, in h th stratum. In order to analyze and publish the 2011 Agfims results, it is necessary to measure standard error of estimates and determine confidence intervals. There is specialized software for estimations of parameters and their errors based on complex sample data (SAS, SUDAAN, WesVar and STATA). 11