2. Sampling Frame

advertisement
It
PROPOSED SAMPLE DESIGN FOR AGRICULTURAL FINANCE MARKET SCOPING
SURVEY IN TANZANIA (AGFIMS SURVEY, 2011)
(SAMPLE FOR NATIONAL ESTIMATES)
Prepared by:
National Bureau of Statistics
Dar es Salaam
MARCH, 2011
Table of Contents
1.
Introduction ........................................................................................................................ 3
2.
Sampling Frame .................................................................................................................. 3
3.
Stratification of the Sampling Frame.................................................................................. 7
4.
Sampling Design and Selection .......................................................................................... 7
5.
Sample Size and Allocation ................................................................................................ 8
6.
The Distribution of Rural / Urban EAs in Agfims Survey ................................................. 9
7.
Estimation procedure ........................................................................................................ 10
2
1. Introduction
The Financial Sector Deepening Trust (FSDT) aims to conduct a national survey of Agricultural
Finance Market Scoping in Tanzania (AGFIMS survey, 2011).
A joint steering and technical
committees oversee the project implementation. This will be the first AGFIMS national survey to be
conducted in Tanzania and will report on male, female, zonal, urban and rural domains to allow for
national and zonal analysis.
Since the AGFIMS survey will form the baseline data, it will be important to ensure the following:

A detailed and robust, nationally representative survey design;

Extensive technical support and guidance is provided to the research firm and the AGFIMS
Steering and Technical Committees by the National Bureau of Statistics and Office of the
Chief Government Statistician (NBS/OCGS) throughout the survey process;

Quality control is present at all stages of the project.

Demographic and key questions from the AGFIMS survey are harmonized by using other
existing household based surveys by NBS/OCGS.
The proposed sample design for the Tanzania’s 2011 AGFIMS survey will provide estimates at zonal,
urban / rural, Mainland / Zanzibar, and male / female levels. Note that rural and urban domains will
only be feasible at national level.
The AgriFim’s main objective is to boost the supply of finance to agricultural sector and to enhance
access to agricultural finance through market-leading innovation and policy change.
2. Sampling Frame
The sampling frame for the 2011 Agfims Survey is based on the data and cartography from the 2002
Tanzania Population and Housing Census. A stratified multi-stage sample design is used for this
survey. The primary sampling units (PSUs) selected at the first stage are the enumeration areas
(EAs), which are small operational areas defined on maps for the 2002 Census enumeration. The EAs
have an average of 133 households each (153 for rural EAs and 97 for urban EAs) for Tanzania
Mainland and 90 households each (92 for rural EAs and 88 for urban EAs) for Tanzania Zanzibar,
which is an effective size for conducting a new listing of households. There are a total of 52,375 EAs
housing 6,966,966 private households in the 2002 Tanzania mainland Census frame (33,947 rural EAs
and 18,428 urban EAs) while in Tanzania Zanzibar there are 2,172 EAs housing 196,293 private
households in the 2002 Census frame (1,351 rural EAs and 821 urban EAs).
3
Table 1: Distribution of Population in 2002 Tanzania Census Frame by Region, Rural
and Urban Stratum
Region
Total
Urban
% National
Population
Dodoma
1,684,561
5.01%
1,472,571
87.42%
211,990
12.58%
Arusha
1,253,082
3.73%
850,632
67.88%
402,450
32.12%
Kilimanjaro
1,347,098
4.01%
1,064,778
79.04%
282,320
20.96%
Tanga
1,623,252
4.83%
1,324,969
81.62%
298,283
18.38%
Morogoro
1,709,273
5.09%
1,237,420
72.39%
471,853
27.61%
Pwani
Population
Rural
% Region
Population
% Region
867,831
2.58%
677,139
78.03%
190,692
21.97%
2,460,824
7.33%
150,607
6.12%
2,310,217
93.88%
779,451
2.32%
654,313
83.95%
125,138
16.05%
Mtwara
1,124,663
3.35%
898,298
79.87%
226,365
20.13%
Ruvuma
1,095,468
3.26%
925,236
84.46%
170,232
15.54%
424,374
1.26%
399,918
94.24%
24,456
5.76%
Mbeya
2,053,205
6.11%
1,634,081
79.59%
419,124
20.41%
Singida
1,079,691
3.21%
932,900
86.40%
146,791
13.60%
Tabora
1,701,617
5.07%
1,486,126
87.34%
215,491
12.66%
Dar es Salaam
Lindi
Iringa
Rukwa
722,768
2.15%
587,824
81.33%
134,944
18.67%
Kigoma
1,296,588
3.86%
1,094,881
84.44%
201,707
15.56%
Shinyanga
1,538,060
4.58%
1,359,660
88.40%
178,400
11.60%
Kagera
1,605,400
4.78%
1,514,148
94.32%
91,252
5.68%
Mwanza
1,919,584
5.71%
1,418,625
73.90%
500,959
26.10%
Mara
1,356,202
4.04%
1,103,847
81.39%
252,355
18.61%
Manyara
1,005,102
2.99%
866,186
86.18%
138,916
13.82%
Njombe
639,114
1.90%
540,464
84.56%
98,650
15.44%
Katavi
999,547
2.98%
878,974
87.94%
120,573
12.06%
Simiyu
2,173,672
6.47%
2,050,905
94.35%
122,767
5.65%
Geita
1,131,524
3.37%
1,017,426
89.92%
114,098
10.08%
33,591,951
100.00%
26,141,928
77.82%
7,450,023
22.18%
186,876
18.23%
183,120
97.99%
3,756
2.01%
92,636
9.04%
87,744
94.72%
4,892
5.28%
Mjini Magharibi
385,366
37.59%
68,810
17.86%
316,556
82.14%
Kaskazini Pemba
185,113
18.05%
154,700
83.57%
30,413
16.43%
Kusini Pemba
175,292
17.10%
143,634
81.94%
31,658
18.06%
1,025,283
100.00%
638,008
62.23%
387,275
37.77%
Total Mainland
Kaskazini Unguja
Kusini Unguja
Total Zanzibar
The Tanzania mainland is divided administratively into 25 regions and Zanzibar into five regions,
identified in Table 1. Each region is divided into districts, which are further divided into wards. For
the 2002 Census the wards were classified by type of residence as urban, rural or mixed, and all the
EAs within a ward were assigned the same classification. The EAs in mixed wards were later
individually assigned to the rural and urban strata using the EA coding scheme. The EAs with codes
of 300 or higher in mixed wards were assigned to the urban stratum, since they are part of small
towns. Table 1 shows the distribution of the population by region, rural and urban strata, based on the
2002 Tanzania Census.
4
It can be seen in Table 1 that the largest region in Tanzania Mainland is Dar es Salaam, with 7.3
percent of the population, and the smallest region is Iringa, with 1.3 percent of the population. In
Tanzania Zanzibar the lagest region is Mjini Magharibi (37.6%) and the smallest region is Kusini
Unguja with 9.04 percent of the total population in Tanzania Zanzibar. In reference to type of
residence, at the national level for Mainland, 77.8 percent of the population is classified as rural and
22.2 percent as urban. In Zanzibar 62.2 percent of the population is classified as rural and 37.8 percent
as urban
Table 2: Distribution of EAs and Households in 2002 Tanzania Census Frame by Region, Rural
and Urban Strata
Region
Total
No. EAs
Rural
No. Hhs.
No. EAs
Urban
No. Hhs.
No. EAs
No. Hhs.
Dodoma
2,217
381,140
1,732
330,711
485
50,429
Arusha
2,147
284,964
1,254
177,940
893
107,024
Kilimanjaro
2,313
298,262
1,632
227,471
681
70,791
Tanga
2,284
360,498
1,599
292,583
685
67,915
Morogoro
2,956
385,148
1,750
270,609
1,206
114,539
Pwani
1,389
201,281
919
155,284
470
45,997
Dar es Salaam
6,721
603,393
181
37,688
6,540
565,705
Lindi
1,338
191,449
1,010
158,175
328
33,274
Mtwara
2,073
297,757
1,408
237,918
665
59,839
Ruvuma
1,470
233,129
1,080
192,957
390
40,172
670
98,786
611
92,718
59
6,068
Mbeya
3,046
496,926
2,103
390,286
943
106,640
Singida
1,500
219,217
1,199
185,364
301
33,853
Tabora
2,200
293,663
1,728
244,715
472
48,948
Rukwa
1,078
149,952
799
120,249
279
29,703
Kigoma
1,761
238,783
1,273
201,134
488
37,649
Shinyanga
2,122
264,101
1,693
222,012
429
42,089
Kagera
2,038
350,093
1,873
326,937
165
23,156
Mwanza
2,594
337,775
1,595
226,451
999
111,324
Mara
2,026
248,570
1,398
195,088
628
53,482
Manyara
1,529
196,447
1,215
162,473
314
33,974
Njombe
1,044
153,553
800
128,341
244
25,212
Katavi
1,500
170,066
1,244
143,978
256
26,088
Simiyu
2,940
326,855
2,660
301,137
280
25,718
Geita
1,419
185,158
1,191
161,865
228
23,293
1,782,882
Iringa
Total Mainland
52,375
6,966,966
33,947
5,184,084
18,428
Kaskazini Unguja
421
38,703
412
37,914
9
789
Kusini Unguja
203
19,993
192
18,853
11
1,140
Mjini Magharibi
833
74,394
155
15,125
678
59,269
Kaskazini Pemba
365
33,258
306
27,920
59
5,338
Kusini Pemba
350
29,945
286
24,522
64
5,423
2,172
196,293
1,351
124,334
821
71,959
Total Zanzibar
Table 2 shows the distribution of the total number of EAs and households in the 2002 Tanzania
Census frame by region and stratum.
5
Table 3 presents the average number of households per EA and the average number of persons per
household in the 2002 Tanzania Census frame, by region, rural and urban stratum. It can be seen that
the average number of households is 133, higher for the rural EAs (153) than for the urban EAs (97)
for Mainland. For Tanzania Zanzibar the average number of households per EA is 90, higher for the
rural EAs (92) than for the urban EAs (88). The average number of persons per household in
Mainland is 4.8 and is considerably higher for the rural areas (5.0) than the urban areas (4.2) for
mainland while for Zanzibar the average number of persons per household is 5.2 higher for urban
areas (5.4) than for rural areas (5.1).
Table 3: Average Number of Households per EA and Average Number of Persons per
Household in 2002 Tanzania Census Frame by Region, Rural and Urban Stratum
Total
Rural
Persons/
Region
Hhs./EA
hh.
Urban
Persons/
Hhs./EA
hh.
Persons/
Hhs./EA
hh.
Dodoma
172
4.4
191
4.5
104
4.2
Arusha
133
4.4
142
4.8
120
3.8
Kilimanjaro
129
4.5
139
4.7
104
4.0
Tanga
158
4.5
183
4.5
99
4.4
Morogoro
130
4.4
155
4.6
95
4.1
Pwani
145
4.3
169
4.4
98
4.1
90
4.1
208
4.0
86
4.1
Lindi
143
4.1
157
4.1
101
3.8
Mtwara
144
3.8
169
3.8
90
3.8
Ruvuma
159
4.7
179
4.8
103
4.2
Iringa
147
4.3
152
4.3
103
4.0
Mbeya
163
4.1
186
4.2
113
3.9
Singida
146
4.9
155
5.0
112
4.3
Tabora
133
5.8
142
6.1
104
4.4
Rukwa
139
4.8
150
4.9
106
4.5
Kigoma
136
5.4
158
5.4
77
5.4
Shinyanga
124
5.8
131
6.1
98
4.2
Kagera
172
4.6
175
4.6
140
3.9
Mwanza
130
5.7
142
6.3
111
4.5
Mara
123
5.5
140
5.7
85
4.7
Manyara
128
5.1
134
5.3
108
4.1
Njombe
147
4.2
160
4.2
103
3.9
Katavi
113
5.9
116
6.1
102
4.6
Simiyu
111
6.7
113
6.8
92
4.8
Geita
130
6.1
136
6.3
102
4.9
Total Mainland
133
4.8
153
5.0
97
4.2
Kaskazini Unguja
92
4.8
92
4.8
88
4.8
Kusini Unguja
98
4.6
98
4.7
104
4.3
Mjini Magharibi
89
5.2
98
4.5
87
5.3
Kaskazini Pemba
91
5.6
91
5.5
90
5.7
Kusini Pemba
86
5.9
86
5.9
85
5.8
Total Zanzibar
90
5.2
92
5.1
88
5.4
Dar es Salaam
6
Following the selection of the sample EAs at the first sampling stage, a new listing of households will
be conducted in each sample EA. At the second sampling stage households will be selected from the
listing of each sample EA.
The units of analysis for the 2011 Agfims will be the individual
households and the persons in these households.
3. Stratification of the Sampling Frame
In order to increase the efficiency of the sample design for 2011 Agfims Survey, it is important to
divide the sampling frame of EAs into strata that are as homogeneous as possible. The first stage
sample selection is carried out independently within each explicit stratum.
The nature of the
stratification depends on the most important characteristics to be measured in the survey, as well as
the domains of analysis; the strata should be consistent with the geographic disaggregation to be used
in the survey tables. It is also desirable to order the EAs within each stratum by certain criteria that
are correlated with key survey variables, in order to provide further implicit stratification when
systematic selection is used.
The first level of stratification will correspond to the geographic domains of analysis defined for the
2011 Agfims Survey. The separate frame of EAs for Dar es Salaam Region can be ordered by urban
and rural residence; there are not many rural EAs in this region, so a small proportional sample of
rural EAs would be selected for Dar es Salaam.
Given that the sample EAs will be selected
systematically with PPS, this ordering of the sampling frame will also automatically provide a
proportional allocation of the sample EAs in each region based on the total number of households in
the frame. In this case the rural and urban parts of each region were treated as explicit strata for the
calculation of sampling errors for the estimates of key indicators. The EAs were further sorted by
region, district, ward and EA codes to ensure that the sample is geographically representative.
4. Sampling Design and Selection
The 2011 Agfims Survey utilized a three-stage sampling design. The primary sampling units (PSUs)
were the enumeration areas (EAs) and the secondary sampling units (SSUs) were the households. The
ultimate sampling units (USUs) were the individual household members aged 18 years and above.
The first stage involved selection of urban and rural enumeration areas (EAs) from the frame used in
the 2002 Population and Housing Census for each administrative region in the Mainland; Unguja and
Pemba Islands in Zanzibar. The EAs were selected using a probability proportional to size (PPS)
procedure. The measure of size for the EAs was the number of households.
7
The second stage involved selection of households from all the selected EAs. A listing form will be
used to list all private households in the selected EAs in a serpentine way to avoid clustering when
drawing households. The survey will interview 8 agricultural producing households to be selected
systematically from the updated list of households from the selected EAs. Roundom selection of
processors and service providers is described in the technical proposal of the research firm.
The third stage will involve selection of random respondents aged 18 years and above from
the selected agriculture producing households. Within each of the selected households, all
members of the household 18 years and older will be listed on a questionnaire. Information
on the gender and age of each household member who are 18 years and above will be
collected. If the selected household is occupied by one individual, then that individual will be
automatically selected. In cases where a selected household is occupied by more than one qualified
individual, the interviewer had to choose randomly one household for interview using a Kish Grid.
5.
Sample Size and Allocation
The sample size for a particular survey is determined by:

The number of domains of analysis (main determinant);

Resources constraints;

Operational constraints;

Accuracy required for the survey domain estimates (measured through sampling errorvariance estimation and non-sampling error -interview or validation studies.
The sampling error is inversely proportional to the square root of the sample size. On the other hand,
the nonsampling error may increase with the sample size, since it is more difficult to control the
quality of a larger operation. It is therefore important that the overall sample size be manageable for
quality and operational control purposes. The sample size also depends on cost considerations and
logistical issues related to the organization of the teams of enumerators and the workload for each
team.
The overall sample size for the 2011 Agfims Survey at national level is 639 EAs (567 EAs for
Mainland and 72 EAs for Zanzibar).
8
6. The Distribution of Rural / Urban EAs in Agfims Survey
The measure of size enabled selection of EAs that were allocated proportional to the rural / urban
explicity domains with geographical representation. The availability of EA inventory from the 2002
Population and Housing Census enabled the selection of required number of EAs in each domain for
both Mainland and Zanzibar (Selected EAs are attached as Annex 1).
The summary of the allocation of EAs by these domains is indicated in Table 5 below.
Table 4: Number of Selected EAs by Region, Urban and Rural strata
Region
Urban Eas
Total Eas Per
Region
Rural Eas
Household to
Total Households
Sample
Dodoma
2
19
21
8
168
Arusha
7
14
21
8
168
Kilimanjaro
2
19
21
8
168
Tanga
5
16
21
8
168
Pwani
12
9
21
8
168
4
17
21
8
168
Dar es Salaam - Kinondoni
28
0
28
8
224
Dar es Salaam - Ilala
14
1
15
8
120
Dar es Salaam - Temeke
18
2
20
8
160
Lindi
2
19
21
8
168
Mtwara
4
17
21
8
168
Ruvuma
3
18
21
8
168
Iringa
3
18
21
8
168
Mbeya
3
18
21
8
168
Singida
3
18
21
8
168
Tabora
1
20
21
8
168
Rukwa
5
16
21
8
168
Kigoma
2
19
21
8
168
Shinyanga
4
17
21
8
168
Kagera
0
21
21
8
168
Mwanza
6
15
21
8
168
Mara
5
16
21
8
168
Manyara
2
19
21
8
168
Njombe
2
19
21
8
168
Katavi
4
17
21
8
168
Simiyu
2
19
21
8
168
Geita
1
20
21
8
168
Kaskazini
0
1
1
8
8
Morogoro
0
9
9
8
72
Mjini Magharibi
27
6
33
8
264
Kaskazini Pemba
3
12
15
8
120
Kusini
Kusini Pemba
Total
4
10
14
8
112
178
461
639
8
5,112
9
7. Estimation procedure
In order to obtain representative estimates for population, it is necessary to attach to each household a
weight. Weights consist of two factors: initial weight or sometimes known as the design weight as
result of sampling design and a correction factor for nonresponse.
The initial weight for each household is equal to the inverse of inclusion probability (this inclusion
probability is a product of inclusion probabilities from each stage). 2011 Agfims Survey was based on
three-stage stratified random sample. Primary sampling units (PSUs) were EAs from 2002 Population
and Housing Census Frame, secondary sampling units (SSU) were households selected from updated
EAs. In case when a household was occupied by more than one eligible individual, the third stage
selection procedure was applied in order to select one individual.
Total inclusion probability for an individual is equal to:
phij 
nh  M hi mhi
1
 ' 
Mh
M hi khij
Where
phij = Total inclusion probability of Individual in j -th Household, selected in i  th EA, in h  th
stratum;
nh = Sample number of EAs in h  th stratum;
M hi = Number of households in the i  th EA, in h  th stratum;
M h = Total number of households in the frame from 2002 Population and housing census in h  th
stratum;
mhi = 8, is the number of selected households in the sample from the updated list of households in
i  th EA in h  th stratum;
M hi = Number of Households in i  th EAs, in h  th stratum, determined after updating the list of
households;
10
k hij
= Number of individuals in j  th households, in i  th EA, in h  th stratum
Three probability components correspond to three stages of the sample selection. If a household is
occupied by one individual, in that case the last probability component would be equal to 1.
Initial sample weight is equal to inverse of the selection probability:
Whij 
M h  M hi'  khij
nh  M hi  mhi
Whij  initial weight for a individual in j  th household, in i  th EA, in h  th stratum
After data collection the initial weight would be adjusted for non-response using the following
formula:
W 'hij  Whij 
m 'hi
m "hi
Whij = adjusted weight for individual in j  th household, i  th EA and in h  th stratum
mhi = number of households in i  th EA, in h  th stratum
mhi = number of completed 2011 Agfims questionnaires (completed questionnaires for individuals) in
i  th EA, in h  th stratum.
In order to analyze and publish the 2011 Agfims results, it is necessary to measure standard error of
estimates and determine confidence intervals.
There is specialized software for estimations of
parameters and their errors based on complex sample data (SAS, SUDAAN, WesVar and STATA).
11
Download