Experiences of variance estimation for relative official sample surveys

advertisement
Experiences of variance estimation for relative
poverty measures and inequality indicators in
official sample surveys
Claudia Rinaldelli
ISTAT, Servizio Condizioni economiche delle famiglie via Adolfo Ravà 150, 00142
Roma
rinaldel@istat.it
Summary. This paper reports ISTAT experience in evaluating the sampling variance of the relative poverty measures and inequality indicators estimated by means
of the Household Budget survey and the Statistics on Income and Living Conditions
survey (EU-SILC survey).
Key words: sampling variance, linearization, resampling, relative poverty, inequality
1 Introduction
ISTAT calculates sampling variance of common estimates (frequencies, means, totals) using the standard methodology1 [SSW92] [Wol85]; two software procedures
were developed in SAS to implement this methodology in complex sampling designs
[FR98] [IST05]. ISTAT has paid special attention to the dissemination of complex
statistics in the last three years; we mean statistics that are complex because they are
estimated by complex sample surveys and above all they are expressed by non-linear
functions. The standard methodology and therefore the current software procedures
of variance estimation can not be directly applied to these statistics. This paper
reports ISTAT experience in evaluating the sampling variance for a set of complex
cross-sectional measures like the relative poverty measures and inequality indicators
estimated from the Household Budget survey [CPR05] and the Statistics on Income
and Living Conditions survey (EU-SILC survey) [Rin05]. Sections 2-3 report the experience both of the Household Budget survey and of the EU-SILC survey. Finally
section 4 contains the conclusions.
1
Standard methodology means a well known and widely implemented methodology; literature on sampling theory for finite population provides formulas to calculate
sampling variance for the most used sampling designs and estimators.
1466
Claudia Rinaldelli
2 The experience of the Household Budget survey
Official poverty estimates in Italy are calculated yearly by ISTAT using the Household Budget survey data. A household whose monthly consumption expenditure
is equal or below a threshold called the (relative) poverty line is defined as poor.
The poverty line is the monthly average per capita expenditure for consumption of
a two-member household. An equivalence scale is used to correct the poverty line
when households have different sizes [IST02]. The described procedure means that
poverty is not directly observed on households, but it is a function of the observed
sample; it actually depends on the poverty line and on the distribution of consumption expenditures of the sampled households. The Incidence of relative poverty for
households (defined as the percentage of households with a monthly consumption
expenditure equal or below the poverty line) is a target poverty measure:
P
Îpov =
j∈S Ij wj
Households
∗ 100
(1)
‘Households’ are the total resident households (a known demographic amount), j is
the index of household, S is the collected sample, wj is the weight of household j, Ij
is a binary variable defined as:
8
>
<1
Ij =
>
:0
if yj ≤ povertyline
otherwise
(2)
yj is the monthly consumption expenditure of household j.
The complexity of this measure affects the use of standard methodology. This
methodology can be indeed applied only assuming poverty observed directly on
households that means assuming the poverty line as a fixed value. Linearization and
resampling approaches were studied and tested to take into account the whole sampling variance [DDF03] [PR03]; sampling errors of the incidence of relative poverty
were calculated making use of:
1. standard methodology assuming that the poverty line is not affected by sampling
variability;
2. linearization;
3. resampling approach by Balanced Repeated Replications technique (BRR).
Table 1 shows the results of these approaches [DDF03] [PR03].
Applications highlighted that relative sampling errors by standard methodology
are not so far from those obtained by BRR technique and linearization. Standard
methodology can underestimate the sampling variance of poverty measures because
it doesn’t take into account the sampling variance of the poverty line. However, this
underestimation is slight because of the small sampling variability of the poverty
line. Standard methodology (under simplification) doesn’t produce severe differences
compared to more suitable (but more complex) techniques for variance estimation.
That is why, sampling errors of the incidence of relative poverty have been calculated
by standard methodology since 2003.
Experiences of variance estimation for relative poverty measures
1467
Table 1. Incidence of relative poverty for households and relative sampling errors
(%) by standard methodology, BRR and linearization - year 2002
inc. relative%
Piemonte
Valle d’Aosta
Lombardia
Trentino AA
Veneto
Friuli VG
Liguria
Emilia-R
Toscana
Umbria
Marche
Lazio
Abruzzo
Molise
Campania
Puglia
Basilicata
Calabria
Sicilia
Sardegna
ITALY
relative sampling errors (%)
poverty
standard methodology BRR
linearization
7.0
7.1
3.7
9.9
3.9
9.8
4.8
4.5
5.9
6.4
4.9
7.8
18.0
26.2
23.5
21.4
26.9
29.8
21.3
17.1
11.0
12.0
18.4
10.5
9.9
12.6
11.4
14.4
14.0
12.2
17.1
12.5
9.3
15.0
6.4
6.1
8.6
11.6
6.6
5.8
8.8
2.4
12.0
18.6
11.7
10.0
13.4
11.5
14.9
14.4
12.4
17.4
13.1
9.4
14.2
6.4
6.0
8.4
11.5
6.5
5.6
8.8
2.4
12.2
14.8
12.1
11.7
15.8
12.3
18.1
14.9
13.8
21.0
15.0
9.7
15.5
6.9
6.8
9.6
14.0
7.4
5.6
8.7
2.4
3 The experience of the EU-SILC survey
EU-SILC survey2 is aimed at producing estimates on income, living conditions,
poverty. Among these, there are the EU(ropean) relative poverty measures and inequality indicators estimated with their sampling errors [Reg03]; most of them are
complex in a double way as mentioned. Let’s brief the EU complex statistics we
deal with; introducing k as the unit index (person), yk as the value of variable Y on
unit k, wk as the weight of unit k, Ŷβ as the estimated βth quantile of variable Y
(0≤ β ≤1):
• at Risk-of-Poverty Threshold (RPT) is the 60% of the median national income:
RPT = 60%Ŷ 0.5
(3)
• at Risk-of-Poverty Rate (RPR) is the percentage of persons (over the total
population) with an income below RPT:
2
This survey is under European Regulation; the first year of survey was 2004.
1468
Claudia Rinaldelli
PI w
k k
k∈S
∗100
RPR = P
w
(4)
k
k∈S
in equation (4) S represents the collected sample, Ik is a binary variable defined as:
Ik =
1
0
if yk < RPT
otherwise
(5)
• inequality of income distribution, Gini index:
2 lastunit
P y ∗w ∗ unitk
P w − lastunit
P y ∗w2 3
2∗
k
k
k
k
k
6 k=firstunit
7
firstunit
k=firstunit
G = 100 ∗ 6
− 17
4
5
lastunit
lastunit
P
P
wk ∗
k=firstunit
k=firstunit
(6)
yk ∗wk
• Gender Pay Gap (GPG) is the difference between men’s and women’s average
gross hourly earnings as a percentage of men’s average gross hourly earnings
(the population consists of all paid employees aged 16-64 at work 15+ hours per
week):
Pyw Pyw
Pw − Pw
Pyw
Pw
k∈M
GPG =
k∈M
k
k
k∈F
k
k∈M
k∈M
k∈F
k
k
k
k
k
∗ 100
(7)
k
in equation (7) M are the male sampled paid employees aged 16-64 at work 15+
hours per week, F are the female sampled paid employees aged 16-64 at work 15+
hours per week;
• Relative Median at Risk-of-Poverty Gap (RPG) is the difference between RPT
and the median income of poor units, expressed as a percentage of RPT:
poor
RPT − Ŷ0.5
∗100
(8)
RPT
poor
in equation (8) Ŷ0.5 is the estimated median income of poor units (units with
Ik =1, see equation (5));
RPG =
• Income Quintile Share Ratio (QSR) is the ratio of total income received by
20% of the country’s population with the highest income (top quintile) to that
received by 20% of the country’s population with the lowest income (lowest
quintile):
Experiences of variance estimation for relative poverty measures
P
QSR =
k∈T
P
P wk
k∈T
P 1469
yk wk /
yk wk /
k∈L
(9)
wk
k∈L
in equation (9) T is the set of sampled units with yk > Ŷ0.80 and L is the set of
sampled units with yk ≤ Ŷ0.20 .
Linearization and resampling approaches were tested to evaluate the sampling
errors of these complex measures [MR05] [MPR06]. First of all, four of these measures
(at risk of poverty threshold, at risk of poverty rate, Gini index, gender pay gap) were
linearized by estimating equations and Taylor–Woodruff methods [KB97] [Dev99];
then, a resampling approach by BRR technique was as well considered with the aim
of experimentally3 obtain sampling errors both by linearized variables and BRR
[Mcc69] [PR03] [SSW92]. The results of these applications (reported in Table 2)
show that these different approaches lead to similar values. Linearization of ‘Relative
median at risk of poverty gap’ and ‘Income quintile share ratio’ were provided by
EUROSTAT. It recommended the use of this approach for all the EU measures.
Sampling errors of the EU measures have been calculated using linearization since
2005 taking into account the results in Table 2, EUROSTAT recommendation and
the computational workload of resampling.
4 Conclusions
Two different solutions were adopted for estimating the sampling variance of the
described relative poverty measures and inequality indicators; the use of standard
methodology under simplification was preferred in the Household Budget survey and
a linearization approach was implemented in the EU-SILC survey. In addition to the
results reported in sections 2-3, the following reasons were taken into account to pick
out a satisfactory variance estimation solution:
1. Household Budget survey has been carried out since 1997 therefore the use
of standard methodology (under simplification) enabled to not severely change
the current data process; otherwise, the planning of the new survey (EU-SILC)
enabled to implement a more complex approach in the data process;
2. more complex measures are estimated in the EU-SILC survey versus the one
disseminated in the Household Budget survey; simplification can be reasonable
for one measure but not for a whole set of complex statistics;
3. EU-SILC survey is carried out under European Regulation therefore EUROSTAT recommendation has to be considered.
3
Data from ECHP (European Community Household Panel survey) were used
because those from EU-SILC were not available.
1470
Claudia Rinaldelli
Table 2. Relative sampling errors (%) of EU measures by linearization and BRR
relative sampling errors (%)
linearization
BRR
At risk of poverty threshold
1.72
1.73
At risk of poverty rate
3.55
3.53
Gini index
1.71
1.60
Gender pay gap
30.51
32.61
Relative median at risk of poverty gap —
6.61
Income quintile share ratio
—
2.88
References
[CPR05] Coccia, G., Pannuzi, N., Rinaldelli, C.: Poor and non poor households: the
estimation from sample surveys. In: Book of Short papers, Cladag2005.
MUP editore, 69-72 (2005)
[DDF03] De Vitiis, C., Di Consiglio, L., Falorsi, S., Pauselli, C., Rinaldelli, C.: La
valutazione dell’errore di campionamento delle stime di povertà relativa.
Final report, Conference ‘Povertà Regionale ed Esclusione Sociale’, Roma
17-12-03 (2003)
[Dev99] Deville, J.C.: Variance Estimation for complex statistics and estimators:
linearization and residual techniques. In: Survey Methodology, 25, 2, 193203 (1999)
[FR98]
Falorsi, S., Rinaldelli, C.: Un software generalizzato per il calcolo delle
stime e degli errori di campionamento. In: Statistica Applicata, 10, 2,
217-234 (1998)
[IST02] ISTAT: La stima ufficiale della povertà in Italia 1997-2000. Argomenti 24
(2002)
[IST05] ISTAT: GENESEES 3.0 Manuale Utente e Aspetti Metodologici (2005)
[KB97]
Kovacevic, M.S., Binder, D.A.: Variance estimation for measures of income inequality and polarization – The estimating equations approach.
In: Journal Official Statistics, 13, 1, 41-58 (1997)
[Mcc69] McCarthy, P.J.: Pseudoreplication: Further evaluation and application of
the balanced Half-Sample Technique. In: Vital and Health Statistics, Series 2, 31, National Center for Health Statistics, Public Health Service,
Washington, D.C. (1969)
[MR05] Moretti, D., Rinaldelli, C.: Variance estimation for relative poverty measures and inequality indicators from complex sample surveys. In: Atti del
Quarto Convegno S.Co.2005, Cluep editrice Padova, 67-72 (2005)
[MPR06] Moretti, D., Pauselli, C., Rinaldelli, C.: La stima della varianza cam pionaria di indicatori complessi di povertà e disuguaglianza. In: Statistica
Applicata, to be printed (2006)
[PR03]
Pauselli, C., Rinaldelli, C.: La valutazione dell’errore di campionamento
delle stime di povertà relativa secondo la tecnica Replicazioni Bilanciate
Ripetute. In: Rivista di Statistica Ufficiale, 2/2003, 7-22 (2003)
[Reg03] Regulation (EC) No 1177/2003 of the EUROPEAN PARLIAMENT and
of the COUNCIL of 16 June 2003 concerning Community statistics on
income and living conditions (EU-SILC), 3.7.2003 L 165/1 Official Journal
of the European Union
Experiences of variance estimation for relative poverty measures
[Rin05]
1471
Rinaldelli, C.: Statistiche complesse e software. In: Statistica & Società,
3, 2, 01.2005, 27-29 (2005)
[SSW92] Särndal, C.E., Swensson, B., Wretman, J.: Model assisted survey sampling. New York Springer-Verlag (1992)
[Wol85] Wolter, K.M.: Introduction to variance estimation. New York SpringerVerlag (1985)
Download