Experiences of variance estimation for relative poverty measures and inequality indicators in official sample surveys Claudia Rinaldelli ISTAT, Servizio Condizioni economiche delle famiglie via Adolfo Ravà 150, 00142 Roma rinaldel@istat.it Summary. This paper reports ISTAT experience in evaluating the sampling variance of the relative poverty measures and inequality indicators estimated by means of the Household Budget survey and the Statistics on Income and Living Conditions survey (EU-SILC survey). Key words: sampling variance, linearization, resampling, relative poverty, inequality 1 Introduction ISTAT calculates sampling variance of common estimates (frequencies, means, totals) using the standard methodology1 [SSW92] [Wol85]; two software procedures were developed in SAS to implement this methodology in complex sampling designs [FR98] [IST05]. ISTAT has paid special attention to the dissemination of complex statistics in the last three years; we mean statistics that are complex because they are estimated by complex sample surveys and above all they are expressed by non-linear functions. The standard methodology and therefore the current software procedures of variance estimation can not be directly applied to these statistics. This paper reports ISTAT experience in evaluating the sampling variance for a set of complex cross-sectional measures like the relative poverty measures and inequality indicators estimated from the Household Budget survey [CPR05] and the Statistics on Income and Living Conditions survey (EU-SILC survey) [Rin05]. Sections 2-3 report the experience both of the Household Budget survey and of the EU-SILC survey. Finally section 4 contains the conclusions. 1 Standard methodology means a well known and widely implemented methodology; literature on sampling theory for finite population provides formulas to calculate sampling variance for the most used sampling designs and estimators. 1466 Claudia Rinaldelli 2 The experience of the Household Budget survey Official poverty estimates in Italy are calculated yearly by ISTAT using the Household Budget survey data. A household whose monthly consumption expenditure is equal or below a threshold called the (relative) poverty line is defined as poor. The poverty line is the monthly average per capita expenditure for consumption of a two-member household. An equivalence scale is used to correct the poverty line when households have different sizes [IST02]. The described procedure means that poverty is not directly observed on households, but it is a function of the observed sample; it actually depends on the poverty line and on the distribution of consumption expenditures of the sampled households. The Incidence of relative poverty for households (defined as the percentage of households with a monthly consumption expenditure equal or below the poverty line) is a target poverty measure: P Îpov = j∈S Ij wj Households ∗ 100 (1) ‘Households’ are the total resident households (a known demographic amount), j is the index of household, S is the collected sample, wj is the weight of household j, Ij is a binary variable defined as: 8 > <1 Ij = > :0 if yj ≤ povertyline otherwise (2) yj is the monthly consumption expenditure of household j. The complexity of this measure affects the use of standard methodology. This methodology can be indeed applied only assuming poverty observed directly on households that means assuming the poverty line as a fixed value. Linearization and resampling approaches were studied and tested to take into account the whole sampling variance [DDF03] [PR03]; sampling errors of the incidence of relative poverty were calculated making use of: 1. standard methodology assuming that the poverty line is not affected by sampling variability; 2. linearization; 3. resampling approach by Balanced Repeated Replications technique (BRR). Table 1 shows the results of these approaches [DDF03] [PR03]. Applications highlighted that relative sampling errors by standard methodology are not so far from those obtained by BRR technique and linearization. Standard methodology can underestimate the sampling variance of poverty measures because it doesn’t take into account the sampling variance of the poverty line. However, this underestimation is slight because of the small sampling variability of the poverty line. Standard methodology (under simplification) doesn’t produce severe differences compared to more suitable (but more complex) techniques for variance estimation. That is why, sampling errors of the incidence of relative poverty have been calculated by standard methodology since 2003. Experiences of variance estimation for relative poverty measures 1467 Table 1. Incidence of relative poverty for households and relative sampling errors (%) by standard methodology, BRR and linearization - year 2002 inc. relative% Piemonte Valle d’Aosta Lombardia Trentino AA Veneto Friuli VG Liguria Emilia-R Toscana Umbria Marche Lazio Abruzzo Molise Campania Puglia Basilicata Calabria Sicilia Sardegna ITALY relative sampling errors (%) poverty standard methodology BRR linearization 7.0 7.1 3.7 9.9 3.9 9.8 4.8 4.5 5.9 6.4 4.9 7.8 18.0 26.2 23.5 21.4 26.9 29.8 21.3 17.1 11.0 12.0 18.4 10.5 9.9 12.6 11.4 14.4 14.0 12.2 17.1 12.5 9.3 15.0 6.4 6.1 8.6 11.6 6.6 5.8 8.8 2.4 12.0 18.6 11.7 10.0 13.4 11.5 14.9 14.4 12.4 17.4 13.1 9.4 14.2 6.4 6.0 8.4 11.5 6.5 5.6 8.8 2.4 12.2 14.8 12.1 11.7 15.8 12.3 18.1 14.9 13.8 21.0 15.0 9.7 15.5 6.9 6.8 9.6 14.0 7.4 5.6 8.7 2.4 3 The experience of the EU-SILC survey EU-SILC survey2 is aimed at producing estimates on income, living conditions, poverty. Among these, there are the EU(ropean) relative poverty measures and inequality indicators estimated with their sampling errors [Reg03]; most of them are complex in a double way as mentioned. Let’s brief the EU complex statistics we deal with; introducing k as the unit index (person), yk as the value of variable Y on unit k, wk as the weight of unit k, Ŷβ as the estimated βth quantile of variable Y (0≤ β ≤1): • at Risk-of-Poverty Threshold (RPT) is the 60% of the median national income: RPT = 60%Ŷ 0.5 (3) • at Risk-of-Poverty Rate (RPR) is the percentage of persons (over the total population) with an income below RPT: 2 This survey is under European Regulation; the first year of survey was 2004. 1468 Claudia Rinaldelli PI w k k k∈S ∗100 RPR = P w (4) k k∈S in equation (4) S represents the collected sample, Ik is a binary variable defined as: Ik = 1 0 if yk < RPT otherwise (5) • inequality of income distribution, Gini index: 2 lastunit P y ∗w ∗ unitk P w − lastunit P y ∗w2 3 2∗ k k k k k 6 k=firstunit 7 firstunit k=firstunit G = 100 ∗ 6 − 17 4 5 lastunit lastunit P P wk ∗ k=firstunit k=firstunit (6) yk ∗wk • Gender Pay Gap (GPG) is the difference between men’s and women’s average gross hourly earnings as a percentage of men’s average gross hourly earnings (the population consists of all paid employees aged 16-64 at work 15+ hours per week): Pyw Pyw Pw − Pw Pyw Pw k∈M GPG = k∈M k k k∈F k k∈M k∈M k∈F k k k k k ∗ 100 (7) k in equation (7) M are the male sampled paid employees aged 16-64 at work 15+ hours per week, F are the female sampled paid employees aged 16-64 at work 15+ hours per week; • Relative Median at Risk-of-Poverty Gap (RPG) is the difference between RPT and the median income of poor units, expressed as a percentage of RPT: poor RPT − Ŷ0.5 ∗100 (8) RPT poor in equation (8) Ŷ0.5 is the estimated median income of poor units (units with Ik =1, see equation (5)); RPG = • Income Quintile Share Ratio (QSR) is the ratio of total income received by 20% of the country’s population with the highest income (top quintile) to that received by 20% of the country’s population with the lowest income (lowest quintile): Experiences of variance estimation for relative poverty measures P QSR = k∈T P P wk k∈T P 1469 yk wk / yk wk / k∈L (9) wk k∈L in equation (9) T is the set of sampled units with yk > Ŷ0.80 and L is the set of sampled units with yk ≤ Ŷ0.20 . Linearization and resampling approaches were tested to evaluate the sampling errors of these complex measures [MR05] [MPR06]. First of all, four of these measures (at risk of poverty threshold, at risk of poverty rate, Gini index, gender pay gap) were linearized by estimating equations and Taylor–Woodruff methods [KB97] [Dev99]; then, a resampling approach by BRR technique was as well considered with the aim of experimentally3 obtain sampling errors both by linearized variables and BRR [Mcc69] [PR03] [SSW92]. The results of these applications (reported in Table 2) show that these different approaches lead to similar values. Linearization of ‘Relative median at risk of poverty gap’ and ‘Income quintile share ratio’ were provided by EUROSTAT. It recommended the use of this approach for all the EU measures. Sampling errors of the EU measures have been calculated using linearization since 2005 taking into account the results in Table 2, EUROSTAT recommendation and the computational workload of resampling. 4 Conclusions Two different solutions were adopted for estimating the sampling variance of the described relative poverty measures and inequality indicators; the use of standard methodology under simplification was preferred in the Household Budget survey and a linearization approach was implemented in the EU-SILC survey. In addition to the results reported in sections 2-3, the following reasons were taken into account to pick out a satisfactory variance estimation solution: 1. Household Budget survey has been carried out since 1997 therefore the use of standard methodology (under simplification) enabled to not severely change the current data process; otherwise, the planning of the new survey (EU-SILC) enabled to implement a more complex approach in the data process; 2. more complex measures are estimated in the EU-SILC survey versus the one disseminated in the Household Budget survey; simplification can be reasonable for one measure but not for a whole set of complex statistics; 3. EU-SILC survey is carried out under European Regulation therefore EUROSTAT recommendation has to be considered. 3 Data from ECHP (European Community Household Panel survey) were used because those from EU-SILC were not available. 1470 Claudia Rinaldelli Table 2. Relative sampling errors (%) of EU measures by linearization and BRR relative sampling errors (%) linearization BRR At risk of poverty threshold 1.72 1.73 At risk of poverty rate 3.55 3.53 Gini index 1.71 1.60 Gender pay gap 30.51 32.61 Relative median at risk of poverty gap — 6.61 Income quintile share ratio — 2.88 References [CPR05] Coccia, G., Pannuzi, N., Rinaldelli, C.: Poor and non poor households: the estimation from sample surveys. In: Book of Short papers, Cladag2005. MUP editore, 69-72 (2005) [DDF03] De Vitiis, C., Di Consiglio, L., Falorsi, S., Pauselli, C., Rinaldelli, C.: La valutazione dell’errore di campionamento delle stime di povertà relativa. Final report, Conference ‘Povertà Regionale ed Esclusione Sociale’, Roma 17-12-03 (2003) [Dev99] Deville, J.C.: Variance Estimation for complex statistics and estimators: linearization and residual techniques. In: Survey Methodology, 25, 2, 193203 (1999) [FR98] Falorsi, S., Rinaldelli, C.: Un software generalizzato per il calcolo delle stime e degli errori di campionamento. In: Statistica Applicata, 10, 2, 217-234 (1998) [IST02] ISTAT: La stima ufficiale della povertà in Italia 1997-2000. Argomenti 24 (2002) [IST05] ISTAT: GENESEES 3.0 Manuale Utente e Aspetti Metodologici (2005) [KB97] Kovacevic, M.S., Binder, D.A.: Variance estimation for measures of income inequality and polarization – The estimating equations approach. In: Journal Official Statistics, 13, 1, 41-58 (1997) [Mcc69] McCarthy, P.J.: Pseudoreplication: Further evaluation and application of the balanced Half-Sample Technique. In: Vital and Health Statistics, Series 2, 31, National Center for Health Statistics, Public Health Service, Washington, D.C. (1969) [MR05] Moretti, D., Rinaldelli, C.: Variance estimation for relative poverty measures and inequality indicators from complex sample surveys. In: Atti del Quarto Convegno S.Co.2005, Cluep editrice Padova, 67-72 (2005) [MPR06] Moretti, D., Pauselli, C., Rinaldelli, C.: La stima della varianza cam pionaria di indicatori complessi di povertà e disuguaglianza. In: Statistica Applicata, to be printed (2006) [PR03] Pauselli, C., Rinaldelli, C.: La valutazione dell’errore di campionamento delle stime di povertà relativa secondo la tecnica Replicazioni Bilanciate Ripetute. In: Rivista di Statistica Ufficiale, 2/2003, 7-22 (2003) [Reg03] Regulation (EC) No 1177/2003 of the EUROPEAN PARLIAMENT and of the COUNCIL of 16 June 2003 concerning Community statistics on income and living conditions (EU-SILC), 3.7.2003 L 165/1 Official Journal of the European Union Experiences of variance estimation for relative poverty measures [Rin05] 1471 Rinaldelli, C.: Statistiche complesse e software. In: Statistica & Società, 3, 2, 01.2005, 27-29 (2005) [SSW92] Särndal, C.E., Swensson, B., Wretman, J.: Model assisted survey sampling. New York Springer-Verlag (1992) [Wol85] Wolter, K.M.: Introduction to variance estimation. New York SpringerVerlag (1985)