no 1014 re-submission

advertisement
Weighting procedure in SILC and its effect on
variance estimation
POSTER PREFERRED
Keywords: Weighting, variance estimation
1.
INTRODUCTION
Weighting is an important element for the quality of EU-SILC data. Calibration and
weight distribution highly affect the estimation of the variance of indicators and
ultimately data quality. The poster will present an empirical analysis based on EU SILC
data whose aim is to show how these components can affect the variability of the
indicator at-risk of poverty. An analysis of non-response and the pattern of non-response
over time will also be included in the analysis.
2.
VARIANCE ESTIMATION
Since two years, Eurostat is able to calculate standard errors and standard error of net
changes on the main EU-SILC indicators thanks to the methodology developed in
cooperation with NETSILC2 and based on linearization. This method is a compromise
between scientific accuracy and practical consideration and even if results rely on a series
of hypothesis, they give a reasonable idea of standard errors and related measures. The
approach used takes into account stratification, multi-stage selection, unequal
probabilities of inclusion for the sample units and re-weighting for unit non-response.
However it does not reflect the gain in accuracy caused by calibration weighting nor the
lost caused by imputation, this will be the object of future developments. It should be
however noted that calibration may improve results on specific indicators but not on all
of them.
3. WEIGHTING PROCEDURE IN SILC
Doc65 gives some basic guidelines on weights and recommends trimming extreme weights to
more acceptable values. Countries follow these recommendations and usually report all
vital metadata on weighting procedure in their quality report.
3.1 WEIGHTING ANALYSIS: FIRST RESULTS
No inconsistency has been detected in the available quality reports; indeed all countries
where the EU SILC survey is implemented correct their data for the non-response in
order to reduce the bias due to this component and calibrate their data to external sources.
Concerning the calibration procedure the majority of countries uses the SAS macro
"CALMAR" developed by INSEE. Going deeper in the analysis of the household cross
sectional weights, some basic statistics have been computed for the variable DB090 by
country.
Looking at the weighting distribution across countries a big variation can be observed.
The main findings of this preliminary analysis are:
1.
High cv detected in DK, EL, LT, LU, NL
2.
3.
4.
High extreme values in UK, FR, ES, EL and NL (max value > 10 000)
Low minimum values (0 or 1) in HR, RO, FI,LU, SI, DK
Hight max/min ratio (> 300) in FI, LU, DK, EE,LT NL,ES,BE and EL
High variability of weights is a concern because it increases variance. It is therefore
desirable to avoid extreme weights, especially very large weights. The use of extremely
variable (large) weights, even if affecting only a small part of the sample cases, can result
in a substantial increase in variance, while their contribution to reducing the bias may be
small.
Doc65 recommends trimming extreme weights so as to limit the associated increase in
variance. Currently, checking programs verify that the CV of weights is not above a
certain threshold. We recommend exploring the possibility of making better use of
weight trimming so to avoid extreme values and high influence on estimates of few
observations.
3.
CONCLUSIONS
Weighting and calibration are important elements for EU-SILC quality and they affect
heavily variance of estimations. With this first analysis Eurostat would like to draw the
attention on the importance of these elements on the SILC data quality.
Download