Redesigning French structural business statistics: first methodological studies Bonn, september 2006 Ph. Brion Insee Outlines General principles of the future device A « specific » variable : the breakdown of turnover Questions raised by the use of multiple sources Page 2 General principles of the future device The present system of structural business statistics needs to be redesigned, for different reasons : in the current device, two « parallel » processes, one statistical survey and another process using tax data (annual income statements) administrative data (especially tax data) are available earlier than before the concept of « enterprise group » needs to be used in a more important way in business statistics Page 3 General principles of the future device INSEE has started a long-term project to take into account these aspects The idea is to use different kinds of administrative data : annual income statements annual statements of payroll data external trade data … and to keep a statistical survey for specific variables Page 4 GENERAL PRINCIPLES - CALENDAR Survey First results 01/01 Tax data Other administrative data Page 5 Definitive results 31/12 A « cornerstone » variable : the breakdown of turnover This variable is very important for business statistics : it is used for the national accounts an algorithm calculates the value of the APE code (principal activity code, according to the NAF nomenclature) depending on this breakdown It is not available in tax data : necessity to get this information in the statistical survey Example : extract of the questionnaire of the annual enterprise survey for the industrial sector (next slide) Page 6 Page 7 Study of the efficiency of selective editing for the variable « breakdown of turnover » In fact, « n » variables How to select units to be edited manually using a score Test of different types of «local» scores : using the difference between raw data and data of previous year, or the difference between raw data and an average profile Test of different ways of aggregating the «local» scores to produce a global score Two examples (next slides) Page 8 Estimator of the turnover of the economic branch « cars trade » 49800000 49700000 49600000 49500000 49400000 501-methodA1 49300000 501-methodA2 501-methodA3 49200000 501-methodA4 49100000 49000000 48900000 Page 9 14586 14157 13728 13299 12870 12441 12012 11583 11154 10725 10296 9867 9438 9009 8580 8151 7722 7293 6864 6435 6006 5577 5148 4719 4290 3861 3432 3003 2574 2145 1716 1287 858 429 0 48800000 Estimator of the turnover of the economic branch « Food retail trade in specialized shops » 1800000 1780000 1760000 1740000 1720000 522-methodA1 1700000 522-methodA2 522-methodA3 522-methodA4 1680000 1660000 1640000 Page 10 14665 14246 13827 13408 12989 12570 12151 11732 11313 10894 10475 10056 9637 9218 8799 8380 7961 7542 7123 6704 6285 5866 5447 5028 4609 4190 3771 3352 2933 2514 2095 1676 1257 838 419 0 1620000 The variable « breakdown of turnover » (continued) For selective editing, we have to take into account the fact that statistics using this variable are of two types : - aggregates concerning economic sectors, as the turnover of an economic sector k : i 1APEk(i) * wi * Ti - aggregates concerning economic branches, as the turnover of the branch k : i Page 11 wi * Ti( APEk) Some questions raised by the use of multiple sources Different flows of data arriving at different times Is it possible to check data arriving first (from the statistical survey) without administrative data ? Especially, two « administrative » variables are important : the turnover, the number of employees Study of the possibility of using infra-annual data for these two variables Questions raised by the estimators used : F(Xi, wi, Yi) where Xi are administrative data, Yi survey data, wi weights Questions raised by the calibration of estimators (consequences on the weights wi due to the use of the administrative data) Page 12