13th International Roundtable on Business Survey Frames Paris ; 27 September-1 October 1999 Session No 3 Paper No 5 Wim KLOEK and Jean RITZEN, Statistics Netherlands Sources for co-ordinated population estimates Abstract There are several internal and external sources for co-ordinated estimates of the population of enterprises. The paper discusses the advantages and disadvantages of these sources. The sources must be combined in an estimate. What makes an optimal estimate and how can we combine the sources in an optimal way? Keywords: Business Register, Co-ordination, Population estimates Introduction Last year we presented a paper on the roots of uncoordinated population figures in business statistics and possible solutions for the co-ordination problem that arises. We distinguished between taking away the cause (i.e. harmonisation of the production processes) and removing the effect (reweighting to a common frame). These courses are supplementary. In this paper we will discuss the construction of the common weighting frame: what sources are available? what are their advantages and disadvantages? how can they be combined in an optimal way? Sources Roughly there are four types of sources for the estimation of the population of enterprises: the Business Register; Administrative sources; Business Surveys; Register Maintenance Survey. The Dutch Business Register contains information of – in principle – all enterprises. It is maintained through information on changes from the Chambers of Commerce, from the social security authorities (mainly for information on the size) and information collected by Statistics Netherlands. This can have several ways: feedback of Business Surveys, individual checking of suspicious records and systematic maintenance surveys. The Register is our natural starting point; the co-ordination problem arises from the practical impossibility to maintain the Register in such a way that it always directly reflects the population of enterprises. Information from other Administrative Sources is made accessible in a co-ordinated way trough an input database (BASELINE). For the moment this database contains only fiscal information (from VAT and corporate tax), but there are several possible extensions (for instance data from the public utility companies). The database serves to provide information for business statistics, both as a substitute for direct coverage in a survey, and as a tool to stratify the surveys in a more efficient way. 1 The basis for each Business Survey is a sample taken from the Business Register. In most surveys the large enterprises are fully covered, medium-sized are covered on a sample basis and very small enterprises are not covered at all. The surveys concern production and related subjects, for instance: research and development, investment or automation technology. The tailpiece in the collection of sources is the Register Maintenance Survey. The survey covers the small enterprises. In the last two years this survey is done in co-operation with the Chambers of Commerce. Statistics Netherlands takes a sample and the Chambers of Commerce mail the questionnaires. In this way both the register of the Chambers, and the register of Statistics Netherlands profit from the information, which reduces response burden and also enhances the external co-ordination with the Chambers. The survey-facility at the Business Register has been reduced. Advantages and disadvantages The usefulness of the different sources for the estimation of the population of enterprises depends on a mix of design-characteristics: coverage; timeliness; sample size; non-response level; non-response policy; matching possibilities or problems; delineation of units; the determination of characteristics. The different sources have a different mix of characteristics. The advantage is that their combination can bring about a fair step forward in the determination of the population of enterprises. But on the other hand does it contribute to the complexity of the process. The scheme gives the main advantages and disadvantages of the sources from the point of view of the co-ordination of the enterprise population. The scheme is rather broad and not exhaustive. It is the first step in the actual project to draw up a list of design elements for all specific sources. The impact of uncovered areas is clear; the impact of matching problems and non-response is easily underrated. Matching problems occur especially with very small and with very large enterprises. Inactive units tend to respond less than active ones. In short, matching problems and non-response are correlated with the target-variable. This makes that standard reweighting techniques will result in biased estimates. The solution for this problem is a low level of non-response combined with knowledge on the composition of non-response. This knowledge is to be used in the estimation process. The same procedure goes, mutatis mutandis, for the matching problem. It goes without saying that it will not always be easy to implement, yet it is essential to the quality of statistical output. 2 Scheme 1. Advantages and disadvantages of sources to estimate the population of enterprises (for the Dutch situation) Advantages Disadvantages Business Register nearly complete coverage registration of inactive enterprises errors in delineation of units and characteristics timelags Administrative sources complete in the covered VAT: some (small) strata enterprises not covered information on turnover corporate tax: natural persons not covered information on employment matching problems delineation of units no information on the kind of economic activity no statistical control over continuity of administrative procedures Business Surveys statistical control over all non-response design aspects small units not covered or only covered on the base of a small sample a set of surveys with different design little information on the kind of economic activity Register Maintenance Survey coverage of small units sample survey low level of non-response quality problems: missing records, regional collects all relevant differences, possible bias in information the answers For medium-sized and large enterprises there is an abundance of information. For the estimation of the number of small enterprises the Register Maintenance Survey is of crucial. In traditional statistical programming high priority is given to data-collection on large and industrial enterprises. As a few large enterprises count for a large part of production and turnover, it is cost-efficient to direct the efforts to these companies. On the other hand we see a revival of industrial organisation, the growing importance of business services, discussion on innovation, entrepreneurship and enterprise dynamics in small business economics. To get insight in these aspects, we will have to redirect some of the statistical effort towards small enterprises. Statistics Netherlands has reduced its interview-capacity for register maintenance surveys considerably; this makes the business register more and more dependent on administrative sources. For this reason it is of strategic importance to have regular checks on the quality of the administrative data. How to combine Timeliness makes a first reduction in possible sources. The co-ordinated population estimate is direct output for enterprise demography, and a tool for the reweighting of the annual business surveys. These statistics publish within a year after the reference period (which is also consistent with the SBSRegulation). This implies that the direct link between annual statistics and short-term statistics will possibly be lost. As the short-term figures are mainly indicators, this seems to be not a significant loss. 3 A second reduction is on the basis of the size of enterprises. The quality of the Business Register in the area of large enterprises can be taken for granted. Large firms have their account-manager within our Bureau, moreover, they receive a lot of questionnaires from the Business Surveys. In this area frame errors can be dealt with and corrected at an individual level. In the project on co-ordination of the population of enterprises we will analyse the feedback from the surveys to the Register in this respect. For the large enterprises the co-ordinated population can be derived directly from the Register. What should be the level of co-ordination? A major problem is that NACE-code distributes the number of enterprises very unevenly over the separate branches. For instance it is not to be advised to take the section level as a basis: it would create an over-presence of manufacturing and mining branches, whereas all construction is confined to one code. We are in the process of defining the co-ordination level. It depends on the wishes of the users, but also on the accuracy of estimates. Maybe the issue on the level of co-ordination should have an international dimension. The production process we have in mind will be about as follows: for the separate business survey the sample is taken from the Business Register; data collection (either through a survey, or from administrative files); editing; raising to the sample frame; reweighting to the co-ordinated population estimate. In the reweighting procedure the latest information is made available in a co-ordinated way. This includes changes in the Business Register since the sample was taken. Concluding remarks The problem of unco-ordinated populations of enterprises is largely confined to the small enterprises. Many inactive units are registered in the Business Register (up to 25 % of the units in a stratum). Also the quality of information on characteristics is less, because small units are seldom surveyed. This makes a Register Maintenance Survey important to monitor the existence of enterprises and the quality of their characteristics. For the co-ordinated population estimate to be a success it is important that the corrections in the reweighting are not too large; the estimate itself will generate an acceptable time-series. The main overall advantages of the mixed-mode approach in assessing the population of enterprises are the higher quality of the population figures and the better comparability of business statistics. Reference Wim Kloek and Jean Ritzen, Co-ordination of the estimated population of enterprises, Paper presented at the 12th International Roundtable on Business Survey Frames, Helsinki 1998 4