Sources for co-ordinated population estimates

advertisement
13th International Roundtable on Business Survey Frames
Paris ; 27 September-1 October 1999
Session No 3
Paper No 5
Wim KLOEK and Jean RITZEN, Statistics Netherlands
Sources for co-ordinated population estimates
Abstract
There are several internal and external sources for co-ordinated estimates of the population of
enterprises. The paper discusses the advantages and disadvantages of these sources. The sources
must be combined in an estimate. What makes an optimal estimate and how can we combine the
sources in an optimal way?
Keywords: Business Register, Co-ordination, Population estimates
Introduction
Last year we presented a paper on the roots of uncoordinated population figures in business statistics
and possible solutions for the co-ordination problem that arises. We distinguished between taking
away the cause (i.e. harmonisation of the production processes) and removing the effect (reweighting
to a common frame). These courses are supplementary. In this paper we will discuss the construction
of the common weighting frame:
 what sources are available?
 what are their advantages and disadvantages?
 how can they be combined in an optimal way?
Sources
Roughly there are four types of sources for the estimation of the population of enterprises:
 the Business Register;
 Administrative sources;
 Business Surveys;
 Register Maintenance Survey.
The Dutch Business Register contains information of – in principle – all enterprises. It is maintained
through information on changes from the Chambers of Commerce, from the social security authorities
(mainly for information on the size) and information collected by Statistics Netherlands. This can have
several ways: feedback of Business Surveys, individual checking of suspicious records and systematic
maintenance surveys. The Register is our natural starting point; the co-ordination problem arises from
the practical impossibility to maintain the Register in such a way that it always directly reflects the
population of enterprises.
Information from other Administrative Sources is made accessible in a co-ordinated way trough an
input database (BASELINE). For the moment this database contains only fiscal information (from VAT
and corporate tax), but there are several possible extensions (for instance data from the public utility
companies). The database serves to provide information for business statistics, both as a substitute
for direct coverage in a survey, and as a tool to stratify the surveys in a more efficient way.
1
The basis for each Business Survey is a sample taken from the Business Register. In most surveys
the large enterprises are fully covered, medium-sized are covered on a sample basis and very small
enterprises are not covered at all. The surveys concern production and related subjects, for instance:
research and development, investment or automation technology.
The tailpiece in the collection of sources is the Register Maintenance Survey. The survey covers the
small enterprises. In the last two years this survey is done in co-operation with the Chambers of
Commerce. Statistics Netherlands takes a sample and the Chambers of Commerce mail the
questionnaires. In this way both the register of the Chambers, and the register of Statistics
Netherlands profit from the information, which reduces response burden and also enhances the
external co-ordination with the Chambers. The survey-facility at the Business Register has been
reduced.
Advantages and disadvantages
The usefulness of the different sources for the estimation of the population of enterprises depends on
a mix of design-characteristics:
 coverage;
 timeliness;
 sample size;
 non-response level;
 non-response policy;
 matching possibilities or problems;
 delineation of units;
 the determination of characteristics.
The different sources have a different mix of characteristics. The advantage is that their combination
can bring about a fair step forward in the determination of the population of enterprises. But on the
other hand does it contribute to the complexity of the process.
The scheme gives the main advantages and disadvantages of the sources from the point of view of
the co-ordination of the enterprise population. The scheme is rather broad and not exhaustive. It is the
first step in the actual project to draw up a list of design elements for all specific sources.
The impact of uncovered areas is clear; the impact of matching problems and non-response is easily
underrated. Matching problems occur especially with very small and with very large enterprises.
Inactive units tend to respond less than active ones. In short, matching problems and non-response
are correlated with the target-variable. This makes that standard reweighting techniques will result in
biased estimates.
The solution for this problem is a low level of non-response combined with knowledge on the
composition of non-response. This knowledge is to be used in the estimation process. The same
procedure goes, mutatis mutandis, for the matching problem. It goes without saying that it will not
always be easy to implement, yet it is essential to the quality of statistical output.
2
Scheme 1. Advantages and disadvantages of sources to estimate the population of enterprises
(for the Dutch situation)
Advantages
Disadvantages
Business Register
 nearly complete coverage
 registration of inactive
enterprises
 errors in delineation of units
and characteristics
 timelags
Administrative sources
 complete in the covered
 VAT: some (small)
strata
enterprises not covered
 information on turnover
 corporate tax: natural
persons not covered
 information on employment
 matching problems
 delineation of units
 no information on the kind of
economic activity
 no statistical control over
continuity of administrative
procedures
Business Surveys
 statistical control over all
 non-response
design aspects
 small units not covered or
only covered on the base of
a small sample
 a set of surveys with
different design
 little information on the kind
of economic activity
Register Maintenance Survey
 coverage of small units
 sample survey
 low level of non-response
 quality problems: missing
records, regional
 collects all relevant
differences, possible bias in
information
the answers
For medium-sized and large enterprises there is an abundance of information. For the estimation of
the number of small enterprises the Register Maintenance Survey is of crucial. In traditional statistical
programming high priority is given to data-collection on large and industrial enterprises. As a few large
enterprises count for a large part of production and turnover, it is cost-efficient to direct the efforts to
these companies. On the other hand we see a revival of industrial organisation, the growing
importance of business services, discussion on innovation, entrepreneurship and enterprise dynamics
in small business economics. To get insight in these aspects, we will have to redirect some of the
statistical effort towards small enterprises.
Statistics Netherlands has reduced its interview-capacity for register maintenance surveys
considerably; this makes the business register more and more dependent on administrative sources.
For this reason it is of strategic importance to have regular checks on the quality of the administrative
data.
How to combine
Timeliness makes a first reduction in possible sources. The co-ordinated population estimate is direct
output for enterprise demography, and a tool for the reweighting of the annual business surveys.
These statistics publish within a year after the reference period (which is also consistent with the SBSRegulation). This implies that the direct link between annual statistics and short-term statistics will
possibly be lost. As the short-term figures are mainly indicators, this seems to be not a significant loss.
3
A second reduction is on the basis of the size of enterprises. The quality of the Business Register in
the area of large enterprises can be taken for granted. Large firms have their account-manager within
our Bureau, moreover, they receive a lot of questionnaires from the Business Surveys. In this area
frame errors can be dealt with and corrected at an individual level. In the project on co-ordination of
the population of enterprises we will analyse the feedback from the surveys to the Register in this
respect. For the large enterprises the co-ordinated population can be derived directly from the
Register.
What should be the level of co-ordination? A major problem is that NACE-code distributes the number
of enterprises very unevenly over the separate branches. For instance it is not to be advised to take
the section level as a basis: it would create an over-presence of manufacturing and mining branches,
whereas all construction is confined to one code. We are in the process of defining the co-ordination
level. It depends on the wishes of the users, but also on the accuracy of estimates. Maybe the issue
on the level of co-ordination should have an international dimension.
The production process we have in mind will be about as follows:
 for the separate business survey the sample is taken from the Business Register;
 data collection (either through a survey, or from administrative files);
 editing;
 raising to the sample frame;
 reweighting to the co-ordinated population estimate.
In the reweighting procedure the latest information is made available in a co-ordinated way. This
includes changes in the Business Register since the sample was taken.
Concluding remarks
The problem of unco-ordinated populations of enterprises is largely confined to the small enterprises.
Many inactive units are registered in the Business Register (up to 25 % of the units in a stratum). Also
the quality of information on characteristics is less, because small units are seldom surveyed. This
makes a Register Maintenance Survey important to monitor the existence of enterprises and the
quality of their characteristics.
For the co-ordinated population estimate to be a success it is important that
 the corrections in the reweighting are not too large;
 the estimate itself will generate an acceptable time-series.
The main overall advantages of the mixed-mode approach in assessing the population of enterprises
are the higher quality of the population figures and the better comparability of business statistics.
Reference
Wim Kloek and Jean Ritzen, Co-ordination of the estimated population of enterprises, Paper
presented at the 12th International Roundtable on Business Survey Frames, Helsinki 1998
4
Download