EUROPEAN COMMISSION EUROSTAT Directorate B: Quality, methodology and information systems Unit B6: Reference databases and metadata Feasibility Study on the collection and production of process related metadata APRIL 2011 1 1. Overview of the applied Methodology In this section the applied methodology for feasibility study on process metadata is outlined. The study was composed of the following stages: Determination of target population The target population was identical and was composed of the 33 NSIs that participated also in the second phase of the assessment analysis. The NSIs that form the target population are the 27 EU member states, the 3 EFTA countries and 3 countries that are candidate for EU membership1. Determination of needs for information The use of statistical business process models and the availability of process metadata in relation to the main phases of those processes have been defined as the main subjects of the analysis. The integration and the harmonisation of the statistical business processes is mainly related to the 1) Use of a specific Model/Standard of statistical and 2) The documentation of the statistical processes. For that reason, these two elements were defined as the main areas of investigation in the feasibility analysis. Data Collection The questionnaire was sent to the 33 concerned NSIs on 16 December 2010 and replies were asked to be reported by 21 January 2011. Analysis Finally, 32 out of 33 NSIs participated to the survey by filling in the questionnaire for process related metadata and the statistical business process models. 1 Iceland is member of EFTA and constitutes also a candidate country since 2009. In the analysis it is considered as an EU candidate country. Hence, in the 3rd phase of the assessment analysis the population of candidate countries that are monitored was enlarged. The set of candidate countries is composed of Croatia, Turkey and Iceland. 2 2. Current Situation In this section are provided the main outcomes of the analysis both for the Statistical Business Process Models and also for the metadata that describe those processes. Within the framework of the feasibility analysis were investigated the following aspects: 1) The role of the Statistical Business Process Models in the statistical business lifecycle of the NSIs within the ESS 2) Process metadata in terms of current and future availability, content and their relation to IT applications that concern their production, storage and dissemination. The analysis for process metadata was conducted in relation to the 9 main phases of the Generic Statistical Business Process Model (GSBPM). 2.1. Statistical Business Process Models The idea behind modelling the statistical process (through the development of statistical business process models) is not really new, since in the statistical community, relevant efforts towards this direction are made for more than ten years now. Moreover, the whole process has reached a certain level of maturity, where the next logical step was the development of a generic international model, such as the GSBPM. A generic model can provide answers and solutions to many of the daily challenges that the statistical offices have to tackle, such as: Harmonisation of used terminologies; Common framework for metadata systems development; Facilitation of quality management procedures; Software sharing and reuse Enabling of process-based management, and other. However, despite the importance and the added value of generic business process models, as well as the resources invested on their development, it is estimated that the adoption of such models by the statistical offices is by no means at the same level. In the rest of this subsection, we will present and comment on the findings of the “2010/2011” questionnaire survey of the project on monitoring of national metadata systems, which are related to the issue of statistical business process models. Use of models/standards in the statistical business lifecycle The first important finding of the analysis is related to the fact that the European Statistical Community is by no means homogeneous, as far as the use of statistical business process models is concerned. More specifically, 19 out of the 32 NSIs declare that they use a model/standard for modelling their business process, but only 4 of them have adopted and use the GSBPM (See Table 2.1). Out of the remaining 15 countries which are using other models/standards, this model is 3 related to the GSBPM for 12 of them (See Table 2.2). Another 2 of them are using models not related to GSBPM and for the last one no information is available. Finally, about 40% of the NSIs (13/32) do not use any kind of model/standard. If we focus only on those NSIs that have adopted some kind of a model for business process modelling (GSBPM or other), then it proves that their maturity level is again much differentiated among them. In fact, only 5 out the 19 NSIs currently use their model to a large extent for their processes and 8 use it to a small extent (See Table 2.3). The picture is different when considering new/future processes. In this case, 5 out of the 8 NSIs which currently use the model to a small extent have plans to further extend its use whereas the other 3 don't have particular plans. Finally, all the NSIs that are currently use the statistical business process model at a large extent plan to preserve this policy also for the future processes. Table 2.1: Statistical Business Process Model Do you have a Model/Standard for describing the statistical business processes ? Total Yes, we have the Generic Statistical Business Process Model (GSBPM ) 4 Yes, we have other than GSBPM 15 No, we do not have a specific Model/Standard 13 Total 32 Table 2.2: Relation of the model to the GSBPM Is the model related to the GSBPM ? Yes 12 No 2 NA 1 Total 2 Total2 100 The Total concerns the 15 NSIs which use a model other than GSBPM 4 Table 2.3: Extent of use of the Statistical Business Process Model Future Process Use of the Model to a large extent Use of the Model to a smaller extent We have adopted the Model but we are not actually using it for the moment NA 5 0 0 0 5 5 3 0 0 8 2 0 2 0 4 NA 1 0 0 1 2 Total 13 3 2 1 19 Current Processes Use of the Model to a large extent Use of the Model to a smaller extent We have adopted the Model but we are not actually using it for the moment Total Obstacles in using a Specific Model/Standard Another important objective of the questionnaire survey was to pinpoint the reasons, which inhibit and discourage NSIs from adopting and using any type of generalised business process models. The majority of the NSIs that does not have a model (9/13) indicated as the main obstacle for introducing models/standards the “limited human and/or financial resources” (See Table 2.4). Concerning the "absence of (corporate) strategy for improving the degree of harmonisation and standardisation of business process", it was mentioned as a reason for the absence of a model by 2 NSIs. Table 2.4: Reasons for not having a model/standard Reasons for not having a Model/Standard Total3 Absence of strategy for improving the degree of harmonisation and standardisation of business processes 2 Limited human and/or financial resources 9 Other 3 3 The Total concerns the 13 NSIs which do not use a model/standard. One country selected two answers, therefore the sum of the Total column equals to 14. 5 Future plans for the adoption of a Model/Standard Another important target of the survey questionnaire was to draw the general picture regarding the future of the statistical business process models and standards in the NSIs. In order to achieve this goal, the NSIs that do not currently use any kind of model or standard (13 in totals) were invited to provide information regarding their future plans. Again, the results were multivariate (See table 2.5): 4 out of the 13 NSIs declared that they intend to adopt GSBPM in the future, although the implementation has not started yet. The same number of respondents (i.e. 31%) replied that it is their intention to use some kind of model/standard, but they have not yet decided which one. Moreover, another 2 NSIs stated that they are already running a project for incorporating in the organisation a model/standard, although this is not GSBPM. None of the NSIs mentioned a progressing project for the adoption of GSBPM. Finally, one NSI reported that it has no plan for adoption of a model (GSBPM or other) and one respondent did not provide any answer to this question. Table 2.5: Future plans for using a model/standard Existence of plans for the future adoption and implementation of a statistical business process model Yes, we plan to adopt GSBPM. The implementation did not start yet 4 Yes, but we do not know yet which model/standard we will adopt 4 Yes , a project for the adoption of a Model/Standard other than GSBPM is in progress 2 No, we do not have any plans 1 Other 1 NA Total 4 Total4 1 100 13 NSIs that do not use a model/standard 6 Assess the contribution of GSBPM Another aspect of the questionnaire survey addressed the issue of assessing the contribution of GSBPM on various aspects of the statistical production. The respondents were asked to evaluate the importance of the contribution of GSBPM on a selection of 8 main issues: For the field of the "importance", a four-level scale was used, varying from “very important” to “not important at all”. The analysis revealed that NSIs consider that the most important contribution of the GSBPM is on the issue of “standardisation of statistical processes”, since 23/31 NSIs (74%) selected the option “Very important” for this one (See Table 2.6). On the other hand, the least important contributions were found to be on the issues “impact on the organisation structure” and "Measurement of operational costs" both selected as to be “not important at all” by 5/31NSIs (16%). As far as each individual issue is concerned, we can also outline that the majority of the NSIs considered that the GSBPM has a significant contribution (selected answers “very important” and “important”) in the fields of: “Development of statistical metadata systems” (10 and 17 respondents respectively), “Quality assessment of statistical business processes”, (same scores), “Description of statistical business processes” (18 and 11), “Increase of understanding of statistical business processes” (17 and 12). Table 2.6: Contribution of GSBPM Importance of GSBPM's contribution Issue to which GSBPM contributes Very important Important Not all that important Not important at all Don’t know 10 17 4 0 0 23 7 1 0 0 10 17 2 2 0 Description of statistical business processes 18 11 1 1 0 Impact on the organisational structure 5 9 12 5 0 17 12 0 2 0 6 12 8 3 2 1 13 10 5 2 Development of statistical metadata systems Standardisation of statistical business processes Quality assessment of statistical business processes Increase of understanding of statistical business processes Provision of an input to high-level corporate work planning Measurement of operational costs 7 The entities of the NSIs that are involved in each phase of the GSBPM As a last step for this part of the analysis, the questionnaire focused on the nine phases of the statistical business process, as these have been defined by the statistical community and the way each statistical institute deals with each individual phase (central handling, handling by production units, other entities or any combination of the three methods). In total, 30 organisations responded to this question (See Table 2.7). The results reveal that, in general, Statistical Production Teams are involved implied in most of the phases of the statistical business processes, either alone or in cooperation with other partners (mainly with central units/departments). In each phase of the statistical business process (except Dissemination), the main responsibilities are assigned exclusively to the Statistical Production Teams in about half of the participating countries. This is merely the case for phases “5.Process” and “6.Analysis”. The role of the Statistical Production Teams is also essential in phases “2.Design” and "9.Evaluation" (involvement in 29/30 NSIs and 25/27 NSIs respectively) during which Central Units/Dept are often collaborating. The most centralized phase is Dissemination. This phase is exclusively under the responsibility of Central Units/Departments by 37% of the NSIs (11/30). Moreover, the proportion of the NSIs, in which dissemination processes are executed both by Central Units/Departments and Statistical Production Teams, is also equal to 37%. Table 2.7: Entities involved in each phase of the statistical business process The entities of the NSIs that are involved in the different phases Central Unit/Dept. Central Statistical + Unit/Dept. Central Statistical Production Statistcial + Central Unit/Dept. Production Teams Production Statistical Unit/Dept. + Teams + Teams Production other entity Other entity + Teams Other entity) 1. Specify needs 3 3 7 14 2 0 Total Other NA 0 0 29 2. Design 1 12 1 13 3 0 0 0 30 3. Build 4 8 4 11 1 1 0 1 30 4. Collect 3 4 8 13 1 0 1 0 30 5. Process 2 8 1 17 2 0 0 0 30 6. Analyse 1 5 1 20 3 0 0 0 30 7. Disseminate 4 11 11 2 0 1 1 0 30 8. Archive 1 6 6 11 2 0 2 0 28 9. Evaluate 1 10 0 12 2 0 1 1 27 8 2.2. Process Metadata The main objective of the survey for process metadata was to measure the extent to which the production and provision of process metadata within the ESS are feasible. Hence, within the framework of the feasibility study the following issues were investigated: The extent of current and future availability of process metadata in the ESS for each phase of the Generic Statistical Business Process Model (GSBPM). The content of process metadata. The type of information that is currently provided or is feasible to be produced in the future. The identification of the phases of the statistical business process for which better documentation is necessary. The degree at which the existing IT infrastructure of the NSIs currently supports the production, storage and dissemination of process metadata within specific phases of the statistical production lifecycle. Current availability of process metadata Process metadata concern the description of the statistical business process. The main reason for investigating the availability of process metadata for each phase of GSBPM is that the structure of the model is considered as the proper basis for the compilation of process metadata. Regarding the results of the analysis, these indicate very clearly that process metadata are currently available within each phase of GSBPM but the extent of availability differs among the 9 phases (See Figure 2.8) Phase “7.Disseminate” is the one for which process metadata are the most often fully available in NSIs (61%). Furthermore, in 32% of the NSIs, information is also partially available for this phase. The same cumulated share (93%) of NSIs with available or partially available process metadata can be found for Phases "4.Collect" and "5.Process", but the amount of NSIs where the information is fully available for these both phases is lower than for the phase "7.Disseminate" (52% and 42% respectively). Among all phases, the availability of metadata for phases “8.Archive “and “9.Evaluate” are the lowest. More concretely, 11/30 (37%) NSIs do not have metadata at all for these both phases. Concerning the remaining phases ("1.Specify Needs", "2.Design", "3.Build", and "6. Analyse"), one can remark that in about half of the NSIs, process metadata are partially available. 9 Figure 2.8: Availability of process metadata in each phase of GSBPM 10 Future availability of process metadata The feasibility for future provision of process metadata was investigated for the subset of the NSIs that do not currently provide process metadata for the different phases of GSBPM. The results of this analysis are provided in Table 3.9. According to this table, phases "1.Specify needs", "8.Archive" and "9.Evaluate" are those where the future collection of information seem to be the most feasible. To a lower extent, information concerning phase "3.Build" could also be partially collected. In opposite, it appears difficult for NSIs which don't collect any process metadata on phase "2.Design" to get more information in the future. Finally, uncertainty exists concerning the possible future collection of metadata related to phase "6.Analyse" where 2 out of the 4 NSIs replying to this question can't conclude on any possible improvement. Table 2.9: Feasibility to collect currently non-provided process metadata Feasibility to collect currently non available process metadata Phases of GSBPM Total Feasible Partially feasible Not feasible I don't know 1. Specify needs Total 3 2 3 2 10 2. Design Total 0 1 3 0 4 3. Build Total 0 4 1 0 5 4. Collect Total 0 1 1 0 2 5. Process Total 0 1 1 0 2 6. Analyse Total 0 1 1 2 4 7. Disseminate Total 0 1 1 0 2 8. Archive Total 3 2 2 0 7 9. Evaluate Total 3 4 2 1 10 11 Types of process metadata currently available Apart from the extent of availability, the content of process metadata that are currently collected by the NSIs within each phase of GSBPM was also investigated. Therefore, NSIs were asked to indicate to which of the following categories the metadata that they currently collect do belong. The proposed categories of process metadata were: 1) Methodological process metadata: Describe the methodological tools and standards along particular statistical production process 2) Technical process metadata: Describe the workflow, IT tools and staff activities at each steps of the production cycle. 3) Process quality metadata: Describe the quality of the statistical output and the underlying statistical production process. The distribution of the available types of metadata is provided in Table 3.10. According to the replies, all the possible combinations of the 3 types of process metadata that are currently available in ESS were mentioned. The analysis revealed that Methodological process metadata are the most common types of process metadata that are currently collected in the ESS. They are the predominant type of metadata collected in most of the phases of the GSBPM. This is particularly the case for phases "2.Build", "5.Process" and "6.Analyse" where their availability is often combined with process quality metadata. This latter type of process metadata is also collected for phases "4.Collect" and "7. Disseminate" and can merely be found in phase "9.Evaluate" where 8 out of 11 NSIs mentioned it as a type of available metadata. In phases "4.Collect" and "8.Archive", the three proposed types of metadata are generally available to more or less the same extent. Finally, it should also be noticed that the proposed field "other type" has often been chosen by NSIs when responding to this question (See footnote of Table 2.8) 12 Table 2.8: Types of process metadata Total Number of NSIs by type of process metadata for each phase of GSBPM Phases of GSBPM 1. Specify needs 5 Methodological + Technical + Process quality 1 Total Methodological Process quality Methodological + Technical Technical + Process Quality Technical Other Types5 1 9 2 0 0 0 7 20 Methodological + Process quality 2. Design 2 6 6 2 2 0 1 6 25 3. Build 2 3 2 0 1 0 5 10 23 4. Collect 3 4 4 3 2 0 2 8 26 5. Process 2 6 4 2 3 1 0 9 27 6. Analyse 2 6 4 2 1 0 0 7 22 7. Disseminate 2 3 4 4 0 1 1 10 25 8. Archive 1 2 2 1 0 0 5 5 16 9. Evaluate 1 2 3 5 0 0 0 0 11 Other refers to very analytical descriptions of process metadata that cannot directly be classified into one or more of the main types (Methodological, Technical, Process Quality) 13 Improvement of documentation of statistical business processes Another aspect of the questionnaire survey concerned the identification of the phases of the statistical business processing for which better documentation is considered as necessary. Table 2.9 indicates that among all NSIs that replied to this question (24 replies in total), more than half of them consider that better documentation is necessary for all the phases of the statistical business processing. Phase "2.Design" is the one for which NSIs would require more documentation (83%) followed by phase "9.Evaluate" and "6.Analyse" (79% and 75% respectively). Table 2.9: Need for better documentation Phases of statistical business process 6 % of Total respondents6 1. Specify needs 71 2. Design 83 3. Build 67 4. Collect 63 5. Process 71 6. Analyse 75 7. Disseminate 54 8. Archive 67 9. Evaluate 79 In Total 24 NSIs provided information for the need of improving the documentation within each phase 14 IT applications in the production, storage and dissemination of process metadata In the survey, the use of dedicated IT applications in the production, storage and dissemination of process related metadata was investigated for the phases 4 to 7 of the GSBPM. The results from the 29 replies received are available in Table 2.10. It shows that dedicated IT applications are mainly used for the production and the storage of process metadata that are collected within phases "4.Collect" and "5.Process" (between 15 and 19 NSIs for the four cases). Phase "6.Analyse" is the phase where dedicated IT applications are the least often used, for production, storage as well as dissemination of process related metadata (less than half of the NSIs concerned). Logically, phase "7.Diseminate" is the one where IT applications are especially dedicated for the dissemination of process related metadata (16/29). However, the use of these IT applications for the production and storage of the process related metadata within this phase concerns a similar number of NSIs (15/29). Table 2.10: The use of dedicated IT applications Dedicated IT Application(s) for: Phases of statistical business process The production of process related metadata The storage of process related metadata The dissemination of process related metadata 4. Collect 19 16 9 5. Process 15 17 10 6. Analyse 8 10 6 7. Disseminate 15 15 16 15