Benefits of Micro-Economic Platform

advertisement

Transforming how we produce statistics: An inside perspective

Submission for the 2014 IAOS Prize for Young Statisticians

Transforming how we produce statistics: An inside perspective

Michelle Feyen

Statistical Analyst

Statistics New Zealand www.stats.govt.nz

Christchurch, New Zealand

Michelle.Feyen@stats.govt.nz

Abstract

Under the Micro-Economic Platform Programme, we are transforming how micro-economic surveys are processed at Statistics New Zealand. This is part of our organisation-wide business transformation programme, Statistics 2020 Te Kāpehu Whetū (Stats 2020), which aims to increase our relevance to government and public data users, and reduce the cost of producing statistical outputs. I report on insights into this programme and outline what the platform team has learned during the projects included in the Micro-Economic Platform. In particular, I report on the challenges involved in using standard methods and tools, increasing the level of automation in the system, introducing survey teams to the platform, and collaborating to achieve the migration of each survey to the platform. I also discuss what we learned from these challenges as part of achieving Statistics New Zealand’s statistical system of the future.

1

Transforming how we produce statistics: An inside perspective

Introduction

The growth of information available in the world poses a challenge to national statistical offices (NSOs) to maintain their relevance and efficiency (Kent, 2011). At Statistics New

Zealand we recognise this challenge and understand the need to improve our technology to remain relevant.

Core to the business of producing statistics is the way we store, manage, and process data collected from different surveys and administrative data sources. Historically, survey systems were designed, built, managed, and updated completely autonomously from each other.

This paper outlines one way Statistics NZ successfully addressed the need for change, by building the Micro-Economic Platform. This infrastructure system allows data from different micro-economic surveys and administrative data sources (referred to as collections in this paper) to be stored, processed, and analysed in a standard way, thus increasing efficiency.

We share lessons learnt as we overcame technical challenges and resistance from subject matter experts to ‘change the way they do things’.

Culture of change vital for overcoming initial barriers

We could not have achieved the transformation of how we produce statistics without the culture of change that permeates Statistics NZ. Our organisation is undergoing a business transformation programme called Statistics 2020 Te Kāpehu Whetū (Stats 2020). This culture change initiative is organisation-wide and involves a range of programmes and projects that aim to increase our relevance to government and public data users, and to reduce costs of producing statistical outputs (see figure 1).

Figure 1

Stats 2020 strategic priorities

The programme of work to build the Micro-Economic Platform and migrate the data and processes for different surveys to the new platform embraces the four strategic priorities of

Stats 2020. This meant we started the work with the organisation’s financial and emotional support for trying something new.

1

Transforming how we produce statistics: An inside perspective

Our programme of work for the Micro-Economic Platform

The aim of our programme was to store and process data from many (if not all) of Statistics

NZ’s micro-economic surveys on the same system, as well as maintain configurations using standard statistical analysis tools.

The purpose was to increase our relevance, by improving processing for collections, offering better options for analysis of data, and mitigating risks resulting from the use of outdated legacy systems.

To achieve this aim, we built an infrastructure to support the processing of many collections, using standardised databases, software, methodology, and processing tools.

Individual migration project

Each collection had an individual ‘migration project’ to ensure the smooth transition of data and associated processes to the new platform. Each project involved analysing the entire business process of the collection, and replicating the design on the new platform, using standard tools and processes. To date, we have migrated 11 collections into the Micro-

Economic Platform.

Insider’s perspective

As a member of the technical team that helped subject matter experts migrate their collections from old systems to the new platform, I was in a unique position to capture the insights and lessons learnt reported in this paper.

Benefits of Micro-Economic Platform

There are many benefits of being able to store data, process, and maintain configurations from different data sources (‘collections’) on the same system. As well as financial benefits, there is the benefit of analysts doing less processing and learning new systems, giving them more time for the core work of analysing the data, and telling the stories behind the statistics.

Automation, so analysts don’t need to learn new processes

Any analyst trained to work with the platform will be able to run almost any survey, as much of the survey processing is done automatically. This is a shift in approach from the current

‘legacy’ systems, which require a lot manual intervention.

Similar processes and shared functionality, so more time for analysis

All surveys will be processed in a similar manner, with shared functionality and internationally recognised, up-to-date statistical analysis tools. This means analysts can focus more on learning about the aims of the collection to further their analytical capability, instead of having to learn new processes.

Standard set of tools

The platform uses a standard set of tools, evaluated and tested by methodologists to ensure best practice methods. The sample or collection design for each dataset is unchanged, but the integrity of the processing system used to produce data ready for estimates is improved.

We use tools from external agencies that have already been built and are able to serve our needs. For example, the main tool for editing and imputation is the Banff suite of procedures

(Statistics Canada, 2013). We also use international standards such as the Generic Statistical

Information Model (UNECE, 2013), the Generic Statistical Business Process Model

(UNECE, 2013). All of the statistical processing is done using SAS.

2

Transforming how we produce statistics: An inside perspective

Standard reports can be adapted for specific collections

The platform’s suite of reports can be adapted slightly during the build phase for each survey.

The reports use standard tools, so they are easy to learn.

Parameters are configurable, so less reliance on IT staff

Anyone using the platform can see what parameters have been programmed into the configurations, where previously this information was only available to IT staff who managed the individual legacy systems. Analysts will increasingly analyse data and turn it into relevant information, instead of carrying out a more basic data processing role.

Standardisation means less expense and risk

The standard platform replaces a situation where every survey collection had an individually created system, built and maintained separately. In some cases bespoke tools and methods were developed to deal with unique situations that arose, making them non-standard. This required a large amount of IT effort to solve problems (some of which were never solved).

This means the old legacy systems were both expensive and risky to maintain.

Survey teams able to continuously improve

The real power of the platform is the huge amount of knowledge that has been built into it, and the transparency of this information. Analysts are able to view the configurations and parameters, and even access the SAS code of the procedures used, should they wish.

This means that even after the migrated survey has been handed back to the subject matter experts, analysts are able to improve how their process runs. They are also able to build their own reports using the cubes, and store these so that their whole team, or an even wider network, is able to access and use them.

Benefits and challenges of standardising old systems and processes

Despite the many benefits of standardised systems and process, we had to overcome several challenges to successfully migrate different collections to the new platform. The first challenge was to standardise very different collections.

New structure for some collections

On the platform, every survey, register, or dataset needs a similar structure. They are set up as collections, with period dates based on the frequency of the data collected. Each collection must use an instrument, which stores the variables used as part of that collection, and tells the data how it needs to be structured. The variables are stored in the variable library, in a standard format. This means the system ‘knows’ on which data in that collection to run the configuration (sets of processing steps).

The advantage of this is that standard practice and methods are supported and encouraged.

Teamwork helps successful configuration

During each migration project, configuration experts, methodologists, and subject matter experts were involved with setting up the configurations. This collaboration meant everyone understood the entire business process for the collection. As a result, there were more innovative solutions in the process of moving the data and processes into the new platform.

Innovation required to maintain data integrity

Subject matter experts and our external users expect that the same survey will produce similar results to what were produced previously, regardless of the technology and processes used to reach the end point. Some of the methods previously used in surveys were not available on

3

Transforming how we produce statistics: An inside perspective the platform, so we had to ensure new methods were able to produce results very similar to those produced on the legacy system. Although methodologically sound, these methods were often particular to a survey, so were unsuitable for use on a standardised platform. Modifying these methods to suit the platform involved finding innovative ways of dealing with the data.

Matching old system with new

More recently, we had time to investigate improvements before migrating collections and therefore introduce changes. With these improvements came new challenges, for instance:

How to match the old system to the new system when the processes have changed? The solution was to undertake thorough analysis to ensure the appropriate outcome was achieved.

Adapting standard reports for specific collections

Not every collection requires the same type of report to be run. The Micro-Economic

Platform has a suite of reports that can be adapted during the build phase for each survey. The result is that each survey team can use their own reports to see the data, including impact analysis and commentary that survey analysts can store at almost any level of the data.

Analysts are also able to develop their own reports once the migration project is complete.

These reports can be produced using the cube (a specially formatted database that allows quick access to large volumes of data via user configurable reporting, using measures and dimensions of the data). The reports can be used for analysis during the production process, which enables further development and investigation by the subject matter analysts.

Benefits and challenges of automated processes

A major benefit of migrating a collection to the platform is the increased automation. This is a key part of increasing the efficiency in processing.

A large proportion of the automation is in the editing and imputation phase of the process, which analysts can then review.

Challenge of fully understanding process and translating it successfully

In order to automate the editing and imputation process, analysts working on the solution first have to fully understand the current business process from start to finish. This involves confirming that these methods are required, then trialling standard methods and tools to replace them.

Sometimes the existing methods are exclusive to the survey and had been created to enable it to run on the legacy system. Translating these methods has to be done with the utmost care, to ensure customer confidence in the methods and continued time series, and also the quality of Statistics NZ’s outputs.

Change of perspective

Subject matter experts have to change their perspective on how they viewed data. The legacy systems encouraged viewing data on a unit-by-unit basis with small numbers of variables compared at a time. Analysts working on the survey then gained in-depth knowledge of many units individually. The platform approach is to look at the data from a higher level, only examining any large impacts to the entire dataset in detail, thus the change of perspective.

Need to fully understand in order to successfully build

With the ability to automate (eg using Banff procedures such as Proc Outlier, Proc Estimator,

Error Locate (Proc Errorloc), and Deterministic Edit (Proc Deterministic); Statistics Canada,

2013), units are frequently compared with one another to find errors; a larger number of

4

Transforming how we produce statistics: An inside perspective variables and the relationships between them can be compared at once. This means we have to fully understand the dataset’s structure and the relationships between variables to build the automated process.

Complexity requires high level of statistical judgement

In the legacy systems for each collection, the setup enabled fast views of data but limited processing. Simple errors were found by the system for a statistical unit, which was then manually ‘cleaned’ by one of the subject matter expert analysts. In contrast, the platform can process the vast majority of these fixes automatically, with the users focusing on complex data issues. This change in complexity requires an increase in the level of statistical judgement required by collection analysts on a day-to-day basis.

A key example is how this was applied to the Annual Enterprise Survey collection. The survey has over 100 variables that add to totals or that match across the form. The legacy survey system would consider only one set of variables at a time – for example, triggering a warning to display when the income variables didn’t add to total income. With Error Locate, the platform has the computational power to consider multiple groups of variables at once – for example, edit rules that check the income variables add to total income, that expenses variables add to total expenses, and that these totals match to the values used in the calculation of profit, all at the same time. The procedure then works out the minimum number of variables that should be changed in order to make the unit pass the edit rules and assigns a failed status to the variables that need to be changed. Deterministic Edit then follows up these failures and, where possible, calculates the exact value needed to pass the edit rules. This is an increase in both the level of specificity and the level of automation.

Benefit of automation

The impact of increasing the level of automation with these Banff procedures is clear – the

Annual Enterprise Survey legacy processing system required approximately 40% of units to have manual changes made to them. On the platform we expect approximately 15% of units to require review (10% from errors unable to be fixed automatically or review of large automated changes, 5% from large or complicated businesses in the collection (key firms)).

This represents a significant reduction in the amount of time analysts spend making minor changes to data, as these small or simple changes are all done by an automated process. It also means that analysts’ time is spent working on significant problems or reviews, making their jobs more analytical in focus. It also enables more time for innovation and high-value analysis – i.e. the statistical transformation envisaged by the Statistics NZ strategic priorities

(Seyb, McKenzie, & Skerret, 2013).

Overcoming inconsistency

Despite the best intentions, analysts are not always completely consistent with one another when it comes to manual editing, nor are they always truly consistent over time. Automation provides reliability to what used to be random. The new, more cohesive view of units, variables, and their relationships, provides more consistency in processing than previously possible. Because automation involves implementing a consistent solution for a known problem, it has the potential to remove human error from statistical processing. With the consistency of automatic editing, we can be more confident in the quality of the data and associated outputs that we produce. This repeatable process will significantly reduce the risk of errors being published.

5

Transforming how we produce statistics: An inside perspective

Challenge of educating subject matter experts new to automation

Subject matter experts new to the idea of automation may think the knowledge built in to the system could risk including bias or over-editing. However, the information available as a result of the platform’s focus on automation means that managers and analysts are increasingly able to understand and quantify the quality of the data, and the quality of the processing, over time.

It helped to be able to show reluctant subject matter experts how to configure the platform

(rather than having to rely on IT technical experts). They saw how easy it was to build or alter configurations and thoroughly test them before following change-control processes to move the solution into production.

Benefits of automation

Automation is not just about making the survey processing time faster. It is a tool that builds intellectual property into a system to enable analysts to continue to stretch and build their knowledge, to carry out more analysis, and to further innovate. This means that automation is essential to Statistics NZ becoming and remaining relevant producers of data and information.

Benefit to survey teams able to continuously improve

The real power of the platform is the huge amount of knowledge that has been built into it, and the transparency of this information. Analysts are able to view the configurations and parameters, and even access the SAS code of the procedures used, should they wish.

This means that even after the new survey processing system has been handed back to the subject matter experts, analysts are able to improve how their process runs. They are also able to build their own reports using the cubes, and store these so that their whole team, or an even wider network, is able to access and use them.

Challenge of introducing subject matter experts to new platform

One of the greatest challenges for those of us responsible for the technical aspects of the migrations was to learn how to engage constructively with the subject matter experts whose surveys were migrated. It was critical they were well informed and fully involved with the migration process. They had to be prepared to work with the new, more automated system, in order to achieve the expected improved efficiency.

Technical team provide training

In most cases the subject matter experts had not been exposed to the platform or the user interface before the migration of their survey. This meant they faced a steep learning curve.

This was alleviated by working with configuration experts and methodologists who work almost exclusively on the platform.

We also provided training for staff new to the platform, which is being expanded as more surveys move on to the platform. There is a formal support system in place, where users can report issues or defects, and have these managed and fixed by the platform team in an organised and timely manner.

Working together in non-confrontational way

During the information gathering process about the old systems, we were careful to avoid any confrontation with the subject matter experts, as they came to terms with the potential of the new processing procedures.

6

Transforming how we produce statistics: An inside perspective

For some subject matter experts, the new platform offered the opportunity to introduce innovations they had been waiting years to implement. Others found it a struggle to approach the new system and the new ways of processing. We made sure the subject matter experts were fully involved in the migration of their collections. This meant we captured all nuances of the data effectively, and created a set of reports that allowed them to see the data in a range of ways.

The configuration experts involved in migrating the collections to the new platform valued the subject matter experts’ practical knowledge of the survey available as they worked together to ensure a particular approach would work.

Challenge of viewing data differently

Subject matter experts had to adjust to a new way of seeing their data on the new platform.

Instead of the traditional bottom–up approach of checking individual units, they had to adjust to the more top–down state, where only units that have a significant impact on the values to be published are examined. This means analysts spend much less time examining small changes in the data, and more time focusing on a smaller number of units whose value could affect the top-level estimate. This change can be a challenge for subject matter experts.

Commissioning a hand-over phase for ongoing support and documentation

A key challenge for the platform team was to ensure the subject matter experts had adequate support and documentation, once their collection was migrated to the new platform.

A large amount of knowledge is embedded in the system, which the subject matter experts need to learn. We solved this by introducing a ‘commissioning’ phase at the end of each migration project, where configurations and reports have been built and tested, but not run in production by the subject matter area. During this phase the platform team works closely with the analysts to ensure processes run smoothly, reports are built to requirements, and the analysts understand how to run the system and report issues that may arise.

More time for analysis

As well as reducing labour costs, the overall benefit of the new platform is that subject matter experts can spend more time analysing their data, and less time processing it. Unleashing the teams from an IT-constrained system means they can produce data and information for the

New Zealand public quicker than before and to a much greater level of confidence and detail.

Benefits of a collaborative platform team

The platform team for each migration project was made up of a number of specialised roles, including methodologists, subject matter experts, configuration experts, and business analysts. The challenge of disparate members was overcome by encouraging collaboration, starting with co-location.

Relocating individuals helps collaboration

We relocated the members of the platform team to one place. This was a significant factor in the success of their collaboration, which drew on the strengths of the different expertise areas.

We were able to meet all the requirements and maximise the ability of the platform to automate, standardise, and innovate; therefore, increased the statistical transformation potential of the platform.

Benefits of configuration experts working closely with subject matter experts

The platform team developed the processing steps the subject matter analysts would eventually run. The configuration experts on the team had to understand the legacy system

7

Transforming how we produce statistics: An inside perspective and its processes, in order to translate these into the standard processes and tools on the platform. Since they also had to explain the process the data goes through in a way that the subject matter analyst could understand and pass on to other members of their team, it was an advantage that the members of the team worked together. Some of the newly implemented methodology is quite different from the legacy systems’ methodology, so it was important the survey analysts understood the new implementation.

Value of the methodologists

Another key element of the migration process has been the input and review role of methodologists. Some have been involved with the programme from its initial phases, while others are asked to investigate potential changes to each collection before they are implemented. This ensures the tools we use are standardised and available in our statistical toolbox and enables us to be confident that our methods and execution of processes are sound and reliable.

Value of a project team

The platform team included members from different disciplines, all focused on the same goal of producing a standardised and automated process for a survey. Having these people together on a project team enhanced their ability to deliver an efficient, innovative system.

Benefit of continuing developments

The platform is constantly being developed, with new IT functionality introduced for new migrations, as well as improving the current systems.

For example, when we migrate the building consent data we are working to introduce automatic coding of building types. In the legacy system, analysts had to manually code every entry. The building consents collection will garner significant efficiency gains, as only units that need to be manually checked will be.

Another aspect included in Statistics NZ’s strategic priorities is a push to reduce respondent burden. For the building activity collection, we are working to reduce the number of survey forms that we send out. Some data will instead be modelled using methodology that has been thoroughly investigated, and will be introduced to the platform as part of this collection’s migration project.

What we learned

Future migrations and transformation within Statistics NZ will benefit from what we have learned from the migration journeys.

Manage expectations

It is essential to manage the subject matter experts’ expectations. If they don’t realise they need to transform their own processes, they will be dissatisfied when the product (their migrated collection) is delivered to them on a new platform. They will try to carry out nonstandard and manual processes in a system built for automation, and therefore miss the opportunity to focus on more analysis and less processing.

Clear product description

The configuration experts work hard to ensure the subject matter experts in the migration teams understand what will be delivered. This includes a product description that must be agreed upon, and a push from platform team members for subject matter experts to understand that a different approach is required.

8

Transforming how we produce statistics: An inside perspective

Collaborative team

Subject matter experts, methodologists, and configuration experts need to work together to engage with IT for our system requirements to build a suitable solution for each collection on the platform.

Innovation

The migration process taught us standardisation cannot be carried out without a large amount of innovation. Because the standard platform tools cannot do a version of every bespoke method, we had to be creative to make the system work to our needs. We looked at each process conceptually – the answer to “Why are we doing this?” helped us find innovative solutions.

People, people, people

The key to the successful migration of collections on to the new platform is the people involved. We learnt the significance of support and assurance – both of the people and of the platform. The subject matter experts in the process need to be engaged through the migration process and be adequately supported during the commissioning phase, and after handover.

Dual run provides confidence and trust

Carrying out a dual run is a must-have requirement. A dual run is where the raw data for a collection is run through the legacy system and the platform processes simultaneously in order to confirm results were as expected. This proves the output of the processing was the same, and shows the gains and efficiencies made by the standard automated system. This dual run also means the production teams who are migrating their surveys to the platform will trust the changes made and the standard processes used.

Documentation

We learned the importance of documentation in supporting subject matter experts during the transition period. It helped them understand the platform’s processes and how they affected their collection’s data. This assurance has been essential, as we continue to build and expand the platform while balancing platform support requirements for collections already using it to produce their outputs.

Culture change, from top down

None of the transformation we achieved would have been possible without the backing of

Statistics NZ’s culture-change initiative. The strategic priorities clearly set out that the way we do our jobs needs to change, and the culture-change directive is such that expectations are clearly set out for everyone in the organisation, top to bottom. With this strong backing, the

Micro-Economic Platform developments have the potential to achieve a total transformation with how production analysts can do their jobs.

Conclusion

The programme of work to migrate collections to the Micro-Economic Platform provided many challenges and chances for innovation for the team members working on each of the migration projects. It also provided an opportunity for subject matter experts to improve their processes, and increase their analytical capability. For the programme to succeed, collaboration and in n ovation are essential, especially in order to achieve standardisation and efficiency goals set out in the Stats 2020 strategic priorities. This new approach to analysis and data will mean that Statistics NZ will maintain relevance into the future. This relevance is further supported by the platform’s standardisation and automation advances, which will

9

Transforming how we produce statistics: An inside perspective allow Statistics NZ to continue to produce relevant information for New Zealand to grow and prosper.

Acknowledgements

Thanks to Catherine Cumpstone for her contributions to this paper; and Allyson Seyb,

Richard Penny, Gary Dunnet, Penny Barber, Darren Allen, Sue Chapman, and Chris Toohey for their support and advice.

References

Banff Support Team, 2013, Functional description of the Banff System for edit and imputation , Version 2.05. Statistics Canada, Canada.

Kent, J., 2011, Availability of data: From scarcity to profusion.

Invited paper, United Nations

Economic Commission for Europe Conference of European Statisticians (Ljubljana,

Slovenia, 9–11 May 2011). Work Session on Statistical Data Editing, Topic (v): Changing organizational cultures, WP.33.

Seyb, A, McKenzie, R, & Skerret, A., 2013, Innovative production systems at Statistics New

Zealand: Overcoming the design and build bottleneck . Journal of Official Statistics, Vol.

29(1), p 73–97.

United Nations Economic Commission for Europe (UNECE), 2013, Generic statistical business process model (GSBPM): Specification , Version 5.0. http://www1.unece.org/stat/platform/display/GSBPM/GSBPM+v5.0

United Nations Economic Commission for Europe (UNECE), 2013, Generic statistical information model (GSIM): Specification, Version 1.1. http://www1.unece.org/stat/platform/display/gsim/GSIM+Specification

10

Download