2010 Population Census (SP2010) and 2013 Agriculture Census (ST2013) Modern Data Processing Automation in BPS-Statistics Indonesia: an Innovations Author: Dudy S. Sulaiman Alliance: Deputy Chief Statistician for Methodology and Statistical Information System – BPS-Statistics Indonesia Email: dudy@bps.go.id Abstract: Indonesia is the biggest archipelago country. Indonesia has around 250M populations and 28M agriculture households, about 400 local languages and more than 300 ethnic groups. These create significant challenges for BPS to conduct censuses on-time with desired quality. Based on experiences, BPS understand that it is almost impossible to achieve. In order to address these challenges and achieve census objectives: timeliness and high quality, BPS has undertaken some key strategic initiatives: conduct better coordination among personnel, adopt right methodology for conducting census, Information Technology to improve efficiency and quality. Based on various studies, BPS has understood that high quality and on-time census can be produced with a state-of-the-art Data and Documents Capture Technology (DCT). For SP2010 and ST2013, BPS has deployed such solution from Kofax. Kofax is the leading provider of Smart Process Applications with numerous enterprise level customers worldwide. Deployment of such Modern DCT has resulted in significant benefits with regards to time taken for each census and quality of data and metadata. This paper describes Innovations and shares experiences gained from implementation of Modern DCT in SP2010 and ST2013. It also explains selection on using right technology that will impact data produced, provide better and faster result. Keyword: GSBPM, BPS-BPM, Data Capture, Census, Agriculture, Population 1 1. Introduction As can be seen from figure 1, Indonesia is the biggest archipelago country in the world. It consists of more than 17,845 big and small islands spreads over the equator. Indonesia is located between 10th North and 11th South latitudes and between 95th and 141st East Longitudes. It appears like a chain of pearls in satellite pictures. The country is divided into three time zones: eastern, central and western Indonesia time. The EastWest spread is the distance of New York to Los Angeles or from Jeddah to London. Indonesia currently has around 250 million populations and 28 million agriculture households [1], about 400 local languages and more than 300 ethnic groups. This makes Indonesia geo-politically and geo-culturally exposed from all sides. Badan Pusat Statistik (BPS-Statistics Indonesia) is a government institution which has a mandate to fulfil necessity of statistics in Indonesia. [2] Accomplishing this task, BPS regularly conducts surveys and censuses. Some of them are carried out in accordance to United Nations recommendations while the other, to fulfil the Indonesian Government’s data requirements for national or regional statistics like Development, Economic and Inflation, just to name a few. To provide statistics, BPS arranges three censuses and dozen surveys. BPS’s censuses are Population, Agriculture and Economic Census. These censuses are conducted 10 yearly. Population Census is organised in year ended by 0. The latest population census is Population census 2010 (SP2010). [3] Agriculture Census is the second census. This Census is organised in year ended by 3. The latest agriculture census is 2013 agriculture census (ST2013). The third census is economic census. This census is managed in year ended by 6. Currently BPS is preparing to conduct this census which will be run at 2016. Surveys on the other hand are done to fill in and narrowing information gaps between census periods. However for these surveys, respondent are not as many as census. Moreover the number of respondent is vary form one survey to another. 2 In terms of process, the more number of respondent means the more effort to produce statistic of it. Therefore census requires more effort to accomplish compare to survey, since number of respondent for survey is less than census one. The effort here means all stages of BPS Business Process Model (BPS-BPM) as illustrated in figure 2. [4] The model or framework that is used by BPS is identical to Generic Statistics Business Process Model (GSBPM) as captured in figure 3. [5] The process starts from specify needs up to evaluate in GSBPM or from planning to evaluation in BPS-BPM. From those BPM, phase that has very big different impact to labour work and can be considered as pain point between census and survey is collect/collection stage. In this stage we do data collection in field and more importantly we transfer paper based questionnaire into electronic data form. We have time constraint and schedule that should be followed. Moreover we have millions documents to process. Therefore to finish the task, we have to mobilise tens of thousands hired staffs. The staffs are selected. We cannot employ a staff that is not suitable for the job. Because we only have less time for training so we should pick a hired staff that is ready to work or less effort to train. Furthermore, the schedule for census is getting tight since result of the census usually will be announced in the president’s speech in front of the house of representative on the occasion of Independence Day in census year. The above conditions create significant challenges for BPS to conduct their census on time and with the desired quality. Based on previous experiences, BPS understand that it is almost impossible to conduct the census on time and with the desired quality of data. Nevertheless, Information Technology (IT) is evolving. Currently there is a jargon that “IT is enabler for business”. We can use IT to support our statistical business process particularly to cope the pain point as described above. Moreover, IT can ease the problem by providing solution that can reduce the effort. And at the end it can help BPS runs the census on time and with the desired quality. 3 This paper describes innovations and shares BPS experiences gained from implementation of Modern Data Capture Technology in SP2010 and ST2013. These are key studies for BPS and also the best examples of how BPS leverages Information Technology to help streamline processes, support accurate data, and report in a timely fashion. Furthermore this paper also explains selection on using right technology that will impact data produced, provide better and faster result. 2. 2010 Population Census (SP2010) As briefly explained above, BPS, as advised by United Nations, has an obligation to conduct a population census once every 10 years. The latest study was conducted in 2010. This study targets all Indonesian households which are located in more than 759.000 census blocks, 80.000 villages, in 523 cities/municipalities, scattered in 33 provinces of Indonesia. All activities involved in the preparing, execution, analysis, and reporting of its findings were targeted to be completed in 6 months. [3] In SP2010, the census used two types of questionnaire. The questionnaires are used to capture housing and population information. The housing form contains 24 housing variables, furthermore population questionnaire contains 43 variables. Both forms are packaged into one 8-pages booklet. So one questionnaire can be used to record one household and 6 inhabitant characteristics. If the people in the household is more than 6, then additional pages of population form are used. [3] [6] Given the amount of work for each household, BPS had hired more than 700.000 temporary staffs to obtain information on each form from each household. These enumerators, as they were called, were assigned to a specific area or location, within which they were expected to carry out face-to-face interviews, house to house, door to door. The objective was to ensure a complete and detailed study to give the most possible accurate information. Data obtained from each session were filled in the forms 4 by the enumerators. All handwritten. The completed forms were then collected, grouped in batches and sent to the respective provincial Data Processing Centre (DPC) for processing. There are 33 DPCs located across 33 Provinces (Figure 4). [3] To share more detail and give you a sense of magnitude of the study, there were more than 63 million households to survey in SP2010. At 8 pages per household in average, this worked out to 500 million pages of paper documents or more than 50 billion fields to process in 6 months. This massive amount of data on paper documents required an enormous effort and manpower just to collate and organize. In order to address these challenges and to achieve the census objectives which are timeliness and high quality, BPS has undertaken some key strategic initiatives: 1. Conduct better coordination among the human resources personnel 2. Adopt the right methodology for conducting the census 3. Use Information Technology to improve efficiency of conducting the census and to improve the quality of data collection From those initiatives particularly the third one, BPS realised that BPS had to leverage off IT to help this task. For this task, there were specific requirements for expected solution from IT. The requirements were: 1. The IT solution should reduce man effort and power. 2. The IT solution should automate the process 3. The IT solution should reduce processing time 4. The IT solution should maintain the quality of data that is captured in the questionnaires 5. The IT solution should easy to learn and enhance 6. The IT solution should be used for another project Based on those requirements, BPS did some research and discussion to decide what sort of technology that met the requirements. Recommendation that came out from that effort was technology which can automating and digitizing data collection process. 5 Particularly, technology that could read, understand, and process not just data, but data in handwritten form without compromising data accuracy and processing speed. That important piece of Technology came in the form of an automated Document and Data Capture Solution. In this stage BPS recognised what BPS want to process hundreds of millions census questionnaires, which was automated Document and Data Capture Solution. The next step was BPS launched an open tender for selection of an appropriate Data Capture Solution. After an exhaustive search and through review and evaluation of readily available technology, BPS finally awarded the project of Data Capture Solution development to Kofax Inc., a Document and Data Capture solution provider. This company today is the world’s leading provider of Smart Process Applications with numerous enterprise level Customers and Governments around the world. [7] Furthermore, Kofax is best known for its enterprise level and high volume Document and Data Capture Technology. Still, the software is rich on features and functions that make it easy for developers to adopt and customised for local requirements. In order to support the implementation of Data Capture Technology, BPS deployed necessary hardware at provincial branches (DPC) as illustrated in figure 5 and BPS headquarters in Jakarta. These hardware were necessary for the process in order to facilitate capture, validation, and extraction of data. Examples of these hardware include Scanners, Printers, Servers and Storages. Once the software and hardware take in place, the next step was implement Standard Operating Procedure (SOP) for data processing in the DPC. Steps involved in SOP for DPC which is depicted in figure 6 were: [3] 1. Documents came to DPC which was located at province branch, the branch would record identity per batch document (registering), and put in order of first come first serve basis to avoid missing or duplicate pages (Staging). 2. Data Capture process: 6 a. Manual document preparation i. the purpose of this step is basically for cleaning up document by removing staples or paper clips if any as staples or paper clips will damage scanners. ii. Cutting the questionnaire booklet into single pages for scanning purpose. b. Scanning documents. i. This is the first and most important step that would kick start capturing process. ii. The following step is cleaning up the image and prepares for the extraction. The image has to be of decent quality in order to capture more accurately. To achieve this, BPS uses Kofax VRS or Virtual Re-scan, a patented Kofax image clean up software, to help improve image quality by removing “noise” in images. This software will run automatically. c. Recognition. The system will automatically: i. identify the forms, ii. separate the documents, iii. extract the handwritten data, and iv. store the extracted data into database for further process d. Correction and completion i. this step is the next step after recognition ii. In the correction step, operator will be given a variable or item that cannot be recognised. And operator should manually fix it based on picture that system is given. iii. Completion is just like correction but for the whole form. d. Validation. i. The system will examine the data based on business rules and accuracy ii. If the system finds an error, Manual Correction has to be performed. 7 e. Export. i. The system will automatically Export data and the Image for further processing. ii. Data is then combined and collated from all the provinces and processed. This process continued over 3 shifts, every day, and 24 hours a day. That’s how we were able to meet the 6-month deadline. Essentially, what would have ordinarily taken 24 months, took us all at 6 months. Quite an achievement I must say. As stated above, volume of document needs to be processed were incredibly high, scattered across locations in 33 processing centres, needless to say, data processing monitoring became very critical. Issues need to be highlighted and alerted as early as possible. To further enhance of technology usage in the process, BPS used Kofax Monitor to highlight issues and flag problems. [8] Through this system, BPS was able to monitor all the critical activities in real time, so issues could be rectified immediately with as little downtime as possible. Moreover, ensuring that the process remained on track becomes possible given the visibility provided by Kofax Monitor in one central location. A combination of hard work from BPS team, outstanding coordination and the deployment of the right technology, SP2010 was completed ahead of schedule, using only 5 out of 6 months to complete. This would never have been possible without the use of Kofax. This study became the yardstick and established a level of competence and achievement never before seen on a project in this scale and magnitude. Use of Information Technology has helped BPS to innovate effective solutions resulting in the following benefits. 1. Increase in the efficiency and effectiveness of statistics administration in BPS 2. Reduction in number of human resources requirements 3. Provision of faster and accurate statistical results 4. Ease of Operational and Management monitoring 8 5. Improvement in the quality of data 6. Improvement in Statistical user satisfaction and confidence of BPS products and services 3. Agriculture Census 2013 (ST2013) ST2013 is refers to the Agriculture Census 2013 targeted at 28 million agricultural households residing in every province, city, and village in Indonesia. [9] This study was targeted to be completed within 3 months. Just like SP2010, every household for the ST2013 study will be supported by an 8-page document consisting of more than 150 questions. [9] This study is more detail than the SP2010 and represents more fields to populate and capture. Based on BPS experience in SP2010, BPS decide to use the same data processing stages and the same set of Kofax licenses that was deployed for SP2010 for ST2013 plus other smaller studies (Sub Sector surveys) that were taking place simultaneously. Having learned valuable lessons during the prior study (SP2010), we were able to apply our learning to this study and were able to cut a considerable amount of time from the learning cycle. As an example, in an effort to become more cost effective, BPS used black and white printed documents instead of colour documents, and this results in considerable cost savings. BPS leveraged upon the same data capture process but with enhancements to achieve better expected accuracy results, and at a lower cost, and a shorter turnaround time. The end result? It was spectacular! For the first time in the history of BPS, we could announce the results of Agriculture census in the same year as the census itself. This was unprecedented given the rife volume of documents, the magnitude of data fields, the expanse of the geography that this census covered, and the number of people it took to execute the study. 9 The people at BPS made it possible, the streamlined processes made it quicker, and Kofax solutions had everything to do with that outcome. A simulation was conducted after these studies that showed that if these studies had been processed manually, it would have taken more than two years to complete. We need more people to accomplish the census task. As the number of document that will be processed is big. This will effect to time to finish the census. 4. Lesson learned. In order to improve future census result below are some lessons learned: 1. During forms design, getting the right people involved is critical. IT teams need to be involved so they can design the forms which become Data Capture friendly, this will bring better accuracy and better results during data capture process. 2. Attend regular product or technology updates to get more information about current technology available in the market. This will provide broader ideas during planning and implementation and a critical step in building the Information Technology vision for BPS. 3. There are some locations in remote areas, especially in the eastern part of Indonesia that has low volume but with limited electricity and communications infrastructure. With the right technology, what used to be a huge worry can now be resolved. 4. BPS requires tools that can produce productivity reports or visualizations on a real time basis. With Kofax, these reports not only report the process but can also provide an analysis of the activities, and identify areas that need attention or are potential bottlenecks. Moreover, details such as error fields and misread characters can also be reported. The report that also identifies the most or the least productive operators. Going forward, this will become a critical tool to help 10 streamline the processes that future studies will go through. I see Kofax solutions as being an important part of that improvement process. 5. Information Technology keeps improving from time to time, moreover demand to improve quality of statistical data and products produced by BPS is also increasing. As the result BPS has to keep on improving their human resources skill on an ongoing basis and also has to conduct regular research on what good and right Information Technology solutions are available in the market. 6. Based on various studies, BPS has understood that high quality and on-time census can be produced with the support of a state-of-the-art Data and Documents Capture Technology. 7. For 2010 Population Census and 2013 Agricultural Census, BPS has deployed such a Data Capture Solution from Kofax, a US based software company. 8. The deployment of such Modern Data Capture Technology has resulted in significant benefits with regards to the time taken for each Census and the quality of data and metadata produced by BPS. 5. Conclusion Technology is a key. Adopt the right technology at the right time. BPS adopted Kofax, and it proved completely successful. The adoption of the right Information Technology results in efficient and effective outcomes on any census process. Higher quality data results, minimum human resources required, completed in less time, and with greater cost savings. In conclusion, if you are looking for proof that Information Technology will make a difference in any census projects, then the proof is right here. BPS is testimony that it doesn’t have to take years, it doesn’t have to be manual, and it doesn’t have to be 11 expensive. All it takes is better planning, better coordination, supported by the right human skills, and the right technology. 12 6. References [1] S. o. S. Demographic, Indonesia Population Projection 2010 - 2035, Jakarta: BPS- Statistics Indonesia, 2013. [2] BPS, Statistik Indonesia 2013, Jakarta: BPS, 2013. [3] BPS, Dokumentasi Komprehensif Sensus Penduduk 2010 Indonesia, Jakarta: BPS, 2013. [4] B. PMU, BPS Analysis Document, Jakarta: BPS - Statistics Indonesia, 2010. [5] T. Lalor, “GSBPM v5.0,” UNECE, 2013. [6] BPS, “Sensus Penduduk 2010,” 2010. [Online]. Available: http://sp2010.bps.go.id. [7] Kofax, “Kofax Capture Enterprise Implementation Considerations White Paper,” Kofax, 2008. [8] R. Software, Kofax, Using Kofax monitor Wizards Version 1.2, Reveille Software, 2011. [9] BPS, “Sensus Pertanian 2013,” BPS, 2012. [Online]. Available: http://st2013.bps.go.id/dev/st2013/index.php/metadata/index. [Accessed 10 06 2014]. 13 7. Figures Figure 1. Indonesia Archipelago 14 Figure 2. BPS Business Process Model Source: BPS Analysis Document Page 25 15 Figure 3. Generic Statistics Business Process Model (GSBPM) 16 Figure 4. 33 Provincial Data Processing Centre 17 Figure 5. Provincial Computer Network Schema VPN / Completion 2 Completion 1 18 Figure 6. Work flow of Data Processing Centre 19