Use of scanning technology for data capture ICR System (Intelligent Character Recognition) Information and Communication Technology Center National Statistical Office (NSO) Thailand 1 ICR SYSTEM It is an automatically Data Entry by Scanning questionnaires which can be evaluated handwritten numeric / alphanumeric characters (ICR) or handwritten optic marks (OMR) and conduce to Digital Data before processing on others Computer System. 2 ICR System ICR System in NSO (Thailand) can be divided into 2 parts : TELEform Software System. ABBYY Software System. 3 ICR for The Population Census 2000 NSO provided ICR System to process the first Project of Population Census in 2000. By scanning the 16 million households (16 million Forms) which spent only 8 months to process the raw data instead of more than 1 year by using Key in Data System. 4 ICR for The Agricultural Census 2003 NSO hired ABBYY Software to process about 25% of The Agricultural Census 2003 that has totally 5.7 million households (24 million forms). 5 Other Survey Projects After processing The Population Census 2000,some survey projects have been used ICR System such as The Labour Force Survey (monthly report), The Household Manufacturing Survey etc. 6 TELEform Hardware System • NetServer for TELEform Server 1 Unit • NetServer for Database Server 1 Unit • Reader Modules Workstations 21 Unit • Verifier Modules Workstations 30 Unit • Scanner Control Workstations 6 Unit • Scanner Fujitsu M4099D 6 Unit 7 TELEform Software System TELEform Software is selected for ICR System which is used for Forms and Documents Processing. It is the product of TELEform Cardiff Software , Inc. USA. TELEform 6.2 Elite Enterprise Edition Components: TELEform Designer. TELEform Reader. TELEform Verifier. 8 ABBYY Hardware System • IBM Server X Series 225 1 Unit • Correction Station 1 Unit • Verifier Modules Workstations 25 Unit • Scanner Control Workstations 4 Unit • Scanner Fujitsu M4099D 4 Unit • Storageflex LT707 1 Unit 9 ABBYY Software System NSO provided ABBYY Software which is the product of Russia. ABBYY FormReader 6.0 Enterprise Edition Components: ❑ Form Design ❑ Administration Station ❑ Recognition Station ❑ Correction Station 10 TELEform / ABBYY Designer Function • To create template form. • Fix value of data type properties in each field of template form on questionnaire. 11 Questionnaire Template 12 TELEform Reader / ABBYY Administration Function • To evaluate the questionnaires . • Export the corrected questionnaires to a data file. • Send the unclear questionnaires to TELEform / ABBYY Verifier Function for correcting and transferring the corrected questionnaires to a data file. • Store scanned images. 13 TELEform / ABBYY Verifier Function • To correct questionnaires that contain mismarked or illegible fields. • The corrected questionnaires are automatically exported to a data file. 14 BasicScript Programming Language Script Language use to add extensive features similar to Visual Basic. • To develop Form Script during Evaluation and Correction. • To develop Export Script during Data Export. 15 Step of ICR System in NSO Scan and Forms Distribution : The questionnaires are scanned in each Block / Village and created Multi Page Image Files in local hard disk. Forms Evaluation : The questionnaires images are evaluated. The corrected questionnaires which skipped Verifier Workstation and directly exported to Database server. 16 The questionnaires are scanned. 17 Scan and Forms Distribution Questionnaires Scanned Image files 18 Step of ICR System in NSO (cont.) Forms Verification : The unclear questionnaires are needed review and corrected it in Verifier Workstations before transferring to Database server. Data Export : - Link a data file from Database server to IBM Mainframe System. - Store Scanned image files to CD. 19 Unrecognized fields must be corrected at Verifier Workstations. 20 Forms Verification and Data Export Export data for processing Database Server Verify Storage (Images files) CD 21 ICR Link System TELEform Software Scanners Questionnaires 6 unit ABBYY Software PC 6 unit controller PC 4 unit Scanners 4 unit controller Questionnaires Storage (HD 880 GB) CD CD Verifications 30 unit HP Server IBM Server Transfers Mainframe Readers 21 unit Processing (Editing & Reporting) Verifications 25 unit COMPAQ Server - Software - Database - Backup Data -Administration -Export -Recognition Correction station 1 unit 22 Specific Questionnaire for ICR System The questionnaires must be professionally printed in a specific colour field boxes (blue,green,red). The scanners are recognized the questionnaires characters / numerics and drop out field boxes colour. Test questionnaires to make sure that can process it with speed and accuracy. 24 ICR Benefits Reduce Cost. Reduce Time. Efficient Data Capture. Increase Data Accuracy. 25