Emerging technologies 2010 Censuses Challenges Shoshani Eli Managing Director Asia Pacific UN Workshop Thailand 2008 Agenda Introduction Who we are? Data capture methods Eflow Platform Summery 2 “Counted” by eFLOW world wide 1,374,026,304 3 TIS’s Experience in Census Projects India 2001 Turkey 1997 Brazil 2000 South Africa 2001 Ireland 2001 Germany (DP) 1999 Cyprus 2002 Turkey 2000 Kenya 2001 Slovak Republic 2001 Hong Kong 2001 Italy 2002 Slovak Republic 2006 Hong Kong 2006 South Africa Pilot 2007 Ireland 2006 Largest market share worldwide in census projects information capture 4 2008 won Belarus Argentina Thailand 5 Overview - Top Image Systems Founded 1991 Data Extraction and Workflow solutions. Specialized in Censuses Project Since 1996, traded on NASDAQ (TISA) ~250 employees Local Offices in the Region: Asia Shanghai, Japan, Singapore, Hong Kong, Guangzhou (R&D) and Australia Europe America’s United Kingdom, Germany, Italy, Spain, France, Benelux Boston, Rio De Jenero Present in app. 40 countries Strong partner network worldwide Around 800 installed systems worldwide The evolution of data capture in census projects From OCR into IDR Solution Key From Paper OMR 8 Key From Image Automated Data Capture eFLOW Intelligent Data Capture The evolution of data capture in census projects Manual data entry (key from paper) Slow High error rate in the data entry process Recruitment, training and management of personnel key from Image: Archive Approx 30-40% faster than key from paper 9 Key From Paper Key From Image The evolution of data capture in census projects OMR (hardware readers for checkbox) OMR – Requires specially printed forms and special scanners – Cannot handle handwritten/printed data – Forms are not user-friendly – OMR requires more answers => more space => increased paper expenditures => more handling and printing costs – Not flexible, difficult to adjust to other applications once census is over – No possibility to add business rules: computation, validations, coding 10 TIS’s Experience in Census Projects India 2001 Turkey 1997 Brazil 2000 South Africa 2001 Ireland 2001 Germany (DP) 1999 Cyprus 2002 Turkey 2000 Kenya 2001 Slovak Republic 2001 Hong Kong 2001 Italy 2002 Slovak Republic 2006 Hong Kong 2006 South Africa Pilot 2007 Ireland 2006 Largest market share worldwide in census projects information capture 11 The evolution of data capture in census projects Automated data capture – Requires less human intervention, enables to complete the census data capture much faster (less space, less salaries, less hardware) Automated Data Capture – Ensures data integrity – enables the use of automatic AND manual: online validations, exception handling, coding – The most advanced and proven technology for Censuses, recommended by the UN and used by all modern countries for census projects – Full flexibility in the type of data gathered (checkbox, handwritten, alpha and numeric, barcode…) – Provides all capabilities of the OMR and plus much more – Creates a correlation between the image and the actual form – Remote capabilities enable all forms to be scanned locally and then sent to a central site for processing 12 eFLOW The evolution of data capture in census projects Intelligent data capture platform by using OCR/ICR/barcode/PDA/Web/email: – Automated data capture + – Smart - automatic classification for documents Smart understands and differentiates between various types of documents and languages and Based on state-ofthe-art Machine Learning algorithms – Freedom Artificial intelligence algorithms which provides enough information for the system to find the location of the fields on its own 13 Intelligent Data Capture Unified content Platform Census Data base Suggest14 a Single platform for all enterprise content India 2001 Turkey 1997 Brazil 2000 South Africa 2001 Ireland 2001 Germany (DP) 1999 Cyprus 2002 Turkey 2000 Kenya 2001 Slovak Republic 2001 Hong Kong 2001 Italy 2002 Slovenia 2006 Hong Kong 2006 South Africa Pilot 2007 Ireland 2006 15 Lessons learned The customer says it best… Saving of 25% Saving of 12% (Source: CSO – Central Statistic Office Ireland) 16 The customer says it best… (Source: CSO – Central Statistic Office Ireland) 17 The customer says it best… Benefits of the eFlow Technology (Source: CSO – Central Statistic Office Ireland) 18 First, several general lessons… Invest in creating the right application for the project – System Design High level business process Functional design Technical/Detailed design Code Guidelines conventions Technical DR, with the R&D – Development Project DR Code review – Budget control – Bi-weekly reports – … 19 First, several general lessons… Spend time on getting the form right – Meet organization standards – Form Design Prepare and optimize with a pilot Training & support 20 Indian Census 2001 TIS partners with CMC, Indian governmental agency with years of experience and offices all over India. Form Processing Technology: Around 500 million A3 images More than 2 million enumerators The technology was implemented at 15 processing centers at major state capitals Data was captured using only 25 high-end Kodak 7520DS Scanners 16 languages The advanced technology in 2001 – eFLOW ver.1.0 Two phases 21 present new advanced technologies to meet 2010 census challenges eFLOW 5.0 – Next Generation… 22 Main improvements in eFLOW to meet Census Challenges Architectural changes Core changes Recognition technologies Modules Features 23 eFLOW Architectural Improvements Core redesigned, built in .NET technology – Microsoft .NET is the Microsoft strategy for connecting systems, information, and devices through Web services so people can collaborate and communicate more effectively Customization by .NET Embedded – Speeds up Runtime – X200 faster Custom Code now part of CAB – no need to manage DLLs separately Debug inside eFLOW – No need to install development environment 24 .Net allows an Object Oriented design approach House Batch Batch 25 Person eFLOW Architectural Improvements Improved flexibility – Multiple active applications on the same server (run phases in parallel) balance workload and personnel Ensuring on going work of all team members – Multiple sites – Support of multiple servers and cluster 26 New eFLOW Architecture - Sites FormID Export 27 Monitoring and Management 28 Architectural Improvements (cont.) Easier management of application: – Control all stations from any location Automatic stations similar to Windows Services – Remote activation of stations, no need physically access server room – Restart/Start/Control of stations from a centralized place (remotely) using eFLOW Controller and Enterprise manager 29 Controller 30 Architectural Improvements Handling Huge batches: – Ability to handle huge batches of 300-3000 pages each – Ability to process lots of batches in parallel – A stable, robust platform (Pic from eFLOW’s performance test) 31 Architectural Changes (cont.) Load balancing – Load balancing between stations (get notifications automatically and better allocation of employees) – Automatic load balancing according to the numbers of batches in a queue – Priority handling - Using the eFLOW capabilities for automatic prioritization by code (for example according to county, region etc) 32 Architectural Changes Improved security mechanism 33 (cont.) Advanced approaches Automatic EFI Matching – Improving template recognition station speed via the “Force EFI” mechanism, a unique barcode posted on each page 34 Advanced approaches (cont.) Auto Coding – Coding tasks and data validations performed on the data capture platform: a ‘cost-effective’ solution – Use one of the statistic software's in the market like ACTR (Canadian statistical software for coding some fields) – Use Approximate Search tools for improving results via DB (Exorbyte) 35 Advanced approaches (cont.) Dynamic Dictionary update – Lookup and dictionaries via DB (and not txt files) Export – Reconstruct the original form according to the template 36 Advanced features (cont.) Splitting & Merging - Using the build in eFLOW4 splitting/merging mechanism Handling Problematic batches by Improved Split/Merge abilities – Taking out physically bad pages (or bad household) and continue to work with the rest of the batch – Split/Merge automatically without the need to build a specific station for merging of data Additional powerful interfaces exposed in the CSM for faster development time – Priority (for example according to county, region etc) – Load balancing between stations (get notifications automatically and better allocation of employees) 37 Modules Statistical report – Statistical report to monitor the daily, weekly, monthly rate per user/station – Quality checking using Licenses – Flexible licenses policy Per station Per number of pages processed 38 Statistic Reporter (e.g Crystal Reports) 39 Recognition technologies OCR/ICR Engines RICOH (Japanese) LIGATURE PENPOWER (Chinese) JUSTICR OCE ABBYY KADMOS EXPERVISION INLITE A2IA OMNIPAGE TIS NESTOR 40 Custom stations approach 41 eFLOW Receives Everything 42 Mobile Devices MNIC Web Completion Remote scanning Web Completion eFLOW 4.x Web Completion Employees LAN Active Directory eFLOW server eFLOW Thick Clients DMZ Network Segment eFLOW Web Completion Server Internet Backbone ng rci es u tso ye Ou mplo force e c e e ork rvi Se H o m n e w d i are m Onl Sh k Fro r Wo Web Browser 44 Ex in terna t (Ve he b l par nd usin ties ors e /C ss p invol us tom roce ved ers ss / Pa rtn e rs) Web Browser Summery Data capture and IDR platform (paper, electronic, mobile) and not a recognition product Proven solution in census data capture! no need to invest time and money in new technology and vendor, minimizing the risk Extensive experience in the design, development and implementation of real census and other high volume form processing projects. Largest market share worldwide in the processing of census projects, Huge experience based on long researches for the special needs of the Indian Census. Maximum flexibility, redundancy and robust platform ensuring you meet project timetable to release census results. 45 India 2001 Turkey 1997 Brazil 2000 South Africa 2001 Ireland 2001 Germany (DP) 1999 Cyprus 2002 Turkey 2000 Kenya 2001 Slovak Republic 2001 Hong Kong 2001 Italy 2002 Slovenia 2006 Hong Kong 2006 South Africa Pilot 2007 Ireland 462006 Summery Thank you