About InftyReader: InftyReader is a Windows-based Optical Character Recognition (OCR) application that recognizes and translates scientific documents (including math symbols) into IML: an XML file format related to InftyEditor; LaTeX: pronounced either "Lah-tech" or "Lay-tech, LaTeX is a macro package based on TeX. Its purpose is to simplify TeX typesetting, especially for documents containing mathematical formulae. XHTML (MathML): mathematical expressions are output using MathML notation. HR-TeX: a simplified LaTeX-like notation designed to be easier to “read" by individuals who are blind. InftyReader incorporates the use of three OCR engines. The first is InftyReader’s own OCR engine which has been designed to recognize scientific and mathematical symbols in formulae. The second and third OCR engines are from the Toshiba Corporation (ExpressReaderPro) and MediaDrive Corporation (WinReader). The ExpressReaderPro and WinReader OCR engines operate simultaneously in order to improve the recognition results of characters appearing in ordinary text areas. InftyReader can recognize tables including math expressions so long as the ruled lines are not broken. InftyReader can also process image-based PDF files. In order for InftyReader to process an image or image-based PDF file, the images must be: 1. black and white (binary) only; 2. high resolution (600 dots-per-inch), 3. non-antialiased (no gray-scaling); and, 4. in TIFF, GIF or PNG formats Web-based PDF files often contain low resolution images (<300 DPI) in order to reduce download and load times. In such cases, InftyReader will not work. Copyright © 2008-2011 by InftyReader Group, Inc. All rights Reserved. Page 1 of 9 Image-Based vs. Text-Based PDF Documents The following screenshots are of an image-based PDF and a text-based PDF document containing the same material. Both have the same visual appearance, but functionally they are different. This is a good test for determining whether InftyReader will process the PDF file correctly. Text-based PDF Documents: Note, in the following image when using the selection tool, you can see that the text is capable of being selected on a character, word, line or paragraph basis. Likewise, the icon next to the selection is a “T” representative of “Text”. Should any of these characters be other than standard ASCII characters, InftyReader will not process this file accurately! Copyright © 2008-2011 by InftyReader Group, Inc. All rights Reserved. Page 2 of 9 Image-based PDF Documents: Note, in the following image, that the entire area is selected and highlighted as opposed to simply the text. Also note in the bottom of the page is an image icon to help communicate the selection is an image. Assuming that the images of text contained in this imagebased PDF document meet the minimum quality requirements cited previously, InftyReader should process it fine. What is Anti-Aliasing (gray-scaling)? Anti-aliasing is a method of fooling the eye that a jagged edge is really smooth. Remember though that Anti-aliasing does not actually smooth any edges of images it merely fools the eye. Let’s take a look at the example below to demonstrate the effects of Anti-Aliasing. The letter on the left is a blown up letter “a” with no anti-aliasing. The “a” on the right has had anti-aliasing applied to it. In this blown up form it looks blurred. Copyright © 2008-2011 by InftyReader Group, Inc. All rights Reserved. Page 3 of 9 Now look closely at the following two letters: You can still tell that the letter of the left is jagged but the letter on the right looks a lot smoother and less blurry than the example above. So as you can see, anti-aliasing brings a much more pleasing image to the eye. Something like what comes out of a high class printer rather than what you can be used to seeing when on a computer screen. InftyReader can not process anti-aliased images. Source: http://www.pantherproducts.co.uk/Articles/Graphics/anti_aliasing.shtml Additional comparisons of non/anti-aliased images: Copyright © 2008-2011 by InftyReader Group, Inc. All rights Reserved. Page 4 of 9 600% Magnification of a binary, non-anti-aliased, image. InftyReader will correctly process this image. Copyright © 2008-2011 by InftyReader Group, Inc. All rights Reserved. Page 5 of 9 100% Magnification of non-binary, anti-Aliased, image. InftyReader will not correctly process this file even though it looks like a black and white (binary) image. It’s really not. Take a look at the following sequence of enlargements. Copyright © 2008-2011 by InftyReader Group, Inc. All rights Reserved. Page 6 of 9 200% Magnification of non-binary, anti-Aliased, image. InftyReader will not correctly process this file! Copyright © 2008-2011 by InftyReader Group, Inc. All rights Reserved. Page 7 of 9 300% Magnification of non-binary, anti-Aliased, image InftyReader will not correctly process this file! Copyright © 2008-2011 by InftyReader Group, Inc. All rights Reserved. Page 8 of 9 400% Magnification of non-binary, anti-Aliased, image InftyReader will not correctly process this file! Copyright © 2008-2011 by InftyReader Group, Inc. All rights Reserved. Page 9 of 9