Scanning the Document

advertisement
Technology Learning Center
Omni Page Professional 17
Optical Character Recognition Software
Technology Learning Center
916.278.6112
AIRC 3012
http://www.csus.edu/irt/fsrc
Last Updated: February 9, 2016
2
Contents
OmniPage Professional 17 ...................................................................................................................................................... 1
Overview ................................................................................................................................................................................. 3
Prerequisites ....................................................................................................................................................................... 3
Objectives............................................................................................................................................................................ 3
Creating Accessible Scanned Documents ............................................................................................................................... 4
What is OCR?....................................................................................................................................................................... 4
OmniPage Pro Scanning Software ...................................................................................................................................... 4
Scanning the Document ...................................................................................................................................................... 4
Launch OmniPage Pro ..................................................................................................................................................... 5
Set Scanning Preferences ................................................................................................................................................ 5
Load file: .......................................................................................................................................................................... 6
Optical Character Recognition Tool ................................................................................................................................ 7
Export the document: ..................................................................................................................................................... 7
Start Scanning ............................................................................................................................................................... 11
Multiple Pages............................................................................................................................................................... 11
Other ways to load files .................................................................................................................................................... 11
Converting from PDF ..................................................................................................................................................... 11
Creating PDF files from other applications ................................................................................................................... 12
OCR and Proofreading....................................................................................................................................................... 12
Exporting the Files............................................................................................................................................................. 13
Save the File .................................................................................................................................................................. 13
OmniPage Document format ........................................................................................................................................ 13
Save images in the document ....................................................................................................................................... 14
Save to PDF ................................................................................................................................................................... 14
References ........................................................................................................................................................................ 15
Manual referred - © Nuance_Omnipage_Professional_17_1_User_Guide
Technology Learning Center
916.278.6112
AIRC 3012
http://www.csus.edu/irt/fsrc
3
Overview
This manual goes over how to create accessible scanned documents using Optical Character Recognition
software. Omni Page Pro 17 lets users scan text documents like; journal articles, text based handouts, pages
in a book and other documents. Not only does it recognize or convert scanned content to readable text
formats; it also converts pictures of documents to readable text formats.
Features available in Omni page 17 include:
1. Asian Recognition: recognizes Asian characters (Japanese, Korean, Traditional Chinese and
Simplified Chinese).
2. Vertical Text Recognition: Identifies Text written vertically and allows it to be edited using Text
editor, using the True Page formatting level.
3. Improved support for Office 2007: The Direct OCR buttons now appear on a separate Nuance OCR
tab instead of being mixed with all other Add-Ins in Office.
4. Robust Batch Processing: The Batch Manager automatically skips files that cannot be processed.
Including those blocked by password requirements, without stopping the main flow of work. The Job
results window indicates which files were not processed.
5. Running: The program’s launch speed is increased and performance is considerably improved on
multi-core computers. Support for quad-core machines is introduced.
6. Linking workflows to scanner buttons: Omni Page functions and workflows can be associated with
scanner buttons, so the whole pre-processing, recognition and storage of documents can be launched
from the scanner.
7. Output to Kindle: The new Kindle Assistant lets you create workflows to send recognition results to a
Kindle account at Amazon and receive them displayed on a Kindle device registered with that
account.
Prerequisites
To use Omni Page, users should have a basic knowledge about how to use a scanner to scan documents,
knowledge about how to use word processing applications such as Microsoft Word, Notepad or WordPad,
and the ability to work in the Windows Operating System environment.
Objectives
After following the steps in this manual you should:
 Understand how to use the different features in Omni Page Pro that aid in creating accessible
scanned documents.
Manual referred - © Nuance_Omnipage_Professional_17_1_User_Guide
Technology Learning Center
916.278.6112
AIRC 3012
http://www.csus.edu/irt/fsrc
4
 Apply what you have learned about Omni Page Pro to manipulate different types of course
materials (text documents) that need to be scanned so that they’re accessible.
Creating Accessible Scanned Documents
What is OCR?
When you scan documents such as; journal articles, pages in a book or handouts, the standard scanning
software that comes with your scanner may scan a text document as an image. Assistive technology such as
screen readers, cannot access image only content because they don’t contain readable and editable text.
OCR or Optical Character Recognition software enables you to take an image (a scanned document page)
and create editable text from this image. This is usually referred to as “OCR’ing” an image file. The editable
text can then be saved into a word processing format such as a MS doc/.docx or saved as an Adobe .pdf.
These formats allow for further editing and formatting of text if necessary.
Omni Page Pro Scanning Software
Omni Page Pro OCR software converts the scanned “image” output from print-based documents such as
laser-printed and typewritten documents and digital documents such as an image PDF into editable text.
This editable text can then be edited, formatted and saved in your choice of application: MS Word,
PowerPoint, Excel or Adobe PDF.
Omni Page also retains various elements from your scanned print based documents and digital documents.
The elements retained include:
 Graphics: photos, drawings, charts, graphs, etc.
 Text Formatting: fonts, font sizes, font styles and font color.
 Page Formatting: column structure, paragraph spacing, table formats, placement of graphics, etc.
Scanning the Document
Before you begin scanning:
 Gather all the course materials that you will be scanning.
 Think about how you will make the scanned document(s) available to users. For example, will you be
posting the scanned document on a website so that it could be downloaded by a user or will it be
linked to a SacCT course?
 Determine the file format that you should save your scanned document in so that it benefits those who
access the document.
Manual referred - © Nuance_Omnipage_Professional_17_1_User_Guide
Technology Learning Center
916.278.6112
AIRC 3012
http://www.csus.edu/irt/fsrc
5
Launch Omni Page Pro
step 1. Click the Start Menu button on the Windows task bar.
step 2. Select All Programs.
step 3. Select Scan Soft Omni Page 17.
step 4. Select Omni Page Professional 17.
step 5. Omni Page Professional main window displays.
Set Scanning Preferences
Before you begin scanning, you will need to specify the processing method you will use to scan documents.
You can choose from; automatic, manual or workflow. For this tutorial we will focus on the automatic
processing, which is the fastest and easiest method. Through the automatic process; Omni Page scans the
image, performs OCR to generate editable text so that you can check and correct errors in the document and
Manual referred - © Nuance_Omnipage_Professional_17_1_User_Guide
Technology Learning Center
916.278.6112
AIRC 3012
http://www.csus.edu/irt/fsrc
6
lastly gives you the option to export the document to the desired format and location. Omni Page can
complete these three steps from beginning to end, in the Automatic Process.
You can set scanning preferences using Scanner Setup Wizard in Tools:
step 1. Select your scanning preferences by clicking on the downward arrows (Toolbox drop down lists)
and make your selection for each phase (1, 2, 3).
We generally select 1-2-3 which executes step 1 of loading file, step 2 of proof reading the document and step
3 of saving the document to its destination folder. By default it saves the file in Documents in Libraries.
Load file
Choose the location of the file you would like Omnipage to convert.
Manual referred - © Nuance_Omnipage_Professional_17_1_User_Guide
Technology Learning Center
916.278.6112
AIRC 3012
http://www.csus.edu/irt/fsrc
7
Optical Character
Recognition Tool:
The user should choose the
option ’Automatic’ so that the
software can detect the type/style
of document being submitted. If
you intend to be more specific in order to aid the software to understand the type of document being fed,
click on the option that best describes your document.
Export the document:
You can export the document to a desired location. We generally select ‘Save to File’ but you could also copy
to clipboard and send it as a e-mail etc.
We can save it in any format as seen below. This document can later on be opened in MS Office Word (or any
other text editing tool) to be edited.
Manual referred - © Nuance_Omnipage_Professional_17_1_User_Guide
Technology Learning Center
916.278.6112
AIRC 3012
http://www.csus.edu/irt/fsrc
8
step 1. Now that you have chosen the recommended workflow, click on the workflow icon.
Manual referred - © Nuance_Omnipage_Professional_17_1_User_Guide
Technology Learning Center
916.278.6112
AIRC 3012
http://www.csus.edu/irt/fsrc
9
step 2. Once you click on the icon, you’ll be prompted with a Scan Setup window. Choose whether to
update to the latest scanner database from Nuance.
step 3. Next select choose “Select and test scanner or digital camera
step 4. Locate your device and click Next
step 5. It’s up to you whether you should test the device you’re using. If the Set Up wizard has prompted
you that your device is ready, then click Next
step 6. Click Finish to exit the Set Up Wizard
Manual referred - © Nuance_Omnipage_Professional_17_1_User_Guide
Technology Learning Center
916.278.6112
AIRC 3012
http://www.csus.edu/irt/fsrc
10
step 7. The Setup can be done manually by going to ToolsScanner Setup wizard. The set up wizard
provides us with the same options as the previous steps provided.
step 8. For normal black and white scanning of primarily text, we recommend choosing the Automatic
Process 1-2-3 procedure and under each phase (1, 2, 3) select the following:
a. Phase 1: Scan B&W
b. Phase 2: Automatic Recognition
c. Phase 3: Save to File
Manual referred - © Nuance_Omnipage_Professional_17_1_User_Guide
Technology Learning Center
916.278.6112
AIRC 3012
http://www.csus.edu/irt/fsrc
11
Start Scanning
After you select your scanning preferences you can begin scanning.
step 1. Click on the 1-2-3-> button to start scanning.
Multiple Pages
Omni Page allows you to scan single or multiple pages at a time. If you will be scanning multiple pages:
step 1.
step 2.
step 3.
step 4.
After the first page is scanned the Continue Automatic Processing window appears.
If you have more pages to scan, place the new page in the scanner.
Click the Add More Pages button.
When you are done scanning, select the Stop Loading Pages button.
Other ways to load files
Converting from PDF
To extract text content from a PDF file, load it into Omni Page, recognize it, and save the results to a text
format. A variety of outputs is also available from a PDF file shortcut menu: Word, Excel, RTF, WordPerfect or
text. For more options, use the Convert Now Wizard.
Manual referred - © Nuance_Omnipage_Professional_17_1_User_Guide
Technology Learning Center
916.278.6112
AIRC 3012
http://www.csus.edu/irt/fsrc
12
Creating PDF files from other applications
The Nuance PDF Create product supplied with Omni Page Professional provides the ability to create Normal
PDF files from documents in any print-capable application on your system. Click File / Print and select the
printer Scan Soft PDF Create! Adjust properties as desired and click OK and supply a file name and location.
If View resulting PDF is selected, your default PDF viewer displays the result.
OCR and Proofreading
After scanning is completed, you will see your image in the Image Panel and the 100% bar appears below the
second button of get file type. The OCR and Proofreading steps will follow automatically. You can respond
to the OCR Proofreader’s suggestions (see below).
step 1. If the OCR does not recognize a word because it is not very clear, it will highlight the wrong
elements and display it to you with a few suggestions that it thinks are appropriate.
Whenever an unidentified character pops up, you need to identify it. After, click on one of the tabs that fit.
Ignore – would ignore the current mistake that has been pointed out.
Ignore all – would ignore all similar occurrences throughout the document.
Add – would add a new word that needs to appear in the text box.
Change- would change the current word with suggestions or any changes that were made would be saved.
Manual referred - © Nuance_Omnipage_Professional_17_1_User_Guide
Technology Learning Center
916.278.6112
AIRC 3012
http://www.csus.edu/irt/fsrc
13
Change all – Changes all the occurrences of that word in the remaining document.
More >> - Displays more special characters.
Page Ready- If you do not want to make any more changes on the page, select this.
Document Ready- If you don’t want to make any more changes to the document, select this.
Close - Closes the editing window and takes you to the next step, which is exporting the file. You can either
save it to a folder or email it.
Exporting the Files
Save the File
1. When done with the Proofreader, click the close button on the Proofreader window. The Save to
File window appears automatically (see below). Select your preferred format in the Files of type popdown window.
You can save the file in any format. It’s highly recommend you save as Word document. Saving as a Word
Document is important because you will have an original digital copy of a file format that is easy to edit in the
future. You can also save your scanned file as PDF file. In both cases (Word or Acrobat) you will also need to
structure the content by assigning tags/styles to the text and adding text equivalent information to images
or figures.
Save As: Omni Page Document format
This way the document is always available in Omni Page to be edited. It will remain in Omni Page after export
and can be edited multiple times, exported to different formats or mailed to your Outlook 2007. You can even
add or recognize already recognized pages
Other formats levels Include
Plain Text - This exports plain, no columns, left-aligned text, in a single font and font size. When exporting to
Text or Unicode file types, graphics and tables are not supported. You can export plain text to nearly all file
types and target applications; in these cases graphics, tables and bullets can be retained.
Formatted Text – Formatted Text exports text with no columns, font and paragraph styling, graphics and
tables. This is available for nearly all file types.
Flowing Page - This keeps the original layout of the pages, including columns. This is done wherever possible
with column and indent settings, not with text boxes or frames. Text will then flow from one column to the
other, which does not happen when text boxes are used.
Manual referred - © Nuance_Omnipage_Professional_17_1_User_Guide
Technology Learning Center
916.278.6112
AIRC 3012
http://www.csus.edu/irt/fsrc
14
True Page - This keeps the original layout of the pages, including columns. This is done with text, picture and
table boxes and frames. This is offered only for target applications capable of handling these. True Page
Formatting is the only choice for XML export and for all PDF export, except to the file type ‘PDF Edited’.
Spreadsheet - This exports recognition results in tabular form, suitable for use in spreadsheet applications.
This places each document page onto a separate worksheet.
Save images in the document
You can save images in various formats. Under Save to file in the Export results drop down list, select Image
under Save as.
step 1. Choose a folder location and type in the file name. After that select the format in which you want to
save your images and select if you want to save images from the current page, multiple pages or all the
pages.
step 2. Select to save the selected zone image(s) only, the current page image, selected page images or all
images in the document. For multiple zones or multiple pages, you can have all images in a single multi-page
image file, providing you set TIFF, MAX, DCX, JB2 or Image-only PDF or XPS as file type. Otherwise each
image is placed in a separate file. Omni Page adds numerical suffixes to the file name you provide, to
generate unique file names.
step 3. Click Options... if you want to specify a saving mode (black and white, grayscale, color or ‘As is’), a
maximum resolution and other settings. For TIFF files, you specify the compression method here. Click OK to
save the image(s) as specified. Zones and recognized text are not saved with the file.
Save to PDF
You have five choices when saving to Portable Document Format (PDF) files. The first four are presented as
Text converters; the last one is listed among the Image converters.
PDF (Normal):
Pages are exported as they appeared in the Text Editor in True Page view. The PDF file can be viewed and
searched in a PDF viewer and edited in a PDF editor.
PDF Edited:
Use this if you have made significant editing changes in the recognition results. You have three formatting
level choices, Saving recognition results 80 including True Page. The PDF file can be viewed, searched and
edited.
PDF Searchable Image (formerly PDF Image on Text):
Manual referred - © Nuance_Omnipage_Professional_17_1_User_Guide
Technology Learning Center
916.278.6112
AIRC 3012
http://www.csus.edu/irt/fsrc
15
The PDF file is viewable only and cannot be modified in a PDF editor. The original images are exported, but
there is a linked text file behind each image, so the text can be searched. A found word is highlighted in the
image.
PDF with image substitutes:
As for PDF (Normal), but words containing reject and suspect characters have image overlays, so these
uncertain words display as they were in the original document. The PDF file can be viewed, searched and
edited.
PDF Image (formerly PDF, image only):
The original images are exported. The PDF file is viewable only and cannot be modified in a PDF editor and
text cannot be searched.
Summary
Topics and techniques described in this manual include





What is OCR? OCR software converts the scanned “image” output from print based documents such
as laser-printed and typewritten documents and digital documents such as an image PDF into editable
text
The process of launching, scanning, setting preferences and exporting
Other ways to load files such as converting to PDF
Proofreading your OCR after it has been loaded or scanned
Exporting options
References
For Further Assistance pleas e refer the official Omni Page Professional 17 User guide by nuance
http://www.nuance.com/imaging/pdf/ug_OmniPage17UserGuide.pdf
Manual referred - © Nuance_Omnipage_Professional_17_1_User_Guide
Technology Learning Center
916.278.6112
AIRC 3012
http://www.csus.edu/irt/fsrc
Download