OmniPage SE Release Notes

Version 2.0

Copyright © 2002-2003 ScanSoft, Inc.

 

OmniPage SE Version 2

Portions of this product copyright © 2000 Proximity Technology, Inc.

Portions of this product copyright © 2000 Stellent Corporation.

Portions of this product copyright © 1991-1998, Thomas G. Lane.

Portions of this product copyright © 1993, Vantage Research.

 

 

Please read this document for up-to-date information about ScanSoft's OmniPage SE.

These release notes discuss the following topics:

 

·         Minimum System Requirements

·         Installation

·         General Application Notes

·         General Technical Notes

·         Scanner Support

·         Scanner Technical Notes

·        Sample Image Files, Shipped with the Product

 

Minimum System Requirements

 

To install and run OmniPage SE, your Windows-compatible PC must be equipped with the following:

 

·         An Intel (or compatible) Pentium™ or higher processor.

·         An SVGA monitor with 256 colors (16- bit color is recommended. This is sometimes called “High color” and called “Medium Color Quality” on WinXP) and 800x600 pixel resolution.

·         A minimum of 64 megabytes (MB) of random access memory (RAM). We recommend 128 MB or more for the best performance.

·         Microsoft Windows 98 SE (Second Edition), Windows Me, Windows NT 4.0 SP6 (Service Pack 6), Windows 2000 or Windows XP. Your PC must be Y2K compliant. OmniPage will not run on Windows 3.1, Windows 98 First Edition, Windows NT 3.51, or OS/2.

·         A hard disk with a minimum of 135MB of free space for the application files, plus 65MB working space on the system drive during installation, if the destination folder is not on the system drive.

·         5MB for the Microsoft Installer (MSI) (if it is not currently on your system). This program is present as part of the Windows Me, Windows 2000 and Windows XP operating systems. OmniPage SE’s Installer will install this program if it is not currently on your system.

·         A Windows-compatible pointing device.

·         A CD-ROM drive for installation.

·         A TWAIN or WIA (Windows Image Acquisition) compatible scanner if you plan to scan documents.

 

Installation

 

The OmniPage SE User's Guide provides information about installing and setting up OmniPage SE. Please refer to it for complete installation instructions.

 

Install your scanner before you install OmniPage SE. Your scanner must be working independently of OmniPage SE prior to connecting it to OmniPage SE. Scanners that use OmniPage SE should be installed according to the scanner manufacturer’s specifications. Please consult the documentation that came with your scanner for information. We recommend that you turn on your scanner before you turn on your PC. Please also refer to the additional information about scanners in these Release Notes.

 

Before you install or uninstall OmniPage SE, exit from any open applications so that only Windows is running. There should be no applications listed in the task bar and no floating toolbars.

 

Install OmniPage SE. If the OmniPage SE Installer does not display the installation menu shortly after you insert the CD, use the following procedure to install this version:

 

1.       Put CD in CD-ROM drive (D: or the letter assigned to your CD-ROM drive).

2.       From the Start menu, select Run.

3.       Type d:\Autorun.exe.

4.       Follow the on-screen instructions.

 

 

On Windows 2000, NT or XP systems, you should have administrator privileges to install OmniPage SE. After installation, users with restricted rights will be able to use OmniPage SE.

 

Register OmniPage SE. At the end of installation, you will be asked to fill out an electronic form to register your copy of OmniPage SE. If you decide not to, you will be prompted again seven days later.

 

The installer may stop with the error code 1706. You may observe this behavior after installing OmniPage SE in a multi-user environment. When a normal user without administrator privileges logs in and tries to launch OmniPage SE by selecting its shortcut from the Windows Start menu, the Windows Installer starts completing the installation procedure for the user and aborts with the above error code, unless the "Enable user to use media source while elevated" policy is enabled. Use the following steps to enable it:

1.       Logon as Administrator

2.       Launch the Windows Group Policy editing application GPEDIT.MSC

3.       Locate the policy to be modified. It is located in Computer Configuration\Administrative Templates\Windows Components\Windows Installer. Its name is "Enable user to use media source while elevated".

4.       Double click on it, select the "Enabled" option and also the checkbox labeled "Check to force setting on..."

5.       Click Apply, then OK.

6.       Close GPEDIT.MSC then logon as the initial user and click on the OmniPage SE shortcut in the Windows Start menu to launch the installation for this user.

 

Uninstalling OmniPage SE. OmniPage SE can be uninstalled in the standard way using the Add/Remove tool in the Control Panel. Should the uninstallation of OmniPage SE fail for any reason, you can clean up your system from all references to OmniPage SE by using our Remover.exe application. This application can be found on the OmniPage SE product CD in the OmniPage\Tools folder. 

 

Launch OmniPage SE. You should launch OmniPage SE at least once after installation, before attempting to use it through integrated solutions, such as Direct OCR, the Shell Extension (recognizing from an image file’s shortcut menu) or applications like PaperPort.

 

OmniPage SE and OmniPage Pro’s earlier versions. If you have an earlier OmniPage version on your system that you will no longer need, we recommend you remove it before starting the installation of OmniPage SE. If you do not, you will be asked during installation whether to keep it or remove it from your computer.

 

 

General Application Notes

 

This section discusses a number of points related to using features in OmniPage SE.

 

OmniPage SE User’s Guide

·        The OmniPage SE User’s Guide is available in PDF format. You can read and search documents in PDF format in Adobe Acrobat Reader 3.0 or higher. Adobe Acrobat Reader Version 4.05 is included on the OmniPage SE CD. You can install it by selecting the “Install Acrobat Reader” option on OmniPage SE Installer’s starting screen.

 

 

General Technical Notes

 

Registering OmniPage SE on a computer without established Internet connection

·        If you install OmniPage SE on a computer on which an Internet connection has not been established yet, when you try to register, the Internet Connection Wizard is started. OmniPage may generate a message saying that the registration is not successful. Dismiss this message and work through the Internet Connection Wizard; when completed, ScanSoft's registration page will be opened where you can fill out an electronic form to register your copy of OmniPage SE.

 

Header/Footer recognition

·        Headers and footers may be recognized on the first page of a document only when re-recognized after more pages are added to the document.

 

Table cell background color retention

·        The background color of a table is retained only if the whole table has the same background color. Retaining the background color of different colored cells within a table is not supported.

 

Background color retention of normal text

·        OmniPage SE can detect and retain the background color of the recognized text, if requested. If a page is recognized with the “Retain text and background color” setting selected and re-recognized with the option turned off, the originally recognized background colors remain on the page. To remove the background color, turn the option off, then delete the page, then add it to the document by re-scanning or re-loading the original image, and then perform OCR on the page.

 

Problems with text and background color retention

·        It may happen that the program is not able to detect the color of the text and its background. Detection fails if the colors are very pale or the color is not homogeneous, but is composed of tiny color pixels of different colors.  

 

Problems with the recognition of special items

·        The program is able to recognize special layout elements, such as bullets, leader dots, drop caps, e-mail addresses and URLs, but there might be special cases when these elements are not recognized correctly, depending on the quality and layout of the original document. However, most problems can be fixed using the OmniPage SE Text Editor.

 

Relation between paragraphs and zones

·        A paragraph can never cross from one zone to another.

 

Retention of frames and rule lines

·       OmniPage SE can detect and retain frames and rule lines from the original document. This feature cannot be disabled. Unwanted rule lines and frames can be deleted in the OmniPage SE Text Editor or later, when the document is opened in the target application. Rule lines and frames are exported with the document in True Page formatting level. Exporting rule lines can be turned off in the Converter Options dialog box. 
 If you select Custom for your document’s layout and detecting graphics is disabled in the Custom Layout panel of the Options dialog box, frames around graphics will be detected. You can remove them in the OmniPage SE Text Editor or in the target application, as described above.

 

Using OmniPage SE from other applications such as PaperPort and Pagis Pro

·        If a recognition process is started from another application, OmniPage SE is run in automation mode with its user interface not visible. If it takes a long time for OmniPage SE to process the pages before the calling application regains control, the “OLE server busy” message box may be displayed by the Windows system. Select “Retry” to let OmniPage SE continue processing the document and allow the client application to wait for the completion of the processing.

 

Converting images to text documents using OmniPage SE’s Shell Extension

·        Right- clicking on a filename in the Windows Explorer displays a shortcut menu. OmniPage SE adds the submenu item “Convert To” in the case of image files. The added menu item allows the image file’s text to be recognized and transferred to the following text formats:*.doc, *.txt, *.xls, *.rtf and *.wpd. Where appropriate, images, layout and styling are also transferred. The text documents will be created in the image file’s folder and will have the same name as the original image, with the appropriate text format’s extension. If the “Launch Application” option is enabled on the shortcut menu’s Properties/OCR Settings dialog, OmniPage SE also tries to launch the application associated with the selected document format. If you do not have an associated application, you will get a system warning message informing you that there is no associated application for the given file type. However, conversion will be performed even in this situation.

 

Message box displayed if using OmniPage with PaperPort

·        If you use OmniPage SE from PaperPort, the message “Please finish any open dialogs so that the link can complete” may appear while the OmniPage SE Options dialog is displayed or OmniPage SE is processing. You can continue working normally after you close the message box. The message box may appear behind the PaperPort or OmniPage SE window. You can switch to it by using the ALT+TAB key combination.

 

Direct OCR menu items may not be removed from earlier versions of MS Office applications

·        After uninstalling OmniPage SE, the “Acquire Text...” and “Acquire Text Settings...” menu items may remain in earlier versions of MS Office applications. If you select any of these menu items, you will get a warning message from the MS Office application saying that OmniPage SE is not available. To remove the items from the File menu, use the Tools/Customize dialog box of the MS Office applications.

 

Application does not get registered for use with Direct OCR, even though it was added to the list of registered applications

·        If you try to register a running application in OmniPage SE for Direct OCR using the Add button on the Options/Direct OCR dialog and the application does not get registered (that is, it does not appear on the Registered list after you re-open the dialog), use the Browse button to locate the application in its installation directory and add it again to the registered applications.

 

Using Direct OCR to insert text in applications that support multiple documents

·        Some applications that support multiple open documents may start with no open document. To make sure that the recognized text is inserted into your application, make sure you have an open document. Typically you do this by selecting “New” in the File menu.

 

Using Direct OCR for processing documents with graphics

·        Graphics in the original document are usually identified by OmniPage SE and if the appropriate option is enabled, will also be exported with the output document. This applies both to normal exporting and using Direct OCR. However, some applications will not accept graphics when used with Direct OCR. Microsoft Power Point and Excel are identified as such applications.

 

Using Direct OCR from within Microsoft PowerPoint

·        When you are using Direct OCR in Microsoft PowerPoint and “True Page” is selected as the output formatting level on the Format tab of the Acquire Text Settings dialog box, documents will be transferred to those applications in “Retain Font and Paragraph"  formatting level and not in the level selected. This restriction is applied to avoid inserting documents with incorrect layout.

 

Direct OCR does not support Corel Word Perfect 6.0 and 7.0, Corel Presentation, Quattro Pro and other items in the Corel Suite

·        Although these applications can be registered to be used with OmniPage SE Direct OCR, and the "Acquire Text..." menu items appear in the File menu of these applications, recognized text is not transferred correctly into those applications.

 

Low memory situation during automatic processing

·       If the program is running out of memory while scanning and recognizing in parallel in automatic mode, it is better to use manual processing. Scan the pages first and recognize them afterwards, when scanning is complete.

 

Recognizing special characters

·        The recognition of special mathematical characters is not supported. The program will substitute these with other characters or the reject character.

 

Text orientation

·        The orientation of the text on the original document must be all the same. The program can determine the dominant text orientation of the document. Text printed with a different orientation may be handled as graphics. 

 

Opening OmniPage Documents (OPDs), created with earlier versions.

·        OmniPage SE can open OPD files created with versions OmniPage Pro 10 and 11 and its Special Edition. These files must not be read-only when you want to open them with OmniPage SE.
When opening an OmniPage Document created with version 10, the program loads the images stored in that document, but the recognized text will not be loaded. You should re-recognize those images to get the text of those images.
When opening an OmniPage Document created with version 11 or its Special Edition, the program loads both the images, zones and the text stored in that document.

 

Problems using the JAWS for Windows screen reading software

·       JAWS may not read out the state of list items in dialog boxes that also have associated checkboxes. To fix this problem, turn on the “Rely on MSAA for Listviews” option in the advanced configuration dialog box of JAWS. An example of a list with checkboxes is the recognition language list on the OCR tab of the Options panel.

 

Deleting User Dictionaries

·        A user dictionary is a list of words which are typically not native to a language, but from the user's point of view are normal words which should be treated by the recognition engine as dictionary words. OmniPage SE lists not only its own user dictionaries, created by the user, but also the dictionaries belonging to Microsoft Office. While you are allowed to delete user dictionaries created in OmniPage SE, you may not delete the dictionaries belonging to Microsoft Office. The default dictionary, shipped with Microsoft Office is "Custom". If you have Microsoft Office installed, you can find this name on OmniPage SE's user dictionary list.

 

Scanner Support

 

OmniPage SE supports scanners that are controlled by TWAIN or WIA (Windows Image Acquisition) scanner drivers. OmniPage SE supports any fully TWAIN- or WIA-compliant scanner or other input device that can supply at least a binary (black and white) image in a supported resolution (200 to 600 dots per inch). Please note that the WIA standard is supported on Windows Me and Windows XP operating systems only.

 

TWAIN source drivers (.ds files) must be provided by the scanner manufacturer. When these are installed on your PC, they are located in the Windows folder in the TWAIN, TWAIN_32, or TWAIN32 folder. OmniPage SE provides a TWAIN interface that communicates with these TWAIN source drivers.

 

Common scanners are tested and tuned for use with OmniPage SE. Tuning is done with scanner hints that specify the best use and optimal settings. OmniPage SE includes a Scanner Setup Wizard that will make an effort to automatically test and optimize your scanner for use with OmniPage SE. You can find its shortcut next to that of OmniPage SE in the Windows Start menu. You can also launch it at any time by selecting Setup in OmniPage SE’s Options dialog box (Tools/Options/Scanner). The Scanner Setup Wizard is launched automatically after installation when you first attempt to scan a document or first open the Tools/Options/Scanner dialog box. This ensures that your scanner is optimized before you start scanning documents.

 

Scanner Technical Notes

 

This section discusses a number of points on using scanners with OmniPage SE. Since scanners change frequently, please refer to http://support.scansoft.com/compatibility for the most recent information about OmniPage SE and scanner compatibility. If you are having problems with your scanner, please contact the scanner manufacturer for assistance. Often, scanner manufacturers provide web sites that provide the latest scanner drivers, answers to frequently asked questions, and other information about their products. Again, your scanner must be working independently of OmniPage SE prior to connecting it to OmniPage SE.

 

Resolution used for scanning documents

·        For most OCR jobs, scanning should be done at 300 dpi resolution. This resolution is automatically set and used by OmniPage SE. You can set a different resolution on the TWAIN driver user interface if it is enabled. You can enable the display of the TWAIN user interface in the Scanner Setup Wizard.

 

Black and White vs. Grayscale and Color scanning

·        Scanning in binary (black and white or line art) mode will provide you with the fastest processing. However, on most documents grayscale will provide more accurate results, especially on degraded or shaded background documents. Use color scanning only if you want to retain color information and/or color graphics in your document.

 

TWAIN drivers User Interface

·        Some TWAIN source drivers, when activated, may display their own user interface window behind OmniPage SE's main window. To access the TWAIN interface, press ALT+TAB and tab to the TWAIN screen.

 

Scanning legal size pages

·        Many flatbed scanners do not have a platen large enough for a legal page. These scanners will scan to their platen length if you select "Legal" page size. To scan legal size pages on scanners whose platens are smaller than legal page size, you must use an ADF (Automatic Document Feeder).

 

·        TWAIN drivers may allow you to select legal page size in the Page Type tab of the Settings dialog box even when this size is not available.

 

Be sure to properly setup you scanner to run with OmniPage

·        After you have installed OmniPage SE, it is necessary to run the Scanner Setup Wizard all the way through to Finish at least once before you can use your scanner properly with OmniPage.  For some scanners, it will be necessary to run the complete diagnostic test of the Scanner Setup Wizard to properly optimize the scanner for use with OmniPage SE. Some scanners have been tested and optimized with OmniPage SE already. For these scanners, it is still necessary to run the Scanner Setup Wizard to Finish for the optimization to take effect. When you run OmniPage for the first time, it will automatically run the Scanner Setup Wizard when you select any scanning function. Your scanner may or may not already be optimized for OmniPage. The following paragraphs describe the first few steps in either case:

 

·        The first Wizard page has the Test and configure current scanning source button checked:

Your scanner needs to be optimized to run with OmniPage SE. Leave this button checked and select "Next". Follow the instructions on the next Wizard pages to test and optimize your scanner for OmniPage SE.

 

·        The first Wizard page has Use current settings checked:

Your scanner has already been optimized for OmniPage but the Wizard still needs to be run to Finish before the optimization can take effect. Leave this button checked and select Next. Depending on the devices connected to your system or the driver you are using, you may see a screen asking you to select your scanner from a list.  Please select your scanner or “Other” if it is not listed, then select Next. On the next Wizard page, select Finish.

 

NOTE: You should not cancel from within the Scanner Setup Wizard unless you have already run the Scanner Setup Wizard to Finish at least once with your scanner.

 

·        If you have problems running the Scanner Setup Wizard from within OmniPage, do the following:

  1. Exit OmniPage
  2. Shutdown and restart Windows
  3. From the Windows Start menu, run the Scanner Wizard
    Start -> Programs -> OmniPage SE 2.0 -> Scanner Wizard

·        If the scanner you wish to run is showing under Scanning source:, make sure Test and configure current scanning source is checked.

1.       Select Next and follow the instructions on the following pages to optimize your scanner for OmniPage.

·        If your scanner is not shown under Scanning source:, make sure Select scanning source is checked and Select "Next".

1.       On the next page, make sure the scanner driver you wish to use is selected. On most systems, you will only see your scanner driver and No scanner in the list.

2.       Select Next.

3.       Depending on the devices connected to your system or the driver you are using, you may see a screen asking you to select your scanner from a list.  Please select your scanner or “Other” if it is not listed, then select Next.

4.       On the next page, make sure that Yes is checked.

5.       Select Next and follow the instructions to optimize your scanner for use with OmniPage.

 

Installing Scanners on Windows 98 Second Edition and Windows 2000

 

·        Windows 98 Second Edition and Windows 2000 have built- in TWAIN drivers for a few scanners. You will get better results with OmniPage on these systems if you install the TWAIN software that came with the scanner. Before doing so, it is best to check the web site of the scanner manufacturer to see if an updated version of the TWAIN driver or a patch is available.

 

Installing Scanners on Windows Me and Windows XP

 

·        Windows Me and Windows XP support the Windows Image Acquisition (WIA) method of connecting to scanners. A scanner must have a WIA driver to use this method. Windows Me and Windows XP ship with a number of WIA drivers for some scanners. A list of devices that have built-in WIA drivers can be found at http://www.microsoft.com/hwdev/archive/WIA/WinME_WIAdrv.asp and http://www.microsoft.com/hwdev/tech/WIA/XP_WIAdrvs.asp respectively.

If your scanner is supported by a WIA diver, you do not have to install a software driver to connect to the scanner. Some WIA drivers however do not support the full functionality of the scanner. Some known problems include ADF handling, problems with certain page sizes and canceling a scan in progress. If you are having such problems, visit the web site of the scanner manufacturer to see if a TWAIN scanner driver is available for download there. The manufacturer will specify if the driver is meant to be used in Windows XP or Windows Me.

Once you have downloaded and installed this driver you will have to use the Scanner Wizard in OmniPage to select this driver. You will have to select “Select scanner or digital camera” on the first wizard page (you won’t see this page if the wizard has been run at least once). On the second page, select “Other drivers” and on the dialog that comes up check TWAIN. Once you have a list of TWAIN drivers you must select your driver. Be careful not to select the driver that begins in “(WIA-)”. If you have installed a TWAIN driver from the manufacturer you should have a driver in the list that does not begin with “(WIA-)”. If the Wizard recommends that you run the scanner tests on this new driver, please do so. Be sure to complete all pages of the Wizard until you reach a page with the “Finish” button.  You must select this button for the changes to take effect.

 

Scanning with HP Scanners that use PrecisionScan LT or Pro

 

The HP scanner dialog has a Scan in color checkbox that must be checked before OmniPage SE can scan in color with HP scanners that use any version of PrecisionScan LT. Some versions of HP’s Precisionscan drivers automatically segment the page which may produce poor images in OmniPage.  Please consult your scanner documentation to turn this feature off.  OmniPage will perform this step when you recognize the image.

 

Sample Image Files, Shipped with the Product

OmniPage SE is supplied with sample files to demonstrate the program’s abilities, and provide a benchmark for the level of OCR accuracy you should expect. Files in your installation language will be placed in the folder My Pictures, if this exists on your system. All sample files have 300 x 300 dpi resolution.

File name

Title

Pages,
mode

Format

Contents

Preparation

Sample1

Camping

1
Color

TIFF

Illustrated brochure; table with inverted text header row.

Check Automatic is set as Layout Description.

Sample2

Document
Management

3
Gray

TIFF

Multi-column text with many subheadings, bullets and a main title straddling columns.

Use Automatic or Multiple Columns, no Table as Layout Description.

Sample4

Population
Tables

1
Color

TIFF

Two tables with right-aligned numerical data.

Use Automatic or Spreadsheet as Layout Description.

Sample5

Automotive
Equipment

2
Color

TIFF

Tri-lingual illustrated brochure with text at a small point size and colored sub-headings.

Specify English, French and German as recognition languages. (Tools/Options/OCR)

The following table summarizes which layout features are included in each sample file. Read the files repeatedly with different settings (for example, as advised below) to discover their practical effects.

Sample,
page

text
attributes

text
color

text
back-
ground

inverse
text

graphics

tables

frames

lines

columns

bullets

titles spanning
columns

super-
scripts

cell text
alignment

accented letters

Sample1

·

·

·

·

·

·

 

 

 

·

 

·

·

 

Sample2/1

·

 

 

 

 

 

 

 

·

 

·

 

 

 

Sample2/2

·

 

 

 

 

 

 

 

·

 

 

·

 

 

Sample2/3

·

 

 

 

 

 

 

 

·

·

 

 

 

 

Sample4

·

 

 

·

 

·

 

 

 

 

 

 

·

 

Sample5/1

·

·

 

 

·

 

 

 

·

 

 

 

 

·

Sample5/2

·

·

 

 

·

 

 

 

·

 

 

 

 

·

Text attributes: View the file alternately in No Formatting View and Retain Fonts and Paragraphs view to see which text attributes are retained or dropped.

Text and background color: Select to have these retained or dropped in the Process panel of the Options dialog box before recognition. Recognize the page twice with the setting on and off to see the difference.

Inverse text: Select to have this retained or converted to non-inverse text in the Process panel of the Options dialog box before recognition. Recognize the page twice with different settings to see the difference.

Graphics: These are always displayed; an export converter setting option allows you to retain or drop them when exporting.

Tables: These are displayed in grids. Format borders before or after recognition. Export converter settings let you define how the tables should be exported.

Frames and Lines: Select these in the Text Editor and use a shortcut menu to format lines, frame borders and shading. Choose in export converter settings how frames should be exported.

Column handling:Check column retention by viewing recognized pages in True Page view. Check decolumnization in the other views.