OmniPage SE Release Notes
Version 2.0
Copyright © 2002-2003 ScanSoft, Inc.
OmniPage SE Version 2
Portions of this product copyright © 2000 Proximity Technology, Inc.
Portions of this product copyright © 2000 Stellent Corporation.
Portions of this product copyright © 1991-1998, Thomas G. Lane.
Portions of this product copyright © 1993, Vantage Research.
Please read this document for up-to-date information about ScanSoft's OmniPage SE.
These release notes discuss the following topics:
· Sample Image Files, Shipped with the Product
To install and run OmniPage SE, your Windows-compatible PC must be equipped with the following:
· An Intel (or compatible) Pentium™ or higher processor.
· An SVGA monitor with 256 colors (16- bit color is recommended. This is sometimes called “High color” and called “Medium Color Quality” on WinXP) and 800x600 pixel resolution.
· A minimum of 64 megabytes (MB) of random access memory (RAM). We recommend 128 MB or more for the best performance.
· Microsoft Windows 98 SE (Second Edition), Windows Me, Windows NT 4.0 SP6 (Service Pack 6), Windows 2000 or Windows XP. Your PC must be Y2K compliant. OmniPage will not run on Windows 3.1, Windows 98 First Edition, Windows NT 3.51, or OS/2.
· A hard disk with a minimum of 135MB of free space for the application files, plus 65MB working space on the system drive during installation, if the destination folder is not on the system drive.
· 5MB for the Microsoft Installer (MSI) (if it is not currently on your system). This program is present as part of the Windows Me, Windows 2000 and Windows XP operating systems. OmniPage SE’s Installer will install this program if it is not currently on your system.
· A Windows-compatible pointing device.
· A CD-ROM drive for installation.
· A TWAIN or WIA (Windows Image Acquisition) compatible scanner if you plan to scan documents.
The OmniPage SE User's Guide provides information about installing and setting up OmniPage SE. Please refer to it for complete installation instructions.
Before you install or uninstall OmniPage SE, exit from any open applications so that only Windows is running. There should be no applications listed in the task bar and no floating toolbars.
Install OmniPage SE. If the OmniPage SE Installer does not display the installation menu shortly after you insert the CD, use the following procedure to install this version:
1. Put CD in CD-ROM drive (D: or the letter assigned to your CD-ROM drive).
2. From the Start menu, select Run.
3. Type d:\Autorun.exe.
4. Follow the on-screen instructions.
On Windows 2000, NT or XP systems, you should have administrator privileges to install OmniPage SE. After installation, users with restricted rights will be able to use OmniPage SE.
Register OmniPage SE. At the end of installation, you will be asked to fill out an electronic form to register your copy of OmniPage SE. If you decide not to, you will be prompted again seven days later.
The installer may stop with the error code 1706. You may observe this behavior after installing OmniPage SE in a multi-user environment. When a normal user without administrator privileges logs in and tries to launch OmniPage SE by selecting its shortcut from the Windows Start menu, the Windows Installer starts completing the installation procedure for the user and aborts with the above error code, unless the "Enable user to use media source while elevated" policy is enabled. Use the following steps to enable it:
1. Logon as Administrator
2. Launch the Windows Group Policy editing application GPEDIT.MSC
3. Locate the policy to be modified. It is located in Computer Configuration\Administrative Templates\Windows Components\Windows Installer. Its name is "Enable user to use media source while elevated".
4. Double click on it, select the "Enabled" option and also the checkbox labeled "Check to force setting on..."
5. Click Apply, then OK.
6. Close GPEDIT.MSC then logon as the initial user and click on the OmniPage SE shortcut in the Windows Start menu to launch the installation for this user.
Uninstalling OmniPage SE. OmniPage SE can be uninstalled in the standard way using the Add/Remove tool in the Control Panel. Should the uninstallation of OmniPage SE fail for any reason, you can clean up your system from all references to OmniPage SE by using our Remover.exe application. This application can be found on the OmniPage SE product CD in the OmniPage\Tools folder.
Launch OmniPage SE. You should launch OmniPage SE at least once after installation, before attempting to use it through integrated solutions, such as Direct OCR, the Shell Extension (recognizing from an image file’s shortcut menu) or applications like PaperPort.
OmniPage SE and OmniPage Pro’s earlier versions. If you have an earlier OmniPage version on your system that you will no longer need, we recommend you remove it before starting the installation of OmniPage SE. If you do not, you will be asked during installation whether to keep it or remove it from your computer.
This section discusses a number of points related to using features in OmniPage SE.
· The OmniPage SE User’s Guide is available in PDF format. You can read and search documents in PDF format in Adobe Acrobat Reader 3.0 or higher. Adobe Acrobat Reader Version 4.05 is included on the OmniPage SE CD. You can install it by selecting the “Install Acrobat Reader” option on OmniPage SE Installer’s starting screen.
· If you install OmniPage SE on a computer on which an Internet connection has not been established yet, when you try to register, the Internet Connection Wizard is started. OmniPage may generate a message saying that the registration is not successful. Dismiss this message and work through the Internet Connection Wizard; when completed, ScanSoft's registration page will be opened where you can fill out an electronic form to register your copy of OmniPage SE.
· Headers and footers may be recognized on the first page of a document only when re-recognized after more pages are added to the document.
· The background color of a table is retained only if the whole table has the same background color. Retaining the background color of different colored cells within a table is not supported.
· OmniPage SE can detect and retain the background color of the recognized text, if requested. If a page is recognized with the “Retain text and background color” setting selected and re-recognized with the option turned off, the originally recognized background colors remain on the page. To remove the background color, turn the option off, then delete the page, then add it to the document by re-scanning or re-loading the original image, and then perform OCR on the page.
· It may happen that the program is not able to detect the color of the text and its background. Detection fails if the colors are very pale or the color is not homogeneous, but is composed of tiny color pixels of different colors.
· The program is able to recognize special layout elements, such as bullets, leader dots, drop caps, e-mail addresses and URLs, but there might be special cases when these elements are not recognized correctly, depending on the quality and layout of the original document. However, most problems can be fixed using the OmniPage SE Text Editor.
· A paragraph can never cross from one zone to another.
· OmniPage SE can detect and retain frames and rule
lines from the original document. This feature cannot be disabled. Unwanted
rule lines and frames can be deleted in the OmniPage SE Text Editor or later,
when the document is opened in the target application. Rule lines and frames
are exported with the document in True Page formatting level. Exporting rule
lines can be turned off in the Converter Options dialog box.
If you select Custom for your document’s layout and detecting graphics
is disabled in the Custom Layout panel of the Options dialog box, frames around
graphics will be detected. You can remove them in the OmniPage SE Text Editor
or in the target application, as described above.
· If a recognition process is started from another application, OmniPage SE is run in automation mode with its user interface not visible. If it takes a long time for OmniPage SE to process the pages before the calling application regains control, the “OLE server busy” message box may be displayed by the Windows system. Select “Retry” to let OmniPage SE continue processing the document and allow the client application to wait for the completion of the processing.
· Right- clicking on a filename in the Windows Explorer displays a shortcut menu. OmniPage SE adds the submenu item “Convert To” in the case of image files. The added menu item allows the image file’s text to be recognized and transferred to the following text formats:*.doc, *.txt, *.xls, *.rtf and *.wpd. Where appropriate, images, layout and styling are also transferred. The text documents will be created in the image file’s folder and will have the same name as the original image, with the appropriate text format’s extension. If the “Launch Application” option is enabled on the shortcut menu’s Properties/OCR Settings dialog, OmniPage SE also tries to launch the application associated with the selected document format. If you do not have an associated application, you will get a system warning message informing you that there is no associated application for the given file type. However, conversion will be performed even in this situation.
· If you use OmniPage SE from PaperPort, the message “Please finish any open dialogs so that the link can complete” may appear while the OmniPage SE Options dialog is displayed or OmniPage SE is processing. You can continue working normally after you close the message box. The message box may appear behind the PaperPort or OmniPage SE window. You can switch to it by using the ALT+TAB key combination.
· After uninstalling OmniPage SE, the “Acquire Text...” and “Acquire Text Settings...” menu items may remain in earlier versions of MS Office applications. If you select any of these menu items, you will get a warning message from the MS Office application saying that OmniPage SE is not available. To remove the items from the File menu, use the Tools/Customize dialog box of the MS Office applications.
· If you try to register a running application in OmniPage SE for Direct OCR using the Add button on the Options/Direct OCR dialog and the application does not get registered (that is, it does not appear on the Registered list after you re-open the dialog), use the Browse button to locate the application in its installation directory and add it again to the registered applications.
· Some applications that support multiple open documents may start with no open document. To make sure that the recognized text is inserted into your application, make sure you have an open document. Typically you do this by selecting “New” in the File menu.
· Graphics in the original document are usually identified by OmniPage SE and if the appropriate option is enabled, will also be exported with the output document. This applies both to normal exporting and using Direct OCR. However, some applications will not accept graphics when used with Direct OCR. Microsoft Power Point and Excel are identified as such applications.
· When you are using Direct OCR in Microsoft PowerPoint and “True Page” is selected as the output formatting level on the Format tab of the Acquire Text Settings dialog box, documents will be transferred to those applications in “Retain Font and Paragraph" formatting level and not in the level selected. This restriction is applied to avoid inserting documents with incorrect layout.
· Although these applications can be registered to be used with OmniPage SE Direct OCR, and the "Acquire Text..." menu items appear in the File menu of these applications, recognized text is not transferred correctly into those applications.
· If the program is running out of memory while scanning and recognizing in parallel in automatic mode, it is better to use manual processing. Scan the pages first and recognize them afterwards, when scanning is complete.
· The recognition of special mathematical characters is not supported. The program will substitute these with other characters or the reject character.
· The orientation of the text on the original document must be all the same. The program can determine the dominant text orientation of the document. Text printed with a different orientation may be handled as graphics.
· OmniPage
SE can open OPD files created with versions OmniPage Pro 10 and 11 and its
Special Edition. These files must not be read-only when you want to open them
with OmniPage SE.
When opening an OmniPage Document created with version 10, the program loads
the images stored in that document, but the recognized text will not be loaded.
You should re-recognize those images to get the text of those images.
When opening an OmniPage Document created with version 11 or its Special
Edition, the program loads both the images, zones and the text stored in that
document.
· JAWS may not read out the state of list items in dialog boxes that also have associated checkboxes. To fix this problem, turn on the “Rely on MSAA for Listviews” option in the advanced configuration dialog box of JAWS. An example of a list with checkboxes is the recognition language list on the OCR tab of the Options panel.
· A user dictionary is a list of words which are typically not native to a language, but from the user's point of view are normal words which should be treated by the recognition engine as dictionary words. OmniPage SE lists not only its own user dictionaries, created by the user, but also the dictionaries belonging to Microsoft Office. While you are allowed to delete user dictionaries created in OmniPage SE, you may not delete the dictionaries belonging to Microsoft Office. The default dictionary, shipped with Microsoft Office is "Custom". If you have Microsoft Office installed, you can find this name on OmniPage SE's user dictionary list.
OmniPage SE supports scanners that are controlled by TWAIN or WIA (Windows Image Acquisition) scanner drivers. OmniPage SE supports any fully TWAIN- or WIA-compliant scanner or other input device that can supply at least a binary (black and white) image in a supported resolution (200 to 600 dots per inch). Please note that the WIA standard is supported on Windows Me and Windows XP operating systems only.
TWAIN source drivers (.ds files) must be provided by the scanner manufacturer. When these are installed on your PC, they are located in the Windows folder in the TWAIN, TWAIN_32, or TWAIN32 folder. OmniPage SE provides a TWAIN interface that communicates with these TWAIN source drivers.
Common scanners are tested and tuned for use with OmniPage SE. Tuning is done with scanner hints that specify the best use and optimal settings. OmniPage SE includes a Scanner Setup Wizard that will make an effort to automatically test and optimize your scanner for use with OmniPage SE. You can find its shortcut next to that of OmniPage SE in the Windows Start menu. You can also launch it at any time by selecting Setup in OmniPage SE’s Options dialog box (Tools/Options/Scanner). The Scanner Setup Wizard is launched automatically after installation when you first attempt to scan a document or first open the Tools/Options/Scanner dialog box. This ensures that your scanner is optimized before you start scanning documents.
This section discusses a number of points on using scanners with OmniPage SE. Since scanners change frequently, please refer to http://support.scansoft.com/compatibility for the most recent information about OmniPage SE and scanner compatibility. If you are having problems with your scanner, please contact the scanner manufacturer for assistance. Often, scanner manufacturers provide web sites that provide the latest scanner drivers, answers to frequently asked questions, and other information about their products. Again, your scanner must be working independently of OmniPage SE prior to connecting it to OmniPage SE.
· For most OCR jobs, scanning should be done at 300 dpi resolution. This resolution is automatically set and used by OmniPage SE. You can set a different resolution on the TWAIN driver user interface if it is enabled. You can enable the display of the TWAIN user interface in the Scanner Setup Wizard.
· Scanning in binary (black and white or line art) mode will provide you with the fastest processing. However, on most documents grayscale will provide more accurate results, especially on degraded or shaded background documents. Use color scanning only if you want to retain color information and/or color graphics in your document.
· Some TWAIN source drivers, when activated, may display their own user interface window behind OmniPage SE's main window. To access the TWAIN interface, press ALT+TAB and tab to the TWAIN screen.
· Many flatbed scanners do not have a platen large enough for a legal page. These scanners will scan to their platen length if you select "Legal" page size. To scan legal size pages on scanners whose platens are smaller than legal page size, you must use an ADF (Automatic Document Feeder).
· TWAIN drivers may allow you to select legal page size in the Page Type tab of the Settings dialog box even when this size is not available.
· After you have installed OmniPage SE, it is necessary to run the Scanner Setup Wizard all the way through to Finish at least once before you can use your scanner properly with OmniPage. For some scanners, it will be necessary to run the complete diagnostic test of the Scanner Setup Wizard to properly optimize the scanner for use with OmniPage SE. Some scanners have been tested and optimized with OmniPage SE already. For these scanners, it is still necessary to run the Scanner Setup Wizard to Finish for the optimization to take effect. When you run OmniPage for the first time, it will automatically run the Scanner Setup Wizard when you select any scanning function. Your scanner may or may not already be optimized for OmniPage. The following paragraphs describe the first few steps in either case:
· The first Wizard page has the Test and configure current scanning source button checked:
Your scanner needs to be optimized to run with OmniPage SE. Leave this button checked and select "Next". Follow the instructions on the next Wizard pages to test and optimize your scanner for OmniPage SE.
· The first Wizard page has Use current settings checked:
Your scanner has already been optimized for OmniPage but the Wizard still needs to be run to Finish before the optimization can take effect. Leave this button checked and select Next. Depending on the devices connected to your system or the driver you are using, you may see a screen asking you to select your scanner from a list. Please select your scanner or “Other” if it is not listed, then select Next. On the next Wizard page, select Finish.
NOTE: You should not cancel from within the Scanner Setup Wizard unless you have already run the Scanner Setup Wizard to Finish at least once with your scanner.
· If you have problems running the Scanner Setup Wizard from within OmniPage, do the following:
· If the scanner you wish to run is showing under Scanning source:, make sure Test and configure current scanning source is checked.
1. Select Next and follow the instructions on the following pages to optimize your scanner for OmniPage.
· If your scanner is not shown under Scanning source:, make sure Select scanning source is checked and Select "Next".
1. On the next page, make sure the scanner driver you wish to use is selected. On most systems, you will only see your scanner driver and No scanner in the list.
2. Select Next.
3. Depending on the devices connected to your system or the driver you are using, you may see a screen asking you to select your scanner from a list. Please select your scanner or “Other” if it is not listed, then select Next.
4. On the next page, make sure that Yes is checked.
5. Select Next and follow the instructions to optimize your scanner for use with OmniPage.
· Windows 98 Second Edition and Windows 2000 have built- in TWAIN drivers for a few scanners. You will get better results with OmniPage on these systems if you install the TWAIN software that came with the scanner. Before doing so, it is best to check the web site of the scanner manufacturer to see if an updated version of the TWAIN driver or a patch is available.
· Windows Me and Windows XP support the Windows Image
Acquisition (WIA) method of connecting to scanners. A scanner must have a WIA
driver to use this method. Windows Me and Windows XP ship with a number of WIA
drivers for some scanners. A list of devices that have built-in WIA drivers can
be found at http://www.microsoft.com/hwdev/archive/WIA/WinME_WIAdrv.asp
and http://www.microsoft.com/hwdev/tech/WIA/XP_WIAdrvs.asp
respectively.
If your scanner is supported by a WIA diver, you do not have to install a
software driver to connect to the scanner. Some WIA drivers however do not
support the full functionality of the scanner. Some known problems include ADF
handling, problems with certain page sizes and canceling a scan in progress. If
you are having such problems, visit the web site of the scanner manufacturer to
see if a TWAIN scanner driver is available for download there. The manufacturer
will specify if the driver is meant to be used in Windows XP or Windows Me.
Once you have downloaded and installed this driver you will have to use the
Scanner Wizard in OmniPage to select this driver. You will have to select
“Select scanner or digital camera” on the first wizard page (you won’t see this
page if the wizard has been run at least once). On the second page, select
“Other drivers” and on the dialog that comes up check TWAIN. Once you have a
list of TWAIN drivers you must select your driver. Be careful not to select the
driver that begins in “(WIA-)”. If you have installed a TWAIN driver from the
manufacturer you should have a driver in the list that does not begin with
“(WIA-)”. If the Wizard recommends that you run the scanner tests on this new
driver, please do so. Be sure to complete all pages of the Wizard until you
reach a page with the “Finish” button. You must select this button for
the changes to take effect.
The HP scanner dialog has a Scan in color checkbox that must be checked before OmniPage SE can scan in color with HP scanners that use any version of PrecisionScan LT. Some versions of HP’s Precisionscan drivers automatically segment the page which may produce poor images in OmniPage. Please consult your scanner documentation to turn this feature off. OmniPage will perform this step when you recognize the image.
OmniPage SE is supplied with sample files to demonstrate the program’s abilities, and provide a benchmark for the level of OCR accuracy you should expect. Files in your installation language will be placed in the folder My Pictures, if this exists on your system. All sample files have 300 x 300 dpi resolution.
File name |
Title |
Pages, |
Format |
Contents |
Preparation |
Sample1 |
Camping |
1 |
TIFF |
Illustrated brochure; table with inverted text header row. |
Check Automatic is set as Layout Description. |
Sample2 |
Document |
3 |
TIFF |
Multi-column text with many subheadings, bullets and a main title straddling columns. |
Use Automatic or Multiple Columns, no Table as Layout Description. |
Sample4 |
Population |
1 |
TIFF |
Two tables with right-aligned numerical data. |
Use Automatic or Spreadsheet as Layout Description. |
Sample5 |
Automotive |
2 |
TIFF |
Tri-lingual illustrated brochure with text at a small point size and colored sub-headings. |
Specify English, French and German as recognition languages. (Tools/Options/OCR) |
The following table summarizes which layout features are included in each sample file. Read the files repeatedly with different settings (for example, as advised below) to discover their practical effects.
Sample, |
text |
text |
text |
inverse |
graphics |
tables |
frames |
lines |
columns |
bullets |
titles spanning |
super- |
cell text |
accented letters |
Sample1 |
· |
· |
· |
· |
· |
· |
|
|
|
· |
|
· |
· |
|
Sample2/1 |
· |
|
|
|
|
|
|
|
· |
|
· |
|
|
|
Sample2/2 |
· |
|
|
|
|
|
|
|
· |
|
|
· |
|
|
Sample2/3 |
· |
|
|
|
|
|
|
|
· |
· |
|
|
|
|
Sample4 |
· |
|
|
· |
|
· |
|
|
|
|
|
|
· |
|
Sample5/1 |
· |
· |
|
|
· |
|
|
|
· |
|
|
|
|
· |
Sample5/2 |
· |
· |
|
|
· |
|
|
|
· |
|
|
|
|
· |
Text attributes: View the file alternately in No Formatting View and Retain Fonts and Paragraphs view to see which text attributes are retained or dropped.
Text and background color: Select to have these retained or dropped in the Process panel of the Options dialog box before recognition. Recognize the page twice with the setting on and off to see the difference.
Inverse text: Select to have this retained or converted to non-inverse text in the Process panel of the Options dialog box before recognition. Recognize the page twice with different settings to see the difference.
Graphics: These are always displayed; an export converter setting option allows you to retain or drop them when exporting.
Tables: These are displayed in grids. Format borders before or after recognition. Export converter settings let you define how the tables should be exported.
Frames and Lines: Select these in the Text Editor and use a shortcut menu to format lines, frame borders and shading. Choose in export converter settings how frames should be exported.
Column handling:Check column retention by viewing recognized pages in True Page view. Check decolumnization in the other views.