OCR/ICR processing options
The processing options are grouped into three tabs:
-
General
-
Page Processing
-
Character Filter
The General tab has the following options:
Zone: Select a zone from the list or click Define Zones to define a new zone.
Show Zone popup window: Selecting this option causes a dialog box to appear when indexing is active in the product.
Do not run Zone OCR / ICR if the index field is already populated: Skip OCR/ICR if there is already data in the index field.
Validate based on OCR/ICR confidence level: Select this option to validate based on a minimum OCR/ICR confidence level, and specify the minimum level expected. If the results fall under this percentage, the value is flagged as invalid during indexing.
The Page Processing tab has only one option:
Page processing option: Decide which pages should the recognition engine process from the document:
-
Defined Page
-
First Page
-
Last Page
-
All Pages
-
Selected Pages
Skip document / folder separator pages: Skip the document or folder separator pages.
The Character Filter tab has the following options:
Character filter: If a zone is known to contain either only numeric or only alphabetic characters, the OCR results can be filtered to return only those characters by specifying that option in the filter. If the OCR text contains both alphabetic and numeric characters, select All Characters. Some possible options are All Characters, Alpha Only, Numeric Only, Numeric Extended (0-9,$,%,#,+,- ..), Date (0-9./-), Extended Characters Only, and Standard Printable Characters.
Enable extended characters: Include additional characters that would not have been included based on the Character Filter option selected above. For instance, if the user has a zone that contains numeric characters and the letters A, B, and C, the user may set the Character Filter to Numeric Characters Only, Enable Extended Characters, and add A, B, and C into the Extended Characters List. With this approach, the user can achieve better OCR results than with the All Characters setting.
Click Extended Characters Setup to add characters to the character filter.
Enter each individual extended character to the list. For example, if the user wants all letters and numbers, the user should select Alpha Only and then enter "1234567890" into the extended characters list.
Invalid character action: The following options are available:
-
Do Not Correct: This option processes all characters detected by the OCR/ICR engine. There may be inaccuracies.
-
Remove: All invalid characters, as defined by the character filter, are deleted from the return value.
-
Auto Correct: This option finds and replaces all invalid characters with user-specified characters defined in the Auto Correction Settings list. Click Auto Correction Settings to display the list.
You can add or remove settings to enhance the quality of the OCR. For example, if the OCR engine returns an alpha O and the Character Filter + Extended Characters are expecting 0-9 and a, b, or c, the character placed in the field would be a zero (0).
-
Replace with marker: Invalid characters are replaced with the value of the Invalid character marker option below.
Processing option: Filter the entire OCR/ICR zone or only certain matching words when detected.
Invalid character marker: Enter a valid character to replace invalid characters that are deleted.
Replacing invalid characters with a character that is invalid for the data type of that field causes either no data to be returned or errors to occur. For example, an asterisk (*) when the data type of the field is numeric.