General tab
Use this tab to set general OCR attributes.
Option |
Description |
---|---|
Activate |
Use this combo box to activate the component according to a condition (see Conditional fields under Appendices). |
CPU usage |
Use this setting to limit processor usage by the OPOCR component. The default value is 100%. Modifying the setting requires some knowledge of processor activity. For example, if you have a four processor system, you would choose a quartile percentage (25%, 50%, 75%, or 100% rather than 40%). You can experiment with this setting to maximize efficient use of resources on a system. |
Pass through |
Set this option to "Yes" to pass the original document to subsequent components in the workflow. You can use conditions in this field (see Conditional fields under Appendices). If you enter an invalid condition into
Pass through box, the activation is
"Yes" by default.
|
Input files |
Defines the file types that the component will process. Enter a wildcard character and extension (such as *.pdf) to define a file type. Separate entries using a comma (,) or semicolon (;). By default this box lists the following file types: *.pdf; *.tif; *.tiff; *.jpg; *.jpeg; *.jfif; *.bmp; *.pcx; *.dcx; *.jp2; *.jpc; *.j2c; *.gif; *.png; *.jb2 You can use the following wildcard characters to specify file types:
|
Resolution |
You can use this setting to minimize resource use for very large files.
|
Languages |
Select the language of the text to be recognized from the list. If necessary, multiple languages may be entered by separating language names with a comma. You can use RRTs in this field to define language recognition at run time. RRTs used in this text box should be replaced with
internal language names. To view internal language names, expand
a language category node in the Select
language dialog box and select a language. The
internal name appears at the bottom of the dialog box.
|
Recognition mode |
Select the mode of recognition, that is, a desired balance of speed/errors rate. There are three recognition modes available:
|
Recognition type |
Select the type of the text to be recognized. Text type settings influence recognition speed and quality. If text type is incorrect, the OPOCR engine might recognize images slowly and less accurately. The following options are available:
|
Output OCR text as |
This group allows you to specify how to output the recognized text. |
File |
Select this check box if you want to save recognized text as a file. The file is passed to the subsequent components. Specify the file format for saving the recognition results manually or by selecting it from the drop-down list. Possible formats are TXT, CSV, HTML, PDF, PDF/A, PDF (Keep original), PPTX, RTF, DOCX, XLS, XLSX, and OPD. If needed, multiple file formats may be entered with a "," separating formats. You can use RRTs from another component in this box. Specify the parameters of the output file in the Format Settings dialog box (see Format Settings). |
Set up output file |
Click this button to open Format settings dialog box. |
Run-time replacement ~FRO::OCRText~ |
Select this check box to save recognized text as the ~FRO::OCRText~ Runtime Replacement Tag. |
Zoned OCR |
Select this check box to use zoned OCR. Recognized fields will be output as RRTs or/and as CSV files. |
Set up zoned OCR |
Click this button to configure settings for a zoned OCR. This button is enabled only if the Zoned OCR check box is selected. This button opens the Setup Zoned OCR dialog box. It is mandatory to select at least one of the
check boxes in the Output OCR text as
group.
|