General
The options available on the General tab are outlined below.
OCR engine options
-
OCR engine: Select which engine to use for recognition. Both engines can recognize hand-printed and machine-printed text, but the RecoStar engine is deprecated.
-
Enable image pre-processing: If image processing has not been performed in earlier steps, this option lets the user do things like deskew, despeckle, resize, smoothing, auto orient, and more. These options can be configured in Image Processing. PSIcapture runs any items in the section when running OCR for this capture profile.
-
Enable auto rotation: If, for any reason, the pages in a document are upside down or not facing the right direction, this setting automatically rotates the pages if it did not happen in previous steps. This function can potentially improve OCR accuracy.
-
Enable spell check: This feature uses the context of the surrounding characters to perform a spell check and attempt to match a corresponding, correct word.
This feature is called "Enable logical context corrections" when the deprecated RecoStar OCR engine is selected.
Output options
-
Output type: Select which document type to use when outputting captured images.
-
Adobe PDF (Image Only): Converts TIFF images to PDF without performing OCR.
-
Adobe PDF (Image with Hidden Text): Performs OCR on the document and includes the data as hidden text within the PDF.
-
Text: Performs OCR on the document and outputs only a text file with the OCR results.
-
XML: Performs OCR on the document and outputs only a XML file with the OCR results.
-
-
OCR file tag: Enter a tag to associate with the output document.
-
Output OCR as single page: Selecting this option produces each image as a single-page PDF.
By default, PSIcapture outputs a multi-page file.
-
Include Folder separators in output: Use this option to include or exclude the folder separator page in the output file. This function is useful, because if this option is not selected, the folder separator page would be available in the QA or Index module but not in the final output file.
-
Include Document separators in output: Use this option to include or exclude the document separator page in the output file.
-
Do not output items marked with Skip flag: Any page/document/folder tagged with a skip flag is not included in the output file.
Secondary output options
Enable secondary output: Selecting this box means the user needs an additional type of document when outputting.
-
Secondary output type: Select which additional document type to use when outputting captured images.
-
Adobe PDF (Image Only): Converts TIFF images to PDF without performing OCR.
-
Adobe PDF (Image with Hidden Text): Performs OCR on the document and includes the data as hidden text within the PDF.
-
Text: Performs OCR on the document and outputs only a text file with the OCR results.
-
XML: Performs OCR on the document and outputs only a XML file with the OCR results.
-
-
Secondary OCR file tag: Enter an additional tag to associate with the additional output document
User dictionary
Enable checking using user dictionary: Select this box and click Setup to enter any words to be used in the user dictionary. This function may be helpful when performing OCR on specialized documents, such as medical documents.