Black and white conversion
When training documents are added to a project for extraction or classification training, their source files can be large in size due to a high resolution, color, and format. This can result in large training sets. Similarly, attached test sets can contain large files, resulting in large document tests.
To avoid this, you can convert your test and training documents to black and white. This conversion minimizes the size of a document set.
This is restricted to test and training sets. You cannot convert a benchmark document set.
For the best results, Train your project before converting the training documents to bitonal format. This ensures that any quality lost during conversion does not negatively affect the training results.
Similarly, ensure that all configuration and testing is complete before converting any Test Sets. This ensures that you are using the best quality documents to configure and test your extraction results.
By converting your training documents, the following is true:
-
Once the conversion is complete, it cannot be undone.
-
A small reduction in quality occurs during conversion.
-
PDFs are converted to bitonal black and white TIFFs.
-
You cannot convert documents in benchmark document sets.
-
You cannot convert documents in a protected project
-
You cannot convert .txt or protected documents. If one or more of these documents is selected, those document are skipped and a message is displayed indicating the number of skipped files.
PDFs with many pages and only a text layer or a small PDF compressed with more advanced compression methods may increase in size when converted to a black and white TIFF.