OCR Smart Zone Configuration

A Smart Zone uses OCR to search for information based on a common anchor point across documents of varying format. For example, users may have a mixture of invoices in varying formats. In this example, we use a label called "Invoice Number", but the placement of the invoice number varies from form to form. The invoice number could be next to or underneath the label and may even be in different locations of each document (left side or right side). Instead of defining multiple zones or multiple zone definition profiles, users can use one OCR Smart Zone. It finds the common anchor point (that is, the label "Invoice Number"), and then searches in predefined areas around the anchor to find the actual value needed (that is, the invoice number itself).

This topic describes the configuration of basic OCR Smart Zones, grouped OCR Smart Zones (which can populate multiple index fields), and Multi-Record OCR Smart Zones (which are useful for extracting oddly formatted tables).

On top of creating a new OCR Smart Zone, the user can create specific profiles that are associated with the capture profile or page that needs its own single OCR Smart Zone configuration or multiple OCR Smart Zone configurations. With the ability to create, edit, or copy an OCR Smart Zone profile, the user has more flexibility. This option can be configured while making an OCR Smart Zone or in Global Lists: User-defined form fields.

Basic OCR Smart Zone

In this example, we create an OCR Smart Zone to locate the invoice number and store it in an index field.

  1. Create an index field called "Invoice Number" in any of the index field tables.
  2. Click the Define Zones button in the Zone tab.
  3. Click the Select Template Image button in the ribbon menu and load a document into the viewer.
  4. Draw an OCR Smart Zone over a large area of the image where the invoice number label is sure to be.

    To draw an OCR Smart Zone on the image, click the Draw OCR Smart Zone toolbar icon (or press Ctrl + 3). Alternatively, you can select an OCR Smart Zone profile that already exists by clicking Select Profile and selecting an existing profile.

    The OCR Smart Zone Configuration screen appears.

  5. Click Add Anchor under Anchor expressions to create an OCR Smart Zone anchor expression. Enter "INVOICE NUMBER" as the Expression to locate that phrase on the document page.

    You can also use the built-in regular expression builder by clicking the Build a regular expression by selecting input text button and highlighting one or more words to automatically build a regular expression to match them.

  6. Click Locate Anchors to locate the anchor on the page.
  7. Select a Type for the anchor expression.
    • Custom: You can manually enter any expression or use one of the Regex buttons to create the expression.

    • System: Select from the list of system form fields:

      • Invoice Number

      • Invoice Date

      • Invoice Total

      • Purchase Order Number

      • Vendor Name

      • Sales Order Number

      • RMA Number

      • Sales Order Date

    • User Defined: Select from the list of user-defined form fields.

  8. Define child zones for each area by clicking the Add Zone button under Child zones.

    In this example, we have two layouts for the invoice number: one is directly below the anchor, while the other is to the right side of the anchor.

  9. Click the Save Smart Zone button to save the changes and return to the Zone Configuration screen.
  10. Enter a name for the new OCR Smart Zone ("InvoiceNumberZone"), then click the Save Zone Settings button.
  11. Select OCR as the zone action and assign the zone to the index field in General > Zone. Also, select the Do not run Zone OCR if the index field is already populated box to stop processing the child zones once a value has been found.

When you run a batch, the OCR Smart Zone picks up the invoice number from either of the two child positions defined.

Beside the label:

Under the label:

Grouped Smart Zones

The basic Smart Zone can only populate one index field. However, there are often multiple values users wish to capture into different index fields that are all anchored by a common anchor point. In these cases, use a grouped Smart Zone. A grouped Smart Zone has group names associated with each child zone. Users can then assign individual child zone groups directly to an index field.

In the example below, we have one anchor that looks for the text "Invoice". This anchor has four child zones. From this single anchor, we can locate and capture the invoice number, invoice date, company, and phone number.

Back on the Zone tab, you can now select each of the individual child zone groups for assignment to an index field:

In the example above, the zone for the "Invoice Number" index field is populated by "Invoice Data:Invoice". Invoice Data receives the value from the Invoice Data:Date child zone, phone receives its value from the Invoice Data:Phone child zone, and Company Name from Invoice Data:Company.

Multi-Record Smart Zones

The multi-record OCR Smart Zone is used for table extraction when a table is oddly formatted. For example, if a page is made up of multiple tables, it can be difficult to set up a set of standard, multi-record zones to extract the line items correctly. Instead, create a single OCR Smart Zone that covers just one column, enter a regular expression to detect the values of that column, and then set up grouped child zones to extract the details of each line item.

In the example below, the OCR Smart Zone anchors on the quantity column numeric values. Whenever one is found, the child zones gather the data from the line item:

  1. To generate the multi-record Smart Zone, select the check box labeled Create a record for each smart zone anchor found on the page in the Options group.

    The OCR Smart Zone configuration screen accepts a hierarchy level number. These level numbers define the parent/child relationships that exist between multiple Smart Zones that are used for table extraction.

  2. After saving the zone configuration, select the child zones and assign them to each individual index field.