Tables and Key Value Pairs for Document Annotation

How to turn unstructured text into structured data with V7

V7 has a suite of tools that can help you turn unstructured data in PDFs into useful data for downstream AI projects. The following document details how you can use our Text Scanner Models, Table Tool, and Graph Tools to create higher quality data quicker!

Document Annotation with a Model in the Loop

V7 has three inbuilt Text Scanner models that can help you take your document annotation workflows to the next level.

Bringing a Text Scanner model into the loop for your PDF annotations can speed up turning unstructured data into structured data.

  1. In the dataset workflow panel, drag in an AI Model Stage
  2. Connect the Text Scanner model of your choosing
  3. Map classes to the model (Note: this requires bounding box or polygon classes with the Text subtype enabled)"
  4. Configure the rest of your workflow, and hit 'Save & Apply'
  5. Send your data to the model from the main datasets page

Extracting Text Data into Tables

The V7 Table Tool allows you to extract bounding boxes of text into structured tables.

To use the Table Tool you must create the following classes:

  • The 'table' class must be a table but does not require a specific naming convention
  • The text class must be a bounding box or polygon with the "text" subtype enabled
  • The 'cell' class must be a string called 'cell'

After you've set this up, you can put bounding boxes of text straight into tables by following these steps:

  1. Select the Table Tool
  2. Configure the table layout in the top panel (i.e. number of rows, and number of columns)
  3. Select the area that contains the data for the table
  4. Adjust the rows and columns as required to capture the data
  5. Press 'enter' and assign the correct 'Table' class

You now have a table containing the text annotations!

Mapping Key Value Pairs

The V7 Graph Tool allows you to create mappings between two bounding boxes containing key value pairs (KVPs).

To use the Graph Tool to create KVPs you must create the following classes:

  • The text class must be a bounding box or polygon with the "text" subtype enabled
  • The 'KVP Pair' class must be a graph but does not require a specific naming convention
  • The 'key' class must be a string called 'key'
  • The 'value' class must be a string called 'value'

After you've set this up, you can map key value pairs using the graph tool by following these steps:

  1. Select the Graph Tool
  2. Highlight the Key
  3. Highlight the Value

This will create a pairing between the two values. This is visually shown in the UI by a link between the two annotations.

Getting Started with Document Processing

The text annotation tools are a Pro Tier only feature - so please get in touch with your CSM/AM if you are interested in using these features.