Bounding Boxes

In addition to being offered in product, we include location references (when available) in the API response payload for workflow run outputs. The format of these references depends on your processor’s configuration type. Citations are used for the recommended JSON Schema configuration, while Bounding Boxes are used for the legacy Fields Array configuration.

Importantly, these references are not the same as traditional OCR and we do not make a guarantee that 100% of the time our system is able to detect and return them for all extracted values. However, we are constantly improving our models and will continue to do so.

Right now location references are only available for Extract output fields, and are only supported for the following file/document types:

PDF
IMG (jpeg, png, etc)

Citation Schema (JSON Schema Config)

This section is relevant for processors using the JSON Schema config type. If you are using the legacy Fields Array config type, please see the Bounding Box Schema documentation. If you aren’t sure which config type you are using, please see the Migrating to JSON Schema documentation.

Citations are a way to reference a specific location in a document where a field is located. Citations are returned in the metadata object for each field. Learn more about Metadata here.

The polygon and referenceText fields are only returned if the includeBoundingBoxCitations option is enabled in the processor config. This can be enabled through the Studio in the Build tab under “Advanced options”.

The shape of the polygon is as follows:

1 export type Citation = {
2   page?: number;
3   referenceText?: string | null;
4   polygon?: Point[];
5 };
6 
7 type Point = {
8   x: number;
9   y: number;
10 };

How to use citations

How to use the bounding values in order to place a bounding box on a file, depends heavily on the file type. In general though, you will need to take the bounding box values we return and convert them to whatever coordinate system your rendering library uses or what you define if you are using a native Canvas element approach to drawing them over an image for instance.

The values we return in the polygon field represent the coordinates of the bounding box on the image or page. They indicate the points of the polygon that form the bounding box. You may need to convert these values into a format suitable for your rendering library. If your rendering library uses a coordinate system based on percentages of the total image or page size, you would need to perform an additional conversion step.

For PDFs, here is an example of how to apply a transform to the value in order to be rendered using react-pdf-viewer or a similar library:

1 export type HighlightArea = {
2   left: number;
3   top: number;
4   width: number;
5   height: number;
6   pageIndex: number;
7 };
8 
9 function convertPolygonToHighlightArea({
10   polygon,
11   pageIndex,
12   pageHeight,
13   pageWidth,
14 }: {
15   polygon: Point[];
16   pageIndex: number;
17   pageHeight: number;
18   pageWidth: number;
19 }): HighlightArea {
20   const boundingBox = polygon.reduce(
21     (acc, curr) => {
22       return {
23         left: Math.min(acc.left, curr.x),
24         top: Math.min(acc.top, curr.y),
25         right: Math.max(acc.right, curr.x),
26         bottom: Math.max(acc.bottom, curr.y),
27       };
28     },
29     { left: 0, top: 0, right: 0, bottom: 0 }
30   );
31 
32   // If you are able to access the page height and width of the document, use them to calculate the highlight area
33   if (pageHeight && pageWidth) {
34     const highlightArea: HighlightArea = {
35       height: ((boundingBox.bottom - boundingBox.top) / pageHeight) * 100 + 2,
36       width: ((boundingBox.right - boundingBox.left) / pageWidth) * 100 + 2,
37       top: (boundingBox.top / pageHeight) * 100 - 1,
38       left: (boundingBox.left / pageWidth) * 100 - 1,
39       pageIndex,
40     };
41     return highlightArea;
42   }
43 
44   const DPI = 72; // Default DPI - calibrate this based on your product
45 
46   // Otherwise you can default to a conversion based on the zoom level of the document
47   const highlightArea: HighlightArea = {
48     left: boundingBox.left * 9,
49     top: boundingBox.top * 9,
50     width: ((boundingBox.right - boundingBox.left) * DPI) / 8,
51     height: ((boundingBox.bottom - boundingBox.top) * DPI) / 8,
52     pageIndex,
53   };
54   return highlightArea;
55 }

Bounding Box Schema (Fields Array Config)

This section is relevant for processors using the Fields Array config type. If you are using the JSON Schema config type, please see the Citation Schema documentation. If you aren’t sure which config type you are using, please see the Migrating to JSON Schema documentation.

The default bounding box using heuristic based matches only supports the following field types:

date fields
string fields
signature fields
array fields (on nested string fields)
object fields (on nested string fields)

If you have selected “Bounding box citations” in the extraction settings in the Extend Studio or the includeBoundingBoxCitations option is set to true in the processor config, you will be able to use bounding boxes for all additional field types. These bounding boxes will have significantly higher coverage.

enum fields
number fields
boolean fields
null fields - If a field is declaratively null (e.g. an empty form input) this will be returned as a bounding box reference. If there is no declarative indication of null, bounding boxes will not be returned.

The shape of the bounding box reference is as follows:

1 type BoundingBox = {
2   /**
3    * The left most position of the bounding box
4    */
5   left: number;
6   /**
7    * The top most position of the bounding box
8    */
9   top: number;
10   /**
11    * The right most position of the bounding box
12    */
13   right: number;
14   /**
15    * The bottom most position of the bounding box
16    */
17   bottom: number;
18 };

How to use bounding boxes

The values we return in the BoundingBox type represent the coordinates of the bounding box on the image or page. They indicate the top, bottom, left, and right extremities of the box. You may need to convert these values into a format suitable for your rendering library. If your rendering library uses a coordinate system based on percentages of the total image or page size, you would need to perform an additional conversion step.

For PDFs, here is an example of how to apply a transform to the value in order to be rendered using react-pdf-viewer or a similar library:

1 export type HighlightArea = {
2   left: number;
3   top: number;
4   width: number;
5   height: number;
6   pageIndex: number;
7 };
8 
9 function convertBoundingBoxToHighlightArea({
10   boundingBox,
11   pageIndex,
12   pageHeight,
13   pageWidth,
14 }: {
15   boundingBox: BoundingBox;
16   pageIndex: number; // The index of the page in the document (0 indexed)
17   pageHeight: number;
18   pageWidth: number;
19 }): HighlightArea {
20   // If you are able to access the page height and width of the document, use them to calculate the highlight area
21   if (pageHeight && pageWidth) {
22     const highlightArea: HighlightArea = {
23       height: ((boundingBox.bottom - boundingBox.top) / pageHeight) * 100 + 2,
24       left: (boundingBox.left / pageWidth) * 100 - 1,
25       width: ((boundingBox.right - boundingBox.left) / pageWidth) * 100 + 2,
26       top: (boundingBox.top / pageHeight) * 100 - 1,
27       pageIndex,
28     };
29     return highlightArea;
30   }
31 
32   const DPI = 72; // Default DPI - calibrate this based on your product
33 
34   // Otherwise you can default to a conversion based on the zoom level of the document
35   const highlightArea: HighlightArea = {
36     left: boundingBox.left * 9,
37     top: boundingBox.top * 9,
38     width: ((boundingBox.right - boundingBox.left) * DPI) / 8,
39     height: ((boundingBox.bottom - boundingBox.top) * DPI) / 8,
40     pageIndex,
41   };
42   return highlightArea;
43 }