Bounding Boxes & Citations
Extend provides references to locate extracted data within your documents. The specific format and availability depend on your processor’s configuration type: JSON Schema (recommended) or Fields Array (legacy).
While traditional OCR products often include bounding boxes, Extend uses a mix of multimodal large language models and traditional vision models. Due to this mixture, providing references isn’t always possible, and coverage for all fields isn’t guaranteed, even when enabled. However, we are always working to improve coverage.
These references are currently only available for Extract
output fields and are supported for the following file/document types:
PDF
IMG
(jpeg, png, etc)
Citations (JSON Schema Config)
This section is relevant for processors using the JSON Schema config type. If you are using the legacy Fields Array config type, please see the Bounding Boxes (Legacy Fields Array Config) section. If you aren’t sure which config type you are using, please see the Migrating to JSON Schema documentation.
For processors configured with JSON Schema, Extend uses Citations. Citations provide a polygon reference to a specific location in the document.
Key Points:
- Availability: Citations are returned in the
metadata
object for each field only if theincludeBoundingBoxCitations
option is enabled in the processor config. You can enable this in the Studio via the Build tab under “Advanced options”. - Field Type Coverage: When enabled, Citations can potentially be returned for all field types.
- Schema: Citations use a
polygon
structure representing points on the page. For detailed schema information and usage examples, see the API Reference.