Processor output types
Document processor outputs follow standardized formats based on the processor type. Understanding these formats is essential when working with evaluation sets, webhooks, and API responses.
Extraction output type (Fields Array)
This section is relevant for the Fields Array config type. If you are using the JSON Schema config type, please see the Extraction output type (JSON Schema) documentation. If you aren’t sure which config type you are using, please see the Migrating to JSON Schema documentation.
For processors using the legacy Fields Array configuration, the extraction output is a flat dictionary where each key is the fieldName
(or sometimes the id
if names aren’t unique) you defined in the configuration, and the value is an ExtractionFieldResult
object containing the extracted data and associated details.
Type definition
Each ExtractionFieldResult
object contains the core id
, type
, and extracted value
. It can also include the following optional details:
schema
: The schema definition for nested fields (like objects or array items).insights
: Reasoning or explanations from the model (if enabled).references
: Location information, including the page number and specific Bounding Boxes relevant to the legacy Fields Array configuration (see Bounding Boxes Guide).enum
: The available options if the field type isenum
.
References
Examples
Basic Field Types
Nested Structures with References and Insights
Shared Types
Certain types are shared across different processor outputs. These provide additional context and information about the processor’s decisions.
Type Definition
Example
Insights can appear in both Extraction and Classification outputs to provide transparency into the model’s decision-making process. They are particularly useful when debugging or validating processor results.