Legacy extraction outputs follow a standardized format that you’ll encounter when working with evaluation sets, webhooks, and API responses.
This section is relevant for the Fields Array config type. If you are using the JSON Schema config type, please see the Extraction output type (JSON Schema) documentation. If you aren’t sure which config type you are using, please see the Migrating to JSON Schema.
For processors using the legacy Fields Array configuration, the extraction output is a flat dictionary where each key is the fieldName (or sometimes the id if names aren’t unique) you defined in the configuration, and the value is an ExtractionFieldResult object containing the extracted data and associated details.
Each ExtractionFieldResult object contains the core id, type, and extracted value. It can also include the following optional details:
schema: The schema definition for nested fields (like objects or array items).insights: Reasoning or explanations from the model (if enabled).references: Location information, including the page number and specific Bounding Boxes relevant to the legacy Fields Array configuration (see Bounding Boxes Guide).enum: The available options if the field type is enum.Certain types are shared across different processor outputs. These provide additional context and information about the processor’s decisions.
Insights can appear in both Extraction and Classification outputs to provide transparency into the model’s decision-making process. They are particularly useful when debugging or validating processor results.