Extract File (Sync)
Extract File (Sync)
Extract File (Sync)
Extract structured data from a file synchronously, waiting for the result before returning. This endpoint has a 5-minute timeout — if processing takes longer, the request will fail.
Note: This endpoint is intended for onboarding and testing only. For production workloads, use POST /extract_runs with polling or webhooks instead, as it provides better reliability for large files and avoids timeout issues.
The Extract endpoint allows you to extract structured data from files using an existing extractor or an inline configuration.
For more details, see the Extract File guide.
Bearer authentication of the form Bearer <token>, where token is your auth token.
Reference to an existing extractor. One of extractor or config must be provided.
Inline extract configuration. One of extractor or config must be provided.
The type of object. Will always be "extract_run".
The unique identifier for this extract run.
Example: "exr_Xj8mK2pL9nR4vT7qY5wZ"
The status of a processor run (extract, classify, or split):
"PENDING" - The run has been created and is waiting to be processed"PROCESSING" - The run is in progress"PROCESSED" - The run completed successfully"FAILED" - The run failed"CANCELLED" - The run was cancelledThe final output, either reviewed or initial. This is a union of two possible shapes:
Availability: Present when status is "PROCESSED".
The initial output from the extract run, before any review edits.
Availability: Present when reviewed is true.
The output after human review.
Availability: Present when reviewed is true.
The reason for failure.
Availability: Present when status is "FAILED".
Possible values include:
ABORTED - The run was aborted by the userINTERNAL_ERROR - An unexpected internal error occurredFAILED_TO_PROCESS_FILE - Failed to process the file (e.g., OCR failure, file access issues)INVALID_PROCESSOR - The processor configuration is invalidINVALID_CONFIGURATION - The provided configuration is incompatible with the selected modelPARSING_ERROR - Failed to parse the extraction outputPRE_PROCESSING_FAILURE - An error occurred during preprocessing (e.g., chunking)POST_PROCESSING_FAILURE - An error occurred during postprocessingOUT_OF_CREDITS - Insufficient credits to run the extractionNote: Additional failure reasons may be added in the future. Your integration should handle unknown values gracefully.
A detailed message about the failure.
Availability: Present when status is "FAILED".
Any metadata that was provided when creating the extract run.
Availability: Present when metadata was provided during creation.
Details of edits made during review.
Availability: Present when edited is true.
The configuration used for this extract run. This is a union of two possible shapes:
The extractor that was used for this run.
Availability: Present when an extractor reference was provided. Not present when using inline config.
The version of the extractor that was used for this run.
Availability: Present when an extractor reference was provided. Not present when using inline config.
The ID of the parse run that was used for this extract run.
Availability: Present when a parse run was created.
Usage credits consumed by this extract run.
Availability: Present when status is "PROCESSED". Will not be returned for runs created before October 7, 2025 or for customers on legacy billing systems.
The time (in UTC) at which the object was created. Will follow the RFC 3339 format.
Example: "2024-03-21T16:45:00Z"
The time (in UTC) at which the object was last updated. Will follow the RFC 3339 format.
Example: "2024-03-21T16:45:00Z"
API version to use for the request. If you’re using an SDK, you can ignore this parameter. If you are not using an SDK and do not specify a version, you will either receive a 400 Bad Request or be set to a previous legacy version. See API Versioning for more details.
An optional object that can be passed in to identify the run. It will be returned back to you in the response and webhooks. Maximum size is 10KB.
To categorize runs for billing and usage tracking, include extend:usage_tags with an array of string values (e.g., {"extend:usage_tags": ["production", "team-eng", "customer-123"]}). Tags must contain only alphanumeric characters, hyphens, and underscores; any special characters will be automatically removed.