Run processors (extraction, classification, splitting, etc.) on a given document.
Synchronous vs Asynchronous Processing:
PROCESSING status. Use webhooks or polling to get results.sync: true to wait for completion and get final results in the response (5-minute timeout).For asynchronous processing:
Bearer authentication of the form Bearer <token>, where token is your auth token.
The ID of the processor to be run.
Example: "dp_Xj8mK2pL9nR4vT7qY5wZ"
A raw string to be processed. Can be used in place of file when passing raw text data streams. One of file or rawText must be provided.
API version to use for the request. If you do not specify a version, you will either receive a 400 Bad Request or be set to a previous legacy version. See API Versioning for more details.
An optional version of the processor to use. When not supplied, the most recent published version of the processor will be used. Special values include:
"latest" for the most recent published version. If there are no published versions, the draft version will be used."draft" for the draft version."1.0", "2.2", etc.The file to be processed. One of file or rawText must be provided. Supported file types can be found here.
Whether to run the processor synchronously. When true, the request will wait for the processor run to complete and return the final results. When false (default), the request returns immediately with a PROCESSING status, and you can poll for completion or use webhooks. For production use cases, we recommending leaving sync off and building around an async integration for more resiliency, unless your use case is predictably fast (e.g. sub < 30 seconds) run time or otherwise have integration constraints that require a synchronous API.
Timeout: Synchronous requests have a 5-minute timeout. If the processor run takes longer, it will continue processing asynchronously and you can retrieve the results via the GET endpoint.
An optional object that can be passed in to identify the run of the document processor. It will be returned back to you in the response and webhooks.
To categorize processor runs for billing and usage tracking, include extend:usage_tags with an array of string values (e.g., {"extend:usage_tags": ["production", "team-eng", "customer-123"]}). Tags must contain only alphanumeric characters, hyphens, and underscores; any special characters will be automatically removed.