Run Processor

Run processors (extraction, classification, splitting, etc.) on a given document.

In general, the recommended way to integrate with Extend in production is via workflows, using the Run Workflow endpoint. This is due to several factors:

  • file parsing/pre-processing will automatically be reused across multiple processors, which will give you simplicity and cost savings given that many use cases will require multiple processors to be run on the same document.
  • workflows provide dedicated human in the loop document review, when needed.
  • workflows allow you to model and manage your pipeline with a single endpoint and corresponding UI for modeling and monitoring.

However, there are a number of legitimate use cases and systems where it might be easier to model the pipeline via code and run processors directly. This endpoint is provided for this purpose.

Similar to workflow runs, processor runs are asynchronous and will return a status of PROCESSING until the run is complete. You can configure webhooks to receive notifications when a processor run is complete or failed.

Headers

AuthorizationstringRequired

Bearer authentication of the form Bearer <token>, where token is your auth token.

x-extend-api-versionenumOptional

API version to use for the request. If you do not specify a version, you will either receive a 400 Bad Request or be set to a previous legacy version. See API Versioning for more details.

Allowed values:

Request

This endpoint expects an object.
processorIdstringRequired

The ID of the processor to be run. The id will start with "dp_". This ID can be found when viewing a processor on the Extend platform.

Example: "dp_Xj8mK2pL9nR4vT7qY5wZ"

versionstringOptionalDefaults to latest

An optional version of the processor to use. When not supplied, the most recent published version of the processor will be used. Special values include:

  • "latest" for the most recent published version. If there are no published versions, the draft version will be used.
  • "draft" for the draft version.
  • Specific version numbers corresponding to versions your team has published, e.g. "1.0", "2.2", etc.
fileobjectOptional

The file to be processed. One of file or rawText must be provided. Supported file types can be found here.

rawTextstringOptional

A raw string to be processed. Can be used in place of file when passing raw text data streams. One of file or rawText must be provided.

priorityintegerOptional>=1<=100Defaults to 50

An optional value used to determine the relative order of ProcessorRuns when rate limiting is in effect. Lower values will be prioritized before higher values.

metadataobjectOptional

An optional object that can be passed in to identify the run of the document processor. It will be returned back to you in the response and webhooks.

configobjectOptional

The configuration for the processor run. If this is provided, this config will be used. If not provided, the config for the specific version you provide will be used. The type of configuration must match the processor type.

Response

Successfully created processor run

successboolean
processorRunobject

Errors