Processors | Extend Documentation

A processor is the saved, reusable form of an extractor, classifier, or splitter. Instead of passing a config inline on every request, you save that config once as a processor, give it a stable id, and reference it from then on. The processor is the entity your runs are tracked against, the thing you version and iterate on in Extend Studio, and the unit you evaluate and optimize.

“Processor” (the saved entity) is different from baseProcessor (a config field). baseProcessor selects the underlying model tier for a run — for example extraction_performance vs extraction_light — and is just one of the settings stored inside a processor’s configuration.

Inline config vs. a saved processor

There are two ways to run extraction, classification, or splitting:

Inline config — pass the full configuration on each /extract, /classify, or /split call. Great for getting started, prototyping, and one-off runs.
A saved processor — create the processor once, then reference it by id on each run. Use this for anything you’ll run more than once.

When you reference a saved processor you can still adjust individual settings for a single run with overrideConfig, without changing the saved configuration.

What a saved processor gives you

Saving your configuration as a processor unlocks the parts of Extend that operate on a persistent entity rather than a one-off request:

A stable identity for your runs. Every run created against a processor is tracked against it, so you can list and review a processor’s runs over time.
Iteration in the dashboard. Build and refine the configuration visually in Extend Studio. Edits are saved to the processor’s draft version until you publish.
Versioning. Publish the draft as a new version to pin a stable configuration; published versions are read-only. See Versioning.
Evaluation. Evaluation sets are created against a processor, letting you measure accuracy on representative documents and verify that changes actually improve results.
Optimization. Composer optimizes a processor’s configuration — field descriptions, prompts, classifications, and split rules — using your evaluation sets.
Use in workflows and batches. Workflows reference published processor versions as steps, and the batch endpoints require a saved processor (inline config isn’t supported for batch).

The three processor types

Processor	What it does	Configuration
Extractor	Pulls structured fields out of a document.	Configuration
Classifier	Assigns a document to one of your defined types.	Configuration
Splitter	Breaks a multi-document file into typed sub-documents.	Configuration

Each type has the same lifecycle — create, run, version, evaluate, optimize — and its own set of management endpoints:

Extractors

Create, update, and version extractors.

Classifiers

Create, update, and version classifiers.

Splitters

Create, update, and version splitters.

Versioning

A processor always has a draft version that holds your latest edits. When you’re ready to lock in a configuration, publish it:

Minor version — for changes that don’t alter the shape of the output (for example, refining a field or classification prompt).
Major version — for changes that alter the output shape or how the processor interacts with a workflow.

Once published, a version is immutable and viewable in read-only mode. Publishing a new version does not affect workflows already using an earlier version — they keep running the version they were deployed with — so you can iterate freely and pin versions where you need stability. Workflow steps reference a specific processor version (a semver string or "draft").

This is separate from base model versioning. The baseVersion field pins Extend’s underlying model release for a processor (see, for example, Extraction Performance versions), while publishing manages versions of your configuration.

When to use a processor

Reach for a saved processor when you want to:

Run the same configuration repeatedly or across multiple files.
Measure and improve accuracy with evaluation sets and Composer.
Process files in bulk with the batch endpoints.
Use the configuration as a step in a workflow.
Collaborate on and iterate the configuration in Extend Studio.

Stick with inline config for quick experiments and one-off runs.

Next steps

Evaluation Sets

Test and track the accuracy of a processor on representative documents.

Composer

Automatically optimize a processor’s configuration using your evaluation sets.

Workflows

Orchestrate published processors into an end-to-end document pipeline.

Batch Processing

Run a saved processor across many files in a single request.