A processor is the saved, reusable form of an extractor, classifier, or splitter. Instead of passing a config inline on every request, you save that config once as a processor, give it a stable id, and reference it from then on. The processor is the entity your runs are tracked against, the thing you version and iterate on in Extend Studio, and the unit you evaluate and optimize.
“Processor” (the saved entity) is different from baseProcessor (a config field). baseProcessor selects the underlying model tier for a run — for example extraction_performance vs extraction_light — and is just one of the settings stored inside a processor’s configuration.
There are two ways to run extraction, classification, or splitting:
config — pass the full configuration on each /extract, /classify, or /split call. Great for getting started, prototyping, and one-off runs.id on each run. Use this for anything you’ll run more than once.When you reference a saved processor you can still adjust individual settings for a single run with overrideConfig, without changing the saved configuration.
Saving your configuration as a processor unlocks the parts of Extend that operate on a persistent entity rather than a one-off request:
config isn’t supported for batch).Each type has the same lifecycle — create, run, version, evaluate, optimize — and its own set of management endpoints:
Create, update, and version extractors.
Create, update, and version classifiers.
Create, update, and version splitters.
A processor always has a draft version that holds your latest edits. When you’re ready to lock in a configuration, publish it:
Once published, a version is immutable and viewable in read-only mode. Publishing a new version does not affect workflows already using an earlier version — they keep running the version they were deployed with — so you can iterate freely and pin versions where you need stability. Workflow steps reference a specific processor version (a semver string or "draft").
This is separate from base model versioning. The baseVersion field pins Extend’s underlying model release for a processor (see, for example, Extraction Performance versions), while publishing manages versions of your configuration.
Reach for a saved processor when you want to:
Stick with inline config for quick experiments and one-off runs.
Test and track the accuracy of a processor on representative documents.
Automatically optimize a processor’s configuration using your evaluation sets.
Orchestrate published processors into an end-to-end document pipeline.
Run a saved processor across many files in a single request.