For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Book a demoLog in
DocumentationAPI ReferenceModel VersioningChangelog
DocumentationAPI ReferenceModel VersioningChangelog
    • Studio
    • Support
    • Benchmarks
    • Status
  • Getting Started
    • Overview
    • API Quickstart
    • Dashboard Quickstart
    • Agent Quickstart
  • Dev Tools
    • SDKs
    • CLI
  • Capabilities
      • Overview
      • Processors
      • Creating Evaluation Sets
      • Running Evaluation Sets
      • Calculating Array Accuracy
LogoLogo
Book a demoLog in
On this page
  • Inline config vs. a saved processor
  • What a saved processor gives you
  • The three processor types
  • Versioning
  • When to use a processor
  • Next steps
Evaluation

Processors

Was this page helpful?
Previous

Creating evaluation sets

Next
Built with

A processor is the saved, reusable form of an extractor, classifier, or splitter. Instead of passing a config inline on every request, you save that config once as a processor, give it a stable id, and reference it from then on. The processor is the entity your runs are tracked against, the thing you version and iterate on in Extend Studio, and the unit you evaluate and optimize.

“Processor” (the saved entity) is different from baseProcessor (a config field). baseProcessor selects the underlying model tier for a run — for example extraction_performance vs extraction_light — and is just one of the settings stored inside a processor’s configuration.

Inline config vs. a saved processor

There are two ways to run extraction, classification, or splitting:

  • Inline config — pass the full configuration on each /extract, /classify, or /split call. Great for getting started, prototyping, and one-off runs.
  • A saved processor — create the processor once, then reference it by id on each run. Use this for anything you’ll run more than once.

When you reference a saved processor you can still adjust individual settings for a single run with overrideConfig, without changing the saved configuration.

What a saved processor gives you

Saving your configuration as a processor unlocks the parts of Extend that operate on a persistent entity rather than a one-off request:

  • A stable identity for your runs. Every run created against a processor is tracked against it, so you can list and review a processor’s runs over time.
  • Iteration in the dashboard. Build and refine the configuration visually in Extend Studio. Edits are saved to the processor’s draft version until you publish.
  • Versioning. Publish the draft as a new version to pin a stable configuration; published versions are read-only. See Versioning.
  • Evaluation. Evaluation sets are created against a processor, letting you measure accuracy on representative documents and verify that changes actually improve results.
  • Optimization. Composer optimizes a processor’s configuration — field descriptions, prompts, classifications, and split rules — using your evaluation sets.
  • Use in workflows and batches. Workflows reference published processor versions as steps, and the batch endpoints require a saved processor (inline config isn’t supported for batch).

The three processor types

ProcessorWhat it doesConfiguration
ExtractorPulls structured fields out of a document.Configuration
ClassifierAssigns a document to one of your defined types.Configuration
SplitterBreaks a multi-document file into typed sub-documents.Configuration

Each type has the same lifecycle — create, run, version, evaluate, optimize — and its own set of management endpoints:

Extractors

Create, update, and version extractors.

Classifiers

Create, update, and version classifiers.

Splitters

Create, update, and version splitters.

Versioning

A processor always has a draft version that holds your latest edits. When you’re ready to lock in a configuration, publish it:

  • Minor version — for changes that don’t alter the shape of the output (for example, refining a field or classification prompt).
  • Major version — for changes that alter the output shape or how the processor interacts with a workflow.

Once published, a version is immutable and viewable in read-only mode. Publishing a new version does not affect workflows already using an earlier version — they keep running the version they were deployed with — so you can iterate freely and pin versions where you need stability. Workflow steps reference a specific processor version (a semver string or "draft").

This is separate from base model versioning. The baseVersion field pins Extend’s underlying model release for a processor (see, for example, Extraction Performance versions), while publishing manages versions of your configuration.

When to use a processor

Reach for a saved processor when you want to:

  • Run the same configuration repeatedly or across multiple files.
  • Measure and improve accuracy with evaluation sets and Composer.
  • Process files in bulk with the batch endpoints.
  • Use the configuration as a step in a workflow.
  • Collaborate on and iterate the configuration in Extend Studio.

Stick with inline config for quick experiments and one-off runs.

Next steps

Evaluation Sets

Test and track the accuracy of a processor on representative documents.

Composer

Automatically optimize a processor’s configuration using your evaluation sets.

Workflows

Orchestrate published processors into an end-to-end document pipeline.

Batch Processing

Run a saved processor across many files in a single request.