For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
GuidesAPI ReferenceChangelogModel Versioning
GuidesAPI ReferenceChangelogModel Versioning
    • Getting Started
    • Authentication
    • API Versioning
    • SDKs
    • Deployments
    • Error Codes
    • Async Processing
  • Endpoints
  • Webhook Events
  • Migration Guides
      • Extractors Endpoints
LogoLogo
On this page
  • What You Get
  • Quick Start: Common Patterns
  • Creating an Extractor
  • Retrieving an Extractor (New!)
  • Listing Extractors
  • Publishing a Version
  • Endpoint Changes Summary
  • Request Changes
  • Creating an Extractor
  • Example: Create Request
  • Response Changes
  • Key Differences
  • Example: Response
  • Versions Endpoint Changes
  • List Versions
  • SDK Method Reference
  • Extractor Schema
  • ExtractorSummary Schema (List Response)
  • ExtractorVersion Schema
  • ExtractConfig Schema
  • Need Help?
  • Migration Guides
Migration Guides2026-02-09

Extractors Migration

Was this page helpful?
Previous

Changelog

Next
Built with

What You Get

  • Dedicated /extractors endpoints — No more type: "EXTRACT" filters, cleaner API surface
  • New GET /extractors/{id} endpoint — Retrieve a single extractor with its draft version (not available in old API)
  • Typed SDK responses — extractor objects are typed, no casting needed
  • Simpler config — No more type field required in config

The old /processors endpoint is still supported in this API version for backward compatibility. You can migrate incrementally.


Quick Start: Common Patterns

Creating an Extractor

TypeScript
Python
Java
Before (2025-04-21)
1const processor = await client.processor.create({
2 name: "Invoice Extractor",
3 type: "EXTRACT",
4 config: { type: "EXTRACT", baseProcessor: "extraction_performance", schema: {...} }
5});
6console.log(processor.processor.id);
After (2026-02-09)
1const extractor = await client.extractors.create({
2 name: "Invoice Extractor",
3 config: { baseProcessor: "extraction_performance", schema: {...} }
4});
5console.log(extractor.id);

Retrieving an Extractor (New!)

TypeScript
Python
Java
1const extractor = await client.extractors.retrieve("ex_abc123");
2console.log(extractor.draftVersion.config);

Listing Extractors

TypeScript
Python
Java
Before
1const processors = await client.processor.list({ type: "EXTRACT" });
After
1const extractors = await client.extractors.list();

Publishing a Version

TypeScript
Python
Java
Before
1const version = await client.processorVersion.create("dp_abc123", { releaseType: "minor" });
After
1const version = await client.extractorVersions.create("ex_abc123", { releaseType: "minor" });

Endpoint Changes Summary

Old EndpointNew Endpoint
POST /processors (type: EXTRACT)POST /extractors
GET /processors?type=EXTRACTGET /extractors
(not available)GET /extractors/{id} (new!)
POST /processors/{id}POST /extractors/{id}
POST /processors/{id}/publishPOST /extractors/{extractorId}/versions
GET /processors/{id}/versionsGET /extractors/{extractorId}/versions
GET /processors/{id}/versions/{versionId}GET /extractors/{extractorId}/versions/{versionId}

Request Changes

Creating an Extractor

OldNewNotes
type: "EXTRACT"(removed)Implicit from endpoint
cloneProcessorIdcloneExtractorIdRenamed
config.type: "EXTRACT"(removed)Implicit from endpoint
config.parserconfig.parseConfigRenamed
config.baseProcessorconfig.baseProcessorNo change (optional)
config.schemaconfig.schemaNow required
config.fields(removed)Use schema instead

cloneExtractorId and config are mutually exclusive. You can either clone an existing extractor or provide a config, but not both. The API will return a validation error if both are provided.

Example: Create Request

1{
2 "name": "Invoice Extractor",
3 "type": "EXTRACT",
4 "config": {
5 "type": "EXTRACT",
6 "baseProcessor": "extraction_performance",
7 "schema": {
8 "type": "object",
9 "properties": {
10 "vendorName": { "type": ["string", "null"] }
11 }
12 },
13 "parser": { "target": "markdown" }
14 }
15}

Response Changes

Response shape changes: Single object responses are now returned directly (no wrapper key), and list responses use { "object": "list", "data": [...] } format. See Simplified Response Shapes for details.

Key Differences

OldNew
success: true(removed) — Use HTTP status codes
{ "extractor": {...} }{...} (object returned directly)
processorsextractors
processorVersionextractorVersion
versionsextractorVersions
processor.type: "EXTRACT"(removed) — Implicit from endpoint
draftVersion.processorIddraftVersion.extractorId
draftVersion.processorType(removed)
draftVersion.processorName(removed)
draftVersion.updatedAt(removed)
List includes versions[](removed) — Use versions endpoint

Example: Response

1{
2 "success": true,
3 "processor": {
4 "object": "document_processor",
5 "id": "dp_abc123",
6 "name": "Invoice Extractor",
7 "type": "EXTRACT",
8 "draftVersion": {
9 "object": "document_processor_version",
10 "id": "dpv_xyz789",
11 "processorId": "dp_abc123",
12 "processorName": "Invoice Extractor",
13 "processorType": "EXTRACT",
14 "version": "draft",
15 "config": { ... },
16 "createdAt": "2024-03-21T15:30:00Z",
17 "updatedAt": "2024-03-21T16:45:00Z"
18 }
19 }
20}

Versions Endpoint Changes

List Versions

The list endpoint now returns summaries without config. Use the get version endpoint for full details.

1{
2 "versions": [
3 {
4 "id": "dpv_abc123",
5 "version": "1.0",
6 "config": { ... } // Included
7 }
8 ]
9}

SDK Method Reference

Old MethodNew Method
client.processor.create()client.extractors.create()
client.processor.list()client.extractors.list()
client.processor.update()client.extractors.update()
—client.extractors.retrieve() (new!)
client.processorVersion.create()client.extractorVersions.create()
client.processorVersion.list()client.extractorVersions.list()
client.processorVersion.get()client.extractorVersions.retrieve()

Detailed Schema Changes

Extractor Schema

PropertyOld (Processor)New (Extractor)Change
object"document_processor""extractor"Value changed
idRequired stringRequired stringNo change
nameRequiredRequiredNo change
typeRequired "EXTRACT"—Removed
draftVersionProcessorVersionExtractorVersionNo change
createdAtRequiredRequiredNo change
updatedAtRequiredRequiredNo change

ExtractorSummary Schema (List Response)

PropertyOldNewChange
object"document_processor""extractor"Value changed
typeRequired—Removed
versionsRequired array—Removed

ExtractorVersion Schema

PropertyOld (ProcessorVersion)New (ExtractorVersion)Change
object"document_processor_version""extractor_version"Value changed
idRequired stringRequired stringNo change
processorIdRequired—Renamed to extractorId
extractorId—RequiredNew
processorNameOptional—Removed
processorTypeRequired—Removed
versionRequiredRequiredNo change
descriptionOptionalOptionalNo change
configExtractionConfigExtractConfigNo change
createdAtRequiredRequiredNo change
updatedAtRequired—Removed

ExtractConfig Schema

PropertyOld (ExtractionConfig)New (ExtractConfig)Change
typeRequired "EXTRACT"—Removed
baseProcessorOptionalOptionalNo change
schemaOptionalRequiredNow required
fieldsOptional (deprecated)—Removed
parserOptional—Renamed
parseConfig—OptionalNew (replaces parser)

Need Help?

If you encounter any issues while migrating, please contact our support team at support@extend.app.


Migration Guides

GuideMigrating FromMigrating To
Overview—What’s new and how to upgrade
Extract Runs/processor_runs/extract_runs + /extract
Classify Runs/processor_runs/classify_runs + /classify
Split Runs/processor_runs/split_runs + /split
Parse Runs/parse, /parse/async/parse_runs + /parse
Edit Runs/edit, /edit/async/edit_runs + /edit
Extractors/processors/extractors
Classifiers/processors/classifiers
Splitters/processors/splitters
Files/files/files (breaking changes)
Evaluation Setsevaluation endpointsUpdated evaluation endpoints
Workflow Runs/workflow_runs/workflow_runs (breaking changes)
Webhooksprocessor_run.* eventsextract_run.*, classify_run.*, etc.