For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
GuidesAPI ReferenceChangelogModel Versioning
GuidesAPI ReferenceChangelogModel Versioning
    • Getting Started
    • Authentication
    • API Versioning
    • SDKs
    • Deployments
    • Error Codes
    • Async Processing
  • Endpoints
  • Webhook Events
  • Migration Guides
      • Extract Endpoints
LogoLogo
On this page
  • What You Get
  • Quick Start: Common Patterns
  • Running an Extraction
  • Listing Extract Runs
  • TypeScript: Typed Schemas with Zod
  • Endpoint Changes Summary
  • Extraction Schema Changes
  • Legacy “Fields” Format Removed
  • Strict Nullable Validation
  • Request Changes
  • File Properties (All Endpoints)
  • Creating an Extract Run
  • Example: Create Request
  • Example: Overriding Extractor Config
  • Response Changes
  • Key Differences
  • Example: Response
  • Citation Format Change
  • SDK Method Reference
  • ExtractRun Schema
  • ExtractConfig Schema
  • ExtractRunSummary Schema (List Response)
  • Need Help?
  • Migration Guides
Migration Guides2026-02-09

Extract Runs Migration

Was this page helpful?
Previous

Changelog

Next
Built with

What You Get

  • Fully typed SDK responses — extractRun.output is typed, no more casting
  • Run without an extractor — Pass your schema inline with config instead of creating an extractor first
  • Cleaner request/response format — Simpler property names, predictable nullable fields
  • Better IDE experience — Autocomplete works out of the box

The old /processor_runs endpoint is still supported in this API version for backward compatibility. You can migrate incrementally.


Quick Start: Common Patterns

Running an Extraction

TypeScript
Python
Java
Before (2025-04-21)
1const run = await client.processorRun.create({
2 processorId: "dp_abc123",
3 file: { fileUrl: "https://example.com/invoice.pdf" },
4 sync: true
5});
6if (run.success) {
7 console.log(run.processorRun.output);
8}
After (2026-02-09)
1const result = await client.extract({
2 extractor: { id: "ex_abc123" },
3 file: { url: "https://example.com/invoice.pdf" }
4});
5console.log(result.output?.value);
6
7// Override an extractor's config (config moved inside extractor object)
8const result = await client.extract({
9 extractor: {
10 id: "ex_abc123",
11 overrideConfig: { schema: { type: "object", properties: { vendorName: { type: ["string", "null"] } } } }
12 },
13 file: { url: "https://example.com/invoice.pdf" }
14});
15
16// Or with inline config (no extractor needed)
17const result = await client.extract({
18 config: { schema: { type: "object", properties: { vendorName: { type: ["string", "null"] } } } },
19 file: { url: "https://example.com/invoice.pdf" }
20});

Listing Extract Runs

TypeScript
Python
Java
Before
1const runs = await client.processorRun.list({ processorType: "EXTRACT", processorId: "dp_abc123" });
After
1const runs = await client.extractRuns.list({ extractorId: "ex_abc123" });

TypeScript: Typed Schemas with Zod

Define your extraction schema using Zod for fully typed output values. Pass a z.object() schema directly as config.schema — the SDK automatically converts it to the API’s JSON Schema format and infers the output type:

1import { ExtendClient, extendDate, extendCurrency } from "extend-ai";
2import { z } from "zod";
3
4const client = new ExtendClient({ token: "your-api-key" });
5
6// Extract with a typed zod schema
7const result = await client.extract({
8 config: {
9 schema: z.object({
10 invoice_number: z.string().nullable(),
11 invoice_date: extendDate(),
12 total: extendCurrency(),
13 line_items: z.array(z.object({
14 description: z.string().nullable(),
15 quantity: z.number().nullable(),
16 unit_price: z.number().nullable(),
17 })),
18 }),
19 },
20 file: { url: "https://example.com/invoice.pdf" }
21});
22
23// TypeScript knows the exact shape of output.value!
24if (result.output) {
25 const output = result.output.value;
26 console.log(output.invoice_number); // string | null
27 console.log(output.invoice_date); // string | null (ISO date)
28 console.log(output.total.amount); // number | null
29 console.log(output.total.iso_4217_currency_code); // string | null
30 output.line_items.forEach(item => {
31 console.log(item.description, item.quantity, item.unit_price);
32 });
33}

You can also use Zod schemas with an existing extractor by passing them via extractor.overrideConfig:

1const result = await client.extract({
2 extractor: {
3 id: "ex_abc123",
4 overrideConfig: {
5 schema: z.object({
6 invoice_number: z.string().nullable(),
7 total: extendCurrency(),
8 }),
9 },
10 },
11 file: { url: "https://example.com/invoice.pdf" }
12});

Available custom type helpers:

  • extendDate() — Date fields (output is ISO format yyyy-mm-dd)
  • extendCurrency() — Currency fields (output has amount and iso_4217_currency_code)
  • extendSignature() — Signature detection fields (output has printed_name, signature_date, is_signed, title_or_role)

Endpoint Changes Summary

Old EndpointNew Endpoint
POST /processor_runs (type: EXTRACT)POST /extract_runs
GET /processor_runs?processorType=EXTRACTGET /extract_runs
GET /processor_runs/{id}GET /extract_runs/{id}
DELETE /processor_runs/{id}DELETE /extract_runs/{id}
POST /processor_runs/{id}/cancelPOST /extract_runs/{id}/cancel

Extraction Schema Changes

Legacy “Fields” Format Removed

The legacy fields array format for defining extraction schemas is no longer supported in this API version. You must use the JSON Schema schema format instead.

1{
2 "config": {
3 "type": "EXTRACT",
4 "fields": [
5 { "id": "vendor_name", "name": "Vendor Name", "type": "string", "description": "..." },
6 { "id": "total", "name": "Total", "type": "currency", "description": "..." }
7 ]
8 }
9}

See the JSON Schema guide for the full schema reference and examples.

Strict Nullable Validation

In previous API versions, passing a non-nullable primitive type (e.g. "type": "string") would be silently converted to its nullable form ("type": ["string", "null"]). Similarly, enum arrays without null would have null automatically appended.

In 2026-02-09, this is now a validation error. You must explicitly declare nullable types in your schema. This prevents a mismatch between your schema definition and the actual API response — particularly important if you’re using SDK features like Zod schema validation in the TypeScript SDK, where a null return for a non-nullable field would cause a runtime error.

Breaking change: Schemas that previously worked due to silent conversion will now return a 400 Bad Request. Update your schemas before migrating.

Primitive types must use the nullable array form:

1{ "type": "string" }

Enum arrays must include null:

1{ "enum": ["active", "inactive"] }

This applies to all endpoints that accept an extraction schema: POST /extract, POST /extract_runs, POST /extractors, POST /extractors/{id}, and POST /extractors/{extractorId}/versions.


Request Changes

File Properties (All Endpoints)

OldNew
file.fileUrlfile.url
file.fileIdfile.id
file.fileNamefile.name
rawTextfile.text

Creating an Extract Run

OldNewNotes
processorIdextractor.idNested in object
versionextractor.versionNested in object
config (with extractor)extractor.overrideConfigMoved inside extractor object. Top-level config is now only for inline extraction without an extractor
sync: true(removed)Use POST /extract (sync, for testing), or createAndPoll() / webhooks for production
config.type: "EXTRACT"(removed)Implicit from endpoint
config.parserparseConfigRenamed. Use config.parseConfig (inline) or extractor.overrideConfig.parseConfig (with extractor)

Breaking change: If you previously passed config alongside a processorId to override extractor settings, you must now pass it as extractor.overrideConfig inside the extractor object. The top-level config property is reserved for inline extraction (without an extractor).

Additionally, config.parser has been renamed to parseConfig everywhere — use config.parseConfig for inline config or extractor.overrideConfig.parseConfig when overriding an extractor.

Example: Create Request

1{
2 "processorId": "dp_abc123",
3 "version": "latest",
4 "file": {
5 "fileUrl": "https://example.com/invoice.pdf",
6 "fileName": "invoice.pdf"
7 },
8 "sync": true,
9 "metadata": { "customerId": "cust_123" }
10}

Example: Overriding Extractor Config

If you previously passed config alongside processorId to override the extractor’s settings, this now moves inside the extractor object as overrideConfig:

1{
2 "processorId": "dp_abc123",
3 "config": {
4 "schema": { "type": "object", "properties": { "vendorName": { "type": "string" } } },
5 "parser": { "type": "SIMPLE" }
6 },
7 "file": { "fileUrl": "https://example.com/invoice.pdf" },
8 "sync": true
9}

Response Changes

Response shape changes: Single object responses are now returned directly (no wrapper key), and list responses use { "object": "list", "data": [...] } format. See Simplified Response Shapes for details.

Key Differences

OldNew
success: true(removed) — Use HTTP status codes
{ "extractRun": {...} }{...} (object returned directly)
processorRun.processorIdextractRun.extractor.id
processorRun.processorVersionIdextractRun.extractorVersion.id
processorRun.files[]extractRun.file (single object)
processorRun.urlextractRun.dashboardUrl
Optional fields (may be absent)Required but nullable (always present)

Example: Response

1{
2 "success": true,
3 "processorRun": {
4 "object": "document_processor_run",
5 "id": "dpr_abc123",
6 "processorId": "dp_xyz789",
7 "processorVersionId": "dpv_456",
8 "processorName": "Invoice Extractor",
9 "type": "EXTRACT",
10 "status": "PROCESSED",
11 "output": {
12 "value": { "vendorName": "Acme Corp" },
13 "metadata": { "vendorName": { "logprobsConfidence": 0.95 } }
14 },
15 "files": [{ "id": "file_123", "name": "invoice.pdf" }],
16 "url": "https://dashboard.extend.ai/runs/dpr_abc123"
17 }
18}

Citation Format Change

If you’re using citations (bounding boxes), the page field structure has changed:

1const pageNumber = citation.page; // number

SDK Method Reference

Old MethodNew Method
client.processorRun.create()client.extractRuns.create()
client.processorRun.get()client.extractRuns.retrieve()
client.processorRun.list()client.extractRuns.list()
client.processorRun.delete()client.extractRuns.delete()
client.processorRun.cancel()client.extractRuns.cancel()
—client.extract() (new sync endpoint)
—client.extractRuns.createAndPoll() (new)

Detailed Schema Changes

ExtractRun Schema

PropertyOld (ProcessorRun)New (ExtractRun)Change
object"document_processor_run""extract_run"Value changed
idRequired stringRequired stringNo change
processorIdRequired—Removed, see extractor.id
processorVersionIdRequired—Removed, see extractorVersion.id
processorNameRequired—Removed, see extractor.name
extractor—Required ExtractorSummary | nullNew
extractorVersion—Required ExtractorVersionSummary | nullNew
typeRequired "EXTRACT"—Removed (implicit)
statusRequiredRequiredNo change
outputRequiredRequired but nullableNow nullable
initialOutputOptionalRequired but nullableNow required
reviewedOutputOptionalRequired but nullableNow required
failureReasonOptionalRequired but nullableNow required
failureMessageOptionalRequired but nullableNow required
metadataOptionalRequired but nullableNow required
configExtractionConfigExtractConfigNo change
filesRequired File[]—Removed
file—Required FileSummaryNew
parseRunId—Required string | nullNew
urlRequired—Renamed
dashboardUrl—RequiredNew (replaces url)
usageOptionalRequired but nullableNow required
createdAt—RequiredNew
updatedAt—RequiredNew

ExtractConfig Schema

PropertyOld (ExtractionConfig)New (ExtractConfig)Change
typeRequired "EXTRACT"—Removed (implicit)
baseProcessorOptionalOptionalNo change
schemaOptionalRequiredNow required
fieldsOptional (deprecated)—Removed
parserOptional—Renamed to parseConfig
parseConfig—OptionalNew

ExtractRunSummary Schema (List Response)

PropertyOld (ProcessorRunSummary)New (ExtractRunSummary)Change
object—"extract_run"New
processorIdRequired—Removed, see extractor.id
processorNameRequired—Removed, see extractor.name
extractor—RequiredNew
extractorVersion—RequiredNew
typeOptional—Removed
file—RequiredNew
dashboardUrl—RequiredNew

Need Help?

If you encounter any issues while migrating, please contact our support team at support@extend.app.


Migration Guides

GuideMigrating FromMigrating To
Overview—What’s new and how to upgrade
Extract Runs/processor_runs/extract_runs + /extract
Classify Runs/processor_runs/classify_runs + /classify
Split Runs/processor_runs/split_runs + /split
Parse Runs/parse, /parse/async/parse_runs + /parse
Edit Runs/edit, /edit/async/edit_runs + /edit
Extractors/processors/extractors
Classifiers/processors/classifiers
Splitters/processors/splitters
Files/files/files (breaking changes)
Evaluation Setsevaluation endpointsUpdated evaluation endpoints
Workflow Runs/workflow_runs/workflow_runs (breaking changes)
Webhooksprocessor_run.* eventsextract_run.*, classify_run.*, etc.