Migration Guides2026-02-09

Extract Runs Migration

What You Get

  • Fully typed SDK responses โ€” extractRun.output is typed, no more casting
  • Run without an extractor โ€” Pass your schema inline with config instead of creating an extractor first
  • Cleaner request/response format โ€” Simpler property names, predictable nullable fields
  • Better IDE experience โ€” Autocomplete works out of the box

The old /processor_runs endpoint is still supported in this API version for backward compatibility. You can migrate incrementally.


Quick Start: Common Patterns

Running an Extraction

Before (2025-04-21)
1const run = await client.processorRun.create({
2 processorId: "dp_abc123",
3 file: { fileUrl: "https://example.com/invoice.pdf" },
4 sync: true
5});
6if (run.success) {
7 console.log(run.processorRun.output);
8}
After (2026-02-09)
1const result = await client.extract({
2 extractor: { id: "ex_abc123" },
3 file: { url: "https://example.com/invoice.pdf" }
4});
5console.log(result.output?.value);
6
7// Override an extractor's config (config moved inside extractor object)
8const result = await client.extract({
9 extractor: {
10 id: "ex_abc123",
11 overrideConfig: { schema: { type: "object", properties: { vendorName: { type: ["string", "null"] } } } }
12 },
13 file: { url: "https://example.com/invoice.pdf" }
14});
15
16// Or with inline config (no extractor needed)
17const result = await client.extract({
18 config: { schema: { type: "object", properties: { vendorName: { type: ["string", "null"] } } } },
19 file: { url: "https://example.com/invoice.pdf" }
20});

Listing Extract Runs

Before
1const runs = await client.processorRun.list({ processorType: "EXTRACT", processorId: "dp_abc123" });
After
1const runs = await client.extractRuns.list({ extractorId: "ex_abc123" });

TypeScript: Typed Schemas with Zod

Define your extraction schema using Zod for fully typed output values. Pass a z.object() schema directly as config.schema โ€” the SDK automatically converts it to the APIโ€™s JSON Schema format and infers the output type:

1import { ExtendClient, extendDate, extendCurrency } from "extend-ai";
2import { z } from "zod";
3
4const client = new ExtendClient({ token: "your-api-key" });
5
6// Extract with a typed zod schema
7const result = await client.extract({
8 config: {
9 schema: z.object({
10 invoice_number: z.string().nullable(),
11 invoice_date: extendDate(),
12 total: extendCurrency(),
13 line_items: z.array(z.object({
14 description: z.string().nullable(),
15 quantity: z.number().nullable(),
16 unit_price: z.number().nullable(),
17 })),
18 }),
19 },
20 file: { url: "https://example.com/invoice.pdf" }
21});
22
23// TypeScript knows the exact shape of output.value!
24if (result.output) {
25 const output = result.output.value;
26 console.log(output.invoice_number); // string | null
27 console.log(output.invoice_date); // string | null (ISO date)
28 console.log(output.total.amount); // number | null
29 console.log(output.total.iso_4217_currency_code); // string | null
30 output.line_items.forEach(item => {
31 console.log(item.description, item.quantity, item.unit_price);
32 });
33}

You can also use Zod schemas with an existing extractor by passing them via extractor.overrideConfig:

1const result = await client.extract({
2 extractor: {
3 id: "ex_abc123",
4 overrideConfig: {
5 schema: z.object({
6 invoice_number: z.string().nullable(),
7 total: extendCurrency(),
8 }),
9 },
10 },
11 file: { url: "https://example.com/invoice.pdf" }
12});

Available custom type helpers:

  • extendDate() โ€” Date fields (output is ISO format yyyy-mm-dd)
  • extendCurrency() โ€” Currency fields (output has amount and iso_4217_currency_code)
  • extendSignature() โ€” Signature detection fields (output has printed_name, signature_date, is_signed, title_or_role)

Endpoint Changes Summary

Old EndpointNew Endpoint
POST /processor_runs (type: EXTRACT)POST /extract_runs
GET /processor_runs?processorType=EXTRACTGET /extract_runs
GET /processor_runs/{id}GET /extract_runs/{id}
DELETE /processor_runs/{id}DELETE /extract_runs/{id}
POST /processor_runs/{id}/cancelPOST /extract_runs/{id}/cancel

Extraction Schema Changes

Legacy โ€œFieldsโ€ Format Removed

The legacy fields array format for defining extraction schemas is no longer supported in this API version. You must use the JSON Schema schema format instead.

1{
2 "config": {
3 "type": "EXTRACT",
4 "fields": [
5 { "id": "vendor_name", "name": "Vendor Name", "type": "string", "description": "..." },
6 { "id": "total", "name": "Total", "type": "currency", "description": "..." }
7 ]
8 }
9}

See the JSON Schema guide for the full schema reference and examples.

Strict Nullable Validation

In previous API versions, passing a non-nullable primitive type (e.g. "type": "string") would be silently converted to its nullable form ("type": ["string", "null"]). Similarly, enum arrays without null would have null automatically appended.

In 2026-02-09, this is now a validation error. You must explicitly declare nullable types in your schema. This prevents a mismatch between your schema definition and the actual API response โ€” particularly important if youโ€™re using SDK features like Zod schema validation in the TypeScript SDK, where a null return for a non-nullable field would cause a runtime error.

Breaking change: Schemas that previously worked due to silent conversion will now return a 400 Bad Request. Update your schemas before migrating.

Primitive types must use the nullable array form:

1{ "type": "string" }

Enum arrays must include null:

1{ "enum": ["active", "inactive"] }

This applies to all endpoints that accept an extraction schema: POST /extract, POST /extract_runs, POST /extractors, POST /extractors/{id}, and POST /extractors/{extractorId}/versions.


Request Changes

File Properties (All Endpoints)

OldNew
file.fileUrlfile.url
file.fileIdfile.id
file.fileNamefile.name
rawTextfile.text

Creating an Extract Run

OldNewNotes
processorIdextractor.idNested in object
versionextractor.versionNested in object
config (with extractor)extractor.overrideConfigMoved inside extractor object. Top-level config is now only for inline extraction without an extractor
sync: true(removed)Use POST /extract (sync, for testing), or createAndPoll() / webhooks for production
config.type: "EXTRACT"(removed)Implicit from endpoint
config.parserparseConfigRenamed. Use config.parseConfig (inline) or extractor.overrideConfig.parseConfig (with extractor)

Breaking change: If you previously passed config alongside a processorId to override extractor settings, you must now pass it as extractor.overrideConfig inside the extractor object. The top-level config property is reserved for inline extraction (without an extractor).

Additionally, config.parser has been renamed to parseConfig everywhere โ€” use config.parseConfig for inline config or extractor.overrideConfig.parseConfig when overriding an extractor.

Example: Create Request

1{
2 "processorId": "dp_abc123",
3 "version": "latest",
4 "file": {
5 "fileUrl": "https://example.com/invoice.pdf",
6 "fileName": "invoice.pdf"
7 },
8 "sync": true,
9 "metadata": { "customerId": "cust_123" }
10}

Example: Overriding Extractor Config

If you previously passed config alongside processorId to override the extractorโ€™s settings, this now moves inside the extractor object as overrideConfig:

1{
2 "processorId": "dp_abc123",
3 "config": {
4 "schema": { "type": "object", "properties": { "vendorName": { "type": "string" } } },
5 "parser": { "type": "SIMPLE" }
6 },
7 "file": { "fileUrl": "https://example.com/invoice.pdf" },
8 "sync": true
9}

Response Changes

Response shape changes: Single object responses are now returned directly (no wrapper key), and list responses use { "object": "list", "data": [...] } format. See Simplified Response Shapes for details.

Key Differences

OldNew
success: true(removed) โ€” Use HTTP status codes
{ "extractRun": {...} }{...} (object returned directly)
processorRun.processorIdextractRun.extractor.id
processorRun.processorVersionIdextractRun.extractorVersion.id
processorRun.files[]extractRun.file (single object)
processorRun.urlextractRun.dashboardUrl
Optional fields (may be absent)Required but nullable (always present)

Example: Response

1{
2 "success": true,
3 "processorRun": {
4 "object": "document_processor_run",
5 "id": "dpr_abc123",
6 "processorId": "dp_xyz789",
7 "processorVersionId": "dpv_456",
8 "processorName": "Invoice Extractor",
9 "type": "EXTRACT",
10 "status": "PROCESSED",
11 "output": {
12 "value": { "vendorName": "Acme Corp" },
13 "metadata": { "vendorName": { "logprobsConfidence": 0.95 } }
14 },
15 "files": [{ "id": "file_123", "name": "invoice.pdf" }],
16 "url": "https://dashboard.extend.ai/runs/dpr_abc123"
17 }
18}

Citation Format Change

If youโ€™re using citations (bounding boxes), the page field structure has changed:

1const pageNumber = citation.page; // number

SDK Method Reference

Old MethodNew Method
client.processorRun.create()client.extractRuns.create()
client.processorRun.get()client.extractRuns.retrieve()
client.processorRun.list()client.extractRuns.list()
client.processorRun.delete()client.extractRuns.delete()
client.processorRun.cancel()client.extractRuns.cancel()
โ€”client.extract() (new sync endpoint)
โ€”client.extractRuns.createAndPoll() (new)

ExtractRun Schema

PropertyOld (ProcessorRun)New (ExtractRun)Change
object"document_processor_run""extract_run"Value changed
idRequired stringRequired stringNo change
processorIdRequiredโ€”Removed, see extractor.id
processorVersionIdRequiredโ€”Removed, see extractorVersion.id
processorNameRequiredโ€”Removed, see extractor.name
extractorโ€”Required ExtractorSummary | nullNew
extractorVersionโ€”Required ExtractorVersionSummary | nullNew
typeRequired "EXTRACT"โ€”Removed (implicit)
statusRequiredRequiredNo change
outputRequiredRequired but nullableNow nullable
initialOutputOptionalRequired but nullableNow required
reviewedOutputOptionalRequired but nullableNow required
failureReasonOptionalRequired but nullableNow required
failureMessageOptionalRequired but nullableNow required
metadataOptionalRequired but nullableNow required
configExtractionConfigExtractConfigNo change
filesRequired File[]โ€”Removed
fileโ€”Required FileSummaryNew
parseRunIdโ€”Required string | nullNew
urlRequiredโ€”Renamed
dashboardUrlโ€”RequiredNew (replaces url)
usageOptionalRequired but nullableNow required
createdAtโ€”RequiredNew
updatedAtโ€”RequiredNew

ExtractConfig Schema

PropertyOld (ExtractionConfig)New (ExtractConfig)Change
typeRequired "EXTRACT"โ€”Removed (implicit)
baseProcessorOptionalOptionalNo change
schemaOptionalRequiredNow required
fieldsOptional (deprecated)โ€”Removed
parserOptionalโ€”Renamed to parseConfig
parseConfigโ€”OptionalNew

ExtractRunSummary Schema (List Response)

PropertyOld (ProcessorRunSummary)New (ExtractRunSummary)Change
objectโ€”"extract_run"New
processorIdRequiredโ€”Removed, see extractor.id
processorNameRequiredโ€”Removed, see extractor.name
extractorโ€”RequiredNew
extractorVersionโ€”RequiredNew
typeOptionalโ€”Removed
fileโ€”RequiredNew
dashboardUrlโ€”RequiredNew

Need Help?

If you encounter any issues while migrating, please contact our support team at support@extend.app.


Migration Guides

GuideMigrating FromMigrating To
Overviewโ€”Whatโ€™s new and how to upgrade
Extract Runs/processor_runs/extract_runs + /extract
Classify Runs/processor_runs/classify_runs + /classify
Split Runs/processor_runs/split_runs + /split
Parse Runs/parse, /parse/async/parse_runs + /parse
Edit Runs/edit, /edit/async/edit_runs + /edit
Extractors/processors/extractors
Classifiers/processors/classifiers
Splitters/processors/splitters
Files/files/files (breaking changes)
Evaluation Setsevaluation endpointsUpdated evaluation endpoints
Workflow Runs/workflow_runs/workflow_runs (breaking changes)
Webhooksprocessor_run.* eventsextract_run.*, classify_run.*, etc.