Extract Runs Migration

What You Get

Fully typed SDK responses — extractRun.output is typed, no more casting
Run without an extractor — Pass your schema inline with config instead of creating an extractor first
Cleaner request/response format — Simpler property names, predictable nullable fields
Better IDE experience — Autocomplete works out of the box

The old /processor_runs endpoint is still supported in this API version for backward compatibility. You can migrate incrementally.

Quick Start: Common Patterns

Running an Extraction

TypeScript

Python

Java

Before (2025-04-21)

1 const run = await client.processorRun.create({
2   processorId: "dp_abc123",
3   file: { fileUrl: "https://example.com/invoice.pdf" },
4   sync: true
5 });
6 if (run.success) {
7   console.log(run.processorRun.output);
8 }

After (2026-02-09)

1 const result = await client.extract({
2   extractor: { id: "ex_abc123" },
3   file: { url: "https://example.com/invoice.pdf" }
4 });
5 console.log(result.output?.value);
6 
7 // Override an extractor's config (config moved inside extractor object)
8 const result = await client.extract({
9   extractor: {
10     id: "ex_abc123",
11     overrideConfig: { schema: { type: "object", properties: { vendorName: { type: ["string", "null"] } } } }
12   },
13   file: { url: "https://example.com/invoice.pdf" }
14 });
15 
16 // Or with inline config (no extractor needed)
17 const result = await client.extract({
18   config: { schema: { type: "object", properties: { vendorName: { type: ["string", "null"] } } } },
19   file: { url: "https://example.com/invoice.pdf" }
20 });

Listing Extract Runs

TypeScript

Python

Java

Before

1 const runs = await client.processorRun.list({ processorType: "EXTRACT", processorId: "dp_abc123" });

After

1 const runs = await client.extractRuns.list({ extractorId: "ex_abc123" });

TypeScript: Typed Schemas with Zod

Define your extraction schema using Zod for fully typed output values. Pass a z.object() schema directly as config.schema — the SDK automatically converts it to the API’s JSON Schema format and infers the output type:

1 import { ExtendClient, extendDate, extendCurrency } from "extend-ai";
2 import { z } from "zod";
3 
4 const client = new ExtendClient({ token: "your-api-key" });
5 
6 // Extract with a typed zod schema
7 const result = await client.extract({
8   config: {
9     schema: z.object({
10       invoice_number: z.string().nullable(),
11       invoice_date: extendDate(),
12       total: extendCurrency(),
13       line_items: z.array(z.object({
14         description: z.string().nullable(),
15         quantity: z.number().nullable(),
16         unit_price: z.number().nullable(),
17       })),
18     }),
19   },
20   file: { url: "https://example.com/invoice.pdf" }
21 });
22 
23 // TypeScript knows the exact shape of output.value!
24 if (result.output) {
25   const output = result.output.value;
26   console.log(output.invoice_number);           // string | null
27   console.log(output.invoice_date);             // string | null (ISO date)
28   console.log(output.total.amount);             // number | null
29   console.log(output.total.iso_4217_currency_code); // string | null
30   output.line_items.forEach(item => {
31     console.log(item.description, item.quantity, item.unit_price);
32   });
33 }

You can also use Zod schemas with an existing extractor by passing them via extractor.overrideConfig:

1 const result = await client.extract({
2   extractor: {
3     id: "ex_abc123",
4     overrideConfig: {
5       schema: z.object({
6         invoice_number: z.string().nullable(),
7         total: extendCurrency(),
8       }),
9     },
10   },
11   file: { url: "https://example.com/invoice.pdf" }
12 });

Available custom type helpers:

extendDate() — Date fields (output is ISO format yyyy-mm-dd)
extendCurrency() — Currency fields (output has amount and iso_4217_currency_code)
extendSignature() — Signature detection fields (output has printed_name, signature_date, is_signed, title_or_role)

Endpoint Changes Summary

Old Endpoint	New Endpoint
`POST /processor_runs` (type: EXTRACT)	`POST /extract_runs`
`GET /processor_runs?processorType=EXTRACT`	`GET /extract_runs`
`GET /processor_runs/{id}`	`GET /extract_runs/{id}`
`DELETE /processor_runs/{id}`	`DELETE /extract_runs/{id}`
`POST /processor_runs/{id}/cancel`	`POST /extract_runs/{id}/cancel`

Extraction Schema Changes

Legacy “Fields” Format Removed

The legacy fields array format for defining extraction schemas is no longer supported in this API version. You must use the JSON Schema schema format instead.

1 {
2   "config": {
3     "type": "EXTRACT",
4     "fields": [
5       { "id": "vendor_name", "name": "Vendor Name", "type": "string", "description": "..." },
6       { "id": "total", "name": "Total", "type": "currency", "description": "..." }
7     ]
8   }
9 }

See the JSON Schema guide for the full schema reference and examples.

Strict Nullable Validation

In previous API versions, passing a non-nullable primitive type (e.g. "type": "string") would be silently converted to its nullable form ("type": ["string", "null"]). Similarly, enum arrays without null would have null automatically appended.

In 2026-02-09, this is now a validation error. You must explicitly declare nullable types in your schema. This prevents a mismatch between your schema definition and the actual API response — particularly important if you’re using SDK features like Zod schema validation in the TypeScript SDK, where a null return for a non-nullable field would cause a runtime error.

Breaking change: Schemas that previously worked due to silent conversion will now return a 400 Bad Request. Update your schemas before migrating.

Primitive types must use the nullable array form:

1 { "type": "string" }

Enum arrays must include null:

1 { "enum": ["active", "inactive"] }

This applies to all endpoints that accept an extraction schema: POST /extract, POST /extract_runs, POST /extractors, POST /extractors/{id}, and POST /extractors/{extractorId}/versions.

Request Changes

File Properties (All Endpoints)

Old	New
`file.fileUrl`	`file.url`
`file.fileId`	`file.id`
`file.fileName`	`file.name`
`rawText`	`file.text`

Creating an Extract Run

Old	New	Notes
`processorId`	`extractor.id`	Nested in object
`version`	`extractor.version`	Nested in object
`config` (with extractor)	`extractor.overrideConfig`	Moved inside extractor object. Top-level `config` is now only for inline extraction without an extractor
`sync: true`	(removed)	Use `POST /extract` (sync, for testing), or `createAndPoll()` / webhooks for production
`config.type: "EXTRACT"`	(removed)	Implicit from endpoint
`config.parser`	`parseConfig`	Renamed. Use `config.parseConfig` (inline) or `extractor.overrideConfig.parseConfig` (with extractor)

Breaking change: If you previously passed config alongside a processorId to override extractor settings, you must now pass it as extractor.overrideConfig inside the extractor object. The top-level config property is reserved for inline extraction (without an extractor).

Additionally, config.parser has been renamed to parseConfig everywhere — use config.parseConfig for inline config or extractor.overrideConfig.parseConfig when overriding an extractor.

Example: Create Request

1 {
2   "processorId": "dp_abc123",
3   "version": "latest",
4   "file": {
5     "fileUrl": "https://example.com/invoice.pdf",
6     "fileName": "invoice.pdf"
7   },
8   "sync": true,
9   "metadata": { "customerId": "cust_123" }
10 }

Example: Overriding Extractor Config

If you previously passed config alongside processorId to override the extractor’s settings, this now moves inside the extractor object as overrideConfig:

1 {
2   "processorId": "dp_abc123",
3   "config": {
4     "schema": { "type": "object", "properties": { "vendorName": { "type": "string" } } },
5     "parser": { "type": "SIMPLE" }
6   },
7   "file": { "fileUrl": "https://example.com/invoice.pdf" },
8   "sync": true
9 }

Response Changes

Response shape changes: Single object responses are now returned directly (no wrapper key), and list responses use { "object": "list", "data": [...] } format. See Simplified Response Shapes for details.

Key Differences

Old	New
`success: true`	(removed) — Use HTTP status codes
`{ "extractRun": {...} }`	`{...}` (object returned directly)
`processorRun.processorId`	`extractRun.extractor.id`
`processorRun.processorVersionId`	`extractRun.extractorVersion.id`
`processorRun.files[]`	`extractRun.file` (single object)
`processorRun.url`	`extractRun.dashboardUrl`
Optional fields (may be absent)	Required but nullable (always present)

Example: Response

1 {
2   "success": true,
3   "processorRun": {
4     "object": "document_processor_run",
5     "id": "dpr_abc123",
6     "processorId": "dp_xyz789",
7     "processorVersionId": "dpv_456",
8     "processorName": "Invoice Extractor",
9     "type": "EXTRACT",
10     "status": "PROCESSED",
11     "output": {
12       "value": { "vendorName": "Acme Corp" },
13       "metadata": { "vendorName": { "logprobsConfidence": 0.95 } }
14     },
15     "files": [{ "id": "file_123", "name": "invoice.pdf" }],
16     "url": "https://dashboard.extend.ai/runs/dpr_abc123"
17   }
18 }

Citation Format Change

If you’re using citations (bounding boxes), the page field structure has changed:

1 const pageNumber = citation.page;  // number

SDK Method Reference

Old Method	New Method
`client.processorRun.create()`	`client.extractRuns.create()`
`client.processorRun.get()`	`client.extractRuns.retrieve()`
`client.processorRun.list()`	`client.extractRuns.list()`
`client.processorRun.delete()`	`client.extractRuns.delete()`
`client.processorRun.cancel()`	`client.extractRuns.cancel()`
—	`client.extract()` (new sync endpoint)
—	`client.extractRuns.createAndPoll()` (new)

Detailed Schema Changes

ExtractRun Schema

Property	Old (ProcessorRun)	New (ExtractRun)	Change
`object`	`"document_processor_run"`	`"extract_run"`	Value changed
`id`	Required `string`	Required `string`	No change
`processorId`	Required	—	Removed, see `extractor.id`
`processorVersionId`	Required	—	Removed, see `extractorVersion.id`
`processorName`	Required	—	Removed, see `extractor.name`
`extractor`	—	Required `ExtractorSummary \| null`	New
`extractorVersion`	—	Required `ExtractorVersionSummary \| null`	New
`type`	Required `"EXTRACT"`	—	Removed (implicit)
`status`	Required	Required	No change
`output`	Required	Required but nullable	Now nullable
`initialOutput`	Optional	Required but nullable	Now required
`reviewedOutput`	Optional	Required but nullable	Now required
`failureReason`	Optional	Required but nullable	Now required
`failureMessage`	Optional	Required but nullable	Now required
`metadata`	Optional	Required but nullable	Now required
`config`	`ExtractionConfig`	`ExtractConfig`	No change
`files`	Required `File[]`	—	Removed
`file`	—	Required `FileSummary`	New
`parseRunId`	—	Required `string \| null`	New
`url`	Required	—	Renamed
`dashboardUrl`	—	Required	New (replaces `url`)
`usage`	Optional	Required but nullable	Now required
`createdAt`	—	Required	New
`updatedAt`	—	Required	New

ExtractConfig Schema

Property	Old (ExtractionConfig)	New (ExtractConfig)	Change
`type`	Required `"EXTRACT"`	—	Removed (implicit)
`baseProcessor`	Optional	Optional	No change
`schema`	Optional	Required	Now required
`fields`	Optional (deprecated)	—	Removed
`parser`	Optional	—	Renamed to `parseConfig`
`parseConfig`	—	Optional	New

ExtractRunSummary Schema (List Response)

Property	Old (ProcessorRunSummary)	New (ExtractRunSummary)	Change
`object`	—	`"extract_run"`	New
`processorId`	Required	—	Removed, see `extractor.id`
`processorName`	Required	—	Removed, see `extractor.name`
`extractor`	—	Required	New
`extractorVersion`	—	Required	New
`type`	Optional	—	Removed
`file`	—	Required	New
`dashboardUrl`	—	Required	New

Need Help?

If you encounter any issues while migrating, please contact our support team at support@extend.app.

Migration Guides

Guide	Migrating From	Migrating To
Overview	—	What’s new and how to upgrade
Extract Runs	`/processor_runs`	`/extract_runs` + `/extract`
Classify Runs	`/processor_runs`	`/classify_runs` + `/classify`
Split Runs	`/processor_runs`	`/split_runs` + `/split`
Parse Runs	`/parse`, `/parse/async`	`/parse_runs` + `/parse`
Edit Runs	`/edit`, `/edit/async`	`/edit_runs` + `/edit`
Extractors	`/processors`	`/extractors`
Classifiers	`/processors`	`/classifiers`
Splitters	`/processors`	`/splitters`
Files	`/files`	`/files` (breaking changes)
Evaluation Sets	evaluation endpoints	Updated evaluation endpoints
Workflow Runs	`/workflow_runs`	`/workflow_runs` (breaking changes)
Webhooks	`processor_run.*` events	`extract_run.`, `classify_run.`, etc.