Multifile Extraction
Multifile extraction lets you run a single extraction over a collection of files with a shared context. Useful when a field’s value must be chosen from the best among a variety of sources, or when values are derived from multiple files.
Example 1: A contract contains a number of amendments and those amendments supersede the original content.
Example 2: A user submits multiple pictures of a long receipt that should be extracted together.
How it works
Pass a package instead of a file on your request. The package.files array accepts up to 50 entries, each either a URL or an existing Extend file ID. The API ingests all files concurrently, runs extraction across the full corpus, and returns a single ExtractRun with a files array in the response (and file: null).
Quick start
Python
TypeScript
Java
Go
cURL
File inputs
Each entry in package.files can be either:
Raw text (text) and base64 inputs are not supported in multifile packages — use url or id.
You can mix URLs and file IDs in the same package:
Response
A multifile run returns the same ExtractRun shape as a single-file run, with two differences:
fileisnullfilesis an ordered array ofFileSummaryobjects, one per input file in submission order
output.value is a single object covering the whole corpus — not one object per file. Design your extractor schema to describe what you want extracted across all files together.
Multifile vs batch
Multifile extraction and batch processing are complementary but different:
Use multifile when your extractor schema is designed to aggregate across a set of documents. Use batch when you just want to submit many independent files efficiently.

