CLI

The extend CLI gives you direct terminal access to Extend’s document operations: parse, extract, classify, split, edit, and run workflows. Reach for it for quick one-off jobs, batch processing, scripting, and CI/CD pipelines — no application code required.

Building a durable integration in your own codebase? Use an SDK (Python, TypeScript, Java, Go) instead — they ship typed clients, polling helpers, and webhook verification.

Installation

$brew install extend-hq/tap/extend

You can also download a signed binary from the releases page. Confirm the install with extend --version.

Authentication

Grab a key from the Developers page, export it, and verify it with a read-only command (it spends no credits):

$export EXTEND_API_KEY="sk_xxx"
$
$extend extractors list

Using an organization-scoped key? Also set EXTEND_WORKSPACE_ID="ws_xxx". On a non-default deployment? Set EXTEND_REGION (us | us2 | eu, default us).

Quick examples

$# Parse a document to clean markdown
$extend parse invoice.pdf > invoice.md
$
$# Extract fields with an inline schema
$extend extract invoice.pdf --config '{"baseProcessor":"extraction_performance","schema":{"type":"object","properties":{"total":{"type":["number","null"]}}}}'
$
$# Extract, classify, or split using a saved processor
$extend extract invoice.pdf --using ex_abc
$extend classify invoice.pdf --using cl_abc
$extend split combined.pdf --using spl_abc
$
$# Run a workflow, then watch it to completion
$RUN=$(extend run doc.pdf --using workflow_abc -o id)
$extend runs watch "$RUN"
$
$# Process a whole folder
$extend extract batch invoices/*.pdf --using ex_abc

Every command takes an <input> — a local file path (uploaded for you), an Extend file_xxx ID, or a publicly reachable https:// URL — across 30+ file types. Results are written to stdout, so redirect them to a file (> result.json) for anything large.

Commands

parse

Convert a document into clean markdown chunks. No configuration required.

FlagDescription
--chunk-strategyHow to segment output: page, section, or document.
$extend parse contract.pdf > contract.md
$extend parse contract.pdf -o json

extract

Pull structured data into JSON. Pass an inline --config for prototyping, or reference a saved extractor with --using and tweak a single run with --patch.

$# Inline config (a baseProcessor plus a JSON schema)
$extend extract invoice.pdf --config "$(cat config.json)"
$
$# Saved extractor, with optional per-run overrides
$extend extract invoice.pdf --using ex_abc
$extend extract invoice.pdf --using ex_abc --patch tweaks.json

Schemas validate strictly and every field must be nullable. Keep yours in a file so each fix is incremental, and see the schema reference for custom field types like dates and currency.

classify

Label a document’s type using a saved classifier.

$extend classify document.pdf --using cl_abc

split

Break a multi-document bundle into segments using a saved splitter.

$extend split combined.pdf --using spl_abc

edit

Fill a PDF form’s fields and get back a filled PDF. Use natural-language --instructions for quick fills, or scaffold a schema for repeatable, structured fills.

$# Natural-language fill
$extend edit form.pdf --instructions "name is Acme Corp; date is 2026-04-15"
$
$# Structured fill: scaffold a schema, populate its values, then apply it
$extend edit schema generate form.pdf > schema.json
$extend edit form.pdf --schema schema.json --output-file filled.pdf

run

Run a saved workflow — a multi-step pipeline that chains the operations above.

$RUN=$(extend run doc.pdf --using workflow_abc -o id)
$extend runs watch "$RUN"

batch

Add batch to any processing command to run across many files at once.

$extend parse batch ./docs/*.pdf
$extend extract batch invoices/*.pdf --using ex_abc --files-from list.txt

Managing resources

Beyond processing, the CLI manages the resources behind your runs:

runs get | list | watch | cancel | delete | update
files upload | list | get | delete | download
extractors list | get | create | update | versions ... # also classifiers, splitters, workflows
evaluations list | get | create (items, runs as subcommands)
webhooks endpoints | subscriptions | verify

Create a saved extractor without leaving the terminal:

$echo '{"config":{"baseProcessor":"extraction_performance","schema":{"type":"object","properties":{"invoice_id":{"type":["string","null"]}}}}}' \
> | extend extractors create --name "Invoices" --from-file - -o id

Run extend <command> --help for the full flag list on any command, and see the CLI on GitHub for the complete reference.

Output and scripting

  • -o <format>json, yaml, raw, id, table, or markdown.
  • --jq '<expr>' — filter structured results with the built-in jq, e.g. --jq '.output.value.invoice_id' -o raw (no separate jq install needed).

Data goes to stdout and status to stderr, so you can pipe results cleanly. The CLI honors NO_COLOR and CLICOLOR_FORCE.

Agent skill

The CLI can describe itself to coding agents as an installable skill — a SKILL.md (following the agentskills.io standard) that teaches harnesses like Claude Code, Codex, Cursor, OpenCode, and Goose how to drive extend correctly. Install it to the cross-client default path:

$extend skill install

This writes ~/.agents/skills/extend/SKILL.md. Claude Code reads from ~/.claude/skills/ instead, so point --target there:

$extend skill install --target ~/.claude/skills/extend/SKILL.md

Or print it to stdout to redirect anywhere with extend skill > SKILL.md. The skill is generated from the CLI’s own command tree, so re-run it after upgrading. For the full agent setup — including the platform context file for writing SDK code — see the Agent Quickstart.

The CLI is under active development — commands, flags, and output formats may change. If you depend on a specific output shape (for example in CI), pin to a specific release tag.

Next steps