Configuring Workflows via API

Configuring Workflows via API

Workflows can be configured programmatically by passing a steps array when you create a workflow, update its draft, or create a version.

Each step has a name, a type, an optional config, and a next array that defines where documents flow after the step completes. The same shape is used in request bodies and in the workflow version responses returned by the API.

Quick Start

The simplest useful workflow extracts structured data from a document:

trigger β†’ parse β†’ extract β†’ review
1{
2 "name": "Invoice Processing",
3 "steps": [
4 {
5 "name": "trigger",
6 "type": "TRIGGER",
7 "next": [{ "step": "parse" }]
8 },
9 {
10 "name": "parse",
11 "type": "PARSE",
12 "next": [{ "step": "extract" }]
13 },
14 {
15 "name": "extract",
16 "type": "EXTRACT",
17 "config": {
18 "extractor": { "id": "ex_abc123", "version": "latest" }
19 },
20 "next": [{ "step": "review" }]
21 },
22 {
23 "name": "review",
24 "type": "HUMAN_REVIEW"
25 }
26 ]
27}

Every workflow starts with a TRIGGER step followed by a PARSE step. After parsing, you can chain any combination of processing, branching, and validation steps.

Key Concepts

Routing

Each step’s next array defines where documents flow. For most step types, you only need to specify the target step:

1"next": [{ "step": "extract" }]

For branching step types, each next entry includes a routing field specific to the step type:

1// CLASSIFY or SPLIT β€” use classificationId
2"next": [
3 { "step": "extract_invoice", "classificationId": "cls_invoice" },
4 { "step": "extract_receipt", "classificationId": "cls_receipt" }
5]
6
7// CONDITIONAL β€” use conditionId
8"next": [
9 { "step": "review", "conditionId": "high_value" },
10 { "step": "webhook", "conditionId": "default_path" }
11]
12
13// RULE_VALIDATION β€” use result
14"next": [
15 { "step": "webhook", "result": "pass" },
16 { "step": "review", "result": "fail" }
17]

Workflow Patterns

Linear Extraction

The simplest pattern β€” every step has exactly one downstream step.

trigger β†’ parse β†’ extract β†’ webhook
1[
2 { "name": "trigger", "type": "TRIGGER", "next": [{ "step": "parse" }] },
3 { "name": "parse", "type": "PARSE", "next": [{ "step": "extract" }] },
4 {
5 "name": "extract",
6 "type": "EXTRACT",
7 "config": { "extractor": { "id": "ex_abc123", "version": "latest" } },
8 "next": [{ "step": "webhook" }]
9 },
10 { "name": "webhook", "type": "WEBHOOK_RESPONSE" }
11]

Classify and Route

Use a CLASSIFY step to route documents to different extractors based on document type. Each next entry’s classificationId must match a classification ID from the classifier’s config.

β”Œβ”€ cls_invoice ─→ extract_invoice ─┐
trigger β†’ parse β†’ classify─ cls_receipt ─→ extract_receipt ─┼→ review β†’ webhook
└─ cls_other β”€β”€β†’β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

First, your classifier defines classifications with stable IDs:

1const classifierConfig = {
2 classifications: [
3 { id: "cls_invoice", type: "invoice", description: "Invoice documents" },
4 { id: "cls_receipt", type: "receipt", description: "Receipt documents" },
5 { id: "cls_other", type: "other", description: "Other documents" },
6 ],
7};

Then the workflow step uses those IDs as classificationId values:

1[
2 { "name": "trigger", "type": "TRIGGER", "next": [{ "step": "parse" }] },
3 { "name": "parse", "type": "PARSE", "next": [{ "step": "classify" }] },
4 {
5 "name": "classify",
6 "type": "CLASSIFY",
7 "config": {
8 "classifier": { "id": "cl_abc123", "version": "0.1" }
9 },
10 "next": [
11 { "step": "extract_invoice", "classificationId": "cls_invoice" },
12 { "step": "extract_receipt", "classificationId": "cls_receipt" },
13 { "step": "review", "classificationId": "cls_other" }
14 ]
15 },
16 {
17 "name": "extract_invoice",
18 "type": "EXTRACT",
19 "config": { "extractor": { "id": "ex_invoice456", "version": "1.0" } },
20 "next": [{ "step": "review" }]
21 },
22 {
23 "name": "extract_receipt",
24 "type": "EXTRACT",
25 "config": { "extractor": { "id": "ex_receipt789", "version": "1.0" } },
26 "next": [{ "step": "review" }]
27 },
28 { "name": "review", "type": "HUMAN_REVIEW", "next": [{ "step": "webhook" }] },
29 { "name": "webhook", "type": "WEBHOOK_RESPONSE" }
30]

Conditions use classification IDs (e.g. "cls_invoice"), not type strings (e.g. "invoice"). IDs are stable across renames β€” if you rename a classification type from "invoice" to "billing_invoice", the ID stays the same and routing continues to work.

Split and Route

Use a SPLIT step to break a multi-document file into individual sub-documents and route each one by type. The same ID-based routing rules apply as for CLASSIFY.

β”Œβ”€ cls_invoice ─→ extract_invoice ─┐
trigger β†’ parse β†’ split─ cls_receipt ─→ extract_receipt ─┼→ collect β†’ webhook
└─ cls_other ──→ review β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
1[
2 { "name": "trigger", "type": "TRIGGER", "next": [{ "step": "parse" }] },
3 { "name": "parse", "type": "PARSE", "next": [{ "step": "split" }] },
4 {
5 "name": "split",
6 "type": "SPLIT",
7 "config": {
8 "splitter": { "id": "spl_abc123", "version": "0.1" }
9 },
10 "next": [
11 { "step": "extract_invoice", "classificationId": "cls_invoice" },
12 { "step": "extract_receipt", "classificationId": "cls_receipt" },
13 { "step": "review", "classificationId": "cls_other" }
14 ]
15 },
16 {
17 "name": "extract_invoice",
18 "type": "EXTRACT",
19 "config": { "extractor": { "id": "ex_invoice456", "version": "1.0" } },
20 "next": [{ "step": "collect" }]
21 },
22 {
23 "name": "extract_receipt",
24 "type": "EXTRACT",
25 "config": { "extractor": { "id": "ex_receipt789", "version": "1.0" } },
26 "next": [{ "step": "collect" }]
27 },
28 { "name": "review", "type": "HUMAN_REVIEW", "next": [{ "step": "collect" }] },
29 { "name": "collect", "type": "COLLECT", "next": [{ "step": "webhook" }] },
30 { "name": "webhook", "type": "WEBHOOK_RESPONSE" }
31]

Conditional Logic

Use a CONDITIONAL step to route based on extracted data values. Each condition has an id that is referenced by next[].conditionId.

trigger β†’ parse β†’ extract β†’ route_total ─┬─ high_value ──→ review β†’ webhook
└─ default_path β†’ webhook
1[
2 { "name": "trigger", "type": "TRIGGER", "next": [{ "step": "parse" }] },
3 { "name": "parse", "type": "PARSE", "next": [{ "step": "extract" }] },
4 {
5 "name": "extract",
6 "type": "EXTRACT",
7 "config": { "extractor": { "id": "ex_abc123", "version": "latest" } },
8 "next": [{ "step": "route_total" }]
9 },
10 {
11 "name": "route_total",
12 "type": "CONDITIONAL",
13 "config": {
14 "conditions": [
15 {
16 "id": "high_value",
17 "type": "IF",
18 "operation": "GTE",
19 "leftOperand": "{{ extract.total }}",
20 "rightOperand": "10000"
21 },
22 { "id": "default_path", "type": "ELSE" }
23 ]
24 },
25 "next": [
26 { "step": "review", "conditionId": "high_value" },
27 { "step": "webhook", "conditionId": "default_path" }
28 ]
29 },
30 { "name": "review", "type": "HUMAN_REVIEW", "next": [{ "step": "webhook" }] },
31 { "name": "webhook", "type": "WEBHOOK_RESPONSE" }
32]

See Formulas for the expression language used in leftOperand and rightOperand.

Validation

Use RULE_VALIDATION to check extracted data against business rules and branch on the result.

trigger β†’ parse β†’ extract β†’ validate ─┬─ pass β†’ webhook
└─ fail β†’ review β†’ webhook
1[
2 { "name": "trigger", "type": "TRIGGER", "next": [{ "step": "parse" }] },
3 { "name": "parse", "type": "PARSE", "next": [{ "step": "extract" }] },
4 {
5 "name": "extract",
6 "type": "EXTRACT",
7 "config": { "extractor": { "id": "ex_abc123", "version": "latest" } },
8 "next": [{ "step": "validate" }]
9 },
10 {
11 "name": "validate",
12 "type": "RULE_VALIDATION",
13 "config": {
14 "rules": [
15 {
16 "name": "total_matches_sum",
17 "formula": "extraction1.total = extraction1.subtotal + extraction1.tax",
18 "description": "Checks invoice math"
19 }
20 ]
21 },
22 "next": [
23 { "step": "webhook", "result": "pass" },
24 { "step": "review", "result": "fail" }
25 ]
26 },
27 { "name": "review", "type": "HUMAN_REVIEW", "next": [{ "step": "webhook" }] },
28 { "name": "webhook", "type": "WEBHOOK_RESPONSE" }
29]

See Formulas for the rule expression language and Validation Step for the UI guide.

Step Reference

All processor references (extractor, classifier, splitter) require an explicit version field.

CLASSIFY and SPLIT steps do not support "latest" β€” you must pin to a specific semver version (e.g. "0.1") or "draft". This is because classification IDs used for routing are tied to a specific version’s config. If a new version is published with different classifications, routing would break silently.

Step type"latest""draft"Semver (e.g. "1.0")
EXTRACTYesYesYes
CONDITIONAL_EXTRACTYesYesYes
CLASSIFYNoYesYes
SPLITNoYesYes

Trigger

The single entry point for every workflow. Must route to exactly one PARSE step.

1{ "name": "trigger", "type": "TRIGGER", "next": [{ "step": "parse" }] }

Parse

Converts the uploaded file into structured content (OCR, text extraction). Must appear immediately after the trigger.

Optionally configure parsing behavior with parseConfig. See Parse Configuration Options.

1{
2 "name": "parse",
3 "type": "PARSE",
4 "config": {
5 "parseConfig": {
6 "target": "markdown",
7 "chunkingStrategy": { "type": "page" }
8 }
9 },
10 "next": [{ "step": "extract" }]
11}

Extract

Runs a published extractor against parsed content. Version is required β€” "latest", "draft", or semver. Can be created without config β€” next cannot be set until config is provided, and config is required before deploy.

1{
2 "name": "extract",
3 "type": "EXTRACT",
4 "config": {
5 "extractor": { "id": "ex_abc123", "version": "latest" }
6 },
7 "next": [{ "step": "review" }]
8}

Classify

Routes documents to different downstream steps based on classification. Conditions must reference classification IDs, not type strings. Requires a pinned version β€” "latest" is not allowed. Can be created without config β€” next cannot be set until config is provided, and config is required before deploy.

See the Classify and Route pattern above for a complete example.

Split

Splits a multi-document file into sub-documents and routes each one. Same ID-based routing and pinned version rules as CLASSIFY. Can be created without config β€” next cannot be set until config is provided, and config is required before deploy.

See the Split and Route pattern above for a complete example.

Merge Extract

Combines outputs from multiple upstream extract steps. Use mergeOrder to control how overlapping fields are prioritized.

1{
2 "name": "merge",
3 "type": "MERGE_EXTRACT",
4 "config": { "mergeOrder": "confidence" },
5 "next": [{ "step": "webhook" }]
6}

Conditional

Routes based on extracted data values using if/else logic. See the Conditional Logic pattern above.

For the UI-based version of this step, see Conditional Steps.

Conditional Extract

Chooses which extractor to run based on formula conditions. Each rule pairs a formula with an extractor reference. The last rule must have formula: "TRUE" as a default catch-all to prevent runtime failures when no other rule matches. Can be created without config β€” next cannot be set until config is provided, and config is required before deploy.

1{
2 "name": "route_extractor",
3 "type": "CONDITIONAL_EXTRACT",
4 "config": {
5 "rules": [
6 {
7 "name": "cigna_provider",
8 "formula": "metadata.provider_name = \"cigna\"",
9 "extractor": { "id": "ex_cigna", "version": "latest" }
10 },
11 {
12 "name": "fallback",
13 "formula": "TRUE",
14 "extractor": { "id": "ex_generic", "version": "latest" }
15 }
16 ]
17 },
18 "next": [{ "step": "validate" }]
19}

See Formulas for the expression language and Conditional Extraction Step for the UI guide.

Rule Validation

Checks extracted data against boolean rules. Can be created without config β€” next cannot be set until config is provided, and config is required before deploy. See the Validation pattern above for a complete example.

External Data Validation

Sends extraction data to an external HTTP endpoint for validation. Can be created without config β€” next cannot be set until config is provided, and config is required before deploy.

1{
2 "name": "external_validate",
3 "type": "EXTERNAL_DATA_VALIDATION",
4 "config": {
5 "requestOptions": {
6 "url": "https://api.example.com/validate",
7 "method": "POST",
8 "headers": { "x-api-key": "secret" },
9 "contentType": "application/json"
10 },
11 "failureBehavior": "EXIT"
12 },
13 "next": [{ "step": "review" }]
14}

See External Data Validation Step for more context.

Human Review

Pauses the workflow for manual review in the dashboard before continuing to downstream steps.

1{ "name": "review", "type": "HUMAN_REVIEW", "next": [{ "step": "webhook" }] }

Collect

Joins multiple upstream branches before continuing. Use after CLASSIFY or SPLIT branches to wait for all parallel work to complete.

1{ "name": "collect", "type": "COLLECT", "next": [{ "step": "webhook" }] }

File Conversion

Converts the file format before downstream processing. Use failureBehavior to control whether conversion failures stop the workflow.

1{
2 "name": "convert",
3 "type": "FILE_CONVERSION",
4 "config": { "failureBehavior": "CONTINUE" },
5 "next": [{ "step": "parse" }]
6}

Webhook Response

Terminal step that delivers results to your webhook endpoint. Must not have next.

1{ "name": "webhook", "type": "WEBHOOK_RESPONSE" }