Configuration

The Classify API accepts a config object that controls how a document is categorized. The only required field is classifications — the set of categories the classifier chooses from. The rest are optional: classificationRules for natural-language guidance, baseProcessor and baseVersion to pick the model, advancedOptions for context, multimodal processing, memory, and page ranges, and parseConfig to tune how the document is parsed first.

You can pass the same config inline on a one-off /classify call or save it to a reusable Classifier and reference that classifier by id. Either way the configuration is identical.

For default values and the full schema, see the Create Classify Run API reference.

Prefer a UI? Extend Studio lets you configure a classifier visually and export the config JSON.

Classifications

`classifications`

Type: array (required)

The categories the classifier can choose from. Provide at least one classification, and at least one classification must have the type "other" as a catch-all. Each classification must have a unique id.

Each entry has these fields:

Field	Type	Description
`id`	string	Unique identifier for the classification. Lowercase, underscore-separated format is recommended.
`type`	string	Type identifier for the classification, returned in the output.
`description`	string	A detailed description of the classification. This is your biggest lever on accuracy.

1 {
2   "config": {
3     "classifications": [
4       {
5         "id": "invoice",
6         "type": "invoice",
7         "description": "An invoice is a document that lists the items purchased and the total amount due."
8       },
9       {
10         "id": "bill_of_lading",
11         "type": "bill_of_lading",
12         "description": "A bill of lading documenting a shipment of goods."
13       },
14       {
15         "id": "other",
16         "type": "other",
17         "description": "Any other document type."
18       }
19     ]
20   }
21 }

Classification rules

`classificationRules`

Type: string

Custom rules to guide the classification process in natural language. Useful for disambiguating categories that look alike or encoding business logic.

1 {
2   "config": {
3     "classificationRules": "Remember, when it comes to differentiating between invoices and purchase orders, the most important thing to look for is the date of the document."
4   }
5 }

Base processor

`baseProcessor`

Type: "classification_performance" | "classification_light" (default: "classification_performance")

The base classification model to use.

Processor	When to use
`classification_performance`	Highest accuracy (default).
`classification_light`	Faster, cheaper classification.

1 { "config": { "baseProcessor": "classification_performance" } }

`baseVersion`

Type: string

The version of the classification_performance or classification_light processor to use. If not provided, the latest stable version for the selected baseProcessor is used automatically. See the Classification Changelog.

1 { "config": { "baseProcessor": "classification_performance", "baseVersion": "3.2.0" } }

Advanced options

Configuration under advancedOptions.

`advancedOptions.context`

Type: "default" | "max" (default: "default")

How much of the document to pass to the model as context.

default — passes in only a limited amount of the document, and you’re only charged for the pages passed in. Typically the whole document isn’t needed to classify it — only the first few pages — so this keeps costs down.
max — passes in the full document. Use it when later pages are needed to tell categories apart.

1 { "config": { "advancedOptions": { "context": "max" } } }

`advancedOptions.advancedMultimodalEnabled`

Type: boolean (default: true)

Enable advanced multimodal processing for better handling of visual elements during classification.

1 { "config": { "advancedOptions": { "advancedMultimodalEnabled": true } } }

`advancedOptions.memoryEnabled`

Type: boolean (default: false)

Enable memory for enhanced processing by learning from past successful classifications. See Memory.

1 { "config": { "advancedOptions": { "memoryEnabled": true } } }

`advancedOptions.pageRanges`

Type: Array<{ start: number, end: number }>

Restrict classification to specific page ranges.

1 {
2   "config": {
3     "advancedOptions": {
4       "pageRanges": [{ "start": 1, "end": 3 }]
5     }
6   }
7 }

Parse config

`parseConfig`

Type: object

Controls how the document is parsed before classification (target format, chunking, and block options). See Parse Configuration for the full set of options.

1 {
2   "config": {
3     "parseConfig": {
4       "target": "markdown",
5       "chunkingStrategy": { "type": "page" }
6     }
7   }
8 }

Using a saved classifier

To reuse a configuration, create a classifier and reference it by id, optionally overriding specific fields per run with overrideConfig:

A classifier is a kind of processor — see that page for how saving a configuration lets you version, evaluate, and optimize it.

Create a classifier — set up a new classifier with your configuration.
Update a classifier — modify an existing classifier’s configuration.
Run a classifier — execute a classifier, optionally with classifier.overrideConfig.

With overrideConfig, only the fields you provide override the classifier’s saved configuration — for example, you can pass only classificationRules without providing classifications.

You can pass the same config inline on a one-off /classify call or save it to a reusable Classifier and reference that classifier by id. Either way the configuration is identical.

For default values and the full schema, see the Create Classify Run API reference.

Prefer a UI? Extend Studio lets you configure a classifier visually and export the config JSON.

Classifications

`classifications`

Type: array (required)

Each entry has these fields:

Field	Type	Description
`id`	string	Unique identifier for the classification. Lowercase, underscore-separated format is recommended.
`type`	string	Type identifier for the classification, returned in the output.
`description`	string	A detailed description of the classification. This is your biggest lever on accuracy.

1 {
2   "config": {
3     "classifications": [
4       {
5         "id": "invoice",
6         "type": "invoice",
7         "description": "An invoice is a document that lists the items purchased and the total amount due."
8       },
9       {
10         "id": "bill_of_lading",
11         "type": "bill_of_lading",
12         "description": "A bill of lading documenting a shipment of goods."
13       },
14       {
15         "id": "other",
16         "type": "other",
17         "description": "Any other document type."
18       }
19     ]
20   }
21 }

Classification rules

`classificationRules`

Type: string

Custom rules to guide the classification process in natural language. Useful for disambiguating categories that look alike or encoding business logic.

1 {
2   "config": {
3     "classificationRules": "Remember, when it comes to differentiating between invoices and purchase orders, the most important thing to look for is the date of the document."
4   }
5 }

Base processor

`baseProcessor`

Type: "classification_performance" | "classification_light" (default: "classification_performance")

The base classification model to use.

Processor	When to use
`classification_performance`	Highest accuracy (default).
`classification_light`	Faster, cheaper classification.

1 { "config": { "baseProcessor": "classification_performance" } }

`baseVersion`

Type: string

1 { "config": { "baseProcessor": "classification_performance", "baseVersion": "3.2.0" } }

Advanced options

Configuration under advancedOptions.

`advancedOptions.context`

Type: "default" | "max" (default: "default")

How much of the document to pass to the model as context.

default — passes in only a limited amount of the document, and you’re only charged for the pages passed in. Typically the whole document isn’t needed to classify it — only the first few pages — so this keeps costs down.
max — passes in the full document. Use it when later pages are needed to tell categories apart.

1 { "config": { "advancedOptions": { "context": "max" } } }

`advancedOptions.advancedMultimodalEnabled`

Type: boolean (default: true)

Enable advanced multimodal processing for better handling of visual elements during classification.

1 { "config": { "advancedOptions": { "advancedMultimodalEnabled": true } } }

`advancedOptions.memoryEnabled`

Type: boolean (default: false)

Enable memory for enhanced processing by learning from past successful classifications. See Memory.

1 { "config": { "advancedOptions": { "memoryEnabled": true } } }

`advancedOptions.pageRanges`

Type: Array<{ start: number, end: number }>

Restrict classification to specific page ranges.

1 {
2   "config": {
3     "advancedOptions": {
4       "pageRanges": [{ "start": 1, "end": 3 }]
5     }
6   }
7 }

Parse config

`parseConfig`

Type: object

Controls how the document is parsed before classification (target format, chunking, and block options). See Parse Configuration for the full set of options.

1 {
2   "config": {
3     "parseConfig": {
4       "target": "markdown",
5       "chunkingStrategy": { "type": "page" }
6     }
7   }
8 }

Using a saved classifier

To reuse a configuration, create a classifier and reference it by id, optionally overriding specific fields per run with overrideConfig:

A classifier is a kind of processor — see that page for how saving a configuration lets you version, evaluate, and optimize it.

Create a classifier — set up a new classifier with your configuration.
Update a classifier — modify an existing classifier’s configuration.
Run a classifier — execute a classifier, optionally with classifier.overrideConfig.

With overrideConfig, only the fields you provide override the classifier’s saved configuration — for example, you can pass only classificationRules without providing classifications.

1	{
2	"config": {
3	"classifications": [
4	{
5	"id": "invoice",
6	"type": "invoice",
7	"description": "An invoice is a document that lists the items purchased and the total amount due."
8	},
9	{
10	"id": "bill_of_lading",
11	"type": "bill_of_lading",
12	"description": "A bill of lading documenting a shipment of goods."
13	},
14	{
15	"id": "other",
16	"type": "other",
17	"description": "Any other document type."
18	}
19	]
20	}
21	}

1	{
2	"config": {
3	"classificationRules": "Remember, when it comes to differentiating between invoices and purchase orders, the most important thing to look for is the date of the document."
4	}
5	}

1	{
2	"config": {
3	"advancedOptions": {
4	"pageRanges": [{ "start": 1, "end": 3 }]
5	}
6	}
7	}

1	{
2	"config": {
3	"parseConfig": {
4	"target": "markdown",
5	"chunkingStrategy": { "type": "page" }
6	}
7	}
8	}