Classification assigns a document to exactly one of the categories you define and returns the match as structured JSON. You describe the possible document types with classifications, and Extend returns the matched type, a confidence score, and reasoning behind the decision. Use it to route incoming documents (invoices vs. bills of lading vs. purchase orders), gate downstream processing, or branch a workflow based on document type.
We’ll classify a freight invoice into one of several logistics document types. For this quick start we’ve uploaded the file here.
Grab a key from the Developers page and store it as the EXTEND_API_KEY environment variable. If you’re using an SDK, see the installation instructions.
The /classify endpoint takes a file and a config with the classifications you want to choose between.
Want to classify your own document? Upload it first, then pass the returned file id instead of a url (reusing the same config).
After you run the code snippet above, you’ll see a response like this. Extend parses the document, picks the best-matching category, and returns an output with the matched classification, a confidence score, and insights explaining the decision.
For full request/response details, see the Create Classify Run API reference.
Branch your logic on the returned id, and use confidence to decide when to route a document to manual review. Match on id rather than type: the id is a stable identifier you control, while type and description are part of the prompt that steers the classification decision and may change as you tune accuracy.
For the full output shape and shared types, see Response Format.
The example above calls the synchronous /classify endpoint. We also have an asynchronous /classify_runs endpoint that should be used for large files and high volume use cases.
See Async Processing for the full comparison, polling options, and webhook setup.
The quick start runs with an inline config, which is perfect for getting started. To reuse a configuration across runs — and to version it, measure its accuracy, and optimize it — save it as a classifier, a kind of processor. Processors are the saved entities you iterate on in the dashboard, run evaluation sets against, and improve with Composer.
The quick start sends just file and config.classifications. To control how classification runs, pass more options inside config. Here are the most commonly used ones; for the full reference, see Configuration.
The classifications array is the heart of every classifier — the set of categories the model chooses from. Each entry needs a unique id, a type returned in the output, and a description (your biggest lever on accuracy). At least one classification must have the type "other" as a catch-all.
Choose the processor based on your accuracy and latency needs.
Steer the model with plain-language rules — useful for disambiguating categories that look similar.
Because Classify runs Parse under the hood, you can tune how the document is parsed before classification with parseConfig.
For every option, including advanced options like context, multimodal processing, and memory, see the Configuration reference.