Async Processing
Every processing endpoint in Extend (extract, classify, split, parse, and edit) supports both a synchronous and asynchronous mode. Workflows are async-only. Choosing the right mode depends on your use case.
Sync vs. Async
Sync endpoints have a 5-minute timeout. If processing takes longer, the request will fail. For production workloads, always use async endpoints.
Why Use Async in Production
Sync endpoints are convenient but they don’t scale well:
- Large files — multi-page PDFs, complex Excel spreadsheets, and large documents often exceed the 5-minute timeout
- High volume means many blocked connections waiting for results
- Network issues can cause you to lose a result that already finished processing server-side
- Workflows are async-only and can take minutes to hours depending on complexity
Async endpoints solve these problems. You fire off a request, get an ID back immediately, and retrieve the result when it’s ready — either by polling or by receiving a webhook.
Polling with SDK Helpers
The SDKs provide createAndPoll / create_and_poll methods that handle polling for you automatically. They use a hybrid strategy: fast polling (every ~1 second) for the first 30 seconds, then gradual backoff up to 30-second intervals. The method returns when the run reaches a terminal state.
Available Polling Methods
Terminal States
Polling completes when the run is no longer in a PROCESSING, PENDING, or CANCELLING state. The terminal states depend on the run type:
Full Example with Error Handling
TypeScript
Python
Java
Configuring Polling Options
You can customize polling behavior by passing options:
TypeScript
Python
Java
Webhooks
For event-driven architectures or very long-running processes (especially workflows), webhooks are the recommended approach. Instead of polling, Extend sends an HTTP request to your server when a run completes.
When to use webhooks over polling:
- High volume — you’re processing hundreds or thousands of files and don’t want to keep processes alive waiting for results
- Long-running workflows — complex workflows can take minutes to hours
- Cost efficiency — with polling, the calling process must stay alive for the entire duration of the run, which can be expensive at scale
- Event-driven architectures — you want to react to completions asynchronously in your backend
Every run type emits webhook events for completion and failure (e.g., extract_run.processed, extract_run.failed). Workflow runs also emit events for needs_review, rejected, and cancelled states.
For setup instructions, event types, and signature verification, see the Webhooks documentation.
Choosing the Right Approach
For most production integrations, we recommend starting with SDK polling for its simplicity, and adding webhooks as you scale or when processing workflows.

