Every processing endpoint in Extend (extract, classify, split, parse, and edit) supports both a synchronous and asynchronous mode. Workflows are async-only. Choosing the right mode depends on your use case.
Sync endpoints have a 5-minute timeout. If processing takes longer, the request will fail. For production workloads, always use async endpoints.
POST /edit_schemas/generate is sync-only. It returns the generated schema directly and does not have async run, polling, or webhook support.
Sync endpoints are convenient but they don’t scale well:
Async endpoints solve these problems. You fire off a request, get an ID back immediately, and retrieve the result when it’s ready — either by polling or by receiving a webhook.
The SDKs provide createAndPoll / create_and_poll methods that handle polling for you automatically. They use a hybrid strategy: fast polling (every ~1 second) for the first 30 seconds, then gradual backoff up to 30-second intervals. The method returns when the run reaches a terminal state.
Polling completes when the run is no longer in a PROCESSING, PENDING, or CANCELLING state. The terminal states depend on the run type:
You can customize polling behavior by passing options:
For event-driven architectures or very long-running processes (especially workflows), webhooks are the recommended approach. Instead of polling, Extend sends an HTTP request to your server when a run completes.
When to use webhooks over polling:
Every run type emits webhook events for completion and failure (e.g., extract_run.processed, extract_run.failed). Workflow runs also emit events for needs_review, rejected, and cancelled states.
For setup instructions, event types, and signature verification, see the Webhooks documentation.
For most production integrations, we recommend starting with SDK polling for its simplicity, and adding webhooks as you scale or when processing workflows.