Best Practices: Latency Optimization

When processing high-volume documents or building real-time applications, latency becomes a critical factor. This guide provides the most impactful settings to reduce latency.

Many latency-sensitive settings involve trade-offs with accuracy for complex documents. See the Advanced Options guide for detailed explanations of each setting.

Quick Reference

Use this checklist when optimizing for latency:

Advanced Options

Use extraction_light for simple document types (verify accuracy with evaluation sets)
Turn off model reasoning insights (modelReasoningInsightsEnabled: false) - only needed for debugging
Disable advanced multimodal (advancedMultimodalEnabled: false) - unless processing scans/handwritten content
Turn off bounding box citations (citationsEnabled: false) - removes spatial location references

Extraction Chunking Options

Limit page ranges if data is on specific pages
Use confidence or take_first merging instead of intelligent
Use large_array_heuristics array strategy if processing large arrays

Parser Configuration

Use document chunk type for non-array extraction to skip merging entirely
Disable figure parsing - unless documents contain important charts/diagrams
Disable agentic OCR - unless processing handwritten/poor quality scans

Workflow

Split into parallel extractors if you have both simple fields and complex arrays

Light Extraction

The biggest change you can make to reduce latency is selecting Extraction Light instead of the default Extraction Performance.

Core performance settings in Extend Studio

1 config: {
2   "type": "EXTRACT",
3   "baseProcessor": "extraction_light"
4 }

Extraction Light is faster and cheaper, but removes support for advanced visual features like figure parsing and signature detection. See the Extraction Light Changelog for details.

Disabling Advanced Options

Each of these options adds processing overhead. Disable what you don’t need:

Option	What disabling does	Config
Bounding Box Citations	Removes spatial location references for extracted values. See Citations.	`citationsEnabled: false`
Advanced Multimodal	Skips vision-language model processing. Keep enabled for scans/handwriting.	`advancedMultimodalEnabled: false`
Model Reasoning Insights	Removes decision-making explanations. Only needed for debugging.	`modelReasoningInsightsEnabled: false`

Chunking Optimizations

Chunking options in Extend Studio

For non-array extraction: Set chunk type to document to skip intelligent merging entirely—this is the fastest option.

For large array extraction: Use large_array_heuristics array strategy with smaller chunk sizes.

Merging strategy: Switch from intelligent to confidence, take_first, or take_last to avoid extra processing overhead.

Merging strategy settings in Extend Studio

Merging Strategy	Speed	Use when
`intelligent`	Slowest	Accuracy is critical (default)
`confidence`	Fast	General purpose, good default for latency
`take_first`	Fastest	Authoritative values appear at document start
`take_last`	Fastest	Authoritative values appear at document end

Disable Advanced Parsing Options

Parser block options in Extend Studio

Figure parsing - Disable unless documents contain important charts/diagrams
Signature detection - Disable unless signature verification is needed
Agentic OCR - Disable unless processing handwritten or poor-quality scans

Learn about Advanced Options for detailed explanations of each setting
Understand Field Names and Prompt Crafting for schema optimization
Explore Evaluation Sets to validate accuracy when optimizing for speed