Best Practices: Latency Optimization

When processing high-volume documents or building real-time applications, latency becomes a critical factor. This guide provides the most impactful settings to reduce latency.

Many latency-sensitive settings involve trade-offs with accuracy for complex documents. See the Advanced Options guide for detailed explanations of each setting.

Quick Reference

Use this checklist when optimizing for latency:

Advanced Options

  • Use extraction_light for simple document types (verify accuracy with evaluation sets)
  • Turn off model reasoning insights (modelReasoningInsightsEnabled: false) - only needed for debugging
  • Disable advanced multimodal (advancedMultimodalEnabled: false) - unless processing scans/handwritten content
  • Turn off bounding box citations (citationsEnabled: false) - removes spatial location references

Extraction Chunking Options

  • Limit page ranges if data is on specific pages
  • Use confidence or take_first merging instead of intelligent
  • Use large_array_heuristics array strategy if processing large arrays

Parser Configuration

  • Use document chunk type for non-array extraction to skip merging entirely
  • Disable figure parsing - unless documents contain important charts/diagrams
  • Disable agentic OCR - unless processing handwritten/poor quality scans

Workflow

  • Split into parallel extractors if you have both simple fields and complex arrays

Light Extraction

The biggest change you can make to reduce latency is selecting Extraction Light instead of the default Extraction Performance.

Core performance settings in Extend Studio

1config: {
2 "type": "EXTRACT",
3 "baseProcessor": "extraction_light"
4}

Extraction Light is faster and cheaper, but removes support for advanced visual features like figure parsing and signature detection. See the Extraction Light Changelog for details.


Disabling Advanced Options

Each of these options adds processing overhead. Disable what you don’t need:

OptionWhat disabling doesConfig
Bounding Box CitationsRemoves spatial location references for extracted values. See Citations.citationsEnabled: false
Advanced MultimodalSkips vision-language model processing. Keep enabled for scans/handwriting.advancedMultimodalEnabled: false
Model Reasoning InsightsRemoves decision-making explanations. Only needed for debugging.modelReasoningInsightsEnabled: false

Chunking Optimizations

Chunking options in Extend Studio

For non-array extraction: Set chunk type to document to skip intelligent merging entirely—this is the fastest option.

For large array extraction: Use large_array_heuristics array strategy with smaller chunk sizes.

Merging strategy: Switch from intelligent to confidence, take_first, or take_last to avoid extra processing overhead.

Merging strategy settings in Extend Studio

Merging StrategySpeedUse when
intelligentSlowestAccuracy is critical (default)
confidenceFastGeneral purpose, good default for latency
take_firstFastestAuthoritative values appear at document start
take_lastFastestAuthoritative values appear at document end

Disable Advanced Parsing Options

Parser block options in Extend Studio

  • Figure parsing - Disable unless documents contain important charts/diagrams
  • Signature detection - Disable unless signature verification is needed
  • Agentic OCR - Disable unless processing handwritten or poor-quality scans