Best Practices

This guide covers best practices for using the Parse API, including performance optimization and accuracy improvements.


Async processing and webhooks

The Parse API uses asynchronous processing via POST /parse_runs. This returns a parse run ID that you can use to check status with GET /parse_runs/{id}.

Two ways to get results:

  1. Polling - Periodically call GET /parse_runs/{id} until status is PROCESSED or FAILED. Simple to implement but less efficient.

  2. Webhooks - Configure a webhook endpoint to receive notifications when parsing completes. More efficient for production workloads.

For large documents with long processing times, consider using webhooks to avoid unnecessary polling requests. See Webhooks for setup instructions.


Performance optimization

For fastest processing:

  • Set blockOptions.text.agentic.enabled to false (most significant speedup—avoids AI-based OCR corrections)
  • Set blockOptions.figures.enabled to false (avoids AI-based figure analysis)
  • Set pageRotationEnabled to false if all pages are correctly oriented
  • Use chunkingStrategy: "document" (fastest) or "page" instead of "section" (section chunking adds CPU overhead during parsing to generate semantic sections)

For highest accuracy:

  • Set target: "markdown" with chunkingStrategy: "section"
  • Set blockOptions.text.agentic.enabled to true for handwritten/degraded documents
  • Set tableHeaderContinuationEnabled to true for multi-page tables
  • Set signatureDetectionEnabled to true for legal documents

Troubleshooting

Poor quality OCR results

  1. Set blockOptions.text.agentic.enabled to true for handwritten/degraded documents
  2. Set pageRotationEnabled to true for rotated pages
  3. Try target: "spatial" for very messy or skewed documents

Chunks are too large or too small

  1. Adjust minCharacters and maxCharacters in chunkingStrategy.options if you are using type: "section" with target: "markdown".
  2. Try different chunking types (page vs section vs document)
  3. Consider using pageRanges to process fewer pages per request

Tables not parsing correctly

  1. Set tableHeaderContinuationEnabled to true for multi-page tables
  2. Set targetFormat: "html" for better table structure
  3. Set blockOptions.tables.agentic.enabled to true