Extraction Light

Our low latency, low cost extraction processor. Faster and cheaper than Performance while still maintaining a high degree of accuracy, but struggles with more complex extraction tasks.

This processor does not support logprobs. JSON Schema processors will have logprobsConfidence set to null. Legacy Fields Array processors will surface confidences of 0.

Versions

VersionDateChangelog
3.3.0
2025-09-06
Add support for arrays of scalars (primitive types like strings, numbers, booleans, integers).
3.2.1
2025-07-10
Small patch to handling of currency fields.
3.2.0
2025-07-06
Minor foundation model version upgrade.
3.1.0
2025-05-29
Improved citations support for most field types.
3.0.0
2025-04-01
- New foundation model with improved accuracy and performance.
- A number of small improvements to the pre-processing pipeline for light processor.
2.4.0
2024-10-10
Add support for an optimized enum field type. Significant improvements to handling documents with messy handwriting and forms.
2.3.0
2024-08-16
Updates to our document pre-processing to better handle more complex document/table layouts.
2.2.1
2024-07-18
Patch to fix rare cases of missing page references in extraction results.
2.2.0
2024-07-14
Updates to our document pre-processing to better handle checkboxes.
2.1.0
2024-07-01
Updates to our document pre-processing to better handle more complex document/table layouts.
2.0.0
2024-04-01
Rolls out a new foundation model for the light extraction processor. Significantly more accurate, with little to no latency increase.
1.0.0
2023-09-01
Initial (and now legacy) light tier extraction model.