For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Book a demoLog in
ProductAPI ReferenceChangelogModel Versioning
ProductAPI ReferenceChangelogModel Versioning
    • Getting Started
    • Create an Extraction Processor
    • Publishing Processors
    • Create a Workflow
  • Evaluation
    • Overview
    • Creating Evaluation Sets
    • Running Evaluation Sets
    • Calculating Array Accuracy
  • Workflows
    • Reviewing Workflow Run
    • Workflow Versioning
    • Conditional Steps
    • External Data Validation Step
    • Conditional Extraction Step
  • Review Experience
    • Confidence Scores
    • Bounding Boxes
  • Validation Rules
    • Validation Step
    • Formulas
LogoLogo
Book a demoLog in
Evaluation

Overview

Evaluation sets allow you to reliably and continuously test the accuracy and performance of your AI document processors in Extend. By creating sets of representative document examples with validated outputs, you can verify that your extraction and classification configs are working as intended and identify areas for improvement, while also rapidly evaluating changes to see if they have improved the accuracy of your processors.

You can view and manage your evaluation sets from the Evaluation page in the Studio section of Extend. Here you’ll find a list of your existing evaluation sets along with options to create new sets and run evaluations.

Evaluation plays a key role in building the best AI document processors for your use cases in Extend:

  1. Create evaluation sets containing examples that represent the range of documents your processor needs to handle
  2. Run evaluations on your sets to test processor accuracy and performance
  3. Review evaluation results to verify processor outputs match expected results, and identify areas for improvement and common errors
  4. Iterate on processor configuration and training data to improve accuracy based on evaluation insights
Was this page helpful?
Previous

Creating evaluation sets

Next
Built with