The BatchProcessorRun object

The BatchProcessorRun object is returned by the Get Batch Processor Run endpoint.

The object represents a run of a processor over a batch of files and contains all the information about the run, including metrics, the processor that was run, and the status of the run.

object
string

The type of response, will always be “batch_processor_run”.

id
string

The unique identifier for this batch processor run.

processorId
string

The ID of the processor used for this run.

processorVersionId
string

The ID of the specific processor version used.

processorName
string

The name of the processor.

metrics
object

The metrics for the batch processor run.

numFiles
number

The total number of files that were processed.

numPages
number

The total number of pages that were processed.

type
string

The type of batch processor run. Possible values are EXTRACT, CLASSIFY, and SPLITTER.

The sections below show the fields in this object that are present for each type of run.

fieldMetrics
object

Record mapping field names to their respective metrics.

meanConfidence
number

The mean confidence score for this field across all documents.

recallPerc
number

The recall percentage for this field, representing how many of the expected values were correctly extracted.

precisionPerc
number

The precision percentage for this field, representing how many of the extracted values were correct.

fieldMetrics
object

For nested object fields, this contains metrics for the child fields. Has the same structure as the parent fieldMetrics.

arrayCardinalityMetrics
object

Maps the root array field name to a number indicating how many times the array field has the correct number of rows extracted.

accuracyPerc
number

The overall accuracy percentage.

meanConfidence
number

The mean confidence score.

distribution
object

Record mapping classification values to their counts.

accuracyPercByClassification
object

Mapping from classification to accuracy percentage as calculated from the confusion matrix.

confusionMatrix
object

Mapping from actual class to predicted class to count. Only present when accuracy percentage is present.

precisionPerc
number

Number of predicted subdocuments that are in the expected set of subdocuments divided by total number of predicted subdocuments.

recallPerc
number

Number of expected subdocuments that are in the predicted set of subdocuments divided by total number of expected subdocuments.

numExpectedDocs
number

The number of expected documents.

numPredictedDocs
number

The number of predicted documents.

numCorrectDocs
number

The number of correctly predicted documents.

meanRunTimeMs
number

The mean runtime in milliseconds per document.

status
string

The current status of the batch processor run. Possible values are PENDING, PROCESSING, PROCESSED, FAILED.

source
string

The source of the batch processor run.

EVAL_SET

The batch processor run was made from an evaluation set. In this case, the sourceId will be the ID of the evaluation set, such as ev_1234.

PLAYGROUND

The batch processor run was made from the playground. The sourceId will not be set for this value.

STUDIO

The batch processor run was made for a processor in Studio. The sourceId will be the ID of the processor, such as dp_1234.

sourceId
string

The ID of the source of the batch processor run. See the source field for more details.

runCount
number

The number of runs that were made.

options
object

The options for the batch processor run.

fuzzyMatchFields
array

The fields that were fuzzy matched. Optional.

excludeFields
array

The fields that were excluded from the run. Optional.

clearPreProcessingCache
boolean

Whether the pre processing cache was cleared. Optional.

createdAt
string

The date and time the batch processor run was created.

updatedAt
string

The date and time the batch processor run was last updated.