Configuring an Extractor (Legacy)
Configuring Fields (Legacy)
This section is relevant only if you are using the older “Fields Array” config type. For new processors, we strongly recommend using the JSON Schema config type described in the “Configuring Properties” section above. If you aren’t sure, please see the Migrating to JSON Schema documentation.
To configure a field in the legacy system, add a semantically accurate field name and write a description that explains how to identify and extract that field from the document.
You must also configure the proper field type:
Text
Use the text data type when you want to extract a string of text from a document. For example, if you want to extract the name of a person from a document, you would use the text data type.
Number
Use the number data type when you want to extract a number from a document. For example, if you want to extract the age of a person from a document, you would use the number data type.
Currency
Use the currency data type when you want to extract a currency value from a document. For example, if you want to extract the price of a product from a document, you would use the currency data type.
Boolean
Use the boolean data type when you want to extract a boolean value from a document. For example, if you want to extract whether a product is in stock from a document, you would use the boolean data type.
Date
Use the date data type when you want to extract a date from a document. For example, if you want to extract the date of birth of a person from a document, you would use the date data type.
Signature
Use the signature data type when you want to extract a signature from a document. For example, if you want to extract the signature of a person from a document, you would use the signature data type. Signature fields will automatically extract all relevant details of a document’s signature block:
- is_signed
- printed_name
- signatory_title
- signature_date
Object
Use the object data type when you want to extract a set of related fields from a document. For example, if you want to extract the address, name, and birth date of a person from a document you would use the object data type.
Array
Use the array data type when you want to extract a list of related fields from a document. For example, if you want to extract a list of products that each have a name, price, and quantity from a document you would use the array data type.
Configuration table
The field config table also will allow you to select the drag button to move the field up or down. Performance is best when related fields in the document are positioned in related order in the configuration table.
The below documentation about field ID
s is relevant for the legacy Fields
Array config type. This is not relevant for the JSON Schema config type.
You can also set a field ID
which is a unique identifier for the field to use in your downstream system, so that you can make changes to the semantic field name without
updating your downstream system.
Configuring Custom Settings
In addition to the fields, you can also configure custom settings for each field. These settings allow you to further customize the extraction process to better suit your specific needs. However, please note that these settings are experimental and may not work as expected in all cases.
Before using these settings, we recommend consulting with the Extend team to understand their potential impact on the extraction process.