Best Practices: Field Names and Prompt Crafting
Before manually tuning your extractor, check out the Composer optimizer which can automatically improve your field descriptions and extraction rules using Extend’s AI agent. After running Composer, refer to these best practices for additional manual refinements and edge cases.
Schema Fundamentals
Field/Property Naming Conventions
Field names are an important part of extend’s extraction model, so keep some of these guidelines in mind:
Field Descriptions
Write clear, direct, and detailed descriptions
“The unique invoice number printed at the top of the document. For extend invoices, this will often appear with the prefix EX_.”
“Invoice number”
Include extra context:
- Note any formatting requirements*
- Mention where the field typically appears, if known
* Formatting can be suggested in field descriptions, but we often recommend implementing custom logic on your own platform.
Field Types
For complete details on all available field types and their configurations, see the Custom Field Types documentation.
Choosing the right field type for your data structure can help improve accuracy.
When to use arrays:
- Multiple similar items (line items, addresses, phone numbers)
- Extracting repeated data from tables
- Lists that can vary in length
Other custom field types:
date- For date/time values with automatic ISO format conversioncurrency- For monetary amounts with amount and currency codesignature- For signature detection with printed name, date, and signing status
Extractor Architecture Decisions
Unified vs Multiple Extractors
One of the most common questions is whether to create one unified extractor or multiple specialized extractors for a document type.
Use a unified extractor when:
- Documents have consistent structure and field requirements
- All fields are typically present in most documents
- You want simpler maintenance and deployment
Use multiple extractors when:
- Processing different document types in the same workflow (e.g., splitting an invoice and associated checks from a single PDF)
- You need different confidence processing rules for different documents
- You want to optimize performance for specific document patterns
Prompt Crafting
Effective prompt engineering is crucial for maximizing both extraction coverage and accuracy. Every extractor will have some unique nuance to it, but these tips should provide a solid foundation to build on.
Core Principles
Be Clear, Direct, and Contextual
Write prompts as if explaining to a new employee with no context about the business or document. Include what the data will be used for and provide clear examples.
Example:
For complex extractions, use sequential instructions:
Common Prompting Mistakes
Vague Instructions:
“Extract important information”
“Extract the invoice number located at one of the top corners of the document.”
Missing Context:
“Extract the date”
“Extract the invoice date (when the invoice was issued), not the due date or any other dates in the document”
Inconsistent Formatting:
“Extract total amount. Use dollar sign.”
“Extract the total amount in the format it appears in the document (e.g., ‘$1,250.00’ or ‘1250.00’). Include currency symbols and decimal places exactly as shown”
Related Topics
- Learn about Advanced Options for chunking & merging strategies
- For latency optimization strategies, see Latency Optimization

