Best Practices
Schema Management Best Practices
Best practices for managing your document processing schemas
Schema Naming
- Use descriptive names: Choose names that clearly indicate the schema’s purpose or the type of document it handles (e.g.,
standard_supplier_invoice
,monthly_bank_statement_chase
). - Include version information: If you anticipate schema changes, incorporate versioning in the name (e.g.,
invoice_schema_v2
,statement_fields_2024_q1
). - Keep names consistent: Establish a naming convention and apply it uniformly across your organization for clarity.
Schema Versioning
- Automatic Versioning: The Invaro API automatically handles versioning. Each time you update a schema using the
PUT /api/v1/schemas/id/{schema_id}
endpoint, theversion
number is automatically incremented. - History: Previous versions are maintained, allowing you to track changes over time, although only the latest active version is used for processing.
Schema Types
- Supported Types: Currently, the API supports
"invoice"
and"bank_statement"
types. - Choose Correctly: Select the type that matches the documents you intend to process with this schema. This helps the system apply the correct underlying models.
- Immutable Type: The
type
of a schema cannot be changed after it has been created.
Schema String Format (schema_string
)
- Define Fields: Clearly list all the data fields you expect to extract within the
fields
array. - Consistency: While the AI is flexible, maintaining a consistent structure for similar document types across schemas can be beneficial.
- Use Description: Utilize the main schema
description
field to document the purpose of specific fields within theschema_string
if needed.
Schema Management Workflow
- Review Regularly: Periodically review your active schemas to ensure they are still relevant and accurate for the documents you are processing.
- Update as Needed: Modify schemas when the format of your source documents changes or when you need to extract additional fields.
- Test Thoroughly: Before relying on a new or updated schema in production, test it with a variety of sample documents to ensure it extracts data correctly.
- Delete Unused Schemas: Remove schemas that are no longer needed using the
DELETE /api/v1/schemas/id/{schema_id}
endpoint to keep your schema list manageable.