Skip to Content

Custom Layout

Classify documents by layout variation, such as vendor-specific formats. Train a custom ML model to recognize different layout templates and route processing accordingly.

Overview

The Custom Layout processor identifies which layout template a document matches. When the same document type (e.g., invoices) comes from different vendors, each vendor’s format has distinct visual layouts. The layout processor detects these variations so downstream extractors can use the correct field mappings.

Use cases:

  • Multi-vendor invoice processing (each vendor has a different invoice format)
  • Form template detection (different versions of the same form)
  • Layout-dependent extraction routing
  • Document format versioning (detecting old vs. new form layouts)
  • Vendor-specific processing pipelines

Creating a Custom Layout Processor

Create Processor

Navigate to ProcessorsCustom Processors and click Create processor on the Custom Layout card. Enter a name (e.g., “Invoice Layout Detector”) and click Create.

Define Variations

Open your processor and navigate to the Variation Management tab. Create a variation for each distinct layout template:

Variation Management tab showing layout variations with reference document thumbnails, match thresholds, and training counts
SettingDescriptionExample
Variation NameMachine-friendly identifieracme_invoice_v2
Display NameHuman-readable label”Acme Corp Invoice (v2)“
ColorVisual indicatorOrange swatch
Reference DocumentUpload a representative exampleA sample Acme invoice
Match ThresholdMinimum confidence for matching0.85

Reference documents are key — they serve as the visual pattern the model learns to match against. Upload a clean, representative example for each variation.

Variations can be reordered by dragging. Status badges show training and test counts per variation.

Import Training Documents

Navigate to the Documents tab and import documents representing all layout variations. Use the dedicated layout import interface.

Include at least 5-10 examples per variation, covering:

  • Different data content (same layout, different values)
  • Scan quality variations (clean vs. slightly skewed)
  • Minor format differences within the same vendor

Label Documents

Click a document to open the Labeling Interface:

  • Document preview with zoom and rotate controls
  • Variation assignment panel showing all defined variations
  • Click a variation to assign it to the current document
  • Keyboard shortcuts (1-9) for quick variation assignment
  • Confidence display for auto-labeled variations
  • Multi-label support (document can match multiple variations)

Workflow:

  1. View the document
  2. Identify which vendor/layout variation it matches
  3. Click the matching variation or press the corresponding number key
  4. Confirm and move to next document

Train Model

Navigate to Training Jobs and click Start Training. The layout processor trains a model to recognize visual layout patterns.

See Training for detailed configuration.

Evaluate Results

The Evaluation dashboard shows:

  • Overall layout detection accuracy
  • Per-variation performance — Accuracy, precision, recall, F1 per variation
  • Training curves — Loss and accuracy over epochs
  • Confusion matrix — Which variations get confused with each other

Deploy

Activate the trained version for production use.

Dashboard

The layout processor has five tabs:

TabPurpose
VariationsDefine and manage layout variations with reference documents
DatasetTraining data statistics and split assignment
DocumentsImport and manage training documents
TrainingLaunch and monitor training jobs
EvaluateReview layout detection accuracy

Backend Architecture

Custom layout processors map to the Layout Provider system in marie-ai:

  • Layout ID-based template selection — Each variation maps to a template ID (TID)
  • Configuration-driven — YAML-based layout definitions with annotator configs
  • Per-layout annotator pipelines — Each variation can have its own extraction prompts and field mappings

Layout configuration structure:

extract/ ├── TID-<layout_id>/ │ └── annotator/ │ ├── config.yml # Annotator pipeline config │ ├── prompt_a.j2 # Jinja2 extraction prompts │ └── _variables.json # Template variables ├── base/ │ └── <fallback_prompts>/ # Shared prompt templates └── config/ └── base-config.yml # Base extraction config

This architecture allows each vendor/layout variation to have completely different extraction logic while sharing common infrastructure.

Layout detection is typically the first step in a multi-processor pipeline. Once the layout is identified, the correct extractor configuration is applied automatically.

Common Pattern: Layout → Extract Pipeline

A typical multi-vendor extraction pipeline:

Document → Layout Detector → Route by Variation ├── Acme Layout → Acme Extractor (fields specific to Acme invoices) ├── Widget Co Layout → Widget Extractor (different field positions) └── Unknown → HITL Review (manual classification)

Build this pattern using workflows with a Layout processor followed by a Branch node that routes to variation-specific extractors.

Best Practices

  1. Use reference documents — Upload the cleanest, most representative example for each variation
  2. Set appropriate thresholds — Higher thresholds reduce false matches but may miss valid variations
  3. Include scan variations — Train on both clean and noisy versions of each layout
  4. Keep variations distinct — If two layouts are very similar, consider merging them
  5. Review the confusion matrix — Identify which variations need more training examples
  6. Combine with extractors — Build layout-aware extraction pipelines for multi-vendor processing

Next Steps

Last updated on