RAG Indexes
Create, configure, and manage document indexes for semantic search and retrieval augmented generation.
Creating an Index
RAG indexes organize documents into searchable collections with configurable chunking and embedding strategies.
Select a Tenant
All indexes are scoped to a tenant for data isolation. Use the tenant picker in the left sidebar to choose the target tenant.
If the tenant dropdown is empty, create a tenant first in the Administration section.
Click “New Index”
The index editor panel opens on the right side. Configure:
- Name - Unique identifier for the index (e.g., “customer-support-docs”)
- Description - Optional human-readable summary
- Embedding Model - Choose from OpenAI, Jina AI, Cohere, or custom models
Configure Chunking
Text splitting determines how documents are divided into searchable units:
| Strategy | Chunk Size | Overlap | Best For |
|---|---|---|---|
| Fixed Size | 512 tokens | 50 tokens | Uniform documents, simple pipelines |
| Sentence Boundary | 256-512 tokens | 0 tokens | Maintaining semantic coherence |
| Paragraph | Variable | 0 tokens | Structured content (reports, articles) |
| Recursive | 1024 tokens | 200 tokens | Long-form documents, books |
Overlap allows adjacent chunks to share context, improving retrieval when queries span chunk boundaries.
Save and Index
Click “Create Index” to persist the configuration. The index is now ready to accept documents.

Changing embedding models or chunking strategies after documents are uploaded requires re-indexing all content. Plan your configuration before uploading large corpora.
Uploading Documents
Supported Formats
M3 Forge parses the following file types:
- PDF - Text extraction with layout preservation
- DOCX - Microsoft Word documents
- TXT - Plain text
- Markdown -
.mdfiles with formatting - HTML - Web pages and exported content
- CSV - Structured data (each row becomes a chunk)
Upload Methods
Drag & Drop
- Open the index in the sidebar
- Navigate to the “Files” tab
- Drag files from your desktop into the upload zone
- Upload begins automatically with progress tracking
Batch Indexing
For large document sets:
- Select multiple files (Shift+Click or Ctrl+Click)
- Upload in a single batch operation
- Backend processes files in parallel
- Status updates appear in the file list
Processing time scales with document count and complexity. A 100-page PDF typically indexes in 10-30 seconds depending on embedding model latency.
Managing Files
File List View
Each indexed file shows:
- Filename - Original document name
- Chunks - Number of text chunks extracted
- Status -
indexed,processing,failed - Uploaded - Timestamp of upload
- Actions - Re-index, delete
Re-Indexing
Re-index individual files when:
- Chunking configuration changes
- Embedding model is updated
- Document content is modified
Select the file and click “Re-index” to trigger fresh processing. Original chunks are replaced atomically.
Deleting Files
Remove files from the index:
- Select the file in the file list
- Click the trash icon
- Confirm deletion
Chunks are removed from the vector index immediately. This operation cannot be undone.
Deleting files impacts search results for queries that previously matched those chunks. Ensure dependencies (workflows, applications) are updated before deletion.
Index Configuration
Embedding Models
Switch embedding models in the index editor:
{
"model": "openai/text-embedding-3-large",
"dimensions": 3072,
"normalize": true
}Model Comparison:
| Model | Latency (p95) | Cost (per 1M tokens) | Languages |
|---|---|---|---|
| OpenAI 3-large | 200ms | $0.13 | English-optimized |
| Jina v4 | 150ms | $0.02 | 100+ languages |
| Cohere multilingual-v3 | 180ms | $0.10 | 100+ languages |
For custom models, provide HuggingFace model ID or deployment endpoint.
Chunking Parameters
Fine-tune splitting behavior:
- chunk_size - Target size in tokens (128-2048)
- chunk_overlap - Overlap between adjacent chunks (0-200 tokens)
- separator - Delimiter for splitting (newline, paragraph, sentence)
- keep_separator - Include delimiter in chunks (true/false)
Example configuration:
{
"strategy": "recursive",
"chunk_size": 1024,
"chunk_overlap": 100,
"separators": ["\n\n", "\n", ". ", " "],
"keep_separator": false
}Recursive splitting tries each separator in order, falling back to character-level if needed.
Metadata Schema
Attach custom metadata to documents for filtered retrieval:
{
"metadata_schema": {
"product": "string",
"version": "string",
"category": "string",
"created_date": "date"
}
}Metadata is specified during file upload:
{
"file": "product-manual.pdf",
"metadata": {
"product": "enterprise",
"version": "2.4.0",
"category": "installation"
}
}Search queries can filter by metadata:
{
"query": "database configuration",
"metadata_filter": {
"product": "enterprise",
"version": ["2.4.0", "2.5.0"]
}
}Multi-Tenant Access
Tenant Hierarchy
Indexes inherit tenant permissions:
- Owner Tenant - Full read/write access
- Parent Tenants - Read-only access (if enabled)
- Child Tenants - No access by default
Enable “Include sub-tenants” to query across child tenant indexes.
Filtering by Tenant
The tenant picker in the sidebar controls which indexes are visible:
- All Tenants - Show all accessible indexes (admin only)
- Specific Tenant - Show only indexes owned by that tenant
- Include Sub-Tenants - Also show child tenant indexes (checkbox)
Search results respect tenant boundaries. Cross-tenant queries are not supported for security isolation.
Index Lifecycle
States
| State | Description | Actions Available |
|---|---|---|
| Draft | Configuration only, no documents | Edit, Delete |
| Active | Documents indexed, ready for search | Search, Upload, Edit Config |
| Indexing | Background processing in progress | View Progress |
| Error | Processing failed | View Logs, Retry |
Monitoring
Index detail view shows:
- Document Count - Total files indexed
- Chunk Count - Total searchable text chunks
- Storage Size - Vector and metadata storage
- Last Updated - Most recent file upload timestamp
- Error Count - Failed indexing operations (if any)
Maintenance
Regular maintenance tasks:
- Reindex All - Rebuild entire index (after config changes)
- Optimize - Compact vector storage and rebuild search structures
- Export Metadata - Download document list with chunk counts
Large indexes (10,000+ documents) benefit from periodic optimization to reduce query latency and storage overhead.
Best Practices
Index Design
- Single Domain per Index - Don’t mix unrelated document types
- Consistent Chunking - Use same strategy across all documents in an index
- Version Indexes - Create new indexes for major content updates (e.g.,
docs-v1,docs-v2) - Metadata First - Define metadata schema before uploading documents
Performance Optimization
- Batch Uploads - Process multiple files in one operation
- Async Indexing - Upload during off-peak hours for large batches
- Index Warming - Run test queries after creation to populate caches
- Monitor Latency - Track p95 search latency in monitoring dashboard
Cost Management
- Right-Size Chunks - Larger chunks reduce embedding API calls
- Cache Embeddings - Reuse embeddings for duplicate content
- Prune Old Versions - Delete outdated documents to reduce storage costs
- Choose Efficient Models - Jina v4 offers 85% cost savings vs OpenAI
Next Steps
- Learn how to search indexes for optimal retrieval
- Integrate indexes into workflows for RAG pipelines
- Monitor index health in the monitoring dashboard