Skip to Content
Knowledge BaseRAG Indexes

RAG Indexes

Create, configure, and manage document indexes for semantic search and retrieval augmented generation.

Creating an Index

RAG indexes organize documents into searchable collections with configurable chunking and embedding strategies.

Select a Tenant

All indexes are scoped to a tenant for data isolation. Use the tenant picker in the left sidebar to choose the target tenant.

If the tenant dropdown is empty, create a tenant first in the Administration section.

Click “New Index”

The index editor panel opens on the right side. Configure:

  • Name - Unique identifier for the index (e.g., “customer-support-docs”)
  • Description - Optional human-readable summary
  • Embedding Model - Choose from OpenAI, Jina AI, Cohere, or custom models

Configure Chunking

Text splitting determines how documents are divided into searchable units:

StrategyChunk SizeOverlapBest For
Fixed Size512 tokens50 tokensUniform documents, simple pipelines
Sentence Boundary256-512 tokens0 tokensMaintaining semantic coherence
ParagraphVariable0 tokensStructured content (reports, articles)
Recursive1024 tokens200 tokensLong-form documents, books

Overlap allows adjacent chunks to share context, improving retrieval when queries span chunk boundaries.

Save and Index

Click “Create Index” to persist the configuration. The index is now ready to accept documents.

RAG Index detail view showing document list, chunking configuration, and embedding model settings

Changing embedding models or chunking strategies after documents are uploaded requires re-indexing all content. Plan your configuration before uploading large corpora.

Uploading Documents

Supported Formats

M3 Forge parses the following file types:

  • PDF - Text extraction with layout preservation
  • DOCX - Microsoft Word documents
  • TXT - Plain text
  • Markdown - .md files with formatting
  • HTML - Web pages and exported content
  • CSV - Structured data (each row becomes a chunk)

Upload Methods

  1. Open the index in the sidebar
  2. Navigate to the “Files” tab
  3. Drag files from your desktop into the upload zone
  4. Upload begins automatically with progress tracking

Batch Indexing

For large document sets:

  1. Select multiple files (Shift+Click or Ctrl+Click)
  2. Upload in a single batch operation
  3. Backend processes files in parallel
  4. Status updates appear in the file list

Processing time scales with document count and complexity. A 100-page PDF typically indexes in 10-30 seconds depending on embedding model latency.

Managing Files

File List View

Each indexed file shows:

  • Filename - Original document name
  • Chunks - Number of text chunks extracted
  • Status - indexed, processing, failed
  • Uploaded - Timestamp of upload
  • Actions - Re-index, delete

Re-Indexing

Re-index individual files when:

  • Chunking configuration changes
  • Embedding model is updated
  • Document content is modified

Select the file and click “Re-index” to trigger fresh processing. Original chunks are replaced atomically.

Deleting Files

Remove files from the index:

  1. Select the file in the file list
  2. Click the trash icon
  3. Confirm deletion

Chunks are removed from the vector index immediately. This operation cannot be undone.

Deleting files impacts search results for queries that previously matched those chunks. Ensure dependencies (workflows, applications) are updated before deletion.

Index Configuration

Embedding Models

Switch embedding models in the index editor:

{ "model": "openai/text-embedding-3-large", "dimensions": 3072, "normalize": true }

Model Comparison:

ModelLatency (p95)Cost (per 1M tokens)Languages
OpenAI 3-large200ms$0.13English-optimized
Jina v4150ms$0.02100+ languages
Cohere multilingual-v3180ms$0.10100+ languages

For custom models, provide HuggingFace model ID or deployment endpoint.

Chunking Parameters

Fine-tune splitting behavior:

  • chunk_size - Target size in tokens (128-2048)
  • chunk_overlap - Overlap between adjacent chunks (0-200 tokens)
  • separator - Delimiter for splitting (newline, paragraph, sentence)
  • keep_separator - Include delimiter in chunks (true/false)

Example configuration:

{ "strategy": "recursive", "chunk_size": 1024, "chunk_overlap": 100, "separators": ["\n\n", "\n", ". ", " "], "keep_separator": false }

Recursive splitting tries each separator in order, falling back to character-level if needed.

Metadata Schema

Attach custom metadata to documents for filtered retrieval:

{ "metadata_schema": { "product": "string", "version": "string", "category": "string", "created_date": "date" } }

Metadata is specified during file upload:

{ "file": "product-manual.pdf", "metadata": { "product": "enterprise", "version": "2.4.0", "category": "installation" } }

Search queries can filter by metadata:

{ "query": "database configuration", "metadata_filter": { "product": "enterprise", "version": ["2.4.0", "2.5.0"] } }

Multi-Tenant Access

Tenant Hierarchy

Indexes inherit tenant permissions:

  • Owner Tenant - Full read/write access
  • Parent Tenants - Read-only access (if enabled)
  • Child Tenants - No access by default

Enable “Include sub-tenants” to query across child tenant indexes.

Filtering by Tenant

The tenant picker in the sidebar controls which indexes are visible:

  • All Tenants - Show all accessible indexes (admin only)
  • Specific Tenant - Show only indexes owned by that tenant
  • Include Sub-Tenants - Also show child tenant indexes (checkbox)

Search results respect tenant boundaries. Cross-tenant queries are not supported for security isolation.

Index Lifecycle

States

StateDescriptionActions Available
DraftConfiguration only, no documentsEdit, Delete
ActiveDocuments indexed, ready for searchSearch, Upload, Edit Config
IndexingBackground processing in progressView Progress
ErrorProcessing failedView Logs, Retry

Monitoring

Index detail view shows:

  • Document Count - Total files indexed
  • Chunk Count - Total searchable text chunks
  • Storage Size - Vector and metadata storage
  • Last Updated - Most recent file upload timestamp
  • Error Count - Failed indexing operations (if any)

Maintenance

Regular maintenance tasks:

  • Reindex All - Rebuild entire index (after config changes)
  • Optimize - Compact vector storage and rebuild search structures
  • Export Metadata - Download document list with chunk counts

Large indexes (10,000+ documents) benefit from periodic optimization to reduce query latency and storage overhead.

Best Practices

Index Design

  • Single Domain per Index - Don’t mix unrelated document types
  • Consistent Chunking - Use same strategy across all documents in an index
  • Version Indexes - Create new indexes for major content updates (e.g., docs-v1, docs-v2)
  • Metadata First - Define metadata schema before uploading documents

Performance Optimization

  • Batch Uploads - Process multiple files in one operation
  • Async Indexing - Upload during off-peak hours for large batches
  • Index Warming - Run test queries after creation to populate caches
  • Monitor Latency - Track p95 search latency in monitoring dashboard

Cost Management

  • Right-Size Chunks - Larger chunks reduce embedding API calls
  • Cache Embeddings - Reuse embeddings for duplicate content
  • Prune Old Versions - Delete outdated documents to reduce storage costs
  • Choose Efficient Models - Jina v4 offers 85% cost savings vs OpenAI

Next Steps

Last updated on