Knowledge Base
Build and manage intelligent document retrieval systems with RAG (Retrieval Augmented Generation) indexes, semantic search, and hybrid ranking.
What is the Knowledge Base?
The Knowledge Base in M3 Forge provides document indexing and retrieval infrastructure for AI applications. It enables:
- RAG Pipelines - Retrieve relevant context from large document corpora to augment LLM responses
- Semantic Search - Find documents by meaning, not just keywords
- Hybrid Ranking - Combine semantic similarity with keyword matching for optimal recall
- Multi-Tenant Isolation - Organize indexes by tenant with hierarchical access control
- Chunk Management - Configure text splitting strategies optimized for your domain
RAG Architecture
Retrieval Augmented Generation enhances LLM outputs by grounding responses in retrieved documents:
- Index Creation - Upload documents and configure chunking strategy
- Embedding Generation - Text chunks converted to vector embeddings via embedding models
- Query Processing - User questions embedded with the same model
- Similarity Search - Vector database returns most relevant chunks
- Context Augmentation - Retrieved chunks injected into LLM prompt
- Response Generation - LLM generates answer grounded in source documents
This pattern prevents hallucination by constraining the LLM to factual information from your documents.

Key Features
Document Management
- File Upload - Drag-and-drop interface for PDF, TXT, DOCX, Markdown, HTML
- Batch Indexing - Process hundreds of documents in a single operation
- Version Control - Track document updates and re-index changed content
- Metadata Filtering - Attach custom metadata for filtered retrieval
Embedding Models
M3 Forge supports multiple embedding providers:
| Provider | Model | Dimensions | Use Case |
|---|---|---|---|
| OpenAI | text-embedding-3-large | 3072 | High accuracy, English-optimized |
| Jina AI | jina-embeddings-v4 | 768 | Multilingual, fast inference |
| Cohere | embed-multilingual-v3 | 1024 | Multilingual semantic search |
| Custom | Any HuggingFace model | Variable | Domain-specific fine-tuned models |
Choose models based on language coverage, latency requirements, and domain specificity.
Chunking Strategies
Configure text splitting to balance context window size with retrieval precision:
- Fixed Size - Split every N tokens (simple, predictable)
- Sentence Boundary - Respect sentence structure (better semantic coherence)
- Paragraph - Maintain logical document sections (ideal for structured content)
- Recursive - Hierarchical splitting with overlap (best for long-form documents)
Chunk size trades off precision vs recall. Smaller chunks (256-512 tokens) improve retrieval precision but may miss broader context. Larger chunks (1024+ tokens) provide more context but may dilute relevance scores.
Search Capabilities
- Semantic Search - Vector similarity using cosine distance
- Hybrid Search - Combine vector search with BM25 keyword ranking
- Metadata Filters - Restrict results by custom attributes
- Reranking - Secondary model re-scores top results for relevance
- MMR (Maximal Marginal Relevance) - Diversify results to reduce redundancy
When to Use Knowledge Base
| Use Case | Example |
|---|---|
| Customer Support | Answer questions using product documentation and past tickets |
| Legal Research | Search case law and contracts by semantic meaning |
| Internal Wiki | Company knowledge base with natural language queries |
| Research Assistant | Literature review over academic papers |
| Compliance | Find relevant regulations and policy documents |
For simple keyword search over structured data, prefer traditional databases. RAG excels when semantic understanding matters.
Tenant-Aware Indexing
Indexes are scoped to tenants for data isolation and access control:
- Single Tenant - Dedicated index per customer (full isolation)
- Multi-Tenant - Shared index with metadata filtering (cost-efficient)
- Hierarchical - Parent-child tenant relationships with inheritance
The UI provides tenant selection and sub-tenant inclusion controls for flexible querying across organizational boundaries.
Getting Started
RAG Indexes
Create and manage RAG indexes with document upload and configuration.
Search
Query indexes with semantic search, hybrid ranking, and metadata filters.
Integration with Workflows
Use Knowledge Base in workflows via the RAG Retrieval Node:
{
"type": "rag-retrieval",
"config": {
"index_id": "customer-docs-v2",
"query": "$.data.question",
"top_k": 5,
"search_type": "hybrid",
"metadata_filter": {
"product": "enterprise"
}
}
}Retrieved chunks flow to downstream nodes via $.nodes.<node_id>.output.chunks, ready for LLM context injection.
Architecture
Indexes are stored in PostgreSQL with vector embeddings in pgvector extension. The Marie-AI backend handles:
- Document parsing and chunking
- Embedding generation via configured models
- Vector storage and similarity search
- Metadata indexing and filtering
The M3 Forge frontend provides visual management through:
- RAG Index Sidebar (tenant filtering, search)
- Index Editor Panel (configuration, chunking)
- File Upload UI (drag-and-drop, batch operations)
Vector indexes require significant storage and compute. A 10,000 document corpus with 3072-dimensional embeddings consumes ~120MB of vector storage. Plan infrastructure accordingly.
Next Steps
- Learn how to create and configure RAG indexes
- Explore search capabilities for optimal retrieval
- Integrate retrieval into workflows for RAG pipelines