Knowledge Base

Build and manage intelligent document retrieval systems with RAG (Retrieval Augmented Generation) indexes, semantic search, and hybrid ranking.

What is the Knowledge Base?

The Knowledge Base in M3 Forge provides document indexing and retrieval infrastructure for AI applications. It enables:

RAG Pipelines - Retrieve relevant context from large document corpora to augment LLM responses
Semantic Search - Find documents by meaning, not just keywords
Hybrid Ranking - Combine semantic similarity with keyword matching for optimal recall
Multi-Tenant Isolation - Organize indexes by tenant with hierarchical access control
Chunk Management - Configure text splitting strategies optimized for your domain

RAG Architecture

Retrieval Augmented Generation enhances LLM outputs by grounding responses in retrieved documents:

Index Creation - Upload documents and configure chunking strategy
Embedding Generation - Text chunks converted to vector embeddings via embedding models
Query Processing - User questions embedded with the same model
Similarity Search - Vector database returns most relevant chunks
Context Augmentation - Retrieved chunks injected into LLM prompt
Response Generation - LLM generates answer grounded in source documents

This pattern prevents hallucination by constraining the LLM to factual information from your documents.

Knowledge Base section showing RAG index list with document counts and search configuration

Key Features

Document Management

File Upload - Drag-and-drop interface for PDF, TXT, DOCX, Markdown, HTML
Batch Indexing - Process hundreds of documents in a single operation
Version Control - Track document updates and re-index changed content
Metadata Filtering - Attach custom metadata for filtered retrieval

Embedding Models

M3 Forge supports multiple embedding providers:

Provider	Model	Dimensions	Use Case
OpenAI	text-embedding-3-large	3072	High accuracy, English-optimized
Jina AI	jina-embeddings-v4	768	Multilingual, fast inference
Cohere	embed-multilingual-v3	1024	Multilingual semantic search
Custom	Any HuggingFace model	Variable	Domain-specific fine-tuned models

Choose models based on language coverage, latency requirements, and domain specificity.

Chunking Strategies

Configure text splitting to balance context window size with retrieval precision:

Fixed Size - Split every N tokens (simple, predictable)
Sentence Boundary - Respect sentence structure (better semantic coherence)
Paragraph - Maintain logical document sections (ideal for structured content)
Recursive - Hierarchical splitting with overlap (best for long-form documents)

Chunk size trades off precision vs recall. Smaller chunks (256-512 tokens) improve retrieval precision but may miss broader context. Larger chunks (1024+ tokens) provide more context but may dilute relevance scores.

Search Capabilities

Semantic Search - Vector similarity using cosine distance
Hybrid Search - Combine vector search with BM25 keyword ranking
Metadata Filters - Restrict results by custom attributes
Reranking - Secondary model re-scores top results for relevance
MMR (Maximal Marginal Relevance) - Diversify results to reduce redundancy

When to Use Knowledge Base

Use Case	Example
Customer Support	Answer questions using product documentation and past tickets
Legal Research	Search case law and contracts by semantic meaning
Internal Wiki	Company knowledge base with natural language queries
Research Assistant	Literature review over academic papers
Compliance	Find relevant regulations and policy documents

For simple keyword search over structured data, prefer traditional databases. RAG excels when semantic understanding matters.

Tenant-Aware Indexing

Indexes are scoped to tenants for data isolation and access control:

Single Tenant - Dedicated index per customer (full isolation)
Multi-Tenant - Shared index with metadata filtering (cost-efficient)
Hierarchical - Parent-child tenant relationships with inheritance

The UI provides tenant selection and sub-tenant inclusion controls for flexible querying across organizational boundaries.

Getting Started

RAG Indexes

Create and manage RAG indexes with document upload and configuration.

Search

Query indexes with semantic search, hybrid ranking, and metadata filters.

Integration with Workflows

Use Knowledge Base in workflows via the RAG Retrieval Node:


{
  "type": "rag-retrieval",
  "config": {
    "index_id": "customer-docs-v2",
    "query": "$.data.question",
    "top_k": 5,
    "search_type": "hybrid",
    "metadata_filter": {
      "product": "enterprise"
    }
  }
}

Retrieved chunks flow to downstream nodes via $.nodes.<node_id>.output.chunks, ready for LLM context injection.

Architecture

Indexes are stored in PostgreSQL with vector embeddings in pgvector extension. The Marie-AI backend handles:

Document parsing and chunking
Embedding generation via configured models
Vector storage and similarity search
Metadata indexing and filtering

The M3 Forge frontend provides visual management through:

RAG Index Sidebar (tenant filtering, search)
Index Editor Panel (configuration, chunking)
File Upload UI (drag-and-drop, batch operations)

Vector indexes require significant storage and compute. A 10,000 document corpus with 3072-dimensional embeddings consumes ~120MB of vector storage. Plan infrastructure accordingly.

Next Steps

Learn how to create and configure RAG indexes
Explore search capabilities for optimal retrieval
Integrate retrieval into workflows for RAG pipelines