Skip to Content
Knowledge BaseOverview

Knowledge Base

Build and manage intelligent document retrieval systems with RAG (Retrieval Augmented Generation) indexes, semantic search, and hybrid ranking.

What is the Knowledge Base?

The Knowledge Base in M3 Forge provides document indexing and retrieval infrastructure for AI applications. It enables:

  • RAG Pipelines - Retrieve relevant context from large document corpora to augment LLM responses
  • Semantic Search - Find documents by meaning, not just keywords
  • Hybrid Ranking - Combine semantic similarity with keyword matching for optimal recall
  • Multi-Tenant Isolation - Organize indexes by tenant with hierarchical access control
  • Chunk Management - Configure text splitting strategies optimized for your domain

RAG Architecture

Retrieval Augmented Generation enhances LLM outputs by grounding responses in retrieved documents:

  1. Index Creation - Upload documents and configure chunking strategy
  2. Embedding Generation - Text chunks converted to vector embeddings via embedding models
  3. Query Processing - User questions embedded with the same model
  4. Similarity Search - Vector database returns most relevant chunks
  5. Context Augmentation - Retrieved chunks injected into LLM prompt
  6. Response Generation - LLM generates answer grounded in source documents

This pattern prevents hallucination by constraining the LLM to factual information from your documents.

Knowledge Base section showing RAG index list with document counts and search configuration

Key Features

Document Management

  • File Upload - Drag-and-drop interface for PDF, TXT, DOCX, Markdown, HTML
  • Batch Indexing - Process hundreds of documents in a single operation
  • Version Control - Track document updates and re-index changed content
  • Metadata Filtering - Attach custom metadata for filtered retrieval

Embedding Models

M3 Forge supports multiple embedding providers:

ProviderModelDimensionsUse Case
OpenAItext-embedding-3-large3072High accuracy, English-optimized
Jina AIjina-embeddings-v4768Multilingual, fast inference
Cohereembed-multilingual-v31024Multilingual semantic search
CustomAny HuggingFace modelVariableDomain-specific fine-tuned models

Choose models based on language coverage, latency requirements, and domain specificity.

Chunking Strategies

Configure text splitting to balance context window size with retrieval precision:

  • Fixed Size - Split every N tokens (simple, predictable)
  • Sentence Boundary - Respect sentence structure (better semantic coherence)
  • Paragraph - Maintain logical document sections (ideal for structured content)
  • Recursive - Hierarchical splitting with overlap (best for long-form documents)

Chunk size trades off precision vs recall. Smaller chunks (256-512 tokens) improve retrieval precision but may miss broader context. Larger chunks (1024+ tokens) provide more context but may dilute relevance scores.

Search Capabilities

  • Semantic Search - Vector similarity using cosine distance
  • Hybrid Search - Combine vector search with BM25 keyword ranking
  • Metadata Filters - Restrict results by custom attributes
  • Reranking - Secondary model re-scores top results for relevance
  • MMR (Maximal Marginal Relevance) - Diversify results to reduce redundancy

When to Use Knowledge Base

Use CaseExample
Customer SupportAnswer questions using product documentation and past tickets
Legal ResearchSearch case law and contracts by semantic meaning
Internal WikiCompany knowledge base with natural language queries
Research AssistantLiterature review over academic papers
ComplianceFind relevant regulations and policy documents

For simple keyword search over structured data, prefer traditional databases. RAG excels when semantic understanding matters.

Tenant-Aware Indexing

Indexes are scoped to tenants for data isolation and access control:

  • Single Tenant - Dedicated index per customer (full isolation)
  • Multi-Tenant - Shared index with metadata filtering (cost-efficient)
  • Hierarchical - Parent-child tenant relationships with inheritance

The UI provides tenant selection and sub-tenant inclusion controls for flexible querying across organizational boundaries.

Getting Started

Integration with Workflows

Use Knowledge Base in workflows via the RAG Retrieval Node:

{ "type": "rag-retrieval", "config": { "index_id": "customer-docs-v2", "query": "$.data.question", "top_k": 5, "search_type": "hybrid", "metadata_filter": { "product": "enterprise" } } }

Retrieved chunks flow to downstream nodes via $.nodes.<node_id>.output.chunks, ready for LLM context injection.

Architecture

Indexes are stored in PostgreSQL with vector embeddings in pgvector extension. The Marie-AI backend handles:

  • Document parsing and chunking
  • Embedding generation via configured models
  • Vector storage and similarity search
  • Metadata indexing and filtering

The M3 Forge frontend provides visual management through:

  • RAG Index Sidebar (tenant filtering, search)
  • Index Editor Panel (configuration, chunking)
  • File Upload UI (drag-and-drop, batch operations)

Vector indexes require significant storage and compute. A 10,000 document corpus with 3072-dimensional embeddings consumes ~120MB of vector storage. Plan infrastructure accordingly.

Next Steps

Last updated on