Assistants

An assistant is a named, versioned agent configuration deployed to the Agent Server as a runtime unit. Assistants define what an agent is — model, prompt, tools, memory — while the Agent Server handles how it runs: persistence, scaling, and lifecycle management.

What is an Assistant?

Every agent running on the Agent Server starts as an assistant configuration. An assistant packages everything the agent needs into a single deployable unit:

Identity — Unique name, description, and metadata
Model selection — LLM provider, model name, and inference parameters
Behavior — System prompt defining personality, constraints, and output format
Capabilities — Tools, skills, and MCP server connections
Memory — Scope and persistence strategy for conversational context

Assistants separate configuration from execution. You define an assistant once, then the Agent Server creates sessions and runs against it. This means multiple users can interact with the same assistant definition simultaneously, each with their own isolated session state.

Assistant Types

M3 Forge supports three primary assistant types, each optimized for different agent patterns.

Type	Pattern	Best For	Tool Support
ReactAgent	ReAct (Reason + Act) loop	General-purpose tasks requiring tool use and multi-step reasoning	Full
DocumentAssistant	RAG (Retrieval-Augmented Generation)	Document Q&A against knowledge bases	Read-only retrieval
Custom BaseAgent	User-defined logic	Domain-specific behavior that doesn’t fit standard patterns	Configurable

ReactAgent

Implements the ReAct loop: the agent reasons about the current state, selects a tool, observes the result, and repeats until the task is complete. Best for general-purpose tasks that require chaining multiple tool calls with intermediate reasoning.

Use cases:

Research tasks requiring web search and summarization
Code generation with file system access
Multi-step data analysis with API calls

DocumentAssistant

Specialized for retrieval-augmented generation. Connects to one or more knowledge bases and retrieves relevant documents before generating responses. Handles chunking strategies, relevance scoring, and citation tracking automatically.

Use cases:

Internal documentation Q&A
Policy and compliance lookups
Technical support with product knowledge

Custom BaseAgent Subclass

Extend BaseAgent for domain-specific logic that goes beyond the standard patterns. Custom subclasses can override the reasoning loop, implement specialized state machines, or integrate with proprietary systems.

Use cases:

Multi-modal agents processing images and text
Agents with custom approval workflows
Integration with legacy enterprise systems

Configuration

Each assistant is composed of configuration blocks that define its behavior. All blocks are optional except LLMConfig.

LLMConfig

Specifies the model provider and inference parameters.


{
  "llm": {
    "provider": "openai",
    "model": "gpt-4o",
    "temperature": 0.7,
    "max_tokens": 4096,
    "top_p": 1.0,
    "stop_sequences": []
  }
}

System Prompt

Instructions defining agent behavior, constraints, and output format. The system prompt is the primary mechanism for shaping assistant behavior.


{
  "system_prompt": "You are a technical support agent for M3 Forge. Answer questions using only the provided documentation. If you cannot find the answer, say so clearly. Always cite the source document."
}

ToolConfig

Lists the tools available to the assistant with per-tool parameters.


{
  "tools": [
    {
      "name": "web_search",
      "enabled": true,
      "parameters": {
        "max_results": 5,
        "safe_search": true
      }
    },
    {
      "name": "file_read",
      "enabled": true,
      "parameters": {
        "allowed_extensions": [".md", ".txt", ".pdf"]
      }
    }
  ]
}

SkillConfig

Skills loaded from the skill registry. Skills are multi-step capabilities that compose tools into higher-level operations.


{
  "skills": [
    {
      "skill_id": "research-and-summarize",
      "version": "1.2.0"
    },
    {
      "skill_id": "code-review",
      "version": "latest"
    }
  ]
}

MemoryConfig

Controls how the assistant retains and accesses conversational context.


{
  "memory": {
    "scope": "user",
    "provider": "mem0",
    "settings": {
      "max_memories": 1000,
      "relevance_threshold": 0.7,
      "ttl_days": 90
    }
  }
}

Memory scopes:

none — Stateless, no memory retained between runs
session — Memory persists within a single session
user — Memory shared across sessions for the same user
group — Memory shared across all users of this assistant

MCPConfig

Model Context Protocol server connections that extend the assistant’s capabilities with external tool providers.


{
  "mcp_servers": [
    {
      "name": "database-tools",
      "url": "http://mcp-db.internal:8080",
      "transport": "sse",
      "auth": {
        "type": "bearer",
        "token_env": "MCP_DB_TOKEN"
      }
    }
  ]
}

Complete Configuration Example

A full assistant configuration combining all blocks:


{
  "name": "support-agent",
  "type": "ReactAgent",
  "description": "Technical support agent for product documentation",
  "llm": {
    "provider": "openai",
    "model": "gpt-4o",
    "temperature": 0.3,
    "max_tokens": 4096
  },
  "system_prompt": "You are a technical support agent. Use the knowledge base to answer questions. Cite sources. Escalate to a human when confidence is low.",
  "tools": [
    { "name": "knowledge_search", "enabled": true },
    { "name": "ticket_create", "enabled": true }
  ],
  "skills": [
    { "skill_id": "troubleshoot-and-resolve", "version": "2.0.0" }
  ],
  "memory": {
    "scope": "user",
    "provider": "mem0"
  },
  "mcp_servers": [],
  "metadata": {
    "team": "customer-success",
    "tier": "production"
  }
}

Versioning

Every configuration change to an assistant creates a new immutable version snapshot. This provides a complete audit trail and safe rollback.

How Versioning Works

Edit configuration

Modify any part of the assistant configuration — model, prompt, tools, memory, or MCP servers.

New version created

The Agent Server saves the updated configuration as a new version with an auto-incremented version number and timestamp.

Active sessions unaffected

Existing sessions continue using the version they were created with. No in-flight conversations are disrupted.

New sessions use latest

New sessions automatically use the latest version unless explicitly pinned to a prior version.

Sessions created with a specific assistant version continue using that version even after updates. New sessions use the latest version by default. Pin a session to a specific version when you need deterministic behavior for testing or compliance.

Version management capabilities:

Rollback — Revert to any previous version with a single API call
Diff view — Compare configurations between any two versions side by side
Pin sessions — Lock a session to a specific version for deterministic behavior
Version tagging — Tag versions with labels like stable, canary, or v2-beta

Assistant Lifecycle

Assistants transition through three states from creation to retirement.


Created ──→ Active ──→ Archived

Created

The assistant configuration has been saved but is not yet serving requests. Use this state to prepare and review configurations before deployment.

Configuration is editable
No sessions can be created
Validation checks run automatically (model availability, tool existence)

Active

The assistant is deployed and accepting new sessions and runs. This is the primary operational state.

New sessions can be created against this assistant
Configuration changes create new versions (existing sessions unaffected)
Monitoring and metrics collection is active

Archived

The assistant has been removed from active service. Archived assistants are preserved for audit and compliance but cannot serve new requests.

No new sessions can be created
Existing session data is retained
Configuration is read-only
Can be restored to Active state if needed

Multi-Assistant Deployment

Multiple assistants can run on the same Agent Server instance, enabling teams to serve diverse use cases from shared infrastructure.

Request Routing

Every API request includes an assistant_id that routes the request to the correct assistant configuration. The server resolves the assistant, loads the appropriate version, and dispatches the run.

Resource Management

Shared resource pool — All assistants share compute, memory, and LLM connections
Per-assistant concurrency limits — Cap the number of concurrent runs per assistant to prevent resource starvation
Independent scaling — Scale individual assistant types based on demand (e.g., more capacity for a high-traffic support agent, less for an internal analytics agent)
Priority queuing — Assign priority levels to assistants for run scheduling

Isolation

Each assistant operates with independent:

Session stores
Memory namespaces
Tool permissions
Rate limits and quotas

This ensures one assistant’s load or misconfiguration does not affect others running on the same server.

API Operations

Manage assistants through REST endpoints. All operations support both JSON request/response bodies.

Create Assistant


POST /assistants


{
  "name": "support-agent",
  "type": "ReactAgent",
  "description": "Technical support agent",
  "llm": { "provider": "openai", "model": "gpt-4o" },
  "system_prompt": "You are a helpful support agent."
}

Returns the created assistant with a generated id and initial version 1.

List Assistants


GET /assistants
GET /assistants?type=ReactAgent&status=active&limit=20

Supports filtering by type, status, name (partial match), and pagination via limit and offset.

Get Assistant


GET /assistants/{id}

Returns the full assistant configuration including the current active version.

Update Assistant


PATCH /assistants/{id}


{
  "system_prompt": "Updated instructions for the agent.",
  "llm": { "temperature": 0.5 }
}

Partial updates are supported. Every update creates a new version. The response includes the new version number.

Archive Assistant


DELETE /assistants/{id}

Transitions the assistant to Archived state. Does not delete data. To permanently delete, use DELETE /assistants/{id}?permanent=true.

List Versions


GET /assistants/{id}/versions
GET /assistants/{id}/versions?limit=10

Returns the version history for an assistant, ordered by version number descending. Each entry includes the full configuration snapshot and a timestamp.

Rollback Version


POST /assistants/{id}/versions/{version}/rollback

Creates a new version with the configuration from the specified historical version. This does not rewrite history — it copies the old configuration forward as the latest version.

Next Steps

Learn about Sessions to understand how users interact with assistants
Explore Runs for task execution and lifecycle
Set up Scheduled Runs for automated recurring tasks
Review Scaling for production deployment patterns