Skip to Content
Agent ServerAssistants

Assistants

An assistant is a named, versioned agent configuration deployed to the Agent Server as a runtime unit. Assistants define what an agent is — model, prompt, tools, memory — while the Agent Server handles how it runs: persistence, scaling, and lifecycle management.

What is an Assistant?

Every agent running on the Agent Server starts as an assistant configuration. An assistant packages everything the agent needs into a single deployable unit:

  • Identity — Unique name, description, and metadata
  • Model selection — LLM provider, model name, and inference parameters
  • Behavior — System prompt defining personality, constraints, and output format
  • Capabilities — Tools, skills, and MCP server connections
  • Memory — Scope and persistence strategy for conversational context

Assistants separate configuration from execution. You define an assistant once, then the Agent Server creates sessions and runs against it. This means multiple users can interact with the same assistant definition simultaneously, each with their own isolated session state.

Assistant Types

M3 Forge supports three primary assistant types, each optimized for different agent patterns.

TypePatternBest ForTool Support
ReactAgentReAct (Reason + Act) loopGeneral-purpose tasks requiring tool use and multi-step reasoningFull
DocumentAssistantRAG (Retrieval-Augmented Generation)Document Q&A against knowledge basesRead-only retrieval
Custom BaseAgentUser-defined logicDomain-specific behavior that doesn’t fit standard patternsConfigurable

ReactAgent

Implements the ReAct loop: the agent reasons about the current state, selects a tool, observes the result, and repeats until the task is complete. Best for general-purpose tasks that require chaining multiple tool calls with intermediate reasoning.

Use cases:

  • Research tasks requiring web search and summarization
  • Code generation with file system access
  • Multi-step data analysis with API calls

DocumentAssistant

Specialized for retrieval-augmented generation. Connects to one or more knowledge bases and retrieves relevant documents before generating responses. Handles chunking strategies, relevance scoring, and citation tracking automatically.

Use cases:

  • Internal documentation Q&A
  • Policy and compliance lookups
  • Technical support with product knowledge

Custom BaseAgent Subclass

Extend BaseAgent for domain-specific logic that goes beyond the standard patterns. Custom subclasses can override the reasoning loop, implement specialized state machines, or integrate with proprietary systems.

Use cases:

  • Multi-modal agents processing images and text
  • Agents with custom approval workflows
  • Integration with legacy enterprise systems

Configuration

Each assistant is composed of configuration blocks that define its behavior. All blocks are optional except LLMConfig.

LLMConfig

Specifies the model provider and inference parameters.

{ "llm": { "provider": "openai", "model": "gpt-4o", "temperature": 0.7, "max_tokens": 4096, "top_p": 1.0, "stop_sequences": [] } }

System Prompt

Instructions defining agent behavior, constraints, and output format. The system prompt is the primary mechanism for shaping assistant behavior.

{ "system_prompt": "You are a technical support agent for M3 Forge. Answer questions using only the provided documentation. If you cannot find the answer, say so clearly. Always cite the source document." }

ToolConfig

Lists the tools available to the assistant with per-tool parameters.

{ "tools": [ { "name": "web_search", "enabled": true, "parameters": { "max_results": 5, "safe_search": true } }, { "name": "file_read", "enabled": true, "parameters": { "allowed_extensions": [".md", ".txt", ".pdf"] } } ] }

SkillConfig

Skills loaded from the skill registry. Skills are multi-step capabilities that compose tools into higher-level operations.

{ "skills": [ { "skill_id": "research-and-summarize", "version": "1.2.0" }, { "skill_id": "code-review", "version": "latest" } ] }

MemoryConfig

Controls how the assistant retains and accesses conversational context.

{ "memory": { "scope": "user", "provider": "mem0", "settings": { "max_memories": 1000, "relevance_threshold": 0.7, "ttl_days": 90 } } }

Memory scopes:

  • none — Stateless, no memory retained between runs
  • session — Memory persists within a single session
  • user — Memory shared across sessions for the same user
  • group — Memory shared across all users of this assistant

MCPConfig

Model Context Protocol server connections that extend the assistant’s capabilities with external tool providers.

{ "mcp_servers": [ { "name": "database-tools", "url": "http://mcp-db.internal:8080", "transport": "sse", "auth": { "type": "bearer", "token_env": "MCP_DB_TOKEN" } } ] }

Complete Configuration Example

A full assistant configuration combining all blocks:

{ "name": "support-agent", "type": "ReactAgent", "description": "Technical support agent for product documentation", "llm": { "provider": "openai", "model": "gpt-4o", "temperature": 0.3, "max_tokens": 4096 }, "system_prompt": "You are a technical support agent. Use the knowledge base to answer questions. Cite sources. Escalate to a human when confidence is low.", "tools": [ { "name": "knowledge_search", "enabled": true }, { "name": "ticket_create", "enabled": true } ], "skills": [ { "skill_id": "troubleshoot-and-resolve", "version": "2.0.0" } ], "memory": { "scope": "user", "provider": "mem0" }, "mcp_servers": [], "metadata": { "team": "customer-success", "tier": "production" } }

Versioning

Every configuration change to an assistant creates a new immutable version snapshot. This provides a complete audit trail and safe rollback.

How Versioning Works

Edit configuration

Modify any part of the assistant configuration — model, prompt, tools, memory, or MCP servers.

New version created

The Agent Server saves the updated configuration as a new version with an auto-incremented version number and timestamp.

Active sessions unaffected

Existing sessions continue using the version they were created with. No in-flight conversations are disrupted.

New sessions use latest

New sessions automatically use the latest version unless explicitly pinned to a prior version.

Sessions created with a specific assistant version continue using that version even after updates. New sessions use the latest version by default. Pin a session to a specific version when you need deterministic behavior for testing or compliance.

Version management capabilities:

  • Rollback — Revert to any previous version with a single API call
  • Diff view — Compare configurations between any two versions side by side
  • Pin sessions — Lock a session to a specific version for deterministic behavior
  • Version tagging — Tag versions with labels like stable, canary, or v2-beta

Assistant Lifecycle

Assistants transition through three states from creation to retirement.

Created ──→ Active ──→ Archived

Created

The assistant configuration has been saved but is not yet serving requests. Use this state to prepare and review configurations before deployment.

  • Configuration is editable
  • No sessions can be created
  • Validation checks run automatically (model availability, tool existence)

Active

The assistant is deployed and accepting new sessions and runs. This is the primary operational state.

  • New sessions can be created against this assistant
  • Configuration changes create new versions (existing sessions unaffected)
  • Monitoring and metrics collection is active

Archived

The assistant has been removed from active service. Archived assistants are preserved for audit and compliance but cannot serve new requests.

  • No new sessions can be created
  • Existing session data is retained
  • Configuration is read-only
  • Can be restored to Active state if needed

Multi-Assistant Deployment

Multiple assistants can run on the same Agent Server instance, enabling teams to serve diverse use cases from shared infrastructure.

Request Routing

Every API request includes an assistant_id that routes the request to the correct assistant configuration. The server resolves the assistant, loads the appropriate version, and dispatches the run.

Resource Management

  • Shared resource pool — All assistants share compute, memory, and LLM connections
  • Per-assistant concurrency limits — Cap the number of concurrent runs per assistant to prevent resource starvation
  • Independent scaling — Scale individual assistant types based on demand (e.g., more capacity for a high-traffic support agent, less for an internal analytics agent)
  • Priority queuing — Assign priority levels to assistants for run scheduling

Isolation

Each assistant operates with independent:

  • Session stores
  • Memory namespaces
  • Tool permissions
  • Rate limits and quotas

This ensures one assistant’s load or misconfiguration does not affect others running on the same server.

API Operations

Manage assistants through REST endpoints. All operations support both JSON request/response bodies.

Create Assistant

POST /assistants
{ "name": "support-agent", "type": "ReactAgent", "description": "Technical support agent", "llm": { "provider": "openai", "model": "gpt-4o" }, "system_prompt": "You are a helpful support agent." }

Returns the created assistant with a generated id and initial version 1.

List Assistants

GET /assistants GET /assistants?type=ReactAgent&status=active&limit=20

Supports filtering by type, status, name (partial match), and pagination via limit and offset.

Get Assistant

GET /assistants/{id}

Returns the full assistant configuration including the current active version.

Update Assistant

PATCH /assistants/{id}
{ "system_prompt": "Updated instructions for the agent.", "llm": { "temperature": 0.5 } }

Partial updates are supported. Every update creates a new version. The response includes the new version number.

Archive Assistant

DELETE /assistants/{id}

Transitions the assistant to Archived state. Does not delete data. To permanently delete, use DELETE /assistants/{id}?permanent=true.

List Versions

GET /assistants/{id}/versions GET /assistants/{id}/versions?limit=10

Returns the version history for an assistant, ordered by version number descending. Each entry includes the full configuration snapshot and a timestamp.

Rollback Version

POST /assistants/{id}/versions/{version}/rollback

Creates a new version with the configuration from the specified historical version. This does not rewrite history — it copies the old configuration forward as the latest version.

Next Steps

  • Learn about Sessions to understand how users interact with assistants
  • Explore Runs for task execution and lifecycle
  • Set up Scheduled Runs for automated recurring tasks
  • Review Scaling for production deployment patterns
Last updated on