Prompt Node

Invoke Large Language Models with configurable prompts, model selection, and structured outputs.

Overview

The Prompt Node sends requests to LLM providers (OpenAI, Anthropic, etc.) with customizable prompts and returns generated text. It supports:

Multiple providers - OpenAI, Anthropic, Cohere, local models
Structured outputs - JSON mode, function calling, schema enforcement
Template variables - Dynamic prompt construction from workflow context
Response caching - Deduplicate identical requests
Token tracking - Monitor usage and costs

When to Use

Use a Prompt Node when you need to:

Generate text from templates with dynamic data
Classify or categorize content (sentiment, topic, urgency)
Extract structured data from unstructured text
Summarize long documents or conversations
Answer questions using provided context (RAG pattern)
Transform text between formats or styles

For custom logic or API calls, use Code Node instead.

Configuration

Basic Settings

Field	Description	Example
Provider	LLM provider	OpenAI, Anthropic, Cohere
Model	Specific model	gpt-4, claude-3-opus, command-r
Temperature	Randomness (0-2)	0.7 (balanced), 0 (deterministic), 1.5 (creative)
Max tokens	Maximum response length	1000, 4096, 8192
System prompt	Role/context instructions	”You are an invoice extraction assistant.”
User prompt	Main query/instruction	`"Extract invoice_number, date, total from: {text}"`

Input Source

JSONPath to the data being processed:


// From workflow input
"input_source": "$.data.document_text"
 
// From previous node output
"input_source": "$.nodes.extract_text.output.text"
 
// From nested field
"input_source": "$.nodes.preprocessing.output.result.cleaned_text"

The input is injected into prompt templates via variable substitution.

Template Variables

Use curly braces {variable} to inject data into prompts:

User Prompt:


Extract the following fields from this invoice:
- Invoice Number
- Date
- Total Amount
- Vendor Name

Invoice text:
{text}

Return as JSON.

Variable Mapping:


{
  "text": "$.nodes.extract_text.output.text"
}

At execution time, {text} is replaced with the actual extracted text.

Supported Providers

OpenAI

OpenAI

Models: gpt-4, gpt-4-turbo, gpt-3.5-turbo

Configuration:


{
  "provider": "openai",
  "model": "gpt-4",
  "temperature": 0.7,
  "max_tokens": 2000,
  "api_key_source": "env:OPENAI_API_KEY"
}

Features: JSON mode, function calling, vision (gpt-4-vision)

Anthropic

Anthropic

Models: claude-3-opus, claude-3-sonnet, claude-3-haiku

Configuration:


{
  "provider": "anthropic",
  "model": "claude-3-opus-20240229",
  "temperature": 1.0,
  "max_tokens": 4096,
  "api_key_source": "env:ANTHROPIC_API_KEY"
}

Features: Long context (200k tokens), vision, low latency

Cohere

Cohere

Models: command-r, command-r-plus

Configuration:


{
  "provider": "cohere",
  "model": "command-r-plus",
  "temperature": 0.5,
  "max_tokens": 1000,
  "api_key_source": "env:COHERE_API_KEY"
}

Features: Enterprise RAG optimization, multilingual

Local Models

Local Models

Self-hosted models via OpenAI-compatible API:

Configuration:


{
  "provider": "openai_compatible",
  "model": "llama-3-70b",
  "base_url": "http://localhost:8000/v1",
  "temperature": 0.7,
  "max_tokens": 2000
}

Supports: Llama, Mistral, Phi, any vLLM/TGI deployment

Structured Outputs

Force LLMs to return valid JSON:

JSON Mode:


{
  "provider": "openai",
  "model": "gpt-4",
  "response_format": { "type": "json_object" },
  "user_prompt": "Extract fields as JSON: {text}"
}

JSON Schema Enforcement:


{
  "provider": "openai",
  "model": "gpt-4",
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "invoice_extraction",
      "schema": {
        "type": "object",
        "properties": {
          "invoice_number": { "type": "string" },
          "date": { "type": "string", "format": "date" },
          "total_amount": { "type": "number" }
        },
        "required": ["invoice_number", "date", "total_amount"]
      }
    }
  }
}

The LLM is constrained to return JSON matching the schema exactly.

Advanced Options

Option	Type	Default	Description
`top_p`	number	1.0	Nucleus sampling threshold
`frequency_penalty`	number	0	Penalize token repetition
`presence_penalty`	number	0	Penalize topic repetition
`stop_sequences`	array	[]	Stop generation at specific strings
`seed`	integer	null	Deterministic sampling seed
`cache_responses`	boolean	false	Cache identical requests
`cache_ttl`	integer	3600	Cache lifetime in seconds

Output

The Prompt Node produces an output object accessible in downstream nodes:


{
  "text": "Generated text response",
  "usage": {
    "prompt_tokens": 245,
    "completion_tokens": 123,
    "total_tokens": 368
  },
  "cost_usd": 0.0147,
  "model": "gpt-4",
  "finish_reason": "stop",
  "cached": false
}

Access fields via JSONPath:

$.nodes.prompt_node.output.text - Generated text
$.nodes.prompt_node.output.usage.total_tokens - Token count
$.nodes.prompt_node.output.cost_usd - API cost

Example Configurations

Text Classification


{
  "node_id": "classify_sentiment",
  "node_type": "PROMPT",
  "definition": {
    "provider": "openai",
    "model": "gpt-3.5-turbo",
    "temperature": 0,
    "max_tokens": 10,
    "system_prompt": "Classify sentiment as: positive, negative, or neutral.",
    "user_prompt": "Text: {text}\n\nSentiment:",
    "input_source": "$.data.customer_review"
  }
}

Structured Data Extraction


{
  "node_id": "extract_invoice",
  "node_type": "PROMPT",
  "definition": {
    "provider": "anthropic",
    "model": "claude-3-opus-20240229",
    "temperature": 0,
    "max_tokens": 1000,
    "system_prompt": "Extract invoice data as JSON with fields: invoice_number, date, total_amount, vendor.",
    "user_prompt": "Invoice text:\n{invoice_text}",
    "input_source": "$.nodes.ocr.output.text",
    "response_format": { "type": "json_object" }
  }
}

RAG Question Answering


{
  "node_id": "answer_question",
  "node_type": "PROMPT",
  "definition": {
    "provider": "openai",
    "model": "gpt-4",
    "temperature": 0.7,
    "max_tokens": 500,
    "system_prompt": "Answer the question using only the provided context. If the answer is not in the context, say 'I don't have enough information.'",
    "user_prompt": "Context:\n{context}\n\nQuestion: {question}\n\nAnswer:",
    "input_source": {
      "context": "$.nodes.retrieve.output.documents",
      "question": "$.data.user_query"
    }
  }
}

Multi-Turn Conversation


{
  "node_id": "chat_response",
  "node_type": "PROMPT",
  "definition": {
    "provider": "anthropic",
    "model": "claude-3-sonnet-20240229",
    "temperature": 1.0,
    "max_tokens": 1000,
    "messages": [
      { "role": "user", "content": "{message_1}" },
      { "role": "assistant", "content": "{response_1}" },
      { "role": "user", "content": "{message_2}" }
    ],
    "input_source": {
      "message_1": "$.data.history[0].content",
      "response_1": "$.data.history[1].content",
      "message_2": "$.data.current_message"
    }
  }
}

Best Practices

Use low temperature (0-0.3) for extraction - Deterministic outputs for structured data
Use higher temperature (0.7-1.2) for generation - Creative, varied responses
Set appropriate max_tokens - Avoid truncated outputs; monitor usage
Cache identical requests - Enable caching for repeated queries (e.g., classification)
Use JSON mode for structured outputs - More reliable than asking “return as JSON” in prompt
Monitor costs - Set up alerts for runs exceeding token budgets
Add guardrails downstream - Validate LLM outputs before using in critical operations

Prompt Node execution is asynchronous and billed by the provider. Set max_tokens conservatively to control costs.

Error Handling

Common errors and resolutions:

Error	Cause	Resolution
Rate limit exceeded	Too many requests to provider	Enable retry with exponential backoff
Invalid API key	Missing or incorrect credentials	Check API key in gateway configuration
Max tokens exceeded	Prompt + response > model limit	Reduce prompt length or use model with larger context
JSON parsing failed	LLM returned invalid JSON	Add Guardrail Node to validate and retry
Timeout	Response took > node timeout	Increase node timeout or use faster model

Set error handling in node definition:


{
  "error_handling": {
    "retry_on_rate_limit": true,
    "max_retries": 3,
    "retry_delay_ms": 1000,
    "on_error": "fail_workflow"
  }
}

Cost Optimization

Reduce LLM costs:

Use cheaper models for simple tasks - gpt-3.5-turbo vs gpt-4 for classification
Enable response caching - Deduplicate identical requests
Reduce prompt length - Remove unnecessary context or examples
Set aggressive max_tokens - Prevent runaway generation
Batch requests - Combine multiple items in one prompt when possible
Use local models - Deploy Llama/Mistral for high-volume use cases

Track costs in the Runs view under Cost Analysis.

Guardrail Node - Validate LLM outputs
Code Node - Custom data transformation
Retrieval Node - Semantic search for RAG

Prompt Node

Overview

When to Use

Configuration

Basic Settings

Input Source

Template Variables

Supported Providers

OpenAI

Anthropic

Cohere

Local Models

Structured Outputs

Advanced Options

Output

Example Configurations

Text Classification

Structured Data Extraction

RAG Question Answering

Multi-Turn Conversation

Best Practices

Error Handling

Cost Optimization

Related Nodes