Prompt Node
Invoke Large Language Models with configurable prompts, model selection, and structured outputs.
Overview
The Prompt Node sends requests to LLM providers (OpenAI, Anthropic, etc.) with customizable prompts and returns generated text. It supports:
- Multiple providers - OpenAI, Anthropic, Cohere, local models
- Structured outputs - JSON mode, function calling, schema enforcement
- Template variables - Dynamic prompt construction from workflow context
- Response caching - Deduplicate identical requests
- Token tracking - Monitor usage and costs
When to Use
Use a Prompt Node when you need to:
- Generate text from templates with dynamic data
- Classify or categorize content (sentiment, topic, urgency)
- Extract structured data from unstructured text
- Summarize long documents or conversations
- Answer questions using provided context (RAG pattern)
- Transform text between formats or styles
For custom logic or API calls, use Code Node instead.
Configuration
Basic Settings
| Field | Description | Example |
|---|---|---|
| Provider | LLM provider | OpenAI, Anthropic, Cohere |
| Model | Specific model | gpt-4, claude-3-opus, command-r |
| Temperature | Randomness (0-2) | 0.7 (balanced), 0 (deterministic), 1.5 (creative) |
| Max tokens | Maximum response length | 1000, 4096, 8192 |
| System prompt | Role/context instructions | ”You are an invoice extraction assistant.” |
| User prompt | Main query/instruction | "Extract invoice_number, date, total from: {text}" |
Input Source
JSONPath to the data being processed:
// From workflow input
"input_source": "$.data.document_text"
// From previous node output
"input_source": "$.nodes.extract_text.output.text"
// From nested field
"input_source": "$.nodes.preprocessing.output.result.cleaned_text"The input is injected into prompt templates via variable substitution.
Template Variables
Use curly braces {variable} to inject data into prompts:
User Prompt:
Extract the following fields from this invoice:
- Invoice Number
- Date
- Total Amount
- Vendor Name
Invoice text:
{text}
Return as JSON.Variable Mapping:
{
"text": "$.nodes.extract_text.output.text"
}At execution time, {text} is replaced with the actual extracted text.
Supported Providers
OpenAI
OpenAI
Models: gpt-4, gpt-4-turbo, gpt-3.5-turbo
Configuration:
{
"provider": "openai",
"model": "gpt-4",
"temperature": 0.7,
"max_tokens": 2000,
"api_key_source": "env:OPENAI_API_KEY"
}Features: JSON mode, function calling, vision (gpt-4-vision)
Structured Outputs
Force LLMs to return valid JSON:
JSON Mode:
{
"provider": "openai",
"model": "gpt-4",
"response_format": { "type": "json_object" },
"user_prompt": "Extract fields as JSON: {text}"
}JSON Schema Enforcement:
{
"provider": "openai",
"model": "gpt-4",
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "invoice_extraction",
"schema": {
"type": "object",
"properties": {
"invoice_number": { "type": "string" },
"date": { "type": "string", "format": "date" },
"total_amount": { "type": "number" }
},
"required": ["invoice_number", "date", "total_amount"]
}
}
}
}The LLM is constrained to return JSON matching the schema exactly.
Advanced Options
| Option | Type | Default | Description |
|---|---|---|---|
top_p | number | 1.0 | Nucleus sampling threshold |
frequency_penalty | number | 0 | Penalize token repetition |
presence_penalty | number | 0 | Penalize topic repetition |
stop_sequences | array | [] | Stop generation at specific strings |
seed | integer | null | Deterministic sampling seed |
cache_responses | boolean | false | Cache identical requests |
cache_ttl | integer | 3600 | Cache lifetime in seconds |
Output
The Prompt Node produces an output object accessible in downstream nodes:
{
"text": "Generated text response",
"usage": {
"prompt_tokens": 245,
"completion_tokens": 123,
"total_tokens": 368
},
"cost_usd": 0.0147,
"model": "gpt-4",
"finish_reason": "stop",
"cached": false
}Access fields via JSONPath:
$.nodes.prompt_node.output.text- Generated text$.nodes.prompt_node.output.usage.total_tokens- Token count$.nodes.prompt_node.output.cost_usd- API cost
Example Configurations
Text Classification
{
"node_id": "classify_sentiment",
"node_type": "PROMPT",
"definition": {
"provider": "openai",
"model": "gpt-3.5-turbo",
"temperature": 0,
"max_tokens": 10,
"system_prompt": "Classify sentiment as: positive, negative, or neutral.",
"user_prompt": "Text: {text}\n\nSentiment:",
"input_source": "$.data.customer_review"
}
}Structured Data Extraction
{
"node_id": "extract_invoice",
"node_type": "PROMPT",
"definition": {
"provider": "anthropic",
"model": "claude-3-opus-20240229",
"temperature": 0,
"max_tokens": 1000,
"system_prompt": "Extract invoice data as JSON with fields: invoice_number, date, total_amount, vendor.",
"user_prompt": "Invoice text:\n{invoice_text}",
"input_source": "$.nodes.ocr.output.text",
"response_format": { "type": "json_object" }
}
}RAG Question Answering
{
"node_id": "answer_question",
"node_type": "PROMPT",
"definition": {
"provider": "openai",
"model": "gpt-4",
"temperature": 0.7,
"max_tokens": 500,
"system_prompt": "Answer the question using only the provided context. If the answer is not in the context, say 'I don't have enough information.'",
"user_prompt": "Context:\n{context}\n\nQuestion: {question}\n\nAnswer:",
"input_source": {
"context": "$.nodes.retrieve.output.documents",
"question": "$.data.user_query"
}
}
}Multi-Turn Conversation
{
"node_id": "chat_response",
"node_type": "PROMPT",
"definition": {
"provider": "anthropic",
"model": "claude-3-sonnet-20240229",
"temperature": 1.0,
"max_tokens": 1000,
"messages": [
{ "role": "user", "content": "{message_1}" },
{ "role": "assistant", "content": "{response_1}" },
{ "role": "user", "content": "{message_2}" }
],
"input_source": {
"message_1": "$.data.history[0].content",
"response_1": "$.data.history[1].content",
"message_2": "$.data.current_message"
}
}
}Best Practices
- Use low temperature (0-0.3) for extraction - Deterministic outputs for structured data
- Use higher temperature (0.7-1.2) for generation - Creative, varied responses
- Set appropriate max_tokens - Avoid truncated outputs; monitor usage
- Cache identical requests - Enable caching for repeated queries (e.g., classification)
- Use JSON mode for structured outputs - More reliable than asking “return as JSON” in prompt
- Monitor costs - Set up alerts for runs exceeding token budgets
- Add guardrails downstream - Validate LLM outputs before using in critical operations
Prompt Node execution is asynchronous and billed by the provider. Set max_tokens conservatively to control costs.
Error Handling
Common errors and resolutions:
| Error | Cause | Resolution |
|---|---|---|
| Rate limit exceeded | Too many requests to provider | Enable retry with exponential backoff |
| Invalid API key | Missing or incorrect credentials | Check API key in gateway configuration |
| Max tokens exceeded | Prompt + response > model limit | Reduce prompt length or use model with larger context |
| JSON parsing failed | LLM returned invalid JSON | Add Guardrail Node to validate and retry |
| Timeout | Response took > node timeout | Increase node timeout or use faster model |
Set error handling in node definition:
{
"error_handling": {
"retry_on_rate_limit": true,
"max_retries": 3,
"retry_delay_ms": 1000,
"on_error": "fail_workflow"
}
}Cost Optimization
Reduce LLM costs:
- Use cheaper models for simple tasks - gpt-3.5-turbo vs gpt-4 for classification
- Enable response caching - Deduplicate identical requests
- Reduce prompt length - Remove unnecessary context or examples
- Set aggressive max_tokens - Prevent runaway generation
- Batch requests - Combine multiple items in one prompt when possible
- Use local models - Deploy Llama/Mistral for high-volume use cases
Track costs in the Runs view under Cost Analysis.
Related Nodes
- Guardrail Node - Validate LLM outputs
- Code Node - Custom data transformation
- Retrieval Node - Semantic search for RAG