Skip to Content
WorkflowsNodesPrompt Node

Prompt Node

Invoke Large Language Models with configurable prompts, model selection, and structured outputs.

Overview

The Prompt Node sends requests to LLM providers (OpenAI, Anthropic, etc.) with customizable prompts and returns generated text. It supports:

  • Multiple providers - OpenAI, Anthropic, Cohere, local models
  • Structured outputs - JSON mode, function calling, schema enforcement
  • Template variables - Dynamic prompt construction from workflow context
  • Response caching - Deduplicate identical requests
  • Token tracking - Monitor usage and costs

When to Use

Use a Prompt Node when you need to:

  • Generate text from templates with dynamic data
  • Classify or categorize content (sentiment, topic, urgency)
  • Extract structured data from unstructured text
  • Summarize long documents or conversations
  • Answer questions using provided context (RAG pattern)
  • Transform text between formats or styles

For custom logic or API calls, use Code Node instead.

Configuration

Basic Settings

FieldDescriptionExample
ProviderLLM providerOpenAI, Anthropic, Cohere
ModelSpecific modelgpt-4, claude-3-opus, command-r
TemperatureRandomness (0-2)0.7 (balanced), 0 (deterministic), 1.5 (creative)
Max tokensMaximum response length1000, 4096, 8192
System promptRole/context instructions”You are an invoice extraction assistant.”
User promptMain query/instruction"Extract invoice_number, date, total from: {text}"

Input Source

JSONPath to the data being processed:

// From workflow input "input_source": "$.data.document_text" // From previous node output "input_source": "$.nodes.extract_text.output.text" // From nested field "input_source": "$.nodes.preprocessing.output.result.cleaned_text"

The input is injected into prompt templates via variable substitution.

Template Variables

Use curly braces {variable} to inject data into prompts:

User Prompt:

Extract the following fields from this invoice: - Invoice Number - Date - Total Amount - Vendor Name Invoice text: {text} Return as JSON.

Variable Mapping:

{ "text": "$.nodes.extract_text.output.text" }

At execution time, {text} is replaced with the actual extracted text.

Supported Providers

OpenAI

Models: gpt-4, gpt-4-turbo, gpt-3.5-turbo

Configuration:

{ "provider": "openai", "model": "gpt-4", "temperature": 0.7, "max_tokens": 2000, "api_key_source": "env:OPENAI_API_KEY" }

Features: JSON mode, function calling, vision (gpt-4-vision)

Structured Outputs

Force LLMs to return valid JSON:

JSON Mode:

{ "provider": "openai", "model": "gpt-4", "response_format": { "type": "json_object" }, "user_prompt": "Extract fields as JSON: {text}" }

JSON Schema Enforcement:

{ "provider": "openai", "model": "gpt-4", "response_format": { "type": "json_schema", "json_schema": { "name": "invoice_extraction", "schema": { "type": "object", "properties": { "invoice_number": { "type": "string" }, "date": { "type": "string", "format": "date" }, "total_amount": { "type": "number" } }, "required": ["invoice_number", "date", "total_amount"] } } } }

The LLM is constrained to return JSON matching the schema exactly.

Advanced Options

OptionTypeDefaultDescription
top_pnumber1.0Nucleus sampling threshold
frequency_penaltynumber0Penalize token repetition
presence_penaltynumber0Penalize topic repetition
stop_sequencesarray[]Stop generation at specific strings
seedintegernullDeterministic sampling seed
cache_responsesbooleanfalseCache identical requests
cache_ttlinteger3600Cache lifetime in seconds

Output

The Prompt Node produces an output object accessible in downstream nodes:

{ "text": "Generated text response", "usage": { "prompt_tokens": 245, "completion_tokens": 123, "total_tokens": 368 }, "cost_usd": 0.0147, "model": "gpt-4", "finish_reason": "stop", "cached": false }

Access fields via JSONPath:

  • $.nodes.prompt_node.output.text - Generated text
  • $.nodes.prompt_node.output.usage.total_tokens - Token count
  • $.nodes.prompt_node.output.cost_usd - API cost

Example Configurations

Text Classification

{ "node_id": "classify_sentiment", "node_type": "PROMPT", "definition": { "provider": "openai", "model": "gpt-3.5-turbo", "temperature": 0, "max_tokens": 10, "system_prompt": "Classify sentiment as: positive, negative, or neutral.", "user_prompt": "Text: {text}\n\nSentiment:", "input_source": "$.data.customer_review" } }

Structured Data Extraction

{ "node_id": "extract_invoice", "node_type": "PROMPT", "definition": { "provider": "anthropic", "model": "claude-3-opus-20240229", "temperature": 0, "max_tokens": 1000, "system_prompt": "Extract invoice data as JSON with fields: invoice_number, date, total_amount, vendor.", "user_prompt": "Invoice text:\n{invoice_text}", "input_source": "$.nodes.ocr.output.text", "response_format": { "type": "json_object" } } }

RAG Question Answering

{ "node_id": "answer_question", "node_type": "PROMPT", "definition": { "provider": "openai", "model": "gpt-4", "temperature": 0.7, "max_tokens": 500, "system_prompt": "Answer the question using only the provided context. If the answer is not in the context, say 'I don't have enough information.'", "user_prompt": "Context:\n{context}\n\nQuestion: {question}\n\nAnswer:", "input_source": { "context": "$.nodes.retrieve.output.documents", "question": "$.data.user_query" } } }

Multi-Turn Conversation

{ "node_id": "chat_response", "node_type": "PROMPT", "definition": { "provider": "anthropic", "model": "claude-3-sonnet-20240229", "temperature": 1.0, "max_tokens": 1000, "messages": [ { "role": "user", "content": "{message_1}" }, { "role": "assistant", "content": "{response_1}" }, { "role": "user", "content": "{message_2}" } ], "input_source": { "message_1": "$.data.history[0].content", "response_1": "$.data.history[1].content", "message_2": "$.data.current_message" } } }

Best Practices

  1. Use low temperature (0-0.3) for extraction - Deterministic outputs for structured data
  2. Use higher temperature (0.7-1.2) for generation - Creative, varied responses
  3. Set appropriate max_tokens - Avoid truncated outputs; monitor usage
  4. Cache identical requests - Enable caching for repeated queries (e.g., classification)
  5. Use JSON mode for structured outputs - More reliable than asking “return as JSON” in prompt
  6. Monitor costs - Set up alerts for runs exceeding token budgets
  7. Add guardrails downstream - Validate LLM outputs before using in critical operations

Prompt Node execution is asynchronous and billed by the provider. Set max_tokens conservatively to control costs.

Error Handling

Common errors and resolutions:

ErrorCauseResolution
Rate limit exceededToo many requests to providerEnable retry with exponential backoff
Invalid API keyMissing or incorrect credentialsCheck API key in gateway configuration
Max tokens exceededPrompt + response > model limitReduce prompt length or use model with larger context
JSON parsing failedLLM returned invalid JSONAdd Guardrail Node to validate and retry
TimeoutResponse took > node timeoutIncrease node timeout or use faster model

Set error handling in node definition:

{ "error_handling": { "retry_on_rate_limit": true, "max_retries": 3, "retry_delay_ms": 1000, "on_error": "fail_workflow" } }

Cost Optimization

Reduce LLM costs:

  1. Use cheaper models for simple tasks - gpt-3.5-turbo vs gpt-4 for classification
  2. Enable response caching - Deduplicate identical requests
  3. Reduce prompt length - Remove unnecessary context or examples
  4. Set aggressive max_tokens - Prevent runaway generation
  5. Batch requests - Combine multiple items in one prompt when possible
  6. Use local models - Deploy Llama/Mistral for high-volume use cases

Track costs in the Runs view under Cost Analysis.

Last updated on