Guardrail Node

Run inline quality evaluation using pre-defined metrics during workflow execution.

Overview

The Guardrail Node evaluates workflow outputs against configurable quality metrics and routes execution to pass or fail paths based on the results. This enables automated quality control, validation, and conditional retry logic.

When to Use

Use a Guardrail Node when you need to:

Validate LLM outputs against quality thresholds before returning to users
Implement retry loops when responses don’t meet quality standards
Check data integrity with schema validation or regex patterns
Run custom evaluation logic via Python functions or external executors

Example: RAG Faithfulness Check

A common use case is validating RAG (Retrieval-Augmented Generation) responses:

LLM generates a response using retrieved context
Guardrail evaluates Faithfulness (is the response grounded in the context?)
If score >= 0.8: Pass → Return response to user
If score < 0.8: Fail → Retry with different prompt or return error


START → LLM_NODE → GUARDRAIL → [pass: END_SUCCESS, fail: RETRY or END_ERROR]

Configuration

Input Source

JSONPath expression to the data being evaluated:

$.nodes.llm_node.output - LLM output from previous node
$.data.response - Response field from input data

Metrics

Configure one or more evaluation metrics:

Metric Type	Description	Example Params
`faithfulness`	RAG faithfulness score	`{}`
`relevance`	Query-response relevance	`{}`
`json_schema`	JSON schema validation	`{"schema": {...}}`
`regex_match`	Pattern matching	`{"pattern": "^[A-Z].*"}`
`length_check`	String length bounds	`{"min": 10, "max": 1000}`
`contains_keywords`	Keyword presence	`{"keywords": ["invoice", "total"]}`
`llm_judge`	LLM-based evaluation	`{"prompt": "...", "model": "gpt-4"}`
`executor`	Custom Python function	`{"endpoint": "guardrail_executor://evaluate", "function": "my_fn"}`

Aggregation Mode

How multiple metrics are combined:

all (default): All metrics must pass their individual thresholds
any: At least one metric must pass
weighted_average: Weighted score compared to overall threshold

Pass Threshold

Score threshold (0.0-1.0) for the aggregated result. Default: 0.8

Additional Options

Option	Type	Default	Description
`fail_fast`	boolean	false	Stop evaluation on first metric failure
`include_feedback`	boolean	true	Include detailed feedback in results
`evaluation_timeout`	integer	30	Maximum time for evaluation (seconds)

Output

The Guardrail Node produces evaluation results accessible in downstream nodes:


{
  "guardrail_result": {
    "overall_passed": true,
    "overall_score": 0.85,
    "selected_path_id": "pass",
    "individual_results": [
      {
        "metric_name": "Faithfulness",
        "passed": true,
        "score": 0.85,
        "feedback": "Response is well-grounded in context"
      },
      {
        "metric_name": "Length Check",
        "passed": true,
        "score": 1.0,
        "feedback": "Length 245 within bounds [10, 1000]"
      }
    ],
    "total_execution_time_ms": 156.3
  }
}

Example Configurations

Simple Validation


{
  "task_id": "guardrail_1",
  "node_type": "GUARDRAIL",
  "query_str": "Validate response quality",
  "definition": {
    "method": "GUARDRAIL",
    "endpoint": "guardrail://control",
    "input_source": "$.nodes.llm_node.output",
    "metrics": [
      {
        "type": "length_check",
        "name": "Response Length",
        "threshold": 0.5,
        "weight": 1.0,
        "params": { "min": 10, "max": 1000 }
      },
      {
        "type": "regex_match",
        "name": "No PII",
        "threshold": 1.0,
        "weight": 1.0,
        "params": { "pattern": "^(?!.*\\d{3}-\\d{2}-\\d{4}).*$" }
      }
    ],
    "aggregation_mode": "all",
    "pass_threshold": 0.8,
    "paths": [
      { "path_id": "pass", "target_node_ids": ["success_node"] },
      { "path_id": "fail", "target_node_ids": ["error_node"] }
    ]
  }
}

Best Practices

Start with simple metrics (regex, length) before adding LLM-based evaluation.

Set appropriate timeouts for executor-based metrics (default: 30s)
Use fail_fast=true when any single metric failure should stop evaluation
Log evaluation results by enabling include_feedback=true for debugging
Route failures gracefully - point fail paths to error handlers or terminal nodes
Consider retries - fail paths can loop back to retry with different parameters

Error Handling

When guardrail evaluation fails (timeout, exception), the workflow routes to the fail path by default:

Evaluation timeout: Routes to fail path with timeout metadata
Evaluation exception: Routes to fail path with error details
No metrics configured: Routes to fail path with validation error

Design your workflow with appropriate terminal nodes on fail paths:


GUARDRAIL
   |-- pass_path → [Processing] → END_SUCCESS
   |-- fail_path → END_ERROR (terminal)

HITL Approval Node — Human approval gate
Prompt Node — LLM invocation
Code Node — Custom code execution