Skip to Content

Prompt Playground

The Playground provides an interactive environment for testing prompts against live LLM providers with real-time streaming responses, variable substitution, and multi-modal support.

Overview

Test prompts before production deployment. The Playground automatically detects variables in your templates, presents fillable forms, and streams responses from any configured LLM connection. Save successful tests for later comparison or export results for analysis.

Prompt Playground showing template editor with variable inputs on left and streaming LLM response on right

Key Features

Auto-Detected Variables

Variables use curly brace syntax: {variable_name}. The Playground scans your prompt template and generates input fields automatically.

Example:

You are a {role} helping with {task}. User question: {user_input} Context: {context}

The UI displays four input fields:

  • role (e.g., “customer service agent”)
  • task (e.g., “technical troubleshooting”)
  • user_input (e.g., “My printer won’t connect”)
  • context (e.g., “Customer has HP LaserJet Pro”)

Variables support multiple types: text, numbers, JSON objects, and file attachments.

Playground variable panel showing auto-detected template variables with input fields and type selectors

Variable names must be alphanumeric with underscores. Nested variables like {user.name} are not supported — use flat keys like {user_name}.

Model Selection

Choose from all configured LLM connections:

ProviderModelsFeatures
OpenAIGPT-4, GPT-4 Turbo, GPT-3.5 TurboVision, streaming, function calling
AnthropicClaude 3 Opus, Sonnet, Haiku200K context, vision, artifacts
QwenQwen2.5-72B, Qwen2.5-7BHigh-performance Chinese/English
CustomAny OpenAI-compatible APISelf-hosted models

The model selector displays context window size and capabilities (vision, streaming) as badges.

Parameter Presets

Fine-tune generation behavior with preset configurations:

  • Deterministic — Temperature 0, consistent outputs for classification and extraction
  • Balanced — Temperature 0.7, general-purpose tasks
  • Creative — Temperature 1.2, storytelling and brainstorming
  • Diverse — Temperature 0.9 with high penalties, maximum variety
  • Custom — Manual control of all parameters

Advanced parameters:

  • Temperature (0-2)
  • Top-P nucleus sampling (0-1)
  • Presence penalty (-2 to 2)
  • Frequency penalty (-2 to 2)
  • Max tokens
  • Stop sequences

Streaming Responses

All models support real-time streaming. Responses appear token-by-token as the LLM generates them, providing immediate feedback during testing.

Performance metrics displayed:

  • Time to first token (TTFT)
  • Tokens per second (TPS)
  • Total duration
  • Token count (input + output)
  • Estimated cost (based on provider pricing)

Multi-Modal Support

Vision-capable models (GPT-4 Vision, Claude 3) support image attachments:

  1. Click the image upload button
  2. Select image files (JPEG, PNG, GIF, WebP)
  3. Images are automatically resized and base64-encoded
  4. Add multiple images per prompt

Supported use cases:

  • Document analysis (invoices, receipts, contracts)
  • Visual QA (describe this image, what’s wrong here?)
  • OCR and text extraction
  • Diagram interpretation

Large images are automatically resized to 2048px max dimension to reduce API costs. Original aspect ratios are preserved.

Using the Playground

Select a Prompt

Navigate to a prompt repository and click any prompt file. The editor opens in the main panel.

Open Playground

Click “Test in Playground” from the toolbar or use the keyboard shortcut Cmd+Shift+P.

The Playground opens as a side panel showing:

  • Prompt template preview
  • Auto-detected variable inputs
  • Model selector
  • Parameter controls

Configure Test

  1. Choose a model from the dropdown
  2. Fill variable values in the auto-generated form
  3. Adjust parameters using presets or custom values
  4. Attach images if testing vision models (optional)

Run Test

Click “Run” or press Cmd+Enter. The response streams in real-time with metrics displayed at the bottom.

Save or Compare

Successful tests can be:

  • Saved to the test history for later reference
  • Exported as JSON for analysis
  • Compared against other test runs to evaluate quality differences

Advanced Features

Test History

Every test run is saved with metadata:

  • Timestamp
  • Model and parameters used
  • Variable values
  • Full response text
  • Performance metrics

Access history from the sidebar to replay tests or compare results across different configurations.

JSON Mode

For structured outputs, enable JSON mode in parameter settings. The LLM returns valid JSON that can be parsed directly into application logic.

Example prompt for JSON mode:

Extract customer information as JSON. Input: "John Smith, john@example.com, lives in Seattle" Output format: { "name": "string", "email": "string", "location": "string" }

The response is guaranteed valid JSON: {"name": "John Smith", "email": "john@example.com", "location": "Seattle"}

System Prompts

Separate system and user prompts for better control:

  • System prompt — Defines agent behavior (e.g., “You are a helpful assistant”)
  • User prompt — Contains the task or question

Both support variables and can be tested independently.

Streaming Control

Pause or cancel streaming responses:

  • Pause — Temporarily stop streaming without losing progress
  • Cancel — Abort the request and discard partial results
  • Resume — Continue from where streaming paused

Best Practices

Variable Naming

Use descriptive, consistent names:

  • Good: {customer_name}, {order_id}, {support_tier}
  • Bad: {x}, {temp}, {data}

Parameter Selection

Choose presets based on use case:

  • Classification/Extraction → Deterministic
  • General conversation → Balanced
  • Content generation → Creative
  • Brainstorming → Diverse

Cost Optimization

Reduce API costs during testing:

  • Limit max tokens to reasonable values
  • Use smaller models (GPT-3.5 vs GPT-4) for initial iterations
  • Compress images before upload
  • Monitor cost per test in the metrics panel

Test with the cheapest model first (Claude Haiku, GPT-3.5 Turbo). Upgrade to premium models only after validating prompt structure.

Keyboard Shortcuts

ActionShortcut
Open PlaygroundCmd+Shift+P
Run testCmd+Enter
Clear responseCmd+K
Toggle parametersCmd+/
Save resultCmd+S

Next Steps

Last updated on