Testing & Comparison

Multi‑Model Testing

For more rigorous testing, use the Multi-Model Test feature:

Open Testing

Click the Test icon in the Prompt Studio sidebar.

Select Models

Choose two or more models to test against (e.g., GPT-4o, Claude 3.5 Sonnet, Gemini Pro).

Run

Click Run All to dispatch the same prompt to all selected models simultaneously.

Compare

Review outputs side-by-side with performance metrics for each model.

Comparison Metrics

For each model response, Traceport displays:

Metric	Description
Output	The full response content from each model
Latency	Response time in milliseconds
Tokens	Input and output token counts
Cost	Estimated cost based on the provider’s pricing

Multi-model comparison is also available in the Playground, which offers a dedicated side‑by‑side comparison interface.

Batch Testing with Datasets

For systematic testing across many inputs, use Datasets:

Create a Dataset

Open the Datasets section in the Prompt Studio sidebar. Add test cases with variable values and expected outputs.

Run Batch

Execute the prompt across all test cases in the dataset.

Review Results

Review each test case’s input, expected output, and actual output in a table view.

Datasets are especially useful for regression testing — run them after every prompt change to ensure you haven’t introduced quality regressions.

Single Model Testing