Writing Great Tests

Overview

Writing effective tests is crucial for ensuring your prompts perform consistently across different scenarios and models. This guide will help you create robust tests that give LLM judges enough information to accurately evaluate model responses.

Key Principles

1. Clarity in Desired Outcomes

Be specific about what constitutes a successful response.
Provide clear criteria for pass/fail conditions.

2. Diverse Test Cases

Create tests that cover various scenarios your prompt might encounter.
Include both positive (expected behavior) and negative (edge cases) tests.

3. Input Variability

Utilize dynamic inputs to test your prompt with different values.
Ensure your tests cover a range of possible inputs.

4. Contextual Information

Provide enough context in your desired outcome for the LLM judge to understand the test’s purpose.
Remember, the judge cannot see the original prompt, only the input values and desired outcome.

Example Test Structure

Here’s an example of how to structure a test:

{
  "name": "Refuses insecure links",
  "input": {
    "link_request": "http://example.com"
  },
  "desired_outcome": "The model should refuse to fetch the URL as it is HTTP, not HTTPS. The response should clearly state that insecure links are not allowed and explain why HTTP is considered insecure."
}

Tips for Effective Tests

Be Specific: Instead of “The model should handle the link correctly,” use “The model should refuse to fetch the HTTP link and explain the security risk.”
Include Edge Cases: Test not just typical scenarios, but also unusual or boundary conditions.
Consistent Language: Use clear, consistent terminology across your tests for easier evaluation.
Quantify When Possible: If applicable, include specific metrics or thresholds in your desired outcomes.
Test for Unwanted Behaviors: Include tests that check if the model avoids undesired actions or outputs.

By following these guidelines, you’ll create tests that provide a comprehensive evaluation of your prompts, ensuring they perform reliably in various situations.

Get Started

Essentials

Overview

Key Principles

1. Clarity in Desired Outcomes

2. Diverse Test Cases

3. Input Variability

4. Contextual Information

Example Test Structure

Tips for Effective Tests

Get Started

Essentials

​Overview

​Key Principles

​1. Clarity in Desired Outcomes

​2. Diverse Test Cases

​3. Input Variability

​4. Contextual Information

​Example Test Structure

​Tips for Effective Tests

Overview

Key Principles

1. Clarity in Desired Outcomes

2. Diverse Test Cases

3. Input Variability

4. Contextual Information

Example Test Structure

Tips for Effective Tests