Skip to content

Evals

What is an Eval?

An eval (short for evaluation set) is a collection of questions, each with specific criteria for what constitutes a correct or high-quality answer. We use evals to automatically measure the quality of our system’s responses. By running our system on an evalset, we can assess how well it meets the defined criteria for each question.

All decisions about rolling out new features or models are based on the results of evals.

Supported Criteria

Currently, we support the following criteria for evaluating answers:

  • Target URL: The answer should rely on information from a specific reference document, provided as a URL.
  • Instructions: A short high-level description of the correct answer.

Examples:

  • Question: How do I install SerenityGPT?
    Target URL: https://docs.serenitygpt.com/deployment/overview/

  • Question: What main features does SerenityGPT have?
    Instructions: The answer should mention custom integration and security.

Eval formats

We support CSV and JSON with the following fields (columns):

  • question (values required)
  • target_url (values optional)
  • instructions (values optional)

While we support all of the listed formats, .yaml is the system’s native format. Any other format will be automatically converted to .yaml.

Here is an example of a YAML file we use for evals.

questions:
  tenant-name:
    - question: How do I install SerenityGPT?
      target_url: https://docs.serenitygpt.com/deployment/overview/
    - question: What is SerenityGPT?
      instructions: The answer should mention that SerenityGPT is a RAG and Agentic AI framework
      target_url:
        - https://docs.serenitygpt.com/
        - https://docs.serenitygpt.com/product/overview/
    - question: What main features does SerenityGPT have?
      instructions: The answer should mention custom integration and security.
      target_url: ^docs.serenitygpt.com/.*

Here is an example of the same configuration table format:

Tenant Name Question Target URL(s) Instructions
tenant-name How do I install SerenityGPT? https://docs.serenitygpt.com/deployment/overview/ -
tenant-name What is SerenityGPT? https://docs.serenitygpt.com/
https://docs.serenitygpt.com/product/overview/
The answer should mention that SerenityGPT is a RAG and Agentic AI framework
tenant-name What main features does SerenityGPT have? ^docs.serenitygpt.com/.* The answer should mention custom integration and security.