Skip to main content

Evals in Arato

Applying built-in evaluations to measure experiment success.

Updated over 3 months ago

Evals in Arato help you measure and validate how well an LLM's output meets your business goals. Whether you need to check for accuracy, relevance, politeness, or any custom criteria, Arato's Eval system allows you to define, test, and analyze evaluation metrics seamlessly.

With Evals, you can:

  • Define evaluation criteria that match your business needs.

  • Test and refine eval prompts before applying them at scale.

  • Run evals on specific text fields from prompt inputs or model outputs.

  • Analyze results with meaningful aggregations and visualizations.


Built-In Evals

Steps to apply a build-in Eval to a Prompt Run:

  1. Navigate to a Prompt Block you would like to experiment with the new Eval.

  2. From the Evaluations Hub, choose the relevant evaluation and select Apply Eval.

  3. Run your experiment and view aggregated scores and detailed breakdowns.

Custom Rule Based Evals

Eval Rules can be based on different conditions for various use cases, for example you can validate if a response or an input prompt contains, does not contain, starts with or ends with specific text. In addition you can use variables from your data set as dynamic values in the eval rules, as well as use regular expressions as a condition for a rule.

Learn more about using Rule Based Evals and Custom LLM Evals in Arato.

Did this answer your question?