Evaluations
Comprehensive evaluation system for assessing AI agent performance
Overview
Reactive Agents provides a comprehensive evaluation system to assess and measure your AI agent's performance. Choose from our available evaluation methods and individual evaluation types to understand how well your agents complete tasks, maintain quality, and meet your requirements.
Task Completion
LLM-as-a-judge evaluation that assesses whether your agent successfully completed a given task
Argument Correctness
Validates whether your agent provides correct and appropriate parameters to tool calls
Role Adherence
Measures how well your agent adheres to its defined role, constraints, and behavioral guidelines
Tool Correctness
Validates whether your agent calls the expected tools with correct parameters and outputs
Knowledge Retention
Assesses how well your agent retains and recalls information across multi-turn conversations
Conversation Completeness
Measures how well your assistant completes conversations by fully satisfying user needs and intentions
Turn Relevancy
Evaluates whether conversation turns are relevant to the preceding context and maintain conversation flow