Reactive Agents

Conversation Completeness

LLM-as-a-judge evaluation that measures how well AI assistants complete conversations by fully satisfying user needs

Overview

Conversation Completeness is an LLM-as-a-judge evaluation method that assesses how well your AI assistant completes conversations by fully satisfying user needs and intentions throughout multi-turn interactions. This metric helps you understand whether your assistant successfully addresses all user requests, resolves queries completely, and provides satisfactory closure to conversations.

Ideal for: Customer support conversations, multi-turn dialogues, task-oriented assistants, and any scenario where complete issue resolution is critical.

What Gets Evaluated

This evaluation analyzes the completeness of user intention satisfaction:

  • ✅ Evaluates: "Were all user requests and questions addressed?"
  • ✅ Evaluates: "Did the assistant provide complete responses to user needs?"
  • ✅ Evaluates: "Were follow-up actions necessary and completed?"
  • ❌ Does NOT evaluate: Response quality or style - only intention satisfaction

Key Features

  • Intention Extraction: Identifies all user intentions from explicit and implicit requests
  • Satisfaction Analysis: Assesses whether each intention was fully addressed
  • Completeness Scoring: Provides ratio of satisfied intentions to total intentions
  • Multi-turn Support: Evaluates completeness across entire conversation flows

How It Works

The evaluation uses a sophisticated multi-step process:

  1. User Intention Extraction: Identifies all user intentions including explicit requests, implicit goals, and follow-up inquiries
  2. Intention Satisfaction Analysis: For each intention, assesses whether the assistant acknowledged, provided complete responses, and satisfied the user's need
  3. Completeness Calculation: score = (Satisfied User Intentions) / (Total User Intentions)