Conversation Completeness
LLM-as-a-judge evaluation that measures how well AI assistants complete conversations by fully satisfying user needs
Overview
Conversation Completeness is an LLM-as-a-judge evaluation method that assesses how well your AI assistant completes conversations by fully satisfying user needs and intentions throughout multi-turn interactions. This metric helps you understand whether your assistant successfully addresses all user requests, resolves queries completely, and provides satisfactory closure to conversations.
Ideal for: Customer support conversations, multi-turn dialogues, task-oriented assistants, and any scenario where complete issue resolution is critical.
What Gets Evaluated
This evaluation analyzes the completeness of user intention satisfaction:
- ✅ Evaluates: "Were all user requests and questions addressed?"
- ✅ Evaluates: "Did the assistant provide complete responses to user needs?"
- ✅ Evaluates: "Were follow-up actions necessary and completed?"
- ❌ Does NOT evaluate: Response quality or style - only intention satisfaction
Key Features
- Intention Extraction: Identifies all user intentions from explicit and implicit requests
- Satisfaction Analysis: Assesses whether each intention was fully addressed
- Completeness Scoring: Provides ratio of satisfied intentions to total intentions
- Multi-turn Support: Evaluates completeness across entire conversation flows
How It Works
The evaluation uses a sophisticated multi-step process:
- User Intention Extraction: Identifies all user intentions including explicit requests, implicit goals, and follow-up inquiries
- Intention Satisfaction Analysis: For each intention, assesses whether the assistant acknowledged, provided complete responses, and satisfied the user's need
- Completeness Calculation:
score = (Satisfied User Intentions) / (Total User Intentions)