Performance
Understanding how Reactive Agents automatically optimizes performance using Thompson Sampling and cluster-based specialization
Core Concepts: Performance
Automatic Optimization
Reactive Agents automatically optimizes performance through Thompson Sampling and K-Means++ clustering. The system learns which configurations perform best and allocates traffic accordingly—no manual tuning required.
How It Works
Thompson Sampling: Tests all configurations initially, then gradually favors better performers using statistical learning.
K-Means++ Clustering: Groups similar requests together so each cluster can discover its own optimal configuration.
Reward-Based Learning: Performance scores from evaluations update statistics, driving selection of better partitions over time.
Key Factors
Temperature & Reasoning: Low = deterministic/fast, Medium = balanced, High = varied/slower
System Prompts: Multiple prompts tested to find optimal phrasing
Model Selection: System learns which models work best for each request type
Evaluation Methods
Enabled evaluation methods determine what gets optimized. Available methods include:
- Enable preferred evaluations methods to optimize for balanced quality across different aspects
Convergence Timeline
- 0-50 requests: Exploration phase
- 50-200 requests: Convergence phase
- 200+ requests: Stable phase
Recommendation: Start with 2-3 clusters for most use cases
Best Practices
Setup:
- Start conservative (2-3 clusters, 2 models, 2 prompts)
- Enable evaluations matching your priorities
- Allow 100+ requests before judging performance
Expected Results:
- Simple tasks: Converge in 50-100 requests
- Complex tasks: Converge in 100-200 requests
- Typical improvement: 10-30% after convergence