Partitions
Learn how Reactive Agents's partition system automatically optimizes your skills using Thompson Sampling and K-Means++ clustering.
Core Concepts: Partitions
What is Partition?
Partition in Reactive Agents refers to a specific configuration of hyperparameters (temperature, reasoning level, model, system prompt) optimized for different types of user requests. The system uses Thompson Sampling (multi-armed bandit) and K-Means++ clustering to automatically discover and select optimal configurations.
How It Works
When you create a skill, Reactive Agents automatically:
- Creates Multiple Partitions: Generates combinations of models, prompts, and hyperparameters
- Groups Requests: Clusters similar requests using semantic similarity
- Tests Simultaneously: Routes requests across partitions to measure performance
- Learns Over Time: Thompson Sampling selects better-performing partitions
- Adapts: Re-clusters periodically to discover new request types
Thompson Sampling
Balances exploration (trying new configurations) with exploitation (using proven ones) by sampling from Beta distributions based on observed performance.
K-Means++ Clustering
Groups requests by semantic similarity, allowing specialized partitions for different request types. System re-clusters periodically to adapt to changing patterns.
Configuration Space
The system explores 9 base partitions covering different temperature and reasoning level combinations:
| Temperature Range | Reasoning Levels |
|---|---|
| 0 - 0.33 | 0, 0.5, 1 |
| 0.34 - 0.66 | 0, 0.5, 1 |
| 0.67 - 1 | 0, 0.5, 1 |
Each base partition is combined with different models and system prompts.
Total Partitions Formula:
Total = configuration_count × allowed_models × system_prompt_count × 9Example:
- 3 clusters × 2 models × 2 prompts × 9 base = 108 partitions
Configuration Parameters
- Configuration Count: Number of request clusters (default: 2-3)
- Allowed Models: AI models to test (e.g., GPT-4, Claude)
- System Prompt Count: Number of prompt variations to generate
- Evaluation Methods: How success is measured (accuracy, latency, etc.)
- Clustering Interval: How often to re-analyze request patterns
Evaluation and Rewards
Enabled evaluation methods determine what the system optimizes for. Multiple methods average their scores to create a reward signal that updates Thompson Sampling statistics.
Example: If accuracy=0.90, latency=0.75, and cost=0.60, the reward is 0.75, optimizing for all three objectives.
Best Practices
- Detailed Skill Descriptions: Help generate better system prompts
- Start Conservative: Begin with 2-3 clusters unless you have diverse request types
- Allow Learning Time: Run 100+ requests before evaluating performance
- Monitor Progress: Watch better partitions receive more traffic over time
- Convergence: Most systems converge within 50-200 requests per cluster
Benefits
- Automatic Optimization: No manual hyperparameter tuning required
- Context-Aware: Different partitions for different request types
- Self-Improving: Performance improves continuously with more data
- Balanced Learning: Explores new options while exploiting known good ones