Factor VII: Smart Routing
Route work to best-fit workflows/agents with measured accuracy
| Aspect | Details |
|---|---|
| Primary Pillar | Learning Science |
| Supporting Pillar | DevOps + SRE |
| Enforces Laws | Law 1 (Extract Learnings), Law 2 (Improve System) |
| Derived From | Pattern recognition + Load balancing + Service mesh routing |
Summary
Not all tasks are the same. Intelligent routing analyzes incoming requests and directs them to the optimal workflow, agent, or skill based on task characteristics, historical patterns, and resource constraints. Measured routing accuracy enables continuous improvement.
The Problem
Without intelligent routing:
- Users guess which workflow to use
- Wrong workflow selection wastes time
- Simple tasks routed to complex workflows (overkill)
- Complex tasks routed to simple workflows (failure)
- No learning from routing mistakes
Familiar pattern:
User: "Create a Kubernetes app"
System: "Use /create-app or /complex-workflow or /research-first?"
User: *Guesses wrong*
Result: 30 minutes wasted on wrong approach
Traditional approach: Manual workflow selection, tribal knowledge
12-Factor AgentOps approach: Intelligent router learns patterns, suggests best workflow
Why This Factor Exists
Grounding in the Five Pillars
Primary: Learning Science
Intelligent routing applies pattern recognition from Learning Science: experts recognize patterns instantly while novices analyze step-by-step. A routing system learns which task characteristics predict which workflow's success, building expertise over time. The learning curve shows this: Week 1 (75% accuracy, cold start), Month 3 (91% accuracy, pattern recognition established).
Human learning research shows that categorization skills develop through repeated exposure and feedback. For routing, each task is a training example: task characteristics + chosen workflow + outcome. Success reinforces the pattern, failure updates the model. After 110 tasks, the router achieves 90.9% accuracy—expert-level pattern recognition.
Supporting: DevOps + SRE
DevOps provides the service mesh routing pattern: route requests based on content, not just destination. Istio and Kubernetes route traffic intelligently based on headers, load, health. For agents, routing analyzes task content (keywords, complexity, risk) to select optimal workflow. Cost optimization follows: simple tasks to quick workflows (30 seconds, $0.01), complex tasks to research workflows (3 hours, $2.00)—60x faster, 200x cheaper when routed correctly.
What This Factor Enforces
Law 1: Extract Learnings
Routing enforces learning extraction through feedback loops. Every routing decision generates data: task → workflow → outcome. Analysis reveals patterns: "Create Kubernetes app" succeeds 93% with applications-create-app, only 40% with generic workflow. This pattern extraction feeds the routing model, improving future decisions.
Concrete example: Month 1 data shows simple tasks routed to complex workflows waste time. Learning extracted: "Simple tasks (no research needed) should route to quick-edit workflow." Router updated with this pattern. Month 2 shows 97% accuracy for simple tasks. The learning compounded into improved routing.
Law 2: Improve System
Routing continuously improves the system by optimizing workflow selection. Poor routes waste time (wrong workflow) and money (expensive workflow for cheap task). Intelligent routing learns from failures, improving accuracy from 75% (Month 1) to 91% (Month 3). Each percentage point improvement reduces wasted effort across all future tasks.
Concrete example: Routing accuracy dashboard shows 10 misrouted tasks per week costing 30 minutes each (5 hours wasted). Routing improvements reduce misroutes to 1 per week (30 minutes wasted). System improvement: 4.5 hours/week saved = 234 hours/year. The router's learning directly improves operational efficiency.
The Principle
Task Classification
Dimensions for routing decisions:
-
Complexity:
- Simple: Known pattern, clear solution
- Medium: Some research needed
- Complex: Research → Plan → Implement required
-
Novelty:
- Familiar: Similar task completed before
- New: First time seeing this type
- Novel: Requires innovation
-
Risk:
- Low: Reversible, non-production
- Medium: Production, but tested
- High: Production, untested, critical
-
Scope:
- Single file modification
- Multiple files, one service
- Multiple services, system-wide
Routing Decision Tree
Task arrives
├─ Is this a known pattern?
│ ├─ Yes → Route to specialized workflow
│ └─ No → Analyze characteristics
│ ├─ Simple + Low Risk → Quick workflow
│ ├─ Complex + High Risk → Research → Plan → Implement
│ └─ Unknown → Interactive router (ask user)
│
└─ Historical accuracy for this route?
├─ ›90% → Auto-route
├─ 70-90% → Suggest with confidence score
└─ ‹70% → Present options, let user choose
The Router Pattern
Input: Task description Process: Analyze → Classify → Match → Route Output: Best-fit workflow + confidence score
class IntelligentRouter:
def route(self, task_description):
# 1. Extract features
features = self.extract_features(task_description)
# 2. Classify
complexity = self.classify_complexity(features)
risk = self.classify_risk(features)
scope = self.classify_scope(features)
# 3. Match to workflows
candidates = self.find_matching_workflows(complexity, risk, scope)
# 4. Rank by historical accuracy
ranked = self.rank_by_accuracy(candidates, features)
# 5. Return best match with confidence
best_match = ranked[0]
confidence = best_match.historical_accuracy
if confidence > 0.90:
return best_match, "auto"
elif confidence > 0.70:
return best_match, "suggested"
else:
return ranked[:3], "choose"
Why This Works
1. Pattern Recognition (Learning Science)
Human learning principle:
"Experts recognize patterns instantly; novices analyze step-by-step"
For AI routing:
Novice: Reads full task → Analyzes → Uncertain decision
Expert: Sees "Kubernetes StatefulSet" → Instantly routes to k8s-stateful workflow
Result: Pattern recognition enables instant, accurate routing
2. Service Mesh Routing (DevOps/SRE)
Kubernetes/Istio pattern:
"Route requests based on content, not just destination"
For AI agents:
Traditional: All tasks → Same agent
Intelligent: Tasks analyzed → Routed to specialized agents
Example:
- "Fix typo" → quick-edit agent (30 seconds)
- "Refactor architecture" → research-plan-implement (3 hours)
3. Feedback Loops Enable Accuracy
Continuous improvement:
Route task → Execute → Measure success → Update routing model
Example:
Week 1: "Create K8s app" → Simple workflow → Failed → Update model
Week 2: "Create K8s app" → Complex workflow → Success → Reinforce
Week 10: "Create K8s app" → Complex workflow (95% accuracy)
4. Cost Optimization
Problem: Using expensive workflows for cheap tasks
Solution: Route by cost/benefit
Task: Fix typo in README
- Complex workflow: 30 minutes, $2.00 in compute
- Quick edit: 30 seconds, $0.01 in compute
Savings: 60x faster, 200x cheaper
Implementation
Feature Extraction
Keywords matter:
class FeatureExtractor:
def extract_features(self, task_description):
features = {
'keywords': self.extract_keywords(task_description),
'complexity_signals': [],
'risk_signals': [],
'scope_signals': []
}
# Complexity signals
if any(kw in task_description.lower() for kw in ['research', 'investigate', 'explore']):
features['complexity_signals'].append('research_needed')
if any(kw in task_description.lower() for kw in ['architecture', 'design', 'plan']):
features['complexity_signals'].append('planning_needed')
# Risk signals
if any(kw in task_description.lower() for kw in ['production', 'prod', 'live']):
features['risk_signals'].append('production')
if any(kw in task_description.lower() for kw in ['critical', 'urgent', 'emergency']):
features['risk_signals'].append('high_priority')
# Scope signals
if any(kw in task_description.lower() for kw in ['system-wide', 'all', 'entire']):
features['scope_signals'].append('broad_scope')
return features
Workflow Matching
Rule-based routing:
class WorkflowMatcher:
def match(self, features):
workflows = []
# Simple, focused tasks
if not features['complexity_signals'] and not features['risk_signals']:
workflows.append({
'workflow': 'quick-edit',
'reason': 'Simple task, low risk'
})
# Research-heavy tasks
if 'research_needed' in features['complexity_signals']:
workflows.append({
'workflow': 'research-plan-implement',
'reason': 'Research required before implementation'
})
# High-risk production tasks
if 'production' in features['risk_signals']:
workflows.append({
'workflow': 'validated-deployment',
'reason': 'Production environment requires validation gates'
})
return workflows
Machine Learning Router (Advanced)
Train on historical data:
from sklearn.ensemble import RandomForestClassifier
class MLRouter:
def __init__(self):
self.model = RandomForestClassifier()
self.trained = False
def train(self, historical_tasks):
# Extract features from past tasks
X = [self.vectorize(task.description) for task in historical_tasks]
# Labels: which workflow was successful
y = [task.successful_workflow for task in historical_tasks]
# Train model
self.model.fit(X, y)
self.trained = True
def predict(self, task_description):
if not self.trained:
raise ValueError("Router not trained yet")
# Vectorize new task
features = self.vectorize(task_description)
# Predict best workflow
workflow = self.model.predict([features])[0]
# Get confidence scores
probabilities = self.model.predict_proba([features])[0]
confidence = max(probabilities)
return {
'workflow': workflow,
'confidence': confidence,
'alternatives': self.get_top_n_workflows(probabilities, n=3)
}
Interactive Routing
When confidence is low, ask:
class InteractiveRouter:
def route_with_interaction(self, task_description):
# Try automatic routing first
result = self.router.predict(task_description)
if result['confidence'] > 0.90:
# High confidence, auto-route
return result['workflow']
elif result['confidence'] > 0.70:
# Medium confidence, suggest with option to override
print(f"Suggested workflow: {result['workflow']} (confidence: {result['confidence']:.0%})")
print(f"Alternatives: {result['alternatives']}")
user_choice = input("Accept suggestion? (y/n): ")
if user_choice.lower() == 'y':
return result['workflow']
else:
return self.present_choices(result['alternatives'])
else:
# Low confidence, present choices
print("I'm not sure which workflow is best. Here are the options:")
return self.present_choices(result['alternatives'])
Validation
✅ You're doing this right if:
- Routing accuracy measured (target: ›90%)
- Users rarely override router suggestions
- Simple tasks route to simple workflows
- Complex tasks route to research-first workflows
- Routing decisions improve over time
❌ You're doing this wrong if:
- No measurement of routing accuracy
- Users constantly override suggestions
- All tasks route to the same workflow
- No learning from routing failures
- Manual workflow selection required
Real-World Evidence
Production Routing Accuracy (110 Validation Cases)
Measured routing decisions:
Total tasks routed: 110
Correct routes: 100
Incorrect routes: 10
Accuracy: 90.9%
Breakdown by task type:
Simple tasks (35): 97% accuracy (34/35)
Medium tasks (50): 90% accuracy (45/50)
Complex tasks (25): 84% accuracy (21/25)
Learning curve:
Month 1: 75% accuracy (cold start)
Month 2: 85% accuracy (learning patterns)
Month 3: 91% accuracy (stable)
Specific Examples
Example 1: Kubernetes Application Creation
Task: "Create a new Kubernetes application for Redis caching"
Router analysis:
- Keywords: "create", "Kubernetes", "application"
- Complexity: Medium (known pattern)
- Risk: Low (new app, not modifying existing)
- Historical: 15 similar tasks, 93% success with
applications-create-app
Route decision: applications-create-app workflow (confidence: 93%)
Outcome: Success ✅
Time: 10 minutes (vs. 45 minutes if routed to research-first)
Example 2: Architecture Redesign
Task: "Investigate migrating from monolith to microservices"
Router analysis:
- Keywords: "investigate", "migrating", "architecture"
- Complexity: High (research needed)
- Risk: High (system-wide change)
- Historical: 3 similar tasks, 100% success with
research-plan-implement
Route decision: research-plan-implement workflow (confidence: 100%)
Outcome: Success ✅
Time: 3 hours (appropriate for complexity)
Example 3: Quick Typo Fix
Task: "Fix typo in README.md"
Router analysis:
- Keywords: "fix typo"
- Complexity: Low (simple edit)
- Risk: Low (documentation)
- Historical: 50+ similar tasks, 100% success with
quick-edit
Route decision: quick-edit workflow (confidence: 100%)
Outcome: Success ✅
Time: 30 seconds (vs. 10 minutes if routed to full workflow)
Anti-Patterns
❌ The "One Size Fits All" Trap
Wrong: Route all tasks to the same workflow Right: Match task characteristics to workflow capabilities
❌ The "Perfect Routing" Trap
Wrong: Spend hours building ML model for 10 tasks Right: Start with rule-based routing, upgrade to ML when data exists
❌ The "Ignore Feedback" Trap
Wrong: Never measure routing accuracy Right: Track successes/failures, continuously improve
❌ The "Black Box" Trap
Wrong: Route without explaining why Right: Show reasoning (keywords detected, historical accuracy, confidence)
Relationship to Other Factors
- Factor III: Focused Agents: Router selects which single-responsibility agent
- Factor IV: Continuous Validation: Routing accuracy is a validation metric
- Factor VI: Measure Everything: Measure routing decisions and outcomes
- Factor IX: Mine Patterns: Routing patterns extracted from successful routes
- Factor X: Small Iterations: Routing accuracy drives improvement backlog
Routing Patterns
Pattern 1: Cascading Router
Primary router (high-level)
├─ "Simple task" → Quick workflow router
│ ├─ "Edit file" → quick-edit
│ └─ "Format code" → code-formatter
│
├─ "Complex task" → Research workflow router
│ ├─ "New architecture" → research-plan-implement
│ └─ "Migration" → analysis-migration-validation
│
└─ "Uncertain" → Interactive router
└─ Present options to user
Pattern 2: Context-Aware Routing
class ContextAwareRouter:
def route(self, task, context):
# Consider current context
if context.current_phase == "research":
# Already in research, continue with research workflows
return self.research_workflows
elif context.recent_failures > 3:
# Many failures, route to safer, validated workflows
return self.conservative_workflows
elif context.time_remaining < 30 * 60: # 30 minutes
# Limited time, route to quick workflows
return self.quick_workflows
# Default: Standard routing
return self.standard_route(task)
Pattern 3: Confidence Calibration
class CalibratedRouter:
def calibrate_confidence(self, prediction, historical_data):
# Adjust confidence based on calibration
raw_confidence = prediction.confidence
# Calibration curve from historical data
calibrated = self.calibration_curve(raw_confidence, historical_data)
return {
'workflow': prediction.workflow,
'raw_confidence': raw_confidence,
'calibrated_confidence': calibrated,
'recommendation': 'auto' if calibrated > 0.90 else 'suggest'
}
Metrics to Track
Routing Accuracy:
accuracy = correct_routes / total_routes
Target: ›90%
User Override Rate:
override_rate = user_overrides / suggestions
Target: ‹10% (users trust suggestions)
Time Savings:
time_saved = (manual_selection_time - auto_routing_time) * num_routes
Example: (60s - 0s) * 100 routes = 100 minutes saved
Cost Optimization:
cost_saved = sum(wrong_workflow_cost - optimal_workflow_cost)
Example: 10 tasks routed to cheap workflows instead of expensive = $50 saved
Next Steps
- Implement rule-based router for common task types
- Measure baseline accuracy on historical tasks
- Collect routing data for ML training (when volume justifies)
- Create confidence calibration based on historical accuracy
- Build interactive fallback for uncertain cases
Further Reading
- Learning Science Pillar: ../docs/principles/four-pillars.md#pillar-2-learning-science
- Law 1 (Extract Learnings): ../docs/principles/nine-laws.md
- Law 2 (Improve System): ../docs/principles/nine-laws.md
- Pattern: Intelligent Routing: ../patterns/intelligent-routing.md
Implementation Patterns
These patterns emerge from production deployments in Houston (local-first), Fractal (Kubernetes-native), and ai-platform (IC-hardened). They extend the conceptual routing principles with battle-tested infrastructure patterns.
Pattern 1: Composable Not Chainable (Blackboard Architecture)
The Problem: Traditional orchestration chains agents linearly (A → B → C), creating tight coupling and brittle pipelines.
The Solution: Blackboard-mediated coordination where agents read/write to shared state, infrastructure handles routing.
┌──────────────────────────────────────────────────────────────────────────┐
│ BLACKBOARD ARCHITECTURE │
├──────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────┐ │
│ │ BLACKBOARD │ │
│ │ (Shared State Store) │ │
│ │ │ │
│ │ ┌─────────┐ ┌──────────┐ │ │
│ │ │Decisions│ │Directives│ │ │
│ │ │(append) │ │ (upsert) │ │ │
│ │ └─────────┘ └──────────┘ │ │
│ └─────────────────────────────┘ │
│ ▲ │ │
│ ┌────────────┘ └────────────┐ │
│ │ write read │ │
│ │ ▼ │
│ ┌───────┴───────┐ ┌───────────────┐ │
│ │ Agent A │ │ Agent B │ │
│ │ (Research) │ │ (Planning) │ │
│ └───────────────┘ └───────────────┘ │
│ ▲ │ │
│ │ │ write │
│ │ read ▼ │
│ ┌───────────────┐ ┌───────────────┐ │
│ │ Agent D │◄───── read ────────│ Agent C │ │
│ │ (Review) │ │(Implementation)│ │
│ └───────────────┘ └───────────────┘ │
│ │
│ KEY INSIGHT: Agents don't know about each other, only the blackboard │
└──────────────────────────────────────────────────────────────────────────┘
Blackboard Data Model (from Fractal):
# Two types of blackboard entries
apiVersion: fractal.ai/v1alpha1
kind: BlackboardEntry
# Type 1: Decisions (append-only, audit trail)
metadata:
name: research-findings-001
spec:
type: decision
phase: research
content:
findings:
- 'API supports pagination via cursor'
- 'Rate limit is 100 req/min'
confidence: 0.92
author: research-agent
timestamp: '2025-01-15T10:30:00Z'
---
# Type 2: Directives (upsertable, current state)
metadata:
name: current-approach
spec:
type: directive
content:
approach: 'cursor-based-pagination'
priority: 'latency'
constraints:
- 'no breaking changes'
lastUpdated: '2025-01-15T10:35:00Z'
Why Composable Beats Chainable:
| Aspect | Chainable (A→B→C) | Composable (Blackboard) |
|---|---|---|
| Coupling | Tight (each agent knows next) | Loose (agents know only blackboard) |
| Failure | Chain breaks | Other agents continue |
| Scaling | Add more chains | Add more agents to same blackboard |
| Debugging | Trace through chain | Read blackboard state |
| Recovery | Restart chain | Resume from blackboard |
Pattern 2: SharedInformer Caching (Kubernetes-Native)
The Problem: Agents polling for routing decisions create N×M API calls (N agents × M decisions).
The Solution: SharedInformer pattern from Kubernetes—local read cache with watch for updates.
┌─────────────────────────────────────────────────────────────────────────┐
│ SHAREDINFORMER CACHING │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ API SERVER │ │
│ │ (Source of Truth) │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────┴─────────┐ │
│ │ Initial List + │ │
│ │ Watch Stream │ │
│ └─────────┬─────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ SHARED INFORMER │ │
│ │ ┌──────────────────────────────────────────────────────┐ │ │
│ │ │ LOCAL CACHE │ │ │
│ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │
│ │ │ │ Agent A │ │ Agent B │ │ Agent C │ │ │ │
│ │ │ │ config │ │ config │ │ config │ │ │ │
│ │ │ └──────────┘ └──────────┘ └──────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ EVENT HANDLERS: │ │
│ │ - OnAdd → Route new agent │ │
│ │ - OnUpdate → Re-route changed agent │ │
│ │ - OnDelete → Remove from routing table │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────┼───────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Router 1 │ │ Router 2 │ │ Router 3 │ │
│ │ (instant │ │ (instant │ │ (instant │ │
│ │ local read)│ │ local read)│ │ local read)│ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │
│ BENEFIT: O(1) reads from local cache, watch keeps cache fresh │
└─────────────────────────────────────────────────────────────────────────┘
Implementation (from Fractal):
// SharedInformer for routing decisions
type RoutingInformer struct {
informer cache.SharedIndexInformer
lister v1alpha1.KAgentLister
}
func NewRoutingInformer(client kubernetes.Interface) *RoutingInformer {
informer := cache.NewSharedIndexInformer(
&cache.ListWatch{
ListFunc: func(options metav1.ListOptions) (runtime.Object, error) {
return client.FractalV1alpha1().KAgents("").List(ctx, options)
},
WatchFunc: func(options metav1.ListOptions) (watch.Interface, error) {
return client.FractalV1alpha1().KAgents("").Watch(ctx, options)
},
},
&v1alpha1.KAgent{},
resyncPeriod,
cache.Indexers{
"byCapability": indexByCapability, // Index agents by what they can do
"byDomain": indexByDomain, // Index agents by domain
},
)
return &RoutingInformer{informer: informer}
}
// Route task to best agent - O(1) local cache read
func (r *RoutingInformer) Route(task Task) (*v1alpha1.KAgent, error) {
// Get agents with matching capability from local cache
agents, err := r.informer.GetIndexer().ByIndex("byCapability", task.RequiredCapability)
if err != nil {
return nil, err
}
// Score and select best agent
return r.selectBestAgent(agents, task)
}
Pattern 3: Classification-Aware Routing (IC Pattern)
The Problem: In IC environments, data cannot cross classification boundaries. Routing must respect security constraints.
The Solution: Classification-aware routing that keeps data within its security boundary.
┌─────────────────────────────────────────────────────────────────────────┐
│ CLASSIFICATION-AWARE ROUTING │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ TASK ARRIVES WITH CLASSIFICATION LABEL │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────┐ │
│ │ CLASSIFICATION ROUTER │ │
│ │ (Checks label, routes to │ │
│ │ appropriate namespace) │ │
│ └─────────────────────────────────┘ │
│ │ │
│ ┌───────────┼───────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │UNCLASS │ │ SECRET │ │TOP SECRET│ │
│ │Namespace │ │Namespace │ │Namespace │ │
│ │ │ │ │ │ │ │
│ │┌────────┐│ │┌────────┐│ │┌────────┐│ │
│ ││Agent A ││ ││Agent B ││ ││Agent C ││ │
│ │└────────┘│ │└────────┘│ │└────────┘│ │
│ │┌────────┐│ │┌────────┐│ │┌────────┐│ │
│ ││Agent D ││ ││Agent E ││ ││Agent F ││ │
│ │└────────┘│ │└────────┘│ │└────────┘│ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │ │ │ │
│ │ NetworkPolicy │ │
│ │ BLOCKS cross- │ │
│ │ namespace traffic │ │
│ └───────────┴───────────┘ │
│ │
│ ENFORCEMENT: Kubernetes NetworkPolicies prevent data spillage │
└─────────────────────────────────────────────────────────────────────────┘
NetworkPolicy Enforcement:
# Block all cross-namespace traffic by default
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-cross-namespace
namespace: secret
spec:
podSelector: {} # Apply to all pods in namespace
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector: {} # Only from same namespace
egress:
- to:
- podSelector: {} # Only to same namespace
- to:
- namespaceSelector:
matchLabels:
name: kube-system # Allow DNS
ports:
- port: 53
protocol: UDP
Classification-Aware Router:
class ClassificationRouter:
"""Routes tasks to agents within classification boundary."""
CLASSIFICATION_HIERARCHY = {
'unclassified': 0,
'cui': 1,
'secret': 2,
'top_secret': 3
}
def route(self, task: Task) -> Agent:
# Get task classification
task_level = self.CLASSIFICATION_HIERARCHY[task.classification]
# Get available agents at or below task classification
# (can route DOWN but never UP)
eligible_agents = [
agent for agent in self.agents
if self.CLASSIFICATION_HIERARCHY[agent.namespace] <= task_level
]
if not eligible_agents:
raise SecurityViolation(
f"No agents available at classification {task.classification}"
)
# Route to best agent within boundary
return self.select_best(eligible_agents, task)
def validate_response(self, response: Response, task: Task) -> bool:
"""Ensure response doesn't leak higher classification."""
response_level = self.detect_classification(response.content)
task_level = self.CLASSIFICATION_HIERARCHY[task.classification]
if response_level > task_level:
raise SecurityViolation(
f"Response contains {response_level} data for {task_level} task"
)
return True
Pattern 4: Multi-Tier Model Routing (ai-platform)
The Problem: Different deployment environments have different constraints (GPU, latency, connectivity).
The Solution: Route to appropriate tier based on task requirements and available resources.
┌─────────────────────────────────────────────────────────────────────────┐
│ MULTI-TIER MODEL ROUTING │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ TASK ARRIVES │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ TIER ROUTER │ │
│ │ │ │
│ │ Evaluates: │ │
│ │ - Latency req │ │
│ │ - Model size │ │
│ │ - Connectivity │ │
│ │ - Cost budget │ │
│ └─────────────────┘ │
│ │ │
│ ┌──────────────────┼──────────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ TIER 1 │ │ TIER 2 │ │ TIER 3 │ │
│ │ Tactical Edge │ │ Datacenter │ │ Connected │ │
│ │ │ │ │ │ │ │
│ │ • 7B quantized │ │ • 70B-123B │ │ • Frontier │ │
│ │ • 3 GPU nodes │ │ • Shared HPCaaS │ │ • Claude/GPT-4 │ │
│ │ • 0% internet │ │ • Internal only │ │ • Full internet │ │
│ │ • ‹100ms latency│ │ • ‹1s latency │ │ • ‹5s latency │ │
│ │ │ │ │ │ │ │
│ │ USE: Tactical │ │ USE: Analysis, │ │ USE: Complex │ │
│ │ decisions, edge │ │ planning, heavy │ │ reasoning, code │ │
│ │ inference │ │ compute │ │ generation │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Tier Selection Logic:
class TierRouter:
"""Routes to appropriate tier based on task requirements."""
def select_tier(self, task: Task) -> Tier:
# Tier 1: Edge - for latency-critical, simple tasks
if task.latency_requirement_ms < 100:
if task.complexity == "simple":
return Tier.EDGE
else:
raise LatencyConstraintViolation(
"Complex task cannot meet ‹100ms latency"
)
# Tier 2: Datacenter - for analysis, medium complexity
if task.requires_large_model and not task.requires_internet:
return Tier.DATACENTER
# Tier 3: Connected - for frontier capabilities
if task.requires_frontier_model or task.requires_internet:
if self.connectivity_available():
return Tier.CONNECTED
else:
# Fallback to datacenter with degraded capability
return Tier.DATACENTER
# Default: Datacenter (best balance)
return Tier.DATACENTER
def route_with_fallback(self, task: Task) -> Response:
"""Route with automatic tier fallback."""
preferred_tier = self.select_tier(task)
try:
return self.execute_on_tier(task, preferred_tier)
except TierUnavailable:
# Fallback chain: Connected → Datacenter → Edge
for fallback in self.get_fallback_chain(preferred_tier):
try:
return self.execute_on_tier(task, fallback)
except TierUnavailable:
continue
raise AllTiersUnavailable(task)
Anti-Patterns for Production Routing
❌ Anti-Pattern 1: Orchestrator Coupling
Wrong: Router calls orchestrator which calls agents
Router → Orchestrator → Agent A → Agent B → Agent C
Right: Router writes to blackboard, agents react
Router → Blackboard ← Agents (react to state changes)
❌ Anti-Pattern 2: Synchronous Routing
Wrong: Router blocks waiting for agent response
router.route(task) # Blocks for 30 seconds
Right: Router enqueues, worker processes
router.enqueue(task) # Returns immediately
worker.process_queue() # Async processing
❌ Anti-Pattern 3: No Classification Enforcement
Wrong: Trust agents to respect classification
agent.process(task) # Agent "promises" to stay in boundary
Right: Infrastructure enforces boundaries
NetworkPolicy + NamespaceIsolation # Cannot violate
Production Checklist for Smart Routing
## Routing Infrastructure Checklist
### Blackboard Architecture
- [ ] Decisions are append-only (audit trail)
- [ ] Directives are upsertable (current state)
- [ ] Agents read/write blackboard, not each other
- [ ] Blackboard survives agent failures
### Caching Layer
- [ ] SharedInformer or equivalent for routing data
- [ ] Local cache with watch-based updates
- [ ] O(1) routing lookups (not O(N) API calls)
- [ ] Cache invalidation on config changes
### Classification (if IC)
- [ ] NetworkPolicies block cross-namespace traffic
- [ ] Router enforces classification boundaries
- [ ] Response validation prevents data spillage
- [ ] Audit log for all routing decisions
### Multi-Tier (if applicable)
- [ ] Tier selection based on task requirements
- [ ] Automatic fallback chain defined
- [ ] Latency SLOs enforced per tier
- [ ] Cost tracking per tier
Remember: Routing is not just about automation—it's about learning. Every routing decision is a learning opportunity. Measure accuracy, analyze failures, and continuously improve. The router should get smarter with every task.