When I started building AI agents, I did what most developers do: I wrote a bunch of if-statements. If the user asks about pricing, call the pricing tool. If they mention “support,” route to the support matching service. If they say “calculate,” use the calculator.
It worked great in testing. Then real users showed up.
“Can you help me figure out what my total cost would be?” – No keyword match for “calculate.”
“I’m looking to get matched with some providers” – Keyword match for “provider” but also “get matched.”
“What’s the difference between Plan A and Plan B?” – No clear keyword, but definitely needs a knowledge base.
Rules-based routing fails because humans don’t speak in keywords. They speak in intent, context, and nuance. Your AI agent needs to understand that nuance, not just pattern match.
The Rules-Based Approach (And Why It Fails)
Here’s what rules-based routing typically looks like:
def route_query(user_message: str) -> str:
"""Route based on keyword matching."""
message_lower = user_message.lower()
# Pricing queries
if any(word in message_lower for word in ["price", "pricing", "cost", "fees"]):
return "get_pricing"
# Provider matching
if any(word in message_lower for word in ["provider", "match", "connect", "options"]):
return "provider_matching"
# Calculations
if any(word in message_lower for word in ["calculate", "total", "cost"]):
return "calculator"
# Default to knowledge base
return "knowledge_base"
This looks reasonable. It’s fast, deterministic, and easy to debug. But it has fatal flaws:
Problem 1: Keyword Collisions
“I want to calculate my price” – matches both “calculate” and “price.” Which route wins?
Problem 2: Paraphrasing
“What would my monthly cost be?” – means “calculate payment” but has no matching keywords.
Problem 3: Context Dependence
User: “What’s a good benchmark for response times?”
Agent: “Generally, under 2 seconds for API calls…”
User: “What about for me?”
The second message has no keywords, but the context makes it clear they’re asking about their specific situation.
Problem 4: Multi-Intent Queries
“Can you help me calculate my cost and find providers?” – needs both calculator and provider matching.
Rules-based routing can’t handle this complexity. You need agentic routing.
The Agentic Approach: Let the LLM Decide
Instead of writing rules, let the LLM analyze intent and make routing decisions:
class RoutingDecision(str, Enum):
USE_KNOWLEDGE = "knowledge"
DIRECT_AGENT = "agent"
class RoutingDict(TypedDict):
decision: RoutingDecision
confidence: float
reasoning: str
async def should_use_knowledge_base(state: SupervisorState) -> RoutingResult:
"""
Use LLM to decide if knowledge base retrieval is needed.
This is pure agentic reasoning - no keyword matching or rules.
"""
# Get conversation context
messages = state["messages"]
user_messages = [msg for msg in messages if isinstance(msg, HumanMessage)]
last_message = user_messages[-1].content if user_messages else ""
# Build routing prompt
routing_prompt = f"""
Analyze this user query and decide if knowledge base retrieval is needed.
Use KNOWLEDGE BASE for:
- Educational questions ("What is...", "How does...", "Why...")
- Questions requiring detailed explanations
- Comparison questions ("What's the difference between...")
- Definition requests
- General product information
Use DIRECT AGENT for:
- Greetings and conversational responses
- Action-oriented requests (apply, submit, start, begin, get matched)
- Questions requiring calculations (use tools instead)
- Questions requiring personalized recommendations
- Questions about getting matched with providers
- Requests to start applications or processes
- Off-topic questions (will be handled by guardrails)
Consider the conversation context to understand follow-up questions.
Recent conversation:
{format_recent_messages(messages[-4:])}
Current query: {last_message}
Provide your decision with confidence (0.0-1.0) and reasoning.
"""
# Use structured output for reliable parsing
llm = get_llm_client(temperature=0.0)
structured_llm = llm.with_structured_output(RoutingDict)
try:
result = await structured_llm.ainvoke([HumanMessage(content=routing_prompt)])
return RoutingResult(
decision=RoutingDecision(result["decision"]),
confidence=result["confidence"],
reasoning=result["reasoning"]
)
except Exception as e:
logger.error(f"Routing failed: {e}")
# Fail open - let agent handle it
return RoutingResult(
decision=RoutingDecision.DIRECT_AGENT,
confidence=0.5,
reasoning="Routing error - defaulting to agent"
)
This approach handles all the cases that rules-based routing misses.
Intent Planning: Going Deeper
Routing is just the first step. Production systems need intent planning that analyzes:
- User goal: What is the user trying to accomplish?
- Action intent: Are they ready to take action or just researching?
- Calculation intent: Do they want a calculation performed?
- Execution strategy: What’s the best way to fulfill this request?
- Recommended tools: Which tools (if any) should be used?
- Fallback actions: What to do if the primary strategy fails?
Here’s what that looks like:
class IntentPlan(TypedDict):
user_intent: UserIntent # "get_pricing", "find_providers", "educational", etc.
user_goal: str
action_intent: bool
calculation_intent: bool
execution_strategy: ExecutionStrategy # "use_tool", "use_kb", "direct_response"
recommended_tools: list[str]
kb_needed: bool
fallback_action: FallbackAction
reasoning: str
async def plan_intent(state: SupervisorState) -> Dict[str, Any]:
"""
Analyze user intent and create execution plan.
This implements the Plan-and-Execute pattern:
1. Understand what the user wants
2. Determine the best execution strategy
3. Identify required resources (tools, KB, workers)
4. Plan fallback actions
"""
messages = state["messages"]
user_profile = state.get("user_profile", {})
available_tools = get_available_tools()
planning_prompt = f"""
You are an intent analysis system for a customer assistant.
Analyze this query and create an execution plan.
Available tools: {[tool.name for tool in available_tools]}
User profile: {json.dumps(user_profile, indent=2)}
Recent conversation:
{format_recent_messages(messages[-6:])}
Determine:
1. What is the user trying to accomplish?
2. Are they ready to take action (action intent)?
3. Do they want a calculation performed?
4. What's the best execution strategy?
5. Which tools (if any) should be used?
6. Is knowledge base retrieval needed?
7. What fallback action if primary strategy fails?
Provide structured analysis with reasoning.
"""
llm = get_structured_llm_client()
structured_llm = llm.with_structured_output(IntentPlan)
result = await structured_llm.ainvoke([HumanMessage(content=planning_prompt)])
return {
"intent_plan": result,
"last_user_intent": result["user_intent"]
}
This plan guides the entire execution flow. The agent knows:
- Whether to use tools or knowledge base
- Which specific tools to call
- What to do if tools fail
- How to provide fallback responses
Handling Edge Cases
Agentic routing isn’t perfect. You need to handle edge cases:
1. Low Confidence Decisions
if result.confidence < 0.7:
logger.warning(f"Low confidence routing: {result.confidence}")
# Use conservative fallback
return RoutingDecision.DIRECT_AGENT
When the LLM isn’t confident, fail safe.
2. Conversation Context
# Check if this is a follow-up question
recent_messages = messages[-4:]
has_context = len(recent_messages) > 2
if has_context:
# Include context in routing decision
context_summary = summarize_recent_context(recent_messages)
routing_prompt += f"\n\nConversation context: {context_summary}"
Follow-up questions need context to route correctly.
3. Multi-Intent Queries
if result.user_intent == "multiple":
# Break down into sub-intents
sub_intents = await analyze_sub_intents(user_message)
# Execute in sequence
for intent in sub_intents:
await execute_intent(intent, state)
Some queries need multiple execution paths.
4. Ambiguous Queries
if result.confidence < 0.6 and result.user_intent == "general_inquiry":
# Ask clarifying question
return {
"messages": [AIMessage(content=generate_clarifying_question(result))]
}
When intent is unclear, ask for clarification.
Combining Agentic and Rules-Based
Pure agentic routing is powerful but not always necessary. Some decisions are simple:
async def route_with_hybrid_approach(state: SupervisorState) -> str:
"""
Use rules for simple cases, LLM for complex ones.
"""
last_message = get_last_user_message(state)
# Simple rule: greetings
if is_greeting(last_message):
return "direct_response"
# Simple rule: explicit tool requests
if last_message.lower().startswith("calculate"):
return "use_tool"
# Complex case: use LLM
result = await should_use_knowledge_base(state)
return result.decision
Use rules for obvious cases, LLM for nuanced ones. This balances performance and accuracy.
Observability Matters
Agentic routing is a black box. You need observability:
logger.info(f"🔀 ROUTING: Processing query: '{user_query[:100]}...'")
logger.info(f"🔀 ROUTING: Decision = {result.decision}, Confidence = {result.confidence:.2f}")
logger.info(f"🔀 ROUTING: Reasoning: {result.reasoning}")
# Track metrics
track_routing_decision(
decision=result.decision,
confidence=result.confidence,
user_intent=result.user_intent,
execution_time=execution_time
)
Log every decision with reasoning. When routing goes wrong, you need to know why.
Performance Considerations
Agentic routing adds latency. Optimize:
1. Use Fast Models for Routing
# Use smaller, faster model for routing
routing_llm = get_llm_client(
model="fast-and-cheap-model-v1", # Fast, cheap
temperature=0.0,
max_tokens=500
)
# Use larger model for actual responses
response_llm = get_llm_client(
model="powerful-and-accurate-model-v2", # Powerful, accurate
temperature=0.7,
max_tokens=2000
)
2. Cache Common Patterns
# Cache routing decisions for common queries
routing_cache = {}
cache_key = hash(user_message)
if cache_key in routing_cache:
return routing_cache[cache_key]
result = await route_with_llm(user_message)
routing_cache[cache_key] = result
return result
3. Parallel Execution
# Don't wait for routing to start other work
routing_task = asyncio.create_task(route_query(state))
context_task = asyncio.create_task(load_context(state))
routing_result = await routing_task
context = await context_task
Conclusion
Rules-based routing is tempting. It’s simple, fast, and deterministic. But it doesn’t work for real users with real queries.
Agentic routing uses LLMs to understand intent, context, and nuance. It handles:
- Paraphrased questions
- Multi-intent queries
- Context-dependent meaning
- Ambiguous requests
The tradeoff is latency and cost. But for production AI agents, it’s worth it. Your users don’t speak in keywords. Your routing shouldn’t either.
Let your agent make its own decisions. That’s what makes it intelligent.
Discover more from The Data Lead
Subscribe to get the latest posts sent to your email.
