How AI Ticket Routing Actually Works (And Why Keywords Fail)
Keyword-based routing breaks the moment a customer phrases something differently. Intent-based routing doesn't. Here is why.
The Keyword Problem
Most support tools route tickets with keyword rules. "If the message contains 'refund,' send it to billing." "If it contains 'bug,' send it to engineering."
This works until a customer writes "the page keeps crashing when I try to check out and I want my money back." That message is about a bug AND a refund. The keyword rules fight each other. Or worse, neither rule triggers because the customer said "money back" instead of "refund."
Keyword routing is brittle. It breaks on synonyms, misspellings, multi-topic messages, and any phrasing the rule author didn't anticipate. And customers are creative — they'll describe the same problem a hundred different ways.
How Intent Classification Works
Intent-based routing doesn't look for keywords. It reads the full message and asks: "What does this person want?"
The message "the page keeps crashing when I try to check out and I want my money back" gets classified as two intents: bug_report (high confidence) and refund_request (high confidence). The routing system can then apply rules for both: create a bug ticket AND notify the billing team about the refund request.
This works because the classifier was trained on thousands of variations of each intent. It has seen "I want my money back," "can I get a refund," "I'd like to return this," "charge was incorrect," and hundreds of other phrasings — they all map to the same intent.
Three Approaches to Routing
Keyword/Rule-Based (Old School) - How it works: if message contains X, do Y - Strengths: simple to set up, easy to understand - Weaknesses: brittle, misses synonyms, can't handle multi-topic messages - Best for: very simple use cases with predictable language (internal tools, structured forms)
LLM-Based (The Expensive Way) - How it works: send the message to GPT-4 or Claude with instructions like "categorize this into one of these 10 buckets" - Strengths: handles nuance, works with any language, understands context - Weaknesses: slow (1 to 3 seconds), expensive ($0.01 to $0.10 per call at scale), inconsistent (same message might get different categories), hallucination risk - Best for: complex, ambiguous messages where speed doesn't matter
Purpose-Built Classification (The Middle Ground) - How it works: a trained ML model maps messages to a fixed set of intents - Strengths: fast (100 to 200ms), cheap ($0.002 to $0.005 per call), consistent, no hallucination - Weaknesses: limited to trained intents (won't understand something completely novel), less nuance than an LLM - Best for: high-volume support with predictable question types
Why Speed Matters for Routing
When a customer sends a message, the routing decision needs to happen fast. If routing takes 3 seconds (typical for an LLM call), the customer is staring at a loading spinner wondering if the system ate their message. If routing takes 150 milliseconds (typical for a trained classifier), the response or acknowledgment feels instant.
Speed also matters for multi-step pipelines. If your routing involves: classify → check confidence → apply rules → trigger action → send response, each step adds latency. A 150ms classification step gives you room for the rest of the pipeline without the total time feeling slow.
Setting Up Good Routing Rules
The best routing setups follow a simple pattern:
Layer 1: High-confidence auto-resolution If the classifier is 85%+ confident AND you have a rule for that intent, fire the action automatically. Password resets, pricing questions, order status — these get resolved in seconds.
Layer 2: Medium-confidence assisted routing If the classifier is 60 to 85% confident, route the message to the right team with the intent label attached. The human sees "Likely intent: billing_dispute (72% confidence)" and can confirm or correct. This saves the human from reading the full message and figuring out the category from scratch.
Layer 3: Low-confidence catch-all Below 60% confidence, route to a general queue. These are either edge cases, multi-topic messages, or messages the classifier genuinely doesn't understand. Humans handle them and the system learns over time.
Measuring Routing Quality
Track your automation rate weekly — you want 50 to 70% of messages auto-resolved. Spot-check 20 classified messages to gauge routing accuracy; if fewer than 90% are correct, raise your confidence thresholds. Keep an eye on your escalation rate too. If more than 20% of messages land in the general queue, look at which intents are piling up there and create specific rules for them. And track time to resolution: under 5 minutes for auto-resolved, under 2 hours for human-handled.
If your automation rate is below 50%, you probably need more routing rules. If your routing accuracy is below 85%, your confidence thresholds might be too low. If your escalation rate is above 30%, look at which intents are landing in the general queue and create specific rules for them.
The Migration Path
If you're currently using keyword-based routing and want to switch:
1. Export your existing rules. Document what keywords map to what actions. 2. Set up intent-based routing alongside your existing system. Run both for 2 weeks. 3. Compare results. Did intent routing catch messages that keyword routing missed? Were there false positives? 4. Gradually shift traffic. Start with 20% of messages going through intent routing, then 50%, then 100%. 5. Turn off keyword rules once you're confident in the new system.
Most teams complete this in 2 to 3 weeks.