Utterance· NLU concept
Utterance — Definition and Role in Chatbot Training (2026)
Quick answer~1 min
Two meanings
The word "utterance" in the chatbot world has two related-but-distinct meanings:
1. Any user message (general usage)
"The user's utterance was where's my order?" — just a term for "what the user said." Common in conversation analysis and transcripts.
2. Training example (NLU usage)
In platforms like Dialogflow, Rasa, and Microsoft Bot Framework, "utterance" specifically means: a sample message labeled with the intent it represents, used to train the intent classifier.
Example training utterances for the intent book_appointment:
- "I'd like to schedule a meeting"
- "can I book an appointment"
- "set up a time, please"
- "schedule a call for tomorrow"
- "book me for Tuesday"
The NLU engine learns from these examples to recognize new utterances ("can we set up a time?") as the same intent.
How many utterances per intent?
Standard guidance:
- Minimum: 5-10 utterances per intent — barely enough.
- Typical: 15-30 — covers common phrasing variations.
- Robust: 30-50 — covers edge cases, formal/casual variants, multilingual variants.
Beyond 50, marginal benefit diminishes — better to add new intents than to keep training existing ones.
Designing good training utterances
- Cover phrasing variation. Mix formal ("I would like to schedule") and casual ("book me for tomorrow").
- Include typos and abbreviations. "whats my order" — real users skip apostrophes, abbreviate.
- Cover edge cases. "Can I get a refund AND cancel?" — multi-intent utterances test boundaries.
- Match user vocabulary. Mine real chat logs. Synthetic "I would like to initiate a cancellation process" doesn't match how users actually speak.
- Include multilingual examples. If your bot serves PT-BR and ES users, train utterances in each language separately.
In LLM-driven bots
LLM-powered chatbots don't use explicit training utterances. The LLM "understands" any phrasing without needing labeled examples. Instead, the equivalent skill is:
- Few-shot examples in the system prompt — "Here are 3 examples of how to respond to refund requests."
- System-prompt instructions describing scope and behavior.
Some platforms blend both — explicit NLU classifier with training utterances for structured intents + LLM fallback for open-ended queries.
Related terms
- Intent recognition — what utterances train.
- Natural Language Understanding — the broader category.
- Chatbot training — the overall setup process utterances fit into.
FAQ
Is "utterance" the same as "query" or "prompt"?
In NLU contexts, "utterance" specifically means a user's message — labeled or unlabeled. "Query" is broader (search query). "Prompt" refers to LLM instruction text. The terms overlap but distinguish in technical usage.
Do I need to update utterances over time?
Yes. Real-world phrasings evolve, new product features get released, and new questions emerge. Audit chat logs monthly and add fresh training utterances for intents with new phrasing patterns.
Can I auto-generate training utterances?
Some platforms offer LLM-driven utterance generation ("generate 20 variations of I want to cancel"). Useful as a starting point, but verify that generated utterances reflect real user speech and not just LLM-style phrasing.
How do I audit training utterances for bias or quality issues?
Review utterances against three lenses: (1) phrasing diversity — does the set cover formal, casual, typo-laden, multilingual variants? (2) demographic representation — does it reflect how your actual customer base speaks, or only the language of the team that wrote it? (3) intent coverage — are any intents over-represented (training imbalance) or under-represented (real users will hit a fallback)? A monthly review of fallback-triggered conversations is the simplest way to surface gaps.
What's the relationship between utterances and prompts?
Utterances are training data for a classifier — labeled user messages that teach the bot to recognize patterns. Prompts are instructions given to an LLM at runtime that steer its behavior. The two are different layers: utterances build a model's knowledge; prompts shape a model's response. Modern LLM-driven chatbots often replace utterances with system prompts plus few-shot examples — no separate training step needed.
Sources
- Google Cloud. Dialogflow CX training data design. cloud.google.com/dialogflow (verified 26 May 2026).
- Rasa documentation. Training data best practices. rasa.com/docs (verified 26 May 2026).