Skip to content
Chatbotscape
Verified
Utterance· NLU concept
An utterance is a single message from a user to a chatbot — typically one to three sentences. In NLU-driven chatbot platforms, the term utterance specifically refers to a training example: a sample message labeled with the intent it represents, used to teach the bot's intent classifier to recognize similar phrasings. Training utterances — for example: I want to cancel, please cancel my order, remove my subscription — teach the bot to recognize the cancel_subscription intent across phrasing variations.
By Chatbotscape Editorial· Methodology· Published 26 May 2026· Updated 26 May 2026

Utterance — Definition and Role in Chatbot Training (2026)

Quick answer~1 min
An utterance is a user message — a single thing they say to a chatbot. In NLU training, utterances are labeled examples used to teach the bot to recognize intents.

Two meanings

The word "utterance" in the chatbot world has two related-but-distinct meanings:

1. Any user message (general usage)

"The user's utterance was where's my order?" — just a term for "what the user said." Common in conversation analysis and transcripts.

2. Training example (NLU usage)

In platforms like Dialogflow, Rasa, and Microsoft Bot Framework, "utterance" specifically means: a sample message labeled with the intent it represents, used to train the intent classifier.

Example training utterances for the intent book_appointment:

  • "I'd like to schedule a meeting"
  • "can I book an appointment"
  • "set up a time, please"
  • "schedule a call for tomorrow"
  • "book me for Tuesday"

The NLU engine learns from these examples to recognize new utterances ("can we set up a time?") as the same intent.

How many utterances per intent?

Standard guidance:

  • Minimum: 5-10 utterances per intent — barely enough.
  • Typical: 15-30 — covers common phrasing variations.
  • Robust: 30-50 — covers edge cases, formal/casual variants, multilingual variants.

Beyond 50, marginal benefit diminishes — better to add new intents than to keep training existing ones.

Designing good training utterances

  • Cover phrasing variation. Mix formal ("I would like to schedule") and casual ("book me for tomorrow").
  • Include typos and abbreviations. "whats my order" — real users skip apostrophes, abbreviate.
  • Cover edge cases. "Can I get a refund AND cancel?" — multi-intent utterances test boundaries.
  • Match user vocabulary. Mine real chat logs. Synthetic "I would like to initiate a cancellation process" doesn't match how users actually speak.
  • Include multilingual examples. If your bot serves PT-BR and ES users, train utterances in each language separately.

In LLM-driven bots

LLM-powered chatbots don't use explicit training utterances. The LLM "understands" any phrasing without needing labeled examples. Instead, the equivalent skill is:

  • Few-shot examples in the system prompt — "Here are 3 examples of how to respond to refund requests."
  • System-prompt instructions describing scope and behavior.

Some platforms blend both — explicit NLU classifier with training utterances for structured intents + LLM fallback for open-ended queries.

FAQ

Is "utterance" the same as "query" or "prompt"?

In NLU contexts, "utterance" specifically means a user's message — labeled or unlabeled. "Query" is broader (search query). "Prompt" refers to LLM instruction text. The terms overlap but distinguish in technical usage.

Do I need to update utterances over time?

Yes. Real-world phrasings evolve, new product features get released, and new questions emerge. Audit chat logs monthly and add fresh training utterances for intents with new phrasing patterns.

Can I auto-generate training utterances?

Some platforms offer LLM-driven utterance generation ("generate 20 variations of I want to cancel"). Useful as a starting point, but verify that generated utterances reflect real user speech and not just LLM-style phrasing.

How do I audit training utterances for bias or quality issues?

Review utterances against three lenses: (1) phrasing diversity — does the set cover formal, casual, typo-laden, multilingual variants? (2) demographic representation — does it reflect how your actual customer base speaks, or only the language of the team that wrote it? (3) intent coverage — are any intents over-represented (training imbalance) or under-represented (real users will hit a fallback)? A monthly review of fallback-triggered conversations is the simplest way to surface gaps.

What's the relationship between utterances and prompts?

Utterances are training data for a classifier — labeled user messages that teach the bot to recognize patterns. Prompts are instructions given to an LLM at runtime that steer its behavior. The two are different layers: utterances build a model's knowledge; prompts shape a model's response. Modern LLM-driven chatbots often replace utterances with system prompts plus few-shot examples — no separate training step needed.

Sources