Token
A token is the fundamental unit of text that large language models process. Tokens are fragments of words, whole words, or punctuation marks that the model reads and generates. In English, one token is roughly 3/4 of a word — so 100 words is approximately 130-140 tokens.
How it works
LLMs do not process text as characters or words — they use tokens. A tokenizer splits input text into tokens based on patterns learned from training data. Common words like "the" or "hello" are single tokens, while uncommon words are split into multiple tokens ("tokenization" might become "token" + "ization").
Tokens matter for three practical reasons:
1. **Pricing**: LLM APIs charge per token (input + output). More tokens = higher cost. 2. **Context window**: Each model has a maximum token limit for the combined input and output. Exceeding it means truncating context. 3. **Latency**: More output tokens = longer response time, since LLMs generate one token at a time.
For a typical customer support interaction: the system prompt uses 200-500 tokens, RAG context uses 500-2,000 tokens, the customer question uses 20-100 tokens, and the AI response uses 100-500 tokens.
Why it matters
How Chatsy uses token
Real-world examples
Key takeaways
Frequently asked questions
How many tokens are in a typical sentence?
An average English sentence of 15-20 words uses approximately 20-27 tokens. Exact count varies by vocabulary — common words use fewer tokens while technical or uncommon words use more. Most LLM providers offer free tokenizer tools to check exact counts.
Why do LLMs use tokens instead of words?
Tokens provide a balance between character-level processing (too granular, very slow) and word-level processing (too many unique words to handle efficiently). Tokenization reduces the vocabulary to 50,000-100,000 tokens that can represent any text efficiently, including code, numbers, and multiple languages.
How do tokens affect chatbot pricing?
LLM APIs charge per 1,000 tokens (input and output separately). Input tokens (your prompt + context) are cheaper than output tokens (the AI response). A typical support conversation costs $0.003-$0.01 in token fees. Platform pricing like Chatsy bundles token costs into conversation-based pricing for simpler budgeting.
Do different languages use different numbers of tokens?
Yes. English is the most token-efficient language because LLMs are primarily trained on English text. Languages using non-Latin scripts (Chinese, Japanese, Korean, Arabic) can use 2-3x more tokens for the same semantic content, which increases costs for multilingual deployments.