Glossary

Embedding

An embedding is a dense numerical vector (array of numbers) that represents the semantic meaning of a piece of text. Embedding models convert words, sentences, or documents into fixed-size vectors in high-dimensional space, where semantically similar texts are positioned close together.

How it works

Embedding models (like OpenAI text-embedding-3, Cohere embed, or open-source models) convert text into vectors of 256-3072 dimensions. For example, "How do I cancel?" and "I want to end my subscription" would produce vectors that are close together in the embedding space because they have similar meaning.

Embeddings enable semantic search: instead of matching keywords, you compare the mathematical similarity (cosine similarity) between the query embedding and document embeddings. The closest vectors represent the most semantically relevant content.

Operational Review

In practice, embedding should be evaluated by what it changes in the support workflow. Ask whether it improves answer accuracy, reduces repeated agent work, clarifies handoff decisions, or makes reporting easier. If the answer is only "it sounds modern," the concept is not yet operational.

A concrete example is knowledge base article indexing: When you publish a help article about "Managing team permissions," Chatsy generates embeddings for each section. Later, a customer asking "how do I give my colleague admin access" matches those embeddings despite zero keyword overlap, because the semantic meaning is the same.

The simplest takeaway is: Embeddings convert text into numerical vectors where semantically similar content clusters together

Why it matters

Embeddings are the foundation of modern AI search and RAG systems. They enable chatbots to find relevant knowledge base content even when the customer uses completely different words than the documentation. Without embeddings, AI chatbots would be limited to keyword matching, missing most relevant content.

How Chatsy uses embedding

Chatsy generates embeddings for all knowledge base content, documentation, and training data. When a customer asks a question, the query is embedded and compared against all content embeddings using pgvector to find the most semantically relevant passages. These passages are then provided to the language model for answer generation.

Real-world examples

Knowledge base article indexing

When you publish a help article about "Managing team permissions," Chatsy generates embeddings for each section. Later, a customer asking "how do I give my colleague admin access" matches those embeddings despite zero keyword overlap, because the semantic meaning is the same.

Cross-language semantic matching

A multilingual embedding model encodes both "politique de remboursement" (French) and "refund policy" (English) to nearby vectors. This enables a single knowledge base to serve customers in multiple languages without duplicating content.

Chunking strategy for long documents

A 5,000-word product guide is split into 20 overlapping chunks of 300 words each, and each chunk is embedded separately. When a customer asks about a specific feature, only the relevant 1-2 chunks are retrieved, not the entire document.

Key takeaways

Embeddings convert text into numerical vectors where semantically similar content clusters together
Embedding models produce vectors of 256-3072 dimensions depending on the model
Cosine similarity between embedding vectors measures how semantically related two texts are
Embedding quality directly determines RAG retrieval accuracy, better models produce better search results
Embeddings are generated once at indexing time and stored in vector databases for fast retrieval

Frequently asked questions

What is the difference between an embedding and a keyword?

A keyword is an exact text string. An embedding is a numerical representation of meaning. Keywords match literally ("pricing" only finds "pricing"). Embeddings match semantically ("pricing" also finds "cost," "fees," "how much," etc.).

How are embeddings stored?

Embeddings are stored in vector databases or vector-enabled databases. Chatsy uses pgvector (a PostgreSQL extension) to store embeddings alongside structured data in the same database, simplifying the architecture.

Do embeddings need to be regenerated when I update content?

Yes. When you edit a knowledge base article, the embeddings for the changed sections need to be regenerated to reflect the updated content. Platforms like Chatsy handle this automatically, updating an article triggers re-embedding within seconds.

What embedding model should I use?

For most customer support use cases, OpenAI text-embedding-3-small offers the best balance of quality and cost. For higher accuracy on complex content, text-embedding-3-large is recommended. Cohere embed and open-source models like BGE are strong alternatives.

Related terms

Vector Search

Vector search is a method of finding information based on semantic meaning rather than exact keyword matches. It works b...

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) is an AI architecture that enhances large language model responses by first retriev...

Hybrid Search

Hybrid search is a retrieval method that combines semantic search (vector/embedding-based) with lexical search (keyword/...

Embedding

How it works

Operational Review

Why it matters

How Chatsy uses embedding

Real-world examples

Knowledge base article indexing

Cross-language semantic matching

Chunking strategy for long documents

Key takeaways

Frequently asked questions

Related terms

Vector Search

Retrieval-Augmented Generation (RAG)

Hybrid Search

Further reading

Related Resources

See embedding in action

Browse the glossary

Embedding

How it works

Operational Review

Why it matters

How Chatsy uses embedding

Real-world examples

Knowledge base article indexing

Cross-language semantic matching

Chunking strategy for long documents

Key takeaways

Frequently asked questions

Related terms

Vector Search

Retrieval-Augmented Generation (RAG)

Hybrid Search

Further reading

Related Resources

See embedding in action

Browse the glossary