Vector Search Explained: How AI Chatbots Find Answers
Vector search powers modern AI chatbots. Learn how it works, why it's better than keyword search, and how it helps chatbots understand what you really mean.

When you ask an AI chatbot "How do I cancel?", how does it know you mean your subscription and not a meeting? The answer is vector searchβa technology that understands meaning, not just keywords.
The Problem with Keyword Search
Traditional search matches words. Ask "How do I cancel?" and it looks for documents containing "cancel."
But what if your help docs say "terminate subscription" or "end your plan"? Keyword search misses these completely.
| You Ask | Doc Contains | Keyword Match? |
|---|---|---|
| "cancel" | "cancel" | β Yes |
| "cancel" | "terminate" | β No |
| "cancel" | "end subscription" | β No |
| "cancel my plan" | "how to cancel" | β Yes |
| "stop my subscription" | "cancel plan" | β No |
This is why old chatbots felt so frustratingβslight wording differences broke everything.
Enter Vector Search
Vector search converts text into embeddingsβnumerical representations that capture meaning.
How Embeddings Work
Text β AI Model β Vector (list of numbers)
"How do I cancel?" β [0.23, -0.45, 0.67, 0.12, ...]
"Terminate my subscription" β [0.21, -0.43, 0.69, 0.14, ...]
"What's the weather?" β [-0.56, 0.34, -0.12, 0.78, ...]
Notice how similar meanings have similar numbers? That's the magic.
Similarity Search
To find relevant content:
- Convert the question to a vector
- Compare it to all document vectors
- Return documents with the closest vectors
Question Vector: [0.23, -0.45, 0.67, ...]
β
Compare to all document vectors
β
Return: "How to cancel your subscription" (similarity: 0.94)
"Ending your plan early" (similarity: 0.89)
"Refund policy" (similarity: 0.72)
Why It's Better
Understands Synonyms
"cancel," "terminate," "end," and "stop" cluster together in vector space.
Handles Paraphrasing
"I want my money back" finds "refund policy" even without shared words.
Language-Agnostic
Modern embeddings work across languagesβa Spanish question can match English docs.
Typo-Tolerant
"How do I cancle my subcription" still works because the meaning is captured.
The Technical Details
Embedding Models
Popular models for text embeddings:
| Model | Dimensions | Best For |
|---|---|---|
| OpenAI text-embedding-3-large | 3072 | General purpose |
| Cohere embed-v3 | 1024 | Multilingual |
| Voyage-3 | 1024 | Long documents |
| BGE-large | 1024 | Open source option |
Higher dimensions = more nuance, but slower search.
Vector Databases
Storing and searching vectors requires specialized databases:
- Pinecone: Managed, easy to use
- Weaviate: Open source, feature-rich
- pgvector: PostgreSQL extension
- Qdrant: High performance, open source
At Chatsy, we use pgvector for reliability and cost-effectiveness.
Chunking Strategy
Before embedding, documents are split into chunks:
Full Document (5000 words)
β
Chunk 1 (500 words): "Subscription Management..."
Chunk 2 (500 words): "Cancellation Policy..."
Chunk 3 (500 words): "Refund Process..."
Why chunk?
- LLMs have context limits
- Smaller chunks = more precise matches
- Better relevance scoring
Optimal chunk size: 500-1000 tokens with 100-200 token overlap
Limitations of Pure Vector Search
Vector search isn't perfect:
1. Exact Match Failures
"Error code E-1234" might not match "E-1234 error" well because embeddings focus on semantic meaning, not exact strings.
2. Rare Terms
Uncommon product names or technical terms may not embed well.
3. Negation Confusion
"I don't want to cancel" and "I want to cancel" have similar embeddings despite opposite meanings.
The Solution: Hybrid Search
Combine vector and keyword search:
User Question
β
βββββββββββββββββββ¬ββββββββββββββββββ
β Vector Search β Keyword Search β
β (meaning) β (exact terms) β
βββββββββββββββββββ΄ββββββββββββββββββ
β β
βββββββββββ¬ββββββββββ
β
Combine & Rerank
β
Final Results
This gets the best of both worldsβsemantic understanding AND exact matching.
How Chatsy Uses Vector Search
Our retrieval pipeline:
- Query Expansion: Generate synonyms and related queries
- Hybrid Search: Vector + keyword across all queries
- Reciprocal Rank Fusion: Combine results intelligently
- Reranking: Use a cross-encoder for final relevance scoring
- Context Assembly: Select best chunks for the LLM
This multi-stage approach delivers 94%+ relevant answer rates.
Implementing Vector Search
Basic Implementation (Python)
pythonfrom openai import OpenAI import numpy as np client = OpenAI() def embed(text): response = client.embeddings.create( input=text, model="text-embedding-3-small" ) return response.data[0].embedding def cosine_similarity(a, b): return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)) # Embed your documents docs = ["How to cancel subscription", "Refund policy", "Pricing plans"] doc_embeddings = [embed(doc) for doc in docs] # Search query = "I want my money back" query_embedding = embed(query) # Find most similar similarities = [cosine_similarity(query_embedding, doc_emb) for doc_emb in doc_embeddings] best_match = docs[np.argmax(similarities)] print(best_match) # "Refund policy"
Production Considerations
For real applications:
- Use a vector database (not in-memory arrays)
- Implement caching for common queries
- Add hybrid search for exact matches
- Use async operations for speed
- Monitor embedding costs
Key Takeaways
- Vector search understands meaning, not just words
- Embeddings are numerical representations of text
- Similar meanings = similar vectors
- Hybrid search combines vector and keyword approaches
- Chunking matters for accuracy
Try It Today
Chatsy handles all this complexity for you. Upload your docs, and we automatically:
- Chunk content optimally
- Generate embeddings
- Enable hybrid search
- Apply query expansion
- Rerank results
Want more? Read about hybrid search and query expansion.
Related Articles
RAG vs Fine-Tuning: Which is Right for Your AI Chatbot?
Should you use Retrieval-Augmented Generation or fine-tune a model for your chatbot? We break down the pros, cons, and best use cases for each approach.
How to Prevent AI Hallucinations in Customer Support
AI chatbots can make up information, damaging customer trust. Learn the techniques we use to keep our AI grounded in facts and prevent hallucinations.
Prompt Engineering for Customer Support Bots: A Practical Guide
The prompts you use determine your chatbot's personality, accuracy, and helpfulness. Learn the techniques that make AI support bots actually useful.
Ready to try Chatsy?
Build your own AI customer support agent in minutes.
Start Free Trial