Vector Search
Vector search is a method of finding information based on semantic meaning rather than exact keyword matches. It works by converting text into numerical representations (vectors/embeddings) and finding the most similar vectors in a database. Also called semantic search or similarity search.
How it works
Traditional keyword search requires exact word matches, searching for "pricing" will not find a document about "cost" or "fees." Vector search understands that these words are semantically related. It converts both the query and all documents into high-dimensional vectors using embedding models, then finds the documents whose vectors are closest to the query vector.
Vector databases like pgvector, Pinecone, and Weaviate store these embeddings and perform fast similarity searches. The quality of vector search depends on the embedding model used, better models create more accurate semantic representations.
Operational Review
In practice, vector search should be evaluated by what it changes in the support workflow. Ask whether it improves answer accuracy, reduces repeated agent work, clarifies handoff decisions, or makes reporting easier. If the answer is only "it sounds modern," the concept is not yet operational.
A concrete example is synonym matching in support queries: A customer searches for "refund policy" but the help article is titled "Returns and money-back guarantee." Vector search finds the article because both phrases share semantic meaning, where keyword search would return zero results.
The simplest takeaway is: Vector search matches by meaning rather than exact keywords, bridging the vocabulary gap between customers and documentation
Why it matters
How Chatsy uses vector search
Real-world examples
Key takeaways
When vector search does not apply
- You have fewer than 1,000 documents. Keyword search is fine.
- Your queries are exact-match (product SKUs, IDs). Use traditional indexing.
- Your content updates several times per day and re-embedding cost outweighs the recall gain.
Frequently asked questions
How is vector search different from keyword search?
Keyword search matches exact words. Vector search matches meaning. Searching for "cancel my account" with keyword search only finds documents containing those exact words. Vector search also finds documents about "close account," "delete profile," or "end subscription" because they have similar meaning.
What is a vector database?
A vector database is specialized storage for embedding vectors that supports fast similarity search. Examples include pgvector (PostgreSQL extension), Pinecone, Weaviate, and Qdrant. Chatsy uses pgvector to keep everything in PostgreSQL.
How fast is vector search compared to keyword search?
Vector search with optimized indexes (HNSW or IVFFlat) returns results in 5-50 milliseconds for databases with millions of vectors. This is comparable to keyword search and fast enough for real-time chatbot responses.
Does vector search work with short queries like one or two words?
Short queries produce less precise vectors because there is less semantic context to encode. For single-word queries like "pricing," keyword search often outperforms vector search, which is why hybrid search combining both methods is recommended.
What is vector search in simple terms?
Vector search is a way of finding "things that mean roughly the same thing" instead of "things that contain the exact same words." It turns each piece of text into a list of numbers (a vector) so the computer can measure how close two ideas are mathematically, even when the words are different.
What is the difference between Elasticsearch and vector search?
Elasticsearch is a search engine traditionally built around inverted-index keyword search (BM25). Vector search uses embeddings and similarity scoring instead. Modern Elasticsearch and OpenSearch now support vector search natively, so the practical question is usually keyword vs hybrid vs pure vector, not Elasticsearch vs vector.
What is a vector lookup?
A vector lookup is a single similarity query against a vector database: you give it a query embedding and it returns the top-K most similar stored embeddings. In a chatbot pipeline, each customer message triggers one or more vector lookups to find the most relevant knowledge base passages before the LLM generates a response.