Chatsy
Glossary

Hybrid Search

Hybrid search is a retrieval method that combines semantic search (vector/embedding-based) with lexical search (keyword/BM25-based) to find relevant information. By merging both approaches, hybrid search achieves higher accuracy than either method alone.

How it works

Semantic search excels at understanding meaning but can miss specific terms, product names, and exact phrases. Keyword search (BM25) excels at matching specific terms but misses paraphrased content. Hybrid search combines both:

1. **Semantic search**: Finds content with similar meaning to the query 2. **BM25 keyword search**: Finds content containing the exact terms 3. **Reciprocal Rank Fusion (RRF)**: Merges and re-ranks both result sets

This means searching for "Chatsy Pro plan pricing" would find documents about "Chatsy Pro subscription cost" (semantic) AND documents containing the exact term "Pro plan" (keyword). Neither search alone would find both.

Why it matters

For AI chatbots, retrieval accuracy directly determines answer quality. If the search step misses relevant content, the AI cannot generate a correct answer — no matter how good the language model is. Hybrid search is the current best practice for RAG systems because it maximizes recall without sacrificing precision.

How Chatsy uses hybrid search

Chatsy uses hybrid search as its core retrieval engine. Customer questions are searched against the knowledge base using both pgvector semantic search and PostgreSQL full-text search (BM25). Results are merged using Reciprocal Rank Fusion to provide the most relevant passages to the AI for answer generation.

Real-world examples

Product name + intent matching

A customer asks "how do I set up Chatsy Pro webhooks?" Keyword search finds documents containing "Chatsy Pro" and "webhooks" (exact terms). Semantic search finds documents about "configuring event notifications" (meaning). Hybrid search returns both, ensuring the most relevant result ranks first.

Technical jargon handling

A developer asks about "CORS errors on the REST API." Keyword search catches the exact technical terms (CORS, REST API). Semantic search also finds related articles about "cross-origin request configuration" and "API access control." The combined results cover both exact and related content.

Misspelled query recovery

A customer types "refud polcy" (misspelled). Keyword search fails because no documents contain those misspellings. Semantic search still matches the query to the "Refund Policy" article because the embedding captures meaning despite typos. Hybrid search recovers from the keyword failure.

Key takeaways

  • Hybrid search combines semantic vector search with BM25 keyword search for maximum accuracy

  • Reciprocal Rank Fusion (RRF) merges and re-ranks results from both search methods

  • Hybrid search improves recall by 10-30% compared to vector search alone

  • Keyword search catches exact terms and product names that semantic search can miss

  • The additional latency is negligible (10-50ms) because both searches run in parallel

Frequently asked questions

Is hybrid search better than vector search alone?

Yes, in most cases. Studies show hybrid search improves recall by 10-30% compared to vector search alone, especially for queries containing specific terms, product names, or technical jargon that semantic search can miss.

Does hybrid search slow down the chatbot?

The additional latency is negligible — typically 10-50 milliseconds. Both searches run in parallel and results are merged. The accuracy improvement far outweighs the minimal latency cost.

When should I use hybrid search instead of vector search alone?

Always, if your platform supports it. Hybrid search is strictly better than vector-only search for customer support because support queries frequently contain specific product names, error codes, and technical terms that keyword search handles better than semantic search.

What is Reciprocal Rank Fusion (RRF)?

RRF is an algorithm that combines ranked result lists from multiple search methods. It scores each result based on its rank position in each list (1/rank), then sums the scores. Results that rank highly in both keyword and semantic search get the highest combined scores, surfacing the most relevant content.

Related terms

Further reading

Related Resources

See hybrid search in action

Try Chatsy free and experience how these concepts come together in an AI-powered support platform.

Start Free

Browse the glossary