If you've built a RAG (Retrieval-Augmented Generation) system, you've faced the search dilemma: semantic search or keyword search? The answer is both.

The Problem with Pure Approaches

Semantic Search Limitations

Semantic search uses embeddings to find conceptually similar content. It's great at:

Finding paraphrases ("cancel subscription" ≈ "terminate membership")
Understanding intent ("I'm unhappy" → negative sentiment content)
Cross-lingual matching

But it fails at:

Exact matches: Searching for "error code E-1234" might return content about errors generally, not that specific code
Proper nouns: "John Smith" might match other people content
Technical terms: "OAuth2" might match "authentication" broadly

Keyword Search Limitations

Keyword search (TF-IDF, BM25) excels at:

Exact term matching
Rare word importance
Speed and simplicity

But it fails at:

Synonyms: "cancel" won't match "terminate"
Typos: "refnd" won't match "refund"
Context: "apple" matches both fruit and company

The Hybrid Approach

Hybrid search combines both methods, getting the best of each:

User Query: "How to cancel my Pro subscription?"
                    │
        ┌───────────┴───────────┐
        ▼                       ▼
  Semantic Search          Keyword Search
        │                       │
   Finds content           Finds content
   about canceling         with exact terms
   and subscriptions       "Pro", "cancel"
        │                       │
        └───────────┬───────────┘
                    ▼
            Combine Results
            (Reciprocal Rank Fusion)
                    │
                    ▼
            Re-rank with AI
                    │
                    ▼
            Final Results

Implementation

Step 1: Dual Search

typescript
async function hybridSearch(query: string, chatbotId: string) {
  // Semantic search with pgvector
  const semanticResults = await prisma.$queryRaw`
    SELECT id, content, 
           embedding <=> ${await embed(query)}::vector AS semantic_score
    FROM chunks
    WHERE chatbot_id = ${chatbotId}
    ORDER BY semantic_score
    LIMIT 20
  `;
  
  // Keyword search with PostgreSQL full-text
  const keywordResults = await prisma.$queryRaw`
    SELECT id, content,
           ts_rank(to_tsvector('english', content), 
                   plainto_tsquery('english', ${query})) AS keyword_score
    FROM chunks
    WHERE chatbot_id = ${chatbotId}
      AND to_tsvector('english', content) @@ plainto_tsquery('english', ${query})
    ORDER BY keyword_score DESC
    LIMIT 20
  `;
  
  return { semanticResults, keywordResults };
}

Step 2: Reciprocal Rank Fusion (RRF)

RRF combines rankings from multiple sources:

typescript
function reciprocalRankFusion(
  results: { id: string; score: number }[][],
  k: number = 60
): { id: string; score: number }[] {
  const scores = new Map<string, number>();
  
  results.forEach(resultSet => {
    resultSet.forEach((result, rank) => {
      const rrf = 1 / (k + rank + 1);
      scores.set(result.id, (scores.get(result.id) || 0) + rrf);
    });
  });
  
  return Array.from(scores.entries())
    .map(([id, score]) => ({ id, score }))
    .sort((a, b) => b.score - a.score);
}

Step 3: AI Re-ranking (Optional)

For the highest accuracy, re-rank top results with an AI model:

typescript
async function rerank(query: string, chunks: string[]): Promise<number[]> {
  const response = await cohere.rerank({
    model: "rerank-english-v2.0",
    query,
    documents: chunks,
  });
  
  return response.results
    .sort((a, b) => b.relevance_score - a.relevance_score)
    .map(r => r.index);
}

Results

We tested on 1,000 real customer queries:

Method	Precision@5	Recall@10	MRR
Semantic Only	72%	68%	0.65
Keyword Only	58%	71%	0.52
Hybrid	84%	82%	0.78
Hybrid + Rerank	91%	89%	0.86

Hybrid search improved precision by 17% over semantic-only.

When to Use What

Query Type	Best Approach
Conceptual questions	Semantic-heavy
Specific terms/codes	Keyword-heavy
General questions	Balanced hybrid
Multi-part queries	Hybrid + rerank

Practical Tips

Tune the fusion weight: Start with 50/50, then adjust based on your data
Use query analysis: Detect query type and adjust weights dynamically
Cache embeddings: Generate once, use forever
Index carefully: Full-text indexes need maintenance (reindex periodically)

Conclusion

Pure semantic search was a revolution, but it's not the end of the story. Hybrid search combines the conceptual understanding of embeddings with the precision of keyword matching.

At Chatsy, hybrid search is the default. Your AI agents benefit from both approaches automatically.

Try Hybrid Search →

Hybrid Search Explained: Best of Both Worlds