Get 20% Lifetime Off on all plans
Back to Blog
Engineering

Why We Migrated from Pinecone to pgvector: A 97% Cost Reduction Story

How we achieved massive cost savings while improving performance by moving to PostgreSQL with pgvector extension.

Sarah Rodriguez
CTO
December 10, 2024
3 min read
Share:
Why We Migrated from Pinecone to pgvector: A 97% Cost Reduction Story

When we started Chatsy, we chose Pinecone for vector storage. It was the obvious choice — purpose-built for vector search, great developer experience, and excellent performance.

But as we scaled, our Pinecone bill grew from $100/month to over $3,000/month. We knew there had to be a better way.

The Decision to Migrate

Our requirements were clear:

  1. Performance: Sub-100ms query latency at the 95th percentile
  2. Scale: Support for 10M+ vectors
  3. Cost: Significant reduction from $3,000/month
  4. Reliability: 99.9% uptime SLA

After evaluating options (Weaviate, Milvus, Qdrant, pgvector), we chose pgvector — the PostgreSQL extension for vector similarity search.

Why pgvector?

1. Unified Data Layer

With pgvector, our vectors live alongside our relational data. No more syncing between databases. One source of truth.

sql
SELECT content, embedding <=> query_embedding AS distance FROM documents WHERE chatbot_id = $1 ORDER BY distance LIMIT 10;

2. Mature Ecosystem

PostgreSQL has 30+ years of battle-tested reliability. We get:

  • ACID transactions
  • Point-in-time recovery
  • Connection pooling (PgBouncer)
  • Mature monitoring tools

3. Cost Efficiency

Our new infrastructure costs $90/month on a managed PostgreSQL instance. That's a 97% reduction.

The Migration Process

Phase 1: Schema Design

We added a vector column to our existing documents table:

sql
ALTER TABLE documents ADD COLUMN embedding vector(1536); CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

Phase 2: Backfill

We wrote a migration script to copy vectors from Pinecone:

typescript
async function backfillVectors() { const pineconeVectors = await pinecone.fetch({ ids: documentIds }); for (const [id, vector] of Object.entries(pineconeVectors)) { await prisma.document.update({ where: { id }, data: { embedding: vector.values } }); } }

Phase 3: Dual-Write

During the transition, we wrote to both databases:

typescript
async function indexDocument(doc: Document, embedding: number[]) { // Write to both await Promise.all([ pinecone.upsert([{ id: doc.id, values: embedding }]), prisma.document.update({ where: { id: doc.id }, data: { embedding } }) ]); }

Phase 4: Cutover

After validating pgvector results matched Pinecone, we switched reads:

typescript
// Before const results = await pinecone.query({ vector, topK: 10 }); // After const results = await prisma.$queryRaw` SELECT id, content, embedding <=> ${vector}::vector AS distance FROM documents WHERE chatbot_id = ${chatbotId} ORDER BY distance LIMIT 10 `;

Performance Results

MetricPineconepgvectorChange
P50 Latency45ms38ms-16%
P95 Latency120ms85ms-29%
P99 Latency250ms150ms-40%
Monthly Cost$3,000$90-97%

Yes, pgvector is actually faster for our use case. The co-location of vector and metadata eliminates network round trips.

Lessons Learned

  1. Start with IVFFlat, consider HNSW: IVFFlat is simpler and works great up to ~1M vectors. HNSW is better for larger scales but uses more memory.

  2. Tune your lists parameter: Too few lists = slow queries. Too many = slow inserts. We settled on sqrt(num_vectors).

  3. Use partial indexes: If you filter by tenant, create partial indexes per tenant for dramatic speedups.

  4. Monitor vacuum: Vector columns are large. Aggressive vacuuming prevents bloat.

Conclusion

The migration took 3 weeks of engineering time and saved us $35,000/year. More importantly, it simplified our architecture and improved performance.

Not every workload is suitable for pgvector — if you need billion-scale vectors with sub-10ms latency, dedicated vector databases still make sense. But for most applications, pgvector is the pragmatic choice.

See Our Tech Stack →

Tags:#infrastructure#postgresql#pgvector#cost-optimization

Related Articles

Ready to try Chatsy?

Build your own AI customer support agent in minutes.

Start Free Trial