🤖 AIBy Neel VoraDecember 6, 20252 min read

How I Built My RAG System With Supabase and pgvector

RAGEmbeddingsSupabasepgvectorOpenAI

By Neel Vora

This post walks through how I built my RAG System With Supabase and pgvector, and where it fits in the rest of my work.

Years of building content management systems for government agencies gave me deep appreciation for how organizations structure knowledge - patterns that directly informed this RAG architecture.

This was one of the key AI engineering projects in my portfolio.

It demonstrates real retrieval augmented generation with:

  • Document ingestion
  • Chunking
  • Embeddings
  • Vector storage
  • Similarity search
  • Streaming model responses

Document ingestion

Users can add documents through a panel. The server:

  • Validates the input
  • Splits the text into chunks
  • Generates embeddings
  • Stores them in Supabase

Chunking logic

Each chunk is about 500 characters with 50 characters of overlap. This keeps context meaningful without making embeddings too large.

pgvector

I used a Supabase Postgres table with a vector column.

The similarity search uses:

match_rag_chunks(query_embedding)

This returns rows ordered by cosine similarity.

Query flow

  1. Embed the user query
  2. Fetch the top matching chunks
  3. Feed them into GPT 4o mini
  4. Stream the response to the browser
  5. Show citation sources

Why this matters

This is production quality RAG in a real environment. It combines design retrieval systems, vector search pipelines, and streaming AI interfaces in a production environment.

Keep exploring

From here you can:

Thanks for reading! If you found this useful, check out my other posts or explore the live demos in my AI Lab.

More Posts