AI & Agents

Retrieval-Augmented Generation(RAG)

An AI pattern that retrieves relevant documents from a vector database and injects them into the LLM prompt — so the model can answer from custom knowledge it was not trained on.

RAG combines (1) an embedding model that turns documents and queries into vectors, (2) a vector store (pgvector, Pinecone, Qdrant, Weaviate) that does fast nearest-neighbour search, and (3) an LLM that conditions its answer on the retrieved snippets. RAG is the dominant production pattern for "chat with your docs" — Slack history, codebase, policy documents, support tickets. Modern RAG adds hybrid (vector + BM25), re-rankers, query rewriting, and citation enforcement.

Related terms

Vector Database

A database optimized for similarity search over high-dimensional embedding vectors — the backbone of RAG and semantic search.

Embeddings

Dense numerical vector representations of text (or images, code, audio) where semantically similar inputs map to nearby vectors.

Large Language Model(LLM)

A neural network with billions of parameters trained on broad text corpora to predict and generate language — the engine behind ChatGPT, Claude, and Gemini.

Agentic AI

AI systems where an LLM plans and executes multi-step tasks by calling external tools, accessing files, browsing, and adjusting its own approach based on results.

Retrieval-Augmented Generation(RAG)

Related terms

Read more on the blog

Need this built into a real product?