RAG
Retrieval augmented generation, a pattern where a system retrieves relevant context before asking a model to generate an answer.
RAG quality depends on retrieval, permissions, chunking, citations, and refusal behavior, not only the language model.
RAG
A search backend optimized for storing embeddings and finding semantically similar documents, chunks, products, or records.
The right vector database affects retrieval quality, latency, cost, metadata filtering, and operational complexity.
RAG
A numeric representation of text, code, image, or structured data that lets search systems compare semantic similarity.
Embedding choice affects retrieval quality, storage cost, re-indexing effort, and how well queries match real documents.
RAG
A retrieval method that combines keyword search with vector search to handle exact terms and semantic intent together.
Hybrid search is important for product names, error codes, API methods, legal clauses, and other exact-match queries.
RAG
Filtering retrieval results by fields such as tenant, user role, document type, language, timestamp, or product area.
Weak metadata filtering can return context that is irrelevant, outdated, or not authorized for the current user.
RAG
A second retrieval step that reorders candidate documents or chunks so the most useful context reaches the final model prompt.
Reranking can improve answer quality, but it adds latency and cost that should be tested against real queries.
RAG
The practice of requiring generated answers to rely on retrieved evidence instead of unsupported model memory.
Grounding improves trust only when citations, retrieved chunks, and refusal behavior are tested together.