RAG and working with documents · Lesson 1
What RAG is and when you need it
The core idea: mix relevant documents into the context before answering.
The idea in one sentence
RAG = we search our document base for relevant fragments and mix them into the prompt. The model answers grounded in what was found.
When RAG beats a long context
- Many documents (they don't fit in the window).
- Documents change (you don't want to embed them into the prompt every time).
- Transparency matters ("show what the answer is based on").
When a long context beats RAG
- A single document that you need to fully "understand."
- When chunks break the meaning.
- When the questions require seeing the whole context at once.
A minimal RAG schema
- Documents → chunks (e.g., 500-1000 tokens each).
- Chunks → embeddings → a vector DB.
- Question → embedding → search for top-K chunks.
- Prompt: "Answer based on these chunks."
- Answer + links to sources.
Practical exercise
What to do after this lesson
Build a simple RAG over 5-10 PDFs: split into chunks, compute embeddings, find the top-3 for a test question.
Ready-to-use prompt
Template for this lesson
Copy and adapt to your context. Text in angle brackets should be replaced.
Answer the user's question strictly based on the provided document fragments. Fragments: <…> Question: <…> Rules: 1. If the answer isn't in the fragments — say "not found in the documents." 2. Cite the source (document name / page) for every statement. 3. Don't supplement from general knowledge.
Common mistakes
What people get wrong
- Doing RAG when a long context would have been enough.
- Chunking — the pieces break the meaning.
- Not citing sources in the answer.
Pro tips
What works but no one documents
- Hybrid search (BM25 + embeddings) is often better than pure embeddings.
- Add a reranker as a second search step.
- First a simple baseline, then improve against an eval set.
When to use
Large document corpora, updating knowledge, a source-citation requirement.
When not to use
A single long document that long-context models can answer over.
Official sources
Квиз — 2 вопроса
1.RAG beats a long context when:
2.What is mandatory in a RAG answer?
Отвечено: 0 из 2