RAG and working with documents · Lesson 2

Chunking, embeddings, retrieval

Where the typical RAG errors live and how to avoid them.

15 min read2 questions in quizReady prompt includedIn progress

Chunking

500-1000 tokens per chunk — a starting point.
An overlap of 50-100 tokens reduces "meaning breaks."
Logical boundaries (paragraph, section) are better than arbitrary ones.
Save metadata: doc_id, page, section.

Embeddings

Modern models work: OpenAI text-embedding-3-large, Cohere embed v3, Voyage embed-v2.
One project = one embeddings model (they're incompatible with each other).
Cache embeddings — it saves a lot.

Retrieval

Pure cosine similarity is often noisy.
Use hybrid (BM25 + vector).
A reranker (cohere-rerank, BGE) on a second pass significantly raises quality.
top-K = 5-10 for most tasks.

What we measure

Recall@K: does the actually needed chunk land in the top-K.
Answer quality on an eval set.
Latency.

Practical exercise

What to do after this lesson

Make an eval set of 30 "question — reference chunk" pairs. Run it on different chunking strategies. Measure Recall@10.

Task grader

Make an eval set of 30 "question — reference chunk" pairs. Run it on different chunking strategies. Measure Recall@10.

Your answer

Ready-to-use prompt

Template for this lesson

Copy and adapt to your context. Text in angle brackets should be replaced.

Help me configure chunking.

Documents: <…>
Content type (legal / technical / marketing): <…>
Average paragraph length: <…>

Give me:
- A splitting strategy.
- Chunk size and overlap.
- Which metadata to save.
- How to validate.

Prompt sandbox

Prompt

Common mistakes

What people get wrong

Chunks too large — precision is lost.
Too small — context is lost.
Not saving metadata.

Pro tips

What works but no one documents

An overlap of 50-100 tokens — a simple fix against "breaks."
A reranker almost always improves the result.
Metadata = the ability to filter by source/tag.

When to use

Any RAG system.

When not to use

Not RAG.

Official sources

Cohere — RAG guide

Quiz — 2 questions

1.What significantly raises retrieval quality?

2.What is best to store alongside a chunk?

Answered: 0 of 2

← What RAG is and when you need it RAG eval and reranking →

Discussion

No comments yet. Be the first!