~/cheatsheets/rag-defaults-2026-cheatsheet-copy-paste-ship
§ CHEATSHEET · APR 23, 2026 CHEATSHEET · RAG · RETRIEVAL v1.0

RAG defaults 2026 cheatsheet: copy, paste, ship

The RAG parameter defaults that moved my top-1 accuracy from 74% to 91% in 2026. Chunk size, overlap, rerank, hybrid BM25, and the 2 flags people forget.
Adrian MarcusAdrian Marcus. Working engineer. Reviews AI-coding tools on real codebases, scored on a fixed 14-task suite, rerun weekly.
  2 min read

The Slack-pasteable version of TCC’s RAG defaults for April 2026. The full reasoning, the ablations, and the per-parameter charts are on the RAG defaults guide. This is the table and the copy-paste config.

The settings that move top-1 accuracy

Parameter Naive default Ship in 2026 Direction of move
Chunk size (tokens) 512 768 +~5 pp on technical corpora
Chunk overlap (tokens) 0 128 +~3 pp; recovers boundary context
Chunking strategy semantic fixed-size +~2 pp; simpler, more reliable
Retriever top-k (before rerank) 5 20 +~6 pp when paired with rerank
Reranker none Cohere Rerank v4.0, final k=3 +~4-5 pp; the single biggest lever
Embedding model ada-002 era Voyage 4 Large or OpenAI text-embedding-3-large +~1 pp; modern embeddings still help
Hybrid BM25 weight 0 0.35 (technical corpus) +~2 pp on code/SKU; 0 on prose
Boilerplate strip off on +~2 pp; PDF and HTML extraction

Direction of move is from independent and TCC editorial ablations on a mixed-technical corpus. Your fixture will produce different absolute deltas; the direction holds. The single largest lever is the reranker; if you only have time for one change, set top-k to 20 and bolt on Cohere Rerank v4.0.

Paste-this config

# rag.yaml
chunker:
  strategy: fixed_size
  size_tokens: 768
  overlap_tokens: 128
  strip_boilerplate: true

embed:
  provider: voyage
  model: voyage-4-large

retrieve:
  top_k: 20
  hybrid_bm25_weight: 0.35    # set to 0 on prose-only corpora

rerank:
  provider: cohere
  model: rerank-v4.0-pro
  final_k: 3

eval:
  ground_truth_path: evals/ragtruth.jsonl
  min_top_1: 0.88

2 flags people forget

  1. Normalize whitespace before chunking. PDF extraction leaves ragged spacing. Run the text through re.sub(r"\s+", " ", text).strip() before you chunk. The recurring r/MachineLearning “RAG quality is bad on PDFs” thread converges on this in the top reply most of the time.
  2. Store the chunk offset, not just the text. When you show the retrieved chunk to a user, you will want to link into the source document at the right byte offset. Reconstructing that offset from the chunk text later is a multi-day project. Store it at index time.

Watch-outs

The full methodology, the ground-truth set, and the per-parameter ablations are in the RAG defaults guide. The eval harness wrapped around every RAG change is on the evals-without-judges post. The Pinecone learn series has the long-form explanation of hybrid scoring if you want the math.

One-line takeaway

Fixed-size 768-token chunks with 128 overlap, retrieve top-20, rerank to top-3 with Cohere Rerank v4.0, BM25 hybrid at 0.35 only on technical corpora, boilerplate strip on by default. That is the production-grade RAG default in April 2026.

esc