~/ home ~/ guides ~/ ai-reviews ~/ prompts ~/ cheatsheets ~/ trendsUpdated 3 weeks ago

~/tags/context

§ ARCHIVE · 2 ENTRIES

#Tag · context

All entries, filed and dated.

2 entries updated May 7

APR 23 11 min
analysis

Long-context evals diverge from reality: the 1M-token gap

Vendor 1M-context numbers keep outperforming my production RAG task by 30+ points. The three reasons the benchmarks lie, and what I trust instead.

read →11 min

APR 23 12 min
context

Gemini 3.1 Pro review: cheapest frontier token, 4 places it lags

Gemini 3.1 Pro scored 7.8 on refactoring and 7.9 on structured output at $0.21 per task. The domains where cheap wins and where you need to route traffic elsewhere.

read →12 min

⌕ esc