~/guides
§ GUIDES · 6 ENTRIES

Engineering guides for building with AI

Deep-dives, not blog posts. Every guide is peer-reviewed, dated, and versioned. Filter by topic, tool, or stack.

6 guides updated May 7
6 guides Sort:
All Featured New
APR 23 10 min
advanced
Refactor legacy TS with AI: 14 call sites, no re-export trap
A step-by-step AI refactor on a 63k-line TypeScript monorepo. The prompt, the re-export trap, and why Claude Opus 4.7 + Aider beat the IDE agent by 3 call sites.
legacy refactor typescript
8.4 peer score
read →10 min
APR 23 10 min
eval
Evals without LLM judges: a harness that catches regressions
How I score LLM pipelines without an LLM-as-judge. Deterministic graders, property-based checks, and the 4 reasons a judge model keeps biting you in production.
evals intermediate
7.9 peer score
read →10 min
APR 23 10 min
chunking
RAG defaults 2026: chunks, rerankers, 3 settings that matter
The chunk size, overlap, rerank, and top-k values that moved my retrieval accuracy from 74% to 91%. Tested on a 1,400-chunk corpus with a ground-truth answer set.
intermediate rag retrieval
8.1 peer score
read →10 min
APR 23 13 min
advanced
The case against autonomous coding agents in 2026
Autonomous coding agents still fail 1 in 9 production runs on my suite. The three failure modes that cause it, and where a bounded planner is the honest answer.
agents essay process
8.9 peer score
read →13 min
APR 23 13 min
intermediate
Structured outputs, three years in: the one pattern that survived
Three years of shipping LLM structured outputs in production. The one pattern that survived, the three that did not, and the strict-JSON failure rate I run at today.
json schema structured
9.1 peer score
read →13 min
APR 23 6 min
advanced
Agent loops and retries: 4-step policy that cuts 429s 30x
The retry policy that cut my agent-loop 429 rate from 6% to 0.2% across 4 vendors. Jitter, step-budget interlock, and the one thing you should never retry.
agents tool-use
9.4 peer score
read →6 min
esc