◨
thecodingcolosseum
.com
$
search 19 guides, prompts, and reviews…
⌘K
Guides
AI Reviews
Prompts
Cheatsheets
◐
About
~/ home
~/ guides
~/ ai-reviews
~/ prompts
~/ cheatsheets
~/ trends
Updated 2 weeks ago
~
/
tags
/
evals
§ ARCHIVE · 1 ENTRIES
#Tag · evals
All entries, filed and dated.
1 entries
updated May 7
1
entries
Sort:
Most recent
Highest scored
Most read
All
Featured
New
APR 23
10 min
eval
Evals without LLM judges: a harness that catches regressions
How I score LLM pipelines without an LLM-as-judge. Deterministic graders, property-based checks, and the 4 reasons a judge model keeps biting you in production.
evals
intermediate
7.9
peer score
read →
10 min
⌕
esc