APR 23
10 min
codegen
codegen
GPT-5.3-Codex review: 9/10 on strict JSON, the test-gen surprise
Adrian Marcus tested GPT-5.3-Codex on a 14-task suite: 9.0 on strict JSON, 8.7 on test-gen, and a costly loss on long-horizon agent planning. Full numbers inside.
—
unrated
read →10 min
APR 23
7 min
gpt-5
gpt-5
Property-based test generation prompt: 6 invariants on first run
The prompt that writes 6 Hypothesis invariants for a JSON-diff library on the first run, with shrink strategies. Tested on GPT-5.3-Codex, Claude Opus 4.7, and Aider.
—
unrated
read →7 min