APR 23
7 min
cheatsheet
cheatsheet
Claude Opus 4.7 tool calling: 7 settings for reliable use
The 7 settings that move Claude Opus 4.7 tool-call reliability from 94% to 99.2%. Adaptive thinking, tool_choice, disable_parallel_tool_use, stop_sequences, and the sampling params you must now omit.
—
unrated
read →7 min
APR 23
10 min
codegen
codegen
GPT-5.3-Codex review: 9/10 on strict JSON, the test-gen surprise
Adrian Marcus tested GPT-5.3-Codex on a 14-task suite: 9.0 on strict JSON, 8.7 on test-gen, and a costly loss on long-horizon agent planning. Full numbers inside.
—
unrated
read →10 min