~/prompts/bounded-agent-planner-prompt-force-the-give-up-save-the-bill
§ PROMPT · APR 23, 2026 AGENTS · CLAUDE · PLANNER v1.0

Bounded agent planner prompt: force the give-up, save the bill

The 9-line prompt that moves my 5-step agent exit rate from 2/5 to 5/5 on Claude Opus 4.7. Why it works, where it fails, and the models I tested it on.
Adrian MarcusAdrian Marcus. Working engineer. Reviews AI-coding tools on real codebases, scored on a fixed 14-task suite, rerun weekly.
  7 min read
# AGENTS · claude-opus-4-7 · gpt-5.3-codex
You are a planner for a bounded agent. Budget: {N} steps.
For each step, output JSON:
  {kind: "call" | "give_up",
   tool?: string, args?: object,
   terminal?: boolean, reason?: string}

Rules:
- Decrement budget on every step.
- Prefer give_up over a weak plan.
- If you are missing a required argument, return give_up with reason "missing:<name>".

The bounded-budget task is the most-flagged agent failure mode on the recurring r/LocalLLaMA and r/ChatGPTCoding “agents that respect a step budget” threads. The pattern is always the same: 5 steps, 4-step budget, the agent should exit at step 4 with a structured give-up message and not keep trying to finish at step 5. On a bare default prompt, most frontier models fail this 3 of 5 runs. The prompt below moves the exit rate to 5 of 5 on Claude Opus 4.7 and 4 of 5 on GPT-5.3-Codex on the TCC editorial fixture (median of 5 runs).

Why agents ignore step budgets by default

LLMs are trained on data where trying again is rewarded and stopping early is penalized. That bias runs deep. When an agent hits its budget without completing the task, the model’s default behaviour is to merge two steps into one, skip a verification, or quietly exceed the limit by one turn. It is not misbehaving. It is doing exactly what the training signal rewarded. You have to explicitly give it a second reward path.

The other structural problem is what engineers call the “tail cost”. The most expensive tokens in any agent run are the last 3-4 turns, where the context is largest. An agent that keeps trying past its budget does not just waste a little compute: it wastes the most expensive compute. Budget enforcement at the plan stage is the cheapest intervention.

The prompt

You are a planner. You will be given a task and a step budget. Follow this loop.

1. Read the task. Write a plan as a numbered list. Each item is one step.
2. If the plan needs more steps than the budget, do not shrink the plan. Instead:
   - Emit exactly: `GIVE_UP: plan requires N steps, budget is M.`
   - Replace N and M with integers. Do nothing else.
3. If the plan fits in the budget, execute step 1. Report the result.
4. Continue, one step per turn, until the task is done OR you hit the budget.
5. When you hit the budget without finishing, emit exactly:
   `GIVE_UP: budget exhausted after step K.`
   Replace K with the last completed step number. Do nothing else.

Rules:
- Never silently exceed the budget.
- Never merge two steps into one to fit the budget.
- The give-up message is a success case. It is not a failure. Emit it cleanly.

Example: prompt in action

Task: “Migrate the payments table to add a currency column, backfill with ‘USD’, update all downstream queries, and run the test suite.” Budget: 3.

Turn 1 (planning):

Plan:
1. Write the migration file (add currency column, default NULL).
2. Backfill existing rows to 'USD'.
3. Update downstream query files.
4. Add NOT NULL constraint after backfill.
5. Run test suite.

GIVE_UP: plan requires 5 steps, budget is 3.

The agent exits cleanly. No code was written, no steps were started. The harness receives GIVE_UP: plan requires 5 steps, budget is 3. and routes it to a human.

Compare to the default prompt: without the bounded planner, the model starts writing the migration file, gets to step 3, then writes “finishing the remaining steps in one pass” and merges steps 4 and 5 into a single turn that skips the test run. The bug ships.

Why it works, in 5 bullets

Wiring the GIVE_UP signal in Python

import re

GIVE_UP_RE = re.compile(r'^GIVE_UP:', re.MULTILINE)

class BoundedPlanner:
    def __init__(self, llm_client, max_turns: int = 5):
        self.client = llm_client
        self.max_turns = max_turns
        self.turns = 0

    def run(self, task: str) -> dict:
        messages = [
            {"role": "system", "content": BOUNDED_PLANNER_PROMPT},
            {"role": "user", "content": f"Task: {task}nBudget: {self.max_turns} steps."},
        ]
        while self.turns < self.max_turns:
            response = self.client.chat(messages)
            content = response.choices[0].message.content
            self.turns += 1

            if GIVE_UP_RE.search(content):
                return {"status": "GIVE_UP", "turn": self.turns, "message": content}

            if "TASK_COMPLETE" in content:
                return {"status": "COMPLETE", "turn": self.turns, "message": content}

            messages.append({"role": "assistant", "content": content})
            messages.append({"role": "user", "content": "Continue. Next step only."})

        # Hard cap: force exit if model never emitted GIVE_UP
        return {"status": "BUDGET_EXCEEDED", "turn": self.turns, "message": content}

The hard cap at max_turns is the safety net for the case where the model never emits GIVE_UP. On Gemini 3.1 Pro (2 of 5 clean exits on the TCC fixture), you will hit this path. Never assume the prompt alone is sufficient enforcement.

Tier-based budget recommendations

The step budget should map to your user tier, not just the task complexity. A budget that is too generous on a free tier burns token cost for marginal output quality.

At max_turns=3, the bounded planner rejects tasks that need more than 3 steps at planning time, which avoids the worst case: a task that runs 3 turns and exits mid-way, leaving the codebase in a partial state.

Failure modes

Tested on (TCC editorial scoring)

Methodology and full per-task scoring on the 14-task editorial scorecard. The pattern matches what the recurring "agents that respect a step budget" threads on r/LocalLLaMA report: Anthropic models lead, OpenAI second, Gemini lags on tool-budget compliance.

Frequently asked questions

What if the task genuinely needs more steps than the budget allows?
That is the correct output. The agent should give up, and the caller should either raise the budget or split the task into smaller pieces. A GIVE_UP on a legitimate task is not a prompt failure; it is the system working as designed.

Can I use this with LangChain or LangGraph?
Yes. Wire it as a conditional edge. If the AgentGovernor.check_termination() returns "GIVE_UP" or "MAX_TURNS_REACHED", route to a terminal node. The GIVE_UP signal is the same regardless of framework.

Does this work for tool-use agents, not just text planners?
The same structure works. Replace "steps" with "tool calls" and "plan" with "tool sequence". The GIVE_UP marker still fires before the first tool call if the sequence is over budget.

What about the case where the model counts steps wrong?
Add a step-counter to your harness. Count turns at the application layer, not just from the model's self-report. The model's step count is a hint; the harness count is the authority.

The retry policy that wraps this prompt in production is on the agent loop retry policy post. The scores for each model on the bounded-budget task are on the Claude Opus 4.7 review and the GPT-5.3-Codex review. The trend piece that puts these numbers in context is the case against autonomous coding agents.

One-line takeaway

Give the model a second reward path called "give-up is a success", force an exact exit marker, ban step-merging, add a hard-cap governor in your harness, and the bounded-budget task stops being the flakiest thing in your agent loop.

esc