The agent is just a loop

Everyone is shipping agents now. There's a product category, a conference track, a job title.

So it's worth saying the quiet part out loud: an agent is a loop.

A model, some tools, and a loop that keeps calling the model until it says it's done. Four beats: call the model, run the tool it asked for, hand back the result, go round again. Every agent you have ever used, from a toy script to a coding agent worth billions, is that loop with more tools and a longer system prompt.

Here's the part that matters for you, though. If you use Claude Code, you are already inside the loop. You don't usually write it. You drive it. And you can drive it pretty hard without touching an SDK.

The loop is the easy bit. Knowing when it should stop is the whole job, and that part nobody puts on the slide.

You're already in the loop

Open Claude Code, type a request, watch it read files, run commands, edit, run the tests. That's the four beats, turn after turn, until it decides it's finished.

"Until it decides" is the soft underbelly. The model has no view of where it's going. It predicts the next token and predicts again. It cannot look at its own output and check it against reality. I've written about that blindness before: nobody is home in there.

Now let that thing decide when the loop ends. A model that can't see where it's going, told to keep going until it feels done, will declare victory on a task it never finished, confidently, every time. The loop doesn't fix the blindness. It multiplies it, because each turn builds on the last.

So the real work is bolting three things onto the loop: a stopping condition a machine can check, real feedback on every turn, and a hard limit so it can't run forever. The good news is that Claude Code gives you all three without code.

The weak way: ask nicely

The first thing people try is prompting. "Keep going until all the tests pass. Run them after each change."

This sometimes works, and it's worth doing. But the thing checking whether the tests pass is the same model that can't verify its own work. It will tell you the tests are green because that's the likely next token after twenty turns of trying, not because it ran anything. Asking nicely puts the model in charge of its own exit condition, which is exactly the part it's bad at. I've made the broader case against asking nicely elsewhere.

You want something that isn't the model deciding when the loop stops.

The strong way: a Stop hook

Claude Code fires a Stop hook when the model thinks it's finished. And a Stop hook can refuse.

If your hook exits with code 2, Claude is blocked from stopping. Whatever you write to stderr is handed back to it as the reason, and the loop continues. That's a loop-until-green you control, enforced by the harness instead of the model's optimism:

bash

#!/usr/bin/env bash
# .claude/hooks/loop-until-green.sh
if ! npm test --silent; then
  echo "Tests are still failing. Keep going until they pass." >&2
  exit 2          # blocks the stop; Claude reads stderr and works on
fi
exit 0            # tests pass: let it stop

Wire it into .claude/settings.json:

json

{
  "hooks": {
    "Stop": [{ "hooks": [{ "type": "command", "command": ".claude/hooks/loop-until-green.sh" }] }]
  }
}

(If you prefer JSON over exit codes, the hook can print {"decision": "block", "reason": "..."} to stdout instead. Same effect.)

Now the stopping condition is a command, not a wish. The model can't talk its way past a failing npm test, because the gate isn't asking its opinion. This is the same deterministic control I keep banging on about with hooks: stop suggesting, start guaranteeing.

One catch worth knowing: a Stop hook that always blocks is an infinite loop. The hook input carries a stop_hook_active flag that's true when Claude is already continuing because of a stop hook. Read it, or count turns in a temp file, and let go after a sane number. A loop you can't exit is worse than no loop.

The built-in way: `/goal` and `/loop`

You don't always need to write a hook. Two commands cover most of it.

/goal sets a completion condition and works autonomously across turns until it's met, tracking turns, tokens, and elapsed time as it goes. The detail that makes it more than a prompt: after each turn a separate, smaller model checks whether the condition is actually satisfied. The writer isn't also the grader. That's the verification gap, closed for you.

text

/goal all tests under tests/auth pass and the linter is clean

Give it a condition the evaluator can actually judge from what's on screen, the same discipline as a Stop hook, just phrased in English.

/loop is the other shape: run a prompt or command on a repeating cadence. Hand it an interval, or let it pace itself and pick a gap each round based on what it sees.

text

/loop 5m check CI on the open PR and ping me if it goes red

That's the loop pointed outward at something that changes over time, instead of inward at a task to finish.

And when the work fans out rather than iterates, spawn subagents: several loops running in parallel, each on its own slice, results gathered at the end.

Building your own

If you're building a product rather than driving Claude Code, you write the loop yourself. It's still ten lines. Against Claude, in TypeScript, no framework:

typescript

let messages: Anthropic.MessageParam[] = [{ role: 'user', content: task }]

while (true) {
  const response = await client.messages.create({
    model: 'claude-opus-4-8',
    max_tokens: 16000,
    tools,
    messages,
  })

  if (response.stop_reason === 'end_turn') break // model says it's done

  messages.push({ role: 'assistant', content: response.content })
  const results = []
  for (const block of response.content) {
    if (block.type === 'tool_use') {
      const output = await runTool(block.name, block.input)
      results.push({ type: 'tool_result', tool_use_id: block.id, content: output })
    }
  }
  messages.push({ role: 'user', content: results })
}

Look at the exit: end_turn means the model decides. You've just inherited the same problem Claude Code solves with hooks and /goal, except now it's yours to solve. The API gives you levers for it. Task budgets (beta) hand the model a token budget for the whole loop so it can wrap up gracefully instead of grinding. And while (true) should never be literal: cap the turns, and surface it loudly when you hit the ceiling rather than pretending the unfinished work is done.

typescript

let turns = 0
while (turns++ < 25) {
  /* ... the loop ... */
}

Feedback is the loop's eyes

One thing runs through all of this, whether you're in Claude Code or the SDK.

Each turn, the model gets back whatever your tools return. Rich, real signal and it can course-correct: the test names the failing assertion, the type checker points at the line. Return nothing useful, or let the model grade its own homework, and you've built an echo chamber with a runtime.

That's why the Stop hook runs the actual tests, why /goal uses a separate evaluator, why your runTool should hand back the real exit code and stderr, failures and all. The confidence of the model's prose tells you nothing. The exit code tells you everything.

What to actually do

Define "done" as a command, not a wish. A Stop hook, a /goal condition, a test gate. If you can't write the check, the loop can't know either.
Feed real results back every turn. Never let the model be the only thing that judges the model.
Put a hard cap on it. stop_hook_active, a turn counter, a token budget. Make it fail loudly at the ceiling, not silently when it gives up.
Verify the output yourself anyway. A green loop is not the same as correct work. Never ship code you don't understand, no matter how good the tests looked on the way out.

None of this is exotic. It's the unglamorous scaffolding around a loop, most of it a hook file and a slash command, and it's the entire difference between an agent that does work and one that performs the appearance of work until you stop watching.

The loop was never the clever part. Anyone can write while (true). The craft is in the condition that lets you write break.

You're already in the loop ​

The weak way: ask nicely ​

The strong way: a Stop hook ​

The built-in way: /goal and /loop ​

Building your own ​

Feedback is the loop's eyes ​

What to actually do ​

related

Working with an agent, properly

The spec you didn't read

You can't spot the bug if you didn't write the code