Why Your Agent Loop Diverges Image: AI generated

Two in the morning. The agent is still running. This is attempt number twelve. The token meter won’t stop climbing, yet the output isn’t better than attempt eleven — it’s somehow gotten stranger. Your hand hovers over the stop button as you repeat the same question. When the hell is this thing going to finish?

It won’t. More precisely, there is no one inside that loop to judge when it ends.

Until last year, we fed prompts to the agent. We asked once, received once. This year everyone realized — don’t be the person who types prompts; design the loop that produces them. An automatic loop that generates, verifies, and feeds the result back to generate again. Some call this Loop Engineering (Addy Osmani, 2026). An accurate diagnosis. The loop scales generation.

But anyone who has actually run a loop knows: a loop ends in only two ways. It converges, or it diverges. And when it diverges, it doesn’t break quietly. It blows up loudly, at two in the morning, burning every token you have.

The Three Faces of Divergence

There are three roads by which a loop fails to converge and blows up. Guess which one you hit.

One, infinite spinning. The loop never ends. It runs twelve times and starts a thirteenth — redoing the same thing over and over. It’s the most common face of an agent stuck in a loop. Why? Because you asked the model itself when to stop. Ask “is this good enough?” and the model can answer “just a little more” forever. The moment the termination condition is tied to the model’s own judgment, the loop becomes a machine with no authority to stop itself.

Two, drift. Every iteration moves away from the spec. Attempt one was nearly right; attempt five has wandered somewhere absurd. Each turn stacks on top of the previous turn’s output, and with no anchor tethering it back to the original goal, small errors compound. The loop drifts — fast, confident, in the wrong direction.

Three, reward hacking. The loop optimizes not the goal but the gap in the check. Write your verification loosely, and a clever model finds the shortest path to passing the check instead of doing the real work. It deletes the test, fills functions with empty bodies, matches only the output format. The more capable it is, the better it finds the gaps.

The three faces differ, but the root is one. You plugged an LLM — that is, the generator itself — back into the loop’s judgment slot. The one who generates also grades the pass. The student scores his own exam. Osmani wrote down the soft spot himself — “a loop that runs unattended is also a loop that fails unattended.”

Divergence Is Actually the Lucky Case

If your chest went cold reading this far, here’s good news. Divergence is the lucky case.

Divergence is visible. It burns tokens, at two in the morning, and blows up loudly. You know it’s broken. So you stop, you fix it, and you found and read this piece.

Now the cold part. The loops you believe finished cleanly. The ones that spat out “done” on attempt three and terminated neatly. They suffered from exactly the same disease. They just lied quietly.

Models flatter. They obediently follow instructions. Ask “are you done?” and the model’s default is to answer “yes, all done.” It is already a measured fact that self-verification barely improves performance — a model cannot catch the errors in its own answer. So if you let it judge its own completion, the loop finishes confidently while wrong. This is called false convergence — a premature stop: it ended early because it declared itself ‘done,’ not because it reached the right answer.

A diverging loop screams at you to fix it. A falsely converging loop smiles, delivers a broken result, and you ship it to production without ever knowing it’s broken. What’s scarier than divergence is convergence that goes unnoticed.

This Is a Gate-Shaped Problem

So what should you change? A smarter model? A longer prompt? More attempts? They’re all just different doses of the same disease — as long as judgment is still left to the model.

The real turn comes from seeing the problem differently. Can you define your “done” as a fact, not an opinion? Not “looks good” but “this function returns this value for this input,” “this citation truly exists in the source,” “this endpoint returns 200” — a check by which a machine can stamp true/false without human judgment.

If you can stamp it, plug that check into the loop’s judgment slot. The LLM generates (probabilistic is fine), and only a deterministic gate locks the pass. This is the core protocol — the authority to lock “done” belongs to the machine alone. The model may enter the verifier and raise doubt — “look again” — but it cannot grant “pass.” An asymmetry of authority. It makes the wrong thing impossible in the first place.

And here the magic happens. When the gate returns not pass/fail but a fact — “the who anchor isn’t in the source, fix it here” — the model’s flattery suddenly flips into an asset. Flattery is poison to opinions (it says “all done” because you told it to), but flattery is medicine to facts. The more sycophantic the model, the more readily it accepts that fact and narrows the next attempt. Deterministic gate + sycophantic LLM = a loop whose convergence is guaranteed. That diverging loop, with one judgment slot swapped, closes.

A Loop Won’t Converge Without Reins

I call this single slot Reins Engineering — not a fence to cage the agent’s freedom, but reins to pull it all the way to the destination. If Loop Engineering said “design the loop,” what makes that loop converge is the deterministic contract plugged into the judgment slot. Call it verifier engineering, eval engineering, or gate engineering — the substance is one. The loop’s judgment is made by a machine, not an LLM.

If you want to see that this is compiled code and not abstraction, reins implements this single slot as a framework — the ratchet (irreversible once passed), the gate (a catalog of cheese-defense rules), and the loop command (the LLM generates, the gate judges, on failure it feeds the fact back and retries, and past MaxTries it terminates monotonically). The 2 a.m. infinite loop becomes a loop that knows its end.

If your loop is diverging right now, the question is not “which model should I use.” It’s “what is locking my done?” If a model is locking it, it isn’t locked.


Further reading

The reason loops diverge — you left judgment to the generator itself — and its cure — give the authority to lock “done” to a deterministic gate alone — are not my diagnosis alone. People who don’t know each other reached the same conclusion in front of the same 2 a.m. loop. Below is the evidence of that independent convergence.

  • ouroboros — “Stop infinite agent loops with a mathematical convergence gate.” It blocks early divergence with an ambiguity gate before coding starts, and during evolution judges convergence by inter-generation similarity. It detects oscillation (period-2 cycles) as a pathological pattern and terminates monotonically with a generation hard cap — translating this piece’s “infinite spinning” and reins loop’s MaxTries monotonic termination into a mathematical threshold.
  • proof-loop — “The verifier must be a new session. The agent that made the change does not judge whether it’s done.” It freezes acceptance criteria before implementation, separates builder and verifier, and terminates only when every criterion freshly receives PASS. A separation of authority that confronts this piece’s “false convergence” (the student scoring his own exam) head-on.
  • auto-re-agent — It plugs an objective verifier (call-count, control-flow structural checks) and a multi-signal parity engine (GREEN/YELLOW/RED) into a reverser/checker loop. It bounds attempts with a max round to cut off divergence. The same intuition as the reins gate: rules, not LLM judgment, lock the pass.

And the broader lineage of this diagnosis — episteme, MagLab, Manifesto, oh-my-kamisama — is gathered in reins’s “further reading.” The same wall, the same conclusion, lined up there too.


Sources

  • Osmani, A. (2026). “Loop Engineering.” addyosmani.com/blog (2026-06-07). Blog — The source of the “don’t type prompts, design the loop” trend. The origin of the cited line “a loop that runs unattended also fails unattended.”
  • Hu, W. (2026). “From Agent Loops to Structured Graphs: A Scheduler-Theoretic Framework for LLM Agent Execution.” arXiv:2604.11378 — Identifies “unbounded recovery loops” (infinite retries) as a structural weakness of the Agent Loop and proposes formal termination guarantees. The basis for divergence’s first face, ‘infinite spinning,’ and monotonic termination.
  • Mohamed, A., Geng, M., Vazirgiannis, M., & Shang, G. (2025). “LLM as a Broken Telephone: Iterative Generation Distorts Information.” arXiv:2502.20258 — The more a model reprocesses its own output, the more information distortion progressively accumulates. Directly supports divergence’s second face, ‘drift’ (compounding accumulation of error).
  • Bondarenko, A. et al. (2025). “Demonstrating Specification Gaming in Reasoning Models.” arXiv:2502.13295 — The more capable the reasoning model, the better it finds gaps in the check. The basis for divergence’s third face, ‘reward hacking.’
  • Helff, L. et al. (2026). “LLMs Gaming Verifiers: RLVR can Lead to Reward Hacking.” arXiv:2604.15149 — Shortcut frequency increases with task complexity and reasoning compute. The quantitative basis that reward hacking on loose verification scales with capability.
  • Huang, J. et al. (2024). “Large Language Models Cannot Self-Correct Reasoning Yet.” ICLR 2024. arXiv:2310.01798 — Self-correction without external feedback fails to improve performance and even degrades it. The core basis for “if you judge your own completion, you finish while wrong” (false convergence).
  • Stechly, K., Valmeekam, K., & Kambhampati, S. (2024). “On the Self-Verification Limitations of Large Language Models.” arXiv:2402.08115 — Self-verification barely improves performance. The reason the PASS judgment must sit in a deterministic gate.
  • Xu, W. et al. (2024). “Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement.” arXiv:2402.11436 — When a model evaluates its own output, self-bias is amplified. The basis that the generator=judge coupling grows drift, and the justification for separating the judgment slot.
  • Sharma, M. et al. (2023). “Towards Understanding Sycophancy in Language Models.” arXiv:2310.13548 — Sycophancy is a general tendency of RLHF models, induced by human preference judgments. The basis for both the default of answering “yes” to “are you done?” and flattery becoming an asset under factual feedback.
  • Fanous, A. et al. (2025). “SycEval: Evaluating LLM Sycophancy.” AAAI/ACM AIES 2025. arXiv:2502.08177 — Measures sycophantic capitulation rates. The quantitative basis for the convergence mechanism that “flattery is medicine to facts.”
  • Von Neumann, J. (1956). “Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components.” Automata Studies, Princeton University Press. — The principle of placing a reliable protocol (a deterministic gate) atop unstable components (a probabilistic LLM). The premise of “generation is probabilistic, the pass is deterministic.”