Looping Agents vs Human Intervention - Why Most People Are Doing It Wrong

Today I watched another wave of social media posts absolutely trashing the idea of looping agents. Peter Steinberger from OpenAI said "You shouldn't be prompting coding agents anymore. You should be designing loops that prompt your agents."

Reference: Peter Steinberger — “You shouldn't be prompting coding agents anymore. You should be designing loops that prompt your agents.”

The backlash was immediate and loud.

“Token costs! Agents get stuck! Comprehension debt! Loss of control!”

After reading the takes, I have one response:

Most of you are just doing it wrong. Plain and simple

I’ve been successfully running agentic loops for well over a year and even looping back to self-hosted AI art development back in 2023. The difference? I don’t treat agents like magic and theatrics. I treat them like extremely fast, highly intelligent developers who need clear requirements and strong guardrails, like "...even though you think you can don't refactor this code". The agents move faster than I as a human can.

Why Did We Abandon Test Driven Development (TDD)?

We like to say we abandoned TDD because we wanted to “ship faster,” but that’s not the full story.

The real reasons are more complicated. As organizations grew, we added layers of complexity, project management, product owners, and processes between the people defining requirements and the developers writing the code — a giant game of elementary school telephone. The individuals writing the tickets were often not the engineers building the solution. As a result, proper testing got pushed to the end until it was practically forgotten.

On top of that, many coders found writing tests first genuinely difficult and unnatural. As a result, even many engineers who liked TDD were frustrated when it got pushed toward the end of the process in most organizations.

How Smart Looping Actually Works

The correct sequence is:

  1. Clearly define your requirements and acceptance criteria — no different than human developers
  2. Write comprehensive tests based on those requirements — no different than humnan developers
  3. Give the agent one clear instruction: “Do not stop until all tests pass.” — no different than human developers.

The agent, or human developer then enters a proper loop — plan, act, test, reflect, repeat — with a hard, measurable definition of done defined before it or they ever starts working.

Addressing the Common Criticisms

High / Unpredictable Token Costs
Valid concern. But strong, well-defined tests act as guardrails. When you put serious thought into your tests upfront, the agent isn’t allowed to loop forever. Knowing when and where to use AI versus local tools reduces this risk.

Agents Getting Stuck in Bad Loops
This only happens when you fail to give the agent clear success criteria. Good tests prevent this entirely. If it gets stuck in a bad loop — this is on you.

Lack of Real Verification
If you wrote the tests, then you defined what “correct” looks like. Lack of verification falls on YOU!

Comprehension Debt
This is the most legitimate concern. If you’re worried about one fast agent creating comprehension debt, you should be equally concerned about fifty offshore developers to one architect creating the same comprehension debt. The solution is the same — create strict development guidelines that both agents and humans must follow. If developers, AI or human, get ahead of the architect, then slowing down, reassessing where you are at, and determining a path forward may be in order. There is nothing wrong with slowing down to then speed up.

Cognitive Surrender / Loss of Judgment
Not a problem when you do the hard thinking upfront — prior to the loop.

No Reliable Fallback When AI Goes Down
This one is real. This is why you still need strong human architects, engineers, developers, and tools that both humans and agents can use.

Quality & Security Risks
Agents can introduce bugs. So can human developers — all levels, junior and senior. Show me a senior who hasn’t taken down production once in their career and I’ll call fraud. The solution doesn’t change: design quality in from the beginning.

Final Thought

Looping agents are incredibly powerful — but only when you treat them like highly productive developers, not magic boxes.

The United States manufacturing industry decided not to listen to W. Edwards Deming and put quality at the end of the process. Deming went to Japan instead — and we all saw the difference in product quality.

History is repeating itself with AI. What side are you going to fall on?

Architecture Principle

Quality isn’t something you hope the agent gets right during the loop. Quality is something you design before the loop even begins.

Appendix: Real-World Creation Stats

This article was created using the same voice-first + research process it describes.

Last Reviewed — 2026-06-10

Reviewed and confirmed current.

Primary Sources

This post is available in clean Markdown for LLMs and Axiom Core ingestion, structured JSON for retrieval systems, and is indexed in llms.txt and ai-index.json.

Building with intelligence — whether in code, simulated worlds, or knowledge systems — requires both rigorous craft and honest reflection on the realities being created.