Jan 6, 2026

The Ralph Wiggum Approach: Running AI Coding Agents for Hours (Not Minutes)

The Problem with One-Shot AI Coding

We’ve all been there. You fire up Claude Code, drop in a prompt like “build me a REST API for todos,” and hope for the best. Maybe it works. Maybe it doesn’t. Either way, you’re staring at your screen, watching tokens burn, wondering if the agent is making progress or just spinning its wheels.

The fundamental issue? Traditional AI coding is a one-shot deal. You get one context window, one shot at the problem, and then you’re either done or you’re not.

But here’s a thought: what if you ran that same agent, on the same prompt, 10 times in a row? Each time it picks up where it left off. Each time it sees what it previously did. Each time it iterates, improves, and gets closer to done.

That’s the Ralph Wiggum approach. And it’s genuinely game-changing.

Quick Start (5 minutes)

Want to try Ralph right now? Here’s the fastest way:

Install the plugin:

/plugin install ralph-wiggum@claude-plugins-official

Run your first loop:

/ralph-loop "Add JSDoc comments to all exported functions in src/utils/" --max-iterations 10

Check the git diff when done

That’s it. You just ran your first autonomous loop. Claude iterated on the same task 10 times, improving its work each time, until it was done.

What is Ralph?

Ralph is a development methodology built on a deceptively simple insight: iteration beats perfection.

The technique comes from Geoffrey Huntley, who described it simply: “Ralph is a Bash loop.”

while :; do cat PROMPT.md | claude ; done

The name comes from Ralph Wiggum of The Simpsons - perpetually confused, always making mistakes, but never stopping. That’s the vibe.

At its core, Ralph repeatedly feeds an AI agent the same prompt until a stop condition is met. The agent sees its previous work (via git history and modified files), learns from it, and iteratively improves.

“The technique is deterministically bad in an undeterministic world. It’s better to fail predictably than succeed unpredictably.” — Geoffrey Huntley

Even Matt Pocock is a fan:

“Ralph Wiggum + Opus 4.5 is really, really good”

How It Actually Works

The Ralph Wiggum plugin uses a Stop hook mechanism. Here’s what happens:

You invoke /ralph-loop with a prompt and completion criteria
Claude works on the task
When Claude tries to exit (thinks it’s done), the Stop hook intercepts it using exit code 2
The hook checks for your completion promise (e.g., <promise>COMPLETE</promise>)
If not found, it re-feeds the original prompt and Claude continues
Each iteration sees modified files and git history from previous runs

/ralph-loop "Migrate all tests from Jest to Vitest" \
  --max-iterations 50 \
  --completion-promise "All tests migrated"

The loop continues until Claude outputs the completion promise or you hit the iteration limit.

Real-World Example: Migrating Tests

Let’s walk through a complete example. Say you want to migrate from Jest to Vitest:

Prompt:

Migrate all tests from Jest to Vitest.
- Update all test files to use Vitest syntax
- Update package.json scripts
- Remove Jest dependencies
- Add Vitest dependencies
- Run tests after migration
Output <promise>MIGRATED</promise> when all tests pass.

What happens:

Iteration	What Claude Does
1	Updates test files to Vitest syntax, tests fail
2	Fixes syntax errors, tests still fail
3	Updates package.json, removes Jest, adds Vitest
4	Runs tests, they pass, outputs `<promise>MIGRATED</promise>`

You wake up to a fully migrated test suite. No manual re-prompting, no debugging in between. Just set it up and let it run.

The Results Are Real

This isn’t theoretical. People are shipping serious work:

Cursed programming language - Built over 3 months with one Ralph loop. A functional compiler with LLVM compilation, standard library, and partial editor support. Keywords include slay (function), sus (variable), and based (true).
6+ repositories overnight - Y Combinator hackathon teams shipped multiple repos for $297 in API costs. Work that would have cost $50K in contractor time.
4-minute tests to 2-second tests - One developer migrated integration tests to unit tests while sleeping. The loop handled the mechanical conversion automatically.
Full APIs with TDD - Iteratively building features, running tests, fixing failures, and repeating until all tests pass.

These are cherry-picked successes. For every overnight win, there are loops that burned through iterations without converging. Failed attempts still cost money. But when it works, it works remarkably well.

When Ralph Shines

Ralph works best for tasks with clear completion criteria and mechanical execution:

Use Case	Example Prompt
Large refactors	”Convert all class components to functional components with hooks. Output `<promise>MIGRATED</promise>` when npm run typecheck passes.”
Framework migrations	”Migrate all tests from Jest to Vitest. Output `<promise>COMPLETE</promise>` when all tests pass.”
TDD workflows	”Implement the checkout flow to make all tests in checkout.test.ts pass. Output `<promise>TESTS_PASS</promise>` when done.”
Test coverage	”Add tests for all uncovered functions in src/“
TypeScript adoption	”Add type annotations to all functions in src/utils/“
Greenfield builds	”Build a REST API with CRUD operations. Output `<promise>COMPLETE</promise>` when all endpoints work and tests pass.”

The common thread: well-defined success metrics. If you can describe “done” precisely, Ralph can iterate toward it.

When NOT to Use It

Ralph doesn’t replace human judgment. It automates mechanical execution. Don’t use autonomous loops for:

Ambiguous requirements - If you can’t define “done” precisely, the loop won’t converge
Architectural decisions - Novel abstractions need human reasoning, not iteration
Security-sensitive code - Auth, payments, data handling need human review at each step
Exploration - “Figure out why the app is slow” isn’t a good Ralph task

Autonomous loops automate the mechanical. They don’t automate the decisions about what’s worth building.

Common Pitfalls

Avoid these mistakes when getting started with Ralph:

❌ Too ambitious on first run → Start with 10-20 iterations, not 50. Get a feel for how it works first.

❌ Vague completion criteria → “Make it better” vs “All tests pass with >80% coverage.” Be specific.

❌ Forgetting to keep CI green → Each iteration must pass tests or you’re in trouble. Broken code compounds.

❌ Not monitoring costs → A 50-iteration loop can cost $50-100+. Watch your usage.

❌ Using it for judgment-heavy tasks → Ralph is for mechanical work, not design decisions or UX choices.

The Core Mechanics

1. Progress Tracking

The agent commits its work at each iteration and appends progress to a progress.txt file. This serves as:

A log for future iterations to read
Documentation of what was attempted
A way to prevent the agent from repeating mistakes

2. Keep CI Green

This is critical. Each iteration must pass tests and type checks. Committing broken code hamstrings future iterations and creates a debugging nightmare.

The rule: if tests fail, the agent must fix them before continuing.

3. Clear Stop Conditions

This is where most people trip up. Ralph needs precise exit criteria:

❌ "Build a todo API and make it good"
✅ "Build a REST API with CRUD operations. Input validation required.
     Tests must pass (>80% coverage). README with API docs.
     Output <promise>COMPLETE</promise> when done."

Important: The --completion-promise flag uses exact string matching, which is unreliable. Always use --max-iterations as your real safety net.

Cost Awareness

This matters. Autonomous loops burn tokens.

A 50-iteration loop on a large codebase can easily cost $50-100+ in API credits depending on context size. On a Claude Code subscription, you’ll hit your usage limits faster.

Best practices:

Set --max-iterations conservatively (start with 10-20)
Scale up once you understand the token consumption pattern
Use tests/build success as the completion criteria, not Claude’s self-assessment
Monitor your usage during longer runs

Troubleshooting

Loop stuck in infinite cycle? → Check your completion promise is actually achievable → Lower --max-iterations → Review progress.txt to see what’s blocking

Tests keep failing? → Your prompt might be asking for too much → Break the task into smaller chunks → Check if dependencies are missing

Costs too high? → Reduce --max-iterations → Start with smaller codebases → Use more specific completion criteria

Claude keeps saying “done” but it’s not? → Your completion promise is too vague → Add objective criteria (tests pass, build succeeds)

Pros and Cons

✅ Pros	❌ Cons
Ships code while you sleep	Can burn through tokens quickly
Great for mechanical tasks	Not for judgment-heavy work
Self-correcting feedback loop	Requires good prompt engineering
Reduces manual re-prompting	Can get stuck if criteria are vague
Builds on git history	Windows setup has jq dependency
Growing ecosystem of tools	Learning curve for effective prompts

The Ecosystem

The pattern has spawned useful tools beyond the official plugin:

ralph-claude-code - 364 stars. Adds rate limiting, tmux dashboards, circuit breakers for failure recovery, and intelligent exit detection.
ralph-orchestrator - Adds token tracking, spending limits, git checkpointing, and multi-AI support.

These solve the operational challenges: cost control, state recovery, monitoring. The official plugin provides the core mechanism. The ecosystem builds the production wrapper.

FAQ

Q: Can I use this with other AI tools? A: The technique works with any AI coding agent. The official plugin is Claude-specific, but the bash loop approach works with Cursor, Copilot, etc.

Q: What if Claude gets stuck? A: Set --max-iterations conservatively. The loop will stop automatically.

Q: Can I run multiple loops at once? A: Yes, but be careful with costs and context windows.

Q: Does this work with large codebases? A: Yes, but start with smaller scopes and scale up. Large codebases = more tokens = more cost.

Q: Can I pause and resume a loop? A: Use /cancel-ralph to stop. To resume, run the same command again - Claude will pick up from git history.

The Broader Context

Ralph Wiggum is one implementation of a larger shift. The SDLC is collapsing. Agents now sustain multi-hour reasoning. The traditional phase boundaries between planning, building, testing, and deployment are dissolving into continuous flow.

Autonomous loops are infrastructure for that flow. Instead of handoffs between human sessions, the agent maintains context across iterations. Progress persists in git history and modified files. Each “session” picks up where the last left off.

The technique has spread widely since launch - listed on AwesomeClaude, documented on DeepWiki, and covered by numerous blogs and YouTube videos (including one from BetterStack with 60K views).

Installation

# Install from official plugin marketplace
/plugin install ralph-wiggum@claude-plugins-official

Windows users: The plugin has an undocumented jq dependency that breaks on Windows/Git Bash. Install jq first or use WSL.

Commands available:

/ralph-loop "<prompt>" --max-iterations N
/ralph-loop "<prompt>" --max-iterations N --completion-promise "text"
/cancel-ralph  # Kill active loop

Getting Started

Start small. Pick a mechanical task with clear success criteria:

Install the plugin (30 seconds)
Try a small scope: “Add JSDoc comments to all exported functions in src/utils/”
Set conservative iterations: --max-iterations 10
Review the git diff when it completes

The technique rewards prompt engineering. If the first attempt doesn’t converge, refine your success criteria and try again.

Next Steps

Try it yourself - Start with a small, safe task
Join the community - Check out the GitHub repos and Discord
Share your results - Post about what you built with Ralph
Contribute - The ecosystem is growing, there’s room for tools
Stay updated - Follow Geoffrey Huntley and the Claude Code team

The Philosophy Shift

Ralph represents a fundamental mindset change:

Traditional AI Coding	Ralph Approach
One-shot perfection	Iteration over perfection
Failures are setbacks	Failures are data
Prompt once	Prompt, observe, repeat
Operator hopes	Operator designs loops
Direct step-by-step	Write prompts that converge

The skill shifts from “directing Claude step by step” to “writing prompts that converge toward correct solutions.” Your job becomes: “How do I set up conditions where iteration leads to success?”

TL;DR - Stop expecting one-shot perfection from AI coding agents. Run them in loops. Track progress. Keep CI green. Let iteration do the heavy lifting. Ship while you sleep.

Have you tried Ralph or a similar approach? What’s been your experience? Drop a comment below.

Sources: