The Ralph Wiggum Approach: Running AI Coding Agents for Hours (Not Minutes)
The Problem with One-Shot AI Coding
We’ve all been there. You fire up Claude Code, drop in a prompt like “build me a REST API for todos,” and hope for the best. Maybe it works. Maybe it doesn’t. Either way, you’re staring at your screen, watching tokens burn, wondering if the agent is making progress or just spinning its wheels.
The fundamental issue? Traditional AI coding is a one-shot deal. You get one context window, one shot at the problem, and then you’re either done or you’re not.
But here’s a thought: what if you ran that same agent, on the same prompt, 10 times in a row? Each time it picks up where it left off. Each time it sees what it previously did. Each time it iterates, improves, and gets closer to done.
That’s the Ralph Wiggum approach. And it’s genuinely game-changing.
Quick Start (5 minutes)
Want to try Ralph right now? Here’s the fastest way:
-
Install the plugin:
/plugin install ralph-wiggum@claude-plugins-official -
Run your first loop:
/ralph-loop "Add JSDoc comments to all exported functions in src/utils/" --max-iterations 10 -
Check the git diff when done
That’s it. You just ran your first autonomous loop. Claude iterated on the same task 10 times, improving its work each time, until it was done.
What is Ralph?
Ralph is a development methodology built on a deceptively simple insight: iteration beats perfection.
The technique comes from Geoffrey Huntley, who described it simply: “Ralph is a Bash loop.”
while :; do cat PROMPT.md | claude ; done
The name comes from Ralph Wiggum of The Simpsons - perpetually confused, always making mistakes, but never stopping. That’s the vibe.
At its core, Ralph repeatedly feeds an AI agent the same prompt until a stop condition is met. The agent sees its previous work (via git history and modified files), learns from it, and iteratively improves.
“The technique is deterministically bad in an undeterministic world. It’s better to fail predictably than succeed unpredictably.” — Geoffrey Huntley
Even Matt Pocock is a fan:
“Ralph Wiggum + Opus 4.5 is really, really good”
How It Actually Works
The Ralph Wiggum plugin uses a Stop hook mechanism. Here’s what happens:
- You invoke
/ralph-loopwith a prompt and completion criteria - Claude works on the task
- When Claude tries to exit (thinks it’s done), the Stop hook intercepts it using exit code 2
- The hook checks for your completion promise (e.g.,
<promise>COMPLETE</promise>) - If not found, it re-feeds the original prompt and Claude continues
- Each iteration sees modified files and git history from previous runs
/ralph-loop "Migrate all tests from Jest to Vitest" \
--max-iterations 50 \
--completion-promise "All tests migrated"
The loop continues until Claude outputs the completion promise or you hit the iteration limit.
Real-World Example: Migrating Tests
Let’s walk through a complete example. Say you want to migrate from Jest to Vitest:
Prompt:
Migrate all tests from Jest to Vitest.
- Update all test files to use Vitest syntax
- Update package.json scripts
- Remove Jest dependencies
- Add Vitest dependencies
- Run tests after migration
Output <promise>MIGRATED</promise> when all tests pass.
What happens:
| Iteration | What Claude Does |
|---|---|
| 1 | Updates test files to Vitest syntax, tests fail |
| 2 | Fixes syntax errors, tests still fail |
| 3 | Updates package.json, removes Jest, adds Vitest |
| 4 | Runs tests, they pass, outputs <promise>MIGRATED</promise> |
You wake up to a fully migrated test suite. No manual re-prompting, no debugging in between. Just set it up and let it run.
The Results Are Real
This isn’t theoretical. People are shipping serious work:
-
Cursed programming language - Built over 3 months with one Ralph loop. A functional compiler with LLVM compilation, standard library, and partial editor support. Keywords include
slay(function),sus(variable), andbased(true). -
6+ repositories overnight - Y Combinator hackathon teams shipped multiple repos for $297 in API costs. Work that would have cost $50K in contractor time.
-
4-minute tests to 2-second tests - One developer migrated integration tests to unit tests while sleeping. The loop handled the mechanical conversion automatically.
-
Full APIs with TDD - Iteratively building features, running tests, fixing failures, and repeating until all tests pass.
These are cherry-picked successes. For every overnight win, there are loops that burned through iterations without converging. Failed attempts still cost money. But when it works, it works remarkably well.
When Ralph Shines
Ralph works best for tasks with clear completion criteria and mechanical execution:
| Use Case | Example Prompt |
|---|---|
| Large refactors | ”Convert all class components to functional components with hooks. Output <promise>MIGRATED</promise> when npm run typecheck passes.” |
| Framework migrations | ”Migrate all tests from Jest to Vitest. Output <promise>COMPLETE</promise> when all tests pass.” |
| TDD workflows | ”Implement the checkout flow to make all tests in checkout.test.ts pass. Output <promise>TESTS_PASS</promise> when done.” |
| Test coverage | ”Add tests for all uncovered functions in src/“ |
| TypeScript adoption | ”Add type annotations to all functions in src/utils/“ |
| Greenfield builds | ”Build a REST API with CRUD operations. Output <promise>COMPLETE</promise> when all endpoints work and tests pass.” |
The common thread: well-defined success metrics. If you can describe “done” precisely, Ralph can iterate toward it.
When NOT to Use It
Ralph doesn’t replace human judgment. It automates mechanical execution. Don’t use autonomous loops for:
- Ambiguous requirements - If you can’t define “done” precisely, the loop won’t converge
- Architectural decisions - Novel abstractions need human reasoning, not iteration
- Security-sensitive code - Auth, payments, data handling need human review at each step
- Exploration - “Figure out why the app is slow” isn’t a good Ralph task
Autonomous loops automate the mechanical. They don’t automate the decisions about what’s worth building.
Common Pitfalls
Avoid these mistakes when getting started with Ralph:
❌ Too ambitious on first run → Start with 10-20 iterations, not 50. Get a feel for how it works first.
❌ Vague completion criteria → “Make it better” vs “All tests pass with >80% coverage.” Be specific.
❌ Forgetting to keep CI green → Each iteration must pass tests or you’re in trouble. Broken code compounds.
❌ Not monitoring costs → A 50-iteration loop can cost $50-100+. Watch your usage.
❌ Using it for judgment-heavy tasks → Ralph is for mechanical work, not design decisions or UX choices.
The Core Mechanics
1. Progress Tracking
The agent commits its work at each iteration and appends progress to a progress.txt file. This serves as:
- A log for future iterations to read
- Documentation of what was attempted
- A way to prevent the agent from repeating mistakes
2. Keep CI Green
This is critical. Each iteration must pass tests and type checks. Committing broken code hamstrings future iterations and creates a debugging nightmare.
The rule: if tests fail, the agent must fix them before continuing.
3. Clear Stop Conditions
This is where most people trip up. Ralph needs precise exit criteria:
❌ "Build a todo API and make it good"
✅ "Build a REST API with CRUD operations. Input validation required.
Tests must pass (>80% coverage). README with API docs.
Output <promise>COMPLETE</promise> when done."
Important: The --completion-promise flag uses exact string matching, which is unreliable. Always use --max-iterations as your real safety net.
Cost Awareness
This matters. Autonomous loops burn tokens.
A 50-iteration loop on a large codebase can easily cost $50-100+ in API credits depending on context size. On a Claude Code subscription, you’ll hit your usage limits faster.
Best practices:
- Set
--max-iterationsconservatively (start with 10-20) - Scale up once you understand the token consumption pattern
- Use tests/build success as the completion criteria, not Claude’s self-assessment
- Monitor your usage during longer runs
Troubleshooting
Loop stuck in infinite cycle?
→ Check your completion promise is actually achievable
→ Lower --max-iterations
→ Review progress.txt to see what’s blocking
Tests keep failing? → Your prompt might be asking for too much → Break the task into smaller chunks → Check if dependencies are missing
Costs too high?
→ Reduce --max-iterations
→ Start with smaller codebases
→ Use more specific completion criteria
Claude keeps saying “done” but it’s not? → Your completion promise is too vague → Add objective criteria (tests pass, build succeeds)
Pros and Cons
| ✅ Pros | ❌ Cons |
|---|---|
| Ships code while you sleep | Can burn through tokens quickly |
| Great for mechanical tasks | Not for judgment-heavy work |
| Self-correcting feedback loop | Requires good prompt engineering |
| Reduces manual re-prompting | Can get stuck if criteria are vague |
| Builds on git history | Windows setup has jq dependency |
| Growing ecosystem of tools | Learning curve for effective prompts |
The Ecosystem
The pattern has spawned useful tools beyond the official plugin:
-
ralph-claude-code - 364 stars. Adds rate limiting, tmux dashboards, circuit breakers for failure recovery, and intelligent exit detection.
-
ralph-orchestrator - Adds token tracking, spending limits, git checkpointing, and multi-AI support.
These solve the operational challenges: cost control, state recovery, monitoring. The official plugin provides the core mechanism. The ecosystem builds the production wrapper.
FAQ
Q: Can I use this with other AI tools? A: The technique works with any AI coding agent. The official plugin is Claude-specific, but the bash loop approach works with Cursor, Copilot, etc.
Q: What if Claude gets stuck?
A: Set --max-iterations conservatively. The loop will stop automatically.
Q: Can I run multiple loops at once? A: Yes, but be careful with costs and context windows.
Q: Does this work with large codebases? A: Yes, but start with smaller scopes and scale up. Large codebases = more tokens = more cost.
Q: Can I pause and resume a loop?
A: Use /cancel-ralph to stop. To resume, run the same command again - Claude will pick up from git history.
The Broader Context
Ralph Wiggum is one implementation of a larger shift. The SDLC is collapsing. Agents now sustain multi-hour reasoning. The traditional phase boundaries between planning, building, testing, and deployment are dissolving into continuous flow.
Autonomous loops are infrastructure for that flow. Instead of handoffs between human sessions, the agent maintains context across iterations. Progress persists in git history and modified files. Each “session” picks up where the last left off.
The technique has spread widely since launch - listed on AwesomeClaude, documented on DeepWiki, and covered by numerous blogs and YouTube videos (including one from BetterStack with 60K views).
Installation
# Install from official plugin marketplace
/plugin install ralph-wiggum@claude-plugins-official
Windows users: The plugin has an undocumented jq dependency that breaks on Windows/Git Bash. Install jq first or use WSL.
Commands available:
/ralph-loop "<prompt>" --max-iterations N
/ralph-loop "<prompt>" --max-iterations N --completion-promise "text"
/cancel-ralph # Kill active loop
Getting Started
Start small. Pick a mechanical task with clear success criteria:
- Install the plugin (30 seconds)
- Try a small scope: “Add JSDoc comments to all exported functions in src/utils/”
- Set conservative iterations:
--max-iterations 10 - Review the git diff when it completes
The technique rewards prompt engineering. If the first attempt doesn’t converge, refine your success criteria and try again.
Next Steps
- Try it yourself - Start with a small, safe task
- Join the community - Check out the GitHub repos and Discord
- Share your results - Post about what you built with Ralph
- Contribute - The ecosystem is growing, there’s room for tools
- Stay updated - Follow Geoffrey Huntley and the Claude Code team
The Philosophy Shift
Ralph represents a fundamental mindset change:
| Traditional AI Coding | Ralph Approach |
|---|---|
| One-shot perfection | Iteration over perfection |
| Failures are setbacks | Failures are data |
| Prompt once | Prompt, observe, repeat |
| Operator hopes | Operator designs loops |
| Direct step-by-step | Write prompts that converge |
The skill shifts from “directing Claude step by step” to “writing prompts that converge toward correct solutions.” Your job becomes: “How do I set up conditions where iteration leads to success?”
TL;DR - Stop expecting one-shot perfection from AI coding agents. Run them in loops. Track progress. Keep CI green. Let iteration do the heavy lifting. Ship while you sleep.
Have you tried Ralph or a similar approach? What’s been your experience? Drop a comment below.
Sources: