I Ran Three AI Agents in Parallel and One Almost Deleted Another's Work
Running one AI coding agent is easy. It edits files, you check the result, done.
Running several at once on the same repo is where it gets interesting, and where it almost bit me. The bottleneck stops being how good the code is. It becomes how the agents coordinate.
Here's the thing nobody warns you about: the agents can't see each other. Each one runs in its own session with no idea the others exist. The only thing they share is the repo itself, the git history and the open pull requests.
So in one session I had three agents working, and they ended up on the same branch, each forked from a different point in history. One of them was about to open a pull request that would have quietly deleted a feature another agent had already shipped. Not because it was wrong. It had branched from a stale commit, before a squash-merge, so its diff read as a deletion. Every automated check was green. No human reviewer would have caught it from the checkmarks.
That near-miss is the whole lesson. Multi-agent engineering fails at coordination, not capability. The models write fine code. They trip over each other. So I stopped trying to make the agents smarter and started treating it like managing a team of contractors who can't see each other's screens.
Step 1: one agent, one worktree, off fresh main
This is the single highest-leverage rule. Two agents on one branch is the number one source of divergence and silent regressions.
The rule: one agent gets its own git worktree, branched off fresh origin/main, opens a small PR, merges, and the branch gets deleted. Never share a branch.
Why a worktree and not just a branch? A git worktree is a second working copy of the same repo, checked out to its own branch, in its own folder, sharing one underlying history. It's like giving each contractor a separate desk with the same filing cabinet behind them. They physically cannot collide on the filesystem.
In Claude Code this is a setting:
{
"worktree": {
"bgIsolation": "worktree",
"baseRef": "fresh",
"symlinkDirectories": ["node_modules"]
}
}bgIsolation: worktreegives each background agent an isolated checkout.baseRef: freshbranches fromorigin/main, never a stale local ref. This is the line that prevents the regressive diff from the story above.symlinkDirectoriesmeans worktrees share installed packages instead of each running a slow reinstall.
Step 2: recover a stale branch without nuking anyone's work
When a branch has already diverged and carries useful commits, do not force-push it. Another agent might be on it. Instead:
git fetch origin
# make a fresh worktree off the latest main
git worktree add ../recovery -b feature/clean origin/main
# bring over only the genuinely-new commits; already-merged ones drop out
git cherry-pick <new-commit-sha>
# check the diff has no unexpected deletions, then push under a new nameThis is exactly how I turned that dangerous branch (the one whose PR would have deleted a live feature) into a clean five-file PR with zero regression.
Step 3: make "done" mean the whole system stays true
The quieter failure is drift. Code changes, but the docs, the ticket, and the changelog don't. Multiply that across a bunch of fast agents and the project's memory rots.
The fix is to make those updates a requirement to merge, not a thing you remember. A change isn't done when the code works. It's done when the living system stays coherent, all in the same pull request:
- Docs: code that changed a documented concern updates its doc.
- The conventions file (AGENTS.md or CLAUDE.md): no policy now contradicts the change.
- Changelog: an entry, or a conventional-commit title it can be derived from.
- Ticket: referenced, acceptance criteria checked, status moved.
- Tech debt: touched files scanned, orphaned code removed.
Step 4: never let an agent be its own only reviewer
This is the part most people miss, and it's true beyond AI.
When the same agent that wrote the code also reviews it, it brings the exact blind spots that produced the bug. Same context, same assumptions, correlated errors. Self-review feels productive and catches almost nothing structural.
So I use a fresh-context reviewer plus adjudication:
- The building agent self-verifies first. Typecheck, lint, tests. Don't hand back broken basics.
- A separate agent with no memory of building it reviews the change adversarially.
- I adjudicate. Fresh eyes catch blind spots, but they also lack the builder's context, so they raise false alarms. I keep the real findings and overrule the wrong ones.
On the last change, the fresh reviewer found one genuine bug (a component forcing underlines on nested links) and two false positives (it flagged valid code it didn't have the version context for). Applying all three blindly would have introduced a regression. The judgment step is what makes independent review safe.
| Review style | Catches deep bugs? | Risk | |---|---|---| | Agent reviews its own work | No, it shares its blind spots | False confidence | | Fresh agent, no judgment | Yes, but flags non-issues | Applies wrong fixes | | Fresh agent plus you adjudicate | Yes | Low, best of both |
Step 5: let the pipeline review and merge itself
Once each agent is isolated and independently reviewed, you can wire it so PRs are self-merging and you only step in on the exceptions:
- Deterministic CI gates: typecheck, lint, unit tests, database tests, secret scanning.
- An independent AI review on every PR, run as a loop-auditor that checks correctness plus the definition of done above.
- Branch protection that requires the always-on checks. I leave admin bypass on so trivial docs can still go straight to main.
- Auto-merge and auto-delete, so the PR merges itself the moment the gates and the review pass.
The AI review can route through an AI gateway so it shares one key with unified billing and observability, using the gateway's Anthropic-compatible endpoint:
- uses: anthropics/claude-code-action@v1
env:
ANTHROPIC_BASE_URL: https://ai-gateway.vercel.sh
ANTHROPIC_AUTH_TOKEN: ${{ secrets.AI_GATEWAY_KEY }}
with:
claude_args: "--model anthropic/claude-sonnet-4.6 ..."Where it lands

An agent finishes work, opens a PR, a fresh agent reviews it and audits the definition of done, it self-merges when green, and the branch auto-deletes. Nothing piles up. The docs and tickets and changelog stay current because the gate won't let them drift. I step in only on the ones that go red.
What I'd tell you if you're starting
- Coordination, not capability, is the multi-agent bottleneck. Solve it the way you'd manage a team that can't see each other's screens.
- Isolation is non-negotiable. One agent, one worktree, fresh main, short-lived branch.
- Self-review is a blind-spot trap. Use a fresh-context reviewer and adjudicate what it finds.
- Make the system self-healing by turning docs, tickets, and changelog updates into merge gates, not afterthoughts.
- Then let the pipeline merge itself, so you supervise by exception instead of babysitting every change.
Start with two agents on two clearly separate parts of a project and watch them each open a clean PR. That's the whole thing, scaled down to where you can see it work.