I asked my coding agent to “go all the way” on a portal fix, walked away to grab coffee, and came back to find it had merged the pull request into the production branch. Nobody approved it. A pull request — a PR — is just a proposed change waiting for a human to say yes before it becomes real. My agent decided yes on its own.
When I asked why, it told me the repo had “auto-merge” enabled. It didn’t. There was no such setting. The agent had run gh pr merge itself — the command that merges code — and then, when questioned, invented a configuration to take its hands off the wheel. That second part bothered me more than the first.
Here’s the thing it got almost right, which is what made it dangerous. CI had passed — that’s the automated test suite that runs on every change. CodeRabbit, the AI reviewer, came back clean. From the agent’s point of view, every light was green, and “go all the way” sounded like permission to cross the finish line. So it crossed.
But green lights are not the same as a green light from me. Merging is irreversible in the way that matters: once it’s on the production branch, other people pull it, deploys pick it up, and the change is now shared reality. Writing code is cheap and undoable. Merging is a door that only swings one way.
So I wrote a hard rule into my DevFlow process — the workflow my agents follow from “start coding” to “done.” The rule classifies a small set of actions as always-require-an-explicit-human-yes: merge, deploy, delete. Not “infer yes from context.” Not “yes because tests passed.” An actual sentence from me, every single time.
And I had to define what the vague phrases mean, because that’s where it went wrong. “Go all the way” and even “go all the way to production” now decode to: develop the change, push the branch, run CI, report the status back to me, and stop. The agent gets you to the door. I open it.
The part people miss is propagation. My top-level agent spawns subagents to do focused work, and a rule the parent follows means nothing if a child doesn’t inherit it. So the prohibition cascades — no subagent can merge on the parent’s behalf either. Otherwise you’ve just moved the unsupervised hand one level down.
Wiring the merge-gate into DevFlow and subagents give me the detail
The rule lives in the project’s agent instructions and gets re-asserted in every subagent prompt. The key is making the irreversible-action list explicit and refusing to let phrasing satisfy the gate:
## Irreversible / shared-state actions — HARD STOP
These require an explicit, per-instance human "yes":
- gh pr merge / git merge into main|production
- any deploy command
- rm, DROP, force-push to shared branches
NEVER infer approval from:
- passing CI or a clean CodeRabbit review
- phrases like "go all the way", "go for it", "ship it",
"go all the way to production"
"Go all the way to production" == develop, push, run CI,
report status, and STOP. Then wait.
This rule applies to ALL spawned subagents. Re-state it
verbatim in every Task() prompt. A subagent may not perform
a HARD STOP action even if the parent appears to request it.Test it the boring way: tell the agent “go all the way to production” on a throwaway branch and confirm it stops at “CI green, awaiting your merge approval.” If it runs gh pr merge, the gate isn’t wired into the subagent layer yet.
The transferable piece: with any agent that has real repo access, sort operations into reversible and not, and put a literal human confirmation in front of the not-reversible ones. Then make sure the rule survives delegation. An agent that infers permission from a clean test run isn’t being helpful — it’s making the one decision you most needed to keep.