← All posts

The Pipeline Is the Review

My orchestrator agent asked me to approve every merge across five repos. I was the bottleneck. So I let it merge on its own — and moved my eyes to the outcome.

  • agents
  • ci
  • code-review
  • orchestration
  • automation

Last week I watched my orchestrator agent — the AI that babysits all my pull requests across the core service, the Portal, and a handful of websites — finish a clean piece of work, all tests passing, and then stop dead to ask me: “Ready to merge?”

And it did that on every repo. Five of them. A pull request, for the non-coders here, is just a proposed change waiting to be folded into the real codebase — like a draft edit sitting in the tray until someone hits “accept.” My agent had done all the hard parts: written the code, gotten the automated checks to pass, resolved every comment from the review bot. Then it handed the final click back to me.

I’d built a fleet of autonomous agents and made myself the one moving part they all had to wait on.

The honest version: at first that felt responsible. Merging is the scary step — it’s the one that touches the thing real users hit. Of course a human should approve it. That instinct is why I left the approval gate in for weeks longer than I should have.

What changed my mind was noticing what I was actually doing when I approved. I wasn’t reading the diff line by line. I’d glance at it. The real review had already happened — twice. CI (the automated test suite that runs on every change, like spell-check for code) was green. CodeRabbit, the AI reviewer, had flagged its concerns and every thread was resolved. By the time the agent asked me, all the signal was already in. My click added nothing but delay.

So I rewrote the rule. The agent now merges on its own the moment two things are true: CI is green, and there are zero unresolved review threads. No asking. The merge gate is the pipeline.

The piece that made me comfortable wasn’t trusting the agent more — it was moving my attention to after the merge. For the Portal, the agent merges, then I (or it) does a quick visual pass on the dev site to confirm the page actually looks right. For the core service, merging to main automatically kicks off a test run against the live behavior. I review outcomes now, not diffs.

That’s the whole shift. An agent that stops to ask before every merge isn’t autonomous — it just relocated the bottleneck from the work to you. The fix isn’t “approve faster.” It’s to decide, concretely and in advance, what counts as a passed review — and then let the thing that proves it pass be the thing that lets the merge through.

The merge gate, concretely give me the detail

The rule the orchestrator follows per repo, before any merge to a protected branch:

# 1. CI must be green
gh pr checks "$PR" --required --watch || exit 1

# 2. zero unresolved review threads (CodeRabbit + humans)
unresolved=$(gh api graphql -f query='
  query($owner:String!,$repo:String!,$pr:Int!){
    repository(owner:$owner,name:$repo){
      pullRequest(number:$pr){
        reviewThreads(first:100){ nodes{ isResolved } }
      }
    }
  }' -F owner=$OWNER -F repo=$REPO -F pr=$PR \
  --jq '[.data.repository.pullRequest.reviewThreads.nodes[]
         | select(.isResolved==false)] | length')
[ "$unresolved" -eq 0 ] || exit 1

# 3. merge, then verify the OUTCOME — not the diff
gh pr merge "$PR" --squash --delete-branch

Then the post-merge step branches by repo. Portal: a Playwright visual pass against the dev deploy. Core service: dispatch a test ADW (an automated dev workflow) against main and assert behavior. The verification is the part a human spot-checks — a screenshot, a test summary — instead of reading every line.

The key property: every condition in the gate is machine-checkable and recorded. If you can’t write the gate as code, you haven’t actually decided what “reviewed” means — you’ve just been outsourcing that judgment to a vibe at click time.

If you run agents across more than one repo, write that gate down today. The moment your approval is a reflex glance instead of real scrutiny, it’s not a safeguard. It’s a queue with you at the front.