← All posts

My Own Filter Hid a CRITICAL Bug — Because I Trusted 'Outdated'

An automated merge gate said 0 unresolved threads and 3 green checks. GitHub disagreed, and GitHub was right — because my filter encoded my own blind spot.

  • agents
  • code-review
  • github
  • automation
  • ci

My merge gate said “3 green / 0 unresolved threads / ready to merge.” I was one click from shipping it. The only thing stopping me was GitHub stubbornly refusing to let the button go green — and GitHub turned out to be right.

Here’s the setup in plain terms. I was building a little automated gate that decides whether a pull request — a proposed code change — is safe to merge. It pulled two signals: the review comments left by CodeRabbit (an AI reviewer that flags problems), and GitHub’s own branch protection rules. If CodeRabbit had zero unresolved comment threads and all the checks were green, my gate called it ready. Think of it as a pre-flight checklist that reads the inspector’s notes before takeoff.

To count “unresolved threads” I wrote a filter: isResolved == false AND isOutdated == false. It returned 0. Clean. Ship it.

But one of those threads was a 🔴 CRITICAL finding — a fail-open security hole, the kind where the system defaults to “allow” when it should default to “deny.” And my filter hid it. Why? Because the thread was marked isOutdated == true.

Here’s the part I had wrong. I assumed isOutdated meant “already dealt with.” It does not. A review thread goes “outdated” when the line numbers it points at move — somebody added a few lines above it in a later commit, so GitHub can’t anchor the comment to the exact spot anymore. The finding was still completely valid against the current code. The lines had shifted; the bug had not. I’d encoded my own optimistic assumption — outdated means addressed — straight into the tool that was supposed to catch my mistakes.

What saved me was the coarser, dumber gate. GitHub’s require_conversation_resolution doesn’t care about outdated state. It counts every unresolved thread, structurally, and it held mergeStateStatus at BLOCKED while every check sat there green. My clever filter said go; the blunt structural rule said no.

That disagreement is the whole lesson. When a fail-closed gate — one that defaults to “blocked” — argues with your hand-rolled “looks clean” logic, the fail-closed gate is almost always the one to trust. It can’t be talked into yes by your assumptions, because it has none.

Counting threads the way that doesn't lie give me the detail

The bug was the isOutdated exclusion. Drop it entirely — count anything a human or bot hasn’t explicitly resolved:

# GitHub GraphQL — reviewThreads on a PR
reviewThreads(first: 100) {
  nodes { isResolved isOutdated }
}
# WRONG — outdated != addressed
unresolved = [t for t in threads
              if not t["isResolved"] and not t["isOutdated"]]

# RIGHT — outdated just means lines moved
unresolved = [t for t in threads if not t["isResolved"]]

Then make the structural gate the source of truth, not your count:

gh pr view "$PR" --json mergeStateStatus -q .mergeStateStatus
# treat anything but CLEAN as not-ready — even with green checks

If mergeStateStatus != CLEAN while your own logic says ready, that’s a signal to investigate, never to override.

The deepest bug doesn’t live in the code under review. It lives in the tool you built to verify the code, because that tool shares your blind spots — it was written by the same person who’d miss the bug. So when you build a gate, give the final word to a check you didn’t author and can’t argue with. Then when it disagrees with you, go look. Don’t reach for the override.