← All posts

My Own Filter Hid a CRITICAL Finding

I built a merge gate that counted '0 unresolved threads' and almost shipped a fail-open bug — because the filter I wrote shared my blind spots.

  • code-review
  • automation
  • github
  • coderabbit
  • ci

My merge gate said “3 green / ready to merge.” GitHub said BLOCKED. One of us was wrong, and for about ten minutes I assumed it was GitHub.

Here’s the setup. I was building an automated check that decides when a pull request — a proposed code change — is safe to merge. Part of that decision came from CodeRabbit, an AI reviewer that leaves comments on the code, each one a “thread” you’re supposed to resolve before shipping. My gate counted those threads. Zero unresolved? Green light. Think of it like a pre-flight checklist that won’t let the plane leave the gate until every item is ticked.

My checklist was lying to me.

The way I’d written the count, a thread only mattered if it was both unresolved and not “outdated.” That isOutdated flag felt obviously safe to exclude. Outdated means stale, means handled, means move on — right? So I filtered them out, the count came back zero, and my gate happily declared victory.

But GitHub’s branch protection — its own structural rule that blocks merging until every conversation is resolved — kept the PR at BLOCKED. Every individual check was green. The merge button still wouldn’t light up. That disagreement is the only reason I looked closer.

When I did, I found a 🔴 CRITICAL thread CodeRabbit had flagged: a fail-open bug, the kind where a security check quietly defaults to “allow” when something goes sideways. And it was marked isOutdated. Not because anyone addressed it. Because an earlier commit had added a few lines above it and shifted the line numbers. The finding was still completely valid against the current code. GitHub didn’t care that the lines moved — it counts the thread either way. My filter saw “outdated” and threw the most important comment in the PR straight in the trash.

That’s the part that still bugs me. The bug wasn’t in the code being reviewed. It was in my reviewer of the reviews. I’d baked an assumption — “outdated means handled” — into the one tool whose whole job was to catch my mistakes. A self-authored filter inherits the author’s blind spots by construction. I can’t see what I can’t see, and now neither could my gate.

The filter, and why the coarse gate wins give me the detail

The broken query excluded outdated threads:

# WRONG: hides valid findings whose lines merely shifted
reviewThreads(first: 100) { nodes { isResolved isOutdated } }
# count = nodes.filter(t => !t.isResolved && !t.isOutdated).length

isOutdated is purely positional — GitHub sets it when later commits move the lines a comment anchored to. It says nothing about whether the finding was fixed. Count isResolved == false and stop there:

# count = nodes.filter(t => !t.isResolved).length

Better: don’t trust your own count at all. GitHub’s mergeStateStatus already aggregates branch protection, including require_conversation_resolution, which is fail-closed — it counts every unresolved thread regardless of outdated state.

gh pr view "$PR" --json mergeStateStatus,reviewDecision
# treat mergeStateStatus == "CLEAN" as the source of truth

When your hand-rolled “clean” and the platform’s CLEAN disagree, the platform is the cheap, dumb, correct one.

So the rule I follow now: when a coarse, fail-closed gate disagrees with your detailed green checks, the gate is probably right and your cleverness is probably the bug. Don’t override it. Go find out why it’s unhappy. The check you wrote yourself is exactly the one that can’t catch your own mistake.