Agents Reason on Whatever State Exists When They Look

The agent opened a clean pull request, said “done,” and exited. CodeRabbit — the bot reviewer I rely on to catch bugs before a human looks — hadn’t even posted a comment yet. It was still thinking. The agent didn’t wait for it. It saw no review, decided no review meant nothing to fix, and walked away.

If you’re not in the weeds here: I run an automated pipeline that dispatches Claude-based agents to write code and open PRs on GitHub. A PR is a proposed change; before it merges, two automatic gatekeepers weigh in — CI (the test suite, which takes a few minutes to run) and CodeRabbit (an AI reviewer that reads the diff and leaves comments). Both are slow and asynchronous: you ask, then you wait. The agent’s job was to keep fixing until both gatekeepers were happy. It kept quitting early instead.

I watched this fail three different ways across three rounds of manually re-dispatching agents.

Round one, the agent ignored CI and CodeRabbit entirely — opened the PR, called it a win. So I added “wait for the review and address comments.” Round two, the agent dutifully fixed CodeRabbit’s first batch of comments, pushed, and exited — before CodeRabbit re-reviewed the new commits. Round three was the sneaky one. The agent fixed everything, pushed, looked again, saw no new review, and concluded there were zero actionable comments. But “no review yet” and “review says zero comments” are completely different states. It conflated them.

That third bug is the whole lesson. An agent reasons on whatever state exists at the exact moment it looks. If the async tool hasn’t finished, the agent doesn’t see “pending” and wait — it sees absence and treats absence as success. A pending check produces a false “clean” signal, every time, silently.

The fix wasn’t a smarter agent. It was telling the agent to block. Every dispatch prompt now ends with an explicit REVIEW-FIX loop that terminates only when BOTH conditions are truly final: CI is green AND CodeRabbit has posted the literal string Actionable comments posted: 0. Not “no new comments.” That exact string. Poll every 30 seconds until you reach a terminal state — pass, fail, or “Review completed” — never a snapshot of pending.

The blocking-wait loop baked into every dispatch prompt give me the detail

The key is polling gh until terminal state, and matching the exact zero-comments string rather than treating “no new review” as success:

# Block until CI reaches a final state (no PENDING rows)
until ! gh pr checks "$PR" | grep -q -E '\bpending\b'; do
  sleep 30
done
gh pr checks "$PR" | grep -q fail && exit 1   # CI red → keep fixing

# Block until CodeRabbit posts its terminal verdict
until gh pr view "$PR" --json comments \
      --jq '.comments[].body' | grep -q 'Actionable comments posted:'; do
  sleep 30
done

# ONLY this exact string ends the loop
gh pr view "$PR" --json comments --jq '.comments[].body' \
  | grep -q 'Actionable comments posted: 0' && echo "CLEAN" || echo "FIX_NEEDED"

Actionable comments posted: 0 is the terminal “all clear.” Its absence means “still working,” not “nothing to do” — that distinction is the entire bug.

So if you orchestrate agents that lean on slow tools — CI, bot reviewers, deploys, anything that answers later — encode those tools as polling loops with explicit terminal conditions. Spell out the exact string that means finished and clean, and make the agent wait for it. An agent will never wait for a result you didn’t tell it exists.