← All posts

My Agents Cached Their Doctrine, Not My Commit

I refactored a shared skill mid-session and half my worker agents kept obeying the old rules — because they'd loaded their behavior the last time they talked to the boss, not the last time I hit save.

  • multi-agent
  • claude-code
  • agent-orchestration
  • prompt-engineering

Around midday I had one “boss” agent coordinating a fleet of worker agents — nine or so active at once — and I decided to clean up the rulebook they all share. Think of the boss as a shift lead and the workers as people on the floor, all following the same handbook for how to ask for help and report problems. That handbook is what I call a skill: a chunk of instructions a Claude Code agent loads and then acts on.

Mid-session I split that one handbook into two — a boss version and a worker version — and added four new principles about how workers should escalate problems and list out what’s blocking them. Good changes. I saved, felt done, moved on.

Then I noticed the fleet was behaving in two different ways.

Some workers were enumerating blockers the new way, escalating early, crisp. Others were still doing the old thing — sitting on problems, reporting them in the vague old format. Same code, same boss, same task. Two personalities.

It took me an embarrassing minute to see what happened. A worker agent loads the skill when it talks to the boss, and then it keeps that copy in its head for the rest of the session. It doesn’t re-read the file. My edit didn’t reach anyone who was already mid-conversation. The moment I saved became a dividing line: peers who’d last checked in before the change kept running the old doctrine; peers who checked in after got the new one. The fleet had quietly forked into two versions, and the only thing that decided which version a given agent used was when it last happened to sync.

My first instinct was to just wait it out — figure the new rules would propagate as agents naturally came back to the boss. But “eventually” isn’t a plan when half your fleet is escalating wrong right now. I had to walk each active worker and hand it the new handbook myself, pointing at exactly what changed.

Here’s the mechanism worth stealing. A long-lived agent caches its behavior from its last interaction, not your latest commit. Your file is the source of truth for new agents. It is not the source of truth for anyone already awake. Those are two different populations and you have to update them separately.

The gate I use before re-syncing a fleet give me the detail

The trap is treating every edit as a push. Most aren’t. Before re-briefing, I ask one question per active agent:

Would this agent behave differently on its next reply under the new instructions?

If no — you fixed a typo, reworded an example, tightened prose — let it propagate organically. The cached copy is functionally identical; a mass re-brief just burns tokens and rate-limit budget for nothing.

If yes — you changed escalation rules, added principles, split one skill into two — it’s a load-bearing event. Send each in-flight agent a customized message naming the changed artifact, not a broadcast:

Doctrine update: the peer skill was split into boss/worker
and gained 4 principles (escalation + blocker enumeration).
You loaded the OLD version at your last check-in.
Re-read worker-skill.md before your next action. Here's the diff: ...

The customization matters because a boss and a worker need different halves of the change, and a generic “please re-read the docs” ping tends to get acknowledged and ignored. Cite the exact file and the exact delta so the agent has to reconcile it.

So the rule I now run by: mutating a shared prompt that live agents already hold is a deploy, not a save. Treat the save and the rollout as two separate acts. Ask the one gate question — different next reply? — and if the answer is yes, go tap every active agent on the shoulder yourself. Nobody in the fleet is reading your commits. They only know what they last heard from you.