At 00:47 one night, a background job on one of my machines ran git add -u against my personal knowledge base, staged 803,100 files as deletions, committed it — 803100 file(s) changed, 33,214,839 deletions, 0 insertions — and pushed it to the branch every other machine syncs from.
Nobody typed a command. No disk failed. The sync did exactly what it is built to do. That is the unsettling part, and it’s the whole lesson.
My notes — years of them — sync across several machines over a shared git branch. Each machine pulls on a timer and self-heals to match the branch. That self-healing is the feature: open my laptop after a week, it catches up to whatever the desktop committed. It’s also what turns one bad commit into a fleet-wide event. The wipe was already on the shared branch. Every machine on its next tick would have pulled it and cheerfully deleted its own copy to match. The system was working perfectly, in the direction of the cliff.
I caught it, reverted the commit, and paused the sync before the next pull cycle. Then I went looking for how a program I trusted had tried to erase the thing it exists to protect.
git add -u did nothing wrong
Here’s the mechanism, because it’s more boring and more dangerous than a bug.
git add -u stages every tracked change — edits and deletions — so the timer can commit your ongoing work without you naming files. It is the correct command for an unattended committer. It has one assumption baked in so deep nobody states it: the working tree reflects reality. That the files are on disk. That “this file is gone” means you deleted it, not that something upstream made the whole tree vanish for a moment.
That night the tree had gone empty. git add -u walked it, saw 803,100 tracked files absent from disk, and faithfully recorded the only fact it’s designed to record: every one of these was deleted. It committed that fact. It pushed that fact. A faithful command on an anomalous input produces a catastrophe with a clean conscience — no error, no exit code, nothing to alert on. This is the local-tree twin of a --force push: both overwrite good data with a state nobody verified was real.
So the tree went empty. Why? The commit itself told me, once I read it like a fingerprint.
The survivors were the tell
Of 803,688 tracked files, exactly 588 survived the deletion. I listed them expecting noise. Instead they were suspiciously uniform: .gitignore, .gitattributes, and a handful of dot-prefixed directories — .config, .scripts, .attachments, and the like. Every survivor was a dot-entry. Not one ordinary file or folder made it. Every visible top-level folder — all of my actual notes — was gone.
That asymmetry is a signature. A shell glob — the default * in bash and zsh — skips dot-prefixed names. So rm -rf * or mv */ somewhere from inside the directory removes everything visible and leaves the dotfiles sitting there untouched. A dot-only survivor set is the exact fingerprint of a glob operation.
And it rules out the suspect you’d reach for first. Git itself — a reset --hard, a checkout, a clean — treats dot and non-dot files identically. Git could never produce a dot-only survivor set. Whatever emptied the tree was a shell glob, not a git command. The sync didn’t corrupt anything. Something else swept the working directory with a glob, and the sync’s next tick walked into the aftermath and dutifully wrote it down.
the two forensic one-liners that cracked it give me the detail
The survivor signature. List what the wipe commit left behind and count how many are not dot-entries:
git ls-tree <wipe-commit> | grep -vcE '^\S+\s+\S+\s+\S+\t\.'It returned 0. Zero non-dot survivors out of 588 is not a coincidence you explain away — it’s a glob that skipped dotfiles. That single number redirected the whole investigation away from git and toward the shell.
Ruling out git. git reflog and the pack timestamps showed no reset/checkout/clean near the event, and the commit’s parent tree was fully intact — the files existed one commit earlier and were present in the object store the whole time. The bytes were never lost; only the working tree was empty. Recovery was a git revert, not a fight with git fsck.
I never did pin the exact process that ran the glob — it left no line in any shell history I could read. And that’s the point that reorganized how I think about this class of failure: I could not enumerate the cause, and I didn’t need to.
The rule that was already there had a blind spot
My sync already had a loud, underlined invariant: never clobber. No --force. No reset --hard. No “resolve a conflict by throwing away one side.” When two machines disagree, keep both and merge; never overwrite. I’d been careful about it for exactly the reason you’d expect — a distributed sync’s nightmare is one machine stomping another’s work.
But read that rule again and you can see the hole. It’s aimed entirely at the remote: don’t let this machine destroy what’s on the branch. It never imagined the destruction originating locally — the working tree itself going empty and the sync faithfully packaging that emptiness as a legitimate commit to send. A faithful git add -u of an empty tree is a --force push wearing the costume of an honest edit. The never-clobber rule waved it right through, because from inside the rule it didn’t look like clobbering. It looked like work.
The fix keys on the delete, not the cause
The instinct after an incident is to find the trigger and block it. But I couldn’t name the trigger, and even if I had, there are a hundred other ways a working tree can transiently go bad — a botched glob, a half-unmounted volume, a race with another process, a bug I haven’t met yet. Blocking last night’s specific cause would leave the other ninety-nine open.
So the gate ignores the cause entirely and refuses on the shape of the action:
Before it commits, the sync counts how many files the commit would delete. If that count crosses a threshold, it does not commit. It halts and it alarms.
The threshold is max(100, a small percentage of the tracked tree) — a floor because a real prune is a few dozen files, not hundreds, and a relative cap so it scales with the repo. A genuine large deletion — a real reorg that really does remove thousands of files — is rare, and on those rare days I do it by hand and watch it. The unattended timer never gets to.
The thing that makes this safe rather than just another blocking rule: refusing to commit is non-destructive by construction. The gate doesn’t touch the tree, doesn’t reset, doesn’t resolve. It declines to act and pages a human. If the tree really was wiped, the files are still recoverable from git and nothing propagated. If it was a false alarm, I lost nothing but a cycle. There is no version of “the gate fired” that costs me data. A trip is a state a human resolves, never a thing the sync retries its way past.
That inverts the burden of proof, which is the actual move. The default was commit whatever the tree says. Now the default is a mass deletion is guilty until a human clears it. The sync has to justify destruction, not the other way around.
Where this generalizes — and why I care about it for agents
Strip the git specifics and the shape is one I now look for everywhere I let software act without me watching:
An automated actor will faithfully execute a catastrophic action if its input is anomalous and it has no notion of “this delta is implausible.” The failure is never in the command. It’s in the unquestioned assumption that the input reflects reality.
I build agents. This is the same failure with better vocabulary. An agent handed a corrupted context, a truncated tool result, a half-empty query response, will faithfully act on it — delete the records, send the batch, overwrite the file — because faithfully executing instructions is the entire job, and “wait, this input looks wrong” is not a thing that happens unless you build the organ that notices. You cannot enumerate every way the input goes bad upstream. You can put a gate on the action that asks one question the actor can’t ask itself: is this delta plausibly real? Deleting most of the table, paying an order of magnitude more than any prior invoice, touching every row — cheap to check on the way out, and it doesn’t care how the bad input got there.
It’s the same pattern I wrote about for extracting data off bank statements: don’t trust the extractor, don’t trust the answer key — build the gate that reconciles or refuses, and it stays correct no matter which upstream part is wrong today. There the gate refused a statement that didn’t sum to its printed total. Here it refuses a commit that deletes more than a human plausibly would. Same instinct, different blast radius. Reconcile or refuse. If you can’t verify the delta is real, don’t ship it.
the gate, and why the check is trivial give me the detail
The whole guard is a handful of lines in the commit path, before git commit:
staged_deletes=$(git diff --cached --diff-filter=D --name-only | wc -l)
threshold=$(( 100 > tracked/50 ? 100 : tracked/50 ))
if [ "$staged_deletes" -gt "$threshold" ]; then
alarm "sync halted: $staged_deletes staged deletions (> $threshold)"
exit 1 # do NOT commit — leave the tree exactly as found
fiThree properties are doing the work, and none of them are clever:
- It reads the staged delta, which git computes for free. No tree scan, no heuristic, no model.
--diff-filter=Dis deletions only. - It fires before the commit, so the destructive act never happens — there’s nothing to roll back, because nothing was written.
- The trip is loud and terminal. It alarms and stays halted until a human looks. It does not auto-retry, because the next tick would walk into the same anomalous tree and make the same call — correctly.
The cost of the gate is one wc -l per commit. The cost of not having it was a 33-million-line deletion on its way to every machine I own. That ratio is the whole argument for putting a plausibility check on any action your automation can take without you.
I keep the incident commit in my history on purpose — 803100 file(s) changed — as a reminder that the scariest failures aren’t the ones where something breaks. They’re the ones where every part works exactly as designed, in a direction you never told it not to go.