Knowledge Bases Need Garbage Collection
Every note you add is a note someone later has to read past. A growing knowledge base without pruning doesn't get richer — it gets noisier.
A knowledge base only ever grows. You add notes, rules, docs, decisions. Nothing ever gets removed. And the unexamined assumption is that more is better — more captured context, more history, more coverage.
It isn’t. Past a point, every note you add is a note that someone (or some agent) later has to read past to find the live one. Stale entries that reference dead systems don’t just sit there harmlessly; they pollute retrieval, contradict the current truth, and quietly lower the signal of the whole corpus. I watched a rule that referenced retired infrastructure stay “active” long after the infrastructure was gone, still showing up in searches, still half-trusted.
So the base needs a garbage collector, and it needs an opinion. Mine, borrowed: default to delete. The burden of proof is on keeping a thing, not on removing it. A note with no live purpose gets archived, not kept “just in case.”
Two mechanics make it real. First, lifecycle metadata on everything — a status and a review-by date; an entry with no review date is itself a defect. Second, a recurring sweep that flags dead-infrastructure references, merges duplicates that say the same thing, and pushes anything past its review date toward repeal. Archive with a tombstone rather than hoard — the audit trail survives, the active set stays lean.
The goal isn’t a smaller knowledge base. It’s a knowledge base where everything still in it is still true.