Part 1. A Problem Discovered Too Late

The models you use are slowly getting dumber. This isn't alarmism, unfortunately, it's empirical: in April 2025, 74.2% of new web pages contained AI-generated text, and each successive model trains partly on the exhaust of its predecessor, which itself digested the exhaust of an even earlier one. Biologists talk about autophagy, an organism consuming its own tissue. Medical geneticists speak of the Habsburg jaw. ML engineers, since July 2024, have used the term “model collapse.”

What do Charles II of Spain (the last Habsburg on the Spanish throne), a photocopier, and ChatGPT have in common? All three degrade when they copy themselves. For the Habsburgs, generations of close-kin marriages accumulated genetic defects. For the photocopier, it's loss of contrast by the tenth copy. For AI, it was a July 2024 paper in Nature — AI models collapse when trained on recursively generated data — that spelled it out: feed a language model on the outputs of its predecessor, and within a few generations it gets substantially dumber.

The scenario is clear: diversity drops, and the tails of the distribution — the rarest, most original outputs — are the first to disappear. By the end of 2024, Dohmatob et al. in Strong Model Collapse had shown something uncomfortable: asymptotically, even one-thousandth synthetic data in a training corpus leads to degradation. 0.1% is already bad. So what happens when 75% of new web text is synthetic, and AI-generated pages account for nearly 20% of Google's top results by 2025? What fresh material is the poor model supposed to learn from? Scaling the dataset doesn't help. Scaling the model doesn't help. The classic scaling laws hypothesis — more data, smarter model — the premise underpinning half of every VC pitch deck out there, collapsed right on investors' slides.

By the time the Nature paper appeared, the industry had already spent two years living in the age of mass AI production, and the web data earmarked for training the next generation of models contained a meaningful share of synthetic content.

Phase One: Alarmism (2024)

After the Nature paper, talking about model collapse as an inevitable curse became fashionable, and two useful terms emerged. In 2023, people started refering to Model Autophagy Disorder (MAD), applied to generative models that feed on their own outputs and go insane. The acronym was too literary not to spread across headlines. Alongside it, journalists picked up "Habsburg AI": a model that breeds with itself comes to a bad end.

By late 2024, an apocalyptic scenario had taken shape. AI generation grows exponentially. Web data becomes contaminated. Future generations of models train on heavily polluted data. Quality falls irreversibly.

Phase Two: Mitigation (2025)

In 2025, several papers appeared showing that yes, things were bad — no one was disputing that — but not quite so fatal. The picture turned out to be more nuanced.

If real data is retained at every generation and synthetic data is merely added on top, the model degrades, but more slowly. A regime that replaces human text with machine text guarantees rapid collapse; an accumulation regime doesn’t. The picture that emerged showed that collapse is real but manageable with three practical steps: accumulate, filter, and blend in the right proportions. The alarmism of 2024 subsided.

Phase Three: Pragmatism (2026)

By May 2026, the industry had calmed down a bit. The apocalypse was postponed, while the collapse was stripped for parts and produced one useful application: spectral diagnostics of the embedding Gram matrix can now catch degradation before a model starts talking nonsense.

Three major shifts happened.

1. The Verifier as Life Preserver (With a Cast-Iron Weight Chained to It)

Yi et al., "Escaping Model Collapse via Synthetic Data Verification" (revised March 2026) revealed a major breakthrough. A verifier, an external filter between a model and its synthetic outputs, helped training produce improvement rather than collapse. Venture capitalists celebrated, but prefer not to think about a few nuances. First, strictly speaking, the theorem is only proven for linear regression in a vacuum. Second, the trained model converges to the verifier's "knowledge center." The student can never surpass the teacher. Third — and this is the really uncomfortable part — if the verifier is imperfect, early gains plateau and can even reverse. Infinite self-play is impossible, and the perfect teacher doesn't exist.

2. We Found the Delete Button

Deleting data from an LLM (unlearning) used to be like trying to extract a specific teaspoon of sugar from a finished cake. Any given model was a black box with hundreds of billions of parameters. Scholten and colleagues proposed an elegant pivot: the Partial Model Collapse method, which sics a targeted collapse on the unwanted knowledge and burns it out through recursion. Their paper is titled "Model Collapse Is Not a Bug but a Feature in Machine Unlearning for LLMs" — arguably the best headline of the year. The bug became a scalpel.

3. The End of Magic (Clinical Autophagy)

In April 2026, clinicians sounded the alarm: clinical LLMs, trained on AI-generated medical records, were systematically erasing rare pathologies and averaging out complex conditions into "benign normals." Rare pathologies aren't actually that rare — in an aging population, it’s especially common for a patient to have two or more conditions at once. Add a kidney problem to a liver problem, a vascular problem, and a hospital-acquired infection, and the AI starts contradicting itself. The same bland, washed-out prose you'd expect from hack writers has migrated into the workstations of practicing physicians, directly threatening the lives of their patient-readers.

Experts gave the phenomenon the reassuringly bland label "interpretive drift” and proposed watermarks on AI-generated records, plus "Human Vaults" that isolate and store repositories of data from real doctors. I think this is a band-aid at best: how are these vaults supposed to interact with the information AI generates in day-to-day clinical work?

AGI Valuations Take a Nosedive

"Investors and AI evangelists have been breathlessly awaiting Artificial General Intelligence (AGI), a model that can solve any problem at least as well as a human. AGI won't need smoke breaks, worry about mortgage payments, or risk burnout, but it also won't walk through the door everyone's been waiting at: simple scaling.".

You can increase compute by a factor of ten. You can (in theory) do it by a hundred. It won't matter. An AI isn’t going to drop you a line asking "I've engineered a new apple variety, launched a delivery startup, optimized global gas logistics, found a few decent and competent politicians, and proved the Riemann Hypothesis along the way. What's next?" A machine uprising’s not coming any time soon, either. A model cannot conjure from its weights information that was never there to begin with.

The idea behind model collapse isn't new: an Austrian philosopher came up with it long before the first perceptron existed. The meaning of a word, he argued, is not determined by its internal structure but by its use within a community. Without public practice, without other participants who can correct you, language degenerates into noise.

Understanding why a model cannot surpass its teacher requires a philosophical detour. Without one, any conversation about AI devolves into tech-speak, and the central question gets lost: what is meaning, and where does it come from?

ma.png

Part 2. Wittgenstein Explains What Happened

Early Wittgenstein and why symbolic AI failed

In 1921, a young Ludwig Wittgenstein published the Tractatus Logico-Philosophicus — a book that, by his own modest estimation, had definitively solved all philosophical problems. (A perfectly reasonable state of mind at 32, especially if you're Austrian and came back from World War I a couple of years prior.)

The central idea of the Tractatus: language is a picture of the world. Every meaningful sentence corresponds to a fact in reality. Words are labels on things; sentences are models of situations. Anything that cannot be expressed clearly in this logical form is meaningless. Hence the famous aphorism, number 7:

"Whereof one cannot speak, thereof one must be silent."

This saying inspired logical positivism, and later symbolic AI of the 1960s–80s: build a large enough system of rules, the thinking went, and you'd get artificial intelligence. From 1956 to 1985, DARPA and universities poured billions into the search for AI according to this principle, creating expert systems. logic reasoners, and common-sense knowledge bases.

The program failed. Common sense still hasn't been encoded, not because of a shortage of processing power but because it cannot be formalized. Every rule requires interpretation, every interpretation requires another rule, and so on without end.

The early Wittgenstein inspired a program that collapsed for precisely the reasons the later Wittgenstein predicted.

Late Wittgenstein: meaning is use

By the 1930s, Wittgenstein had begun to doubt that he had solved all the problems of philosophy — a rare quality in any philosopher, and one worth acknowledging. He spent the next twenty years dismantling his own earlier work.

The result was Philosophical Investigations, published posthumously in 1953. It is an entirely different philosophy. Language is no longer a picture of the world but an activity. From "a word denotes a thing," we move to "a word has a use." One correct language disappears in favor of a multiplicity of language games.

Take the word "water." What does it mean?

— "Water!" a man cries out in the desert.

— "This is water," a mother shows her child.

— "Water boils at one hundred degrees," a physics textbook states.

— "Are you serving me distilled water again?" the generative text co-author snaps at ChatGPT.

These are all different uses: not different "senses of one word," but different language games in which the word operates differently. To understand the word "water" is not to memorize the definition H₂O, but to be able to play these games. Understanding means you know how to cry out when thirsty; to point when teaching; to write the formula when doing physics; to throw the word as an insult when irritated.

These uses seem to share no single common core, yet they form a chain of overlapping similarities, like the faces of family members. While a definition of the word "water" is impossible, any native speaker handles it with complete ease. Family resemblance matters more than formal semantics.

Meaning does not live in the mind, in a dictionary, or in logical form. Meaning lives in practice — in how words are used among other people, in concrete situations.

The Private Language Argument: The Key to Everything

In §§243–271 of Philosophical Investigations, Wittgenstein runs a thought experiment — in my view, the single most important argument in all of twentieth-century philosophy for anyone trying to understand AI.

Suppose I have a tickle in my left ear that no one else will ever experience. I invent the sign "S" for it and write that in my diary every time the sensation appears. I tell no one. I have my own, private language.

Wittgenstein asks: what exactly is this? What does "the same S" mean across my entries? How do I know that today's sensation is the very same S as yesterday's?

The answer is that I don't. There is no criterion. I can say "it seems to me that it's the same," but "it seems to me" is not a criterion, it’s part of what needs verification. Without a public practice, without other participants who can correct me or agree with me, the sign "S" has no meaning. It’s not a language, just noise I’m recording for some reason.

Meaning arises only within a community of practices.

If you have read this far, something should have clicked. Modern LLMs are trained on vast corpora of human text. They see which words appear alongside which other words. But they do not participate in language games. They do not drink water, do not cry "Water!" in the desert, do not explain things to a child, do not write physics textbooks, do not hurl words as insults.

From the perspective of the later Wittgenstein, LLMs have statistics of usage but no usage itself. This is precisely the private language argument applied to AI: a model that takes no part in a form of life is dealing with something that looks like language from the outside but is, in essence, its own private vocabulary — endlessly sophisticated and entirely without meaning.

When such a model begins training on its own outputs without human verification, we get a private language devouring itself. In practice, this is called model collapse; in philosophy, it’s the impossibility of private language.

Brandom: Turning Intuition into Theory

Wittgenstein was a brilliant philosopher, but a systematic thinker he was not. Philosophical Investigations is a collection of notes and aphorisms, wonderful to read and agonizing to work with theoretically. By the end of the twentieth century, Wittgenstein's insights were waiting for someone to systematize them.

That someone was the American philosopher Robert Brandom of the University of Pittsburgh. His magnum opus — Making It Explicit (1994) — runs to 800 pages of dense prose, in which every concept is defined in terms of five others, each of those defined in terms of ten more. Best read slowly, pencil in hand, with the growing sense that the author has a personal score to settle with you.

Brandom takes Wittgenstein's "meaning is use" and turns it into a rigorous theory called inferentialism: the inferences a word enters into determine its meaning.

The word "water" means what it means because:

— From "this is water" it follows that "this is drinkable."

— From "this is water" it follows that "this will get you wet."

— From "this is water" it does NOT follow that "this will burn."

— From "ice melts at 0°C" it follows that "water freezes at 0°C."

This is an inferential network — a web of consequences. To understand a word is to know its position in that web.

Brandom's central metaphor: language is the game of giving and asking for reasons. When I assert something, I commit myself to justifying it if challenged. I license others to use my assertion in their own reasoning. If I'm caught in a contradiction, I'm obliged to retract or revise.

This is a normative structure, with rights, obligations, and sanctions for all participants. To understand a sentence is to know its commitments, entitlements, and incompatibilities.

And this is where things get interesting for our purposes.

LLM as a Stochastic Parrot in a Normative Vacuum

From Brandom's perspective, an LLM produces assertions, but:

— It takes on no obligation to justify them. The model generated a contradiction? Tomorrow it will generate the next one without missing a beat.

— It faces no consequences for what it asserts. The model told you yesterday that Sydney is the capital of Australia? You can see how that happens.

— It doesn't exist within a normative community where anything is actually at stake. I can be fired for lying in a report; a model won't be fired for hallucinating, and if it gets shut down, it doesn't care. It has nothing on the line.

The optimist camp (Sutskever, Hinton) says that "understanding" emerges from large-scale statistics. If a model predicts the next token well enough, it must have built an internal model of the world.

I don't like the word "emerges." It's the academically respectable way of saying "we have no idea how it works, but apparently it does." The more honest term is "black box." While I don’t want to wade further into that debate — it doesn’t seem like one that arguments can resolve — I can confidently state that within a single model, with no access to a normative community, meaning cannot arise. Emergence from statistics is, at best, a hypothesis. External verification is a mathematically demonstrable necessity.

An LLM has no intrinsic concept of "water." If it's being checked by a verifier (another AI, a human expert, a compiler) that says "this is right, this is wrong," it gradually builds up a picture: "water" is the thing the verifier nods at when I say "liquid," "wet," or "boils at a hundred degrees," and frowns at when I say "burns" or "triangular."

The model doesn't understand the word, it understands the teacher. Its "knowledge" is an imprint of the verifier's normative grip, not contact with the world. Change the verifier, and you change the meaning.

If tomorrow the judge decides that water burns, the model will dutifully learn that water burns.

What This Means

The most honest conclusion of Part 2 is a simple one. When the AI industry debates "does a model understand," "does it have consciousness," or "could it be AGI," it's treading the same ground philosophy covered sixty years ago. That’s not because philosophers are smarter (though sometimes they are), but because they had more time and fewer financial incentives to stay distracted.

Philosophy's central lesson for AI in the twentieth century: meaning doesn't arise from within a system. Meaning comes from participation in a normative community. Yi et al. formalized this for the specific case of training generative models. Brandom did it for all discourse. Wittgenstein did it for language itself.

But the conclusion most people draw from this — that AI is doomed, that synthetic data is poison, that we need to build walls around "human" data — is wrong. In Part 3, I'll show why that same Wittgensteinian logic points in exactly the opposite direction.

ma.png

Part 3. Where the Skeptics Go Wrong: Provenance Doesn't Matter

The distinction between "AI-generated" and "human" text is false. That’s not because there's no difference, but because the difference doesn't matter for the central question: can you train the next generation of models on these texts? If yes, LLMs have a future. If no, we're at a dead end.

The standard picture looks like this. There's a clean human corpus — books, articles, forums from before 2022 (the golden age of the internet, even though most forum content in 2008 was junk too, just hand-typed). Then there's a contaminated AI corpus — everything models have generated since ChatGPT launched. The assumption is that mixing the two in training leads to model collapse.

This viewpoint underpins every current regulatory proposal: watermarks, clean-data registries, the Human Vaults concept. I'm arguing that it's wrong.

Take a concrete example. I write with AI. For this article, Claude and I discussed Wittgenstein's philosophy; I proposed a thesis, she pushed back, and we arrived at a formulation together. This text is partly hers, partly mine, and I can no longer tell where one ends and the other begins.

Say I then:

  1. Read through and edit it.

  2. Show it to two colleagues who point out errors, then make the corrections they recommend.

  3. Publish it in a peer-reviewed journal, where other researchers scrutinize it for years.

What kind of text is this? "AI-generated" or "human"?

The standard picture demands a binary answer: once AI touches content, it's synthetic. However, in terms of epistemic function, this text has been put through a filtering process no less rigorous than what you'd find at Nature. For training purposes, it's closer to verified knowledge than to "human" noise. What matters isn't the origin but the validation trail.

The later Wittgenstein gives the same clear answer: meaning is determined by participation in a practice, not by its source. A word means the same thing when a parrot squawks it as when a professor speaks it if both use it correctly within the same language game.

The Real Reason Models Collapse

Reframe the problem through this lens, and it looks entirely different. It’s not "AI-generated text contaminates data," but:

The speed at which AI produces content has outpaced the speed at which existing institutions can filter it.

The scandal involving Kenyan content moderators is a perfect illustration of how this works today. When OpenAI trained its safety filters, it outsourced the work through Sama, paying Kenyan workers just over a dollar an hour to read descriptions of violence and abuse. Verification today falls far short of any reasonable technical or ethical standard.

AI-generated content isn't bad because of its machine origin. It's bad because engineers cut corners on annotation to protect shareholder margins, and then lazy writers can't be bothered to proofread whatever the model spits out. That's a different problem, and it calls for different solutions.

Nature faces an analogous challenge and solves it differently: organisms accumulate DNA mutations constantly. Most are harmful. If evolution operated on the principle of "eliminate mutations before they occur," life would never have arisen. Evolution works because selection weeds out bad mutations faster than they accumulate. That is the architecture worth studying.

AI-generated texts are a new type of "mutation" in humanity's epistemic body of knowledge. They are not inherently poisonous. They become poisonous only when the filtering system buckles under their sheer volume.

The right question here is not "how do we label AI content," but "how do we scale verification institutions?"

I half-expect someone to announce a website modestly titled "Everything About Everything" any day now. It’ll include a pipeline that analyzes the interests of several thousand audience segments, then generates news pieces, longform articles, and ostensibly authored opinion columns, complete with real agency photographs and AI-generated images. The site will adapt to each individual reader and publishe in the 100–200 languages humanity still reads. The writing quality will be far from perfect, but that's a temporary problem. Besides, is everything humanity has ever produced perfect?

Here's the kicker. Say our hypothetical site owner puts a like button and a read-time estimate at the bottom of every article, and how could he not? Suddenly he has in his hands exactly the filter Yi et al. describe. Hundreds of millions of clicks a day across a hundred languages is a verification mechanism with more throughput than any peer review process in existence.

What the Math Says

We probably won’t get a single mogul creating such a website, but the existing AI moguls could pool their efforts. Instead of spending billions trying to separate "clean human content" from "dirty AI content," they could invest that money in filters capable of distinguishing verified from unverified text, regardless of origin. Humans, incidentally, exploit unverified content for far more nefarious purposes, and far more often, than AI does out of its own inherent limitations. The term fake news still means something, doesn't it?

Where It Already Works (and What That Tells Us)

The funny thing is, industry best practices already operate by Wittgenstein’s logic, but nobody spells it out explicitly. Let's name names.

Code. AlphaCode, Codex, GitHub Copilot, Claude Code. The model generates code. The code runs. If it works, great. If not, it gets feedback. The compiler is the reality verifier. Where the code came from (a human, an AI, or some hybrid) is irrelevant. All that matters is whether the code passes the tests. That's why AI is progressing at a stunning pace in this domain.

Mathematics. AlphaProof, Lean, Coq. The model proposes a proof. A formal verifier checks it. It passes, it's true. It doesn't, it's wrong. Provenance is irrelevant; verification is everything.

Protein structures. AlphaFold predicts the structure and a lab verifies it experimentally.

Chess and Go. AlphaZero's self-play is the one case where self-improvement doesn't lead to collapse. The reason is the built-in verifier: the rules of the game and the fact of winning or losing. A loss is an objective signal, independent of any verifier's "knowledge base."

The pattern is clear: wherever a cheap reality filter exists, AI progresses without collapse and without needing human-labeled data. Wherever it doesn't — creative writing, politics — we hit the ceiling of the humanas-verifier approach and run straight into the limitations Yi et al. describe.

Under this logic, the current political debates around AI start to look pretty shaky:

— Banning AI on Wikipedia is pointless. Fact-checking is what matters. A text's origin tells you nothing about its quality.

— Mandatory labeling is useful for transparency, not for quality. A labeled AI text doesn't automatically become worse; an unlabeled human text doesn't automatically become better.

— Isolation in Human Vaults is a safety net, not a solution. Building filters that work in the flow of content should take priority over hiding away a "golden corpus."

What actually makes sense instead:

— AI judges in peer review. Paradoxically, AI filters AI better than humans do, recognizing characteristic errors faster. Anthropic, OpenAI, and DeepMind already use this principle in final-stage review.

— Scaling reality grounding. Extend the code-and-math approach as broadly as possible to any domain with an objective verifier.

— Reputation systems. The question to answer isn't "should I trust this article?" but "should I trust this author?"

— Provenance tracking. Watermarks can serve as tracking infrastructure rather than exclusion filters, helping people account for a text’s characteristics by providing information about its journey.

The Sharpest Implication

Taken to its logical conclusion, this leads to a provocative claim: most of the fear around AI data contamination is displaced anxiety about our own laziness.

We're afraid we won't be able to filter AI-generated content, but we don't properly filter human content either. Social media recommendation algorithms have spent years giving the spotlight to whatever triggers an emotional response, from dubious claims to political propaganda, with no regard for truth whatsoever. Our epistemic institutions were already in deep crisis long before ChatGPT.

AI didn't create the filtering problem. AI laid it bare. The solution is making our epistemic institutions stronger, but that's expensive, tedious, and doesn't generate clickbait headlines. Worse, it carries the risk of being accused of censorship.

The industry would rather talk about watermarks. That's politically safe. It’s also epistemically useless: getting a model to embed a cryptographic signature in its outputs is technically trivial, but it says nothing about the quality of those results.

This Is a Race

AI progress now hinges on a single question: can verification institutions keep pace with the volume of generation? Right now, they can't. The ratio of verified knowledge to synthetic noise is in freefall.

This is a race with two possible outcomes.

In the optimistic scenario, we manage to rebuild institutions in time and AI advances through hybrid filtering. Wittgenstein turns out to be right in the most elegant sense: meaning arises in a form of life, and that form of life turns out to be elastic enough to incorporate AI as a participant rather than a threat.

In the pessimistic scenario, institutions buckle. Web data becomes unusable for training. Frontier models are forced to retreat to closed, curated corpora and reality grounding. That's not the end of the world, but it's a radical slowdown, nothing like what the industry promises in its investor decks.

The sharpest irony is that in both scenarios, the top labs will have to do the same thing: build verification infrastructure. They’ll need it to keep moving forward in one case and to avoid collapse in the other. The only difference is that, in my view, these institutions are the primary engine of AI progress, while in the standard view, they're an annoying line item that costs money better spent on GPUs.

The Honest Counterargument

The central weakness of my position is this: verification doesn't scale well. Not cognitively (careful verification requires a slow human in the loop), and not economically (Kenyan annotators at a dollar an hour represent the industry's ceiling, and it doesn't hold). Generation gets cheaper on an exponential curve. If that gap doesn't close, we end up in the pessimistic scenario for purely economic reasons, not philosophical ones.

I don't know how to close that gap. My suspicion is that part of the answer lies in AI judges (a model checking a model is faster than a human), part in reality grounding (a compiler is free and never gets tired), and part in reputation systems (trust in an author lowers the cost of verifying their work). However, these are all hypotheses. The problem may be unsolvable in principle. If so, the pessimists are right.

A Falsifiable Prediction

If I'm right, three things will happen in the next two to three years.

First: Companies likeAnthropic and OpenAI will stop keeping AI judges in house. Someone will raise a round for technology providing verification as a service, and it will be the most boring and most necessary startup of the decade.

Second: reputation layers will grow on top of arXiv and similar platforms. This won’t be a binary "passed/failed peer review," but a trust gradient with a full verification history.

Third: the top labs will quietly stop scraping the web wholesale. They'll shift to carefully curated text collections and domains where reality checks results (code, experiments, mathematics). In their technical reports, the phrase "trained on internet-scale data" will start being sheepishly swapped out for "trained on high-quality curated data."

If, on the other hand, the industry spends another three years debating watermarks and registries of "clean" data, then I was wrong, and the problem really is about where text comes from, not how it's filtered. Fair enough.

What readers should do about this

Verify your work. Don't just "disclose that you used AI" — that won't save anyone. Run your text through a fact-check, a colleague, a test, an experiment, or anything else that might catch you out.

You are part of the verification architecture, whether you like it or not. The quality of your personal filtering matters more now than it ever has. You are not a lone author but a small institution, and you belong to the kind of life form that checks the others.

Including stochastic parrots.

Especially stochastic parrots.

Sources

References cited in this piece. Last verified on the published or revision date.

  1. 01
  2. 02
  3. 03
  4. 04
  5. 05
  6. 06
  7. 07
  8. 08
  9. 09

    Hackaday — Why Model Collapse in LLMs Is Inevitable With Self-Learning

    hackaday.com/2026/04/29/why-model-collapse-in-llms-is-inevitable-with-self-learning

  10. 10
  11. 11

    NewsGuard — Watch Out: AI 'News' Sites Are on the Rise

    www.newsguardtech.com/insights/watch-out-ai-news-sites-are-on-the-rise

  12. 12
  13. 13

    Perrigo — OpenAI Used Kenyan Workers on Less Than $2 Per Hour (TIME)

    time.com/6247678/openai-chatgpt-kenya-workers