The Dead Internet Theory is no longer a theory. As of 2024, automated systems account for 51% of all web traffic, AI-generated content appears in 74% of newly published web pages, and AI-written articles have surpassed human-written work in volume. The internet is still online—but the humans who once filled it are increasingly outnumbered.
What Is the Dead Internet Theory?
The Dead Internet Theory posits that most online content—social media posts, forum replies, comment sections, blog articles—is generated by bots and AI systems rather than real people, with authentic human interaction having been quietly displaced. What was originally a fringe conspiracy now maps neatly onto measurable infrastructure trends.
The theory’s canonical form appeared in a 2021 post on Agora Road’s Macintosh Cafe by a user called “IlluminatiPirate,” who argued that the internet had been overtaken by automated content since approximately 2016–2017. At the time, the claim was treated as paranoid speculation. Enthusiasts circulated the manifesto across subreddits and tech forums, but it remained on the outer edges of discourse.
By 2025, it had moved to the center. Linguist Adam Aleksic, speaking to Time, stated that the theory “used to be a lunatic fringe conspiracy theory, but it’s looking a lot more real.” A February 2025 academic survey published in the Asian Journal of Research in Computer Science formally examined the theory’s claims, concluding that the “commodification of content consumption for revenue has taken precedence over meaningful human connectivity.”1
The Numbers Don’t Lie
Three data points define the current state of the internet’s authenticity crisis.
Automated traffic has crossed 51%. Imperva’s 2025 Bad Bot Report—based on analysis of trillions of requests across thousands of domains—found that for the first time in a decade, automated traffic (bots, scrapers, AI agents) exceeded human traffic, accounting for 51% of all web traffic in 2024.2 Bad bots specifically—those designed for scraping, fraud, and manipulation—now account for 37% of all traffic, up from just over 30% in 2023. The report attributes the acceleration largely to generative AI making bot development accessible without technical expertise.
74% of new web pages contain AI-generated content. Ahrefs analyzed 900,000 newly published web pages in April 2025 using AI content detection tools and found that 74.2% contained AI-generated content.3 Only 2.5% were classified as “pure AI” with no human editing, while 71.7% represented human-AI hybrid work. Among 879 content marketers surveyed in the same study, 87% admitted using AI to create or assist in content creation.
AI-written articles have overtaken human-written ones in volume. Analytics firm Graphite reported that AI-generated articles surpassed human-written work in total volume for the first time in late 2024. Google’s response—serving AI-generated summaries via AI Overviews rather than linking to sources—has created a feedback loop where AI content generates AI summaries that further displace human publishers.
| Metric | Stat | Source | Year |
|---|---|---|---|
| Automated web traffic share | 51% | Imperva Bad Bot Report | 2025 |
| Bad bot traffic share | 37% | Imperva Bad Bot Report | 2025 |
| New web pages with AI content | 74.2% | Ahrefs (900k page study) | 2025 |
| Content marketers using AI | 87% | Ahrefs survey | 2025 |
| Google traffic decline (global publishers) | -33% YoY | Press Gazette / Chartbeat | 2025 |
| Google traffic decline (US publishers) | -38% YoY | Press Gazette / Chartbeat | 2025 |
| Expected publisher traffic drop by 2029 | -43% | Search Engine Land | 2025 |
How AI Content Took Over
The mechanism isn’t nefarious actors in basements. It’s economics.
AI-generated content costs a fraction of human writing to produce, and Google’s own research—cited by Ahrefs—found no correlation between AI content and search ranking penalty. The platform incentive structure never punished AI content at scale, which meant content farms could flood the web with AI-generated pages targeting long-tail keywords at near-zero cost. SEO spam, always a problem, became industrialized.
Social platforms followed. OpenAI CEO Sam Altman posted on X in 2025: “i never took the dead internet theory that seriously but it seems like there are really a lot of LLM-run twitter accounts now.”4 Alexis Ohanian, investor and Reddit cofounder, was less measured: “You all prove the point that so much of the internet is now just dead—this whole dead internet theory, right? Whether it’s botted, whether it’s quasi-AI, LinkedIn slop.”5
The shift isn’t limited to low-quality content farms. Sophisticated AI personas now participate in forum discussions, post reviews, generate comments, and engage in debates—indistinguishably from humans in most contexts. On Reddit, unauthorized AI bots have been documented eroding community norms, prompting Hacker News discussions about whether platform sociality itself is being hollowed out.6
The Model Collapse Problem
The consequences extend beyond the social. They reach into the technical foundations of AI itself.
In July 2024, researchers published a landmark paper in Nature: “AI Models Collapse When Trained on Recursively Generated Data.”7 Lead author Ilia Shumailov and colleagues demonstrated mathematically that AI models trained on AI-generated content progressively degrade, losing diversity in their outputs. “Tails of the original content distribution disappear,” the paper states—meaning rare, nuanced, and minority perspectives get erased first. Within a few generations of recursive training, original content is replaced by unrelated nonsense.
This is not a distant risk. It describes the current training pipeline for most major AI systems, which scrape the public web for data. As the web tilts toward AI-generated content, future models trained on it will reflect a degraded, homogenized signal. Human insight—the original data source—is being diluted at the input layer of AI development.
What Does Authentic Human Content Even Mean Now?
In 2026, “authentic human content” is not a binary state—it’s a spectrum with contested borders.
A journalist who uses AI to transcribe interviews and suggest structure but writes every sentence is producing hybrid content. A content marketer who prompts an LLM and lightly edits the output is producing something different. A bot account that synthesizes trending topics to generate engagement bait is producing something different again. All three currently exist without clear platform-level distinction.
The deeper question is what authenticity was ever supposed to guarantee. The original appeal of blogs, forums, and social media was the presence of real perspectives, lived experience, and genuine stakes in a conversation. AI content can simulate all of these—but the simulation is hollow. There is no person who learned something and wanted to share it. There is no community member who cares about the outcome. There is no human on the other side.
This matters because the value of online discourse—in theory—was epistemic: it aggregated distributed human knowledge and perspective. An internet dominated by AI-generated content is an epistemic hall of mirrors, reflecting AI outputs back at AI training pipelines.
Can Provenance Technology Close the Gap?
The technical response to the authenticity crisis centers on content provenance: cryptographic attestation of where content came from and how it was created.
The Coalition for Content Provenance and Authenticity (C2PA), formed in 2021 by Adobe, Microsoft, the BBC, and others, has developed an open standard called Content Credentials. Google joined as a steering committee member in 2025 and collaborated on C2PA specification version 2.1, which includes stricter validation requirements for content history.8 The specification is on track for adoption as an ISO international standard.
Google’s SynthID, developed by DeepMind, embeds invisible watermarks in AI-generated text, audio, images, and video. Adobe’s Content Authenticity Initiative integrates Content Credentials into Photoshop, Firefly, and other creative tools, producing cryptographically signed provenance data attached to exported files.
The gap between technical capability and platform adoption remains significant. Watermarks are voluntary, can be stripped, and require platform-level verification infrastructure most social networks haven’t deployed at scale. Detection arms races historically favor offense: bad actors adapt faster than standards bodies.
The Stakes for Publishers
The structural impact on human publishers is already severe. Google traffic to publishers globally fell 33% between November 2024 and November 2025, with US publishers absorbing a 38% decline, according to Chartbeat data analyzed by Press Gazette.9 AI Overviews—Google’s on-SERP summaries—reduce clickthrough rates by approximately 35% when they appear, which is now roughly 30% of all queries.
News publishers expect search referral traffic to drop 43% by 2029, according to a Search Engine Land analysis of industry projections.10 The model that funded investigative journalism, local news, and long-form analysis—advertising revenue tied to search traffic—is collapsing precisely as the content those journalists create is being used to train the AI systems displacing them.
The Dead Internet Theory, in its most cynical reading, predicted this exact dynamic: the displacement of human creators by automated systems optimized for engagement metrics rather than truth or meaning.
Frequently Asked Questions
Q: Is the Dead Internet Theory actually true? A: It is no longer a theory in the conspiratorial sense. Imperva’s 2025 Bad Bot Report confirmed that automated traffic exceeded human traffic (51% vs. 49%) in 2024, and Ahrefs found AI-generated content in 74.2% of new web pages. The fringe claim has become documented infrastructure reality.
Q: Does AI-generated content hurt search rankings? A: Not reliably, according to Ahrefs research across 600,000 web pages. Google’s systems currently show no consistent correlation between AI content percentage and ranking position—which is precisely why the economic incentive to produce AI content at scale remains intact.
Q: What is model collapse and why does it matter? A: Model collapse, documented in a July 2024 Nature paper by Shumailov et al., occurs when AI models are trained on AI-generated data. Over successive generations, output diversity collapses and models begin producing nonsense. As AI content floods the web and enters training pipelines, future model quality is at risk.
Q: Can technology restore content authenticity? A: Partially. C2PA’s Content Credentials standard provides cryptographic provenance attestation, and Google’s SynthID embeds watermarks in AI-generated media. However, these tools require voluntary adoption, are subject to stripping, and depend on platform-level verification infrastructure that most social networks haven’t deployed.
Q: What should content creators do right now? A: Establish provenance where possible—use tools that support Content Credentials, publish original research and primary sources that AI cannot synthesize, build direct audience relationships outside search-dependent channels, and prioritize depth and specificity that AI-generated content cannot replicate at scale.
Sources:
- Dead Internet Theory - Wikipedia
- 2025 Imperva Bad Bot Report
- Ahrefs: 74% of New Webpages Include AI Content
- Nature: AI models collapse when trained on recursively generated data
- arXiv.00007 - The Dead Internet Theory: A Survey
- Press Gazette: Global publisher Google traffic dropped by a third in 2025
- Fortune: Reddit cofounder Alexis Ohanian on the dead internet
- Google Blog: C2PA content transparency
- Search Engine Land: Publishers expect 43% traffic drop by 2029
- Hacker News: Unauthorised AI Bots on Reddit
Footnotes
-
Muzumdar, Prathamesh et al. “The Dead Internet Theory: A Survey on Artificial Interactions and the Future of Social Media.” Asian Journal of Research in Computer Science, 18(1), 67–73. arXiv
.00007. February 2025. ↩ -
Imperva. “2025 Bad Bot Report: How AI is Supercharging the Bot Threat.” Thales/Imperva, April 2025. https://www.imperva.com/blog/2025-imperva-bad-bot-report-how-ai-is-supercharging-the-bot-threat/ ↩
-
Ahrefs. “74% of New Webpages Include AI Content (Study of 900k Pages).” Ahrefs Blog, April 2025. https://ahrefs.com/blog/what-percentage-of-new-content-is-ai-generated/ ↩
-
Sam Altman, post on X (Twitter), 2025. Referenced in Fortune, October 2025. ↩
-
Alexis Ohanian, quoted in Fortune. “Reddit cofounder Alexis Ohanian says ‘so much of the internet is dead.’” October 15, 2025. https://fortune.com/2025/10/15/reddit-co-founder-alexis-ohanian-dead-internet-theory-ai-bots-linkedin-slop/ ↩
-
“Unauthorised AI Bots on Reddit Are Eroding Sociality.” Hacker News discussion. https://news.ycombinator.com/item?id=43821600 ↩
-
Shumailov, Ilia et al. “AI models collapse when trained on recursively generated data.” Nature, 631, 755–759. July 2024. https://www.nature.com/articles/s41586-024-07566-y ↩
-
Google Blog. “How Google and the C2PA are increasing transparency for gen AI content.” https://blog.google/innovation-and-ai/products/google-gen-ai-content-transparency-c2pa/ ↩
-
Press Gazette. “Global publisher Google traffic dropped by a third in 2025.” January 2026. https://pressgazette.co.uk/media-audience-and-business-data/google-traffic-down-2025-trends-report-2026/ ↩
-
Search Engine Land. “News publishers expect search traffic to drop 43% by 2029.” https://searchengineland.com/news-publishers-search-referrals-drop-report-467408 ↩