900 billion. That's how many web pages the Internet Archive's Wayback Machine has catalogued since 1996, making it the closest thing the web has to a collective memory. That memory is under serious pressure.
Major news organizations are blocking the Wayback Machine from crawling their sites and displaying archived versions of their pages, according to Wired. The stated reasons vary - copyright claims, paywalls, control over content distribution - but the cumulative effect is the same: the archive's usefulness for research and fact-checking shrinks every time a publisher opts out. Journalists and advocacy groups are pushing back, but legal and technical pressure is mounting.
The Archive already had a difficult 2024. A cyberattack in October compromised 31 million user accounts. Courts ruled against it in cases brought by music and book publishers. Now it's losing ground with the news industry - historically one of its strongest institutional allies.
For anyone who uses the internet as a research tool, the practical impact is concrete. Broken links are everywhere. The Wayback Machine has been the standard fallback: paste a dead URL, retrieve what the page said before it was edited, deleted, or moved behind a paywall. That's useful for verifying what an AI company promised before a product launch, confirming pricing that quietly changed, or checking what a service's terms said six months ago.
The Archive operates as a nonprofit and doesn't have the legal resources of the major publishers challenging it. If current trends continue, the Wayback Machine won't disappear - but it will cover less of the web, with more gaps, precisely when reliable research tools are in highest demand.