Who's keeping the record?

Part one ended with a question worth sitting with: as we keep adapting to new technology, we leave things behind. Are we okay with that? The honest answer, at a societal level, seems to be: we haven't really decided yet.

This is part two of two. Part one is about what happens when studios close and their websites disappear with them. You can read part one here.

We don't own what we think we own

The Stop Killing Games movement, introduced in part one, is slowly but surely gaining more traction. In California, a bill called the Protect Our Games Act has passed the state assembly and is now heading to the senate. All going well it will require publishers to give 60 days notice before shutting down a server-dependent game, and to provide either an offline mode, community server support, or a full refund.

The industry pushback has been telling. The Entertainment Software Association (ESA) argues the bill is based on a false premise. Their position is that games are licensed software, not owned property. This means access is not guaranteed indefinitely.

That distinction matters well beyond gaming. When you stream a film, you don't own it. When a platform decides to pull it, it goes. Westworld was removed from HBO Max as part of a cost-cutting campaign following a business merger. Nearly 200 episodes of Sesame Street were pulled from the same platform overnight as part of the same cost-cutting purge. Warner Bros. has shelved completed films that were never released at all, writing them off as tax losses. Batgirl, a $90 million film that had already been shot, was scrapped entirely. The creators had no say. The audience had no recourse. And unless someone made a physical copy, some of these things may never be seen at all.

A while back I heard a podcast, and I wish I could remember which one, where a showrunner talked about insisting on a physical copy of their streaming series for exactly this reason. The platform could be gone tomorrow. The company could change direction. The content could be pulled for any number of commercial reasons that have nothing to do with whether the work has cultural value. She wanted a copy that existed independently of anyone's servers or subscription model.

It stuck with me because it's such a simple act of preservation. And most creators never think to do it.

The tool that was quietly keeping the record is under threat

For nearly three decades, the Internet Archive's Wayback Machine has been doing something nobody else was doing at scale: archiving the web. Saving snapshots of pages before they disappeared. Giving researchers, journalists, and curious people a way to see what something looked like before it was edited, deleted, or taken down entirely.

A 2024 Pew Research study found that 38% of web pages that existed in 2013 were no longer online a decade later. The Wayback Machine has been one of the few tools standing between that loss and total erasure.

And now news organisations are blocking it.

The New York Times implemented what the Wayback Machine's own director described as a hard block in late 2025. In total, 241 news sites across nine countries now explicitly disallow at least one of the Archive's crawling bots. The largest newspaper publisher in the US, accounts for a significant share of those blocked sites, effectively removing hundreds of local publications from the historical record.

The reason is understandable. AI companies training large language models need vast quantities of high-quality text. Archived news content is exactly that: structured, dated, attributed, high-quality writing accumulated over decades. Publishers are watching their work get hoovered up to train models that then compete with them for readers, and they are trying to protect themselves.

It's a coherent response to a real problem. It just has an unintended consequence that nobody seems to be adequately accounting for: the historical archive of the web is being quietly dismantled as collateral damage in a copyright dispute between publishers and AI companies.

AI is making this worse in a second way

The more we lose trust in what we read online, the more important historical archives become.

AI-generated content has made it genuinely harder to know whether something is real. Fabricated quotes, synthetic images, plausible-sounding misinformation dressed up as journalism. The ability to point to an archived version of a page and say "this is what was actually published, on this date, before anyone could edit it" is increasingly valuable. It's one of the few tools that supports accountability in an environment where everything can be altered after the fact.

And yet we're blocking it. To stop AI scraping. Which is also a legitimate concern.

Both of these things are true simultaneously, and they pull in opposite directions, and there is no clean answer to the tension between them. What there is, is a slow erosion of the record while everyone argues about who caused it.

‍

Video above:

It Gets Worse is a book I researched and wrote as part of my final year project for my degree. The book was constructed with an Arduinoboard, touch sensitive paint and thread that allowed people to touch the book and an animation with further information would start on screen using Touch Designer. Definitely a project that ideally should be shelved in my folio for something more current. But I adore it, so it remains.

The question nobody is answering cleanly

Every thread in this two-part piece arrives at the same place.

Studios close and websites lapse. Platforms pull content. Hardware becomes obsolete. Fan communities try to preserve things and get shut down by copyright claims. The one organisation quietly archiving the open web is being blocked by the publishers whose archives it was preserving.

Blocking the Internet Archive isn't going to stop AI training. What it will do is ensure that significant chunks of our journalistic record and historical cultural context simply disappear.

The same is true of the studio website that wasn't renewed. The game that got its servers turned off. The VR experience that no longer has hardware to run on. The streaming series that got quietly delisted.

Nobody made a deliberate decision to lose any of these things. They just slipped through gaps between commercial priorities, expired subscriptions, and technology that moved on.

The question isn't really whether we can afford to preserve things. We have 200 zettabytes of storage and counting.

The question is whether we'll decide it's worth caring about before enough of it is gone that we notice.