The bots that ate history.
AI’s hunger for data is quietly erasing the internet’s cultural memory, draining archives, rewriting history, and redefining who owns knowledge.
How AI’s hunger for data is quietly erasing the internet’s cultural memory.
The Silent Stamped
It doesn’t begin with a headline. It begins with a crash. Servers slow, museum websites flicker offline, and a digital librarian somewhere in Europe wonders why their open-access archive suddenly looks like Times Square on a Friday night. The answer: bots. Not the malicious kind built by hackers, but the new, respectable kind — AI training crawlers, quietly devouring every byte of culture they can reach. Paintings, audio archives, public domain films, oral histories — scraped not for preservation, but for ingestion. AI, it turns out, is starving. And our heritage has become the meal.
Progress at the Cost of Memory
In June 2025, researchers at the GLAM-E Lab, NYU’s Engelberg Center on Innovation Law and Policy, published a report with an ominous title: Are AI Bots Knocking Cultural Heritage Offline?
The findings were staggering:
- 39 of 43 cultural institutions surveyed reported massive surges in traffic.
- 27 confirmed the source was automated AI data scrapers.
- Some museums described their archives being hit with hundreds of thousands of simultaneous requests per hour — what one curator called “a DDoS attack disguised as curiosity.”
The bots didn’t mean harm; they were simply doing their job — hoarding information for future models. But to archives built for human curiosity, not industrial-scale extraction, the result was the same: digital exhaustion.
The report concluded with a chilling sentence:
“The infrastructure of open access is being strained to the point of collapse by automated AI activity.”
It’s a quiet collapse, too — one that doesn’t burn or break, but fades.
UNESCO’s Warning: A Cultural Vacuum
A few months later, UNESCO released its CULTAI Report — a sweeping, 120-page document on the future of artificial intelligence and culture. The language was diplomatic but urgent:
“AI is advancing faster than cultural governance, widening divides and raising new risks for cultural rights, diversity, and sovereignty.”
Translation: we’re so busy building the future that we’re forgetting to back up the past.
The report described an emerging “cultural vacuum” — where local heritage, particularly from the Global South, is disproportionately scraped into datasets controlled by Western tech firms.
A Kenyan photography archive or a Filipino oral history project may find its contents floating in the training data of a multimillion-dollar Silicon Valley model — uncredited, uncontextualised, and untraceable. The machine learns the rhythm of a voice but forgets the language it came from.

The New Colonialism Is Digital
Let’s call this what it is: digital colonialism. If the 19th century was about the extraction of land and labour, the 21st is about the extraction of meaning. Our cultural commons — the shared memory of humanity — has become training material. Museums built to democratise knowledge now find themselves plundered by algorithms under the banner of innovation. The UNESCO CULTAI report warned that this pattern risks reproducing “cultural imperialism through data ownership.” It’s the same old power imbalance, translated into code: the few training on the stories of the many. And just as in empire, erasure doesn’t happen with violence. It happens with silence.
The Cost of Openness
A parallel study by Britt Amell (2025), Are Bots Reshaping Open Access?, found that over 90% of open-access repositories surveyed had faced “aggressive scraping behaviour.” For institutions already running on shoestring budgets, this means rising server costs, throttled access, and sometimes — digital lockdown. “We built open access for human discovery,” one curator told Museums Journal. “Now we’re defending it from machines pretending to be students.”
It’s a paradox of the digital age: openness makes you vulnerable, but closure betrays your mission. Some archives have responded by limiting downloads or blocking bots entirely — effectively reversing two decades of progress toward democratised knowledge. And still, the scraping continues. Because if there’s one thing AI values more than creativity, it’s quantity.
Culture as Commodity
Every new wave of technology repackages human expression as product. But this time, the speed is different — and the consequences are invisible. The global AI industry is projected to exceed $2 trillion by 2030, fuelled by vast language and image models that rely on millions of cultural artefacts.
Each artwork, text, or recording ingested into these systems becomes an unpaid contributor to a machine that might later sell access to “cultural intelligence” back to the world it learned from.
A museum’s public collection becomes a data set.
A community’s oral tradition becomes training audio.
A nation’s story becomes a statistic.
And we call it progress.
The Emotional Cost of Erasure
It’s easy to treat this as a technical issue, but it’s not. It’s existential. When archives go offline, the loss isn’t just informational — it’s emotional. We lose evidence of who we were, and how we remembered ourselves. At Truffle Culture, we argue that this erosion of context is the cultural crisis of our generation. AI doesn’t erase culture maliciously; it erases it indifferently. The algorithm doesn’t care about the hands that painted the fresco or the prayer that inspired the poem. It cares about correlation. Pattern. Scale.
When Innovation Forgets Its Source
The conversation around AI often oscillates between fear and fascination. But here’s the quieter, more dangerous reality: AI doesn’t destroy culture by replacing it, it destroys it by consuming it faster than it can be replenished. In this light, the race for ever-larger datasets looks less like progress and more like an arms race for memory.
The internet’s collective archives, once symbols of global collaboration, are now supply chains for an intelligence industry. If the library was once a cathedral of knowledge, the data center is its modern equivalent — colder, louder, and far less romantic.
Our Reflection: Toward Cultural Stewardship
The question isn’t whether AI should learn from human culture — it already has.The question is: What does it owe in return?
There’s still time to build systems of reciprocity — frameworks where data collection respects cultural authorship, where archives are credited, compensated, and protected. UNESCO calls this “Cultural AI Sovereignty.” At Truffle Culture, we call it digital stewardship — a cultural contract between progress and preservation. Because the machines aren’t evil. They’re hungry. And it’s up to us to decide whether they feast or learn.
References & Data Sources
- GLAM-E Lab, NYU Engelberg Center (2025) — Are AI Bots Knocking Cultural Heritage Offline?
→ 43 surveyed institutions; 39 reported major bot traffic surges; 27 confirmed AI scraping. - UNESCO CULTAI Report (2025) — Artificial Intelligence and Culture: Independent Expert Group on AI and Culture, pp. 14–62.
→ Warns of cultural sovereignty risks and governance lag. - Britt Amell (2025) — Are Bots Reshaping Open Access?
→ Over 90 % of repositories reported scraping issues. - Museums Journal (June 26 2025) — AI scraper bots are disrupting online collections.
→ Quotes from curators on server overloads and “mechanical extraction.” - AI Market Forecast (Statista, 2025) — Projected to exceed $2 trillion by 2030.