AI crawlers cause Wikimedia(The umbrella organization of Wikipedia and a dozen or so other crowdsourced knowledge projects) Commons bandwidth demands to surge 50%.

Tea@programming.dev · edit-2 2 days ago

AI crawlers cause Wikimedia(The umbrella organization of Wikipedia and a dozen or so other crowdsourced knowledge projects) Commons bandwidth demands to surge 50%.

mke@programming.dev · edit-2 1 day ago

Apparently the dump doesn’t include media, though there’s ongoing discussion within wikimedia about changing that. It also seems likely to me that AI scrapers don’t care about externalizing costs onto others if it might mean a competitive advantage (e.g. most recent data, not having to spend time and resources developing dedicated ingestion systems for specific sites).

I want to stress this: it’s not that “tech bros” are just stupid—even though a lot of them are revoltingly unappreciative of the giants whose sholders they stand on—it’s that they don’t care.

AI crawlers cause Wikimedia(The umbrella organization of Wikipedia and a dozen or so other crowdsourced knowledge projects) Commons bandwidth demands to surge 50%.

AI crawlers cause Wikimedia(The umbrella organization of Wikipedia and a dozen or so other crowdsourced knowledge projects) Commons bandwidth demands to surge 50%.

How crawlers impact the operations of the Wikimedia projects