Salta al contenuto principale


Wikipedia Editors Adopt ‘Speedy Deletion’ Policy for AI Slop Articles


“The ability to quickly generate a lot of bogus content is problematic if we don't have a way to delete it just as quickly.”

Wikipedia editors just adopted a new policy to help them deal with the slew of AI-generated articles flooding the online encyclopedia. The new policy, which gives an administrator the authority to quickly delete an AI-generated article that meets a certain criteria, isn’t only important to Wikipedia, but also an important example for how to deal with the growing AI slop problem from a platform that has so far managed to withstand various forms of enshittification that have plagued the rest of the internet.

Wikipedia is maintained by a global, collaborative community of volunteer contributors and editors, and part of the reason it remains a reliable source of information is that this community takes a lot of time to discuss, deliberate, and argue about everything that happens on the platform, be it changes to individual articles or the policies that govern how those changes are made. It is normal for entire Wikipedia articles to be deleted, but the main process for deletion usually requires a week-long discussion phase during which Wikipedians try to come to consensus on whether to delete the article.

However, in order to deal with common problems that clearly violate Wikipedia’s policies, Wikipedia also has a “speedy deletion” process, where one person flags an article, an administrator checks if it meets certain conditions, and then deletes the article without the discussion period.

For example, articles composed entirely of gibberish, meaningless text, or what Wikipedia calls “patent nonsense,” can be flagged for speedy deletion. The same is true for articles that are just advertisements with no encyclopedic value. If someone flags an article for deletion because it is “most likely not notable,” that is a more subjective evaluation that requires a full discussion.

At the moment, most articles that Wikipedia editors flag as being AI-generated fall into the latter category because editors can’t be absolutely certain that they were AI-generated. Ilyas Lebleu, a founding member of WikiProject AI Cleanup and an editor that contributed some critical language in the recently adopted policy on AI generated articles and speedy deletion, told me that this is why previous proposals on regulating AI generated articles on Wikipedia have struggled.

“While it can be easy to spot hints that something is AI-generated (wording choices, em-dashes, bullet lists with bolded headers, ...), these tells are usually not so clear-cut, and we don't want to mistakenly delete something just because it sounds like AI,” Lebleu told me in an email. “In general, the rise of easy-to-generate AI content has been described as an ‘existential threat’ to Wikipedia: as our processes are geared towards (often long) discussions and consensus-building, the ability to quickly generate a lot of bogus content is problematic if we don't have a way to delete it just as quickly. Of course, AI content is not uniquely bad, and humans are perfectly capable of writing bad content too, but certainly not at the same rate. Our tools were made for a completely different scale.”

The solution Wikipedians came up with is to allow the speedy deletion of clearly AI-generated articles that broadly meet two conditions. The first is if the article includes “communication intended for the user.” This refers to language in the article that is clearly an LLM responding to a user prompt, like "Here is your Wikipedia article on…,” “Up to my last training update …,” and "as a large language model.” This is a clear tell that the article was generated by an LLM, and a method we’ve previously used to identify AI-generated social media posts and scientific papers.

Lebleu, who told me they’ve seen these tells “quite a few times,” said that more importantly, they indicate the user hasn’t even read the article they’re submitting.

“If the user hasn't checked for these basic things, we can safely assume that they haven't reviewed anything of what they copy-pasted, and that it is about as useful as white noise,” they said.

The other condition that would make an AI-generated article eligible for speedy deletion is if its citations are clearly wrong, another type of error LLMs are prone to. This can include both the inclusion of external links for books, articles, or scientific papers that don’t exist and don’t resolve, or links that lead to completely unrelated content. Wikipedia's new policy gives the example of “a paper on a beetle species being cited for a computer science article.”

Lebleu said that speedy deletion is a “band-aid” that can take care of the most obvious cases and that the AI problem will persist as they see a lot more AI-generated content that doesn’t meet these new conditions for speedy deletion. They also noted that AI can be a useful tool that could be a positive force for Wikipedia in the future.

“However, the present situation is very different, and speculation on how the technology might develop in the coming years can easily distract us from solving issues we are facing now, they said. “A key pillar of Wikipedia is that we have no firm rules, and any decisions we take today can be revisited in a few years when the technology evolves.”

Lebleu said that ultimately the new policy leaves Wikipedia in a better position than before, but not a perfect one.

“The good news (beyond the speedy deletion thing itself) is that we have, formally, made a statement on LLM-generated articles. This has been a controversial aspect in the community before: while the vast majority of us are opposed to AI content, exactly how to deal with it has been a point of contention, and early attempts at wide-ranging policies had failed. Here, building up on the previous incremental wins on AI images, drafts, and discussion comments, we workshopped a much more specific criterion, which nonetheless clearly states that unreviewed LLM content is not compatible in spirit with Wikipedia.”


Wikipedia Pauses AI-Generated Summaries After Editor Backlash


The Wikimedia Foundation, the nonprofit organization which hosts and develops Wikipedia, has paused an experiment that showed users AI-generated summaries at the top of articles after an overwhelmingly negative reaction from the Wikipedia editors community.

“Just because Google has rolled out its AI summaries doesn't mean we need to one-up them, I sincerely beg you not to test this, on mobile or anywhere else,” one editor said in response to Wikimedia Foundation’s announcement that it will launch a two-week trial of the summaries on the mobile version of Wikipedia. “This would do immediate and irreversible harm to our readers and to our reputation as a decently trustworthy and serious source. Wikipedia has in some ways become a byword for sober boringness, which is excellent. Let's not insult our readers' intelligence and join the stampede to roll out flashy AI summaries. Which is what these are, although here the word ‘machine-generated’ is used instead.”

Two other editors simply commented, “Yuck.”

For years, Wikipedia has been one of the most valuable repositories of information in the world, and a laudable model for community-based, democratic internet platform governance. Its importance has only grown in the last couple of years during the generative AI boom as it’s one of the only internet platforms that has not been significantly degraded by the flood of AI-generated slop and misinformation. As opposed to Google, which since embracing generative AI has instructed its users to eat glue, Wikipedia’s community has kept its articles relatively high quality. As I recently reported last year, editors are actively working to filter out bad, AI-generated content from Wikipedia.

A page detailing the the AI-generated summaries project, called “Simple Article Summaries,” explains that it was proposed after a discussion at Wikimedia’s 2024 conference, Wikimania, where “Wikimedians discussed ways that AI/machine-generated remixing of the already created content can be used to make Wikipedia more accessible and easier to learn from.” Editors who participated in the discussion thought that these summaries could improve the learning experience on Wikipedia, where some article summaries can be quite dense and filled with technical jargon, but that AI features needed to be cleared labeled as such and that users needed an easy to way to flag issues with “machine-generated/remixed content once it was published or generated automatically.”

In one experiment where summaries were enabled for users who have the Wikipedia browser extension installed, the generated summary showed up at the top of the article, which users had to click to expand and read. That summary was also flagged with a yellow “unverified” label.
An example of what the AI-generated summary looked like.
Wikimedia announced that it was going to run the generated summaries experiment on June 2, and was immediately met with dozens of replies from editors who said “very bad idea,” “strongest possible oppose,” Absolutely not,” etc.

“Yes, human editors can introduce reliability and NPOV [neutral point-of-view] issues. But as a collective mass, it evens out into a beautiful corpus,” one editor said. “With Simple Article Summaries, you propose giving one singular editor with known reliability and NPOV issues a platform at the very top of any given article, whilst giving zero editorial control to others. It reinforces the idea that Wikipedia cannot be relied on, destroying a decade of policy work. It reinforces the belief that unsourced, charged content can be added, because this platforms it. I don't think I would feel comfortable contributing to an encyclopedia like this. No other community has mastered collaboration to such a wondrous extent, and this would throw that away.”

A day later, Wikimedia announced that it would pause the launch of the experiment, but indicated that it’s still interested in AI-generated summaries.

“The Wikimedia Foundation has been exploring ways to make Wikipedia and other Wikimedia projects more accessible to readers globally,” a Wikimedia Foundation spokesperson told me in an email. “This two-week, opt-in experiment was focused on making complex Wikipedia articles more accessible to people with different reading levels. For the purposes of this experiment, the summaries were generated by an open-weight Aya model by Cohere. It was meant to gauge interest in a feature like this, and to help us think about the right kind of community moderation systems to ensure humans remain central to deciding what information is shown on Wikipedia.”

“It is common to receive a variety of feedback from volunteers, and we incorporate it in our decisions, and sometimes change course,” the Wikimedia Foundation spokesperson added. “We welcome such thoughtful feedback — this is what continues to make Wikipedia a truly collaborative platform of human knowledge.”

“Reading through the comments, it’s clear we could have done a better job introducing this idea and opening up the conversation here on VPT back in March,” a Wikimedia Foundation project manager said. VPT, or “village pump technical,” is where The Wikimedia Foundation and the community discuss technical aspects of the platform. “As internet usage changes over time, we are trying to discover new ways to help new generations learn from Wikipedia to sustain our movement into the future. In consequence, we need to figure out how we can experiment in safe ways that are appropriate for readers and the Wikimedia community. Looking back, we realize the next step with this message should have been to provide more of that context for you all and to make the space for folks to engage further.”

The project manager also said that “Bringing generative AI into the Wikipedia reading experience is a serious set of decisions, with important implications, and we intend to treat it as such, and that “We do not have any plans for bringing a summary feature to the wikis without editor involvement. An editor moderation workflow is required under any circumstances, both for this idea, as well as any future idea around AI summarized or adapted content.”


Tim Chambers reshared this.