How AI and Wikipedia have sent vulnerable languages into a doom spiral
Wikipedia is the most ambitious multilingual project after the Bible: There are editions in over 340 languages, and a further 400 even more obscure ones are being developed and tested. Many of these smaller editions have been swamped with automatically translated content as AI has become increasingly accessible. Volunteers working on four African languages, for instance, estimated to MIT Technology Review that between 40% and 60% of articles in their Wikipedia editions were uncorrected machine translations. And after auditing the Wikipedia edition in Inuktitut, an Indigenous language close to Greenlandic that’s spoken in Canada, MIT Technology Review estimates that more than two-thirds of pages containing more than several sentences feature portions created this way.This is beginning to cause a wicked problem. AI systems, from Google Translate to ChatGPT, learn to “speak” new languages by scraping huge quantities of text from the internet. Wikipedia is sometimes the largest source of online linguistic data for languages with few speakers—so any errors on those pages, grammatical or otherwise, can poison the wells that AI is expected to draw from. That can make the models’ translation of these languages particularly error-prone, which creates a sort of linguistic doom loop as people continue to add more and more poorly translated Wikipedia pages using those tools, and AI models continue to train from poorly translated pages. It’s a complicated problem, but it boils down to a simple concept: Garbage in, garbage out.
“These models are built on raw data,” says Kevin Scannell, a former professor of computer science at Saint Louis University who now builds computer software tailored for endangered languages. “They will try and learn everything about a language from scratch. There is no other input. There are no grammar books. There are no dictionaries. There is nothing other than the text that is inputted.”
There isn’t perfect data on the scale of this problem, particularly because a lot of AI training data is kept confidential and the field continues to evolve rapidly. But back in 2020, Wikipedia was estimated to make up more than half the training data that was fed into AI models translating some languages spoken by millions across Africa, including Malagasy, Yoruba, and Shona. In 2022, a research team from Germany that looked into what data could be obtained by online scraping even found that Wikipedia was the sole easily accessible source of online linguistic data for 27 under-resourced languages.
This could have significant repercussions in cases where Wikipedia is poorly written—potentially pushing the most vulnerable languages on Earth toward the precipice as future generations begin to turn away from them.
“Wikipedia will be reflected in the AI models for these languages,” says Trond Trosterud, a computational linguist at the University of Tromsø in Norway, who has been raising the alarm about the potentially harmful outcomes of badly run Wikipedia editions for years. “I find it hard to imagine it will not have consequences. And, of course, the more dominant position that Wikipedia has, the worse it will be.”
How AI and Wikipedia have sent vulnerable languages into a doom spiral
Machine translators have made it easier than ever to create error-plagued Wikipedia articles in obscure languages. What happens when AI models get trained on junk pages?Jacob Judah (MIT Technology Review)
like this
A picture I took today of some mushrooms
Four major Earth system components are losing stability
Four major Earth system components are losing stability
01.10.2025 – Four key parts of the Earth’s climate system are destabilising, according to a new study with contributions from the Potsdam Institute for Climate Impact Research (PIK).Potsdam Institute for Climate Impact Research
like this
Stubsack: weekly thread for sneers not worth an entire post, week ending 9th November 2025
Want to wade into the sandy surf of the abyss? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid: Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.
Any awful.systems sub may be subsneered in this subthread, techtakes or no.
If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.
The post Xitter web has spawned soo many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.
(Credit and/or blame to David Gerard for starting this.)
adhocfungus likes this.
Impending availability of fully-functional Casio ring watches:
newatlas.com/wearables/casio-g…
Casio nano-sizes the rugged G-Shock into full-functioning finger watch
Following up on the 50th anniversary ring watch Casio introduced last year, the new G-Shock Nano adds a beefier build with 200-m water resistance, shock resistance and a working mini-strap. This one is truly a G-Shock shrunken down to finger size.C.C. Weiss (New Atlas)
reshared this
Jeremy Corbyn says ‘No war with Venezuela’
jeremy Corbyn says ‘No war with Venezuela’
The operations against Venezuela are not about drugs. The real reason for this military escalation is regime...Stop the War
The Future of Advertising Is AI Generated Ads That Are Directly Personalized to You
Do you and your human family have interest in sharing an exciting IRL experience supporting your [team of choice] with other human fans at The Big Game? In that case, don the chosen color of your [team of choice] and head to the local [iconic stadium]; Ticketmaster has exciting ticket deals, and soon you and your human family can look as happy and excited as these virtual avatars:
Ticketmaster's personalized AI slop ads are a glimpse at the future of social media advertising, a harbinger of system that Mark Zuckerberg described last week in a Meta earnings call. This future is one where AI is used both for ad targeting and for ad generation; eventually ads are going to be hyperpersonalized to individual users, further siloing the social media experience: "Advertisers are increasingly just going to be able to give us a business objective and give us a credit card or bank account, and have the AI system basically figure out everything else that’s necessary, including generating video or different types of creative that might resonate with different people that are personalized in different ways, finding who the right customers are,” Zuckerberg said.
adhocfungus likes this.
RRF Caserta. Cinema. La vita va così
MIT Sloan quietly shelves AI ransomware study after researcher calls BS
Netflix, Disney & Crunchyroll Named Following Mass Anime Streaming Crackdown [Edit: better source]
cross-posted from: lemmy.dbzer0.com/post/56920779
torrentfreak.com/global-piracy…
Global Piracy Injunction Targets VidSrc Domains, Hydra Regenerates in Russia * TorrentFreak
An attempt to quickly suspend a dozen or so VidSrc domains aimed to do significant damage, but the site lives on thanks to a backup plan.Andy Maxwell (TF Publishing)
like this
It's ironic that crunchyroll is here.
They should understand more than anyone that they need to offer a good service, and piracy would be less of an issue.
[CW: meat] Chinese astronauts are now grilling in space
- YouTube
Profitez des vidéos et de la musique que vous aimez, mettez en ligne des contenus originaux, et partagez-les avec vos amis, vos proches et le monde entier.www.youtube.com
Brazil records biggest annual fall in emissions in 15 years, notably thanks to fight against deforestation
Brazil records biggest annual fall in emissions in 15 years, notably thanks to fight against deforestation
The gross emissions of Latin America's biggest country fell by 16.7% year-on-year, according to Brazil's Climate Observatory, a network of environmental NGOs.Le Monde with AFP (Le Monde)
like this
Cresce la resistenza ai deep fake italiani
‘I felt violated’: the Italian women taking on porn sites over doctored images
Giorgia Meloni, Sophia Loren and writer Francesca Barra among prominent figures to have ‘nudified’ photos circulated onlineAngela Giuffrida (The Guardian)
Quanto corre uno pneumatico alle prese con la duna più elevata dell'America Meridionale - Il blog di Jacopo Ranieri
Quanto corre uno pneumatico alle prese con la duna più elevata dell'America Meridionale - Il blog di Jacopo Ranieri
Quando minacciato da un predatore, il ragno delle dune Carparachne aureoflava raccoglie le sue zampe attorno al corpo ed effettua un tuffo carpiato in avanti, dando inizio a una valanga che vede soltanto se stesso come protagonista.Jacopo (Il blog di Jacopo Ranieri)
Theater Review | In ‘Kyoto,’ Seeking Consensus to Save the Earth but Veering Off Course
At Lincoln Center Theater, a new play from the makers of “The Jungle” tries to dramatize the negotiations that led to the Kyoto Protocol.
How to fight climate change without the US: a guide to global action
How to fight climate change without the US: a guide to global action
With the US government absent from the COP30 global climate summit, it will be up to others to avert catastrophe.Tollefson, Jeff
like this
Plusoj kaj minusoj en la nova strategia plano
La nova strategia plano de UEA estas facile superrigardebla kaj listigas kvin klarajn celojn por la proksima jaro. Tamen entute dek kvin “strategiaj” celoj por la sekvaj jaroj estas tro multe, des pli ke ne klaras, kio estas prioritata kaj kiu respondecas pri plenumo, opinias Tim Owen. Francisco Javier Moleón volus ke UEA konsideru, kial esperantistoj ne vidas kialon aliĝi al la asocio, dum Osmo Buller miras ke la prezidanto de UEA indikis sin mem kiel la aŭtoron de la plano.
Two courts urge ICE to halt deportation of man wrongfully imprisoned for more than 40 years
Legal resident ‘Subu’ Vedam being held in short-term center after getting murder conviction overturned earlier this year
Two different courts have called on immigration officials to halt deportation of a Pennsylvania man who spent more than 40 years in prison for a murder conviction that was recently overturned.
Subramanyam Vedam, 64, was brought to the United States by his parents when he was nine months old. Vedam is a legal permanent resident, and according to his lawyer, had his citizenship application accepted prior to his arrest in 1982. He is known by his relatives as “Subu”, per the Associated Press.
He is currently being held in a short-term center in Alexandria, Louisiana, which is equipped with an airstrip for deportations.
adhocfungus likes this.
What is the current state of Discourse to threadiverse federation?
I found this article from earlier this year: blog.discourse.org/2025/04/dis…
However, I haven't come across that much content from Discourse platforms over here on Lemmy/Piefed. Is there more work to do with the plugins, or should we work with organizations running Discourse to help them connect with us?
For example, the threadiverse communities for OpenStreetMaps is relatively small, and being able to see / contribute to community.openstreetmap.org would be amazing.
Discourse and the Fediverse!
Two years ago, we started working on a plugin that brings Discourse and the Fediverse closer together. Discourse communities are online spaces that facilitate open collaboration and communication.Penar Musaraj (Discourse)
reshared this
Re: What is the current state of Discourse to threadiverse federation?
The Discourse ActivityPub plugin is developed by a sole developer but I believe it is ready for use. I also believe he is still working on the plugin so that's positive news.
Federation with Discourse forums is tricky, it's been difficult getting NodeBB to work reliably with Discourse.
Are you able to load Discourse categories in Lemmy?
like this
General Discussion reshared this.
Trial starts for DC man charged with throwing sandwich at federal agent
Trial starts for DC man charged with throwing sandwich at federal agent
A video that went viral captured Sean Charles Dunn hurling his sandwich at a Customs and Border Protection agent outside a nightclubGuardian staff reporter (The Guardian)
Mangione, Mamdani and the Media
Zohran Mamdani looks poised to become mayor of the most important city in the world tomorrow, despite the New York Times (and other major media) doing its best to smother his campaign in its crib.
adhocfungus likes this.
After confusing driver release, AMD says old GPUs are still actively supported
Re-using old silicon means that dropping “old” GPUs can affect “new” products.
adhocfungus likes this.
Alberta Forcing Teachers Back To Work Is A Historic Loss
Premier Danielle Smith has used the ‘notwithstanding clause’ to shield the strike-breaking bill from a court challenge.
adhocfungus likes this.
[Patch Notes] 3.27.0 Hotfix 8
3.27.0 Hotfix 8
- Fixed a bug causing players to sometimes get large latency spikes and disconnections.
Patch Notes - 3.27.0 Hotfix 8 - Forum - Path of Exile
Path of Exile is a free online-only action RPG under development by Grinding Gear Games in New Zealand.Path of Exile
Wrist-Cut Transformation Subculture ✡ Menhera-chan - Capitolo 9
A parte scoprire che i suoi voti in educazione fisica sono scarsi, Momoka a scuola scopre di avere un ammiratore segreto...
From Baghdad to Abuja: America’s old script of liberation and ruin
From Baghdad to Abuja: America’s old script of liberation and ruin
Trump’s claims of Christian genocide in Nigeria and threat of military intervention are merely recycled justifications for domination masked as humanitarian concern.Abu Bilaal Abdulrazaq bn Bello bn Oare (TRT World)
What to watch in Tuesday's big elections: Races for governor, NYC mayor, redistricting and more
What to expect in the 2025 elections: NYC mayor race, Virginia governor and more
Off-year elections on Tuesday provide the first big chance for voters across several states to make their voices heard this year — and shed early light on some major questions ahead of next year’s midterm elections.Ben Kamisar (NBC News)
'Not a Freudian slip': Analyst astonished by Trump's 'confession'
'Not a Freudian slip': Analyst astonished by Trump's 'confession'
President Donald Trump just made an astonishing "confession" about pardons, an analyst flagged Monday.Nicole Charky-Chami (Raw Story)
adhocfungus likes this.
Kimberly-Clark to buy Tylenol maker Kenvue in massive consumer merger
...Kimberly-Clark is buying Kenvue in a nearly $50 billion cash and stock deal...
adhocfungus likes this.
How Google Tracks and Scans Everything on Your Android Device
Why Google Play Services Has More Access Than Any App on Your Phone
They have a hidden app with every permission enabled already and you can't change that.Faisal Rasool (How-To Geek)
like this
how to check if One-Click-Hoster download links are online/offline without jdownloader?
Hi, how to check if One-Click-Hoster download links are online/offline without jdownloader? Any tool or website?
Thanks for any help 😀
China freezes chip chemistry to slash defects by 99 per cent
I wish they linked a source on this, but overall seems like a breakthrough.
Chinese boffins have emerged from their smoke filled labs with a way to stop chips from going pear-shaped during manufacture by literally freezing the process mid-flow.According to researchers at Peking University, Tsinghua, and HKU, the new method can slash lithography defects by a 99 per cent.
One of the trickiest bits of making semiconductors is photolithography, where light is used to “print” circuits onto silicon wafers. It’s rather like developing a microscopic photograph, except it costs billions and breaks more often.
The process involves spreading a photoresist, a light-sensitive goo, over the wafer. Ultraviolet light then shines through a mask that carries the circuit pattern, and the exposed material is chemically developed so some bits dissolve while others stay put. What remains forms the stencil for the later steps, like etching the metal or silicon layers.
That’s all well and good until the photoresist starts misbehaving. During development, dissolved material sometimes clumps together into microscopic particles that can stick back onto the wafer. At five-nanometre or smaller nodes, even a 30-nanometre blob can ruin a circuit.
China freezes chip chemistry to slash defects by 99 per cent
Cryogenic trick gives Beijing’s fabs a leg-up Chinese boffins have emerged from their smoke filled labs with a way to stop chips from going pear-shaped during manufacture by literally freezing the process mid-flow.Nick Farrell (Fudzilla)
adhocfungus likes this.
Upvote RSS - Generate rich RSS feeds from Reddit, Lemmy, Hacker News, Lobsters, PieFed, Mbin, and GitHub
Fitik likes this.
reshared this
redbear
in reply to Anarcho-Bolshevik • • •like this
Commiejones, Ace! _SL/S, AmarkuntheGatherer, Patyk34, Maeve, BassedWarrior, mufasio, redrum, Nondiegetic (any), dg2jeng, stink, Philo_and_sophy e Malkhodr like this.
Anarcho-Bolshevik
in reply to redbear • • •As long as none of the victims is a White cishet capitalist man, who cares?
An army massacring hundreds of thousands of non-Whites = yawn.
Somebody murdering one White cishet capitalist man (e.g. Charlie Kirk) = most important event in all of history.
like this
redbear, Patyk34, Maeve, BassedWarrior, redrum, Nondiegetic (any), Vertraumir, surjomukhi, stink, Philo_and_sophy e Malkhodr like this.