Salta al contenuto principale




Is Meta Scraping the Fediverse for AI?


Building on some initial reports coming from the FediPact account and Dropsite news, we dive into potential measures admins can take for their instances.


Is Meta Scraping the Fediverse for AI?


A new report from Dropsite News makes the claim that Meta is allegedly scraping a large amount of independent sites for content to train their AI. What’s worse is that this scraping operation appears to completely disregard robots.txt, a control list used to tell crawlers, search engines, and bots which parts of a site should be accessed, and which parts should be avoided. It’s worth mentioning that the efficacy of such lists depend on the consuming software to honor this, and not every piece of software does.

Meta Denies All Wrongdoing


Andy Stone, a communications representative for Meta, has gone on record by claiming that the list is bogus, and the story is incorrect. Unfortunately, the spread of Dropsite’s story is relatively small, and there haven’t been any other public statements about the list at this time. This makes it difficult to adequately critique the initial story, but the concept is nevertheless a wakeup call.

However, it’s worth acknowledging Meta’s ongoing efforts to scrape data from many different sources. This includes user data, vast amounts of published books, and independent websites not part of Meta’s sprawling online infrastructure. Given that the Fediverse is very much a public network, it’s not surprising to see instances getting caught in Meta’s net.

Purportedly Affected Instances


The FediPact account has dug in to the leaked PDF, and a considerable amount of Fediverse instances appear on the list. The document itself is 1,659 pages of URLs, so we were able to filter down a number of matches based on keywords. Please keep in mind that these only account for sites that use a platform’s name in the domain:

  • Mastodon: 46 matches
  • Lemmy: 6 matches
  • PeerTube: 46 matches

There are likely considerably more unique domain matches in the list for a variety of platforms. Admins are advised to review whether their own instances are documented there. Even if your instance’s domain isn’t on the list, consider whether your instance is federating with something on the list. Due to the way federation works, cached copies of posts from other parts of the network can still show up on an instance that’s been crawled.

Access the Leaked List


We are mirroring this document for posterity, in case the original article is taken offline.

Download (PDF)

Protective Measures to Take


Regardless of the accuracy of the Dropsite News article, there’s an open question as to what admins can do to protect their instances from being scraped. Due to the nature of the situation, there is likely no singular silver bullet to solve these problems, but there are a few different measures that admins can take:

  • Establish Community Terms of Service – Establish a Terms of Service for your instance that explicitly calls out scraping for the purposes of data collection and LLM training specifically. While it may have little to no effect on Meta’s own scraping efforts, it at least establishes precedence and a paper trail for your own server community’s expectations and consent.
  • Request Data Removal – Meta has a form buried within the Facebook Privacy Center that could be used to submit a formal complaint regarding instance data and posts being part of their AI training data. Whether or not Meta does anything is a matter of debate, but it’s nevertheless an option.
  • (EU-Only) Send a GDPR Form – Similar to the above step, but try to get the request in front of Meta’s GDPR representatives that have to deal with compliance.
  • Establish Blocking Measures Anyway: Even if private companies can still choose to disregard things like robots.txt and HTTP Headers such as X-Robots-Tag: noindex, you can still reduce the attack surface of your site from AI agents that do actually honor those things.
  • Set Up a Firewall: one popular software package that’s seeing a lot of recent adoption for blocking AI traffic is Anubis, which has configurable policies that you can adjust as needed to handle different kinds of traffic.
  • Use Zip Bombs: When all else fails, take measures into your own hands. On the server side, use an Nginx or Apache configuration to detect specific User Agents associated with AI, and serve them ever-expanding compressed archives to slow them down.

In all reality, fighting against AI scraping is still a relatively new problem that’s complicated by lack of clear regulation, and companies deciding to do whatever they want. The best we can do for our communities is to adopt protective measures and stay informed of new developments in the space.

ShareOpenly logo Share


in reply to Sean Tilley

Does the GPL apply to content published on the fediverse? If yes we're getting a great open source model for free.

Also obligatory fuck META

in reply to Ziggurat

No, the GPL is strictly a software license. Even then, most Fediverse platforms are AGPL at best, which only really deals with modifications to the source code of AGPL projects. Content licensing itself is more nebulous and more complicated.
in reply to Sean Tilley

Meta's Threads has launched a beta allowing users in some countries to share posts to ActivityPub-compliant Fediverse servers. ActivityPub is a protocol enabling interoperability between social networks, allowing posts to flow between networks regardless of ownership. Key points: * Threads users can opt-in to share posts to the Fediverse. * Replies to Threads posts on the Fediverse won't appear in Threads itself initially. * Meta plans further integrations, including allowing Threads users to see and engage with replies from other servers and follow users on other Fediverse servers. * Future Threads profiles will consolidate followers from both Threads and other Fediverse servers. * Challenges exist due to the lack of standardized features like quote posts in ActivityPub. * Meta aims to provide a fully interoperable experience, reaching new audiences and fostering community. * Interoperability allows Meta to better understand user behavior across different platforms for monetization purposes.

theregister.com/2024/03/22/met…



Azerbaijan | Aliyev approves $2 million energy aid to Kyiv following Russian strikes on Azeri-linked sites in Ukraine


The allocated funds will be used to purchase and ship Azerbaijani-made electrical equipment from the president's 2025 reserve fund.


Archived version: archive.is/newest/kyivindepend…


Disclaimer: The article linked is from a single source with a single perspective. Make sure to cross-check information against multiple sources to get a comprehensive view on the situation.


in reply to schizoidman

Wikipedia doesn't have to do shit.

Let them break their internet until they fix it.



In India, Trump's tariffs spark calls to boycott American goods


From McDonald's and Coca-Cola to Amazon and Apple, U.S.-based multinationals are facing calls for a boycott in India as business executives and Prime Minister Narendra Modi's supporters stoke anti-American sentiment to protest against U.S. tariffs.


Archived version: archive.is/20250811152120/reut…


Disclaimer: The article linked is from a single source with a single perspective. Make sure to cross-check information against multiple sources to get a comprehensive view on the situation.

Questa voce è stata modificata (1 mese fa)


EU to channel $1.7 billion from frozen Russian assets to repay Ukraine's loans


This marks the third such transfer, covering income generated in the first half of 2025. Previous tranches were disbursed in July 2024 and April 2025.


Archived version: archive.is/newest/kyivindepend…


Disclaimer: The article linked is from a single source with a single perspective. Make sure to cross-check information against multiple sources to get a comprehensive view on the situation.



Australia to recognise Palestinian state in September


It follows similar moves by the UK, France and Canada.


Archived version: archive.is/20250811104845/bbc.…


Disclaimer: The article linked is from a single source with a single perspective. Make sure to cross-check information against multiple sources to get a comprehensive view on the situation.



French report links Nestlé bottled waters to record microplastic contamination


French investigators have uncovered microplastic contamination in two of Nestlé’s top mineral water brands, sparking a renewed legal battle and fresh calls for tougher environmental regulation.


Archived version: archive.is/newest/rfi.fr/en/en…


Disclaimer: The article linked is from a single source with a single perspective. Make sure to cross-check information against multiple sources to get a comprehensive view on the situation.



Ukrainian drones reportedly strike oil plant almost 2,000 km inside Russia


Locals have reported internet disruptions around the refinery.


Archived version: archive.is/20250811010318/tvpw…


Disclaimer: The article linked is from a single source with a single perspective. Make sure to cross-check information against multiple sources to get a comprehensive view on the situation.



Just got a new GPU, why is it so hard to use it?


Hey all, just got a Geforce 5070 to replace my 2070 from years ago. Ubuntu's been pretty smooth sailing for me until now, and I'm not exactly the best at navigating this stuff.

When Ubuntu starts to boot, the GPU stops outputting display to my monitor. As though it doesn't detect the new GPU. I tried putting the 2070 back in and downloading the 570 drivers but it didn't change anything. I found a tutorial for what seemed to be my issue that asked me to change the kernel, but halfway through the tutorial, commands that worked on their machine started failing on mine. I wish I'd documented what the error messages were because when I went to poke around more today, I got a message about kernel panic and can't even boot with the 2070. Where do I go from here?

in reply to lilpatchy2eyes

Nvidia


fool me once shame on you, fool me twice shame on me

in reply to lilpatchy2eyes

I had problems with my 3080 and Ubuntu having a flickering screen. Eventually I switched from displayport cable to hdmi and that fixed the issue.


UK and Canada say peace must not be imposed on Ukraine


British Prime Minister Keir Starmer and Canadian Prime Minister Mark Carney agreed that peace in Ukraine must be built with Kyiv and not imposed upon it, a Downing Street spokesperson said on Monday.


Archived version: archive.is/20250811195857/reut…


Disclaimer: The article linked is from a single source with a single perspective. Make sure to cross-check information against multiple sources to get a comprehensive view on the situation.

Questa voce è stata modificata (1 mese fa)


Russian forces pierce Ukrainian defense in Donetsk Oblast, bypassing fortifications, monitoring group says


Russian forces advanced toward the Dobropillia–Kramatorsk highway in Donetsk Oblast, seizing positions in nearby settlements to support further offensive operations.


Archived version: archive.is/newest/kyivindepend…


Disclaimer: The article linked is from a single source with a single perspective. Make sure to cross-check information against multiple sources to get a comprehensive view on the situation.



Connex Credit Union data breach impacts 172,000 members


Connex, one of Connecticut's largest credit unions, warned tens of thousands of members that unknown attackers had stolen their personal and financial information after breaching its systems in early June.

https://www.bleepingcomputer.com/news/security/connex-credit-union-discloses-data-breach-impacting-172-000-people/

Questa voce è stata modificata (1 mese fa)


[Announcement] Merciless Gauntlet Statistics


The Merciless Gauntlet has concluded, congratulations to Ben_ for taking the top spot! It was a blast watching the action, so we've prepared some statistics from the event in today's news post. Check them out!

Base Class Campaign Completion

ClassCharactersEntered Act 2Entered Act 3Entered Act 4Entered Act 5Entered Act 6Entered Act 7Entered Act 8Entered Act 9Entered Act 10Entered Epilogue% Campaign Completion
Marauder197,22131,71525,30021,90514,61212,4249,8379,1598,5066,8626,0163.05%
Ranger116,2978,8885,3384,5672,7022,3091,8301,6271,3949766840.59%
Shadow81,44312,8428,8667,7294,6844,1153,1962,8552,5331,9471,5011.84%
Witch44,8936,1263,7593,1241,8371,5701,1321,0259226685471.22%
Templar41,7664,7233,3522,8571,9411,7181,2611,1411,0207406141.47%
Duelist40,7159,3796,2695,4733,4233,0252,5332,2691,9631,5091,2253.01%
Scion28,2305,0663,1492,6971,4171,2199468297595774971.76%

Campaign Completion Rate by Ascendancy

Ascendancy ClassCompleted Act 10
Ancestral Commander5,974
Servant of Arakaali1,381
Paladin1,131
Scavenger496
Polytheist354
Whisperer279
Herald256
Daughter of Oshabi222
Wildspeaker206
Bog Shaman191
Puppeteer177
Surfcaster114
Harbinger106
Architect of Chaos97
Aristocrat75
Antiquarian39
Blind Prophet27
Gambler22
Behemoth19
Templar3
Duelist1
Scion1
Witch1
Marauder0
Ranger0
Shadow0

Endgame Deaths by Area Level

Area LevelTotal Deaths
68358
69298
70683
71359
72425
73477
74510
75537
76376
77437
78287
79593
80203
81281
82243
83905
84159
8535

Total Deaths by Character Level

Character LevelTotal Deaths
1-4286,698
5-933,760
10-1419,569
15-1913,131
20-246,001
25-2911,019
30-343,023
35-404,737
40-4510,860
45-495,408
50-544,025
55-593,489
60-641,767
65-692,407
70-743,157
75-891,737
80-841,619
85-891,810
90-941,593
95-99564
10031

Pinnacle Boss Encounters

Pinnacle BossInstancesTotal KillsTotal DeathsKills With DeathsGave UpPercent DeathsAccountsCharacters
Uber Shaper355110190.3142425
Uber Searing Exarch1885050.27889
Uber Eater of Worlds22460120.2731011
Uber Maven822040.2533
Quest Searing Exarch77341319101690.247634659
Incarnation of Fear826913000.1594856
Incarnation of Dread58449050.1554448
Quest Incarnation of Dread10578160110.1529297
Uber Uber Elder20430130.151011
Quest Incarnation of Neglect3601555301520.147299309
Maven294164340960.116132143
Uber Elder12410713040.10590101
Quest Infinite Hunger2,0681,36720604950.11,7211,750
Quest Black Star1,8461,46218302010.0991,7201,741
Shaper514376480900.093215236
Searing Exarch174127130340.075114119
Quest Eater of Worlds774677481490.062731735
Quest Incarnation of Fear12610760130.048112113
Incarnation of Neglect47441020.0214043
Eater of Worlds2522395080.02152160
Elder5765678110.014248271
Black Star525000201819
Infinite Hunger5841001703337

Top Ten Deadly Monsters

MonsterAreaTotal Deaths
HillockThe Twilight Strand158,017
Hillock's MavenThe Twilight Strand28,203
Hillock's ZombiesThe Twilight Strand21,203
Fire FuryThe Coast15,731
Sand SpitterThe Twilight Strand12,732
HailrakeThe Tidal Island9,979
Vaal OversoulThe Ancient Pyramid7,487
Trarthan Mercenary (Ranger)The Mud Flats6,086
BrutusThe Upper Prison5,283
Water ElementalThe Submerged Passage4,692


[Announcement] Play the Path of Exile 2 Boss Rush Event at Gamescom


Next week, Path of Exile 2's Boss Rush Event will be available at Gamescom! If you're attending the expo, be sure to visit our booth and challenge yourself against 15 bosses. Check out this announcement to find out more about the event.

Gamescom

  • When - 20th - 24th of August
  • Where - Hall 6.1 C071 / A070

Good luck, Exiles!



Is self-hosted OpenVPN on a vps safe enough for p2p?


As the title says, I have my own instance of OpenVPN running in a vps (default settings). Is that "safe" enough for p2p? Any settings I should change? Anything I should watch out for? I guess it would show that the IP address of my vps will be going to these p2p sites and connecting to the IP address of whoever I'm transferring from, but how hard is it for the vps traffic to be traced back to me?
in reply to bender223

no. it all depends on the vps provider. linode for example has sent me emails about detecting torrent traffic, and threatens to end my service. if a government asked them for logs, i assume they would send them right over.
Questa voce è stata modificata (1 mese fa)
in reply to bender223

Your VPS provider will likely just forward copyright infringement letters to you, same as your ISP would, or they'll suspend your account.

It will hide your ISP IP from torrent peers, but the VPS provider still knows exactly who you are.

but how hard is it for the vps traffic to be traced back to me?


Very easy by the VPS provider, as the VPS has a static IP assigned to you.












in reply to silence7

This reads like "old man yells at clouds"

Like on one hand, sure, tiktok has accelerated the amount of info/media kids are latching onto. But also, remember street sharks? Remember pogs? Or floam? Like.. shits always been like this.



Reddit will block the Internet Archive


in reply to General_Effort

Not that reddit isn't hot garbage right now, and has been for a while actually, but there's a lot of people here who have glazed over the reason why reddit instituted this policy.

AI companies are scraping the Wayback Machine. This is something that should concern all of us.

Questa voce è stata modificata (1 mese fa)
in reply to Midnight1938

And what do I care about Reddit getting paid?

If the IA doesn't complain about being used, then it's fine for me. The ideal outcome would be, if the archive can make some arrangement where they scrape the data and provide it to everyone. That way, sites only get scraped once and not constantly hammered.



Lenovo Webcams Can Be Turned into Persistent Attack Platforms





GPT-5: has AI just plateaued?





Why TikTok ADHD misinformation is dangerous







Vuoto a perdere 06: o mio o di nessun altro!


Mark Wilson, il Vuoto a Perdere, scopre un segreto su Freddie. Ora può ricattarlo.

MONDO REALE: Freddie Mercury non ha mai avuto dei figli ma Mary Austin, la donna che lui ha amato, sì.

Il frontman dei Queen però era appassionato di gatti e per questo ha scritto una canzone per la sua preferita. Delilah.

FANTASIA: Mark Wilson scopre un segreto molto personale su Freddie intercettando alcune lettere a lui indirizzate, e la decisione è presa: Freddie, di quella corrispondenza, non saprà mai; è un’arma troppo potente nelle mani di Mark!

Racconto aggiornato dopo la pubblicazione iniziale (2021)


Vuoto a perdere: o mio, o di nessun altro!


Per giorni il mio defunto compagno Andrew continuò a ripetermelo nel sogno: “Dissangua il mio gifter, porta la sua anima da me!”

Dalla sera in cui Freddie si era tagliato un dito durante una cena, l’idea di ottenere il suo sangue diventò un’ossessione e più passava il tempo, più la determinazione si tramutò in rabbia.

“Se Freddie non sarà mio, nessun altro lo avrà!” dicevo tra me, ogni volta che mi alzavo dal letto e mi guardavo nello specchio.

Non riuscii mai però ad avvicinarlo davvero: lo vedevo girare insieme al suo staff, sorridere, parlare con l’assistente Melania ma era impossibile avere un’occasione per stare da solo con lui.

Non restò altro che studiare un piano per tendergli una trappola che l’avrebbe potuto dissanguare ma sfortunatamente, documentandomi grazie ai pochi mezzi di informazione disponibili, capii quanto il virus HIV fosse debole: la trasmissione non avviene senza contatto diretto col sangue del malato!

Il piano del dissanguatore


La sola speranza per me era stare lì mentre accadeva, e raccogliere il prezioso fluido in velocità: Ogni qual volta pulivo la sua stanza, cercavo di escogitare un sistema per sabotare un lampadario, una finestra, qualsiasi altro oggetto di vetro che potesse infrangersi su di lui mentre dormiva e io sarei corso ad aiutarlo, guarda caso mi sarei tagliato e poi…

No, alla fine decisi che non valeva la pena; troppo rischioso per la mia reputazione, mi avrebbero accusato di omicidio facendomi finire come Mark David Chapman, l’uomo che nel 1980 sparò a John Lennon. Povero idiota, quando John venne portato via rimase lì a leggersi Il Giovane Holden aspettando gli eventi, ma non ebbe alcun legame col proprio mito.

Io invece, facendo le cose per bene, potevo essere unito a Freddie per sempre e, soprattutto, diffondere il virus consentendo al mio idolo di vivere per l’eternità. E pazienza se per il mondo sarei stato l’ennesimo criminale! Mi sentivo coperto, nella mia falsa identità di William Karson.

Buone notizie?


Assorto nei miei pensieri non mi accorsi immediatamente di qualcuno che bussava alla porta della mia stanza; “Karson, c’è bisogno di te!” Mi disse Jim, il compagno di Freddie, quando aprii.

Stringeva in mano una busta. “Questa l’hanno mandata un paio di giorni fa destinata a Freddie, e appena ho letto da dove arriva mi sei venuto in mente subito.”

Allora era vero, hanno letto il mio curriculum prima di assumermi! Presi la lettera fra le dita e mi lasciai andare a un sospiro di malinconia, quando lessi la provenienza del mittente: Bugliano, culla e tomba per i miei sogni di scienziato.

Ricordai immediatamente il mio passato da studente di medicina, quando la speranza di sconfiggere l’AIDS con le mie sole forze, mi aveva indotto persino a fare pace con chi mi aveva sempre bullizzato. Raymond Still, che quando mi bocciarono, diede un colpo di spugna alla nostra inaspettata amicizia e mi umiliò il doppio.

Promozioni e fallimenti


“Mark Wilson, Lei può solo pulire i gabinetti”, mi disse impietoso l’insegnante quando provai a contestare il risultato finale; e Ray Still non mosse un dito per difendermi, anzi mi voltò le spalle per godersi la sua promozione con tutti gli onori!

Soltanto pochi mesi dopo aver accettato il lavoro a casa di Freddie Mercury, scoprii che mentre io facevo le pulizie Raymond Still aveva fatto carriera e avviato un’importante ricerca su HIV, con tanto di annuncio: “Sto cercando una celebrità risultata positiva al virus per sperimentare il mio nuovo studio. L’HIV, opportunamente modificato, può trasmettere emozioni e talento da una persona all’altra!”

Quale occasione migliore per fargli sapere di Freddie?

Stringendo ancora fra le mani la busta consegnatami da Jim, ripensai alla mia lettera appena spedita a Raymond in risposta al suo annuncio; gli avevo raccontato per filo e per segno la positività di Freddie ma anche dopo giorni dall’invio della missiva, non ricevetti alcun riscontro.

E se la busta in mano mia contenesse proprio la risposta di Ray, che da Bugliano voleva parlare direttamente con Freddie?

Non mi feci alcuno scrupolo e me la misi in tasca: “Mai paura, Jim! La porto io al capo, il messaggio è in buone mani.”

Inizio di un inganno


Alla svelta, chiusi la porta della mia stanza e tirata fuori la lettera, iniziai a leggere quel foglio scritto a penna:

Sono incinta, amore della mia vita. Aspettiamo una bambina, il nostro miracolo ma anche la mia più grande paura.

Il mondo è un posto crudele e sento di non essere capace di proteggerla abbastanza perché sono in Russia, sposata con un violento.


“Che cosa? Freddie ha una figlia in Russia, o a Bugliano? Dove? Poco importa”, pensai. Se è vero non lo saprà mai! Richiusi la busta come nulla fosse e la nascosi nella mia borsa. Sarei stato l’unico a conoscere il segreto, avrei girato il mondo per trovare quella presunta bambina, dopodiché Freddie sarebbe stato mio, e solo mio!

Questa voce è stata modificata (4 settimane fa)




BRICS Strengthens Itself Ahead of Putin-Trump Summit