More than 130,000 Claude, Grok, ChatGPT, and Other LLM Chats Readable on Archive.org#News


More than 130,000 Claude, Grok, ChatGPT, and Other LLM Chats Readable on Archive.org


A researcher has found that more than 130,000 conversations with AI chatbots including Claude, Grok, ChatGPT, and others are discoverable on the Internet Archive, highlighting how peoples’ interactions with LLMs may be publicly archived if users are not careful with the sharing settings they may enable.

The news follows earlier findings that Google was indexing ChatGPT conversations that users had set to share, despite potentially not understanding that these chats were now viewable by anyone, and not just those they intended to share the chats with. OpenAI had also not taken steps to ensure these conversations could be indexed by Google.

“I obtained URLs for: Grok, Mistral, Qwen, Claude, and Copilot,” the researcher, who goes by the handle dead1nfluence, told 404 Media. They also found material related to ChatGPT, but said “OpenAI has had the ChatGPT[.]com/share links removed it seems.” Searching on the Internet Archive now for ChatGPT share links does not return any results, while Grok results, for example, are still available.

Dead1nfluence wrote a blog post about some of their findings on Sunday and shared the list of more than 130,000 archived LLM chat links with 404 Media. They also shared some of the contents of those chats that they had scraped. Dead1nfluence wrote that they found API keys and other exposed information that could be useful to a hacker.
playlist.megaphone.fm?p=TBIEA2…
“While these providers do tell their users that the shared links are public to anyone, I think that most who have used this feature would not have expected that these links could be findable by anyone, and certainly not indexed and readily available for others to view,” dead1nfluence wrote in their blog post. “This could prove to be a very valuable data source for attackers and red teamers alike. With this, I can now search the dataset at any time for target companies to see if employees may have disclosed sensitive information by accident.”

404 Media verified some of dead1influence’s findings by discovering specific material they flagged in the dataset, then going to the still-public LLM link and checking the content.

💡
Do you know anything else about this? I would love to hear from you. Using a non-work device, you can message me securely on Signal at joseph.404 or send me an email at joseph@404media.co.

Most of the companies whose AI tools are included in the dataset did not respond to a request for comment. Microsoft which owns Copilot acknowledged a request for comment but didn't provide a response in time for publication. A spokesperson for Anthrophic, which owns Claude, told 404 Media: “We give people control over sharing their Claude conversations publicly, and in keeping with our privacy principles, we do not share chat directories or sitemaps with search engines like Google. These shareable links are not guessable or discoverable unless people choose to publicize them themselves. When someone shares a conversation, they are making that content publicly accessible, and like other public web content, it may be archived by third-party services. In our review of the sample archived conversations shared with us, these were either manually requested to be indexed by a person with access to the link or submitted by independent archivist organizations who discovered the URLs after they were published elsewhere across the internet first.” 404 Media only shared a small sample of the Claude links with Anthrophic, not the entire list.

Fast Company first reported that Google was indexing some ChatGPT conversations on July 30. This was because of a sharing feature ChatGPT had that allowed users to send a link to a ChatGPT conversation to someone else. OpenAI disabled the sharing feature in response. OpenAI CISO Dane Stuckey said in a previous statement sent to 404 Media: “This was a short-lived experiment to help people discover useful conversations. This feature required users to opt-in, first by picking a chat to share, then by clicking a checkbox for it to be shared with search engines.”

A researcher who requested anonymity gave 404 Media access to a dataset of nearly 100,000 ChatGPT conversations indexed on Google. 404 Media found those included the alleged texts of non-disclosure agreements, discussions of confidential contracts, and people trying to use ChatGPT for relationship issues.

Others also found that the Internet Archive contained archived LLM chats.


#News

Part of Article I Section 8, and all of Sections 9 and 10, which address things like habeas corpus, nobility, and militias, are gone from Congress's website for the Constitution.

Part of Article I Section 8, and all of Sections 9 and 10, which address things like habeas corpus, nobility, and militias, are gone from Congressx27;s website for the Constitution.#archiving #websites #Trumpadministration

The lawsuit alleges XVideos, Bang Bros, XNXX, Girls Gone Wild and TrafficFactory are in violation of Florida's law that requires adult platforms to verify visitors are over 18.

The lawsuit alleges XVideos, Bang Bros, XNXX, Girls Gone Wild and TrafficFactory are in violation of Floridax27;s law that requires adult platforms to verify visitors are over 18.#ageverification

The decision highlights hurdles faced by developers as they navigate a world where credit card companies dictate what is and isn't appropriate.

The decision highlights hurdles faced by developers as they navigate a world where credit card companies dictate what is and isnx27;t appropriate.#News

#News #x27

Submit to biometric face scanning or risk your account being deleted, Spotify says, following the enactment of the UK's Online Safety Act.

Submit to biometric face scanning or risk your account being deleted, Spotify says, following the enactment of the UKx27;s Online Safety Act.#spotify #ageverification

We talked to people living in the building whose views are being blocked by Tesla's massive four-story screen.

We talked to people living in the building whose views are being blocked by Teslax27;s massive four-story screen.#News #Tesla

The massive Tea breach; how the UK's age verification law is impacting access to information; and LeBron James' AI-related cease-and-desist.

The massive Tea breach; how the UKx27;s age verification law is impacting access to information; and LeBron Jamesx27; AI-related cease-and-desist.#Podcast

The Sig Sauer P320 has a reputation for firing without pulling the trigger. The manufacturer says that's impossible, but the firearms community is showing the truth is more complicated.

The Sig Sauer P320 has a reputation for firing without pulling the trigger. The manufacturer says thatx27;s impossible, but the firearms community is showing the truth is more complicated.#News

#News #x27

“Without these safeguards, Mr. Barber eventually developed full-blown PTSD, which he is currently still being treated for,” the former mod's lawyer said.

“Without these safeguards, Mr. Barber eventually developed full-blown PTSD, which he is currently still being treated for,” the former modx27;s lawyer said.#ContentModeration