Salta al contenuto principale


#Meta says that its #Llama #AI tools are "the best #OpenSource models of their class, period."

From @kylelwiggers: "There’s only one problem: the Llama…models aren’t really “open source”… Open source implies that devs can use the models how they choose…But…Meta has imposed certain licensing restrictions…Llama models can’t be used to train other models. And app developers with over 700M monthly users must request a special license from Meta."
techcrunch.com/2024/04/20/this…

#LLMs #Licensing

Questa voce è stata modificata (1 anno fa)
in reply to petersuber

are they releasing the data sets? Because it's all bullshit if they are "open sourcing" only the code and a trained network weight set.

See lawfaremedia.org/article/why-t…

in reply to petersuber

Update. Another #Zuckerberg defense of the #Meta approach to #OpenSource #AI, this time co-authored by Daniel #Ek, the CEO of #Spotify. Zuck and Ek use the case for OS as an argument against #EU regulations of AI.
archive.is/IybwU

Tech Cyborg reshared this.

in reply to petersuber

Update. The Open Source Initiative (@osi, #OSI) is trying to define what counts as #OpenSource #AI.
simonwillison.net/2024/Aug/27/…

"There is one very notable absence from the definition: while it requires the code and weights be released under an OSI-approved #license, the #TrainingData itself is exempt from that requirement."

Tech Cyborg reshared this.

in reply to petersuber

@osi

That's interesting, in particular as the actual text says:

"Preferred form to make modifications to machine-learning systems: ... Data information: Sufficiently detailed information about the data used to train the system ... Data information shall be made available with licenses that comply with the Open Source Definition. ... if used, this would include the training methodologies and techniques, the training data sets used, ..."

This at least recommends sharing the data?

in reply to petersuber

Update. More on #OpenWashing and the complexity of defining #OpenSource #AI.
hackernoon.com/is-that-llm-act…

"The Open Source AI Definition (#OSAID) is still open for public review and feedback. If you’d like to participate in shaping the future of Open Source AI, you can submit comments."

reshared this

in reply to petersuber

Update. More controversies around the still-evolving Open Source Initiative (@osi, #OSI) definition of #OpenSource #AI.
theregister.com/2024/09/14/opi…
Questa voce è stata modificata (6 mesi fa)

Tech Cyborg reshared this.

in reply to petersuber

Update. New study: "Our extensive experiments reveal that #OpenAccess high-performance #LLMs can be adeptly reverse-aligned to output harmful content, even in the absence of manually curated malicious datasets. Our research acts as a whistleblower for the community, emphasizing the need to pay more attention to safety of open-accessing LLMs."
aclanthology.org/2024.findings…
in reply to petersuber

Update. If #OpenSource #LLMs are vulnerable to hacks that generate harmful content (this thread, prev post), they also "bring several advantages to cybersecurity systems" that can reduce those risks.
venturebeat.com/security/how-o…

Tech Cyborg reshared this.

in reply to petersuber

Update. HELIOS Open (@heliosopen) comments on v. 0.0.9 of the Open Source Initiative (#osi, @osi) definition of #OpenSource #AI.

"If the definition doesn’t start by emphasizing the openness of training data out of the gate, [we] worry it will not get added in later."

#TrainingData

Questa voce è stata modificata (6 mesi fa)

reshared this

in reply to petersuber

@osi Open Source has a lot of potential for good, but we have to be real careful with how we think about weights. We need to bring in some ideas from bio like gain-of-function research and dual-use tech, because just thinking in terms of software licensing leaves out the ability to make important distinctions.

For example, in bio research, #openscience means publishing #openaccess and providing #opendata, but it doesn't mean sending virus samples with dangerous mutations to anyone

in reply to petersuber

Update. "#Meta has been criticised for calling its #AI models #OpenSource by the group that has spearheaded open-source technology in the software world for the past 25 years. The social media company is 'confusing' users and 'polluting' the term open-source by using it to describe its #Llama family of #LLMs, said Stefano Maffulli, head of the Open Source Initiative [#OSI, @osi]."
archive.is/N5CFG
in reply to petersuber

Update. "The source code for #Winamp [IP owned by #Llama] has been taken offline…This comes as no surprise, as there have been signs. You see, when the source code first appeared on GitHub, there were numerous issues with it. Take, for instance, the fact that forking was not allowed, distribution of modified versions was not allowed, and only official maintainers were allowed to distribute the source code for Winamp."
news.itsfoss.com/winamp-disast…
in reply to petersuber

Update. "While the Open Source Initiative (OSI, @osi) is diligently working on defining the term “#OpenSource #AI,” our work [at the @linuxfoundation] focuses on a narrower scope, extending from the Model Openness Framework we’ve developed in LF AI & Data. These definitions represent a natural evolution of our ongoing efforts and are aligned with the broader goals of openness, transparency, and collaboration that underpin the open source community."
lfaidata.foundation/blog/2024/…
Questa voce è stata modificata (5 mesi fa)
in reply to petersuber

Update. "The Open Source Initiative (OSI, @osi)…today released version 1.0 of its #OpenSource #AI Definition (OSAID)." Good coverage of the controversies and dissents.
techcrunch.com/2024/10/28/we-f…

The definition itself
opensource.org/ai/open-source-…

in reply to petersuber

Update. HELIOS Open (@heliosopen) asked its advisory committee to comment on the new Open Source Initiative (#osi, @osi) definition of #OpenSource #AI.
heliosopen.org/news/defining-o…
in reply to petersuber

Update. "If you believe Mark Zuckerberg, #Meta's #AI large language model (#LLM) Llama 3 is #OpenSource. It's not. The Open Source Initiative (#OSI, @osi) spells it out in the Open Source Definition, and Llama 3's license – with clauses on litigation and branding – flunks it on several grounds. Meta, unfortunately, is far from unique in wanting to claim that some of its software and models are open source. Indeed, the concept has its own name: #OpenWashing."
theregister.com/2024/10/25/opi…
in reply to petersuber

Update. "Maximally ‘open’ #AI allows some forms of oversight and experimentation on top of existing models. However, we find that openness alone does not perturb the concentration of power in AI. Just as many traditional #opensource software projects were co-opted in various ways by large technology companies, we show how rhetoric around ‘open’ AI is frequently wielded in ways that exacerbate rather than reduce concentration of power in the AI sector."
nature.com/articles/s41586-024…

#OpenWashing

in reply to petersuber

Update. @sj argues that the #MozillaFoundation contradicted its own earlier positions when it endorsed the Open Source Initiative (#osi, @osi) definition of #OpenSource #AI.
samjohnston.org/2024/12/18/a-f…
in reply to petersuber

Update. "The #OpenScholar team has released not only the code for the language model but also the entire retrieval pipeline, a specialized 8-billion-parameter model fine-tuned for scientific tasks, and a datastore of [#OpenAccess] scientific papers. 'To our knowledge, this is the first open release of a complete pipeline for a scientific assistant LM —from data to training recipes to model checkpoints,' the researchers wrote in their blog post announcing the system."
venturebeat.com/ai/openscholar…

#LLM #OpenSource

Questa voce è stata modificata (3 mesi fa)
in reply to petersuber

Update. "While #OpenAI has open sourced models in the past, the company has generally favored a proprietary, closed source development approach. “[I personally think we need to] figure out a different open source strategy,” #SamAltman said…In a follow-up reply, Kevin Weil, OpenAI’s chief product officer, said that OpenAI is considering open sourcing older models that aren’t state-of-the-art anymore."
techcrunch.com/2025/01/31/sam-…

#AI #LLMs #OpenSource

in reply to petersuber

Update. "Researchers at #HuggingFace are trying to replicate [#DeepSeek] from scratch in what they’re calling a pursuit of “open knowledge”…[They seek] to build a duplicate of R1 and #OpenSource all of its components, including the data used to train it…Technically, R1 is “open” in that the model is permissively licensed…However, R1 isn’t “open source” by the widely accepted definition because some of the tools used to build it are shrouded in mystery."
techcrunch.com/2025/01/28/hugg…

#AI #LLMs

Questa voce è stata modificata (2 mesi fa)
in reply to petersuber

Update. "On Tuesday, #HuggingFace researchers released an #OpenSource #AI research agent called "Open Deep Research," created by an in-house team as a challenge 24 hours after the launch of OpenAI's Deep Research feature."
arstechnica.com/ai/2025/02/aft…

#LLMs

in reply to petersuber

Update. From Julien Sobrier: "We need a common understanding of what an open model means [for #AI and #LLMs]. We want to watch out for any #OpenWashing, as we saw it with free vs #OpenScience software."
artificialintelligence-news.co…
in reply to petersuber

"Bruce Perens, who wrote the original #OpenSource Definition and parted ways with #OSI [@osi] in 2020, denounced the idea of the OSAID [Open Source #AI Definition] last year. He believes AI is incompatible with the open software movement because 'its output is inherently plagiarism…The Open Source AI Definition requires less of AI than the original Open Source Definition requires of any other form of software,' said Perens…'My contention is that it isn't Open Source and is Openwashing.'"
Questa voce è stata modificata (1 mese fa)

reshared this

in reply to petersuber

Useful table showing in what respects major #AI / #LLM tools are open and in what respects they are not.
osai-index.eu/the-index?type=t…

From the European #OpenSource AI Index.
osai-index.eu/

reshared this

in reply to petersuber

From my perspective, a machine learning model can be considered open source if anyone can "compile" it from scratch using source code and source data. I don't think I can train my Llama model from scratch.