The long-standing tussle between proprietary and open-source software has taken a fresh turn with the advent of artificial intelligence (AI).
AI has sparked a new debate as there’s no clear consensus on what ‘open source’ means in its context.
The New York Times recently published an article praising Meta CEO Mark Zuckerberg for embracing ‘open source AI.’ This move has reportedly bolstered his standing in Silicon Valley.
However, critics argue that Meta’s Llama-branded large language models don’t truly embody the open source spirit, exposing the crux of the controversy.
- The OSI is tackling the complex task of defining ‘open source’ in AI.
- Meta’s ‘open source AI’ Llama models face criticism for not truly embodying open source principles.
- OSI is set to unveil the ‘stable version’ of the Open Source AI Definition in late October.
OSI Defines ‘Open Source AI’: Debunks Meta’s Claim, Sets Guidelines
Open Source Initiative Spearheads the Task
The Open Source Initiative (OSI), helmed by Executive Director Stefano Maffulli, is tackling this challenge. To address the issue, the OSI engages in various activities, such as conferences, workshops, reports, and webinars.
For over a quarter-century, the OSI has overseen the Open Source Definition (OSD). The OSD lays out guidelines on applying the term ‘open source’ to software.
However, extending these traditional naming and licensing provisions from software to AI has proven to be a complex task due to the inherent differences between the two.
The Complex Task of Defining Open-Source AI
Joseph Jacks, an open source proponent and the founder of VC firm OSS Capital, contends that ‘open source AI’ is a misnomer. He asserts that ‘open source’ was originally coined for software source code. He further explains that ‘neural network weights’ (NNWs) — parameters used in AI for learning during the training process — cannot be equated to software in any substantial way.
To address this, Jacks and his colleague Heather Meeker proposed a new concept of ‘open weights,’ a definition that Maffulli concurs with.
Established in 1998, the OSI undertakes many open source-related activities and relies on sponsorships from corporations like Amazon, Google, Microsoft, Cisco, Intel, Salesforce, and Meta.
Meta’s involvement is particularly intriguing, given its claims of championing ‘open source AI’ despite placing specific restrictions on the use of its Llama models.
Meta’s descriptions of its LLMs are notably flexible. While the Llama 2 model was initially labeled as open source, with the release of Llama 3, the company dialed back on the terminology, opting for phrases like ‘openly accessible’ and ‘openly available’.
Building a Comprehensive Open-Source AI Definition
Creating a comprehensive definition for open source AI involves grappling with the issue of data. Maffulli argues that the data source and the methodology used for labeling, de-duplicating, and filtering it are more important than the dataset itself. He also emphasizes the significance of accessing the code used to assemble the dataset.
The current draft of the Open Source AI Definition stipulates that an Open Source AI system should permit use for any purpose without prior permission, allow others to study its operation, and enable modifications and sharing.
The OSI plans to release the definition officially, dubbed the ‘stable version,’ at the All Things Open conference in late October.
Through a global roadshow spanning five continents, the organization hopes to gather diverse input on the definition of ‘open source AI.’ Nonetheless, any final changes are expected to be minor tweaks.
“This is the final stretch,” Maffulli asserts. “We have reached a feature-complete version of the definition; we have all the necessary elements. Now we have a checklist, so we’re checking that there are no surprises; no systems should be included or excluded.”
Join our newsletter community and get the latest AI and Tech updates before it’s too late!
Leave a Reply