ADVERTISEMENT

Open-Source AI Isn’t Always 'Open' And Free

Free versions of AI models are challenging Big Tech, but they could also benefit the giants more than one would expect.  

ISTANBUL, TURKEY - MAY 06: A woman views historical documents and photographs displayed in a high tech art installation at Salt Galata on May 6, 2017 in Istanbul, Turkey. The "Archive Dreaming" installation by artist Refik Anadol uses artificial intelligence to visualize nearly 2 million historical Ottoman documents and photographs from the SALT Research Archive. Controlled by a single tablet in the center of a mirrored room the artist used machine learning algorithms to combine historical documents, art, graphics and photographs to create an immersive installation allowing people to scroll, read and explore the archives. The SALT Galata archives include around 1.7 million documents ranging from the late-Ottoman era to the present day. The exhibition is on show at SALT Galata art space through till June 11, 2017. (Photo by Chris McGrath/Getty Images)
ISTANBUL, TURKEY - MAY 06: A woman views historical documents and photographs displayed in a high tech art installation at Salt Galata on May 6, 2017 in Istanbul, Turkey. The "Archive Dreaming" installation by artist Refik Anadol uses artificial intelligence to visualize nearly 2 million historical Ottoman documents and photographs from the SALT Research Archive. Controlled by a single tablet in the center of a mirrored room the artist used machine learning algorithms to combine historical documents, art, graphics and photographs to create an immersive installation allowing people to scroll, read and explore the archives. The SALT Galata archives include around 1.7 million documents ranging from the late-Ottoman era to the present day. The exhibition is on show at SALT Galata art space through till June 11, 2017. (Photo by Chris McGrath/Getty Images)

The most successful marketing phrase of all time may well be “artificial intelligence,” since, no, computers still can’t think. What should take the number two spot? How about, “open-source artificial intelligence”.

A popular prediction about the technology’s trajectory in 2024 is that open-source AI models will catch up with proprietary ones such as ChatGPT and Google’s Bard. That sounds promising at first. If free AI tools become as technically capable as those provided by tech monopolies, that means someone is challenging Big Tech’s dominance of what could become the world’s most transformative technology. 

Except there’s a caveat: Some of the most promising open-source AI models are not truly open or free from the control of large tech companies.

Open source refers to software that’s freely available for any member of the public to view, modify and distribute as they see fit. Outside of AI, these tools, such as the blog-hosting platform WordPress or the image-editing software GIMP, can seem a little unpolished compared to what you might buy from the likes of Google or Apple Inc., but they have democratized access to new, digital services. WordPress, for instance, allowed millions of small businesses to establish an online presence cheaply. 

It now looks like AI is heading in a similar direction, with a number of open-source projects from companies like Mistral and Hugging Face offering free alternatives to the models created by established AI firms. But some of the biggest of these projects are backed by tech giants that have added restrictions running counter to open-source standards — making them not so free after all and their descriptions somewhat misleading.

Meta Platforms Inc., for instance, released an open-source language model called Llama 2 in 2023, but its license bans developers from using it to train other language models. Open-source software does have certain parameters around how code is used, and that can vary around making sure, for instance, software remains open source when it is distributed. But restricting the use of code for other purposes — like training other AI models — is less liberal than normal. 

If you’re a startup hoping to build the next Facebook or Google, you also have to jump through extra hoops to use Meta’s system, acquiring a separate license from the company if you amass more than 700 million daily users. The model isn’t as transparent as an open-source project should be either, particularly when it comes to the data Meta used to train Llama 2, according to a 2023 study by researchers from Carnegie Mellon University, the AI Now Institute and the Signal Foundation. They concluded that Big Tech was using the term “open source” as a branding effort to look better in front of regulators and the public.

Llama doesn’t fit the commonly-accepted definition of “open source” as put forward by the Open Source Initiative (OSI), a nonprofit organization that sets the criteria for software that is considered open source. The OSI has even said that Meta’s use of the term is wrong, and that it’s asked the company to “correct its misstatement.” 

A spokesperson for Meta said the company was aiming to “help companies and developers that may be resource-constrained still have access to large language models like Llama 2, so it is free for the vast majority of users.” They did not address questions about licensing limits. 

The term “open source” has been used so liberally that there seems to be broad confusion about the label. When Apple recently released an AI model called Ferret, press reports described it as “open source,” but its license contained several clauses that showed it was not, and the model itself is meant for research use only.(1) Apple declined to comment.   

If large tech firms steer “open source” AI projects toward their commercial interests, that will make it harder for smaller companies to compete, and lock many into technology controlled by the biggest vendors — a problem that already exists in cloud computing. These extra limitations aren’t normal in open source, and could ultimately help the largest technology firms maintain their dominance. 

Open-source AI will make plenty of progress in 2024, but it may benefit Big Tech more than we might expect, too.

More From Bloomberg Opinion:

  • Can Midjourney’s CEO Stop a Storm of Fake Election Images?: Parmy Olson
  • Want to Be Wealthy? You Might Try Therapy: Tyler Cowen
  • The FTSE 100 Hasn't Been All Bad for Investors: Merryn Somerset Webb

(1) In Ferret’s license, Apple restricts the use of its name and logo when promoting products that were built using the software, and it offers no guarantee of patent rights, which poses some legal risk for developers who decide to use the model.

This column does not necessarily reflect the opinion of the editorial board or Bloomberg LP and its owners.

Parmy Olson is a Bloomberg Opinion columnist covering technology. A former reporter for the Wall Street Journal and Forbes, she is author of “We Are Anonymous.”

More stories like this are available on bloomberg.com/opinion

©2024 Bloomberg L.P.