Open-source AI isn’t always ‘open’ and free

Home opinion

Open-source AI isn’t always ‘open’ and freeIf free AI tools become as technically capable as those provided by tech monopolies, that means someone is challenging Big Tech’s dominance of what could become the world’s most transformative technology.

Bloomberg Opinion

Last Updated 05 January 2024, 11:41 IST

Representative image of artificial intelligence.

Credit: iStock Photo

By Parmy Olson

The most successful marketing phrase of all time may well be “artificial intelligence,” since, no, computers still can’t think. What should take the number two spot? How about, “open-source artificial intelligence”.

A popular prediction about the technology’s trajectory in 2024 is that open-source AI models will catch up with proprietary ones such as ChatGPT and Google’s Bard. That sounds promising at first. If free AI tools become as technically capable as those provided by tech monopolies, that means someone is challenging Big Tech’s dominance of what could become the world’s most transformative technology.

Except there’s a caveat: Some of the most promising open-source AI models are not truly open or free from the control of large tech companies.

Also Read:What's next in Artificial Intelligence?

Open source refers to software that’s freely available for any member of the public to view, modify and distribute as they see fit. Outside of AI, these tools, such as the blog-hosting platform WordPress or the image-editing software GIMP, can seem a little unpolished compared to what you might buy from the likes of Google or Apple Inc., but they have democratized access to new, digital services. WordPress, for instance, allowed millions of small businesses to establish an online presence cheaply.

It now looks like AI is heading in a similar direction, with a number of open-source projects from companies like Mistral and Hugging Face offering free alternatives to the models created by established AI firms. But some of the biggest of these projects are backed by tech giants that have added restrictions running counter to open-source standards — making them not so free after all and their descriptions somewhat misleading.

Meta Platforms Inc., for instance, released an open-source language model called Llama 2 in 2023, but its license bans developers from using it to train other language models. Open-source software does have certain parameters around how code is used, and that can vary around making sure, for instance, software remains open source when it is distributed. But restricting the use of code for other purposes — like training other AI models — is less liberal than normal.

If you’re a startup hoping to build the next Facebook or Google, you also have to jump through extra hoops to use Meta’s system, acquiring a separate license from the company if you amass more than 700 million daily users. The model isn’t as transparent as an open-source project should be either, particularly when it comes to the data Meta used to train Llama 2, according to a 2023 study by researchers from Carnegie Mellon University, the AI Now Institute and the Signal Foundation. They concluded that Big Tech was using the term “open source” as a branding effort to look better in front of regulators and the public.

Llama doesn’t fit the commonly-accepted definition of “open source” as put forward by the Open Source Initiative (OSI), a nonprofit organization that sets the criteria for software that is considered open source. The OSI has even said that Meta’s use of the term is wrong, and that it’s asked the company to “correct its misstatement.”

A spokesperson for Meta said the company was aiming to “help companies and developers that may be resource-constrained still have access to large language models like Llama 2, so it is free for the vast majority of users.” They did not address questions about licensing limits.

The term “open source” has been used so liberally that there seems to be broad confusion about the label. When Apple recently released an AI model called Ferret, press reports described it as “open source,” but its license contained several clauses that showed it was not, and the model itself is meant for research use only. Apple declined to comment.

If large tech firms steer “open source” AI projects toward their commercial interests, that will make it harder for smaller companies to compete, and lock many into technology controlled by the biggest vendors — a problem that already exists in cloud computing. These extra limitations aren’t normal in open source, and could ultimately help the largest technology firms maintain their dominance.

Open-source AI will make plenty of progress in 2024, but it may benefit Big Tech more than we might expect, too.

(Published 05 January 2024, 11:41 IST)