llama3.1 license restrictions
I'd like to raise a point about the model's license terms, as honest feedback to the Meta Llama team. I'm certain people who already read the Llama 3.1 license know exactly what I'm going to talk about, as it was a hot discussion during the release of Llama 3. Yet, this part in particular remains:
i. If you distribute or make available the Llama Materials (or any derivative works
thereof), or a product or service (including another AI model) that contains any of them, you shall (A)
provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with
Llama” on a related website, user interface, blogpost, about page, or product documentation. If you use
the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or
otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at
the beginning of any such AI model name.
It doesn't feel too restrictive at first glance. Put Llama3.1-
in front of your model's name and include the required notice, simple and manageable for most use cases, including commercial ones. But once you start thinking more deeply about where you can realistically apply a Llama 3.1 model, you really can't find anything more than "deploying a chatbot/embedded tool". Here are a few examples:
- Base, untuned models may be used for unrestricted and creative tasks where you want to avoid the assistant-ish bias. If you were to use a Llama 3.1 model as a completion tool to manually edit or enrich your data, you'll have to name your model
Llama3.1-
or state that your whole work wasbuilt with Llama
no matter the percentage of Llama 3.1-generated tokens inside. The "fair use" term does not exist here. - Sometimes, you may want to create an intermediate LLM for a specific task. For example, fixing grammar in dataset-1. You do the manual work of creating a human-written dataset-2 with broken/fixed pairs and fine-tuning Llama 3.1 to act as such a tool. Actually using the data processed by your Llama 3.1 derivative that generated dataset-3 with fixed grammar will result in any model trained on it legally becoming a
Llama3.1-
model. - The cases unrelated to LLM development are also affected. If you were to generate a description for your YouTube video using Llama 3.1, everyone would see that your whole YouTube video was
built with Llama
. If you were to ask Llama 3.1 to fix CSS on your website, everyone would see that your whole website wasbuilt with Llama
. If you were to ask Llama 3.1 to shorten some text in your project's docs and forget what exactly was altered by Llama 3.1, your whole docs becomebuilt with Llama
until you rewrite them completely.
The point is, if your resulting work contains even a single token generated by a LLaMA model or was barely touched by the weights in any other way, you legally must state that it was built with Llama
. Simply put, it is unfair and hurtful to the industry. LLM researchers and developers might not want to have these restrictions and will use potentially inferior but more open models, wasting the opportunity to utilize such a well-performing model to advance the development of the ever-growing industry.
Interestingly, if you take llama3.1 and apache-2.0 licenses, there isn't that much of a difference in terms of crediting the work. People will still have to include the notice that their own LLM was built with Llama 3.1
, as per Apache 2.0 licensing terms. In a much more sane, far less intrusive way, still letting the Meta Llama team have their well-deserved credit.
I'm not a legal expert. My third example may be completely wrong, it's just my personal interpretation that outputs can be considered derivatives and "Llama Materials" under these terms. So, of course, any input is welcome.