Definition of Transformative Use: The legal concept of transformative use involv...

bonzini · on Dec 28, 2023

Nope, it doesn't work that way. The fact that the LLM can regurgitate original articles doesn't remove the possibility that training can be considered transformative work, or more in general that using copyrighted material for training can be considered fair use.

Rather, verbatim reproduction is the proof that copyrighted materials was used. Then the court has to evaluate whether it was fair use. Without verbatim reproduction, the court might just say that there is not enough proof that the Times's work was important for the training, and dismiss the lawsuit right away.

Instead, the jury or court now will almost certainly have to evaluate OpenAI's operation against the four factors.

In fact, I agree with the parent that ingesting text and creating a representation that can critique historical facts using material that came from the Times is transformative. An LLM is not just a set of compressed texts, people have shown for example that some neurons fire when you are talking of specific historical periods or locations on Earth.

However, I don't think that the trasformative character is enough to override the other factors, and therefore in the end it won't/shouldn't be considered fair use IMHO.

vbi8iBEX · on Dec 28, 2023

What if the LLM is running locally and doing all of these things rather than hosted on a webserver which is serving the content?

bonzini · on Dec 28, 2023

It doesn't matter, if everything else stays the same what matters is what it's used for. If it's used to make money, it would certainly hurt claims of fair use—maybe not for those that do the training, but for those that use it.

vbi8iBEX · on Dec 29, 2023

> If it's used to make money, it would certainly hurt claims of fair use

What if a human manually searches all those articles and transcribes / summarizes them to me in the way ChatGPT did?

bonzini · on Dec 29, 2023

It might also be considered copyright violation, after evaluating the four fair use factors.

tantalor · on Dec 28, 2023

Only humans can do those things, so the test fails for LLM