Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Definition of Transformative Use: The legal concept of transformative use involves significantly altering the original work to create new expressions, meanings, or messages. AI models like GPT don't merely reproduce text; they analyze, interpret, and recombine information to generate unique responses. This process can be argued as creating new meaning or purpose, different from the original works.

In the case of the famous screenshot, the AI just relayed the information it found on the web, it's not included in its training data.

So you're just wrong.



Nope, it doesn't work that way. The fact that the LLM can regurgitate original articles doesn't remove the possibility that training can be considered transformative work, or more in general that using copyrighted material for training can be considered fair use.

Rather, verbatim reproduction is the proof that copyrighted materials was used. Then the court has to evaluate whether it was fair use. Without verbatim reproduction, the court might just say that there is not enough proof that the Times's work was important for the training, and dismiss the lawsuit right away.

Instead, the jury or court now will almost certainly have to evaluate OpenAI's operation against the four factors.

In fact, I agree with the parent that ingesting text and creating a representation that can critique historical facts using material that came from the Times is transformative. An LLM is not just a set of compressed texts, people have shown for example that some neurons fire when you are talking of specific historical periods or locations on Earth.

However, I don't think that the trasformative character is enough to override the other factors, and therefore in the end it won't/shouldn't be considered fair use IMHO.


What if the LLM is running locally and doing all of these things rather than hosted on a webserver which is serving the content?


It doesn't matter, if everything else stays the same what matters is what it's used for. If it's used to make money, it would certainly hurt claims of fair use—maybe not for those that do the training, but for those that use it.


> If it's used to make money, it would certainly hurt claims of fair use

What if a human manually searches all those articles and transcribes / summarizes them to me in the way ChatGPT did?


It might also be considered copyright violation, after evaluating the four fair use factors.


Only humans can do those things, so the test fails for LLM




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: