Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The two issues aren't exactly the same but they do seem intimately connected. When you consider what's involved in generating a weights file, it's a mostly mechanical process. You write a model, gather some data, and then train. Maybe the design of the model is patentable, or the model/training code is copyrightable (actually, I'm pretty sure it is), but the training process itself is just the execution of a program on some data. You can argue that what that program is doing is simply compiling a collection of facts, which means you haven't created a derivative work, but in that case the weights file is a database, by definition, so not copyrightable in the US. Or you can argue that the program is a tool which you're using to create a new copyrightable work. But in that case it's probably a derivative work.


Appreciate the distinction in the above comment that they are two distinct questions, but also agree the two questions are very connected.

I should've been more specific: I was thinking mainly of the artists v. stable diffusion lawsuit which makes the specific technical claim that the stable diffusion software (which includes a bunch of "weights files") includes compressed copies of the training data. (Line 17, "By training Stable Diffusion on the Training Images, Stability caused those images to be stored at and incorporated into Stable Diffusion as compressed copies", https://stablediffusionlitigation.com/pdf/00201/1-1-stable-d...).

I expect that if the decision hinges on this claim, that could have far reaching implications re: model licensing. I think this along the lines of what you've laid out here!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: