Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> if LLaMa is free from legal claims for commercial use.

That restriction is only in the license though, which was not accepted. So it won't apply, right?

> otherwise, [all?] copyright notices and software licenses would be meaningless, because you could simple launder stolen property

That applies to trade secrets. If I knowingly induce you to leak a secret I can't use it, but if you leak the secret to the public in an unrelated fashion and I discover it, I can. It doesn't apply to copyright, as you note.

> If you’re positive there no copyrighted materials used in deploying LLaMa, then obviously there’s no copyright claim or breach of contract.

There's no breach of contract because there was no agreement. Therefore there's no license to use, so you're right - the copyrightable parts will be a violation. Basically everything except 'data' or machine translations of someone else's works.

And yes, I assume that there's some incidental content, at least, which will be infringing. Even a readme is copyrighted after all. And yes, of course any code which isn't already public elsewhere.

I think the weights themselves are the interesting bit though, and perhaps someone could rerelease them without any incidentals just to clarify the issue.



Happy to be proven wrong, but the weights are meaningless without the parts written by humans. If someone refers to any parts that are protected to write a new interface to the weights it would most likely be violate the law. Using the weights alone, it would be impossible to reverse engineer them using a clean-room to develop new interface to the weights.

>> There's no breach of contract because there was no agreement.

No, this is property laundering. If intentional, it’s a crime. If unintentional, the property owner need just notify the party of their rights, the remedy they’re seeking, and if needed, send a cease and desist.


> weights are meaningless without the parts written by humans.

I think this can be true, but that the format of these is well known.

The torrent consists of params.json, consolidated.*.pth, and tokenizer.model plus some .chk files. Notably, there is one script, llama.sh, which is about 2k so even if it was needed, can't be that complex.

> If someone refers to any parts that are protected to write a new interface to the weights it would most likely be violate the law

Not at all. I can refer to pages and words in a book I don't own.

> No, this is property laundering. If intentional, it’s a crime.

Only if the weights are copyrightable. Otherwise it'd hinge on being a trade secret, the best theory I've seen yet, and that basically says the cat's out of the bag once the public knows something.


>> can't be that complex.

Point is unless the new system only references the property that’s free of any claims, there’s at the very least a valid legal basis to file a complaint and it would at that point be in the courts hands to decide whether the contracts or copyrights had been breached.

>> Not at all. I can refer to pages and words in a book I don't own.

It depends, the only way for this for sure not to be the case is for the author of the code to have never seen the relevant code. At the point they have seen the code, it would be up to the courts to decide the merits of the arguments presented in court.

>> Only if the weights are copyrightable.

Weights are irrelevant, what is relevant is any aspect of the system that is subject to the related terms of use and/or copyright.


> Weights are irrelevant, what is relevant is any aspect of the system that is subject to the related terms of use and/or copyright.

The leak appears to be essentially just weights. The copyrightability of weights is the central and perhaps only issue.

> it would be up to the courts

That's a non-argument. Everything is ultimately up to the courts despite the letter of the law.

> decide whether the contracts [...] had been breached.

If you didn't sign the contract or induce the breach then it isn't relevant.


>> The leak appears to be essentially just weights. The copyrightability of weights is the central and perhaps only issue.

You’re wrong, there’s a material and significant amount of copyrighted material related to LLaMa which is critical to running it. If you’re so confident it’s legal, feel free to link to a guide on how to LLaMa that uses the only materials originally provided by Facebook so it’s possible to assess the system’s dependencies on legally protected materials. Next, feel free to link to build that is not bound to any property claims by Facebook.

>> If you didn't sign the contract or induce the breach then it isn't relevant

Again, this is not true, that’s property laundering; see above comments, repeating points I have already made will not add to this discussion. If anything is unclear, let me know, but claim that party is not bound to an agreement related to legally protected property (not referring to the weights) if they launder it is obviously invalid, since if it was, no property for which the terms of use were separable from the property itself would be enforced; again, party would receive a cease and desist with a copy of the terms of use.

>> That's a non-argument. Everything is ultimately up to the courts despite the letter of the law.

No, if a legally it’s material. There is a massive difference between clean-room reverse engineering a systems from property that’s free from any claims — and referencing materials that are subject to claims to build a new system. Further, it is my position it is impossible to do a clean-room build in this situation. As a result, the only way anyone would have any confidence that a new system was free from material claims is as a result of a ruling.

____

Beyond the prior points above, worth noting Facebook has already begun taking legal actions against developers related to LLaMa leak, so it’s clear they have no intention of releasing the weight for commercial use. Here’s an example:

https://github.com/shawwn/llama-dl


> there’s a material and significant amount of copyrighted material related to LLaMa which is critical to running it.

Which files from the torrent do you assert are required?

> system’s dependencies on legally protected materials.

Do you think the weights are copyrightable? The dependencies are irrelevant because they weren't in the leak.

> Again, this is not true, that’s property laundering;

Only, if there is actual copyrightable material. And not just an adjacent copyrightable material that is required to use the weights, but the weights themselves because they are what leaked.

If not this is a trade-secret scenario not a copyright scenario.

If there's no copyright on the weights then there's no "laundering" because there's no general restriction on the public using the material once it leaks. If Coke lost its recipe and it turned up online, everyone including Pepsi would be free to use it.

What is certain is that the "no commercial use" clause is irrelevant. The only people the license is binding on are those who accepted it. If the weights are copyrightable then the license is irrelevant to you because it simply hasn't been offered to you. If the material isn't copyrightable then there's no reason to accept the license once it leaks.

> worth noting Facebook has already begun taking legal actions against developers related to LLaMa leak

They clearly have a copyright on at least one file (llama.sh) in that archive so yes, they can make a DMCA takedown claim. That doesn't prove anything you're saying though, about weights and the ability to use them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: