Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Copilot is using GPL licensed code snippets as starting points or for adding to other snippets as derivations. As a result, it's injecting GPL code inside to another code, or providing GPL code as is.

At that point the code needs to be GPL licensed, or it's a breach of license. It's plain and simple.

Triviality cause is hard to argue, because you can code a nice "trade secret", or an algorithm worthy of its paper in ~25 lines (at least I did).

So Copilot's use of GPL licensed code is not fair use in my eyes.

BTW, all my public code is (A)GPLv3+ licensed. I want my code to stay open, not closed.



Beyond notions of de minimis is the important distinction that copyright does not apply to utilitarian features! There are also notions of expressions that were already in the public domain and some copyright cases even refer to this as prior art.


You boldly assume that all GPL code is trivial and all the functions included in these codebases can be claimed to be on public domain.

There are many GPL licensed codebases (e.g. Linux Kernel, GNU Octave, GNU R, etc.) which contains serious research and novel ideas and implementations. These codebases are esp. licensed that way to keep that research in the open.

I have such a codebase which I was planning to open source with a GPL3.0 license, but I postponed that plans because of GitHub's Copilot shenanigans.

As I said, the code contains many novel, yet short snippets which are worthy of their own research paper.

Downplaying the research embedded in open codebases is just harmful to the discussion.


Is looking at GPL code and rewriting it from scratch in your own style a license breach? For me, Copilot always uses the context and writes new code based on your own code as input. It uses the same patterns and code style as your own code.


Yes. When you write the same algorithm, even with a different style, it's a direct derivation of the code at hand. Derivative works of GPL licensed code are still GPL licensed.

Moreover, Copilot doesn't always derive code, but provides the code as-is, reproducing it comment by comment [0], and with a wrong license while at that.

[0]: https://twitter.com/mitsuhiko/status/1410886329924194309


Only the expressive elements of an algorithm fall under copyright. Generally speaking math equations and algorithms can’t be restricted by copyright.

Expressive, non-technical comments in code are most definitely protected by copyright.


So, I can use a code from a leaked source code as I wish, and duplicate all the algorithms and methods for achieving something.

Nice.

Well, we're so naive for making clean room development for so long, and writing EULA's with sentences all in caps stating that this is a trade secret and you can't do anything with it.

BTW, I want to reiterate that I'm trying to keep things open, not closed. So I'm not trying to copyright and protect it from being modified. This is what GPL is actually. So, the research is already there, in the form of a paper. You can re-implement that. But, my implementation is under GPLv3. So I'm not copyrighting/patenting the algorithm. It's certain expression I've implemented as a code, and opened under GPL.

As a result, if you derive anything from my implementation, that's GPL too. If you want that algorithm, it's on paper. Go implement that. I don't care.


Note that the controlling framework here is copyright law, hence the term "copyright license".

A lawyer will tell you that reimplementing code you have read, even if you change around the functions and rename the variables, is inviting a lawsuit. A general defense to a copyright lawsuit is to document that you have never viewed the copyrighted work, e.g. by having one person/team document the function of the code and having another person/team implement the function using that documentation (a.k.a. "clean-room reverse-engineering").

In this case, it would be up to the courts, but having a piece of copyrighted code reproduced verbatim in your project without a license is not a good look, and I highly doubt usage of Copilot would be an affirmative defense.


If you look at NEC v Intel we get an affirmation that verbatim similarity is often a sign that it was the only way to implement a given function.

And in Sony v Connectix we get a ruling that even non-clean room implementations are not in violation if there is no way to reverse engineer otherwise.


Yes, actually. That's why clean-room design is a thing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: