I like the fact that I can now reproduce any Microsoft content without paying for it. Cheers!
Incidentally, some AI chatbots do link to their sources. And it is a good idea to make that an explicit prompt if you're using one that doesn't. It's also worth prompting for how recent their information is.
I would argue that if I ask ChatGPT something, it doesnt "reproduce" what was written on certain website (or at least it shouldn', without attribution). It takes what it scrapped before and re-tell it in its own words. That isn't reproducing, looks like a grey area not yet addressed in copyright laws.
I would partially agree with the guy, that yes, that was a social contract since 90's, but before the AI era. Back then this use case wasn't anticipated.
Imagine training a LLM vs a group of people from birth on wrong information. The LLM will unquestionably just repeat in "its own words" the wrong information, whereas the group of people will of course believe some of the wrong stuff, but they will also doubt a lot of it as well.
You could say that an LLM is just not good enough yet so the comparison isn't fair. In other words that people are just even more LLM'ing than the LLM, but there simply is no mechanism for an LLM to go from wrong information to right information.
People on the other hand will always doubt, hypothesize, and compare and contrast whatever information they have to at least attempt to form correct answers from correct information. This in a sense is because they actually have their own words.
There is, as of today, never been a smart or creative thing an LLM has ever said that doesn't literally come from other people's words. If LLM's are smart, it's because people are smart.
There’s nothing ambiguous from a copyright perspective, it’s a derivative work. People seem to confuse plagiarism in an academic environment from copyright. Simply using your own words doesn’t mean you’re free from copyright.
However even when something infringes copyright that doesn’t mean anything necessarily happens. Just look at YouTube’s early history or the mountains of fan fiction out there.
But something did happen. Viacom and others sued them, and then YouTube introduced their Content ID system so that they could pay copyright holders for content that others uploaded, as well as to take down videos belonging to copyright holders that did not agree to other people uploading their content.
Yes, it took 2 years after creation and truly massive amounts of copyright infringement before the lawsuits by copyright owners showed up. OpenAI is getting sued, but don’t expect your requesting a website be rewritten to provoke anything unless you publish such rewritten posts at scale or something.
> However even when something infringes copyright that doesn’t mean anything necessarily happens. Just look at YouTube’s early history or the mountains of fan fiction out there.
This part is talking about uploading a copy of something verbatim, the way I read it.
Last time I used Copilot, the "sources" often didn't support what it said and it seemed like they were obtained by adding search results from feeding the answer into Bing after it had already been generated.
And there were of cause tons of SEO slop links among them.
I asked ChatGPT for sources and they were impossible to determine if they were real or not. It'd cite things like "Sky and Telescope magazine" no edition, no page numbers no year, just a vague unverifiable citation
>I like the fact that I can now reproduce any Microsoft content without paying for it
Only if you have the same quality lawyers and financial backup to support them to get you off like MS has. Else what applies to MS doesn't apply to you :)
You are probably joking, but that is literally what MS said, they don't even hide it. A quote from the register: "Suleyman (Head of MS AI) did allow that there's another category of content, the stuff published by companies with lawyers." (https://www.theregister.com/2024/06/28/microsoft_ceo_ai/)
Has this become any better? Every time I asked ChatGPT for sources it makes up papers, with fragments of real paper titles and topically related authors. The supposed paper itself though can't be found anywhere.
Incidentally, some AI chatbots do link to their sources. And it is a good idea to make that an explicit prompt if you're using one that doesn't. It's also worth prompting for how recent their information is.