I like the fact that I can now reproduce any Microsoft content without paying fo...

greyman · on June 29, 2024

I would argue that if I ask ChatGPT something, it doesnt "reproduce" what was written on certain website (or at least it shouldn', without attribution). It takes what it scrapped before and re-tell it in its own words. That isn't reproducing, looks like a grey area not yet addressed in copyright laws.

I would partially agree with the guy, that yes, that was a social contract since 90's, but before the AI era. Back then this use case wasn't anticipated.

asimpletune · on June 29, 2024

> in its own words

LLM's have no words of their own.

Imagine training a LLM vs a group of people from birth on wrong information. The LLM will unquestionably just repeat in "its own words" the wrong information, whereas the group of people will of course believe some of the wrong stuff, but they will also doubt a lot of it as well.

You could say that an LLM is just not good enough yet so the comparison isn't fair. In other words that people are just even more LLM'ing than the LLM, but there simply is no mechanism for an LLM to go from wrong information to right information.

People on the other hand will always doubt, hypothesize, and compare and contrast whatever information they have to at least attempt to form correct answers from correct information. This in a sense is because they actually have their own words.

There is, as of today, never been a smart or creative thing an LLM has ever said that doesn't literally come from other people's words. If LLM's are smart, it's because people are smart.

Retric · on June 29, 2024

There’s nothing ambiguous from a copyright perspective, it’s a derivative work. People seem to confuse plagiarism in an academic environment from copyright. Simply using your own words doesn’t mean you’re free from copyright.

However even when something infringes copyright that doesn’t mean anything necessarily happens. Just look at YouTube’s early history or the mountains of fan fiction out there.

codetrotter · on June 29, 2024

> Just look at YouTube’s early history

But something did happen. Viacom and others sued them, and then YouTube introduced their Content ID system so that they could pay copyright holders for content that others uploaded, as well as to take down videos belonging to copyright holders that did not agree to other people uploading their content.

Retric · on June 29, 2024

> something did happen

Yes, it took 2 years after creation and truly massive amounts of copyright infringement before the lawsuits by copyright owners showed up. OpenAI is getting sued, but don’t expect your requesting a website be rewritten to provoke anything unless you publish such rewritten posts at scale or something.

coldtea · on June 29, 2024

>then YouTube introduced their Content ID system

That's for content that's reproduced in part or fully, but verbatim (like a song, movie clip, etc, where Content ID can apply).

But the parent's point is you can have trouble even for content where you "retell" something "in your own words".

codetrotter · on June 29, 2024

The part I was responding to is this:

> However even when something infringes copyright that doesn’t mean anything necessarily happens. Just look at YouTube’s early history or the mountains of fan fiction out there.

This part is talking about uploading a copy of something verbatim, the way I read it.

moritzwarhier · on June 29, 2024

Last time I used Copilot, the "sources" often didn't support what it said and it seemed like they were obtained by adding search results from feeding the answer into Bing after it had already been generated.

And there were of cause tons of SEO slop links among them.

malfist · on June 29, 2024

I asked ChatGPT for sources and they were impossible to determine if they were real or not. It'd cite things like "Sky and Telescope magazine" no edition, no page numbers no year, just a vague unverifiable citation

coldtea · on June 29, 2024

>I like the fact that I can now reproduce any Microsoft content without paying for it

Only if you have the same quality lawyers and financial backup to support them to get you off like MS has. Else what applies to MS doesn't apply to you :)

danybittel · on June 29, 2024

You are probably joking, but that is literally what MS said, they don't even hide it. A quote from the register: "Suleyman (Head of MS AI) did allow that there's another category of content, the stuff published by companies with lawyers." (https://www.theregister.com/2024/06/28/microsoft_ceo_ai/)

mglz · on June 29, 2024

Has this become any better? Every time I asked ChatGPT for sources it makes up papers, with fragments of real paper titles and topically related authors. The supposed paper itself though can't be found anywhere.

BlueTemplar · on June 29, 2024

Good when they do, but depending on what we are discussing, linking to all their sources might be completely impractical.