Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In this case I think I'd give ChatGPT the benefit of the doubt. It is possible to invent something that already exists, and it has happened on several occasions trough-out history. A great example is the history on who was really first at inventing the telephone. In the end Alexander Graham Bell got the patent, but perhaps Elisha Gray was actually first? Historians remain divided on the topic.

For instance, I once found what I thought was an ingeniously original idea about about how TV is really just a kind of reflection of reality akin to Plato's Cave. I immediately got started writing a thesis about it, but I didn't have to search for long on the topic before I found an entire book written on this way of thinking about television. I wasn't really disappointed, because in the back of my head I knew that it had too be too good to be true that I'd be first with such a great idea. In any case I kept working with the thesis, and I still did got a good grade on it despite the idea not being revolutionary.

The questions I now wonder about is, can ChatGPT forget? Or could it be that ChatGPT was never exposed to this game, but could still infer it through other game rules, such as those for Soduko? Which I guess opens up another rabbit hole on if or how AI can be creative. Which I guess opens up another rabbit hole on how creativity works in general.



The funny thing is that it is neither lying, nor inventing something new. What OpenAI did pretty well was collect data. And wouldn't you know it, the folks who developed that new puzzle describe it as what it is---a new kind of puzzle. So now in the training data you have a combination of puzzle, sudoku, and new/novel. And wouldn't you know it, by asking for a new puzzle, based on sudoku, you make ChatGPT dig for that kind of text. If ChatGPT really had a novel idea, I would not expect it to be this coherent---after all, logic and coherence are not a constrain on how language models work, just what words are likely to occur next. That is why it is being compared to entry level college writing, because that is how an excited student writes who hops from topic to topic.


But how is it different from humans? I can't tell you how many times now that I've come up with what I thought was a really cool idea but upon web searching found it was already invented/discovered etc. In fact before the Internet I had come up with my own algorithms and only upon the Internet existing did I find they were already discovered years earlier. There's no way that I was regurgitating something I had read in that case.


There’s a difference between coming up with a puzzle then finding out it already exists versus finding a puzzle and saying you came up with it.

If I told you “We need a brand new, never-never-before-seen puzzle for our next game release.” and you searched Google for “brand new, never-before-seen puzzle”, found a puzzle game with those words in its marketing copy and pitched it to me, that would be some combination of unintelligent and dishonest behavior. Like, surprisingly so. It’s different from forgetting some puzzle you played with as a little kid and thinking you made it up, or creating a puzzle you’d never seen but has been made before.


But ChatGPT is not a person, it is a text generator. By asking it to generate a new puzzle, you are prompting it to find text in its training data showing someone describing a new puzzle, and it is going to speak in their voice. It's going to emit sentences that were influenced by what the puzzle developer originally wrote, and that person correctly said that it was new.


I'm not entirely sure about this. ChatGPT would have to make a model for how such a game was made, and then infer its rules. From that perspective, it would be brand new, although very similar games would perhaps exist out there. And at that point it's also starting to look a lot more like human creativity, although I guess not entirely. As such the statistical or probabilistic approach, or the Chinese room approach, is getting less and less valid for the AI, because it's not doing simple probabilistic look-ups from some table. Instead it's actually developing something "new", or at least with respect to the the perspective of the AI and the data or source material available to it.


I agree with everything you’ve written here, so I’m not sure what the “But” that’s starts your comment is contrasting.

I was answering the question “But how is this different from a person?”. Being asked for something new and finding something that already exists with the word “new” in front of it isn’t normal human behavior. That’s how it’s different from a person.

Zooming out a bit, I think there’s some confusion in this whole chain. There’s a common topic about ChatGPT you could call Question of Creativity. If you ask for a new poem, it just smashes together its patterns around poems. You can debate if this is creativity, and if not, how are humans different. A few comments up, someone brought in a different idea you could call New Matching. If you ask for a new poem it will just grab you a poem that had the words “new poem” in front of it. New Matching is a different idea than Question of Creativity. The person I replied to seemed to be mistaking one idea for the other.


You're not prompting it to "find text". Comparing the size of the model to the size of the training data is sufficient to conclusively establish that it's an impossibility.

We train it to predict the next word based on the training data, that is true. But we still have no idea what kind of internal structures said training actually produces inside of neural net. It sure as hell isn't just a "stochastic parrot", though, which is rather obvious if you ever tried giving it a complicated multi-step task and solve it while "thinking out loud".


This. people who can ground themselves in what ChatGPT is (an auto completion text predictor) are able to best understand the origins of its output.


It is different to what you do. If I tell you that this is already a thing, you might go back to the drawing board, and do something from scratch. Maybe do some abstract drawing with numbers for brainstorming. A language model is not able to do this, the starting point for a language model is always the training data. That is why there is so many instances where you see some wrong (or correct) response from ChatGPT and when the other person corrects this, the model just agrees to whatever the user says. That is the right thing to do according to language etiquette, but it has nothing to do with what is true and right. (It invokes the image of a sociopath manager trying to sell you a product---they will find a way to agree with you to close the deal.)

I don't know what introspective is, but I know it when I see it. People around me genuinely come up with new concepts---some of what they came up decades ago with is now ubiquitous---and the sources is often not language. It comes from observing the world with your eyes, from physical or natural mechanisms. If you want to put it into the language of models: we just have so much more data to draw on. And we have a good feedback mechanism. If you invent a toy, you can build it and test it. Language models only get second hand feedback from users. They cannot prototype stuff if the data isn't out there already.


>It is different to what you do. If I tell you that this is already a thing, you might go back to the drawing board, and do something from scratch.

Wouldn't your "something from scratch" idea, be based on your "training set" (knowledge you've learned in your life), and ways of re-arranging it inside your brain, using neuron stuctures created, shaped, and reiforced in certain ways by exposure to said training set and various kinds of re-inforcement?


Human brains training data has orders of magnitude more complexity than text. Language models are amazing but they can only do text, based on previously available text. We have higher dimensional models and we can relate to those from entirely different contexts. Same thing to me limits 'computer vision' severely. We get 3d interactive models to train our brains with, machine learning models are restricted to grids of pixels.


>Human brains training data has orders of magnitude more complexity than text.

Still a training set though. There's no some magic non-training part creating stuff from zero, out of pure determination!


There is never any 'magic'. Magic is just a word for things we don't understand. This is beside the point. Just like you'll never reach orbit with a cannon, it is useful to know the limits of the tools. There will never be an isolated language model trained on bodies of text capable of reasoning, and people shouldn't expect outputs of language models to be more than accidentally cogent word salads.


One implication though, is that LLMs can currently come up with novel mixes of existing ideas. It might be a good blender, integrating different pieces into a new whole.


Yes, but the language model does not have the feedback mechanism we have. We can test ideas against reality. Language models can make up all kinds of crap until there is data somewhere mentioning that it's not going to work. You could come up with an idea and workshop it, e.g., seeing if it's physically feasible to make something, before sharing it with others, language models cannot.


There are very few new ideas, but many different people have the same ideas.


> Or could it be that ChatGPT was never exposed to this game, but could still infer it through other game rules, such as those for Soduko?

There is no way, the game type is centuries old, you can read this giant wikipedia articles about games like this.

https://en.wikipedia.org/wiki/Magic_square

ChatGPT "inventing" this is like thinking it invented chess.


from my understanding, anybody please correct me if i'm wrong, ChatGPT can not really invent anything, it can just generate text based on probabilities obtained from the mountain of source documents used for training it. it does not think in the same way we do, it is just amazing at writing coherent phrases (and very simple code).

there's a quite long article from Stephen Wolfram about how it works and this is why I belive it can't do that: https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...


What does it mean to say that it can't invent anything? If, for example, I ask it to make a new poem with no line previously recorded in the english language it will do so. If I google that poem to test it's originality I won't find a match. It seems to me it just made something novel, right?


When humans write new literature, or design new games, are we simply remixing elements of language and game mechanics that we've seen before, or is there something more going on?


Who else's experiences can we pull from but our own? It can be anything else.


You may be splitting the wrong hair here.

However it generates a text, that text may describe what for practical purposes is a new invention.


>it does not think in the same way we do

And how do we think exactly? Don't we have a brain trained on input (livable experience, knowledge from books, school, videos, conversations, etc) and generating text based on probabilities (weighted sets of neurons with weights built from that set)?


This is not a magic square, though. All rows and columns explicitly do not add to the same number.


Yes, but the non magic square is inspired by the magic square and such games are everywhere. Just buy a random puzzle book and you find pages and pages of puzzles with "make the numbers add up to these columns and rows", because they are very easy to make.

Point about magic square is that every culture invents games like that, it is one of the most basic puzzle ideas humans have, I don't see how ChatGPT can't have that in its training set.


For someone trying to show that a chat bot could not possibly have generated this specific game on its own because it already exists, you kind of have to show that it already exists.

All that you’ve done is shown that similar types of puzzles exist. Which, I mean, is kind of the point of a generative AI.

“Games like this” exist. Does this specific game exist?


Much more advanced versions of it exists, and has existed for a long time, for example Kakuro is before computer games. Magic sum is just a special case of it. Finding a discussion with those exact rules are probably a bit hard, search engines aren't good at searching for that, but given how common these games are and how many game design discussions and ideas there are online, a game where "block out these numbers to make these sums" is surely to exist somewhere. The poster above even found the exact same game, although that wasn't described in text, but someone probably described it in text somewhere.

https://en.wikipedia.org/wiki/Kakuro


Once again, to show that’s one thing is blatantly copying another thing, you kind of have to show that thing exists already. Kakuro is also a similar game with its own unique rules that only somewhat overlap with this one.

It’s not enough to say “a lot of games with similar rules exist” and if anything, that just shows that a generative AI is good at what it does: break down the rules of a game and make modifications to make what is potentially a new game.

If you can show an example of this exact game having existed for centuries, then you have a point. But showing that magic squares and similar games exist… just shows that magic squares and similar games exist, not that the algorithm incorrectly said this is a new game.


The discussion was probability of ChatGPT having invented it, the probability that description for such a game is in ChatGPT's dataset is extremely high. We have examples of that exact game existing (the top post of this thread), and we know from my links that there are countless texts about puzzles like this out there, although they aren't exactly the same.

> It’s not enough to say “a lot of games with similar rules exist” and if anything, that just shows that a generative AI is good at what it does: break down the rules of a game and make modifications to make what is potentially a new game.

No it doesn't, even if that is the case it just shows that it adds random variations. Since we only see the trimmed subset of ideas it generates that people found good enough to post, the smart one is the person.

You would need to prove that ChatGPT actually consistently generates working puzzle ideas that are novel to convince anyone that it actually does so. Extraordinary claims require extraordinary evidence, so all I need to do is find plausible explanations to how ChatGPT found it, you would need much better evidence to convince people it actually did make a novel game.


> The discussion was probability of ChatGPT having invented it, the probability that description for such a game is in ChatGPT's dataset is extremely high.

If this were the case, it would have been trivial for you to find a game with its written rules described and which match the one generated.

You have done nothing but say that is the case. You haven’t actually proven that’s the case.

ChatGPT can’t magically infer the rules of the game from screenshots, and you have only shown that similar games exist and have existed for centuries. But that is not the same as saying that this specific game has and that ChatGPT just pulled it out of its dataset.

That is the extraordinary claim that you don’t have evidence for but are acting like it’s right there obviously out in the open for everyone to see.


> If this were the case, it would have been trivial for you to find a game with its written rules described and which match the one generated.

Search engines doesn't work like that. You are basically asking me the equivalent of proving that a photo isn't depicting a ghost. No, I can't prove that, I can however come up with examples showing how the photo could have been created even if it wasn't a ghost.

If you want to prove that ghosts are real you need plenty of photos from lots of angles and situations, or videos, and from many sources to show that it isn't all made up by a single person. The equivalent of that would be if they had made ChatGPT generate 100 different working games for example, that would be much more believable. But a single case of a game that already exists and has countless texts describing similar games? It just looks like random chance that got handpicked or plagiarism.

This isn't a court trial, I am not going to sue ChatGPT for plagiarism here, it is just a discussion whether it is reasonable to believe ChatGPT can generate novel puzzle games.

Edit: But do note that since ChatGPT can find such ideas that are hard to find with a search engine, that makes ChatGPT very useful in a way search engines aren't. So I am not saying it doesn't add value. Just that people seem to say ChatGPT does a lot of thing that it doesn't seem to be able to do.

Edit again:

> That is the extraordinary claim that you don’t have evidence for but are acting like it’s right there obviously out in the open for everyone to see.

Yes, you think it is obvious that ChatGPT is capable of very creative and productive thinking. But most people don't think that, to them that is an extraordinary claim. I'm not here to convince you, I'm here to explain to you why you aren't convincing anyone with what you say. People like you were convinced by articles like this before the discussion even began.


> Search engines doesn't work like that. You are basically asking me the equivalent of proving that a photo isn't depicting a ghost. No, I can't prove that, I can however come up with examples showing how the photo could have been created even if it wasn't a ghost.

The claim was that it pulled the game out of its dataset. If this were the case, I would argue it would absolutely be trivial to find them. It’s not some concept that can’t be described in words or would be hard to quantify. The rules have been provided, and, assuming they were plagiarized from somewhere else, would be listed verbatim or close to it.

If a student plagiarized on their work, whether in written form or in code, it’s been trivially easy to find the exact work that was copied from. It generally takes me a few seconds of searching to find it.

This is the same. If these rules existed in a dataset, then it should be equally easy to pull them up and prove the plagiarism. If all you can find is similar puzzles, you can’t just throw your hands up and say “yep, gottem”. That’s just not how this works.


> The claim was that it pulled the game out of its dataset. If this were the case, I would argue it would absolutely be trivial to find them. It’s not some concept that can’t be described in words or would be hard to quantify. The rules have been provided, and, assuming they were plagiarized from somewhere else, would be listed verbatim or close to it.

ChatGPT uses word vectors, it wont use the same words but variants of the words. You can't search for that. Cases where word vectors only maps to single words with no variations for every word are very rare, so ChatGPT is very good at plagiarising things without reproducing exactly, it just rarely fails at it.

> If a student plagiarized on their work, whether in written form or in code, it’s been trivially easy to find the exact work that was copied from. It generally takes me a few seconds of searching to find it.

No it isn't, they just change the words and rewrites it until it no longer looks the same. ChatGPT is trained to rewrite texts like that to avoid triggering trivial plagiarism detectors. They train it to produce the same text, but with different words, producing exactly the same text is punished.


> No it isn't, they just change the words and rewrites it until it no longer looks the same. ChatGPT is trained to rewrite texts like that to avoid triggering trivial plagiarism detectors. They train it to produce the same text, but with different words, producing exactly the same text is punished.

Do you think students plagiarizing don’t do the exact same thing? Clearly someone has never actually dealt with plagiarized work. This is plagiarizing 101. The structure remains the same even if they use synonyms. Considering it’s trivially easy to find in code which is magnitudes harder to pull off, I would still argue it should be easy as pie to find this supposed set of rules.

Your point is not very credible without proof of this game existing and ChatGPT pulling it from this source. Without showing this supposed proto-game having existed with rules the ChatGPT can pull from, then all you’ve done is wave your hands around and yelled “similar games exist so this can’t possibly be uniquely generated” and that’s not a very compelling argument.


> Do you think students plagiarizing don’t do the exact same thing? Clearly someone has never actually dealt with plagiarized work. This is plagiarizing 101. The structure remains the same even if they use synonyms.

You rewrite the structure of the text, you don't just use synonyms. ChatGPT is capable of rewriting text to a different structure while keeping the meaning, I hope you are aware of that.

Anyway, even if you just change the words to synonyms it wont be easy to find in a search engine. Search engines aren't very good at finding matches to synonyms. Google tries, but in doing so they fail to find more specific texts like scientific publications or documentation, so no search engines aren't good at finding plagiarism.

Edit: And you make it sound like most plagiarism is found. No, that isn't the case, most plagiarism is not found out because it is a very hard problem to solve. Only the most blatant cases are caught. For humans that is reasonable, for AI we can be stricter since there isn't a humans career at stake.


> Anyway, even if you just change the words to synonyms it wont be easy to find in a search engine.

Got it, so you’ve never actually dealt with plagiarized work. You should have just led with that.

I have literally said, from actual experience, that this is the case. But I guess discarding that and pretending it was never said and that the opposite is true is I’m sure an easier position to hold.


Do you believe you never missed any plagiarised work examples? You caught some people doing X, and then you declare that catching people who do X is trivial. But plenty of people get away with doing X so we know that it isn't easy to catch.

For students they are probably easier to catch since they use the same tools you do, they use a search engine to find an article and plagiarises that. But ChatGPT takes deep discussions from reddit or stack overflow, I can't find those with a search engine.


If it’s as blatant as copying the entire game, you’d think it would be easier for you to find the game it copied. By your own account, this is an example of an obvious case of plagiarism. You were dead set on it, 100% sure.

Yet here we are. Dozen comments later and still no written set of rules produced which definitively shows that it was copied.

Come back when you actually have that and maybe we can continue this conversation.

> But ChatGPT takes deep discussions from reddit or stack overflow, I can't find those with a search engine.

Where do you think the answers come from? It’s not like Google has a massive index island around Reddit and SO.


I tend to exaggerate my claims a bit, yes. But you exaggerate your claims as well, for example you claim that if it had copied the rules it would be easy to find an example, that isn't true at all. Many examples of plagiarism goes unnoticed for years, until someone who is familiar with the original work points it out. I know examples where the person was found out during his thesis defence, he had plagiarised his entire PhD work from papers in another language and nobody noticed until years later, not even all the peer reviewers of the papers.

So maybe these rules are described in Japanese? Most similar games comes from Japan, Kakuro, Sudoku etc. Would your plagiarism detection method of Googling it find a Japanese source? I doubt it. But ChatGPT transcends language barriers, it can translate to English just fine.


Being briefly mentioned in the dataset would not really help it, because it doesn't "remember" the entirety of the dataset anyway. It would have to be something described repeatedly in the training inputs for ChatGPT to really remember the rules with this level of precision.


One game I can think of being very similar, is a game within a game. Dungeons & Diagrams puzzle within Last Call BBS [0]. In that game you can place or remove walls for them to add up to the numbers shown per row/column. That game has another layer of strategy built on top, as there are certain "dungeon patterns" you could observe that would in theory guide you through completion. I myself haven't noticed any patterns when I've tried the game the first time, and just relied on the numbers shown. (Guess that's why I've only played 3-4 levels)

[0] https://steamuserimages-a.akamaihd.net/ugc/18583143573725211...


https://play.google.com/store/apps/details?id=com.rohitpailw...

this comment thread started with this link, this is exactly the same game


Sure, and then the next comment said that ChatGPT could have separately invented the game, to which the comment I replied to said that's impossible because the type of game is old and surely would have been written down and included in its corpus, which it then claimed it invented. The rest of the context matters.

ChatGPT can't deduce the rules of the game using the screenshots. They would need to be written somewhere for them to come out of its dataset. And so far, nobody has shown a game with the rules in a format that ChatGPT could consume.

Why is it so hard to believe that a generative AI generated this game from similar ones which exist? That is literally the purpose of it, after all.


Hm, well in that case I may well be wrong. Thanks for the info!


Chatgpt should be able to cluster things and see were clusters could be, collect everything necessary for that theoretically cluster and the human could evaluate it.


Re forgetting: we should be careful not to anthropomorphize ChatGPT.

In principle, ChatGPT cannot forget. It is trained on data and this training will stay as long as it didn't get deleted or destroyed. In other words, in all cases of someone having made ChatGPT tell something, it should be possible to repeat this. Perhaps in some case it will be effectively impossible for some rare combination of prompt and random seed, so one could say, ChatGPT forgot something. But this is not the same as people forgetting something.

Or during the training something was not considered important, but this is not forgetting, this is ignoring.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: