Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You are exaggerating. LLMs simply don’t hallucinate all that often, especially ChatGPT.

I really hate comments such as yours because anyone who has used ChatGPT in these contexts would know that it is pretty accurate and safe. People also can generally be trusted to identify good from bad advice. They are smart like that.

We should be encouraging thoughtful ChatGPT use instead of showing fake concern at each opportunity.

Your comment and many others just try to signal pessimism as a virtue and has very less bearing on reality.





All we can do is share anecdotes here, but I have found ChatGPT to be confidently incorrect about important details in nearly every question I ask about a complex topic.

Legal questions, question about AWS services, products I want to buy, the history a specific field, so many things.

It gives answers that do a really good job of simulating what a person who knows the topic would say. But details are wrong everywhere, often in ways that completely change the relevant conclusion.


I definitely agree that ChatGPT can be incorrect. I’ve seen that myself. In my experience, though, it’s more often right than wrong.

So when you say “in nearly every question on complex topics", I’m curious what specific examples you’re seeing.

Would you be open to sharing a concrete example?

Specifically: the question you asked, the part of the answer you know is wrong, and what the correct answer should be.

I have a hypothesis (not a claim) that some of these failures you are seeing might be prompt-sensitive, and I’d be curious to try it as a small experiment if you’re willing.


In one example, AWS has two options for automatic deletion of objects in S3 buckets that are versioned.

"Expire current versions" means that the object will be automatically deleted after some period.

"Permanently delete non-current versions" means that old revisions will be permanently removed after some period.

I asked ChatGPT for advice on configuring a bucket. Within a long list of other instructions, it said "Expire noncurrent versions after X days". In this case, such a setting does not exist, and the very similar "expire current versions" is exactly the wrong behavior. "Permanently delete noncurrent versions" is the option needed.

The prompt I used has other information in it that I don't want to share.


I don't think that LLM's do a significantly worse job than the average human professional. People get details wrong all the time, too.

LLM give false information often. The ability for you to catch incorrect facts is limited by your knowledge and ability and desire to do independent research.

LLMs are accurate with everything you don't know but are factually incorrect with things you are an expert in is a common comment for a reason.


As I used LLMs more and more for fact type queries, my realization is that while they give false information sometimes, individual humans also give false information sometimes, even purported subject matter experts. It just turns out that you don’t actually need perfectly true information most of the time to get through life.

No they don’t give false information often.

They do. To the point where I'm getting absolutely furious at work at the number of times shit's gotten fucked up and when I ask about how it went wrong the response starts with "ChatGPT said"

Do you double check every fact or are you relying on yourself being an expert on the topics you ask an llm? If you are an expert on a topic you probably aren't asking ab llm anyhow.

It reminds me of someone who reads a newspaper article about a topic they know and say its most incorrect but then reading the rest of the paper and accepting those articles as fact.


Gell-Mann Amnesia

"Often" is relative but they do give false information. Perhaps of greater concern is their confirmation bias.

That being said, I do agree with your general point. These tools are useful for exploring topics and answers, we just need to stay realistic about the current accuracy and bias (eager to agree).


I just asked chatGPT.

"do llms give wrong information often?"

"Yes. Large language models produce incorrect information at a non-trivial rate, and the rate is highly task-dependent."

But wait, it could be lying and they actually don't give false information often! But if that were the case, it would then verify they give false information at a non trivial rate because I don't ask it that much stuff.


I have them make up stuff constantly for smaller rust libraries that are newish or dont get a lot of use.

Whether or not Hallucination “happens often” depends heavily on the task domain and how you define correctness. In a simple conversational question about general knowledge, an LLM might be right more often than not — but in complex domains like cloud config, compliance, law, or system design, even a single confidently wrong answer can be catastrophic.

The real risk isn’t frequency averaged across all use cases — it’s impact when it does occur. That’s why confidence alone isn’t a good proxy: models inherently generate fluent text whether they know the right answer or not.

A better way to think about it is: Does this output satisfy the contract you intended for your use case? If not, it’s unfit for production regardless of overall accuracy rates.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: