Yeah, it only takes a small hallucination to delete emails instead of copy them

sharemywin · on April 20, 2023

Yes, you should look at the code before you run it. and don't run development code on production resources like email addresses.

13years · on April 20, 2023

lol, indeed.

As long as the hallucination problem remains, I think we are going to see a significant hype bubble crash within a year or so.

Yes, it is still useful under proper guidance, but building things on full automation that are reliable doesn't seem to be something that is actually within realms of reality at present. More innovations will be required for that.

spacebanana7 · on April 20, 2023

Never underestimate a hype bubble ;) The crypto one lasted for many years with far fewer use cases than LLMs, even accounting for hallucinations.

outworlder · on April 20, 2023

We don't even have to go to crypto. AI has had many boom/bust cycles. The term "AI Winter" dates back from the 80s!

Of course, at every cycle we get new tools. The thing is, once tools become mainstream, people stop referring to them as "AI". Take assistants (Amazon Echo, Cortana, Siri, etc). They are doing things that were active areas of "AI" research not long ago. Voice recognition and text to speech were very hard problems. Now people use them without remembering that they were once AI.

I predict that GPT will follow the same cycle. It's way too overhyped right now (because it's impressive, just like Dragon Naturally Speaking was). But people will try to everyday scenarios and – outside of niches – they will be disappointed. Cue crash as investments dry up.

Hopefully this time we won't have too many high profile casualties, like what happened with Lisp Machines.

sharemywin · on April 20, 2023

I never upgraded to pro and I spent like $2 on credits so far this month. I could easily see them hitting a $1b/yr which to me isn't niche market or a hype bubble.

spacebanana7 · on April 20, 2023

A large profitable business can still be overpriced or inflated by a bubble. Like Cisco/Intel in the 90s

sharemywin · on April 20, 2023

My understanding is the base model is pretty good about knowing whether it knows stuff or not. it's human feedback training that causes it to lose that signal.

13years · on April 20, 2023

Do you have any references? I know of the emergent deception problem that seems to be created through feedback.

https://bounded-regret.ghost.io/emergent-deception-optimizat...

famouswaffles · on April 20, 2023

Base GPT-4 was highly calibrated. read open ai's technical paper.

also, this paper on gpt-4 performance of medical challenge problems confirmed the high calibration for medicine https://arxiv.org/abs/2303.13375

13years · on April 20, 2023

Thanks, but I didn't find any details about performance of pre reinforcement training and after. Looking to understand more about the assertion that hallucinations are introduced by the reinforcement training.

famouswaffles · on April 20, 2023

https://arxiv.org/abs/2303.08774 The technical report has before and after comparisons. It's a bit worse on some tests. and they pretty explicitly mention the issue of calibration (how well confidence on a problem results in the ability or accuracy solving that problem).

Hallucinations are a different matter.