Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is exactly the reason OpenAI isn't afraid of the open-source community, like many kneejerk opponents of regulatory capture assume (they are probably still afraid of Google). Also why they still do the expensive and cumbersome RLHF training, instead of those deceptively cheap and fast finetunes. They understand their own tech and why there isn't free lunch.

Recently, John Schulman explained the issue with behavior cloning and it's a very typical ML problem.[1] Basically: what are we training the model to do? The model updates after finetuning in a holistic manner, based on the sum total of its content and capability. Suppose GPT-4 can correctly answer to many requests because it knows correct answers, in the sense that it has something isomorphic to an internal knowledge graph and tools for querying it, and that graph contains sufficient data for its tools to derive an answer at inference. RLHF reinforces this behavior by constraining the distribution of outputs (essentially, steering the model away from applying inappropriate tools for respective inputs, e.g. employing fantasy-narrative or bad-yahoo-answers cognitive routines when asked something that looks like a straightforward factual question).

Now suppose you teach LLaMA-13B to imitate those responses by SFTing it on a dump of successful GPT-4 conversations. But LLaMA doesn't have internals that would have enabled it to find the same answers; so on the object level it shallowly memorizes specific items of the post-training dataset, and on the meta-level it learns the stylistic flourish of a high-powered model. But it starts to hallucinate confident nonsense whenever you step out of the training distribution, because it doesn't actually learn to query its own knowledge graph. A little anthropomorphism won't hurt: you create an incapable impostor this way, a wannabe nerd, a character who is used to guessing the teacher's password and being praised, instead of understanding the subject, and keeps raising its hand whenever a question is asked, but is painfully clueless.

Indeed, the early and cheap success of behavior cloning was a massive red flag unto itself. There's no way all the compute and data that went into training GPT-3/3.5/4 tier models can be substituted with gently demonstrating the attitude vector. If we had models that were markedly less capable but comparably honest, we would have reasons for hope that this line terminates in a genuine open-source peer competitor; instead, we have total fraud.

It is a nontrivial task to have a model generalize epistemic honesty and not a lower-order behavior like clamping up and kowtowing or bullshitting from external examples; train it to say "I don't know" whenever it actually does not, but only then.

There are clever approaches here, but they're not such a low-hanging fruit as what passes for open-source right now.

1. https://youtu.be/hhiLw5Q_UFg?t=685



Before the internet I use to laugh my ass of at school friends who would wonder/debate about something but never bother to look it up, in stead they had elaborate collective "hallucinations", they imagined the facts until they were satisfied their answer was well reasoned enough, then they would consider it a fact. We all do this at times (at all ages) but one must learn to shut down the train of thought, stop polluting your memory.

I remember one from very early in life. I postulated out loud that Jerusalem, being the birth place of Jesus, must be the most peaceful place on earth. All those loving and caring religious people who work so hard to be good people. That their religions are slightly different shouldn't matter to Jesus message?

That the LLM's can consume such huge amounts of data doesn't mean they matured beyond that rather infantile mind set.

In the video you linked he explained that training it to learn to say it doesn't know will trigger false negatives.

The correct formula I imagine (hah!) is to wonder if the question is of interest to the model and to ask someone else for answers or some help figuring out the question. The human will just have to wait.

What is completely hilarious to me is that we all have heads full of learned answers for which we have no idea "why it is so" or at the very least lack that what would have one arrive at that solution. I get what Archimedes realized in the bathtub but what I want is the mechanism by which he arrived at such wild ideas. Could it be that learning a lot of facts would be the exact opposite kind of conditioning?

My mind now says this must be why we humans expire so fast. You keep the calcified brains around for a while as data storage but the focus of project humanity must be to create young ones. I will have to ponder this fact free line of reasoning some more. Perhaps I will find ways to convince myself I know something.

It is a fun thought that people created AI, we really want to believe we did. If enough pretend it is true no one can take it away from us.

If you want people to think you are intelligent you tell them things they already know and hide your sources.


Humans don't expire very quickly.

Most humans are going to outlast whatever they produced during their lives. If anything, human bodies are among the most durable "goods" in the economy. Only real estate, public infrastructure and recorded knowledge (including genes) last longer than a human lifetime. How many of the things you buy and own are going to outlast you?


I was referring to the age distribution. If having a smaller percentage young people gave us an evolutionary edge it would have happened(?) Say the ratio rebellious exploration vs applied knowledge. What is ideal?

All the goods we produce are designed to last for a specific time. We can easily make them more durable and with some serious effort they could last longer than we can imagine. It would be expensive, it might be beneficial but who wants to pay for benefits 100 or 200 years into the future?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: