I'm not entirely sure if the mental model is somehow a layer deeper than the pre...

ffwd · on Oct 5, 2023

Well I can't say any of this for sure but I want to say upfront, I think llm's can in theory do a lot (not sure if most) of the computations a human can do, but it's important to realize, imo, it's not actually stepping through the steps in the way humans are. When we give complicated step by step prompts and so on, it only means it's creating new constraints for what the probability of the next token is (from what exists in the data/model). If the data/model doesn't contain the data needed to produce the desired result, or the data that it was trained does not have examples that can generalize (but not be specific to) the desired result then it can't produce it.

That's the difference between humans and llm's imo. We can generalize any "computation" we have to any other "desired output" we want, by thinking about it, while llm's aren't at least not now, so general that they use low level representations of all the 'objects' we can prompt about. Like humans can reason about the objects and things in our mind almost infinitely and recursively while also retaining all the physical realities and facts of those objects, while an llm is limited in this regard. Doesn't mean it can't in theory, there is some weird generalization going on as far as I can tell, but it feels like it's going to need a lot more data or something to do it.