Agreed. I had it solve a little programming problem in a really obscure programming language and after some prompt tuning got results strongly superior to GPT4, Claude, Llama3 and Mixtral. As the language (which I won't name here) is acceptably documented but there are _really_ few examples available online, this seems to indicate very good generalization and reasoning capabilities.