Don’t have enough ram for this model, however the smaller 20B model runs nice and fast on my MacBook and is reasonably good for my use-cases. Pity that function calling is still broken with llama.cpp
I'm glad to see this was a bug of some sort and (hopefully) not a full RAM limitation. I've used quite a few of these models on my MacBook Air with 16GB of RAM. I also have a plan to build an AI chat bot and host it from my bedroom on a $149 mini-pc. I'll probably go much smaller than the 20B models for that. The Qwen3 4B model looks quite good.