Don’t have enough ram for this model, however the smaller 20B model runs nice an...

tarruda · 2025-08-11T13:19:32 1754918372

It is fixed in this PR/branch: https://github.com/ggml-org/llama.cpp/pull/15181

codazoda · 2025-08-11T16:03:29 1754928209

I'm glad to see this was a bug of some sort and (hopefully) not a full RAM limitation. I've used quite a few of these models on my MacBook Air with 16GB of RAM. I also have a plan to build an AI chat bot and host it from my bedroom on a $149 mini-pc. I'll probably go much smaller than the 20B models for that. The Qwen3 4B model looks quite good.

https://joeldare.com/my_plan_to_build_an_ai_chat_bot_in_my_b...

tempotemporary · 2025-08-13T17:04:56 1755104696

what are your use cases? wondering if it's good enough for coding / agentic stuff