Any resources you can share for these experimental builds? This is something I was looking into setting up at some point. I'd love to take a look at examples in the wild to gauge if it's worth my time / money.
An aside, if we ever reach a point where it's possible to run an OSS 20b model at reasonable inference on a Macbook Pro type of form factor, then the future is definitely here!
It's possible the future is now.. assuming you have an M series with enough RAM. My sense is that you need ~1gb of RAM for every 1b paramters, so 32gb should in theory work here. I think macs also get a performance boost over other hardware due to unified memory.
Spit balling aside, I'm in the same boat, saving my money, waiting for the right time. If it isn't viable already its damn close.
It seems like the ecosystem around these tools has matured quite rapidly. I am somewhat familiar with Open WebUI, however, the last time I played around with it, I got the sense that it was merely a front-end to Ollama, the llm command line tool & it didn't have any capabilities outside of that.
I got spooked when the Ollama team started monetizing so I ended up doing more research into llama.cpp and realized it could do everything I wanted including serve up a web front end. Once I discovered this I sort of lost interest in Open WebUI.
I'll have to revisit all these tools again to see what's possible in the current moment.
> My sense is that you need ~1gb of RAM for every 1b paramters, so 32gb should in theory work here. I think macs also get a performance boost over other hardware due to unified memory.
This is a handy heuristic to work with, and the links you sent will keep me busy for the next little while. Thanks!
An aside, if we ever reach a point where it's possible to run an OSS 20b model at reasonable inference on a Macbook Pro type of form factor, then the future is definitely here!