It’s an absolute beast. I run it via OpenRouter, where I have Groq and Cerebras ...

jsheard · 2025-11-08T11:40:14 1762602014

Cheap enough for now, but of all the companies selling inference at a loss, Cerebras and Groq are probably losing the most per token. Their hardware is ungodly expensive and its reliance on huge amounts of SRAM bottlenecks how much cheaper it can get, since SRAM density is improving at a snails pace at this point.

rpdillon · 2025-11-08T13:01:28 1762606888

You're pointing out a bunch of high capex costs (hardware, SRAM), but then concluding that their opEx is greater than their revenue on a per unit basis. Are they really losing money on every token? It seems that using hardware acceleration would decrease inference costs and they could make it up on unit economics over time.

But I'm just reasoning from first principles. I don't have any specific data about them.

aurareturn · 2025-11-08T19:40:44 1762630844

  It seems that using hardware acceleration would decrease inference costs and they could make it up on unit economics over time.

Nvidia GPUs are accelerators too. The reason they can do this so fast is because they're storing entire models in SRAM.

rpdillon · 2025-11-09T13:24:27 1762694667

There are degrees of acceleration. My understanding, limited as it is, is that groq and cerebras are using highly optimized acceleration to achieve their token generation rates, far beyond that in a regular GPU, and this leads to lower costs per token.

Is this incorrect?

aurareturn · 2025-11-10T05:04:00 1762751040

Yes, they're called ASICs on Grog. But Cerebras has more general cores that can do more complex things. Inference is mostly limited by bandwidth though.

7thpower · 2025-11-08T15:53:27 1762617207

Switching costs are low, so if that happens we’ll just switch.

petesergeant · 2025-11-08T12:30:37 1762605037

Not doubting you but anything to back that up? Happy enough to burn VC money until someone shows up who can run it without losing money, either way.

rajman187 · 2025-11-08T12:48:05 1762606085

They’ve filed a S1 [1] last year when attempting to go public. It showed something like a $60M+ loss for the first 6 months of 2024. The IPO didn’t happen because the CEO’s past included some financial missteps and the banks didn’t want to deal with this. At the time the majority of their revenue came from a single source in Abu Dhabi, as well

[1] https://www.sec.gov/Archives/edgar/data/2021728/000162828024...

petesergeant · 2025-11-09T03:56:54 1762660614

> the majority of their revenue came from a single source in Abu Dhabi, as well

I live in UAE, whose continuing enthusiasm in AI investment stretches well beyond short-term profit, so having AD on-board seems like a plus not a minus. I'm sure there are specific exceptions, but generally Emirati money has seemed like smart money.