> It is very easy to blurt out "well obviously we need twice the processing powe...

NooneAtAll3 · on June 18, 2024

I think I finally understood that math

in queue theory, you don't expect "operating rooms" to operate 24 hours per day - spherical patients may have a gap, which causes the room to not work for some time, but then jamboree happens and it averages out

doubling the cows input doesn't mean each "burst" becomes twice as big - some of the new cows can simply fall into periods that previously weren't used

thus the second portion of patients don't need a whole second copy of all operating rooms - part of them get gobbled into inactive timeslots of already existing resources

pas · on June 18, 2024

tHNanks for writing down this!

(so it seems putting two stochastic processes on top of each other is not like putting two "solid" things on top of each other, right? intuitively they mesh, I guess their stacking height is their "expected value"?

and their worst case will be the sum of their worst cases, where there's no averaging, right? so again intuitively a the larger a flow is the more dangerous is, even if it's smooth, because if it backs up it fills up queues/buffers faster, so to plan for extreme cases we still need "linear" thinking)

hnaccount_rng · on June 18, 2024

Kind of yes. Stacking two stochastic processes simply adds up the expectation value but not the noise/dispersion/volatility. That variation adds as a sum of squares.

If you push utilisation towards 1, what you essentially do is push the next "free" slot farther and farther into the future. This, essentially, means that you always buy higher utilisation with longer latency (at least in the upper bound). But the good thing is: If you have enough numbers, then the maximum latency grows slower with utilisation.

lesuorac · on June 18, 2024

> So what am I thinking wrong here? Is "scale to twice the processing power" not the same as getting a second operating room? I'm not seeing where "we will actually be serving each request in half the time" comes from.

Single Core vs Multi Core (ish).

With a Single thread you must work twice as fast to handle the increased load which also means the work is done at half the speed.

With Multi threading you can shuffle two units of work out at the same time so it's twice the load but the same speed.

To go back to the cow analogy, rather than adding 24 rooms (more threads) you give each surgeon a powersaw and they work twice as fast.

So if you scale the processing power per thread up then the time goes down, if you scale the processing power by adding threads (cores) the time stays the same (ish).

krisoft · on June 18, 2024

> To go back to the cow analogy, rather than adding 24 rooms (more threads) you give each surgeon a powersaw and they work twice as fast.

I was thinking that maybe that is what we are talking about. But convinced myself otherwise. Surely we don't need that much math to show if we cut the processing time in half the processing time will be cut in half. But if that is all we are saying I guess I can accept that as quite trivial.

kqr · on June 18, 2024

Cut the processing time in half while doubling the load!

The average cow, pre-jamboree, spends one hour in the hospital, including time waiting for a slot in the OR. Then you give the surgeon a power saw that allows him to complete the job in half the time, but he also gets twice as many cows to work on.

Most people's intuition would tell them the cows would still spend an hour in the hospital (the doubling in work rate canceled by the doubling in work amount), but actually now it takes half an hour -- regardless of how swamped the surgeon was initially.