This feels like Gigerenzer has read my mind. Two weeks back I blogged about a si...

mjburgess · on March 3, 2021

I think the only reason this works is that it's a power law. A power law gives you a semi-principled way of talking about different scales: atomic, table, house, world, galaxy... The rate of the power increase corresponds to our "magnification factor"

Probably many sequences of powers works, and many bases: base^0.5, base^1, base^1.5, base^2, ...

eg., 4^0.5, 4^1, 4^1.5, 4^2, 4^2.5 = [2, 4, 8, 16, 32]

If the estimate is 3days, then the scope is: [3, 6, 12, 24, 48, 96]

Now we have a "semi-principled" set of numbers to choose from, the estimator applies their domain knowledge to select the relevant scales: 3 to 24 say, based on "how big the project could be".

ie., is it both possibly a small change in one area (to a table), and possibly "reworking the foundations of the house" -- if so, you have the small-scale starter number, and you can dial up using this sequence to whatever magnification suits.

Probably with base pi, and sequence (0.5, 1, 2) we capture the three most relevant scales for most properly discrete software tasks. That's essentially conincidental, in that i'm sure a (2 to 4) base with several sequences would wokr.

ToJans · on March 3, 2021

Update: looks like my assumptions might be completely wrong... Please ignore my initial post, and I will check if/when I can provide an update to the blog post, potentially discarding the sqrt PI idea.

_v7gu · on March 3, 2021

Okay, I'm stumped. Isn't the gaussian function a probability density function, which means it should have an area of 1 by definition? Are you taking it as f(x) = exp(-x^2)? To keep f(0) equal to 1?

em500 · on March 3, 2021

You're stumped because most of the "statistics primer" section in that post doesn't make any sense. The connection between the Gaussian density and the sqrt(pi) heuristic is mostly imaginary. The original heuristic (pi, sqrt(pi), pi^2) works pretty much the same with 3 instead of pi, so you can view the pi versions as numerology or charitably a nice mnemonic.

ToJans · on March 3, 2021

Could you elaborate? Assume we convert all our happy path estimates to minutes. What I'm saying is that each "estimated minute" is more likely to have a gaussian distribution than a uniform standard distribution,because normal distribution is more likely to occur in nature.

while I can understand that this is a controversial assumption, I'm not the first one to make it. Referring to numerology seems a bit odd?

I'm really looking for a proper way here, I'm quite a rational person, so numerology is not really my cup of tea TBH.

em500 · on March 3, 2021

Your blog post has so many errors that I don't even know where to start. As another poster mentioned, areas under non-degenerate probability density functions are 1 by definition, whether they're uniform, Gaussian, standardized or not. What you described as a "standard uniform distribution" is really a degenerate distribution[1], meaning that you assume no uncertainty at all (stdev=0). There's nothing "uniform" about that, you might just as well start with a Gaussian with stdev=0.

"converting from a standard uniform distribution to a Gaussian distribution" as you described does not make any sense at all. If you replace an initial assumption of a degenerate distribution with a Gaussian, as you seem to be doing, you replace a no-uncertainty (stdev=0) assumption with some uncertainty (so the uncertainty blow-up is infinite), but it doesn't affect point estimates such as the mean or median, unless you make separate assumptions about that. There is nothing in your story that leads to multiplying some initial time estimate by sqrt(pi). The only tenuous connection with sqrt(pi) in the whole story is that the Gaussian integral happens to be sqrt(pi). There are some deep mathematical reasons for that, which has to do with polar coordinates transformations. But it has nothing to do with adjusting uncertainties or best estimates.

[1] https://en.wikipedia.org/wiki/Degenerate_distribution

ToJans · on March 3, 2021

Thank you for your valuable feedback; it will take some time to process, and I will adjust the blog post as my insights grow (potentially discarding the whole idea, but for me it's a learning process.)

yorwba · on March 3, 2021

> What I'm saying is that each "estimated minute" is more likely to have a gaussian distribution

If that were your actual assumption, you should measure the variance of the difference between your estimate and the actual time taken, use that to determine a confidence interval (e.g. 95% of the time, the additional delay is less than x) and then add it to your estimate.

ToJans · on March 3, 2021

Addendum: this is what I was referring to: https://en.wikipedia.org/wiki/Gaussian_integral

ToJans · on March 3, 2021

My apologies, I'm by no means well versed in math, so I might use the wrong terminology.

Here is a good example video of what I am alluding to: https://www.youtube.com/watch?v=9CgOthUUdw4

Maybe I should refine my blog post or include the video in the explanation?

sumtechguy · on March 3, 2021

You are the second person I have come across that uses PI to get software estimates. "I am not sure why but it works, except when it doesnt then I know exactly why"

What I found was most people over/under estimate things. They also tend to do it consistently at the same rate. Typically they have a scaling factor you can use. Around 3 seems to be the sweet spot for most people. You could just as easily use 3.1 or 3.2 and get a similar answer. I usually go for 3 because I can do that in my head without a calculator.

ToJans · on March 6, 2021

**** I stand corrected and marked that section from the blog post as BS; please accept my apologies. ****