Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is really cool, but I'd love to see this narrowed down to something super scientific. Cause I can already hear the haters, screaming about control groups, sample sizes, and all that stuff. Which is actually important.

Right now I take it for what it is a quick mock up of some shallow data. But it seems compelling enough to take a deeper Nate Silver ( http://fivethirtyeight.com/ ) style deep dive.



I’d love suggestions about how to make this more scientific. I think the biggest potential pitfall is the variability related to the stock size. My approach is pretty simple right now. I just took all of the companies from the Fortune 1000 that had female CEOs since 2002, and bought their stock when they were women-led, and sold it when they no longer were.

From an investment perspective, I think the best thing to do next would be to make sure I have the right benchmark. Ideally they Fortune 1000 would be the best one to use, but I need the historical Fortune 1000 companies for the last 12 years….that will take some manual work to pull together.

At some point, though, the control group is an academic exercise. If the strategy makes money - invest.


In particular, your purchases aren't cap-weighted. You simply divide your portfolio evenly among women-lead companies. So aside from biases that might exist for women-led companies (they have to work harder, so they're better performers; or, they tend to be more highly represented in consumer cyclical, etc), you're also "over-weighting" small-cap equities, which have done quite well relative to the S&P500 (all large cap) over the period[0].

I would be interested in a breakdown of other factors, but I expect very little variance actually derives solely from gender of ownership once you account for other conventional biasing factors of the chosen index.

[0]: http://performance.morningstar.com/funds/etf/total-returns.a... (Compare to S&P500 TR)


Calculating four factor model alphas (http://en.wikipedia.org/wiki/Carhart_four-factor_model) would be a good starting point.


It's not a sample, it's a census.

sample_size == population_size


This comment was into 0 territory. It's actually a very important point that people often miss when discussing things... you don't do "population statistics" when you're getting data from every member of the population. So in this case, "sample size" is irrelevant.

Many other elements of the analysis can and are being challenged, which is fine to discuss, but there's no need to discuss "sampling" issues, as there is no sampling going on.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: