So the thing I always ctrl-F for, to see if a paper or course really knows what ...

So the thing I always ctrl-F for, to see if a paper or course really knows what it's talking about, is called the “multi-armed bandit” problem. Just ctrl-F bandit, if an A/B tutorial is long enough it will usually mention them.

This is not a foolproof method, I'd call it only ±5 dB of evidence, so it would shift a 50% likely that they know what they're talking about to like 75% if present or 25% if absent, but obviously look at the rest of it and see if that's borne out. And to be clear: Even mentioning it if it's just to dismiss it, counts!

So e.g. I remember reading a whitepaper about “A/B Tests are Leading You Astray” and thinking “hey that's a fun idea, yeah, effect size is too often accidentally conditioned on whether the result was judged significantly significant which would be a source of bias” ...and sure enough a sentence came up, just innocently, like, “you might even have a bandit algorithm! But you had to use your judgment to discern that that was appropriate in context.” And it’s like “OK, you know about bandits but you are explicitly interested in human discernment and human decision making, great.” So, +5 dB to you.

And on the flip-side if it makes reference to A/B testing but it's decently long and never mentions bandits then there's only maybe a 25% chance they know what they are talking about. It can still happen, you might see e.g. χ² instead of the t-test [because usually you don't have just “converted” vs “did not convert”... can your analytics grab “thought about it for more than 10s but did not convert” etc.?] or something else that piques interest. Or it's a very short article where it just didn't come up, but that's fine because we are, when reading, performing a secret cost-benefit analysis and short articles have very low cost.

For a non-technical thing you can give to your coworkers, consider https://medium.com/jonathans-musings/ab-testing-101-5576de64...

Researching this comment led to this video which looks interesting and I’ll need to watch later about how you have to pin down the time needed to properly make the choices in A/B testing: https://youtu.be/Fs8mTrkNpfM?si=ghsOgDEpp43yRmd8

Some more academic looking discussions of bandit algorithms that I can't vouch for personally, but would be my first stops:

- https://courses.cs.washington.edu/courses/cse599i/21wi/resou... - https://tor-lattimore.com/downloads/book/book.pdf - http://proceedings.mlr.press/v35/kaufmann14.pdf