When I see this "How soon statistical significance (of p-value < 0.05) was d...

paraschopra · on June 1, 2012

It's not flawed. Wouldn't you assume that whatever bias (due to "peeking" at results) was there for one algorithm would also be there for another? Also, to offset this effect somewhat, I waited till I saw statistical significance at least 10 times. (I know it's a heuristic) Moreover, I'm curious if there's a theoretical result that compares statistical power of randomization and MAB.

Though like a commenter said, comparing these two algorithms is actually like comparing apples to oranges, and that was precisely the conclusion.

rfergie · on June 1, 2012

All you are saying is here is that you got it wrong in the same way for all parts of the test.

Which may remove the bias (I'm not sure about this) but it doesn't inspire confidence.

paraschopra · on June 1, 2012

Yes, I agree it is not a rigorous statistical analysis but I tried doing a fair comparison (based on assumptions/heuristics clearly described in the post). The blog post simply claims what is described there.