Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I see your point. But isn't that only true for two-tailed tests? This sort of ordering problem seems more suited to a one-tailed test. Let's say for example, we represent the bins of the probability distribution as the number of B's we can obtain (0, 1, 2, or 3).

In this case I am defining the statistical significance of our test ordering (AAABBB) as anything less than or equal to 0.05 of our probability distribution - which corresponds to the 0 Bs bin. This corresponds to a one-sided confidence interval of 95%.

What am I doing wrong?



A one sided test is only valid if you know that it's not possible to get a result in the other direction. In the general case you don't know for certain that B can't be faster than A. One sided tests are almost always invalid as it's very difficult to know that the other direction is impossible.

If this doesn't make sense, I would recommend running simulations under the null hypothesis. You will see that 5% of the time you will falsely conclude that A < B and that in another 5% of the time you will conclude that B < A, leading to an overall false positive rate of 10%.


I think I'm finally starting to see the heart of the problem now.

> You will see that 5% of the time you will falsely conclude that A < B and that in another 5% of the time you will conclude that B < A, leading to an overall false positive rate of 10%.

From my perspective, this makes perfect sense, since I know there is a 1 / 20 chance of getting 3 AAAs based on the combinations formula. Where I think I am having trouble is why the p value needs to be derived to account for both these extremes.

Earlier you wrote: > The main key to remember when deriving a p value is that you are computing the probability of seeing an outcome at least as extreme given the null hypothesis, not the probability of seeing your particular outcome.

This is where the confusion is coming from for me. Based on this definition of the p-value everything you say makes sense. But, I don't understand what the point of deriving a p value this way is, or what it tells us. And to clarify, I'm not saying your wrong, more that I just need to read up on p-value derivation to understand this.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: