I see your point. But isn't that only true for two-tailed tests? This sort of or...

lalaland1125 · on May 25, 2020

A one sided test is only valid if you know that it's not possible to get a result in the other direction. In the general case you don't know for certain that B can't be faster than A. One sided tests are almost always invalid as it's very difficult to know that the other direction is impossible.

If this doesn't make sense, I would recommend running simulations under the null hypothesis. You will see that 5% of the time you will falsely conclude that A < B and that in another 5% of the time you will conclude that B < A, leading to an overall false positive rate of 10%.

saeranv · on May 26, 2020

I think I'm finally starting to see the heart of the problem now.

> You will see that 5% of the time you will falsely conclude that A < B and that in another 5% of the time you will conclude that B < A, leading to an overall false positive rate of 10%.

From my perspective, this makes perfect sense, since I know there is a 1 / 20 chance of getting 3 AAAs based on the combinations formula. Where I think I am having trouble is why the p value needs to be derived to account for both these extremes.

Earlier you wrote: > The main key to remember when deriving a p value is that you are computing the probability of seeing an outcome at least as extreme given the null hypothesis, not the probability of seeing your particular outcome.

This is where the confusion is coming from for me. Based on this definition of the p-value everything you say makes sense. But, I don't understand what the point of deriving a p value this way is, or what it tells us. And to clarify, I'm not saying your wrong, more that I just need to read up on p-value derivation to understand this.