The method and math at a glance
- Randomness: the assignment of any one user to scenario A or B is purely random.
- Confidence measures the certainty of the outcomes of a test. Confidence is quantified using the p-value. For classic two-variant tests, a result is confident (statistically significant) when the p-value is below 0.05 (95 % confidence). If you compare more than two variants, the p-value is stricter so the test results stay reliable. To learn more, see Multi-variant testing.
- Mathematical formula: the same two-tailed test is applied to every comparison between the control variant and each test variant. Two-tailed tests detect differences in either directionβbetter or worseβwithout assuming which variant performs better.
- Relevance improvement: in practice you add variants because you hypothesize one of them performs better than the control. Multi-variant tests let you try several ideas at once and show which, if any, delivers better impact.