Welyft calculator

Discover the statistical confidence calculator for your A/B tests

FAQ

Frequently asked questions about statistical confidence in A/B testing

Why use a statistical confidence calculator?

A statistical confidence calculator enables you to assess whether the difference observed between two variants (for example, in an A/B test) is statistically significant or simply due to chance. It helps you make data-driven decisions, identifying the best-performing version with a certain level of certainty.

Does 95% confidence mean that I have a 95% chance of being right?

Not exactly. It means that there's a 95% chance that the observed difference is not due to chance. How this is interpreted depends on the method used (frequentist or Bayesian), but in any case, the higher the percentage, the more confidence you can have in the result.

Can I stop my test as soon as I have a good level of confidence?

Yes, but you have to be careful. Stopping a test too early can lead to erroneous conclusions if the volume is still too low. The ideal is to combine a good level of confidence with a sufficient volume of data to guarantee the robustness of your decision.

Does this calculator work for all types of test?

This type of calculator is generally designed for classic A/B tests, where two variants are compared on a clear objective (e.g. conversion rate). For more complex tests (multi-variant, complete funnel, etc.), specific tools or more advanced analysis may be required.

What's the difference between a frequentist and a Bayesian test?

Frequentist: classic method, which tells you whether the observed difference could have happened by chance, through a p-value. The interpretation can be counter-intuitive.Bayesian: more modern method, which gives you directly the probability that one variant is better than the other. Both approaches are valid, but Bayesian is often more comprehensible to non-statisticians.

What is MDE (Minimum Detectable Effect) and why is it essential in A/B testing?

The MDE, or Minimum Detectable Effect, is the smallest improvement that your test can reliably detect. For example, if you set an MDE of 5%, this means that the test will be able to identify a difference of at least 5% between variants, if it really exists.

This parameter is crucial, as it determines the sample size required for your test. The smaller the MDE, the more visitors and conversions you'll need for the test to detect a statistically significant variation. Conversely, a higher MDE will shorten the test duration, but at the risk of missing out on small improvements.

Choosing the right MDE is therefore a balance between ambition and realism: aiming for effects that are too small can make the test unnecessarily long or uninterpretable, while an MDE that is too large risks missing out on interesting incremental gains. We recommend that you define your MDE according to your business objectives, the volume of traffic available, and the expected impact.4o

Tell us more about your project

We know how to boost the performance of your digital channels.

CRO

Data

User Research

Experiment