Power calculator

Calculate sample size of a certain power for your experiments
Power calculator
Microsoft OpenAI vercel Notion Flipkart Brex ea affirm Anthropic Univision Ancestry Thumbtack
Test vs. Control
Range can be 0.01-0.1
Range can be 0.65-0.95
30%
50%
70%
Minimum Detectable Effect
TEST SIZE

198

CONTROL SIZE

198

TOTAL SAMPLE SIZE

396


Having enough power in your A/B test requires a large enough sample size.

Power is the probability that a test correctly rejects a false null hypothesis - i.e., ensuring an A/B tests is sensitive enough to detect a true effect when there is one. To calculate the sample we need for a certain power, we need several inputs - including baseline conversion rate, minimum detectable effect, A/B split ratio, significance and power.

How to use this calculator:

Determine your baseline conversion rate

This is the current conversion rate of your control group. In an A/B test, the baseline conversion is the expected rate of conversion (or other desireable outcome) in the control group, or those not being exposed to a new experience.

Choose a minimum detectable effect

This is the smallest difference that can be consistently detected in an experiment. In an A/B test, this is the minimum change in desireable outcome you’d want to be able to detect.

Input your values

The output sample size of the calculator will be the minimum viable amount to consistently achieve statistically significant results, based on the power level that you choose. Choosing a higher power means a lower frequency of false negatives, but will also require a commensurate number more samples.

The calculator is automatically set to optimal defaults, but you can adjust the advanced settings to see how they impact your results.

Hypothesis

If you are looking to determine if a single test variation is better than a control, use a one-sided test (recommended). If you want to determine if its different from the control, then use a two-sided test.

A/B split ratio

Most A/B tests are conducted with a 50%/50% split across test and control users (represented as an input of 0.5 in this calculator), but this can be tuned to your own experimental design.

Significance (α)

Alpha is the probability that a statistically significant difference is detected when one does not exist. 0.05 is a common default for Alpha, but you can choose a higher or lower value to adjust the probability that an observed difference isn’t due to chance, but requires a larger sample size.

Statistical Power (1 - β)

As shared above, statistical power is the probability that the minimum detectable effect will be detected, assuming it exists. If you’d like to calculate a minimum detectable effect or A/B test duration automatically based on your data each time you run a test, sign up for Statsig!

Join the #1 experimentation community

Connect with like-minded product leaders, data scientists, and engineers to share the latest in product experimentation.

Try Statsig Today

Get started for free. Add your whole team!
We use cookies to ensure you get the best experience on our website.
Privacy Policy