A/B Testing Calculator

Optimize your conversion rates with our A/B testing calculator designed for UX & marketing pros.

variation a - example screen A - Control

Sample Size


Conversion Rate

variation b - example screen B - Variation

Select the type of test you are running

New Demo! Build AB test plans with our GPT-4 tool and start building an experiment knowledge base for your company


Watch: How to design an AB Test experiment plan with our GPT-4 integration

Prototypr.ai is an AI platform that helps you validate ideas with data and accelerate growth. Build test plans using one of the world's most powerful language models in GPT-4 turbo. Create test plans to measure changes of your web pages, ui or creative and measure their results with our custom Google Analytics reporting integration.


A/B Testing FAQ

A/B testing, also known as split testing, is a method of comparing two versions of a webpage or user experience against each other to determine which one performs better. Essentially, it is an experiment where two or more variants of a page are shown to users at random, and statistical analysis is used to determine which variation performs better for a given conversion goal.

This AB Testing calculator helps users determine whether the difference in performance between the control group (A) and the variation group (B) is statistically significant. This calculator requires users to input a sample size exposed to each variant, and the number of conversions for each. Then select the type of test (1-tailed or 2-tailed) you want to run and the calculator evaluates the likelihood that the observed difference in conversion rates between the control and variation is statistically significant at 95% confidence.

In hypothesis testing, specifically within the context of A/B tests, the choice between a 1-tailed and a 2-tailed test is crucial because it reflects our expectations and the directionality of the test. A 1-tailed test (also known as a one-sided test) is used when we have a specific hypothesis about the direction of the effect; that is, we predict that one group (usually the treatment) will perform better or worse than the other group (usually the control). In this case, we are only interested in finding evidence for an increase or a decrease, not just any difference.

Conversely, a 2-tailed test (or two-sided test) is employed when we do not have a specific prediction regarding which group will perform better. The hypothesis is simply that there is a difference between the two groups, and the test assesses whether the treatment group's conversion rate is either significantly higher or lower than that of the control group. With a 2-tailed test, we are open to finding evidence of a significant effect in either direction.

This distinction affects the interpretation of the test results. In a 1-tailed test, a negative t-statistic (indicating the control group performed better than the treatment group) would not be considered evidence against the hypothesis. In a 2-tailed test, however, this result could still indicate a significant difference between groups, but in the opposite direction to what might have been expected. Decision on which test to use must be made before conducting the A/B test and should not be changed after seeing the results, as this would invalidate the test's statistical assumptions and could lead to incorrect conclusions.

When running an A/B test, several factors should be considered to ensure the results are valid and actionable:

  • Sample Size: Determining the appropriate number of participants to detect a meaningful difference while avoiding false positives or negatives.
  • Randomization: Randomly assigning users to the control or variant group to minimize selection bias.
  • Duration: Running the test for a sufficient duration to account for business cycles and external factors.
  • Consistency: Ensuring the conditions under which the test is conducted are as consistent as possible between the groups.
  • Segmentation: Considering different user segments, as the effect might be significant for certain groups but not for others.
  • Statistical Significance: Choosing the confidence level (usually 95% or 99%) to determine when the results are statistically significant. This calculator defaults to 95% confidence.

The best way to record and share the results of an A/B test includes presenting the findings in a clear and comprehensive experiment document. Key elements of a good testing document include:

  • The test's objective and hypothesis
  • A detailed description of the experiment design, which includes topics such as the length of test, sample size considerations and a definition of the conversion rate being measured
  • A description of the control and variation(s), with any creative that denotes the differences between variations.
  • A section for key findings about the test
  • Raw numbers such as sample size, conversions, conversion rate, and the observed difference.
  • An interpretation of the results and recommendations for next steps, such as implementing the change permanently or conducting further tests.

prototypr.ai has a complete solution for recording and sharing your experiment designs. It's free to try, so sign up today and start recording your learnings and sharing them with your team!