A/B Test Calculator

Measure statistical significance between conversion rates with an A/B testing calculator designed for UX and growth marketing professionals.

variation a - example screen A - Control

Sample Size

Conversions

Conversion Rate

variation b - example screen B - Variation

Select the type of test you are running

New Demo! Build AB test plans with our GPT-4 tool and start building an experiment knowledge base for your company

AI LED EXPERIMENT DESIGN AND TESTING

Watch: How to design an AB Test experiment plan with our GPT-4 integration

Prototypr.ai is an AI platform that helps you validate ideas with data and accelerate growth. Build test plans using one of the world's most powerful language models in GPT-4 turbo. Create test plans to measure changes of your web pages, ui or creative and measure their results with our custom Google Analytics reporting integration.

BUILD A TEST PLAN WITH AI

A/B Testing FAQ

A/B testing is a method of comparing two or more versions of a web page or user experience against a controlled variation in order determine which one performs better. Essentially, it is an experiment where two or more variants of a page are shown to users at random, and statistical analysis is used to determine which variation performs better for a given conversion goal. Some popular examples of conversion goals include primary call to action clicks on a landing page or new user sign ups to an app or website.

This AB Testing calculator helps users determine whether the difference in performance between the control group (A) and the variation group (B) is statistically significant. This calculator requires customers to input a sample size and number of conversions for each variation. After you have input this data, simply select the type of test (1-tailed or 2-tailed) you want to run and the calculator evaluates the likelihood that the observed difference in conversion rates between the control and variation is statistically significant at 95% confidence.

A 1-tailed test (also known as a one-sided test) is used when we have a specific hypothesis about the direction of the effect; that is, we predict that one group (usually the treatment) will perform better or worse than the other group (usually the control). In this case, we are only interested in finding evidence for an increase or a decrease, not just any difference.

Conversely, a 2-tailed test (or two-sided test) is employed when we do not have a specific prediction regarding which group will perform better. The hypothesis is simply that there is a difference between the two groups, and the test assesses whether the treatment group's conversion rate is either significantly higher or lower than that of the control group. With a 2-tailed test, we are open to finding evidence of a significant effect in either direction.

This distinction affects the interpretation of the test results. In a 1-tailed test, a negative t-statistic (indicating the control group performed better than the treatment group) would not be considered evidence against the hypothesis. In a 2-tailed test, however, this result could still indicate a significant difference between groups, but in the opposite direction to what might have been expected. Decision on which test to use must be made before conducting the A/B test and should not be changed after seeing the results, as this would invalidate the test's statistical assumptions and could lead to incorrect conclusions.

When running an A/B test, several factors should be considered to ensure the results are valid and actionable:

  • Sample Size: Determining the appropriate number of participants needed to detect a statistically significant difference
  • Randomization: Randomly assigning users to the control or variant group to minimize selection bias.
  • Duration: Running the test for a sufficient duration to ensure a sufficient sample size
  • Measureable: Ensure all user experiences are being properly measured using a tool such as Google Analytics
  • Segmentation: Considering different user segments, as the effect might be significant for certain groups but not for others.
  • Statistical Significance: Choosing the confidence level (typically at 95% or 99%) to determine when the results are statistically significant. This calculator defaults to 95% confidence.

The best way to record and share the results of an A/B test includes presenting the findings in a clear and comprehensive experiment document. Key elements of a good testing document include:

  • The test's objective and hypothesis
  • A detailed description of the experiment design, which includes topics such as the length of test, sample size considerations and a definition of the conversion rate being measured
  • A description of the control and variation(s), with any creative that denotes the differences between variations.
  • A section for key findings about the test
  • Raw numbers such as sample size, conversions, conversion rate, and the observed difference.
  • An interpretation of the results and recommendations for next steps, such as implementing the change permanently or conducting further tests.

prototypr.ai has a complete solution for recording and sharing your experiment designs. It's free to try, so sign up today and start recording your learnings and sharing them with your team!