Skip to main content
statistics

A/B Test Calculator

Analyze A/B test results by calculating conversion rate differences, statistical significance, and confidence intervals. Essential for website optimization, marketing campaigns, and product experiments in data-driven decision making.

Reviewed by Chase FloiedUpdated

This free online a/b test calculator provides instant results with no signup required. All calculations run directly in your browser — your data is never sent to a server. Enter your values below and see results update in real time as you type. Perfect for everyday calculations, homework, or professional use.

Number of visitors in the control (original) group.

Number of conversions (purchases, clicks, signups) in the control group.

Number of visitors in the treatment (variant) group.

Number of conversions in the treatment group.

How to Use This Calculator

1

Enter your input values

Fill in all required input fields for the A/B Test Calculator. Most fields include unit selectors so you can work in your preferred unit system — metric or imperial, whichever matches your problem.

2

Review your inputs

Double-check that all values are correct and that you have selected the right units for each field. Incorrect units are the most common source of calculation errors and can produce results that are off by factors of 2, 10, or more.

3

Read the results

The A/B Test Calculator instantly computes the output and displays results with units clearly labeled. All calculations happen in your browser — no loading time and no data sent to a server.

4

Explore parameter sensitivity

Try adjusting individual input values to see how the output changes. This is a quick and effective way to develop intuition about how different parameters influence the result and to identify which inputs have the largest effect.

Formula Reference

A/B Test Calculator Formula

See calculator inputs for the governing equation

Variables: All variables and their units are labeled in the calculator interface above. Input fields accept values in multiple unit systems — select your preferred unit from the dropdown next to each field.

When to Use This Calculator

  • Use the A/B Test Calculator when you need accurate results quickly without the risk of manual computation errors or unit conversion mistakes.
  • Use it to verify calculations made by hand or in spreadsheets — an independent check can catch errors before they lead to costly decisions.
  • Use it to explore how changing input parameters affects the output — a quick way to develop intuition and identify the most influential variables.
  • Use it when collaborating with others to ensure everyone is working from the same numbers and applying the same assumptions.

About This Calculator

The A/B Test Calculator is a free, browser-based calculation tool for engineers, students, and technical professionals. Analyze A/B test results by calculating conversion rate differences, statistical significance, and confidence intervals. Essential for website optimization, marketing campaigns, and product experiments in data-driven decision making. It implements standard formulas and supports both metric (SI) and imperial unit systems with automatic unit conversion. All calculations are performed instantly in your browser with no data sent to a server. Use this calculator as a quick reference and sanity-check tool during design, analysis, and learning. Always verify results against primary engineering references and applicable standards for any safety-critical application.

About A/B Test Calculator

The A/B test calculator evaluates the statistical significance of differences between two versions of a webpage, email, ad, or product feature. A/B testing (also called split testing) is the gold standard for data-driven optimization, used by companies worldwide to make evidence-based decisions about design changes, pricing, copy, and user experience. This calculator computes conversion rates for both variants, the relative lift (percentage improvement), and a Z-score for a two-proportion hypothesis test. By comparing the Z-score to critical values, you can determine whether the observed difference is statistically significant or could be explained by random chance. Proper A/B testing requires adequate sample sizes, random assignment, and patience to reach significance.

The Math Behind It

A/B testing is a randomized controlled experiment applied to online optimization. Visitors are randomly assigned to either the control (A) or treatment (B) group, and a key metric (conversion rate) is compared between groups. The statistical foundation is a two-proportion Z-test, which tests whether the difference in proportions p_B - p_A is significantly different from zero. The test statistic uses the pooled proportion under the null hypothesis (no difference). For a two-tailed test at alpha = 0.05, a |Z| > 1.96 indicates significance. The standard error depends on both the conversion rates and the sample sizes, so higher traffic and higher conversion rates both reduce the uncertainty. The required sample size per variant can be estimated as n = 2 * (z_alpha/2 + z_beta)^2 * p(1-p) / (MDE)^2, where MDE is the minimum detectable effect and p is the baseline conversion rate. For a 5% baseline and 10% relative MDE (detecting an increase to 5.5%), you need about 31,000 visitors per group for 80% power. Common pitfalls include peeking at results before reaching the planned sample size (which inflates false positive rates), running multiple tests without correcting for multiplicity, and testing changes that are too small to detect with available traffic. Sequential testing methods and Bayesian A/B testing offer alternatives that allow for ongoing monitoring without inflating error rates.

Formula Reference

Two-Proportion Z-Test

Z = (p_B - p_A) / sqrt(p_pool * (1-p_pool) * (1/n_A + 1/n_B))

Variables: p_A, p_B = conversion rates; p_pool = pooled rate; n_A, n_B = sample sizes

Worked Examples

Example 1: Landing page conversion test

Control: 5,000 visitors, 200 conversions. Variant: 5,000 visitors, 240 conversions.

Step 1:Rate A = 200/5000 = 0.040 (4.0%).
Step 2:Rate B = 240/5000 = 0.048 (4.8%).
Step 3:Lift = (0.048 - 0.040) / 0.040 * 100 = 20%.
Step 4:Pooled rate = 440/10000 = 0.044.
Step 5:SE = sqrt(0.044 * 0.956 * (1/5000 + 1/5000)) = sqrt(0.0000168) = 0.00410.
Step 6:Z = (0.048 - 0.040) / 0.00410 = 1.95.

Z = 1.95, just below 1.96. The 20% lift is not quite significant at the 5% level (p = 0.051). Consider running the test longer.

Example 2: Email subject line test

Control: 10,000 sends, 2,100 opens. Variant: 10,000 sends, 2,350 opens.

Step 1:Rate A = 21.0%, Rate B = 23.5%.
Step 2:Lift = (23.5 - 21.0) / 21.0 * 100 = 11.9%.
Step 3:Pooled rate = 4450/20000 = 22.25%.
Step 4:SE = sqrt(0.2225 * 0.7775 * 2/10000) = 0.00588.
Step 5:Z = (0.235 - 0.210) / 0.00588 = 4.25.

Z = 4.25 (p < 0.0001). The 11.9% lift in open rate is highly statistically significant.

Common Mistakes & Tips

  • !Peeking at results before reaching the planned sample size -- this inflates the false positive rate. Use sequential testing methods if you need to monitor ongoing results.
  • !Stopping the test as soon as significance is reached -- significance can fluctuate, and early significant results often revert. Run for the full planned duration.
  • !Testing too many variants without adjusting for multiple comparisons -- with 20 variants, you expect one to appear significant by chance at alpha = 0.05.
  • !Ignoring practical significance -- a statistically significant 0.1% lift may not justify the engineering cost of implementing the change.

Related Concepts

Frequently Asked Questions

How long should I run an A/B test?

Run the test until you reach the pre-calculated sample size for your desired power and minimum detectable effect. As a minimum, run for at least one full business cycle (typically one week) to capture day-of-week effects. Never stop early just because results look significant -- this leads to inflated false positive rates.

What is a good conversion rate lift to test for?

The minimum detectable effect (MDE) depends on your baseline conversion rate and traffic. For high-traffic sites, you can detect lifts as small as 1-2%. For lower-traffic sites, target 10-20% relative lifts. Be realistic -- most individual changes produce lifts of 1-5%. Major redesigns may produce larger lifts but also carry more risk.

Should I use Bayesian or frequentist A/B testing?

Frequentist testing (Z-test/chi-squared) is the traditional approach and requires a fixed sample size determined in advance. Bayesian testing provides a probability that B is better than A, allows continuous monitoring, and handles small samples better. Bayesian is increasingly popular for its intuitive interpretation, but both methods give similar conclusions with adequate sample sizes.