Minimum Detectable Effect (MDE) Calculation

Minimum Detectable Effect (MDE) is defined as the smallest difference between a control and a test group that your A/B test can reliably identify as statistically significant. It’s a critical concept because it helps in determining the necessary sample size for an experiment and in interpreting the results.

How MDE is calculated in experimental design

The MDE calculation depends on several key parameters:

Sample size (\(n\) for the test group, \(m\) for the control group).
Significance level (alpha, \(\alpha\)): This is the probability of a Type I error (falsely rejecting the null hypothesis), typically set at 0.05.
Statistical power (1-beta, \(1-\beta\)): This is the probability of correctly detecting an effect when one truly exists, commonly set at 0.8 (or 80%).
Variance (\(\sigma^2\)) of the metric being measured in the population, often estimated from historical data.
Ratio of control to test group sizes (\(k = m/n\)).

Control and test distributions with critical values, alpha, beta, and power shaded.

Calculating MDE for a given sample size

If you have a fixed number of users (n) or a defined sample size (m or n), you can determine the smallest effect (e) that your test can reliably detect:

\[e > \sqrt{\dfrac{(z_{1-\alpha/2} + z_{1-\beta})^2 (1 + k)\sigma^2}{m}}\]

\(z_{1-\alpha/2}\) and \(z_{1-\beta}\) are the Z-scores corresponding to the desired significance level and power, respectively.

\(k=\frac{m}{n}\) is the ratio of the control group size to the test group size (e.g., \(k=1\) for a 1:1 split).

\(\sigma^2\) is the estimated variance of the metric.

\(e\) is the minimal effect.

How to calculate MDE in R

For instance, with 100,000 total users, a sigma of 500, a k-ratio of 2, alpha = 0.05, and beta = 0.2, the minimum detectable effect would be approximately 9.397 units, which translates to a 1.88% change from the mean of 500.

# Function to calculate MDE
calculate_mde <- function(n, alpha = 0.05, beta = 0.2, sigma = 1, k = 1) {
  z_alpha <- qnorm(1 - alpha / 2)
  z_beta <- qnorm(1 - beta)
  m <- n * k / (k + 1)
  e <- sqrt((z_alpha + z_beta)^2 * (1 + k) * sigma^2 / m)
  return(e)
}

calculate_mde(n = 100000, alpha = 0.05, beta = 0.2, sigma = 500, k = 2)

[1] 9.396802

Interactive MDE calculator

Below is an interactive calculator for finding the MDE based on your sample size, alpha, beta, sigma, and k-ratio. Adjust the parameters to see how they affect the MDE.

Finding the required sample size for a given MDE

The formula used to determine the necessary sample size (e.g., for the control group, \(m\)) to detect a specific MDE (\(e\)) is:

\[m > \dfrac{(z_{1-\alpha/2} + z_{1-\beta})^2 (1 + k)\sigma^2}{e^2}\]

How to calculate sample size in R

Example: if a monetary metric has a mean of 500 and a standard deviation (sigma) of 500, and you want to detect a 2% effect (MDE = 10) with alpha = 0.05 and beta = 0.2, and a test-to-control ratio of k = 2, the required sample size would be approximately 29,434 users in the test group and 58,867 users in the control group, totaling 88,301 users.

# Function to calculate required sample size
calculate_sample_size <- function(e, alpha = 0.05, beta = 0.2, sigma = 1, k = 1) {
  z_alpha <- qnorm(1 - alpha / 2)
  z_beta <- qnorm(1 - beta)
  m <- (z_alpha + z_beta)^2 * (1 + k) * sigma^2 / e^2
  return(ceiling(m))
}
m <- calculate_sample_size(e = 10, alpha = 0.05, beta = 0.2, sigma = 500, k = 2)
n <- ceiling(m / 2)
total_users <- m + n

c(m, n, total_users)

[1] 58867 29434 88301

Interactive sample size calculator

Below is an interactive calculator that allows you to input your desired MDE, alpha, beta, sigma, and k-ratio to compute the required sample size. Adjust the parameters to see how they affect the sample size needed for your A/B test.

Conclusion

A/B testing is a powerful tool for product development, and understanding the concept of Minimal Detectable Effect (MDE) is crucial for designing effective experiments. By calculating the required sample size or MDE, you can ensure that your tests are statistically sound and capable of providing meaningful insights into user behavior and product performance.