Bonferroni Correction Calculator
Adjust p-values for multiple comparisons to control family-wise error rate
About the Bonferroni Correction Calculator
The Bonferroni Correction Calculator is an essential tool for researchers and statisticians who need to maintain the integrity of their data analysis when performing multiple hypothesis tests. When you run a single statistical test at a 5% significance level, there is a 5% chance of a Type I error (a false positive). However, if you run twenty independent tests at that same level, the probability of finding at least one false positive increases to about 64%. This phenomenon is known as alpha inflation or the 'multiple comparisons problem.'
This calculator helps you determine the revised significance threshold necessary to keep the Family-Wise Error Rate (FWER) at your desired level. By dividing your target alpha by the number of comparisons, the Bonferroni method provides a simple and robust way to ensure that your findings are not merely the result of chance. While it is known for being a conservative adjustment, it remains a standard requirement in medical research, genomics, and experimental psychology to prevent overstating the evidence for a discovery. Use this tool before interpreting p-values in any study where more than one hypothesis is being evaluated simultaneously.
Formula
α_adj = α / n OR p_adj = p * nIn the first variation, α_adj is the new significance threshold (alpha) required to reject a null hypothesis, α is the original desired significance level (usually 0.05), and n is the total number of statistical tests performed. In the second variation, p_adj is the adjusted p-value for a specific test, which is compared against the original alpha to determine significance. If the adjusted p-value exceeds 1.0, it is typically capped at 1.0.
Worked examples
Example 1: A researcher is performing 5 separate t-tests to compare different soil nutrients across three farms and wants to maintain an overall alpha of 0.05.
Original Alpha (α) = 0.05 Number of tests (n) = 5 Calculation: 0.05 / 5 = 0.01
Result: The adjusted significance threshold is 0.01. Any individual test must have a p-value below 0.01 to be considered statistically significant.
Example 2: A data scientist obtains a p-value of 0.03 from one of 15 simultaneous A/B tests and needs the adjusted p-value.
Observed p-value = 0.03 Number of tests (n) = 15 Calculation: 0.03 * 15 = 0.45
Result: The adjusted p-value is 0.45. This result is not significant because 0.45 is greater than the original alpha of 0.05.
Common use cases
- A clinical trial comparing the efficacy of a new drug across five different age demographics.
- A marketing team testing eight different website headlines to see which one increases conversions.
- A psychologist conducting post-hoc comparisons between four different treatment groups after a significant ANOVA result.
- A geneticist analyzing whether twelve specific snps are associated with a certain trait.
Pitfalls and limitations
- The correction assumes that every test is independent, which can lead to over-correction if variables are highly correlated.
- Applying this to a huge number of tests (e.g., thousands of genes) may result in a threshold so low that no results can ever reach significance.
- It does not protect against Type II errors, where you incorrectly accept a null hypothesis that is actually false.
Frequently asked questions
is bonferroni correction too conservative for 50 tests?
The Bonferroni correction is famously conservative, meaning it makes it very difficult to find significant results. As the number of tests increases, the power of the study drops drastically, which often leads to Type II errors where you fail to detect an effect that actually exists.
when to use bonferroni correction vs no correction at all?
You should apply the correction whenever you are testing multiple hypotheses on the same data set to avoid 'p-hacking' or accidental discoveries. Common scenarios include gene expression studies, post-hoc ANOVA tests, and large-scale marketing A/B tests.
does bonferroni correction control false discovery rate?
No, it controls the Family-Wise Error Rate (FWER), which is the probability of making at least one Type I error. The False Discovery Rate (FDR), often managed by the Benjamini-Hochberg procedure, controls the expected proportion of false positives among all rejected null hypotheses.
can i adjust p-values after the experiment is done?
Yes, if you have already performed your tests, you can calculate an 'adjusted p-value' by multiplying your observed p-value by the number of tests. If this product is still less than your original alpha (usually 0.05), the result is significant.
bonferroni vs sidak correction which is better?
The Sidak correction is slightly less conservative than Bonferroni while still controlling the FWER. It assumes that the tests are independent, whereas Bonferroni is safe to use even if tests are correlated.