P-value Correction For Multiple Mann-Whitney Tests, Some Of Them Being Dependent
Introduction
When performing multiple comparisons using non-parametric tests like the Mann-Whitney U test, it's essential to correct for the inflated type I error rate that arises from conducting multiple tests. This is particularly crucial when dealing with dependent tests, where the results of one test may influence the outcome of another. In this article, we'll delve into the world of p-value correction for multiple Mann-Whitney tests, with a focus on dependent tests.
The Problem of Multiple Comparisons
When conducting multiple tests, the probability of obtaining at least one false positive result increases with the number of tests performed. This is known as the multiple comparisons problem. To illustrate this, consider a scenario where you're comparing the performance of two treatments in two different experiments. You conduct a Mann-Whitney U test for each experiment, resulting in two p-values. If the significance level (α) is set at 0.05, you'd expect to obtain a false positive result in at least one of the experiments, even if the treatments are identical.
Dependent Tests: A Special Case
Dependent tests refer to situations where the results of one test may influence the outcome of another. In the context of Mann-Whitney U tests, this can occur when the same subjects are used in multiple experiments, or when the experiments are conducted in a specific order. Dependent tests can lead to an even more severe inflation of the type I error rate, making it essential to correct for this effect.
P-value Correction Methods
Several p-value correction methods are available to address the multiple comparisons problem. Some of the most commonly used methods include:
- Bonferroni correction: This method involves multiplying the p-value by the number of tests performed. While simple to implement, the Bonferroni correction can be overly conservative, leading to a loss of statistical power.
- Holm-Bonferroni method: This method is an extension of the Bonferroni correction, which takes into account the order of the tests. It's more powerful than the Bonferroni correction but can still be conservative.
- Benjamini-Hochberg procedure: This method is a popular choice for correcting p-values in multiple testing scenarios. It's based on the concept of the false discovery rate (FDR), which is the expected proportion of false positives among all significant results.
- FDR control: This method involves setting a target FDR and adjusting the p-values accordingly. It's a more flexible approach than the Bonferroni correction and can be used in conjunction with other methods.
Implementing P-value Correction in R
R provides several packages and functions for implementing p-value correction methods. Here's an example of how to use the p.adjust()
function in R to perform Bonferroni correction:
# Load the necessary libraries
library(p.adjusted)

set.seed(123)
n <- 100
p <- 0.05
x <- rnorm(n)
y <- rnorm(n)
z <- rnorm(n)
mw1 <- wilcox.test(x, y)
mw2 <- wilcox.test(x, z)
mw3 <- wilcox.test(y, z)
pvals <- c(mw1p.value, mw3$p.value)
pvals_adj <- p.adjust(pvals, method = "bon")
print(pvals_adj)
Example Use Case: Correcting P-values for Dependent Mann-Whitney Tests
Suppose we have two experiments, each with two groups. We want to compare the performance of the groups in each experiment using Mann-Whitney U tests. However, the experiments are dependent, as the same subjects are used in both experiments. We'll use the Benjamini-Hochberg procedure to correct the p-values for multiple testing.
# Load the necessary libraries
library(p.adjusted)
library(wilcox.test)
set.seed(123)
n <- 100
p <- 0.05
x1 <- rnorm(n)
y1 <- rnorm(n)
x2 <- rnorm(n)
y2 <- rnorm(n)
mw1 <- wilcox.test(x1, y1)
mw2 <- wilcox.test(x2, y2)
mw3 <- wilcox.test(x1, x2)
mw4 <- wilcox.test(y1, y2)
pvals <- c(mw1p.value, mw3p.value)
pvals_adj <- p.adjust(pvals, method = "BH")
print(pvals_adj)
Conclusion
Introduction
In our previous article, we discussed the importance of p-value correction for multiple Mann-Whitney tests, particularly when dealing with dependent tests. However, we understand that some readers may still have questions about this topic. In this article, we'll address some of the most frequently asked questions about p-value correction for multiple Mann-Whitney tests.
Q: What is the difference between Bonferroni correction and Holm-Bonferroni method?
A: The Bonferroni correction involves multiplying the p-value by the number of tests performed, while the Holm-Bonferroni method takes into account the order of the tests. The Holm-Bonferroni method is more powerful than the Bonferroni correction but can still be conservative.
Q: What is the false discovery rate (FDR) and how does it relate to p-value correction?
A: The FDR is the expected proportion of false positives among all significant results. P-value correction methods like the Benjamini-Hochberg procedure aim to control the FDR, ensuring that the number of false positives is minimized.
Q: Can I use p-value correction methods for other types of tests, such as ANOVA or t-tests?
A: Yes, p-value correction methods can be used for other types of tests, including ANOVA and t-tests. However, the choice of method may depend on the specific research question and the type of data being analyzed.
Q: How do I choose the right p-value correction method for my research?
A: The choice of p-value correction method depends on the research question, the type of data being analyzed, and the desired level of statistical power. It's essential to consider the trade-off between controlling the FDR and maintaining statistical power.
Q: Can I use p-value correction methods for dependent tests, such as paired samples?
A: Yes, p-value correction methods can be used for dependent tests, including paired samples. However, the choice of method may depend on the specific research question and the type of data being analyzed.
Q: How do I implement p-value correction in R?
A: R provides several packages and functions for implementing p-value correction methods, including the p.adjust()
function. Here's an example of how to use the p.adjust()
function to perform Bonferroni correction:
# Load the necessary libraries
library(p.adjusted)
set.seed(123)
n <- 100
p <- 0.05
x <- rnorm(n)
y <- rnorm(n)
z <- rnorm(n)
mw1 <- wilcox.test(x, y)
mw2 <- wilcox.test(x, z)
mw3 <- wilcox.test(y, z)
pvals <- c(mw1p.value, mw3$p.value)
pvals_adj <- p.adjust(pvals, method = "bon")
print(pvals_adj)
``Q: What are some common pitfalls to avoid when using p-value correction methods?
A: Some common pitfalls to avoid when using p-value correction methods include:
- Using the Bonferroni correction without considering the order of the tests
- Failing to control for multiple testing when using dependent tests
- Ignoring the trade-off between controlling the FDR and maintaining statistical power
- Not considering the specific research question and type of data being analyzed
Conclusion
P-value correction is a crucial step in multiple testing scenarios, particularly when dealing with dependent tests. By understanding the different p-value correction methods and their applications, researchers can ensure that their results are reliable and meaningful. In this article, we've addressed some of the most frequently asked questions about p-value correction for multiple Mann-Whitney tests. We hope this guide has been helpful in clarifying the importance of p-value correction and its applications in research.