How To Handle Paired Wilcox Test With Incomplete Follow-up?
Introduction
In experimental studies, comparing a blood variable at two different points in time on the same participants is a common practice. However, when dealing with non-normally distributed data, traditional parametric tests may not be suitable. The paired Wilcoxon test, also known as the Wilcoxon signed-rank test, is a non-parametric alternative that can be used to compare paired data. However, when there is incomplete follow-up, handling the paired Wilcoxon test can be challenging. In this article, we will discuss how to handle paired Wilcoxon test with incomplete follow-up.
Understanding Paired Wilcoxon Test
The paired Wilcoxon test is a non-parametric test used to compare paired data. It is a variation of the Wilcoxon rank-sum test, which is used to compare two independent groups. The paired Wilcoxon test is used when the data is paired, meaning that each observation in one group has a corresponding observation in the other group. The test works by ranking the differences between the paired observations and then calculating the sum of the ranks for each group.
Why Paired Wilcoxon Test is Preferred
The paired Wilcoxon test is preferred over traditional parametric tests for several reasons:
- Non-normal data: The paired Wilcoxon test does not assume normality of the data, making it a suitable choice for non-normally distributed data.
- Paired data: The paired Wilcoxon test is specifically designed for paired data, making it a good choice for experimental studies where the same participants are measured at two different points in time.
- Robustness: The paired Wilcoxon test is a robust test, meaning that it is less affected by outliers and non-normality.
Handling Incomplete Follow-up
Incomplete follow-up is a common issue in experimental studies, where some participants may not have complete data. Handling incomplete follow-up can be challenging, especially when using non-parametric tests like the paired Wilcoxon test. Here are some strategies to handle incomplete follow-up:
1. Listwise Deletion
Listwise deletion is a common strategy for handling incomplete follow-up. This involves deleting the entire row of data if any of the values are missing. However, this approach can lead to biased results, especially if the missing data is not missing completely at random (MCAR).
2. Pairwise Deletion
Pairwise deletion is another strategy for handling incomplete follow-up. This involves deleting only the pair of data if one of the values is missing. However, this approach can also lead to biased results, especially if the missing data is not MCAR.
3. Imputation
Imputation is a strategy for handling incomplete follow-up by replacing the missing values with estimated values. There are several imputation methods available, including:
- Mean imputation: Replacing the missing values with the mean of the available data.
- Median imputation: Replacing the missing values with the median of the available data.
- Regression imputation: Replacing the missing values with the predicted value based on a regression model.
4. Multiple Imputation
Multiple imputation is a for handling incomplete follow-up by creating multiple versions of the data with different imputed values. This involves creating multiple datasets with different imputed values and then analyzing each dataset separately.
5. Sensitivity Analysis
Sensitivity analysis is a strategy for handling incomplete follow-up by analyzing the results under different assumptions. This involves analyzing the results under different imputation methods or deletion strategies to see how the results change.
Example Code in R
Here is an example code in R for handling incomplete follow-up using the paired Wilcoxon test:
# Load the necessary libraries
library(tidyverse)
library(pairedWilcoxon)

data <- data.frame(
id = c(1, 2, 3, 4, 5),
time1 = c(10, 20, 30, 40, 50),
time2 = c(15, 25, 35, NA, 55)
)
data_listwise <- data %>% drop_na()
data_pairwise <- data %>% group_by(id) %>% drop_na()
data_mean <- data %>% mutate(time2 = ifelse(is.na(time2), mean(time2, na.rm = TRUE), time2))
data_median <- data %>% mutate(time2 = ifelse(is.na(time2), median(time2, na.rm = TRUE), time2))
data_regression <- data %>% mutate(time2 = ifelse(is.na(time2), predict(lm(time2 ~ time1), newdata = data.frame(time1 = time1)), time2))
data_multiple <- data %>% mutate(time2 = ifelse(is.na(time2), rnorm(1, mean = mean(time2, na.rm = TRUE), sd = sd(time2, na.rm = TRUE)), time2))
paired_wilcoxon_listwise <- pairedWilcoxon(time1 ~ time2, data = data_listwise)
paired_wilcoxon_pairwise <- pairedWilcoxon(time1 ~ time2, data = data_pairwise)
paired_wilcoxon_mean <- pairedWilcoxon(time1 ~ time2, data = data_mean)
paired_wilcoxon_median <- pairedWilcoxon(time1 ~ time2, data = data_median)
paired_wilcoxon_regression <- pairedWilcoxon(time1 ~ time2, data = data_regression)
paired_wilcoxon_multiple <- pairedWilcoxon(time1 ~ time2, data = data_multiple)
Conclusion
Q: What is the paired Wilcoxon test and why is it used?
A: The paired Wilcoxon test, also known as the Wilcoxon signed-rank test, is a non-parametric test used to compare paired data. It is used when the data is paired, meaning that each observation in one group has a corresponding observation in the other group. The test works by ranking the differences between the paired observations and then calculating the sum of the ranks for each group.
Q: What are the advantages of using the paired Wilcoxon test?
A: The paired Wilcoxon test has several advantages, including:
- Non-normal data: The paired Wilcoxon test does not assume normality of the data, making it a suitable choice for non-normally distributed data.
- Paired data: The paired Wilcoxon test is specifically designed for paired data, making it a good choice for experimental studies where the same participants are measured at two different points in time.
- Robustness: The paired Wilcoxon test is a robust test, meaning that it is less affected by outliers and non-normality.
Q: How do I handle incomplete follow-up in the paired Wilcoxon test?
A: There are several strategies for handling incomplete follow-up in the paired Wilcoxon test, including:
- Listwise deletion: Deleting the entire row of data if any of the values are missing.
- Pairwise deletion: Deleting only the pair of data if one of the values is missing.
- Imputation: Replacing the missing values with estimated values.
- Multiple imputation: Creating multiple versions of the data with different imputed values.
- Sensitivity analysis: Analyzing the results under different assumptions.
Q: What are the limitations of listwise deletion?
A: Listwise deletion can lead to biased results, especially if the missing data is not missing completely at random (MCAR). This is because the deletion of entire rows of data can lead to a loss of information and a biased sample.
Q: What are the limitations of pairwise deletion?
A: Pairwise deletion can also lead to biased results, especially if the missing data is not MCAR. This is because the deletion of only the pair of data can lead to a loss of information and a biased sample.
Q: What are the limitations of imputation?
A: Imputation can lead to biased results, especially if the imputed values are not accurate. This is because the imputed values can be influenced by the underlying assumptions of the imputation method.
Q: What are the limitations of multiple imputation?
A: Multiple imputation can be computationally intensive and may not always produce accurate results. This is because the creation of multiple versions of the data with different imputed values can lead to a loss of information and a biased sample.
Q: What are the limitations of sensitivity analysis?
A: Sensitivity analysis can be time-consuming and may not always produce accurate results. This is because the analysis of results under different assumptions can lead to a loss of information and a biased sample.
Q: How do I choose the best strategy for handling incomplete follow-up?
A: The choice of strategy for handling incomplete follow-up depends on the specific research question and the characteristics of the data. It is recommended to use a combination of strategies, such as listwise deletion, pairwise deletion, imputation, multiple imputation, and sensitivity analysis, to ensure that the results are accurate and reliable.
Q: What are the best practices for handling incomplete follow-up?
A: The best practices for handling incomplete follow-up include:
- Using a combination of strategies: Using a combination of strategies, such as listwise deletion, pairwise deletion, imputation, multiple imputation, and sensitivity analysis, to ensure that the results are accurate and reliable.
- Verifying the assumptions: Verifying the assumptions of the imputation method and the sensitivity analysis to ensure that the results are accurate and reliable.
- Reporting the results: Reporting the results of the sensitivity analysis and the imputation method to ensure that the results are transparent and reliable.
Q: What are the future directions for handling incomplete follow-up?
A: The future directions for handling incomplete follow-up include:
- Developing new imputation methods: Developing new imputation methods that are more accurate and reliable.
- Improving the sensitivity analysis: Improving the sensitivity analysis to ensure that the results are accurate and reliable.
- Using machine learning algorithms: Using machine learning algorithms to improve the accuracy and reliability of the imputation method and the sensitivity analysis.