I have carried out a permutation test comprising a Null-distribution of distances and then 5 observed distances as statistics. Now I would like to correct for multiple comparisons using the Max-T method; using the multtest
package, and the ss.maxT
, the ss.minT
and/or the sd.maxT
functions.
But I have problems implementing the functions and making sense of the results; the first function only gives 1s as result, the second only gives back the unadjusted p-values and the third throws an error. Please see example data below:
## Example data
# Observed distances
obs <- matrix(c(0.001, 0.2, 0.50, 0.9, .9999))
null_values <- runif(20)
# Null distribution of distances
null <- matrix(null_values, nrow = length(obs), ncol = length(c(1:20)), byrow=TRUE)
null
# Hypotheses
alternative <- "more"
# The unadjusted raw p-value
praw <- c(0, 0.1, 0.45, 0.85, 1)
# Only getting 1s as results
adjusted_p_values_max <- multtest::ss.maxT(null, obs, alternative, get.cutoff=FALSE,
get.cr = FALSE, get.adjp = TRUE, alpha = 0.05)
adjusted_p_values_max
# Should probably use this one: but getting praw back, which is supposedly correct (but perhaps odd)
# this is because of the null distribution being identical for all 5 variables.
# Hence, should each word be tested against its own unique null distribution?
adjusted_p_values_min <- multtest::ss.minP(null, obs, praw, alternative, get.cutoff=FALSE,
get.cr = FALSE, get.adjp = TRUE, alpha=0.05)
adjusted_p_values_min
# Throwing and error
adjusted_p_values_sdmax <- sd.maxT(null, obs, alternative, get.cutoff=TRUE,
get.cr = TRUE, get.adjp = TRUE, alpha = 0.05)
adjusted_p_values_sdmax
Considering the very different conclusions from the first two methods, I’m wondering if my plan to implement these methods are incorrect in the first place. Basically, I want to examine several hundred distances against a null distribution of several thousands.
obs = The observed distances between different observed points in space to the same “original” point A. (Hence, distances are not independent since they all relate to the same point)
null = The null distribution comprises distances between points that have been randomly selected (replacement = TRUE) from the different observed points and the same original point A.
It seems way too conservative to use ss.maxP for me. Whereas it seems unnecessary to use ss.minP if it “just” returns the raw p-values; or what am I missing?
Can I perhaps solve this situation by constructing individual null distributions for every observed distance?
Thank you in advance!