Hi,
I'm having trouble interpreting the intended effect of the absRanking argument on GSVA. I'm running an example with three gene sets, one is strongly positive, one is strongly negative, and one without differential expression.
Standard GSVA (absRanking=FALSE) returns the expected results (up/down/none), but GSVA with absRanking=TRUE does not show any significant enrichment. My ultimate aim is to do ranking with mixed-sign genesets (up/down) but would've naively assumed that all-up / all-down are a special case of that and thus should work too.
Thanks for your help,
Gad
set.seed(19820)
library(GSVA)
library(limma)
p <- 1000 ## number of genes
n <- 100 ## number of samples
nGrp1 <- 50 ## number of samples in group 1
nGrp2 <- n - nGrp1 ## number of samples in group 2
# consider three disjoint gene sets
geneSets <- list(set1 = paste("g", 1:3, sep = ""),
set2 = paste("g", 4:6, sep = ""),
set3 = paste("g", 7:10, sep = ""))
# sample data from a normal distribution with mean 0 and st.dev. 1
x1 <- matrix(rnorm(n*p), nrow = p, ncol = n,
dimnames = list(paste("g", 1:p, sep = ""),
paste("s", 1:n, sep = "")))
# genes in set1 are expressed at higher levels in the last 'nGrp1+1' to 'n' samples
x1[geneSets$set1, (nGrp1+1):n] <- x1[geneSets$set1, (nGrp1+1):n] + 2
# Also set2 are lower
x1[geneSets$set2, (nGrp1+1):n] <- x1[geneSets$set2, (nGrp1+1):n] - 2
## build design matrix
design <- cbind(sampleGroup1 = 1, sampleGroup2vs1 = c(rep(0, nGrp1), rep(1, nGrp2)))
gsvapar1 <- gsvaParam(x1, geneSets, maxDiff = TRUE, absRanking = FALSE)
gsvapar2 <- gsvaParam(x1, geneSets, maxDiff = TRUE, absRanking = TRUE)
gsva_es1 <- gsva(gsvapar1)
gsva_es2 <- gsva(gsvapar2)
topTable(eBayes(lmFit(gsva_es1, design)), coef = "sampleGroup2vs1")
logFC AveExpr t P.Value adj.P.Val B
set2 -1.06001420 -0.007121808 -27.712090 8.321766e-50 2.496530e-49 103.047997
set1 1.09278445 -0.003754398 26.819689 1.651951e-48 2.477926e-48 100.049550
set3 -0.07639161 0.008858442 -1.132907 2.598605e-01 2.598605e-01 -7.663724
topTable(eBayes(lmFit(gsva_es2, design)), coef = "sampleGroup2vs1")
logFC AveExpr t P.Value adj.P.Val B
set1 -0.02693682 0.7847923 -1.2968183 0.1957106 0.4985793 -5.381740
set2 0.02016743 0.7983954 0.9709200 0.3323862 0.4985793 -5.735982
set3 -0.01100966 0.6397191 -0.5300375 0.5964860 0.5964860 -6.053170
sessionInfo( )
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: ---
Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.3.so
locale:
[1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8 LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8 LC_PAPER=en_AU.UTF-8
[8] LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] limma_3.50.3 GSVA_1.50.1
Thanks Robert, appreciate the quick reply (and fix).