Hi, I am puzzled by the apparently opposite results in mroast and romer functions from limma.
If I well understood the difference is that romer is optimized for a gsea analysis on dataset of gene sets and use sample label rotations, while mroast use rotations of genes and is suitable to test one or more gene sets.
In my case I have 22 samples (11 vs 11), and I want to test two custom gene sets: m1 (14 genes) and m2.
If I plot the running enrichment scores (y axis) along the overal list of genes ranked for the moderated t statistics, I can clearly see that, especially for the m1 genes, genes in the gene set distribute at the top of the list. While this agrees with the mroast direction that says it is up, romer "Up" p-values are high, while the "down" p-value are low.
Why this results are opposite?
Moreover mroast PropUp column indicates "1" for both M1 and M2 list, but in reality it should be 0.7857143 for M1 and 0.5833333 for M2.
> gsea.m1m2=mroast(gene.level, index=ids2indices(m1m2_list, rownames(gene.level)),design=design, array.weights=aw, nrot=999999)
> gsea.m1m2
NGenes PropDown PropUp Direction PValue FDR PValue.Mixed FDR.Mixed
M2 24 0 1 Up 1e-06 1e-06 1e-06 1e-06
M1 14 0 1 Up 1e-06 1e-06 1e-06 1e-06
> gsea.m1m2=romer(gene.level, index=ids2indices(m1m2_list, rownames(gene.level)),design=design, array.weights=aw)
> gsea.m1m2
NGenes Up Down Mixed
M1 14 0.8984 0.1017 0.9999
M2 24 0.9831 0.0170 0.9997
> m1m2_list
$M1
[1] "15945" "20304" "20846" "17329" "21926" "12524" "18126" "20296" "27056" "16193" "16365" "16176" "83430" "20656"
$M2
[1] "17534" "19016" "19224" "21803" "16181" "17533" "16153" "12655" "20295" "20778" "170786" "20852" "11846"
[14] "16600" "57262" "16178" "56221" "170776" "98396" "19015" "170779" "69142" "93671" "69165"
> gene.level[1:3,1:6]
scr.healthy anti210.healthy scr.healthy anti210.healthy scr.healthy anti210.healthy
100008567 6.834766 7.008072 6.935282 6.818102 6.830925 6.676217
100017 6.847484 7.260983 6.894910 7.166771 7.249061 7.224644
100019 7.633976 7.522403 7.595522 7.609688 7.673701 7.553197
> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 15.04
locale:
[1] LC_CTYPE=it_IT.UTF-8 LC_NUMERIC=C LC_TIME=it_IT.UTF-8 LC_COLLATE=it_IT.UTF-8
[5] LC_MONETARY=it_IT.UTF-8 LC_MESSAGES=it_IT.UTF-8 LC_PAPER=it_IT.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=it_IT.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] BiocInstaller_1.18.4 limma_3.24.15 clusterProfiler_2.2.7 DOSE_2.6.6 org.Mm.eg.db_3.1.2
[6] pathview_1.8.0 org.Hs.eg.db_3.1.2 AnnotationDbi_1.30.1 GenomeInfoDb_1.4.3 IRanges_2.2.7
[11] S4Vectors_0.6.6 KEGGgraph_1.26.0 graph_1.46.0 XML_3.98-1.3 scales_0.3.0
[16] biomaRt_2.24.1 beadarray_2.18.0 ggplot2_1.0.1 Biobase_2.28.0 BiocGenerics_0.14.0
[21] RSQLite_1.0.0 DBI_0.3.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.1 plyr_1.8.3 XVector_0.8.0 bitops_1.0-6 tools_3.2.2
[6] zlibbioc_1.14.0 base64_1.1 digest_0.6.8 gtable_0.1.2 png_0.1-7
[11] igraph_1.0.1 Rgraphviz_2.12.0 proto_0.3-10 stringr_1.0.0 httr_1.0.0
[16] Biostrings_2.36.4 grid_3.2.2 qvalue_2.0.0 R6_2.1.1 GOSemSim_1.26.0
[21] BeadDataPackR_1.20.0 GO.db_3.1.2 reshape2_1.4.1 DO.db_2.9 magrittr_1.5
[26] GenomicRanges_1.20.8 MASS_7.3-44 splines_3.2.2 KEGGREST_1.8.0 colorspace_1.2-6
[31] stringi_0.5-5 RCurl_1.95-4.7 munsell_0.4.2 illuminaio_0.10.0
This reversal between roast (a self-contained test) and romer (a competitive test) can only occur in special circumstances.
I have a suspicion of what the problem may be here. Please show me what your design matrix is.
Also, how exactly have you concluded that PropUp should be 0.786 for M1 and 0.583 for M2?
As regards the value for PropUp, if i have well understood stands for "proportion up", so simply by looking at the topTable of those genes I know that are 11 out of 14 for the M1 gene set and 14 our of 24 for M2.
Here is my design matrix, which is the same between the mroast and romer calls.
I don't think you are testing what you think you are testing given the design you show (you have no intercept) and the
contrast
that is being used (you haven't specified one, so it defaults to the last column of your design matrix).You can keep the same design, and specify
contrast=c(1, -1)
in your calls toromer
androast
, which would be testing the (anti20 / scr) fold changes.You said you compared the romer and roast results vs. what you got by looking at where the moderated t-stats landed on the distribution of the t-stats ... what was the (straight up limma for differential expression) code you used to calculate those?
You are right! I forgot the argument contrasts that I used for calling limma's fit. What a stupid mistake!
Now results agree:
> gsea.m1m2
NGenes Up Down Mixed
M1 14 0.1246 0.8755 0.4971
M2 24 0.3481 0.6520 0.8300
> gsea.m1m2.mroast
NGenes PropDown PropUp Direction PValue FDR PValue.Mixed FDR.Mixed
M1 14 0.07142857 0.4285714 Up 0.047 0.093 0.037 0.0370
M2 24 0.12500000 0.2500000 Up 0.468 0.468 0.034 0.0365
Anyway it sounds strange that even calculating the opposite contrasts results were diverging.
Good to see that you now have correct results.
The divergent results that you got originally are not at all surprising given the hypothesis you were testing -- see my answer below.