Why are padj results different for BetaPrior = TRUE vs lfcShrink()?
1
0
Entering edit mode
Ali Barry ▴ 40
@2f691b31
Last seen 3 days ago
United Kingdom

Hi,

I'm struggling to wrap my head around the change in absolute number of DEGs calculated using different shrinkage methods when I specify independent hypothesis weight (ihw) in both instances. Is this difference expected, or am I feeding results into lfcShrink incorrectly? Possibly an assumption I'm missing?

Overview

Building my DESeq object

> dds <- DESeqDataSetFromMatrix(countData = toc, colData = colData, design = ~batch + Pop_Sex)
> dds <- estimateSizeFactors(dds)
> dds <- estimateDispersions(dds)
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates

Hypothesis Testing

a) Hypothesis testing using normal shrinkage through BetaPrior=TRUE

> dds$Pop_Sex <- relevel(dds$Pop_Sex,"MRTD_F")
> dds <- nbinomWaldTest(dds, betaPrior=TRUE) 
found results columns, replacing these

> resultsNames(dds)
 [1] "Intercept"     "batch1"        "batch2"        "Pop_SexMRTD_F" "Pop_SexCGRT_F" "Pop_SexCGRT_M"
 [7] "Pop_SexCRTH_F" "Pop_SexCRTH_M" "Pop_SexMRTD_M" "Pop_SexTBAC_F" "Pop_SexTBAC_M" "Pop_SexTDNV_F"
[13] "Pop_SexTDNV_M"

> results_MRTD_norm <- results(dds, contrast=c("Pop_Sex","MRTD_M", "MRTD_F"), cooksCutoff=TRUE, filterFun=ihw)
> table(results_MRTD_norm$padj < 0.05)

FALSE  TRUE 
44233    15

b) Hypothesis testing using apeglm shrinkage through lfcShrink

> dds$Pop_Sex <- relevel(dds$Pop_Sex,"MRTD_F") 
> dds <- nbinomWaldTest(dds, betaPrior=FALSE)
found results columns, replacing these

> resultsNames(dds)
 [1] "Intercept"                "batch_2_vs_1"             "Pop_Sex_CGRT_F_vs_MRTD_F"
 [4] "Pop_Sex_CGRT_M_vs_MRTD_F" "Pop_Sex_CRTH_F_vs_MRTD_F" "Pop_Sex_CRTH_M_vs_MRTD_F"
 [7] "Pop_Sex_MRTD_M_vs_MRTD_F" "Pop_Sex_TBAC_F_vs_MRTD_F" "Pop_Sex_TBAC_M_vs_MRTD_F"
[10] "Pop_Sex_TDNV_F_vs_MRTD_F" "Pop_Sex_TDNV_M_vs_MRTD_F"


> res <- results(dds, name="Pop_Sex_MRTD_M_vs_MRTD_F", cooksCutoff=TRUE, filterFun=ihw)
> results_MRTD_ape  <- lfcShrink(dds, coef="Pop_Sex_MRTD_M_vs_MRTD_F", res=res, type="apeglm")
> table(results_MRTD_ape$padj < 0.05)

FALSE  TRUE 
44122   126 

# gives very similar result (112 padj < 0.05) when fed into lfcShrink
# res  <- results(dds, contrast=c("Pop_Sex","MRTD_M", "MRTD_F"), cooksCutoff=TRUE, filterFun=ihw)
> DESeq2::plotMA(results_MRTD_norm)
![MAplot for normal shrinkage][1]

> DESeq2::plotMA(results_MRTD_ape)
![MAplot from apeglm shrinkage][1]
> head(results_MRTD_norm)
log2 fold change (MAP): Pop_Sex MRTD_M vs MRTD_F 
Wald test p-value: Pop_Sex MRTD_M vs MRTD_F 
DataFrame with 6 rows and 7 columns
                      baseMean log2FoldChange     lfcSE      stat    pvalue      padj    weight
                     <numeric>      <numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
ENSMUSG00000000001 724.6171634      -0.107774  0.297979 -0.361685  0.717588         1  0.866199
ENSMUSG00000000003   0.0243594       0.000000  0.621820  0.000000  1.000000         1  0.000000
ENSMUSG00000000028  45.0620373      -0.165511  0.637214 -0.259741  0.795064         1  2.245600
ENSMUSG00000000031 698.5754183      -0.154553  1.053695 -0.146677  0.883387         1  1.121025
ENSMUSG00000000037   3.6064554       0.335497  0.659736  0.508532  0.611081         1  1.897672
ENSMUSG00000000049 249.3472568       0.145636  0.543124  0.268145  0.788587         1  1.606663


> head(results_MRTD_ape)
log2 fold change (MAP): Pop Sex MRTD M vs MRTD F 
Wald test p-value: Pop Sex MRTD M vs MRTD F 
DataFrame with 6 rows and 5 columns
                      baseMean log2FoldChange      lfcSE    pvalue      padj
                     <numeric>      <numeric>  <numeric> <numeric> <numeric>
ENSMUSG00000000001 724.6171634   -1.23632e-06 0.00144268  0.716050         1
ENSMUSG00000000003   0.0243594   -7.51882e-09 0.00144270  0.994846         1
ENSMUSG00000000028  45.0620373   -3.66459e-07 0.00144269  0.802388         1
ENSMUSG00000000031 698.5754183   -7.02208e-08 0.00144269  0.943279         1
ENSMUSG00000000037   3.6064554    6.82196e-07 0.00144269  0.629760         1
ENSMUSG00000000049 249.3472568    4.22034e-07 0.00144269  0.825730         1

Session details

> sessionInfo( )
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    

attached base packages:
 [1] grid      stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] msigdbr_7.4.1               pathfindR_1.6.3             pathfindR.data_1.1.2        goseq_1.44.0               
 [5] geneLenDataBase_1.28.0      BiasedUrn_1.07              GenomicFeatures_1.44.2      org.Mm.eg.db_3.13.0        
 [9] AnnotationDbi_1.54.1        BiocParallel_1.26.1         ggrepel_0.9.1               clusterProfiler_4.0.5      
[13] biomaRt_2.48.3              venneuler_1.1-0             rJava_1.0-6                 ggbiplot_0.55              
[17] scales_1.1.1                plyr_1.8.6                  ggpubr_0.4.0                cowplot_1.1.1              
[21] forcats_0.5.1               stringr_1.4.0               purrr_0.3.4                 readr_2.1.1                
[25] tidyr_1.1.4                 tibble_3.1.3                tidyverse_1.3.1             viridis_0.6.2              
[29] viridisLite_0.4.0           dplyr_1.0.7                 ComplexHeatmap_2.8.0        RColorBrewer_1.1-2         
[33] ggplot2_3.3.5               gplots_3.1.1                limma_3.48.3                IHW_1.20.0                 
[37] vsn_3.60.0                  DESeq2_1.32.0               SummarizedExperiment_1.22.0 MatrixGenerics_1.4.3       
[41] matrixStats_0.61.0          GenomicRanges_1.44.0        GenomeInfoDb_1.28.4         IRanges_2.26.0             
[45] S4Vectors_0.30.0            GEOquery_2.60.0             Biobase_2.52.0              BiocGenerics_0.38.0        

loaded via a namespace (and not attached):
  [1] rappdirs_0.3.3           rtracklayer_1.52.1       coda_0.19-4              bit64_4.0.5              knitr_1.37              
  [6] DelayedArray_0.18.0      data.table_1.14.2        KEGGREST_1.32.0          RCurl_1.98-1.3           doParallel_1.0.17       
 [11] generics_0.1.2           snow_0.4-4               preprocessCore_1.54.0    RSQLite_2.2.7            shadowtext_0.1.1        
 [16] bit_4.0.4                tzdb_0.2.0               enrichplot_1.12.3        xml2_1.3.2               lubridate_1.8.0         
 [21] assertthat_0.2.1         apeglm_1.14.0            xfun_0.28                hms_1.1.1                babelgene_21.4          
 [26] evaluate_0.15            fansi_0.5.0              restfulr_0.0.13          progress_1.2.2           caTools_1.18.2          
 [31] dbplyr_2.1.1             readxl_1.3.1             igraph_1.2.7             DBI_1.1.2                geneplotter_1.70.0      
 [36] ellipsis_0.3.2           backports_1.4.0          annotate_1.70.0          vctrs_0.3.8              Cairo_1.5-12.2          
 [41] abind_1.4-5              cachem_1.0.5             withr_2.4.3              ggforce_0.3.3            bdsmatrix_1.3-4         
 [46] GenomicAlignments_1.28.0 treeio_1.16.2            fdrtool_1.2.17           prettyunits_1.1.1        cluster_2.1.2           
 [51] DOSE_3.18.3              ape_5.5                  lazyeval_0.2.2           crayon_1.5.0             genefilter_1.74.0       
 [56] pkgconfig_2.0.3          slam_0.1-48              labeling_0.4.2           tweenr_1.0.2             nlme_3.1-152            
 [61] rlang_0.4.11             lifecycle_1.0.1          downloader_0.4           filelock_1.0.2           affyio_1.62.0           
 [66] BiocFileCache_2.0.0      modelr_0.1.8             cellranger_1.1.0         polyclip_1.10-0          Matrix_1.3-3            
 [71] aplot_0.1.2              carData_3.0-5            lpsymphony_1.20.0        reprex_2.0.1             GlobalOptions_0.1.2     
 [76] png_0.1-7                rjson_0.2.20             bitops_1.0-7             KernSmooth_2.23-20       Biostrings_2.60.1       
 [81] blob_1.2.2               shape_1.4.6              qvalue_2.24.0            rstatix_0.7.0            gridGraphics_0.5-1      
 [86] ggsignif_0.6.3           memoise_2.0.1            magrittr_2.0.1           zlibbioc_1.38.0          compiler_4.1.0          
 [91] scatterpie_0.1.7         BiocIO_1.2.0             bbmle_1.0.24             clue_0.3-60              Rsamtools_2.8.0         
 [96] cli_3.1.0                affy_1.70.0              XVector_0.32.0           patchwork_1.1.1          MASS_7.3-54             
[101] mgcv_1.8-35              tidyselect_1.1.1         stringi_1.7.3            emdbook_1.3.12           yaml_2.3.5              
[106] GOSemSim_2.18.1          locfit_1.5-9.4           fastmatch_1.1-3          tools_4.1.0              circlize_0.4.14         
[111] rstudioapi_0.13          foreach_1.5.2            gridExtra_2.3            farver_2.1.0             ggraph_2.0.5            
[116] digest_0.6.27            BiocManager_1.30.16      Rcpp_1.0.7               car_3.0-12               broom_0.7.12            
[121] httr_1.4.2               colorspace_2.0-2         rvest_1.0.2              XML_3.99-0.6             fs_1.5.2                
[126] splines_4.1.0            yulab.utils_0.0.4        tidytree_0.3.8           graphlayouts_0.7.1       ggplotify_0.1.0         
[131] xtable_1.8-4             jsonlite_1.7.2           ggtree_3.0.4             tidygraph_1.2.0          ggfun_0.0.5             
[136] R6_2.5.1                 htmltools_0.5.2          pillar_1.7.0             glue_1.4.2               fastmap_1.1.0           
[141] codetools_0.2-18         fgsea_1.18.0             mvtnorm_1.1-3            utf8_1.2.2               lattice_0.20-44         
[146] numDeriv_2016.8-1.1      curl_4.3.2               gtools_3.9.2             GO.db_3.13.0             survival_3.2-11         
[151] rmarkdown_2.11           munsell_0.5.0            DO.db_2.9                GetoptLong_1.0.5         GenomeInfoDbData_1.2.6  
[156] iterators_1.0.14         haven_2.4.3              reshape2_1.4.4           gtable_0.3.0
IHW lfcShrink DESeq2 • 1.1k views
ADD COMMENT
0
Entering edit mode

MA plots didn't upload properly, here at the links for ref.

normal : MAplot_normal.png
apeglm : MAplot_apeglm.png

R markdown : Rmd file

ADD REPLY
1
Entering edit mode
@mikelove
Last seen 4 days ago
United States

betaPrior=TRUE is essentially deprecated and not recommended since ~2016, as discussed in the apeglm paper. So really these should give the same results, because lfcShrink has no effect on padj.

ADD COMMENT
0
Entering edit mode

Thanks for the quick reply! The discrepancy was making me question my knowledge of shrinkage. I'm moving forward with lfcShrink anyways, but this definitely adds confidence.

ADD REPLY

Login before adding your answer.

Traffic: 361 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6