Entering edit mode
Hello, today I had some problem in DE analysis results replication. I analyzed the same dataset, with exactly the same code, same DESeq2 version but i got different results in a comparison. How could it be?
dds_stage <- DESeqDataSetFromTximport(cur_txi, cur_sampleTable, ~uid+stage)
The dataset is composed by 150 tumoral and matched-normal samples (for this reason I used the design ~uid+stage, where uid stands for patient ID and stage for individual tumor stage). I didn't set any seed. If I summarize DESeq results I obtain:
out of 36405 with nonzero total read count
adjusted p-value < 0.05
LFC > 1.00 (up) : 1, 0.0027%
LFC < -1.00 (down) : 0, 0%
outliers [1] : 0, 0%
low counts [2] : 0, 0%
(mean count < 0)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results
for the new analysis, while the previous result was
out of 36405 with nonzero total read count
adjusted p-value < 0.05
LFC > 1.00 (up) : 2, 0.0055%
LFC < -1.00 (down) : 0, 0%
outliers [1] : 0, 0%
low counts [2] : 0, 0%
(mean count < 0)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results
Thank you Matteo
dds_stage <- DESeqDataSetFromTximport(cur_txi, cur_sampleTable, ~uid+stage)
# include your problematic code here with any corresponding output
# please also include the results of running the following in an R session
sessionInfo( )
R version 4.2.1 (2022-06-23)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.4.1
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] EnhancedVolcano_1.16.0 ggrepel_0.9.3 BiocParallel_1.32.6
[4] clusterProfiler_4.6.2 lubridate_1.9.2 forcats_1.0.0
[7] stringr_1.5.0 dplyr_1.1.1 purrr_1.0.1
[10] readr_2.1.4 tidyr_1.3.0 tibble_3.2.1
[13] ggplot2_3.4.2 tidyverse_2.0.0 openxlsx_4.2.5.2
[16] Nozzle.R1_1.1-1.1 RColorBrewer_1.1-3 gplots_3.1.3
[19] DESeq2_1.38.3 SummarizedExperiment_1.28.0 MatrixGenerics_1.10.0
[22] matrixStats_0.63.0 GenomicRanges_1.50.2 GenomeInfoDb_1.34.9
[25] IRanges_2.32.0 S4Vectors_0.36.2 pheatmap_1.0.12
[28] genefilter_1.80.3 tximport_1.26.1 biomaRt_2.54.1
[31] Biobase_2.58.0 BiocGenerics_0.44.0
loaded via a namespace (and not attached):
[1] utf8_1.2.3 shinydashboard_0.7.2 tidyselect_1.2.0
[4] heatmaply_1.4.2 RSQLite_2.3.1 AnnotationDbi_1.60.2
[7] htmlwidgets_1.6.2 grid_4.2.1 TSP_1.2-4
[10] scatterpie_0.1.8 munsell_0.5.0 codetools_0.2-19
[13] DT_0.27 withr_2.5.0 colorspace_2.1-0
[16] GOSemSim_2.24.0 Category_2.64.0 filelock_1.0.2
[19] pcaExplorer_2.24.0 knitr_1.42 rstudioapi_0.14
[22] DOSE_3.24.2 ca_0.71.1 NMF_0.26
[25] GenomeInfoDbData_1.2.9 polyclip_1.10-4 topGO_2.50.0
[28] farver_2.1.1 bit64_4.0.5 rhdf5_2.42.1
[31] downloader_0.4 treeio_1.22.0 vctrs_0.6.2
[34] generics_0.1.3 gson_0.1.0 xfun_0.38
[37] timechange_0.2.0 BiocFileCache_2.6.1 R6_2.5.1
[40] doParallel_1.0.17 graphlayouts_0.8.4 seriation_1.4.2
[43] locfit_1.5-9.7 rhdf5filters_1.10.1 gridGraphics_0.5-1
[46] bitops_1.0-7 cachem_1.0.7 shinyAce_0.4.2
[49] fgsea_1.24.0 DelayedArray_0.24.0 assertthat_0.2.1
[52] promises_1.2.0.1 scales_1.2.1 ggraph_2.1.0
[55] enrichplot_1.18.4 gtable_0.3.3 tidygraph_1.2.3
[58] rlang_1.1.0 splines_4.2.1 lazyeval_0.2.2
[61] shinyBS_0.61.1 BiocManager_1.30.20 reshape2_1.4.4
[64] threejs_0.3.3 crosstalk_1.2.0 httpuv_1.6.9
[67] qvalue_2.30.0 RBGL_1.74.0 tools_4.2.1
[70] ggplotify_0.1.0 gridBase_0.4-7 ellipsis_0.3.2
[73] Rcpp_1.0.10 plyr_1.8.8 base64enc_0.1-3
[76] progress_1.2.2 zlibbioc_1.44.0 RCurl_1.98-1.12
[79] prettyunits_1.1.1 viridis_0.6.2 cowplot_1.1.1
[82] cluster_2.1.4 magrittr_2.0.3 data.table_1.14.8
[85] SparseM_1.81 patchwork_1.1.2 hms_1.1.3
[88] mime_0.12 evaluate_0.20 xtable_1.8-4
[91] HDO.db_0.99.1 XML_3.99-0.14 gridExtra_2.3
[94] compiler_4.2.1 shadowtext_0.1.2 KernSmooth_2.23-20
[97] crayon_1.5.2 htmltools_0.5.5 GOstats_2.64.0
[100] ggfun_0.0.9 later_1.3.0 tzdb_0.3.0
[103] aplot_0.1.10 geneplotter_1.76.0 DBI_1.1.3
[106] tweenr_2.0.2 dbplyr_2.3.2 MASS_7.3-58.3
[109] rappdirs_0.3.3 Matrix_1.5-4 cli_3.6.1
[112] parallel_4.2.1 igraph_1.4.2 pkgconfig_2.0.3
[115] registry_0.5-1 plotly_4.10.1 xml2_1.3.3
[118] foreach_1.5.2 ggtree_3.6.2 annotate_1.76.0
[121] rngtools_1.5.2 webshot_0.5.4 XVector_0.38.0
[124] AnnotationForge_1.40.2 yulab.utils_0.0.6 digest_0.6.31
[127] graph_1.76.0 Biostrings_2.66.0 rmarkdown_2.21
[130] fastmatch_1.1-3 tidytree_0.4.2 dendextend_1.17.1
[133] GSEABase_1.60.0 curl_5.0.0 shiny_1.7.4
[136] gtools_3.9.4 nlme_3.1-162 lifecycle_1.0.3
[139] jsonlite_1.8.4 Rhdf5lib_1.20.0 viridisLite_0.4.1
[142] limma_3.54.2 fansi_1.0.4 pillar_1.9.0
[145] lattice_0.21-8 KEGGREST_1.38.0 fastmap_1.1.1
[148] httr_1.4.5 survival_3.5-5 GO.db_3.16.0
[151] glue_1.6.2 zip_2.3.0 png_0.1-8
[154] iterators_1.0.14 bit_4.0.5 Rgraphviz_2.42.0
[157] ggforce_0.4.1 stringi_1.7.12 blob_1.2.4
[160] caTools_1.18.2 memoise_2.0.1 ape_5.7-1
Michael Love thank you for answering. I can't be sure 100% but I am actually running DESeq2=1.38.3, which I suppose is the same from May 2023 (also my R version is from Nov 2022). I did an analysis in May 2023 and repeated yesterday. Could you please give me some advice?
No idea. But DESeq2 is deterministic, in that it doesn't have any change due to seed.
Two ways in the future to keep track is to print out .Rout files with sessionInfo() called at the bottom of scripts (likewise with Rmd output).
Also DESeq2 saves it's own version number in the object when it is run, so if you had saved the
dds
that would also have the info.yes I have both the dds objects. I did
metadata(dds)$version
as you sggested and both are generated byAny check/comparison between the two objects that I can do to figure out what's going on? Thanks
You can check
all.equal(x,y)
Thank you Michael Love . I did as you suggested and i get this output:
I really have no idea of what's happening. what do you think about?
Another idea would be:
Thanks, but still i got
TRUE
Not sure then. Just go with the results you can replicate now.