Hi BioC community,
I work on metagenomic data (16S sequencing) and I want to indentify microorganisms differentially abundant between two groups. With this aim, I use the size Factors computed with GMPR, a normalization method adapted to metagenomic data. Consequently,I can not use directly the DESeq function. In addition, I have pvalues set to NA due to outliers, that's why, I want to replace outliers (like in the DESeq process). I am wondering if, after replacing outliers, I have to recompute the size factors ? I also set the cooksCutoff argument to "FALSE" after replacing outliers to avoid getting again NA pvalues, is it correct ?
Briefly : is the code below OK for you?
Many thanks in advance for your help, Best, Eleonore
gmpr.size.factor <- GMPR(t(otu.tab))
dds <- DESeqDataSetFromMatrix(countData = otu.tab,colData = cli,design= ~ gp)
sizeFactors(dds) <- gmpr.size.factor
dds <- estimateDispersions(dds)
dds <- nbinomWaldTest(dds)
dds2<-replaceOutliers(dds)
gmpr.size.factor <- GMPR(t(counts(dds2,normalized=FALSE,replaced=FALSE)))
sizeFactors(dds2) <- gmpr.size.factor
dds2 <- estimateDispersions(dds2)
dds2 <- nbinomWaldTest(dds2)
res<-results(dds2, name="gp_A_vs_B",independentFiltering=FALSE,pAdjustMethod="none",cooksCutoff = "FALSE")
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252
[4] LC_NUMERIC=C LC_TIME=French_France.1252
attached base packages:
[1] stats4 grid parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] biomformat_1.12.0 readxl_1.3.1 coin_1.3-1
[4] survival_3.1-8 pscl_1.5.5 DESeq2_1.24.0
[7] SummarizedExperiment_1.14.1 DelayedArray_0.10.0 BiocParallel_1.18.1
[10] matrixStats_0.56.0 GenomicRanges_1.36.1 GenomeInfoDb_1.20.0
[13] IRanges_2.18.3 S4Vectors_0.22.1 GMPR_0.1.3
[16] MASS_7.3-51.4 lmerTest_3.1-1 lme4_1.1-21
[19] Matrix_1.2-17 rmarkdown_2.1 car_3.0-6
[22] carData_3.0-3 emmeans_1.4.5 EMA_1.4.7
[25] tidyr_1.0.2 scales_1.1.0 gridExtra_2.3
[28] vegan_2.5-6 lattice_0.20-38 permute_0.9-5
[31] htmlwidgets_1.5.1 reshape2_1.4.3 plotly_4.9.2
[34] curatedMetagenomicData_1.14.1 ExperimentHub_1.10.0 dplyr_1.0.2
[37] Biobase_2.44.0 AnnotationHub_2.16.1 BiocFileCache_1.8.0
[40] dbplyr_1.4.2 BiocGenerics_0.30.0 ggplot2_3.2.1
[43] ape_5.3 phyloseq_1.28.0 knitr_1.28
loaded via a namespace (and not attached):
[1] tidyselect_1.1.0 RSQLite_2.2.0 AnnotationDbi_1.46.1
[4] FactoMineR_2.3 munsell_0.5.0 codetools_0.2-16
[7] preprocessCore_1.46.0 withr_2.1.2 colorspace_1.4-1
[10] highr_0.8 rstudioapi_0.11 leaps_3.1
[13] labeling_0.3 GenomeInfoDbData_1.2.1 bit64_0.9-7
[16] farver_2.0.1 rhdf5_2.28.1 coda_0.19-3
[19] vctrs_0.3.4 generics_0.1.0 TH.data_1.0-10
[22] xfun_0.12 R6_2.4.1 locfit_1.5-9.1
[25] bitops_1.0-6 assertthat_0.2.1 promises_1.1.0
[28] multcomp_1.4-14 nnet_7.3-12 gtable_0.3.0
[31] affy_1.62.0 sandwich_3.0-0 rlang_0.4.8
[34] genefilter_1.66.0 scatterplot3d_0.3-41 splines_3.6.1
[37] lazyeval_0.2.2 acepack_1.4.1 checkmate_2.0.0
[40] heatmap.plus_1.3 BiocManager_1.30.10 yaml_2.2.1
[43] abind_1.4-5 backports_1.1.5 httpuv_1.5.2
[46] Hmisc_4.3-1 tools_3.6.1 affyio_1.54.0
[49] ellipsis_0.3.0 RColorBrewer_1.1-2 siggenes_1.58.0
[52] Rcpp_1.0.4.6 plyr_1.8.5 base64enc_0.1-3
[55] progress_1.2.2 zlibbioc_1.30.0 purrr_0.3.3
[58] RCurl_1.98-1.1 prettyunits_1.1.1 rpart_4.1-15
[61] zoo_1.8-7 haven_2.2.0 ggrepel_0.8.1
[64] cluster_2.1.0 magrittr_1.5 data.table_1.12.8
[67] openxlsx_4.1.4 mvtnorm_1.0-11 hms_0.5.3
[70] mime_0.9 evaluate_0.14 xtable_1.8-4
[73] XML_3.99-0.3 rio_0.5.16 jpeg_0.1-8.1
[76] gcrma_2.56.0 compiler_3.6.1 biomaRt_2.40.5
[79] tibble_3.0.4 crayon_1.3.4 minqa_1.2.4
[82] htmltools_0.4.0 mgcv_1.8-28 later_1.0.0
[85] Formula_1.2-3 libcoin_1.0-6 geneplotter_1.62.0
[88] DBI_1.1.0 rappdirs_0.3.1 boot_1.3-22
[91] ade4_1.7-13 igraph_1.2.4.2 forcats_0.5.0
[94] pkgconfig_2.0.3 flashClust_1.01-2 numDeriv_2016.8-1.1
[97] foreign_0.8-71 foreach_1.4.8 annotate_1.62.0
[100] multtest_2.40.0 XVector_0.24.0 estimability_1.3
[103] scrime_1.3.5 stringr_1.4.0 digest_0.6.25
[106] Biostrings_2.52.0 cellranger_1.1.0 htmlTable_1.13.3
[109] curl_4.3 modeltools_0.2-23 shiny_1.4.0
[112] nloptr_1.2.1 GSA_1.03.1 lifecycle_0.2.0
[115] nlme_3.1-140 jsonlite_1.6.1 Rhdf5lib_1.6.3
[118] viridisLite_0.3.0 pillar_1.4.3 fastmap_1.0.1
[121] httr_1.4.1 interactiveDisplayBase_1.22.0 glue_1.4.0
[124] zip_2.0.4 png_0.1-7 iterators_1.0.12
[127] bit_1.1-15.2 stringi_1.4.3 blob_1.2.1
[130] latticeExtra_0.6-29 memoise_1.1.0
Hi Michael and thaks for your answer. How can I use custom size factors on DESeq function please ? Thanks again, Eleonore
sizeFactors(dds) <- ...
Hi again Michael,
I now understand what you mean, sorry for the delay. So, the code is simply as below :
Thanks a lot for your help !