I'm confused in regards to the p-value for a multilevel design in deseq2. I have a timecourse (0h, 3h, 6h, 24h) and I want to know what is changing between each pairwise comparison. I'm curious as to how the p-value is the same for individual and all pairwise comparisons (aka 0hvs3h OR 6hvs24h have same pvalues).
From the research that i've done looking into ?results, it seems the contrast parameter stated is used for the logFC calculation and it seems from my googling (and TRYING my best to read stats equations) that the wald test would result in one p-value based on the design. I'm curious as to what I can do to look at specific up and down genes in regard to specific timepoints, aka pulling out specific p-values for speicif contrasts (0hvs3h OR 6hvs24h)
dds <- DESeqDataSetFromMatrix(countData = counts, colData = coldata, design = ~ Condition) dds <- DESeq(dds, test="LRT", reduced=~1) res <- results(dds, contrast = c("Condition", "0hpa", "24hpa")) sessionInfo() R version 3.4.4 (2018-03-15) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: OS X El Capitan 10.11.1 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] grid parallel stats4 stats graphics grDevices utils datasets methods base other attached packages: [1] DESeq2_1.16.1 SummarizedExperiment_1.6.5 DelayedArray_0.2.7 matrixStats_0.53.1 Biobase_2.36.2 [6] VennDiagram_1.6.20 futile.logger_1.4.3 RColorBrewer_1.1-2 knitr_1.20 edgeR_3.18.1 [11] limma_3.32.10 Seurat_2.3.0 Matrix_1.2-14 cowplot_0.9.2 ggplot2_2.2.1 [16] GenomicRanges_1.28.6 GenomeInfoDb_1.12.3 IRanges_2.10.5 S4Vectors_0.14.7 BiocGenerics_0.22.1 loaded via a namespace (and not attached): [1] snow_0.4-2 backports_1.1.2 Hmisc_4.1-1 VGAM_1.0-5 sn_1.5-2 [6] plyr_1.8.4 igraph_1.2.1 lazyeval_0.2.1 splines_3.4.4 BiocParallel_1.10.1 [11] digest_0.6.15 foreach_1.4.4 htmltools_0.3.6 lars_1.2 gdata_2.18.0 [16] memoise_1.1.0 magrittr_1.5 checkmate_1.8.5 cluster_2.0.7-1 mixtools_1.1.0 [21] ROCR_1.0-7 sfsmisc_1.1-2 annotate_1.54.0 recipes_0.1.2 gower_0.1.2 [26] dimRed_0.1.0 R.utils_2.6.0-9000 colorspace_1.3-2 blob_1.1.1 dplyr_0.7.4 [31] RCurl_1.95-4.10 genefilter_1.58.1 bindr_0.1.1 survival_2.42-3 zoo_1.8-1 [36] iterators_1.0.9 ape_5.1 glue_1.2.0 DRR_0.0.3 gtable_0.2.0 [41] ipred_0.9-6 zlibbioc_1.22.0 XVector_0.16.0 kernlab_0.9-25 ddalpha_1.3.2 [46] prabclus_2.2-6 DEoptimR_1.0-8 abind_1.4-5 scales_0.5.0 futile.options_1.0.1 [51] mvtnorm_1.0-7 DBI_0.8 Rcpp_0.12.16 metap_0.9 dtw_1.18-1 [56] xtable_1.8-2 htmlTable_1.11.2 magic_1.5-8 tclust_1.3-1 bit_1.1-12 [61] foreign_0.8-70 proxy_0.4-22 mclust_5.4 SDMTools_1.1-221 Formula_1.2-2 [66] tsne_0.1-3 lava_1.6.1 prodlim_2018.04.18 htmlwidgets_1.2 FNN_1.1 [71] gplots_3.0.1 fpc_2.1-11 acepack_1.4.1 modeltools_0.2-21 ica_1.0-1 [76] XML_3.98-1.11 pkgconfig_2.0.1 R.methodsS3_1.7.1 flexmix_2.3-14 nnet_7.3-12 [81] locfit_1.5-9.1 caret_6.0-79 labeling_0.3 AnnotationDbi_1.38.2 tidyselect_0.2.4 [86] rlang_0.2.0 reshape2_1.4.3 munsell_0.4.3 tools_3.4.4 RSQLite_2.1.0 [91] ranger_0.9.0 broom_0.4.4 ggridges_0.5.0 evaluate_0.10.1 geometry_0.3-6 [96] stringr_1.3.0 bit64_0.9-7 ModelMetrics_1.1.0 fitdistrplus_1.0-9 robustbase_0.92-8 [101] caTools_1.17.1 purrr_0.2.4 RANN_2.5.1 bindrcpp_0.2.2 pbapply_1.3-4 [106] nlme_3.1-137 formatR_1.5 R.oo_1.22.0 RcppRoll_0.2.2 compiler_3.4.4 [111] rstudioapi_0.7 png_0.1-7 geneplotter_1.54.0 tibble_1.4.2 stringi_1.1.7 [116] highr_0.6 lattice_0.20-35 trimcluster_0.1-2 psych_1.8.3.3 diffusionMap_1.1-0 [121] pillar_1.2.2 lmtest_0.9-36 data.table_1.10.4-3 bitops_1.0-6 irlba_2.3.2 [126] R6_2.2.2 latticeExtra_0.6-28 KernSmooth_2.23-15 gridExtra_2.3 codetools_0.2-15 [131] lambda.r_1.2.2 MASS_7.3-49 gtools_3.5.0 assertthat_0.2.0 CVST_0.2-1 [136] rprojroot_1.3-2 withr_2.1.2 mnormt_1.5-5 GenomeInfoDbData_0.99.0 diptest_0.75-7 [141] doSNOW_1.0.16 rpart_4.1-13 timeDate_3043.102 tidyr_0.8.0 class_7.3-14 [146] rmarkdown_1.9 segmented_0.5-3.0 Rtsne_0.13 numDeriv_2016.8-1 scatterplot3d_0.3-41 [151] lubridate_1.7.4 base64enc_0.1-3
I did that, as I stated
"""
From the research that i've done looking into ?results, it seems the contrast parameter stated is used for the logFC calculation and it seems from my googling (and TRYING my best to read stats equations) that the wald test would result in one p-value based on the design. I'm curious as to what I can do to look at specific up and down genes in regard to specific timepoints, aka pulling out specific p-values for speicif contrasts (0hvs3h OR 6hvs24h)
"""
My question is what is the common practice for figuring out which genes are changing in specific contrasts? (ie; What is changing along the time-series) I'm used to using edgeR in which a pvalue is generated for each contrast and in this case it seems that one pvalue is generated for the experiment. Is the common practice in DESeq to accomplish my question to find regions that pass a cutoff AND the individual contrast's logFC above a threshold?
If you want to test individual contrasts, and so generate p-values per contrast, you can switch from an LRT to a Wald test, by setting test="wald" when you run results().
Gotcha thanks!