Hello,
I’m observing an unexpected behavior from DESeq2. I obtain a significant padj when I compare the transcripts from 2 groups with 0 reads in all the replicates.
> as.data.frame(res_F3POPS_vs_F3_CTRL) %>% rownames_to_column("id") %>% filter(id == "ENSRNOT00000088387") id baseMean log2FoldChange lfcSE stat 1 ENSRNOT00000088387 381.6177 -25.8996 4.694172 -5.517395 pvalue padj 1 0.00000003440612 0.0000003680966 > as.data.frame(txi$counts) %>% rownames_to_column("id") %>% filter(id == "ENSRNOT00000088387") id control_1 control_2 control_3 F3_CTRL_1 F3_CTRL_2 1 ENSRNOT00000088387 573.3939 1075.113 287.7978 0 0 F3_CTRL_3 F3_FA_1 F3_FA_2 F3_FA_3 F3_POPs_1 F3_POPs_2 F3_POPs_3 F3_POPsFA_1 1 0 0 0 0 0 0 0 0 F3_POPsFA_2 F3_POPsFA_3 F4_CTRL_1 F4_CTRL_3 F4_FA_1 F4_FA_2 F4_FA_3 F4_POPs_1 1 0 0 2 0 0 0 0 0 F4_POPs_2 F4_POPs_3 F4_POPsFA_1 F4_POPsFA_2 F4_POPsFA_3 FA_1 FA_2 1 0 0 0 0 0 3250.33 715.4023 FA_3 FA_PoPs_1 FA_PoPs_2 FA_PoPs_3 PoPs_1 PoPs_2 1 3177.454 881.573 1323.833 1859.41 849.9267 0.00003033968 PoPs_3 1 0.00000002652117
I used the following design:
> design %>% as.data.frame
group sample
1 F3_CTRL F3_CTRL_1
2 F3_CTRL F3_CTRL_2
3 F3_CTRL F3_CTRL_3
4 F3_FA F3_FA_1
5 F3_FA F3_FA_2
6 F3_FA F3_FA_3
7 F3_POPsFA F3_POPsFA_1
8 F3_POPsFA F3_POPsFA_2
9 F3_POPsFA F3_POPsFA_3
10 F3_POPs F3_POPs_1
11 F3_POPs F3_POPs_2
12 F3_POPs F3_POPs_3
13 F4_CTRL F4_CTRL_1
14 F4_CTRL F4_CTRL_3
15 F4_FA F4_FA_1
16 F4_FA F4_FA_2
17 F4_FA F4_FA_3
18 F4_POPsFA F4_POPsFA_1
19 F4_POPsFA F4_POPsFA_2
20 F4_POPsFA F4_POPsFA_3
21 F4_POPs F4_POPs_1
22 F4_POPs F4_POPs_2
23 F4_POPs F4_POPs_3
24 F1_FA FA_1
25 F1_FA FA_2
26 F1_FA FA_3
27 F1_POPsFA FA_PoPs_1
28 F1_POPsFA FA_PoPs_2
29 F1_POPsFA FA_PoPs_3
30 F1_POPs PoPs_1
31 F1_POPs PoPs_2
32 F1_POPs PoPs_3
33 F1_CTRL control_1
34 F1_CTRL control_2
35 F1_CTRL control_3
And I produced the results with this code:
dds <- DESeqDataSetFromTximport(txi, design, ~ group)
dds <- dds[rowSums(counts(dds)) >= 2]
dds <- DESeq(dds)
res_F3POPS_vs_F3_CTRL <- results(dds, contrast = c("group", "F3_POPs", "F3_CTRL"))
I also checked in the dds object to make sure the samples were correctly paired with the design:
> dds$sample
[1] "F3_CTRL_1" "F3_CTRL_2" "F3_CTRL_3" "F3_FA_1" "F3_FA_2"
[6] "F3_FA_3" "F3_POPsFA_1" "F3_POPsFA_2" "F3_POPsFA_3" "F3_POPs_1"
[11] "F3_POPs_2" "F3_POPs_3" "F4_CTRL_1" "F4_CTRL_3" "F4_FA_1"
[16] "F4_FA_2" "F4_FA_3" "F4_POPsFA_1" "F4_POPsFA_2" "F4_POPsFA_3"
[21] "F4_POPs_1" "F4_POPs_2" "F4_POPs_3" "FA_1" "FA_2"
[26] "FA_3" "FA_PoPs_1" "FA_PoPs_2" "FA_PoPs_3" "PoPs_1"
[31] "PoPs_2" "PoPs_3" "control_1" "control_2" "control_3"
> dds$group
[1] F3_CTRL F3_CTRL F3_CTRL F3_FA F3_FA F3_FA F3_POPsFA
[8] F3_POPsFA F3_POPsFA F3_POPs F3_POPs F3_POPs F4_CTRL F4_CTRL
[15] F4_FA F4_FA F4_FA F4_POPsFA F4_POPsFA F4_POPsFA F4_POPs
[22] F4_POPs F4_POPs F1_FA F1_FA F1_FA F1_POPsFA F1_POPsFA
[29] F1_POPsFA F1_POPs F1_POPs F1_POPs F1_CTRL F1_CTRL F1_CTRL
12 Levels: F1_CTRL F1_FA F1_POPs F1_POPsFA F3_CTRL F3_FA F3_POPs ... F4_POPsFA
Everything appears to be OK.
I’m probably missing something, but I don’t know what.
Could you help me with this problem?
Thank you very much!
Charles.
> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.5
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] bindrcpp_0.2.2 DESeq2_1.20.0
[3] SummarizedExperiment_1.10.1 DelayedArray_0.6.1
[5] BiocParallel_1.14.1 matrixStats_0.53.1
[7] Biobase_2.40.0 GenomicRanges_1.32.3
[9] GenomeInfoDb_1.16.0 IRanges_2.14.10
[11] S4Vectors_0.18.3 BiocGenerics_0.26.0
[13] forcats_0.3.0 stringr_1.3.1
[15] dplyr_0.7.5 purrr_0.2.5
[17] readr_1.1.1 tidyr_0.8.1
[19] tibble_1.4.2 ggplot2_2.2.1
[21] tidyverse_1.2.1 rnaseq_0.0.27
loaded via a namespace (and not attached):
[1] nlme_3.1-137 bitops_1.0-6 lubridate_1.7.4
[4] bit64_0.9-7 httr_1.3.1 RColorBrewer_1.1-2
[7] tools_3.5.0 backports_1.1.2 utf8_1.1.4
[10] R6_2.2.2 rpart_4.1-13 Hmisc_4.1-1
[13] DBI_1.0.0 lazyeval_0.2.1 colorspace_1.3-2
[16] nnet_7.3-12 tidyselect_0.2.4 gridExtra_2.3
[19] mnormt_1.5-5 bit_1.1-14 compiler_3.5.0
[22] cli_1.0.0 rvest_0.3.2 htmlTable_1.12
[25] flashClust_1.01-2 xml2_1.2.0 scales_0.5.0
[28] checkmate_1.8.5 psych_1.8.4 genefilter_1.62.0
[31] digest_0.6.15 foreign_0.8-70 XVector_0.20.0
[34] base64enc_0.1-3 pkgconfig_2.0.1 htmltools_0.3.6
[37] FactoMineR_1.41 readxl_1.1.0 htmlwidgets_1.2
[40] rlang_0.2.1 rstudioapi_0.7 RSQLite_2.1.1
[43] bindr_0.1.1 jsonlite_1.5 acepack_1.4.1
[46] RCurl_1.95-4.10 magrittr_1.5 GenomeInfoDbData_1.1.0
[49] Formula_1.2-3 leaps_3.0 Matrix_1.2-14
[52] Rcpp_0.12.17 munsell_0.5.0 scatterplot3d_0.3-41
[55] stringi_1.2.3 MASS_7.3-50 zlibbioc_1.26.0
[58] plyr_1.8.4 grid_3.5.0 blob_1.1.1
[61] ggrepel_0.8.0 crayon_1.3.4 lattice_0.20-35
[64] haven_1.1.1 splines_3.5.0 annotate_1.58.0
[67] hms_0.4.2 locfit_1.5-9.1 knitr_1.20
[70] pillar_1.2.3 geneplotter_1.58.0 reshape2_1.4.3
[73] XML_3.98-1.11 glue_1.2.0 latticeExtra_0.6-28
[76] modelr_0.1.2 data.table_1.11.4 cellranger_1.1.0
[79] gtable_0.2.0 assertthat_0.2.0 xtable_1.8-2
[82] broom_0.4.4 survival_2.42-3 AnnotationDbi_1.42.1
[85] memoise_1.1.0 tximport_1.8.0 cluster_2.0.7
OK, I re-launched after changing the order of colData and it seems to correct the problem. Thank you for your input.
Charles.