Entering edit mode
Given the formula from the paper/vignette for mu, I was expecting mu to be on a log normalized scale:
The mean values μ_ij = s_j * q_ij
where log2(q_ij) = X_j * B_i
head(assays(dds)[["mu"]])
## treated1 treated2 treated3 untreated1 untreated2
## FBgn0000008 154.396031 71.8609656 78.6055308 107.292909 169.04844
## FBgn0000014 1.501799 0.6989863 0.7645902 1.473255 2.32123
But, when I looked at this on my own data, I noticed that it is definitely not on a log scale, and has the highest correlation with un-normalized counts.
> assays(dds)$mu[1:3,1:3]
subject_12207_visit_1 subject_12207_visit_2 subject_12507_visit_1
ENSG00000000419.14 243.76985 218.3846 171.21118
ENSG00000000457.14 257.88875 239.1162 183.68763
ENSG00000000460.17 75.09165 100.5722 70.53956
> assays(normTransform(dds))[[1]][1:3,1:3]
subject_12207_visit_1 subject_12207_visit_2 subject_12507_visit_1
ENSG00000000419.14 7.559637 7.277101 7.514149
ENSG00000000457.14 7.594060 7.321629 8.115484
ENSG00000000460.17 6.895476 6.438979 5.774912
> counts(dds, normalized = TRUE)[1:3,1:3]
subject_12207_visit_1 subject_12207_visit_2 subject_12507_visit_1
ENSG00000000419.14 187.6589 154.10494 181.80345
ENSG00000000457.14 192.2146 158.96685 276.33463
ENSG00000000460.17 118.0543 85.76122 53.75474
> counts(dds)[1:3,1:3]
subject_12207_visit_1 subject_12207_visit_2 subject_12507_visit_1
ENSG00000000419.14 260 191 175
ENSG00000000457.14 253 194 259
ENSG00000000460.17 112 109 48
What is the correct scale for mu?
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] grid stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] gridExtra_2.3 here_1.0.1 foreach_1.5.2 forcats_0.5.1 stringr_1.4.0
[6] dplyr_1.0.8 purrr_0.3.4 readr_2.1.2 tidyr_1.2.0 tibble_3.1.6
[11] ggplot2_3.3.5 tidyverse_1.3.1 DESeq2_1.34.0 SummarizedExperiment_1.24.0 Biobase_2.54.0
[16] MatrixGenerics_1.6.0 matrixStats_0.61.0 GenomicRanges_1.46.1 GenomeInfoDb_1.30.1 IRanges_2.28.0
[21] S4Vectors_0.32.3 BiocGenerics_0.40.0
loaded via a namespace (and not attached):
[1] bitops_1.0-7 fs_1.5.2 lubridate_1.8.0 bit64_4.0.5 RColorBrewer_1.1-2 httr_1.4.2
[7] rprojroot_2.0.2 tools_4.1.2 backports_1.4.1 utf8_1.2.2 R6_2.5.1 DBI_1.1.2
[13] colorspace_2.0-3 withr_2.4.3 tidyselect_1.1.2 bit_4.0.4 compiler_4.1.2 cli_3.2.0
[19] rvest_1.0.2 xml2_1.3.3 DelayedArray_0.20.0 scales_1.1.1 genefilter_1.76.0 XVector_0.34.0
[25] pkgconfig_2.0.3 dbplyr_2.1.1 fastmap_1.1.0 limma_3.50.1 rlang_1.0.1 readxl_1.3.1
[31] rstudioapi_0.13 RSQLite_2.2.10 generics_0.1.2 jsonlite_1.8.0 BiocParallel_1.28.3 RCurl_1.98-1.6
[37] magrittr_2.0.2 GenomeInfoDbData_1.2.7 Matrix_1.4-0 Rcpp_1.0.8 munsell_0.5.0 fansi_1.0.2
[43] lifecycle_1.0.1 stringi_1.7.6 edgeR_3.36.0 zlibbioc_1.40.0 blob_1.2.2 parallel_4.1.2
[49] crayon_1.5.0 lattice_0.20-45 Biostrings_2.62.0 haven_2.4.3 splines_4.1.2 annotate_1.72.0
[55] hms_1.1.1 KEGGREST_1.34.0 locfit_1.5-9.4 pillar_1.7.0 codetools_0.2-18 geneplotter_1.72.0
[61] reprex_2.0.1 XML_3.99-0.9 glue_1.6.2 modelr_0.1.8 png_0.1-7 vctrs_0.3.8
[67] tzdb_0.2.0 cellranger_1.1.0 gtable_0.3.0 assertthat_0.2.1 cachem_1.0.6 xtable_1.8-4
[73] broom_0.7.12 survival_3.2-13 iterators_1.0.14 AnnotationDbi_1.56.2 memoise_2.0.1 ellipsis_0.3.2
I do see this:
DESeq fitted values using coefficients vs mu
which implies that normalized, but not logged.