Hello, I am trying to get differentially expressed genes after running Kallisto and importing abundances to DESeq2 by using tximport. I run DESEq2 after I relevel the condition factors for the specific comparison and then collect the specific results by using lfcShrink with apeglm.
When I switch the reference condition from untreated to treated, I get slightly different results. While most genes have the same log2FoldChange with only a sign change, some are different. For example, treated/untreated gives me 1608 significant differences, reverse comparison gives me 1618.
This only happens if I use lfcShrink with "apeglm". It doesn't happen with "normal" or "ashr" or if I just use "results".
Attached is an example of my code and also my ColData for dds1. Due to sequencing issues, I have some missing data that I do not include in the comparison, and I don't know if this is actually the problem.
I really appreciate it if you have any idea why this might be happening. Thank you
## Import kallisto abundances
txi.kallisto <- tximport(files, type = "kallisto", txOut = TRUE)
## load data into DESEQ2
dds <- DESeqDataSetFromTximport(txi.kallisto, sampleTable, ~condition)
## First comparison
condition1 = "condition_36_ICN_vs_30_ICN"
## Reverse comparison
condition2 = "condition_30_ICN_vs_36_ICN"
# Run DEseq2
dds1 <- DESeq(dds)
# 1st comparison
# relevel to set cond2 as our reference sample
dds1$condition <- relevel(dds1$condition, ref = "30_ICN")
# Re-run DEseq2
dds1 <- DESeq(dds1)
# get results and use lfc shrinkage
resLFC1 <- lfcShrink(dds = dds1, coef = condition1, type = "apeglm", lfcThreshold = lfc)
## Order output on svalue
res1.ordered <- resLFC1[order(resLFC1$svalue),]
# 2nd comparison
dds2 <- DESeq(dds)
# relevel to set cond1 as our reference sample
dds2$condition <- relevel(dds2$condition, ref = "36_ICN")
# Re-run DEseq2
dds2 <- DESeq(dds2)
# get results and use lfc shrinkage
resLFC2 <- lfcShrink(dds = dds2, coef = condition2, type = "apeglm", lfcThreshold = lfc)
## Order output on svalue
res2.ordered <- resLFC2[order(resLFC2$svalue),]
sum(res1.ordered$svalue < 0.0005, na.rm = TRUE)
1608
sum(res2.ordered$svalue < 0.0005, na.rm = TRUE)
1618
Here is my colData for dds1
sampleName fileName sample Temp Site.Name condition replaceable
ES1-30A abundance.h5 ES1_30A 30 ICN 30_ICN FALSE
ES1-33A abundance.h5 ES1_33A 33 ICN 33_ICN FALSE
ES2-30A abundance.h5 ES2_30A 30 ICN 30_ICN FALSE
ES2-33A abundance.h5 ES2_33A 33 ICN 33_ICN FALSE
ES2-36A abundance.h5 ES2_36A 36 ICN 36_ICN FALSE
ES3-30A abundance.h5 ES3_30A 30 ICN 30_ICN FALSE
ES3-36A abundance.h5 ES3_36A 36 ICN 36_ICN FALSE
ES4-30A abundance.h5 ES4_30A 30 ICN 30_ICN FALSE
ES4-33A abundance.h5 ES4_33A 33 ICN 33_ICN FALSE
ES4-36A abundance.h5 ES4_36A 36 ICN 36_ICN FALSE
ES5-30A abundance.h5 ES5_30A 30 ICN 30_ICN FALSE
ES5-33A abundance.h5 ES5_33A 33 ICN 33_ICN FALSE
ES5-36A abundance.h5 ES5_36A 36 ICN 36_ICN FALSE
ES6-33A abundance.h5 ES6_33A 33 ICN 33_ICN FALSE
ES6-36A abundance.h5 ES6_36A 36 ICN 36_ICN FALSE
ES7-30A abundance.h5 ES7_30A 30 ICN 30_ICN FALSE
ES7-33A abundance.h5 ES7_33A 33 ICN 33_ICN FALSE
ES7-36A abundance.h5 ES7_36A 36 ICN 36_ICN FALSE
KS1-30A abundance.h5 KS1_30A 30 AF 30_AF TRUE
KS1-33A abundance.h5 KS1_33A 33 AF 33_AF FALSE
KS1-36A abundance.h5 KS1_36A 36 AF 36_AF TRUE
KS2-30A abundance.h5 KS2_30A 30 AF 30_AF TRUE
KS2-36A abundance.h5 KS2_36A 36 AF 36_AF TRUE
KS3-30A abundance.h5 KS3_30A 30 AF 30_AF TRUE
KS3-33A abundance.h5 KS3_33A 33 AF 33_AF FALSE
KS3-36A abundance.h5 KS3_36A 36 AF 36_AF TRUE
KS4-30A abundance.h5 KS4_30A 30 AF 30_AF TRUE
KS4-33A abundance.h5 KS4_33A 33 AF 33_AF FALSE
KS4-36A abundance.h5 KS4_36A 36 AF 36_AF TRUE
KS5-30A abundance.h5 KS5_30A 30 AF 30_AF TRUE
KS5-33A abundance.h5 KS5_33A 33 AF 33_AF FALSE
KS5-36A abundance.h5 KS5_36A 36 AF 36_AF TRUE
KS6-30A abundance.h5 KS6_30A 30 AF 30_AF TRUE
KS6-33A abundance.h5 KS6_33A 33 AF 33_AF FALSE
KS6-36A abundance.h5 KS6_36A 36 AF 36_AF TRUE
KS7-30A abundance.h5 KS7_30A 30 AF 30_AF TRUE
KS7-33A abundance.h5 KS7_33A 33 AF 33_AF FALSE
KS7-36A abundance.h5 KS7_36A 36 AF 36_AF TRUE
S1-E-ST-30B abundance.h5 S1_E_ST_30B 30 ExT 30_ExT TRUE
S1-E-ST-33B abundance.h5 S1_E_ST_33B 33 ExT 33_ExT TRUE
S1-E-ST-36B abundance.h5 S1_E_ST_36B 36 ExT 36_ExT TRUE
S1-P-ST-30B abundance.h5 S1_P_ST_30B 30 PrT 30_PrT TRUE
S1-P-ST-33B abundance.h5 S1_P_ST_33B 33 PrT 33_PrT TRUE
S1-P-ST-36B abundance.h5 S1_P_ST_36B 36 PrT 36_PrT FALSE
S2-E-ST-30B abundance.h5 S2_E_ST_30B 30 ExT 30_ExT TRUE
S2-E-ST-33B abundance.h5 S2_E_ST_33B 33 ExT 33_ExT TRUE
S2-E-ST-36B abundance.h5 S2_E_ST_36B 36 ExT 36_ExT TRUE
S2-P-ST-30B abundance.h5 S2_P_ST_30B 30 PrT 30_PrT TRUE
S2-P-ST-33B abundance.h5 S2_P_ST_33B 33 PrT 33_PrT TRUE
S2-P-ST-36B abundance.h5 S2_P_ST_36B 36 PrT 36_PrT FALSE
S3-E-ST-30B abundance.h5 S3_E_ST_30B 30 ExT 30_ExT TRUE
S3-E-ST-33B abundance.h5 S3_E_ST_33B 33 ExT 33_ExT TRUE
S3-E-ST-36B abundance.h5 S3_E_ST_36B 36 ExT 36_ExT TRUE
S3-P-ST-30B abundance.h5 S3_P_ST_30B 30 PrT 30_PrT TRUE
S3-P-ST-33B abundance.h5 S3_P_ST_33B 33 PrT 33_PrT TRUE
S3-P-ST-36B abundance.h5 S3_P_ST_36B 36 PrT 36_PrT FALSE
S4-E-ST-30B abundance.h5 S4_E_ST_30B 30 ExT 30_ExT TRUE
S4-E-ST-33B abundance.h5 S4_E_ST_33B 33 ExT 33_ExT TRUE
S4-E-ST-36B abundance.h5 S4_E_ST_36B 36 ExT 36_ExT TRUE
S4-P-ST-30B abundance.h5 S4_P_ST_30B 30 PrT 30_PrT TRUE
S4-P-ST-33B abundance.h5 S4_P_ST_33B 33 PrT 33_PrT TRUE
S4-P-ST-36B abundance.h5 S4_P_ST_36B 36 PrT 36_PrT FALSE
S5-E-ST-30B abundance.h5 S5_E_ST_30B 30 ExT 30_ExT TRUE
S5-E-ST-33B abundance.h5 S5_E_ST_33B 33 ExT 33_ExT TRUE
S5-E-ST-36B abundance.h5 S5_E_ST_36B 36 ExT 36_ExT TRUE
S5-P-ST-30B abundance.h5 S5_P_ST_30B 30 PrT 30_PrT TRUE
S5-P-ST-33B abundance.h5 S5_P_ST_33B 33 PrT 33_PrT TRUE
S5-P-ST-36B abundance.h5 S5_P_ST_36B 36 PrT 36_PrT FALSE
S6-E-ST-30B abundance.h5 S6_E_ST_30B 30 ExT 30_ExT TRUE
S6-E-ST-33B abundance.h5 S6_E_ST_33B 33 ExT 33_ExT TRUE
S6-E-ST-36B abundance.h5 S6_E_ST_36B 36 ExT 36_ExT TRUE
S6-P-ST-30B abundance.h5 S6_P_ST_30B 30 PrT 30_PrT TRUE
S6-P-ST-33B abundance.h5 S6_P_ST_33B 33 PrT 33_PrT TRUE
S7-E-ST-30B abundance.h5 S7_E_ST_30B 30 ExT 30_ExT TRUE
S7-E-ST-33B abundance.h5 S7_E_ST_33B 33 ExT 33_ExT TRUE
S7-E-ST-36B abundance.h5 S7_E_ST_36B 36 ExT 36_ExT TRUE
S7-P-ST-30B abundance.h5 S7_P_ST_30B 30 PrT 30_PrT TRUE
S7-P-ST-33B abundance.h5 S7_P_ST_33B 33 PrT 33_PrT TRUE
S7-P-ST-36B abundance.h5 S7_P_ST_36B 36 PrT 36_PrT FALSE
sessionInfo( )
R version 3.6.2 (2019-12-12)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS
Matrix products: default
BLAS: /local/R-3.6.2/lib/libRblas.so
LAPACK: /local/R-3.6.2/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] forcats_0.5.0 stringr_1.4.0 dplyr_1.0.2
[4] purrr_0.3.4 readr_1.4.0 tidyr_1.1.2
[7] tibble_3.0.3 ggplot2_3.3.2 tidyverse_1.3.0
[10] DT_0.15 DESeq2_1.26.0 SummarizedExperiment_1.16.1
[13] DelayedArray_0.12.3 BiocParallel_1.20.1 matrixStats_0.57.0
[16] Biobase_2.46.0 GenomicRanges_1.38.0 GenomeInfoDb_1.22.1
[19] IRanges_2.20.2 S4Vectors_0.24.4 BiocGenerics_0.32.0
[22] rhdf5_2.30.1 tximport_1.14.2 workflowr_1.6.2
loaded via a namespace (and not attached):
[1] colorspace_1.4-1 ellipsis_0.3.1 htmlTable_2.1.0 XVector_0.26.0
[5] base64enc_0.1-3 fs_1.5.0 rstudioapi_0.11 bit64_4.0.5
[9] mvtnorm_1.1-1 apeglm_1.8.0 AnnotationDbi_1.48.0 fansi_0.4.1
[13] lubridate_1.7.9 xml2_1.3.2 splines_3.6.2 geneplotter_1.64.0
[17] knitr_1.30 Formula_1.2-3 jsonlite_1.7.1 broom_0.7.0
[21] annotate_1.64.0 ashr_2.2-47 cluster_2.1.0 dbplyr_1.4.4
[25] png_0.1-7 compiler_3.6.2 httr_1.4.2 backports_1.1.10
[29] assertthat_0.2.1 Matrix_1.2-18 cli_2.0.2 later_1.1.0.1
[33] htmltools_0.5.0 tools_3.6.2 coda_0.19-4 gtable_0.3.0
[37] glue_1.4.2 GenomeInfoDbData_1.2.2 Rcpp_1.0.5 bbmle_1.0.23.1
[41] cellranger_1.1.0 vctrs_0.3.4 xfun_0.18 rvest_0.3.6
[45] irlba_2.3.3 lifecycle_0.2.0 XML_3.99-0.3 MASS_7.3-53
[49] zlibbioc_1.32.0 scales_1.1.1 hms_0.5.3 promises_1.1.1
[53] RColorBrewer_1.1-2 memoise_1.1.0 gridExtra_2.3 emdbook_1.3.12
[57] bdsmatrix_1.3-4 rpart_4.1-15 SQUAREM_2020.5 latticeExtra_0.6-29
[61] stringi_1.5.3 RSQLite_2.2.1 genefilter_1.68.0 checkmate_2.0.0
[65] truncnorm_1.0-8 rlang_0.4.8 pkgconfig_2.0.3 bitops_1.0-6
[69] invgamma_1.1 evaluate_0.14 lattice_0.20-41 Rhdf5lib_1.8.0
[73] htmlwidgets_1.5.2 bit_4.0.4 tidyselect_1.1.0 plyr_1.8.6
[77] magrittr_1.5 R6_2.4.1 generics_0.0.2 Hmisc_4.4-1
[81] DBI_1.1.0 pillar_1.4.6 haven_2.3.1 foreign_0.8-76
[85] withr_2.3.0 mixsqp_0.3-43 survival_3.2-7 RCurl_1.98-1.2
[89] nnet_7.3-14 modelr_0.1.8 crayon_1.3.4 rmarkdown_2.4
[93] jpeg_0.1-8.1 locfit_1.5-9.4 grid_3.6.2 readxl_1.3.1
[97] data.table_1.13.0 blob_1.2.1 reprex_0.3.0 digest_0.6.25
[101] xtable_1.8-4 numDeriv_2016.8-1.1 httpuv_1.5.4 munsell_0.5.0
Hi Michael, Thanks a lot for your prompt reply. I will continue with the natural coding. I wanted to make sure this is not because of an issue with the way I build the comparisons.