Dear all,
as a spin-off of this other topic (HTMLReport of DESeq2 results with Ensembl Gene Ids), here's my question:
I have a dataset with multifactorial design, aligned and quantified with Ensembl references/annotations, and I would like to get the DE genes in each contrast to be reported in the nice framework of ReportingTools.
Now, I have 3 levels for one factor (condition) and 2 for the other one (tissue)
ddsmf <- DESeqDataSetFromMatrix(countMatrix, colData = data.frame(condition=samplesDesign$strain, tissue = samplesDesign$tissueID), design = ~ tissue + condition) ddsmf <- DESeq(ddsmf,parallel=T) res_mf_tissue1vs2 <- results(ddsmf, contrast = c("tissue","t1","t2")) res_mf_condition1vs2 <- results(ddsmf, contrast = c("condition","c1","c2")) res_mf_condition1vs3 <- results(ddsmf, contrast = c("condition","c1","c3")) res_mf_condition2vs3 <- results(ddsmf, contrast = c("condition","c2","c3")) library("hwriter") library("biomaRt") library("ReportingTools") add.anns <- function(df, mart, ...) { nm <- rownames(df) anns <- getBM( attributes = c("ensembl_gene_id", "mgi_symbol", "description"), filters = "ensembl_gene_id", values = nm, mart = mart) anns <- anns[match(nm, anns[, 1]), ] colnames(anns) <- c("Ensembl", "Gene Symbol", "Gene Description") df <- cbind(anns, df[, 2:ncol(df)]) # slight modification to the suggestion of James rownames(df) <- nm df } # to add links to ensembl ensemblLinks <- function(df, ...){ naind <- is.na(df$Ensembl) df$Ensembl <- hwrite(as.character(df$Ensembl), link = paste0("http://www.ensembl.org/Mus_musculus/Gene/Summary?g=", as.character(df$Ensembl)), table = FALSE) df$Ensembl[naind] <- "" return(df) } mart <- useMart("ensembl",dataset="mmusculus_gene_ensembl")
Now, my best case scenario is that I specify the contrast as in the results function of DESeq2, and the corresponding DE genes are extracted and reported. Something like...
rt_cfrcond <- HTMLReport(shortName = 'report_cond',title = 'report_cond',reportDirectory = "./_reports") publish(ddsmf, rt_cfrcond, .modifyDF = list(add.anns,modifyReportDF,ensemblLinks), mart = mart,factor=colData(ddsmf)$condition) finish(rt_cfrcond)
When I provide the factor as
colData(ddsmf)$condition
then I get the comparison between first and last one (as probably expected from the DESeq2 behaviour), also with the nice boxplots (for all 3 factors, actually). So I can't do it for the other comparisons I had in mind - only by "cheating" and providing the DESeqResults (where I provide the contrasts), but then I lose the nice boxplots of normalized counts.
More strangely, when I provide the tissues as the contrast, the results are provided as if I am contrasting conditions...
rt_tis <- HTMLReport(shortName = 'report_tissues',title = 'report_tissues',reportDirectory = "./_reports") publish(ddsmf, rt_tis, .modifyDF = list(add.anns,modifyReportDF,ensemblLinks), mart = mart,factor=colData(ddsmf)$tissue) finish(rt_tis)
As in the other post I linked, I tried already with some searching around, but was unsuccessful in finding *the* solution for my problem, so I thought I could be helped/and maybe helpful to others here.
So, my question is somehow two-fold:
- How can I provide the contrasts correctly when checking on the factor with 3 levels?
- What am I doing wrong/what do I need to tweak for the contrast on tissue to be reported correctly?
Thanks in advance to all who can provide hints and suggestions - I hope my case can be interesting to others too!
Federico
Here is my sessionInfo() output:
sessionInfo() R version 3.1.2 (2014-10-31) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8 attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets [8] methods base other attached packages: [1] ReportingTools_2.6.0 AnnotationDbi_1.28.1 [3] Biobase_2.26.0 RSQLite_1.0.0 [5] DBI_0.3.1 knitr_1.9 [7] biomaRt_2.22.0 hwriter_1.3.2 [9] BiocParallel_1.0.3 DESeq2_1.6.3 [11] RcppArmadillo_0.4.650.1.1 Rcpp_0.11.5 [13] GenomicRanges_1.18.4 GenomeInfoDb_1.2.4 [15] IRanges_2.0.1 S4Vectors_0.4.0 [17] BiocGenerics_0.12.1 loaded via a namespace (and not attached): [1] acepack_1.3-3.3 annotate_1.44.0 AnnotationForge_1.8.2 [4] base64enc_0.1-2 BatchJobs_1.5 BBmisc_1.9 [7] Biostrings_2.34.1 biovizBase_1.14.1 bitops_1.0-6 [10] brew_1.0-6 BSgenome_1.34.1 Category_2.32.0 [13] checkmate_1.5.1 cluster_2.0.1 codetools_0.2-11 [16] colorspace_1.2-6 dichromat_2.0-0 digest_0.6.8 [19] edgeR_3.8.6 evaluate_0.5.5 fail_1.2 [22] foreach_1.4.2 foreign_0.8-63 formatR_1.0 [25] Formula_1.2-0 genefilter_1.48.1 geneplotter_1.44.0 [28] GenomicAlignments_1.2.2 GenomicFeatures_1.18.3 GGally_0.5.0 [31] ggbio_1.14.0 ggplot2_1.0.0 GO.db_3.0.0 [34] GOstats_2.32.0 graph_1.44.1 grid_3.1.2 [37] gridExtra_0.9.1 GSEABase_1.28.0 gtable_0.1.2 [40] Hmisc_3.15-0 iterators_1.0.7 lattice_0.20-30 [43] latticeExtra_0.6-26 limma_3.22.7 locfit_1.5-9.1 [46] MASS_7.3-39 Matrix_1.1-5 munsell_0.4.2 [49] nnet_7.3-9 OrganismDbi_1.8.1 PFAM.db_3.0.0 [52] plyr_1.8.1 proto_0.3-10 R.methodsS3_1.7.0 [55] R.oo_1.19.0 R.utils_2.0.0 RBGL_1.42.0 [58] RColorBrewer_1.1-2 RCurl_1.95-4.5 reshape_0.8.5 [61] reshape2_1.4.1 rpart_4.1-9 Rsamtools_1.18.3 [64] rtracklayer_1.26.2 scales_0.2.4 sendmailR_1.2-1 [67] splines_3.1.2 stringr_0.6.2 survival_2.38-1 [70] tools_3.1.2 VariantAnnotation_1.12.9 XML_3.98-1.1 [73] xtable_1.7-4 XVector_0.6.0 zlibbioc_1.12.0
I agree with James. In DESeq2 version 1.4 we added a class DESeqResults to wrap up the results object, which is a simple DataFrame. This allows downstream packages like ReportingTools to annotate the results table directly, so that downstream packages can avoid trying to mimic a results() call internally.
You say,
"only by cheating and providing the DESeqResults (where I provide the contrasts), but then I lose the nice boxplots of normalized counts."
But did you try passing the DESeqResults to
object
, and the DESeqDataSet toDataSet
?Thank you James.
The shameful part for me was that I did not spot correctly the contrast parameter.
With that, if specified correctly, it seems to work like a charm.
Where I should not thank you is for teasing the inner strings of my tiny control freak - but hey, in the end that has always been a push to try and learn a little more on coding especially in R :)
@Mike: you're right, that way works also pretty well. Thank you too for chiming in!
Federico