Dear all,
I am trying to analyze my smallRNAseq data with Deseq2 and to run ReportingTools on it.
I have different treatments and timepoints, I saw that you have a coef option to tell where to do the comparison with, but my data set then will exclude another timepoint or treatment. I am wondering if it is possible to get all pairwise comparisons from the data set as log2FC, or to input the results from Deseq2 into those columns and keep the barplots.
I tried to do them first as pairwise comparisons, but that did not seem to work properly:
dds.sub1 <- dds[ , dds$condition %in% c("M_3h", "R_3h") ]
dds.sub1$condition <- droplevels(dds.sub1$condition)
dds.sub1$type <- droplevels(dds.sub1$type)
publish(dds.sub1,des2Report, pvalueCutoff=1.1, annotation.db="org.Hs.egENSEMBL", factor = colData(dds.sub1)$condition, reportDir=out, n = length(row.names(dds.sub1)))
Error in counts(object) %*% whichSamples : non-conformable arguments
In case you have a comment or suggestion on that it would be nice.
I am also wondering if there is a way to print everything, even if the calculated pvalue is NA, I have one replicate that I would not exclude, but caused that some interesting genes get a pvalue of NA, and for me would be interesting to check them.
Thank you!
R version 3.2.4 (2016-03-10) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Debian GNU/Linux 8 (jessie) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base other attached packages: [1] ReportingTools_2.10.0 knitr_1.12.3 pheatmap_1.0.8 stringr_1.0.0 RColorBrewer_1.1-2 [6] gplots_3.0.1 DESeq2_1.10.1 RcppArmadillo_0.6.600.4.0 Rcpp_0.12.4 SummarizedExperiment_1.0.2 [11] GenomicRanges_1.22.4 GenomeInfoDb_1.6.3 ggplot2_2.1.0 geneplotter_1.48.0 annotate_1.48.0 [16] XML_3.98-1.4 lattice_0.20-33 genefilter_1.52.1 org.Hs.eg.db_3.2.3 AnnotationDbi_1.32.3 [21] IRanges_2.4.8 S4Vectors_0.8.11 Biobase_2.30.0 BiocGenerics_0.16.1 scatterplot3d_0.3-37 [26] RSQLite_1.0.0 DBI_0.3.1 loaded via a namespace (and not attached): [1] edgeR_3.12.1 splines_3.2.4 R.utils_2.3.0 gtools_3.5.0 Formula_1.2-1 [6] latticeExtra_0.6-28 RBGL_1.46.0 BSgenome_1.38.0 Rsamtools_1.22.0 Category_2.36.0 [11] biovizBase_1.18.0 limma_3.26.9 XVector_0.10.0 colorspace_1.2-6 ggbio_1.18.5 [16] R.oo_1.20.0 Matrix_1.2-5 plyr_1.8.3 OrganismDbi_1.12.1 GSEABase_1.32.0 [21] biomaRt_2.26.1 zlibbioc_1.16.0 xtable_1.8-2 GO.db_3.2.2 scales_0.4.0 [26] gdata_2.17.0 BiocParallel_1.4.3 PFAM.db_3.2.2 GenomicFeatures_1.22.13 nnet_7.3-12 [31] survival_2.39-2 magrittr_1.5 R.methodsS3_1.7.1 GGally_1.0.1 hwriter_1.3.2 [36] foreign_0.8-66 GOstats_2.36.0 graph_1.48.0 BiocInstaller_1.20.3 tools_3.2.4 [41] munsell_0.4.3 locfit_1.5-9.1 cluster_2.0.4 lambda.r_1.1.7 Biostrings_2.38.4 [46] caTools_1.17.1 futile.logger_1.4.1 grid_3.2.4 RCurl_1.95-4.8 dichromat_2.0-0 [51] VariantAnnotation_1.16.4 AnnotationForge_1.12.2 bitops_1.0-6 gtable_0.2.0 reshape_0.8.5 [56] reshape2_1.4.1 GenomicAlignments_1.6.3 gridExtra_2.2.1 rtracklayer_1.30.4 Hmisc_3.17-3 [61] futile.options_1.0.0 KernSmooth_2.23-15 stringi_1.0-1 rpart_4.1-10 acepack_1.3-3.3
That means that I should run results for the overall dds (until now I only did contrast analysis)? So I do not loose this NA values and run the ReportingTools over this OverallComparisson. If I understood correctly? Then I would have to??
OverallComparisson <- results(dds, cooksCutoff = FALSE)
interesting because:
Also in padj there are NA values.
I do not seem to understand the problem, or what did I do wrong.
### I know this will fail, but still to be sure I understoo thisOvCo <- DESeqResults(OverallComparisson)
publish(OverallComparisson,des2Report, pvalueCutoff=1.1, annotation.db="org.Hs.egENSEMBL", factor = colData(dds)$condition, reportDir=out, n = length(row.names(dds)))
You can use whatever contrasts you like, I'm just suggesting that you pass a results object to ReportingTools.
I'm a bit lost and can't figure out what exactly you want from ReportingTools and what code you've run already.
Can you describe in detail what you want the report to look like?
Sorry, I have run already ReportingTools, and what I get (and I would like to add)
Gene names - plots - FC (FC for all comparissons**) : Including the genes with a padj of NA
**I understand that the FC of each pairwise comparisons has to be added later
My problem is with the padj NA. According to the manual I do not need to put a contrast so I can do
>OverallComparisson <- results(dds, cooksCutoff = FALSE)
which runs, but when I want to analyze it I get :
> OverallComparisson
and ReportingTools does not recognize it as an input, it is the same class as dds and any of my contrasts, but it can not read it. I assume is because R is also not recognizing the output with cooksC = Null.
what works:
>R24hZ24h <- results(dds,contrast = c("condition","Z_24h","R_24h"))
>summary(R24hZ24h )
My full command for ReportingTools (this was the first try with a Pvalue restriction):
out <- "/folder/"
des2Report05 <- HTMLReport(shortName = 'RNAseq_analysis_with_DESeq2_p05', title = 'sRNA-seq THP1 analysis using DESeq2, pvalue cutoff 0.05', basePath = out, reportDirectory = 'html/')
publish(dds,des2Report05, pvalueCutoff=0.05, annotation.db="org.Hs.egENSEMBL", factor = colData(dds)$condition, reportDir=out, n = length(row.names(dds)))
finish(des2Report05)
In genral I can only put as input to publish the dds, but not the results even when I try to transform them.
Sorry if I do not make sense, I do not know which other information I should add.
Adjusted p-values of NA are from independent filtering. You can turn this off with independentFiltering=FALSE, or you can replace NA with 1. See the DESeq2 vignette on how to do this.
I guess getting exactly what you want from ReportingTools might require input from those package authors on this thread.
I don't know why you are getting the "Error in .classEnv(classDef)" error, but I can't see what code you've run before this line, so I can't help much.
What happens in a fresh R session if you: build the DESeqDataSet, run DESeq(), then run results(dds, cooksCutoff=FALSE)? Do you still get this error?
I tried from the beginning and I got the same result:
dds <- DESeqDataSetFromMatrix(countData = THP1_countData,
colData = colData_try,
design = ~ condition)
featureData <- data.frame(gene=rownames(THP1_countData))
dds <- dds[ rowSums(counts(dds)) > 1, ]
dds$condition <- factor(dds$condition, levels=c("M_3h","M_24h", "R_3h" ,"R_24h" ,"Z_3h","Z_24h"))
dds <- DESeq(dds)
OverallComparisson <- results(dds, cooksCutoff = FALSE)
R3hZ3hNA <- results(dds,contrast = c("condition","Z_3h","R_3h"), cooksCutoff = FALSE)
summary(R3hZ3hNA)
M3hR3h <- results(dds,contrast = c("condition","R_3h", "M_3h"))
summary(M3hR3h)
However the structure of all the three analysis is the same.
Thanks for the help! (sorry for the delay on answer, seems as a newbie in the forum I cant answer that many times in a row)
You are right was a problem with the versions. thanks!