Hello,
I am having difficulty producing output files with DEXSeqHTML. Only <gene_name>expression.html and <gene_name>results.html files are created, and only the latter have any contents (a data table).
I can generate plots for individual genes by calling
plotDEXSeq(dxr1,"NM_007440",legend=TRUE,cex.axis=1.2,cex=1.3,lwd=2, displayTranscripts = TRUE)
I was able to generate the HTML output following the vignette with the Pasilla dataset. However, when calling DEXSeqHTML in a similar manner with my own data I am only finding the files noted above.
DEXSeqHTML(dxr1, FDR=0.05,color=c("#FF00080","#0000FF80"),fitExpToVar="condition",mart = mart,filter="refseq_mrna" , attributes="external_gene_name")
I receive the following traceback
> traceback() 2: stop(sprintf("The value of the parameter fitExpToVar, '%s', is not a variable from the annotation of the samples. Please specify a column name of the colData slot of the DEXSeqDataSet object.", fitExpToVar)) 1: DEXSeqHTML(dxr1, FDR = 0.05, color = c("#FF00080", "#0000FF80"), fitExpToVar = "treatment", mart = mart, filter = "refseq_mrna", attributes = "external_gene_name")
and warnings (truncated, all 50 are similar)
Warning messages: 1: closing unused connection 13 (DEXSeqReport/files/NM_001002842expression.html) 2: closing unused connection 12 (DEXSeqReport/files/NM_001002764expression.html) 3: closing unused connection 11 (DEXSeqReport/files/NM_001002268expression.html) 4: closing unused connection 10 (DEXSeqReport/files/NM_001002241expression.html) 5: closing unused connection 9 (DEXSeqReport/files/NM_001002239expression.html) 6: closing unused connection 8 (DEXSeqReport/files/NM_001001883expression.html) 7: closing unused connection 7 (DEXSeqReport/files/NM_001001880expression.html) ...
The complete workflow I am following is essentially straight from the vignette as I am just beginning to use DEXSeq.
countFiles <- list.files(pattern="count.txt$",full.names=TRUE) flattenedFile <- list.files(pattern="gff$",full.names=TRUE) sampleTable <- data.frame( row.names = c("KO1","KO2","KO3","WT1","WT2","WT3"), condition = c("knockout","knockout","knockout","wildtype","wildtype","wildtype") ) dxd <- DEXSeqDataSetFromHTSeq( countFiles, sampleData=sampleTable, design = ~ sample + exon + condition:exon, flattenedfile = flattenedFile ) sampleAnnotation(dxd) dxd <- estimateSizeFactors(dxd) dxd <- estimateDispersions(dxd) dxd <- testForDEU(dxd) dxd <- estimateExonFoldChanges (dxd,fitExpToVar="condition") dxr1 <- DEXSeqResults(dxd)
LRT p-value: full vs reduced DataFrame with 205335 rows and 13 columns groupID featureID exonBaseMean dispersion stat pvalue <character> <character> <numeric> <numeric> <numeric> <numeric> NM_001001130:E001 NM_001001130 E001 45.843785 0.2252586 0.1458363 0.7025466 NM_001001130:E002 NM_001001130 E002 3.558073 0.2312652 0.1098326 0.7403348 NM_001001130:E003 NM_001001130 E003 2.616500 0.1867738 0.4723285 0.4919179 NM_001001130:E004 NM_001001130 E004 1.371563 0.1070000 0.9595710 0.3272950 NM_001001152:E001 NM_001001152 E001 9.450050 0.0568492 2.1828880 0.1395524 ... ... ... ... ... ... ... NM_213733:E012 NM_213733 E012 30.43564 0.04353082 3.033824 0.0815456 NR_003278:E001 NR_003278 E001 0.00000 NA NA NA NR_003279:E001 NR_003279 E001 0.00000 NA NA NA NR_003280:E001 NR_003280 E001 0.00000 NA NA NA NR_046233:E001 NR_046233 E001 0.00000 NA NA NA padj knockout wildtype log2fold_wildtype_knockout <numeric> <numeric> <numeric> <numeric> NM_001001130:E001 0.9565526 9.400194 10.128272 0.1076256 NM_001001130:E002 0.9643624 2.788459 3.298784 0.2424661 NM_001001130:E003 0.8903618 3.026732 2.667826 -0.1820963 NM_001001130:E004 0.7989704 2.465953 1.828341 -0.4316096 NM_001001152:E001 0.5876139 4.748336 4.220563 -0.1699867 ... ... ... ... ... NM_213733:E012 0.4639771 6.392214 7.311873 0.1939251 NR_003278:E001 NA NA NA NA NR_003279:E001 NA NA NA NA NR_003280:E001 NA NA NA NA NR_046233:E001 NA NA NA NA genomicData countData transcripts <GRanges> <matrix> <list> NM_001001130:E001 chr13:-:[67747800, 67749722] 62 17 11 ... ######## NM_001001130:E002 chr13:-:[67751610, 67751705] 7 0 0 ... ######## NM_001001130:E003 chr13:-:[67752388, 67752514] 7 1 0 ... ######## NM_001001130:E004 chr13:-:[67755063, 67755134] 4 1 0 ... ######## NM_001001152:E001 chr13:-:[67254918, 67258144] 9 2 2 ... ######## ... ... ... ... NM_213733:E012 chr2:+:[174122091, 174122702] 50 64 54 ... ######## NR_003278:E001 Rn18s:+:[ 1, 1870] 0 0 0 ... ######## NR_003279:E001 Rn28s1:+:[ 1, 4730] 0 0 0 ... ######## NR_003280:E001 Rn5-8s1:+:[ 1, 157] 0 0 0 ... ######## NR_046233:E001 Rn45s:+:[ 1, 13404] 0 0 0 ... ########
My session info is as follows
R version 3.2.3 (2015-12-10) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.11.2 (El Capitan) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] Cairo_1.5-9 biomaRt_2.26.1 DEXSeq_1.16.10 [4] DESeq2_1.10.1 RcppArmadillo_0.6.500.4.0 Rcpp_0.12.3 [7] SummarizedExperiment_1.0.2 GenomicRanges_1.22.4 GenomeInfoDb_1.6.3 [10] IRanges_2.4.8 S4Vectors_0.8.11 Biobase_2.30.0 [13] BiocGenerics_0.16.1 BiocParallel_1.4.3 loaded via a namespace (and not attached): [1] RColorBrewer_1.1-2 futile.logger_1.4.1 plyr_1.8.3 XVector_0.10.0 [5] bitops_1.0-6 futile.options_1.0.0 tools_3.2.3 zlibbioc_1.16.0 [9] statmod_1.4.24 rpart_4.1-10 RSQLite_1.0.0 annotate_1.48.0 [13] gtable_0.2.0 lattice_0.20-33 DBI_0.3.1 gridExtra_2.2.1 [17] stringr_1.0.0 hwriter_1.3.2 genefilter_1.52.1 cluster_2.0.3 [21] Biostrings_2.38.4 locfit_1.5-9.1 grid_3.2.3 nnet_7.3-12 [25] AnnotationDbi_1.32.3 XML_3.98-1.4 survival_2.38-3 foreign_0.8-66 [29] latticeExtra_0.6-28 Formula_1.2-1 magrittr_1.5 geneplotter_1.48.0 [33] ggplot2_2.1.0 lambda.r_1.1.7 Rsamtools_1.22.0 Hmisc_3.17-2 [37] scales_0.4.0 splines_3.2.3 xtable_1.8-2 colorspace_1.2-6 [41] stringi_1.0-1 acepack_1.3-3.3 RCurl_1.95-4.8 munsell_0.4.3
Thanks for any advice
Seth
Hi Seth,
Can you try
DEXSeqHTML(dxr1, FDR = 0.05, color = c("#FF00080", "#0000FF80"),
fitExpToVar = "condition", mart = mart, filter = "refseq_mrna",
attributes = "external_gene_name")
You are specifying "treatment" instead of "condition" for fitExpToVar, and "treatment" does not exist as a column of your colData. Will make sure you get an informative error, sorry about that!
Hi Alejandro,
Sorry about that, I copied over the incorrect line from that history. What you entered is what has produced the described error. I have also tried without entering the fitExpToVar parameter. When I tried with "treatment" it returned an error noting that the column does not exist.
I did try it again anyways, and happened to have the output folder opened and visible. I noticed that an empty svg file was generated for each gene, but is deleted before the process finishes. I don't know if this is expected behaviour. The final output was the same as previously seen. There is no traceback, while the warnings listed are present.
testForDEU.html looks how I'd expect, with the experimental design and a table of genes. Each gene links to an expression.html which has nothing but links back and to multiple page which do not exist. It also links to the results.html page which has the expected table.
Sorry for the confusion, and thanks for the help.
After seeing the brief appearance of zero byte SVG files I checked dev.list() following the DEXSeqHTML call. There were 63 svg in the list. I cleared it with graphics.off() and reran DEXSeqHTML and the dev.list is repopulated. I'm guessing now that there is something funny about my installation or environment, but it isn't apparent to me just what is going on.
I have tried the same workflow on my notebook, using a fresh installation of R 3.2.4 and the various packages, and I am experiencing the same issue with DEXSeqHTML. I briefly see several hundred 0 byte svg files created and then removed, while the function is running. Afterwards dev.list() shows 63 svg devices attached, presumably the first 63 created, and then no further devices can be attached.
Hi Skasowitz,
Would you mind sending me your objects so I could have a closer look at the problem?
Alejandro
Hi Alejandro,
Yesterday I sent a link to some files via the email form at embl.de. Were you able to access the object?
Seth
Hi Seth,
Thanks for checking! I got your dataset, but could not reproduce the behaviour that you were describing. I do get the report as expected. There are some glm fits that fail and a warning is issues, but these are only a couple of genes.
How many empty files are you getting? Could you try the devel version with the same code?
Alejandro
Hi Alejandro,
Thanks for giving it a look. About how many genes are showing up in your results?
I see several hundred empty svg files created when I first start DEXSeqHTML. These files do not persist for more than a few seconds. When R is finished I have a directory with only expression.html and results.html files. There are 1929 pairs of files. Although only the results.html files contain anything besides links, the expression files do remain when the process is complete.
I installed R devel and found the same result. When I tried this on the lab computer I saw over 3000 pairs of html files are left when finished, so I am wondering if part of this is a resource issue. I will create some subsets of the full dataset to continue working this out. If nothing else I can test more quickly.
Seth