Question

DESEq2 Error in rownames<-(*tmp*, value = names(x))

0

Entering edit mode

ashley.doane ▴ 20

@ashleydoane-8524

Last seen 6.6 years ago

United States

Hi,

Getting an unexpected error with DESeq2.

> dds = DESeq(dds)
using pre-existing normalization factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
-- replacing outliers and refitting for 792 genes
-- DESeq argument 'minReplicatesForReplace' = 7 
-- original counts are preserved in counts(dds)
estimating dispersions
Error in `rownames<-`(`*tmp*`, value = names(x)) : 
  duplicate rownames not allowed

Of course I checked rownames are unique:

> rn = rownames(dds.ed)
> rn[duplicated(rn)]
character(0)

Also tried setting new rownames like rownames(dds) = 1:length(dds), but I still get this error.

I've tried installing the binary for OSX and compiling source, and same result.

I must be missing something obvious. Any ideas?

thanks,

Ashley

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS  10.14

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
 [1] splines   parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] cqn_1.26.0                  quantreg_5.36               SparseM_1.77                preprocessCore_1.42.0       nor1mix_1.2-3              
 [6] mclust_5.4.1                edgeR_3.22.5                limma_3.36.5                DESeq2_1.20.0               SummarizedExperiment_1.10.1
[11] DelayedArray_0.6.6          BiocParallel_1.14.2         matrixStats_0.54.0          Biobase_2.40.0              forcats_0.3.0              
[16] dplyr_0.7.6                 purrr_0.2.5                 tidyr_0.8.1                 tibble_1.4.2                ggplot2_3.0.0.9000         
[21] tidyverse_1.2.1             readr_1.1.1                 stringr_1.3.1               rtracklayer_1.40.6          GenomicRanges_1.32.7       
[26] GenomeInfoDb_1.16.0         IRanges_2.14.12             S4Vectors_0.18.3            BiocGenerics_0.26.0         BiocInstaller_1.30.0       

loaded via a namespace (and not attached):
 [1] colorspace_1.3-2         htmlTable_1.12           XVector_0.20.0           base64enc_0.1-3          rstudioapi_0.8           MatrixModels_0.4-1      
 [7] bit64_0.9-7              AnnotationDbi_1.42.1     lubridate_1.7.4          xml2_1.2.0               geneplotter_1.58.0       knitr_1.20              
[13] Formula_1.2-3            jsonlite_1.5             Rsamtools_1.32.3         broom_0.5.0              annotate_1.58.0          cluster_2.0.7-1         
[19] compiler_3.5.1           httr_1.3.1               backports_1.1.2          assertthat_0.2.0         Matrix_1.2-14            lazyeval_0.2.1          
[25] cli_1.0.1                acepack_1.4.1            htmltools_0.3.6          tools_3.5.1              bindrcpp_0.2.2           gtable_0.2.0            
[31] glue_1.3.0               GenomeInfoDbData_1.1.0   Rcpp_0.12.19             cellranger_1.1.0         Biostrings_2.48.0        nlme_3.1-137            
[37] rvest_0.3.2              XML_3.98-1.16            zlibbioc_1.26.0          scales_1.0.0             hms_0.4.2                RColorBrewer_1.1-2      
[43] yaml_2.2.0               memoise_1.1.0            gridExtra_2.3            rpart_4.1-13             latticeExtra_0.6-28      stringi_1.2.4           
[49] RSQLite_2.1.1            genefilter_1.62.0        checkmate_1.8.5          rlang_0.2.2              pkgconfig_2.0.2          bitops_1.0-6            
[55] lattice_0.20-35          bindr_0.1.1              GenomicAlignments_1.16.0 htmlwidgets_1.3          bit_1.1-14               tidyselect_0.2.4        
[61] plyr_1.8.4               magrittr_1.5             R6_2.3.0                 Hmisc_4.1-1              DBI_1.0.0                pillar_1.3.0            
[67] haven_1.1.2              foreign_0.8-71           withr_2.1.2              survival_2.42-6          RCurl_1.95-4.11          nnet_7.3-12             
[73] modelr_0.1.2             crayon_1.3.4             locfit_1.5-9.1           grid_3.5.1               readxl_1.1.0             data.table_1.11.8       
[79] blob_1.1.1               digest_0.6.17            xtable_1.8-3             munsell_0.5.0

software error deseq2 • 4.9k views

ADD COMMENT • link updated 6.6 years ago by Michael Love 43k • written 6.6 years ago by ashley.doane ▴ 20

score 1 · Accepted Answer · 2018-10-07

1

Entering edit mode

Michael Love 43k

@mikelove

Last seen 8 days ago

United States

What happens with DESeq(dds, minRep=Inf)?

ADD COMMENT • link 6.6 years ago Michael Love 43k

0

Entering edit mode

Thanks, this solves the immediate issue. I dont think it would matter, but failed to mention it was a large counts matrix by number of rows, as there were 103,000 "genes" (ATACseq peaks). Please let me know if I can provide additional information. And also, thanks so much for DESeq2 and for continuiing it's development. Best, Ashley

ADD REPLY • link 6.6 years ago ashley.doane ▴ 20

0

Entering edit mode

I’m not sure if I can figure out what’s going on because it doesn’t throw this error in our tests. I’ll take a look at the code, but may not find the issue.

I’d say you can also just assess outlier by eye with a few example of peaks with large value of maxCooks in mcols(dds), rather than using the outlier replacement heuristic.

ADD REPLY • link 6.6 years ago Michael Love 43k

0

Entering edit mode

Can you show mcols(dds) before you run DESeq()? Are there any additional columns there?

ADD REPLY • link 6.6 years ago Michael Love 43k

0

Entering edit mode

Hello,

Just wanted to add that I had the same issue: https://www.biostars.org/p/343037/

Setting minRep=Inf also fixed the problem for me, and it does look like I had a few outliers in the post-DESeq dds.

When I tried to graph outliers in the pre-DESeq dds using the method described in that post (bottom), I got this error:

Error in apply(assays(dds_kal_agg)[["cooks"]], 1, max) :
dim(X) must have a positive length

Hope this helps.

Kristin

(edit - realizing that the error message is because Cooks has not been calculated for dds_kal_agg, being pre-DESeq - is there another feature of mcols I should check out? It looks pretty empty:)

ADD REPLY • link 6.6 years ago muench.kristin • 0

0

Entering edit mode

Can you send me the dds to maintainer(“DESeq2”) ? And I’ll try to hunt down the bug.

ADD REPLY • link 6.6 years ago Michael Love 43k

0

Entering edit mode

Thank you, I was able to reproduce with v1.20.

The problem is that you have duplicate columns of colData(dds), which breaks some code where replaceOutliers adds a column to colData(dds) and adds some metadata about that column.

sum(duplicated(colnames(colData(dds))))

"Line" and "DESeqAnalysisID" columns both have duplicates.

So a solution is to only have unique column names for colData(dds), which is probably a good idea anyway.

I noticed that the error isn't thrown anway in the development version, which will be released in a few weeks as v1.22.

ADD REPLY • link 6.6 years ago Michael Love 43k