Hello,
I have 16,500 OTUs and am trying to do a sequence alignment using DECIPHER. If I run the code with a smaller number of OTUs (approx 11K) everything works fine but if I run with all my data I get the error:
"error is.leaf(dend) node stack overflow"
Traceback gets me the following:
44: .collapse(dend[[2]], collapse)
43: .collapse(dend[[2]], collapse)
42: .collapse(dend[[2]], collapse)
41: .collapse(dend[[2]], collapse)
40: .collapse(dend[[2]], collapse)
39: .collapse(dend[[2]], collapse)
38: .collapse(dend[[2]], collapse)
37: .collapse(dend[[2]], collapse)
36: .collapse(dend[[2]], collapse)
35: .collapse(dend[[2]], collapse)
34: .collapse(dend[[2]], collapse)
33: .collapse(dend[[2]], collapse)
32: .collapse(dend[[2]], collapse)
31: .collapse(dend[[2]], collapse)
30: .collapse(dend[[2]], collapse)
29: .collapse(dend[[2]], collapse)
28: .collapse(dend[[2]], collapse)
27: .collapse(dend[[2]], collapse)
26: .collapse(dend[[2]], collapse)
25: .collapse(dend[[2]], collapse)
24: .collapse(dend[[2]], collapse)
23: .collapse(dend[[2]], collapse)
22: .collapse(dend[[2]], collapse)
21: .collapse(dend[[2]], collapse)
20: .collapse(dend[[2]], collapse)
19: .collapse(dend[[2]], collapse)
18: .collapse(dend[[2]], collapse)
17: .collapse(dend[[2]], collapse)
16: .collapse(dend[[2]], collapse)
15: .collapse(dend[[2]], collapse)
14: .collapse(dend[[2]], collapse)
13: .collapse(dend[[2]], collapse)
12: .collapse(dend[[2]], collapse)
11: .collapse(dend[[2]], collapse)
10: .collapse(dend[[2]], collapse)
9: .collapse(dend[[2]], collapse)
8: .collapse(dend[[2]], collapse)
7: .collapse(dend[[2]], collapse)
6: .collapse(dend[[2]], collapse)
5: .collapse(d, collapse)
4: IdClusters(d, method = "single", type = "dendrogram", verbose = verbose,
processors = processors)
3: withCallingHandlers(expr, warning = function(w) invokeRestart("muffleWarning"))
2: suppressWarnings(guideTree <- IdClusters(d, method = "single",
type = "dendrogram", verbose = verbose, processors = processors))
1: AlignSeqs(DNAStringSet(seqs), anchor = NA, processors = 10)
My original command is: alignment <- AlignSeqs(DNAStringSet(seqs), anchor=NA, processors=10)
Session info is as follows:
R version 3.4.1 (2017-06-30)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Scientific Linux release 6.9 (Carbon)
Matrix products: default
BLAS: /usr/lib64/R/lib/libRblas.so
LAPACK: /usr/lib64/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] phyloseq_1.20.0 dada2_1.4.0 Rcpp_0.12.13
[4] DECIPHER_2.4.0 RSQLite_2.0 Biostrings_2.44.2
[7] XVector_0.16.0 IRanges_2.10.5 S4Vectors_0.14.7
[10] BiocGenerics_0.22.1
loaded via a namespace (and not attached):
[1] ape_4.1 lattice_0.20-35
[3] Rsamtools_1.28.0 digest_0.6.12
[5] foreach_1.4.3 GenomeInfoDb_1.12.3
[7] plyr_1.8.4 ShortRead_1.34.2
[9] ggplot2_2.2.1 zlibbioc_1.22.0
[11] rlang_0.1.2 lazyeval_0.2.0
[13] data.table_1.10.4-2 vegan_2.4-4
[15] blob_1.1.0 Matrix_1.2-10
[17] splines_3.4.1 BiocParallel_1.10.1
[19] stringr_1.2.0 igraph_1.1.2
[21] RCurl_1.95-4.8 bit_1.1-12
[23] munsell_0.4.3 DelayedArray_0.2.7
[25] compiler_3.4.1 pkgconfig_2.0.1
[27] multtest_2.32.0 mgcv_1.8-17
[29] biomformat_1.4.0 SummarizedExperiment_1.6.5
[31] tibble_1.3.4 GenomeInfoDbData_0.99.0
[33] codetools_0.2-15 matrixStats_0.52.2
[35] permute_0.9-4 GenomicAlignments_1.12.2
[37] MASS_7.3-47 bitops_1.0-6
[39] grid_3.4.1 nlme_3.1-131
[41] jsonlite_1.5 gtable_0.2.0
[43] DBI_0.7 magrittr_1.5
[45] scales_0.5.0 RcppParallel_4.3.20
[47] stringi_1.1.5 hwriter_1.3.2
[49] reshape2_1.4.2 latticeExtra_0.6-28
[51] RColorBrewer_1.1-2 iterators_1.0.8
[53] tools_3.4.1 ade4_1.7-8
[55] bit64_0.9-7 Biobase_2.36.2
[57] survival_2.41-3 colorspace_1.3-2
[59] rhdf5_2.20.0 cluster_2.0.6
[61] GenomicRanges_1.28.6 memoise_1.1.0
I have removed all chimeras and there are no duplicate sequences. I tried increasing the expression number in R but that did not help. Any advice is much appreciated!!!