Hi,
I recently started working with RNAseq data. I used the code below to try to read 2-4 BAM files (BAM and BAI in the same directory, etc) but I repeatedly get the following error when running summarizeOverlaps()
:
Error: stop worker failed:
'clear_cluster' receive data failed:
reached elapsed time limit
One other time I got this error (with the same code):
Error: 'bplapply' receive data failed:
error reading from connection
The BAM files are from ~40M single-end 75bp reads, each ~2-2.5Gb (aligned using tophat2/bowtie2; hg19 reference genome). Code, sessionInfo(), and last lines from traceback() are below (of note, this works just fine if I try to do just one BAM file):
> library(TxDb.Hsapiens.UCSC.hg19.knownGene) > txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene > grl <- exonsBy(txdb, by="gene") > bamLst BamFileList of length 4 names(4): file1.bam file2.bam file3.bam file4.bam > experiment2 <- summarizeOverlaps(features=grl, reads=bamLst, ignore.strand=T, singleEnd=T) Error: stop worker failed: 'clear_cluster' receive data failed: reached elapsed time limit > traceback() 16: stop(.error_worker_comm(e, "stop worker failed")) 15: value[[3L]](cond) 14: tryCatchOne(expr, names, parentenv, handlers[[1L]]) 13: tryCatchList(expr, classes, parentenv, handlers) > sessionInfo() R version 3.3.1 (2016-06-21) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.11.4 (El Capitan) attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] GenomicAlignments_1.8.3 Rsamtools_1.24.0 Biostrings_2.40.2 [4] XVector_0.12.0 SummarizedExperiment_1.2.3 Biobase_2.32.0 [7] GenomicRanges_1.24.2 GenomeInfoDb_1.8.1 IRanges_2.6.1 [10] S4Vectors_0.10.1 BiocGenerics_0.18.0
It seems to me like this may be related to either computer memory (8Gb), cores (4), or something like that. Beyond using a more powerful computer, is there any way to fix (or circumvent) this??
I would expect
yieldSize
of > 100000 to be ok for speed. You could process in serial withor perhaps see the
Rsubread::featureCounts()
or bamsignals packages.Will definitely try those --thanks!