Entering edit mode
Ryan C. Thompson
★
7.9k
@ryan-c-thompson-5618
Last seen 12 weeks ago
Icahn School of Medicine at Mount Sinaiā¦
Hi Wei,
I've been testing out the new arguments for featureCounts on my use
case, and I'm getting a segfault when I try to run it. I suspect that
I
may be using a too large annotation and featureCounts is running out
of
memory. Specifically, I'm counting ChIP-Seq reads in bins placed every
50 bp across the entire human genome, so that's about 60 million bins,
which is a lot. Below is the log output for the crash. Note that
"featureCounts.fragments is my function that simply calls
featureCounts
with the appropriate arguments. In any case, the crash seems to happen
while loading the annotation file.
Should I try splitting the annotation into smaller pieces, calling
featureCounts on each one, and then rbinding them together?
Also, an unrelated question: when the read is reduced to its 5-prime
end, is this reducing its length to one or zero? I ask because I want
to
know whether to extend by the fragment length or the fragment length
minus 1.
Thanks,
-Ryan
Error output:
========== _____ _ _ ____ _____ ______ _____
===== / ____| | | | _ \| __ \| ____| /\ | __ \
===== | (___ | | | | |_) | |__) | |__ / \ | | | |
==== \___ \| | | | _ <| _ /| __| / /\ \ | | |
|
==== ____) | |__| | |_) | | \ \| |____ / ____ \|
|__| |
========== |_____/ \____/|____/|_| \_\______/_/
\_\_____/
Rsubread 1.15.2
//========================== featureCounts setting
===========================\\
|| ||
|| Input files : 142 BAM
files ||
|| S
/gpfs/home/rcthomps/Projects/sarah-cd4/bam ... ||
...
|| S
/gpfs/home/rcthomps/Projects/sarah-cd4/bam ... ||
|| ||
|| Output file :
./.Rsubread_featureCounts_pid17803 ||
|| Annotations :
./.Rsubread_UserProvidedAnnotation_pid17803 ... ||
|| ||
|| Threads :
8 ||
|| Level : meta-feature
level ||
|| Paired-end :
no ||
|| Strand specific :
no ||
|| Multimapping reads : not
counted ||
|| Multi-overlapping reads :
counted ||
|| Read extensions : 0 on 5' and 147 on 3'
ends ||
|| Read reduction to : 5'
end ||
|| ||
\\===================== http://subread.sourceforge.net/
======================//
//================================= Running
==================================\\
|| ||
|| Load annotation file ./.Rsubread_UserProvidedAnnotation_pid17803
... ||
*** caught segfault ***
address 0x7f428c4fd38d, cause 'memory not mapped'
Traceback:
1: .C("R_readSummary_wrapper", as.integer(n), as.character(cmd),
PACKAGE = "Rsubread")
2: featureCounts(bam, annot.ext = saf, isPairedEnd = FALSE, read2pos
=
5, readExtension3 = fraglength - 1, allowMultiOverlap = TRUE,
strandSpecific = 0, nthreads = nthreads, ...)
3: featureCounts.fragments(bamfiles, windows, fraglength =
histone.width)
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8
[9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8
attached base packages:
[1] grDevices datasets parallel graphics utils stats methods
[8] base
other attached packages:
[1] BSgenome.Hsapiens.UCSC.hg19_1.3.19 BSgenome_1.30.0
[3] BiocParallel_0.4.1 xlsx_0.5.5
[5] xlsxjars_0.6.0 rJava_0.9-6
[7] plyr_1.8 doParallel_1.0.8
[9] iterators_1.0.7 foreach_1.4.2
[11] Rsubread_1.15.2 Rsamtools_1.14.3
[13] Biostrings_2.30.1 GenomicRanges_1.14.4
[15] XVector_0.2.0 BiocInstaller_1.12.1
[17] stringr_0.6.2 IRanges_1.20.6
[19] BiocGenerics_0.8.0 R.utils_1.29.8
[21] R.oo_1.17.0 R.methodsS3_1.6.1
[23] ggplot2_0.9.3.1
loaded via a namespace (and not attached):
[1] BatchJobs_1.2 BBmisc_1.5 bitops_1.0-6 brew_1.0-6
[5] codetools_0.2-8 colorspace_1.2-4 DBI_0.2-7
dichromat_2.0-0
[9] digest_0.6.4 fail_1.2 grid_3.0.2 gtable_0.1.2
[13] labeling_0.2 MASS_7.3-30 munsell_0.4.2 proto_0.3-10
[17] RColorBrewer_1.0-5 reshape2_1.2.2 RSQLite_0.11.4 scales_0.2.3
[21] sendmailR_1.1-2 stats4_3.0.2 tools_3.0.2 zlibbioc_1.8.0