Question

caught segfault when using featureCounts

0

Entering edit mode

Ge Tan ▴ 20

@ge-tan-7918

Last seen 3.2 years ago

Switzerland

Hi,

I encountered segfault when calling featurecounts with Ensembl GRCh37 gtf file. In my dataset, only one bam file has the segfault, not the others.

And if I use featurecounts in built NCBI RefSeq annotation hg19, it works, although the chromosomes are in chr* format, instead of Ensembl format.

The command I used:

gtf <- "/srv/GT/reference/Homo_sapiens/Ensembl/GRCh37.p13/Annotation/Genes/genes.gtf"
bam <- "129278-pericyte_Mnb_sRNA_B_duprm.bam"
stranded=1
mh=TRUE
dup=FALSE
paired=FALSE
threads=8

foo1 <- Rsubread::featureCounts(files=bam,annot.ext=gtf,
      isGTFAnnotationFile=TRUE,GTF.featureType="exon",
      GTF.attrType="gene_id",nthreads=threads, isPairedEnd=paired,
      strandSpecific=stranded,ignoreDup=dup,countMultiMappingReads=mh)

foo2 <- Rsubread::featureCounts(files=bam,annot.inbuilt="hg19",
      nthreads=threads, isPairedEnd=paired,
      strandSpecific=stranded,ignoreDup=dup,countMultiMappingReads=mh)

The bam file is available at https://1drv.ms/u/s!Ar4vJvgFbg4Ki9s2oq8Ava9ebXYyRQ if anyone wants to reproduce the error.

The seesion info:


> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)

Matrix products: default
BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] ezRun_1.2.1          GenomicRanges_1.30.0 GenomeInfoDb_1.14.0
 [4] Biostrings_2.46.0    XVector_0.18.0       IRanges_2.12.0
 [7] S4Vectors_0.16.0     BiocGenerics_0.24.0  data.table_1.10.4-3
[10] colorout_1.1-3       BiocInstaller_1.28.0

loaded via a namespace (and not attached):
[1] bitops_1.0-6            zlibbioc_1.24.0         RCurl_1.95-4.8
[4] compiler_3.4.2          GenomeInfoDbData_0.99.1

and the output of run:

> foo1 <- Rsubread::featureCounts(files=bam,annot.ext=gtf,
+       isGTFAnnotationFile=TRUE,GTF.featureType="exon",
+       GTF.attrType="gene_id",nthreads=threads, isPairedEnd=paired,
+       strandSpecific=stranded,ignoreDup=dup,countMultiMappingReads=mh)


        ==========     _____ _    _ ____  _____  ______          _____
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
       Rsubread 1.28.0

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 1 BAM file                                       ||
||                           S 129278-pericyte_Mnb_sRNA_B_duprm.bam           ||
||                                                                            ||
||      Dir for temp files : .                                                ||
||                 Threads : 8                                                ||
||                   Level : meta-feature level                               ||
||              Paired-end : no                                               ||
||         Strand specific : stranded                                         ||
||      Multimapping reads : counted                                          ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file /srv/GT/reference/Homo_sapiens/Ensembl/GRCh37.p13 ... ||
||    Features : 1195764                                                      ||
||    Meta-features : 57905                                                   ||
||    Chromosomes/contigs : 61                                                ||
||                                                                            ||
|| Process BAM file 129278-pericyte_Mnb_sRNA_B_duprm.bam...                   ||
||    Single-end reads are included.                                          ||
||    Assign reads to features...                                             ||

 *** caught segfault ***
address 0x7fbd3c521734, cause 'memory not mapped'

Traceback:
 1: Rsubread::featureCounts(files = bam, annot.ext = gtf, isGTFAnnotationFile = TRUE,     GTF.featureType = "exon", GTF.attrType = "gene_id", nthreads = threads,     isPairedEnd = paired, strandSpecific = stranded, ignoreDup = dup,     countMultiMappingReads = mh)

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

Thanks, Ge

rsubread featurecounts • 2.7k views

ADD COMMENT • link 7.1 years ago Ge Tan ▴ 20

0

Entering edit mode

Please follow the posting guide (http://www.bioconductor.org/help/support/posting-guide/) when you post a question. Provide output of sessionInfo() so we can know what versions of software packages you use. You also need to show the errors you got from running featureCounts. It will be best to provide the complete screen output from your featureCounts run.

ADD REPLY • link 7.1 years ago Wei Shi ★ 3.6k

0

Entering edit mode

I changed as you suggested.

ADD REPLY • link 7.1 years ago Ge Tan ▴ 20

0

Entering edit mode

Thanks. We can reproduce the problem you encountered from our end. We found this was caused by a bug in the program that was related to the processing of bam files processed by Picard. We will release a fix soon.

ADD REPLY • link 7.1 years ago Wei Shi ★ 3.6k

0

Entering edit mode

This is now fixed in both release and devel versions.

ADD REPLY • link 7.1 years ago Wei Shi ★ 3.6k

0

Entering edit mode

Many thanks for fixing this.

ADD REPLY • link 7.0 years ago Ge Tan ▴ 20

score 0 · Answer 1 · 2017-12-19

0

Entering edit mode

sergisayolspuig ▴ 80

@sergisayolspuig-8816

Last seen 2.7 years ago

Germany

We recently saw similar behaviour with some valid BAM files. The repair tool included in the Subread package will fix the issue, even if you have SR data (the tool is in theory designed to fix the order in PE data).

Nevertheless it is an unexpected behaviour, and the result of repairing the order with this tool is, at least, suspicious. Running diff we see a handful of reads which are wrongly sorted after running repair. But the repaired bam file then goes through featureCounts.

We're currently using Rsubread_1.28.0.

ADD COMMENT • link 7.1 years ago sergisayolspuig ▴ 80

0

Entering edit mode

Please do not run repair function on single-end reads since it is designed for processing paired-end reads.

Could you provide an example of reads that were 'wrongly' sorted by repair?

ADD REPLY • link 7.1 years ago Wei Shi ★ 3.6k