Dear Rsubread developers,
I wanted to use the Rsubread package to count features on shotgun RNA-seq data. But when counting, it fails due too long alignments and requests the longread mode which changes runtime from 10 minutes to 30 hours and from multi core to single core! How can we overcome the problem? I also could not find any problematic read within the bam file.
The expected behavior is that it counts in multi core mode in around 10 minutes since the used bam file is a shotgun RNA-seq sample with 76bp read length.
And it works on other local sequencing data with 150bp read length with not problems.
Thanks in advance for any help. Best regards,
Christian Mertes
> featureCounts(file="HG00356.2.M_111215_6.bam", annot.ext=ranges, isLongRead=FALSE)
========== _____ _ _ ____ _____ ______ _____
===== / ____| | | | _ \| __ \| ____| /\ | __ \
===== | (___ | | | | |_) | |__) | |__ / \ | | | |
==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
Rsubread 1.34.7
//========================== featureCounts setting ===========================\\
|| ||
|| Input files : 1 BAM file ||
|| P HG00356.2.M_111215_6.bam ||
|| ||
|| Annotation : R data.frame ||
|| Dir for temp files : . ||
|| Threads : 1 ||
|| Level : meta-feature level ||
|| Paired-end : no ||
|| Multimapping reads : counted ||
|| Multi-overlapping reads : not counted ||
|| Min overlapping bases : 1 ||
|| ||
\\============================================================================//
//================================= Running ==================================\\
|| ||
|| Load annotation file .Rsubread_UserProvidedAnnotation_pid36210 ... ||
|| Features : 3 ||
|| Meta-features : 3 ||
|| Chromosomes/contigs : 2 ||
|| ||
|| Process BAM file HG00356.2.M_111215_6.bam... ||
|| Paired-end reads are included. ||
|| Assign alignments to features... ||
|| ||
|| ERROR: Alignment record is too long. ||
|| Please use the long read mode. ||
FATAL Error: The program has to terminate and no counting file is generated.
Error in featureCounts(file = "HG00356.2.M_111215_6.bam", annot.ext = ranges, :
No counts were generated.
Here is an example showing the problem:
# load libraries
library(Rsamtools)
library(Rsubread)
# get example data
download.file(method = "wget", extra="--continue -nv",
"https://www.ebi.ac.uk/arrayexpress/files/E-GEUV-1/HG00356.2.M_111215_6.bam",
"HG00356.2.M_111215_6.bam")
# index file
indexBam("HG00356.2.M_111215_6.bam")
# ranges to count
ranges <- data.frame(
GeneID=c("1", "2", "3"),
Chr=c("chr3", "chr3", "chr19"),
Start= c(119349459, 119380675, 7126379),
End = c(119349460, 119380676, 7126380),
Strand=c("*", "*", "*")
)
# count features
featureCounts(file="HG00356.2.M_111215_6.bam", annot.ext=ranges,
isLongRead=FALSE)
And my sessionInfo is:
> sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.7 (Nitrogen)
Matrix products: default
BLAS: /data/nasif12/modules_if12/SL7/i12g/R/3.6.0-Bioc3.9/lib64/R/lib/libRblas.so
LAPACK: /data/nasif12/modules_if12/SL7/i12g/R/3.6.0-Bioc3.9/lib64/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices datasets utils methods base
other attached packages:
[1] Rsubread_1.34.7 Rsamtools_2.0.1 Biostrings_2.52.0 XVector_0.24.0 GenomicRanges_1.36.1
[6] GenomeInfoDb_1.20.0 IRanges_2.18.2 S4Vectors_0.22.1 BiocGenerics_0.30.0
loaded via a namespace (and not attached):
[1] packrat_0.5.0 bitops_1.0-6 zlibbioc_1.30.0 BiocParallel_1.18.1
[5] tools_3.6.0 RCurl_1.95-4.12 compiler_3.6.0 GenomeInfoDbData_1.2.1
Hi Christian, Can you share your BAM file, please? I hope to look into the BAM file to see what was the cause of the error message. You can send the link to liao (AT) wehi.edu.au Thanks. Cheers, Yang
Please check my example. It contains the link to the file. And thanks in advance!